The optimistic principle applied to function optimization and planning

29 Apr
Tuesday, 04/29/2014 12:00pm to 1:00pm

Remi Munos

Computer Science Building, Room 151

Faculty Host: Sridhar Mahadevan

Abstract: I will show how the "optimism in the face of uncertainty" principle developed in multi-armed bandits can be extended to address large scale decision making problems. Initially motivated by the empirical success of the Monte-Carlo tree search methods popularized in computer-go and further extended to many other optimization problems, I will report elements of theory that characterize the complexity of the underlying search problems and describe efficient algorithms for function optimization and planning with performance guarantees.

Bio: Remi Munos received his PhD in 1997 in Cognitive Science from EHESS, France, and later did a postdoc at CMU under the supervision of Andrew Moore. From 2000 to 2006 he was Assistant Professor in the department of Applied Mathematics at Ecole Polytechnique. In 2006 he joined the French research institute INRIA as a Senior Researcher and co-created the project-team SequeL (Sequential Learning) which gather approximately 25 people. His research interests cover several fields of Statistical Learning including Reinforcement Learning, Optimization, and Bandit Theory.