Improving Exploratory Behavior by Leveraging Prior Knowledge

28 Mar
Wednesday, 03/28/2018 3:30pm to 5:00pm
Computer Science Building, Room 140
Ph.D. Dissertation Proposal Defense

"Improving Exploratory Behavior by Leveraging Prior Knowledge"

The area of reinforcement learning (RL) has seen a surge in interest in recent years. One of the central problems in RL is the \emph{exploration-exploitation dilemma}, that is, when should a learning agent act to gather more information about the environment it interacts with (exploration) and when should it act according to what it has learned so far (exploitation)? There has been significant amount of research on how to best explore given the information gathered so far about the task the agent is \emph{currently} trying to solve. However, the existing body of work largely ignores the possibility that an agent might have experienced similar tasks in the past. 

The ability to adapt previous knowledge to new situations is a hallmark of animal intelligence, and if we ever hope to build truly autonomous learning agents, it is a critical component that must not be overlooked. Assuming prior experience to related situations, a learning agent should be able to guide its exploratory efforts using the acquired prior knowledge.  

This thesis proposal will introduce ideas on how to leverage prior knowledge to guide exploration efforts. First, we propose using trajectory samples to identify and exploit recurrent behavior found in optimal policies. Second, we propose a MDP formulation for directly optimizing how an agent should behave during exploration. Last, we propose to integrate external information through co-occurrence graphs to make informed decisions when exploring new environments.

Advisor: Philip Thomas