Faculty Recruiting Support CICS

Rigorous Experimentation for Reinforcement Learning

16 Sep
Friday, 09/16/2022 2:00pm to 4:00pm
Zoom and A215
PhD Thesis Defense
Speaker: Scott Jordan

Abstract: 

Scientific fields make advancements by leveraging the knowledge created by others to push the boundary of understanding. The primary tool in many fields for generating knowledge is empirical experimentation. Although common, it is often challenging to generate accurate knowledge from empirical experiments due to inherent randomness in execution and confounding variables that can obscure the correct interpretation of the results. As such, researchers need to hold themselves and others to a high degree of rigor when designing experiments. Unfortunately, most reinforcement learning (RL) experiments lack this rigor, making the knowledge generated from experiments dubious. This dissertation proposes methods to address central issues in RL experimentation.

Evaluating the performance of an RL algorithm is the most common type of experiment in RL literature. Most performance evaluations are often incapable of answering a specific research question and produce misleading results. Thus, the first issue we address is how to create a performance evaluation procedure that holds up to scientific standards.

Despite the prevalence of performance evaluation, these types of experiments produce limited knowledge, e.g., they can only show how well an algorithm worked and not how or why, and they require significant amounts of time and computational resources to complete. As an alternative, this dissertation proposes scientific experimentation, the process of conducting carefully controlled experiments designed to further the knowledge and understanding of how an algorithm works. This type of experimentation overcomes many challenges in developing rigorous performance evaluations and can serve as a primary source of knowledge generation. 

Lastly, this dissertation provides a case study on policy gradient methods, showing how and when each experimentation method can generate knowledge. As a result, this dissertation can motivate others in the field to adopt more rigorous experimental practices. 

Advisor: Philip Thomas

Join via Zoom