Faculty Recruiting Support CICS

Robust Fitted-Q-Evaluation and Iteration Under Sequentially Exogenous Unobserved Confounders

14 Sep
Thursday, 09/14/2023 12:00pm to 1:00pm
Computer Science Building, Room 150/151; Virtual via Zoom
Machine Learning and Friends Lunch

Abstract: Offline reinforcement learning is important in domains such as medicine, economics, and e-commerce where online experimentation is costly, dangerous or unethical, and where the true model is unknown. However, most methods assume all covariates used in the behavior policy's action decisions are observed. This untestable assumption may be incorrect. We study robust policy evaluation and policy optimization in the presence of unobserved confounders. We assume the extent of possible unobserved confounding can be bounded by a sensitivity model, and that the unobserved confounders are sequentially exogenous. We propose and analyze an (orthogonalized) robust fitted-Q-iteration that uses closed-form solutions of the robust Bellman operator to derive a loss minimization problem for the robust Q function. Our algorithm enjoys the computational ease of fitted-Q-iteration and statistical improvements (reduced dependence on quantile estimation error) from orthogonalization. We provide sample complexity bounds, insights, and show effectiveness in simulations.

Bio: Angela is an Assistant Professor at the University of Southern California Marshall School of Business in Data Sciences and Operations. Her research interests are in statistical machine learning for data-driven sequential decision making under uncertainty, causal inference, and the interplay of statistics and optimization. She is particularly interested in applications-motivated methodology with guarantees in order to bridge method and practice. She was a co-program chair for ACM conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO).

To find out more information about this event or to obtain the Zoom link, please see the event announcements from MLFL on the college email lists or contact Wenlong Zhao.