PhD Dissertation Proposal Defense: Shiv Shankar, Counterfactual Inference in the Era of User Privacy and Gen-AI Models
Content
Speaker
Abstract
The optimization of user experience through online A/B testing has long been a cornerstone of digital business strategy, driving revenue and informing decision-making. However, the digital ecosystem is rapidly evolving, introducing new challenges that disrupt traditional A/B testing paradigms. Privacy regulations, the depreciation of third-party and first-party cookies, device fragmentation, and the rise of generative AI models are reshaping the landscape, complicating causal inference and necessitating novel methodologies for estimating treatment effects. This dissertation addresses these emerging challenges, proposing innovative approaches to adapt A/B testing methodologies to modern constraints while maintaining robust causal inference.
First, we examine the impact of irst-party cookie depreciation, where user tracking across visits is restricted. We establish the non-identifiability of GATE in this setting and propose an alternative methodology, DIET, a method inspired by the Cox model, that strategically combines sparse tracked user data with larger untracked datasets to improve treatment effect estimation.
Next, we generalize this to the arena of device and/or identity fragmentation, where a user’s online persona is split across multiple devices. Assuming access to a superset of a user’s devices, we analyze two scenarios: (i) linear models with graph motifs and (ii) non-linear models with additive effects. For both, we derive novel estimators that operate without requiring additional measurements, ensuring robust treatment effect estimation despite fragmented user identities.
Finally, we explore the implications of generative AI on A/B testing. The proliferation of AI-generated content accelerates testing cycles while introducing unprecedented complexity in treatment variations. Traditional
methods struggle to efficiently evaluate these high-dimensional treatments, prompting the need for simulation-based approaches. We propose leveraging large language models (LLMs) as synthetic evaluators, assessing their potential and limitations in treatment effect estimation. Our findings reveal that while off-the-shelf LLMs are insufficient, fine-tuning strategies can enhance their utility in A/B testing applications.
This dissertation contributes both theoretical and practical advancements to causal inference in evolving digital environments, ensuring that A/B testing remains viable and effective amid technological and regulatory shifts. Our proposed methodologies offer scalable, privacy-compliant, and computationally efficient solutions, bridging the gap between traditional experimentation and the demands of modern digital ecosystems.
Advisor
Ina Fiterau