Content

Speaker:

Shiv Shankar

Abstract:

The optimization of user experience through online A/B testing has long been a cornerstone of digital business strategy, driving revenue and informing decision-making. However, the digital ecosystem is rapidly evolving, introducing new challenges that disrupt traditional A/B testing paradigms. Privacy regulations, the depreciation of third-party and first-party cookies, device fragmentation, and the rise of generative AI models are reshaping the landscape, complicating causal inference and necessitating novel methodologies for estimating treatment effects. This dissertation addresses these emerging challenges, proposing innovative approaches to adapt A/B testing methodologies to modern constraints while maintaining robust causal inference.

First, we examine the impact of  first-party cookie depreciation, where user tracking across visits is restricted. We establish the non-identifiability of GATE in this setting and propose an alternative methodology, DIET, a method inspired by the Cox model, that strategically combines sparse tracked user data with larger untracked datasets to improve treatment effect estimation.

Next, we generalize this to the arena of device and/or identity fragmentation, where a user’s online persona is split across multiple devices. Assuming access to a superset of a user’s devices, we analyze two scenarios: (i) linear models with graph motifs and (ii) non-linear models with additive effects. For both, we derive novel estimators that operate without requiring additional measurements, ensuring robust treatment effect estimation despite fragmented user identities.

Next, we explore the implications of foundation models on A/B testing.  Traditional methods struggle to efficiently evaluate these high-dimensional treatments, prompting the need for simulation-based approaches. We propose leveraging large foundation models  as synthetic evaluators, assessing their potential and limitations in treatment effect estimation. Our findings reveal that while off-the-shelf LLMs are insufficient, fine-tuning strategies can enhance their utility in A/B testing applications. We evaluate the results via retrospective analysis of a real drug trial to measure the efficacy of the proposed method.

This dissertation contributes both theoretical and practical advancements to causal inference in evolving digital environments, ensuring that A/B testing remains viable and effective amid technological and regulatory shifts. Our proposed methodologies offer scalable, privacy-compliant, and computationally efficient solutions, bridging the gap between traditional experimentation and the demands of modern digital ecosystems.

Advisor: 

Ina Fiterau