Content

Speaker

Shahrooz Pouryousef

Abstract

The optimization of user experience through online A/B testing has long been a cornerstone of digital business strategy, driving revenue and informing decision-making. However, the digital ecosystem is rapidly evolving, introducing new challenges that disrupt traditional A/B testing paradigms. Privacy regulations, the depreciation of third-party and first-party cookies, device fragmentation, and the rise of generative AI models are reshaping the landscape, complicating causal inference and necessitating novel methodologies for estimating treatment effects. This dissertation addresses these emerging challenges, proposing innovative approaches to adapt A/B testing methodologies to modern constraints while maintaining robust causal inference.

First, we examine the impact of  irst-party cookie depreciation, where user tracking across visits is restricted. We establish the non-identifiability of GATE in this setting and propose an alternative methodology, DIET , a method inspired by the Cox model, that strategically combines sparse tracked user data with larger untracked datasets to improve treatment effect estimation.

Next, we generalize this to the arena of device and/or identity fragmentation, where a user’s online persona is split across multiple devices. Assuming access to a superset of a user’s devices, we analyze two scenarios: (i) linear models with graph motifs and (ii) non-linear models with additive effects. For both, we derive novel estimators that operate without requiring additional measurements, ensuring robust treatment effect estimation despite fragmented user identities.

Finally, we explore the implications of generative AI on A/B testing. The proliferation of AI-generated content accelerates testing cycles while introducing unprecedented complexity in treatment variations. Traditional methods struggle to efficiently evaluate these high-dimensional treatments, prompting the need for simulation-based approaches. We propose leveraging large language models (LLMs) as synthetic evaluators, assessing their potential and limitations in treatment effect estimation. Our findings reveal that while off-the-shelf LLMs are insufficient, fine-tuning strategies can enhance their utility in A/B testing applications.

This dissertation contributes both theoretical and practical advancements to causal inference in evolving digital environments, ensuring that A/B testing remains viable and effective amid technological and regulatory shifts. Our  proposed methodologies offer scalable, privacy-compliant, and computationally efficient solutions, bridging the gap between traditional experimentation and the demands of modern digital ecosystems.

Advisor

Don Towsley

Hybrid event posted in PhD Thesis Defense