Zoom Meeting: https://umass-amherst.zoom.us/j/8679070608
This thesis studies different approaches to learning with limited data, including multi-armed bandits, causal inference and concentration bounds for small samples. In the first project, we study the effects of approximate inference on the performance of Thompson sampling in the k-armed bandit problems. Thompson sampling is a successful algorithm for online decision-making but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in alpha-divergence) can lead to poor performance (linear regret) due to underexploration (for alpha < 1) or over-exploration (for alpha > 0) by the approximation. While for alpha > 0 this is unavoidable, for alpha <= 0 the regret can be improved by adding a small amount of forced exploration even when the inference error is a large constant. Second, we consider the problem of designing a randomized experiment on a source population to estimate the Average Treatment Effect (ATE) on a target population. We propose a novel approach which explicitly considers the target when designing the experiment on the source. Under the covariate shift assumption, we design an unbiased importance-weighted estimator for the target population's ATE. To reduce the variance of our estimator, we design a covariate balance condition (Target Balance) between the treatment and control groups based on the target population. We show that Target Balance achieves a higher variance reduction asymptotically than methods that do not consider the target population during the design phase. Our experiments illustrate that Target Balance reduces the variance even for small sample sizes. Finally, we examine confidence intervals. Upper and lower confidence bounds on the mean of an unknown distribution, and their associated confidence intervals, are widely used in the scientific literature to characterize uncertainty of mean estimates. Historically, to bound the mean for small sample sizes,