Faculty Recruiting Support CICS

Practical Methods for High-Dimensional Data Publication with Differential Privacy

30 Mar
Wednesday, 03/30/2022 1:00pm to 3:00pm
Hybrid - LGRC A215 and Zoom
PhD Thesis Defense
Speaker: Ryan McKenna

Abstract:

In recent years, differential privacy has seen significant growth, and has been widely embraced as the dominant privacy definition by the research community. Much progress has been made on designing theoretically principled and practically sound privacy mechanisms. There have even been some real-world deployments of differential privacy, although it has not yet seen widespread adoption. One challenge is that for some problems, there is a gap between the privacy budget required to have a meaningful privacy guarantee and to retain data utility. A second challenge is that many privacy mechanisms have trouble scaling to high-dimensional data, limiting their applicability to real world data.

In this work, we take significant steps towards addressing these challenges, by designing mechanisms and tools that mitigate this gap and scale effectively to high-dimensional settings. This thesis consists of three high-level contributions. In Chapter 2, we present HDMM, a mechanism for linear query answering under differential privacy, that scales effectively to large multi-dimensional domains, while providing more utility than a large body of prior work. In Chapter 3, we present PrivatePGM, a general-purpose post-processing tool that can estimate a discrete data distribution from noisy observations, improving the utility and scalability of many existing mechanisms at no cost to privacy. In Chapter 4, we present AIM, a mechanism for differentially private synthetic data generation that leverages PrivatePGM to scale to high-dimensional settings, while introducing a number of novel components to
overcome the utility limitations prior work.

Adviosr: Gerome Miklau

Join via Zoom