Emerging Trustworthiness Issues in Distributed Learning Systems

11 Apr

Tuesday, 04/11/2023 10:00am to 12:00pm

Zoom

PhD Thesis Defense

A distributed learning system allocates learning processes onto several workstations to enable faster learning algorithms. Federated Learning (FL) is an increasingly popular type of distributed learning which allows mutually untrusted clients to collaboratively train a common machine learning model without sharing their private/proprietary training data with each other. In this dissertation, we aim to address emerging trustworthiness issues in distributed learning systems, particularly in the field of FL.

First, we tackle the issue of robustness in FL and demonstrate its susceptibility by presenting a comprehensive analysis of the various poisoning attacks and defensive aggregation rules proposed in the literature and connecting them under a common framework. To address this issue, we propose Federated Rank Learning (FRL) which reduces the space of client updates from a continuous space of float numbers in standard FL to a discrete space of integer values, limiting the adversary's options for poisoning attacks.

Next, we address the privacy concerns in FL, including access privacy and data privacy. An adversarial server in FL gets information about the data distribution of a target client by monitoring either I) local updates that the target submits throughout the FL training or II) the access pattern of the target, which can be privacy sensitive in many real-world scenarios. To preserve access privacy, we design Heterogeneous Private Information Retrieval (HPIR), which allows clients to fetch their specific model parameters from untrusted servers without leaking any information. We believe that HPIR will enable new application scenarios for private distributed learning systems, as well as improve the usability of some of the known applications of PIR. To preserve data privacy, we show that local rankings leak less information about private training data. We conduct a comprehensive investigation on the privacy of rankings in FRL to measure data leakage compared to weight parameter updates in standard FL in presence of the state-of-the-art white-box membership inference attack.

Finally, we address the issue of fairness in FL where a single model cannot represent all clients equally due to heterogeneity in their data distributions.To alleviate this issue, we propose Equal and Equitable Federated Learning (E2FL). E2FL produces fair federated learning models by preserving both equity and equality among the participating clients based on learning on parameter rankings where multiple global models are learned so that each group of clients can benefit from their personalized model.

Advisor: Amir Houmansadr

Join via Zoom

Emerging Trustworthiness Issues in Distributed Learning Systems

Subscribe to the CICS eNewsletter