Faculty Recruiting Support CICS

Communication-Efficient and Byzantine-Robust Distributed Learning

08 Apr
Wednesday, 04/08/2020 12:15pm to 1:15pm
Virtual
Theory Seminar
Speaker: Raj Kumar Maity

To view this live seminar via Zoom visit: https://umass-amherst.zoom.us/j/440256942

Abstract: We develop a communication-efficient distributed learning algorithm that is robust against Byzantine worker machines. We propose and analyze a distributed gradient-descent algorithm that performs a simple thresholding based on gradient norms to mitigate Byzantine failures. We show the (statistical) error-rate of our algorithm matches that of Yin et al. (2018), which uses more complicated schemes (like coordinate-wise median or trimmed mean) and thus optimal. Furthermore, for communication efficiency, we consider a generic class of d-approximate compressors from Karimireddy et al. (2019b) that encompasses sign-based compressors and top-k sparsification. Our algorithm uses compressed gradients and gradient norms for aggregation and Byzantine removal respectively. We establish the statistical error rate of the algorithm for arbitrary (convex or non-convex) smooth loss function. We show that, in the regime when the compression factor d is constant and the dimension of the parameter space is fixed, the rate of convergence is not affected by the compression operation, and hence we effectively get the compression for free. Moreover, we analyze the compressed gradient descent algorithm with error feedback (proposed in Karimireddy et al. (2019b)) for the distributed setting and in the presence of adversarial worker machines. We show that exploiting error feedback indeed improves the statistical error rate. Finally, we experimentally validate our results and shown good performance in convergence for convex (least-square regression) and non-convex (neural network training) problems.

Joint work with Avishek Ghosh, Swanand Kadhe, Arya Mazumdar and Kannan Ramchandran.