Privacy preserving Machine Learning

Research Mentor: Caiwen Ding

Federated learning (FL) has evolved into an important branch in trustworthy AI and allows multiple clients to jointly train a global machine learning model with their private data. In FL, instead of transferring local raw data, clients can exchange their local training outcomes with a server to obtain an aggregated global model. Although the design of FL addresses some data privacy issues, recent studies show that a third party can infer the client's private training data by using the publicly shared models. Some secure FL solutions tackle the aforementioned problems. However, most of the existing FL solutions require a trusted model aggregator who can learn the local training outcomes from each client. Other solutions require a large portion of clients to be engaged in each training iteration at the cost of the complexity of the protocol. In this work, we will investigate a secure and highly scalable federated learning framework that (1)eliminates the need for any trusted entity; (2)achieves similar and even better model accuracy compared with secure FL that leverages a trusted aggregator; (3) produces a correct model even when the majority of clients are dropped out from the protocol. To improve the computation and communication performance and scale up our framework to train large scale ML models, especially deep neural network (DNN), we propose an optimization using structure matrix-based compression. This compression optimization integrates with our encryption scheme, and it simultaneously reduces the volume of local updates and weight storage.