Affordable Access

Robust and privacy preserving distributed machine learning

Authors
  • Talbi, Rania
Publication Date
Nov 19, 2021
Source
HAL
Keywords
Language
English
License
Unknown
External links

Abstract

With the pervasiveness of digital services, huge amounts of data are nowadays continuously generated and collected. Machine Learning (ML) algorithms allow the extraction of hidden yet valuable knowledge from these data and have been applied in numerous domains, such as health care assistance, transportation, user behavior prediction, and many others. In many of these applications, data is collected from different sources and distributed training is required to learn global models over them. However, in the case of sensitive data, running traditional ML algorithms over them can lead to serious privacy breaches by leaking sensitive information about data owners and data users. In this thesis, we propose mechanisms allowing to enhance privacy preservation and robustness in the domain of distributed machine learning. The first contribution of this thesis falls in the category of cryptography-based privacy preserving machine learning. Many state-of-the-art works propose cryptography-based solutions to ensure privacy preservation in distributed machine learning. Nonetheless, these works are known to induce huge overheads time and space-wise. In this line of works, we propose PrivML an outsourced Homomorphic Encryption-based Privacy Preserving Collaborative Machine Learning framework, that allows optimizing runtime and bandwidth consumption for widely used ML algorithms, using many techniques such as ciphertext packing, approximate computations, and parallel computing. The other contributions of this thesis address the robustness issues in the domain of Federated Learning. Indeed federated learning is the first framework to ensure privacy by design for distributed machine learning. Nonetheless, it has been shown that this framework is still vulnerable to many attacks, among them we find poisoning attacks, where participants deliberately use faulty training data to provoke misclassification at inference time. We demonstrate that state-of-the-art poisoning mitigation mechanisms fail to detect some poisoning attacks and propose ARMOR, a poisoning mitigation mechanism for Federated Learning that successfully detects these attacks, without hurting models’ utility.

Report this publication

Statistics

Seen <100 times