-
Immersion and Invariance-based Coding for Privacy-Preserving Federated Learning
Authors:
Haleh Hayati,
Carlos Murguia,
Nathan van de Wouw
Abstract:
Federated learning (FL) has emerged as a method to preserve privacy in collaborative distributed learning. In FL, clients train AI models directly on their devices rather than sharing data with a centralized server, which can pose privacy risks. However, it has been shown that despite FL's partial protection of local data privacy, information about clients' data can still be inferred from shared m…
▽ More
Federated learning (FL) has emerged as a method to preserve privacy in collaborative distributed learning. In FL, clients train AI models directly on their devices rather than sharing data with a centralized server, which can pose privacy risks. However, it has been shown that despite FL's partial protection of local data privacy, information about clients' data can still be inferred from shared model updates during training. In recent years, several privacy-preserving approaches have been developed to mitigate this privacy leakage in FL, though they often provide privacy at the cost of model performance or system efficiency. Balancing these trade-offs presents a significant challenge in implementing FL schemes. In this manuscript, we introduce a privacy-preserving FL framework that combines differential privacy and system immersion tools from control theory. The core idea is to treat the optimization algorithms used in standard FL schemes (e.g., gradient-based algorithms) as a dynamical system that we seek to immerse into a higher-dimensional system (referred to as the target optimization algorithm). The target algorithm's dynamics are designed such that, first, the model parameters of the original algorithm are immersed in its parameters; second, it operates on distorted parameters; and third, it converges to an encoded version of the true model parameters from the original algorithm. These encoded parameters can then be decoded at the server to retrieve the original model parameters. We demonstrate that the proposed privacy-preserving scheme can be tailored to offer any desired level of differential privacy for both local and global model parameters, while maintaining the same accuracy and convergence rate as standard FL algorithms.
△ Less
Submitted 25 November, 2024; v1 submitted 25 September, 2024;
originally announced September 2024.
-
Privacy in Cloud Computing through Immersion-based Coding
Authors:
Haleh Hayati,
Nathan van de Wouw,
Carlos Murguia
Abstract:
Cloud computing enables users to process and store data remotely on high-performance computers and servers by sharing data over the Internet. However, transferring data to clouds causes unavoidable privacy concerns. Here, we present a synthesis framework to design coding mechanisms that allow sharing and processing data in a privacy-preserving manner without sacrificing data utility and algorithmi…
▽ More
Cloud computing enables users to process and store data remotely on high-performance computers and servers by sharing data over the Internet. However, transferring data to clouds causes unavoidable privacy concerns. Here, we present a synthesis framework to design coding mechanisms that allow sharing and processing data in a privacy-preserving manner without sacrificing data utility and algorithmic performance. We consider the setup where the user aims to run an algorithm in the cloud using private data. The cloud then returns some data utility back to the user (utility refers to the service that the algorithm provides, e.g., classification, prediction, AI models, etc.). To avoid privacy concerns, the proposed scheme provides tools to co-design: 1) coding mechanisms to distort the original data and guarantee a prescribed differential privacy level; 2) an equivalent-but-different algorithm (referred here to as the target algorithm) that runs on distorted data and produces distorted utility; and 3) a decoding function that extracts the true utility from the distorted one with a negligible error. Then, instead of sharing the original data and algorithm with the cloud, only the distorted data and target algorithm are disclosed, thereby avoiding privacy concerns. The proposed scheme is built on the synergy of differential privacy and system immersion tools from control theory. The key underlying idea is to design a higher-dimensional target algorithm that embeds all trajectories of the original algorithm and works on randomly encoded data to produce randomly encoded utility. We show that the proposed scheme can be designed to offer any level of differential privacy without degrading the algorithm's utility. We present two use cases to illustrate the performance of the developed tools: privacy in optimization/learning algorithms and a nonlinear networked control system.
△ Less
Submitted 9 August, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Infinite Horizon Privacy in Networked Control Systems: Utility/Privacy Tradeoffs and Design Tools
Authors:
Haleh Hayati,
Nathan van de Wouw,
Carlos Murguia
Abstract:
We address the problem of synthesizing distorting mechanisms that maximize infinite horizon privacy for Networked Control Systems (NCSs). We consider stochastic LTI systems where information about the system state is obtained through noisy sensor measurements and transmitted to a (possibly adversarial) remote station via unsecured/public communication networks to compute control actions (a remote…
▽ More
We address the problem of synthesizing distorting mechanisms that maximize infinite horizon privacy for Networked Control Systems (NCSs). We consider stochastic LTI systems where information about the system state is obtained through noisy sensor measurements and transmitted to a (possibly adversarial) remote station via unsecured/public communication networks to compute control actions (a remote LQR controller). Because the network/station is untrustworthy, adversaries might access sensor and control data and estimate the system state. To mitigate this risk, we pass sensor and control data through distorting (privacy-preserving) mechanisms before transmission and send the distorted data through the communication network. These mechanisms consist of a linear coordinate transformation and additive-dependent Gaussian vectors. We formulate the synthesis of the distorting mechanisms as a convex program. In this convex program, we minimize the infinite horizon mutual information (our privacy metric) between the system state and its optimal estimate at the remote station for a desired upper bound on the control performance degradation (LQR cost) induced by the distortion mechanism.
△ Less
Submitted 27 July, 2023; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Immersion and Invariance-based Coding for Privacy in Remote Anomaly Detection
Authors:
Haleh Hayati,
Nathan van de Wouw,
Carlos Murguia
Abstract:
We present a framework for the design of coding mechanisms that allow remotely operating anomaly detectors in a privacy-preserving manner. We consider the following problem setup. A remote station seeks to identify anomalies based on system input-output signals transmitted over communication networks. However, it is not desired to disclose true data of the system operation as it can be used to inf…
▽ More
We present a framework for the design of coding mechanisms that allow remotely operating anomaly detectors in a privacy-preserving manner. We consider the following problem setup. A remote station seeks to identify anomalies based on system input-output signals transmitted over communication networks. However, it is not desired to disclose true data of the system operation as it can be used to infer private information. To prevent adversaries from eavesdropping on the network or at the remote station itself to access private data, we propose a privacy-preserving coding scheme to distort signals before transmission. As a next step, we design a new anomaly detector that runs on distorted signals and produces distorted diagnostics signals, and a decoding scheme that allows extracting true diagnostics data from distorted signals without error. The proposed scheme is built on the synergy of matrix encryption and system Immersion and Invariance (I&I) tools from control theory. The idea is to immerse the anomaly detector into a higher-dimensional system (the so-called target system). The dynamics of the target system is designed such that: the trajectories of the original anomaly detector are immersed/embedded in its trajectories, it works on randomly encoded input-output signals, and produces an encoded version of the original anomaly detector alarm signals, which are decoded to extract the original alarm at the user side. We show that the proposed privacy-preserving scheme provides the same anomaly detection performance as standard Kalman filter-based chi-squared anomaly detectors while revealing no information about system data.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Privacy-Preserving Federated Learning via System Immersion and Random Matrix Encryption
Authors:
Haleh Hayati,
Carlos Murguia,
Nathan van de Wouw
Abstract:
Federated learning (FL) has emerged as a privacy solution for collaborative distributed learning where clients train AI models directly on their devices instead of sharing their data with a centralized (potentially adversarial) server. Although FL preserves local data privacy to some extent, it has been shown that information about clients' data can still be inferred from model updates. In recent…
▽ More
Federated learning (FL) has emerged as a privacy solution for collaborative distributed learning where clients train AI models directly on their devices instead of sharing their data with a centralized (potentially adversarial) server. Although FL preserves local data privacy to some extent, it has been shown that information about clients' data can still be inferred from model updates. In recent years, various privacy-preserving schemes have been developed to address this privacy leakage. However, they often provide privacy at the expense of model performance or system efficiency, and balancing these tradeoffs is a crucial challenge when implementing FL schemes. In this manuscript, we propose a Privacy-Preserving Federated Learning (PPFL) framework built on the synergy of matrix encryption and system immersion tools from control theory. The idea is to immerse the learning algorithm, a Stochastic Gradient Decent (SGD), into a higher-dimensional system (the so-called target system) and design the dynamics of the target system so that: the trajectories of the original SGD are immersed/embedded in its trajectories, and it learns on encrypted data (here we use random matrix encryption). Matrix encryption is reformulated at the server as a random change of coordinates that maps original parameters to a higher-dimensional parameter space and enforces that the target SGD converges to an encrypted version of the original SGD optimal solution. The server decrypts the aggregated model using the left inverse of the immersion map. We show that our algorithm provides the same level of accuracy and convergence rate as the standard FL with a negligible computation cost while revealing no information about the clients' data.
△ Less
Submitted 7 September, 2022; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Finite Horizon Privacy of Stochastic Dynamical Systems: A Synthesis Framework for Dependent Gaussian Mechanisms
Authors:
Haleh Hayati,
Carlos Murguia,
Nathan van de Wouw
Abstract:
We address the problem of synthesizing distorting mechanisms that maximize privacy of stochastic dynamical systems. Information about the system state is obtained through sensor measurements. This data is transmitted to a remote station through an unsecured/public communication network. We aim to keep part of the system state private (a private output); however, because the network is unsecured, a…
▽ More
We address the problem of synthesizing distorting mechanisms that maximize privacy of stochastic dynamical systems. Information about the system state is obtained through sensor measurements. This data is transmitted to a remote station through an unsecured/public communication network. We aim to keep part of the system state private (a private output); however, because the network is unsecured, adversaries might access sensor data and input signals, which can be used to estimate private outputs. To prevent an accurate estimation, we pass sensor data and input signals through a distorting (privacy-preserving) mechanism before transmission, and send the distorted data to the trusted user. These mechanisms consist of a coordinate transformation and additive dependent Gaussian vectors. We formulate the synthesis of the distorting mechanisms as a convex program, where we minimize the mutual information (our privacy metric) between an arbitrarily large sequence of private outputs and the disclosed distorted data for desired distortion levels -- how different actual and distorted data are allowed to be.
△ Less
Submitted 3 August, 2021;
originally announced August 2021.