-
Incentivize Contribution and Learn Parameters Too: Federated Learning with Strategic Data Owners
Authors:
Drashthi Doshi,
Aditya Vema Reddy Kesari,
Swaprava Nath,
Avishek Ghosh,
Suhas S Kowshik
Abstract:
Classical federated learning (FL) assumes that the clients have a limited amount of noisy data with which they voluntarily participate and contribute towards learning a global, more accurate model in a principled manner. The learning happens in a distributed fashion without sharing the data with the center. However, these methods do not consider the incentive of an agent for participating and cont…
▽ More
Classical federated learning (FL) assumes that the clients have a limited amount of noisy data with which they voluntarily participate and contribute towards learning a global, more accurate model in a principled manner. The learning happens in a distributed fashion without sharing the data with the center. However, these methods do not consider the incentive of an agent for participating and contributing to the process, given that data collection and running a distributed algorithm is costly for the clients. The question of rationality of contribution has been asked recently in the literature and some results exist that consider this problem. This paper addresses the question of simultaneous parameter learning and incentivizing contribution, which distinguishes it from the extant literature. Our first mechanism incentivizes each client to contribute to the FL process at a Nash equilibrium and simultaneously learn the model parameters. However, this equilibrium outcome can be away from the optimal, where clients contribute with their full data and the algorithm learns the optimal parameters. We propose a second mechanism with monetary transfers that is budget balanced and enables the full data contribution along with optimal parameter learning. Large scale experiments with real (federated) datasets (CIFAR-10, FeMNIST, and Twitter) show that these algorithms converge quite fast in practice, yield good welfare guarantees, and better model performance for all agents.
△ Less
Submitted 17 May, 2025;
originally announced May 2025.
-
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
Authors:
Suhas S Kowshik,
Abhishek Divekar,
Vijit Malik
Abstract:
Large language models (LLMs) have demonstrated remarkable performance in diverse tasks using zero-shot and few-shot prompting. Even though their capabilities of data synthesis have been studied well in recent years, the generated data suffers from a lack of diversity, less adherence to the prompt, and potential biases that creep into the data from the generator model. In this work, we tackle the c…
▽ More
Large language models (LLMs) have demonstrated remarkable performance in diverse tasks using zero-shot and few-shot prompting. Even though their capabilities of data synthesis have been studied well in recent years, the generated data suffers from a lack of diversity, less adherence to the prompt, and potential biases that creep into the data from the generator model. In this work, we tackle the challenge of generating datasets with high diversity, upon which a student model is trained for downstream tasks. Taking the route of decoding-time guidance-based approaches, we propose CorrSynth, which generates data that is more diverse and faithful to the input prompt using a correlated sampling strategy. Further, our method overcomes the complexity drawbacks of some other guidance-based techniques like classifier-based guidance. With extensive experiments, we show the effectiveness of our approach and substantiate our claims. In particular, we perform intrinsic evaluation to show the improvements in diversity. Our experiments show that CorrSynth improves both student metrics and intrinsic metrics upon competitive baselines across four datasets, showing the innate advantage of our method.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Improved bounds for the many-user MAC
Authors:
Suhas S Kowshik
Abstract:
Many-user MAC is an important model for understanding energy efficiency of massive random access in 5G and beyond. Introduced in Polyanskiy'2017 for the AWGN channel, subsequent works have provided improved bounds on the asymptotic minimum energy-per-bit required to achieve a target per-user error at a given user density and payload, going beyond the AWGN setting. The best known rigorous bounds us…
▽ More
Many-user MAC is an important model for understanding energy efficiency of massive random access in 5G and beyond. Introduced in Polyanskiy'2017 for the AWGN channel, subsequent works have provided improved bounds on the asymptotic minimum energy-per-bit required to achieve a target per-user error at a given user density and payload, going beyond the AWGN setting. The best known rigorous bounds use spatially coupled codes along with the optimal AMP algorithm. But these bounds are infeasible to compute beyond a few (around 10) bits of payload. In this paper, we provide new achievability bounds for the many-user AWGN and quasi-static Rayleigh fading MACs using the spatially coupled codebook design along with a scalar AMP algorithm. The obtained bounds are computable even up to 100 bits and outperform the previous ones at this payload.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems
Authors:
Prateek Jain,
Suhas S Kowshik,
Dheeraj Nagaraj,
Praneeth Netrapalli
Abstract:
We consider the setting of vector valued non-linear dynamical systems $X_{t+1} = φ(A^* X_t) + η_t$, where $η_t$ is unbiased noise and $φ: \mathbb{R} \to \mathbb{R}$ is a known link function that satisfies certain {\em expansivity property}. The goal is to learn $A^*$ from a single trajectory $X_1,\cdots,X_T$ of {\em dependent or correlated} samples. While the problem is well-studied in the linear…
▽ More
We consider the setting of vector valued non-linear dynamical systems $X_{t+1} = φ(A^* X_t) + η_t$, where $η_t$ is unbiased noise and $φ: \mathbb{R} \to \mathbb{R}$ is a known link function that satisfies certain {\em expansivity property}. The goal is to learn $A^*$ from a single trajectory $X_1,\cdots,X_T$ of {\em dependent or correlated} samples. While the problem is well-studied in the linear case, where $φ$ is identity, with optimal error rates even for non-mixing systems, existing results in the non-linear case hold only for mixing systems. In this work, we improve existing results for learning nonlinear systems in a number of ways: a) we provide the first offline algorithm that can learn non-linear dynamical systems without the mixing assumption, b) we significantly improve upon the sample complexity of existing results for mixing systems, c) in the much harder one-pass, streaming setting we study a SGD with Reverse Experience Replay ($\mathsf{SGD-RER}$) method, and demonstrate that for mixing systems, it achieves the same sample complexity as our offline algorithm, d) we justify the expansivity assumption by showing that for the popular ReLU link function -- a non-expansive but easy to learn link function with i.i.d. samples -- any method would require exponentially many samples (with respect to dimension of $X_t$) from the dynamical system. We validate our results via. simulations and demonstrate that a naive application of SGD can be highly sub-optimal. Indeed, our work demonstrates that for correlated data, specialized methods designed for the dependency structure in data can significantly outperform standard SGD based methods.
△ Less
Submitted 1 December, 2021; v1 submitted 24 May, 2021;
originally announced May 2021.
-
Streaming Linear System Identification with Reverse Experience Replay
Authors:
Prateek Jain,
Suhas S Kowshik,
Dheeraj Nagaraj,
Praneeth Netrapalli
Abstract:
We consider the problem of estimating a linear time-invariant (LTI) dynamical system from a single trajectory via streaming algorithms, which is encountered in several applications including reinforcement learning (RL) and time-series analysis. While the LTI system estimation problem is well-studied in the {\em offline} setting, the practically important streaming/online setting has received littl…
▽ More
We consider the problem of estimating a linear time-invariant (LTI) dynamical system from a single trajectory via streaming algorithms, which is encountered in several applications including reinforcement learning (RL) and time-series analysis. While the LTI system estimation problem is well-studied in the {\em offline} setting, the practically important streaming/online setting has received little attention. Standard streaming methods like stochastic gradient descent (SGD) are unlikely to work since streaming points can be highly correlated. In this work, we propose a novel streaming algorithm, SGD with Reverse Experience Replay ($\mathsf{SGD}-\mathsf{RER}$), that is inspired by the experience replay (ER) technique popular in the RL literature. $\mathsf{SGD}-\mathsf{RER}$ divides data into small buffers and runs SGD backwards on the data stored in the individual buffers. We show that this algorithm exactly deconstructs the dependency structure and obtains information theoretically optimal guarantees for both parameter error and prediction error. Thus, we provide the first -- to the best of our knowledge -- optimal SGD-style algorithm for the classical problem of linear system identification with a first order oracle. Furthermore, $\mathsf{SGD}-\mathsf{RER}$ can be applied to more general settings like sparse LTI identification with known sparsity pattern, and non-linear dynamical systems. Our work demonstrates that the knowledge of data dependency structure can aid us in designing statistically and computationally efficient algorithms which can "decorrelate" streaming samples.
△ Less
Submitted 1 December, 2021; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Energy efficient coded random access for the wireless uplink
Authors:
Suhas S Kowshik,
Kirill Andreev,
Alexey Frolov,
Yury Polyanskiy
Abstract:
We discuss the problem of designing channel access architectures for enabling fast, low-latency, grant-free and uncoordinated uplink for densely packed wireless nodes. Specifically, we study random-access codes, previously introduced for the AWGN multiple-access channel (MAC) by Polyanskiy'2017, in the practically more relevant case of users subject to Rayleigh fading, when channel gains are unkno…
▽ More
We discuss the problem of designing channel access architectures for enabling fast, low-latency, grant-free and uncoordinated uplink for densely packed wireless nodes. Specifically, we study random-access codes, previously introduced for the AWGN multiple-access channel (MAC) by Polyanskiy'2017, in the practically more relevant case of users subject to Rayleigh fading, when channel gains are unknown to the decoder. We propose a random coding achievability bound, which we analyze both non-asymptotically (at finite blocklength) and asymptotically. As a candidate practical solution, we propose an explicit sparse-graph based coding scheme together with an alternating belief-propagation decoder. The latter's performance is found to be surprisingly close to the finite-blocklength bounds. Our main findings are twofold. First, just like in the AWGN MAC we see that jointly decoding large number of users leads to a surprising phase transition effect, where at spectral efficiencies below a critical threshold (5-15 bps/Hz depending on reliability) a perfect multi-user interference cancellation is possible. Second, while the presence of Rayleigh fading significantly increases the minimal required energy-per-bit $E_b/N_0$ (from about 0-2 dB to about 8-11 dB), the inherent randomization introduced by the channel makes it much easier to attain the optimal performance via iterative schemes.
In all, it is hoped that a principled definition of the random-access model together with our information-theoretic analysis will open the road towards unified benchmarking and comparison performance of various random-access solutions, such as the currently discussed candidates (MUSA, SCMA, RSMA) for the 5G/6G.
△ Less
Submitted 22 July, 2019;
originally announced July 2019.
-
Fundamental limits of many-user MAC with finite payloads and fading
Authors:
Suhas S Kowshik,
Yury Polyanskiy
Abstract:
Consider a (multiple-access) wireless communication system where users are connected to a unique base station over a shared-spectrum radio links. Each user has a fixed number $k$ of bits to send to the base station, and his signal gets attenuated by a random channel gain (quasi-static fading). In this paper we consider the many-user asymptotics of Chen-Chen-Guo'2017, where the number of users grow…
▽ More
Consider a (multiple-access) wireless communication system where users are connected to a unique base station over a shared-spectrum radio links. Each user has a fixed number $k$ of bits to send to the base station, and his signal gets attenuated by a random channel gain (quasi-static fading). In this paper we consider the many-user asymptotics of Chen-Chen-Guo'2017, where the number of users grows linearly with the blocklength. Differently, though, we adopt a per-user probability of error (PUPE) criterion (as opposed to classical joint-error probability criterion). Under PUPE the finite energy-per-bit communication is possible, and we are able to derive bounds on the tradeoff between energy and spectral efficiencies. We reconfirm the curious behaviour (previously observed for non-fading MAC) of the possibility of almost perfect multi-user interference (MUI) cancellation for user densities below a critical threshold. Further, we demonstrate the suboptimality of standard solutions such as orthogonalization (i.e., TDMA/FDMA) and treating interference as noise (i.e. pseudo-random CDMA without multi-user detection). Notably, the problem treated here can be seen as a variant of support recovery in compressed sensing for the unusual definition of sparsity with one non-zero entry per each contiguous section of $2^k$ coordinates. This identifies our problem with that of the sparse regression codes (SPARCs) and hence our results can be equivalently understood in the context of SPARCs with sections of length $2^{100}$. Finally, we discuss the relation of the almost perfect MUI cancellation property and the replica-method predictions.
△ Less
Submitted 14 June, 2021; v1 submitted 20 January, 2019;
originally announced January 2019.