Search | arXiv e-print repository

Distributed Learning of Distributions via Social Sampling

Abstract: A protocol for distributed estimation of discrete distributions is proposed. Each agent begins with a single sample from the distribution, and the goal is to learn the empirical distribution of the samples. The protocol is based on a simple message-passing model motivated by communication in social networks. Agents sample a message randomly from their current estimates of the distribution, resulti… ▽ More A protocol for distributed estimation of discrete distributions is proposed. Each agent begins with a single sample from the distribution, and the goal is to learn the empirical distribution of the samples. The protocol is based on a simple message-passing model motivated by communication in social networks. Agents sample a message randomly from their current estimates of the distribution, resulting in a protocol with quantized messages. Using tools from stochastic approximation, the algorithm is shown to converge almost surely. Examples illustrate three regimes with different consensus phenomena. Simulations demonstrate this convergence and give some insight into the effect of network topology. △ Less

Submitted 5 June, 2014; v1 submitted 20 May, 2013; originally announced May 2013.

Comments: 17 pages, accepted to IEEE Transactions on Automatic Control

arXiv:1209.2755 [pdf, ps, other]

Relaxing the Gaussian AVC

Authors: Anand D. Sarwate, Michael Gastpar

Abstract: The arbitrarily varying channel (AVC) is a conservative way of modeling an unknown interference, and the corresponding capacity results are pessimistic. We reconsider the Gaussian AVC by relaxing the classical model and thereby weakening the adversarial nature of the interference. We examine three different relaxations. First, we show how a very small amount of common randomness between transmitte… ▽ More The arbitrarily varying channel (AVC) is a conservative way of modeling an unknown interference, and the corresponding capacity results are pessimistic. We reconsider the Gaussian AVC by relaxing the classical model and thereby weakening the adversarial nature of the interference. We examine three different relaxations. First, we show how a very small amount of common randomness between transmitter and receiver is sufficient to achieve the rates of fully randomized codes. Second, akin to the dirty paper coding problem, we study the impact of an additional interference known to the transmitter. We provide partial capacity results that differ significantly from the standard AVC. Third, we revisit a Gaussian MIMO AVC in which the interference is arbitrary but of limited dimension. △ Less

Submitted 12 September, 2012; originally announced September 2012.

Comments: Submitted to the IEEE Transactions on Information Theory

arXiv:1207.2812 [pdf, other]

Near-Optimal Algorithms for Differentially-Private Principal Components

Authors: Kamalika Chaudhuri, Anand D. Sarwate, Kaushik Sinha

Abstract: Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between pr… ▽ More Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method. △ Less

Submitted 7 August, 2013; v1 submitted 11 July, 2012; originally announced July 2012.

Comments: 37 pages, 8 figures; final version to appear in the Journal of Machine Learning Research, preliminary version was at NIPS 2012

arXiv:1204.2587 [pdf, ps, other]

doi 10.1109/TIT.2013.2245721

Upper Bounds on the Capacity of Binary Channels with Causal Adversaries

Authors: Bikash Kumar Dey, Sidharth Jaggi, Michael Langberg, Anand D. Sarwate

Abstract: In this work we consider the communication of information in the presence of a causal adversarial jammer. In the setting under study, a sender wishes to communicate a message to a receiver by transmitting a codeword $(x_1,...,x_n)$ bit-by-bit over a communication channel. The sender and the receiver do not share common randomness. The adversarial jammer can view the transmitted bits $x_i$ one at a… ▽ More In this work we consider the communication of information in the presence of a causal adversarial jammer. In the setting under study, a sender wishes to communicate a message to a receiver by transmitting a codeword $(x_1,...,x_n)$ bit-by-bit over a communication channel. The sender and the receiver do not share common randomness. The adversarial jammer can view the transmitted bits $x_i$ one at a time, and can change up to a $p$-fraction of them. However, the decisions of the jammer must be made in a causal manner. Namely, for each bit $x_i$ the jammer's decision on whether to corrupt it or not must depend only on $x_j$ for $j \leq i$. This is in contrast to the "classical" adversarial jamming situations in which the jammer has no knowledge of $(x_1,...,x_n)$, or knows $(x_1,...,x_n)$ completely. In this work, we present upper bounds (that hold under both the average and maximal probability of error criteria) on the capacity which hold for both deterministic and stochastic encoding schemes. △ Less

Submitted 13 December, 2012; v1 submitted 11 April, 2012; originally announced April 2012.

Comments: To appear in the IEEE Transactions on Information Theory; shortened version appeared at ISIT 2012

arXiv:0912.0071 [pdf, ps, other]

Differentially Private Empirical Risk Minimization

Authors: Kamalika Chaudhuri, Claire Monteleoni, Anand D. Sarwate

Abstract: Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $ε$-differential privacy definition due… ▽ More Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $ε$-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance. △ Less

Submitted 16 February, 2011; v1 submitted 30 November, 2009; originally announced December 2009.

Comments: 40 pages, 7 figures, accepted to the Journal of Machine Learning Research

arXiv:0907.1413

Privacy constraints in regularized convex optimization

Authors: Kamalika Chaudhuri, Anand D. Sarwate

Abstract: This paper is withdrawn due to some errors, which are corrected in arXiv:0912.0071v4 [cs.LG]. This paper is withdrawn due to some errors, which are corrected in arXiv:0912.0071v4 [cs.LG]. △ Less

Submitted 21 June, 2011; v1 submitted 9 July, 2009; originally announced July 2009.

Comments: This paper has been withdrawn by the authors due to some errors. Corrections have been included in arXiv:0912.0071v4

arXiv:0810.2513 [pdf, ps, other]

The Impact of Mobility on Gossip Algorithms

Authors: Anand D. Sarwate, Alexandros G. Dimakis

Abstract: The influence of node mobility on the convergence time of averaging gossip algorithms in networks is studied. It is shown that a small number of fully mobile nodes can yield a significant decrease in convergence time. A method is developed for deriving lower bounds on the convergence time by merging nodes according to their mobility pattern. This method is used to show that if the agents have one-… ▽ More The influence of node mobility on the convergence time of averaging gossip algorithms in networks is studied. It is shown that a small number of fully mobile nodes can yield a significant decrease in convergence time. A method is developed for deriving lower bounds on the convergence time by merging nodes according to their mobility pattern. This method is used to show that if the agents have one-dimensional mobility in the same direction the convergence time is improved by at most a constant. Upper bounds are obtained on the convergence time using techniques from the theory of Markov chains and show that simple models of mobility can dramatically accelerate gossip as long as the mobility paths significantly overlap. Simulations verify that different mobility patterns can have significantly different effects on the convergence of distributed algorithms. △ Less

Submitted 21 June, 2011; v1 submitted 14 October, 2008; originally announced October 2008.

Comments: Revised version submitted to IEEE Transactions on Information Theory

arXiv:0711.3926 [pdf, ps, other]

Rateless codes for AVC models

Authors: Anand D. Sarwate, Michael Gastpar

Abstract: The arbitrarily varying channel (AVC) is a channel model whose state is selected maliciously by an adversary. Fixed-blocklength coding assumes a worst-case bound on the adversary's capabilities, which leads to pessimistic results. This paper defines a variable-length perspective on this problem, for which achievable rates are shown that depend on the realized actions of the adversary. Specifical… ▽ More The arbitrarily varying channel (AVC) is a channel model whose state is selected maliciously by an adversary. Fixed-blocklength coding assumes a worst-case bound on the adversary's capabilities, which leads to pessimistic results. This paper defines a variable-length perspective on this problem, for which achievable rates are shown that depend on the realized actions of the adversary. Specifically, rateless codes are constructed which require a limited amount of common randomness. These codes are constructed for two kinds of AVC models. In the first the channel state cannot depend on the channel input, and in the second it can. As a byproduct, the randomized coding capacity of the AVC with state depending on the transmitted codeword is found and shown to be achievable with a small amount of common randomness. The results for this model are proved using a randomized strategy based on list decoding. △ Less

Submitted 5 October, 2009; v1 submitted 25 November, 2007; originally announced November 2007.

Comments: 14 pages, double column, extended version of paper to appear in the IEEE Transactions on Information Theory

arXiv:0711.0237 [pdf, ps, other]

doi 10.1109/TIT.2009.2034779

Zero-rate feedback can achieve the empirical capacity

Authors: Krishnan Eswaran, Anand D. Sarwate, Anant Sahai, Michael Gastpar

Abstract: The utility of limited feedback for coding over an individual sequence of DMCs is investigated. This study complements recent results showing how limited or noisy feedback can boost the reliability of communication. A strategy with fixed input distribution $P$ is given that asymptotically achieves rates arbitrarily close to the mutual information induced by $P$ and the state-averaged channel. Wh… ▽ More The utility of limited feedback for coding over an individual sequence of DMCs is investigated. This study complements recent results showing how limited or noisy feedback can boost the reliability of communication. A strategy with fixed input distribution $P$ is given that asymptotically achieves rates arbitrarily close to the mutual information induced by $P$ and the state-averaged channel. When the capacity achieving input distribution is the same over all channel states, this achieves rates at least as large as the capacity of the state averaged channel, sometimes called the empirical capacity. △ Less

Submitted 10 August, 2009; v1 submitted 1 November, 2007; originally announced November 2007.

Comments: Revised version of paper originally submitted to IEEE Transactions on Information Theory, Nov. 2007. This version contains further revisions and clarifications

arXiv:0709.3921 [pdf, ps, other]

doi 10.1109/TSP.2007.908946

Geographic Gossip: Efficient Averaging for Sensor Networks

Authors: Alexandros G. Dimakis, Anand D. Sarwate, Martin J. Wainwright

Abstract: Gossip algorithms for distributed computation are attractive due to their simplicity, distributed nature, and robustness in noisy and uncertain environments. However, using standard gossip algorithms can lead to a significant waste in energy by repeatedly recirculating redundant information. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of… ▽ More Gossip algorithms for distributed computation are attractive due to their simplicity, distributed nature, and robustness in noisy and uncertain environments. However, using standard gossip algorithms can lead to a significant waste in energy by repeatedly recirculating redundant information. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is related to the slow mixing times of random walks on the communication graph. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing geographic routing combined with a simple resampling method, we demonstrate substantial gains over previously proposed gossip protocols. For regular graphs such as the ring or grid, our algorithm improves standard gossip by factors of $n$ and $\sqrt{n}$ respectively. For the more challenging case of random geometric graphs, our algorithm computes the true average to accuracy $ε$ using $O(\frac{n^{1.5}}{\sqrt{\log n}} \log ε^{-1})$ radio transmissions, which yields a $\sqrt{\frac{n}{\log n}}$ factor improvement over standard gossip algorithms. We illustrate these theoretical results with experimental comparisons between our algorithm and standard methods as applied to various classes of random fields. △ Less

Submitted 25 September, 2007; originally announced September 2007.

Comments: To appear, IEEE Transactions on Signal Processing

arXiv:cs/0701146 [pdf, ps, other]

State constraints and list decoding for the AVC

Authors: Anand D. Sarwate, Michael Gastpar

Abstract: List decoding for arbitrarily varying channels (AVCs) under state constraints is investigated. It is shown that rates within $ε$ of the randomized coding capacity of AVCs with input-dependent state can be achieved under maximal error with list decoding using lists of size $O(1/ε)$. Under average error an achievable rate region and converse bound are given for lists of size $L$. These bounds are… ▽ More List decoding for arbitrarily varying channels (AVCs) under state constraints is investigated. It is shown that rates within $ε$ of the randomized coding capacity of AVCs with input-dependent state can be achieved under maximal error with list decoding using lists of size $O(1/ε)$. Under average error an achievable rate region and converse bound are given for lists of size $L$. These bounds are based on two different notions of symmetrizability and do not coincide in general. An example is given that shows that for list size $L$ the capacity may be positive but strictly smaller than the randomized coding capacity. This behavior is different than the situation without state constraints. △ Less

Submitted 5 October, 2009; v1 submitted 23 January, 2007; originally announced January 2007.

Comments: 22 pages, significantly changed version submitted to IEEE Transactions on Information Theory

arXiv:cs/0602071 [pdf, ps, other]

Geographic Gossip: Efficient Aggregation for Sensor Networks

Authors: Alexandros G. Dimakis, Anand D. Sarwate, Martin J. Wainwright

Abstract: Gossip algorithms for aggregation have recently received significant attention for sensor network applications because of their simplicity and robustness in noisy and uncertain environments. However, gossip algorithms can waste significant energy by essentially passing around redundant information multiple times. For realistic sensor network model topologies like grids and random geometric graph… ▽ More Gossip algorithms for aggregation have recently received significant attention for sensor network applications because of their simplicity and robustness in noisy and uncertain environments. However, gossip algorithms can waste significant energy by essentially passing around redundant information multiple times. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is caused by slow mixing times of random walks on those graphs. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing a simple resampling method, we can demonstrate substantial gains over previously proposed gossip protocols. In particular, for random geometric graphs, our algorithm computes the true average to accuracy $1/n^a$ using $O(n^{1.5}\sqrt{\log n})$ radio transmissions, which reduces the energy consumption by a $\sqrt{\frac{n}{\log n}}$ factor over standard gossip algorithms. △ Less

Submitted 19 February, 2006; originally announced February 2006.

Comments: 8 pages total; to appear in Information Processing in Sensor Networks (IPSN) 2006

Showing 51–62 of 62 results for author: Sarwate, A