Search | arXiv e-print repository

Differentially Private Secure Multiplication with Erasures and Adversaries

Abstract: We consider a private distributed multiplication problem involving N computation nodes and T colluding nodes. Shamir's secret sharing algorithm provides perfect information-theoretic privacy, while requiring an honest majority, i.e., N \ge 2T + 1. Recent work has investigated approximate computation and characterized privacy-accuracy trade-offs for the honest minority setting N \le 2T for real-val… ▽ More We consider a private distributed multiplication problem involving N computation nodes and T colluding nodes. Shamir's secret sharing algorithm provides perfect information-theoretic privacy, while requiring an honest majority, i.e., N \ge 2T + 1. Recent work has investigated approximate computation and characterized privacy-accuracy trade-offs for the honest minority setting N \le 2T for real-valued data, quantifying privacy leakage via the differential privacy (DP) framework and accuracy via the mean squared error. However, it does not incorporate the error correction capabilities of Shamir's secret-sharing algorithm. This paper develops a new polynomial-based coding scheme for secure multiplication with an honest minority, and characterizes its achievable privacy-utility tradeoff, showing that the tradeoff can approach the converse bound as closely as desired. Unlike previous schemes, the proposed scheme inherits the capability of the Reed-Solomon (RS) code to tolerate erasures and adversaries. We utilize a modified Berlekamp-Welch algorithm over the real number field to detect adversarial nodes. △ Less

Submitted 29 April, 2025; originally announced April 2025.

Comments: A short version of this article was accepted at ISIT 2025

arXiv:2410.05540 [pdf, other]

Game of Coding: Sybil Resistant Decentralized Machine Learning with Minimal Trust Assumption

Authors: Hanzaleh Akbari Nodehi, Viveck R. Cadambe, Mohammad Ali Maddah-Ali

Abstract: Coding theory plays a crucial role in ensuring data integrity and reliability across various domains, from communication to computation and storage systems. However, its reliance on trust assumptions for data recovery poses significant challenges, particularly in emerging decentralized systems where trust is scarce. To address this, the game of coding framework was introduced, offering insights in… ▽ More Coding theory plays a crucial role in ensuring data integrity and reliability across various domains, from communication to computation and storage systems. However, its reliance on trust assumptions for data recovery poses significant challenges, particularly in emerging decentralized systems where trust is scarce. To address this, the game of coding framework was introduced, offering insights into strategies for data recovery within incentive-oriented environments. The focus of the earliest version of the game of coding was limited to scenarios involving only two nodes. This paper investigates the implications of increasing the number of nodes in the game of coding framework, particularly focusing on scenarios with one honest node and multiple adversarial nodes. We demonstrate that despite the increased flexibility for the adversary with an increasing number of adversarial nodes, having more power is not beneficial for the adversary and is not detrimental to the data collector, making this scheme sybil-resistant. Furthermore, we outline optimal strategies for the data collector in terms of accepting or rejecting the inputs, and characterize the optimal noise distribution for the adversary. △ Less

Submitted 16 October, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

arXiv:2407.03289 [pdf, other]

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

Authors: Sajani Vithana, Viveck R. Cadambe, Flavio P. Calmon, Haewon Jeong

Abstract: Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of $d$-dimensional vectors held by $n$ users while ensuring $(ε,δ)$-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untru… ▽ More Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of $d$-dimensional vectors held by $n$ users while ensuring $(ε,δ)$-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an $O(n)$ utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP -- even in adversarial settings -- while maintaining better resilience to dropouts and attacks compared to distributed DP. △ Less

Submitted 8 January, 2025; v1 submitted 3 July, 2024; originally announced July 2024.

arXiv:2405.06641 [pdf, other]

On Existence of Latency Optimal Uncoded Storage Schemes in Geo-Distributed Data Storage Systems

Authors: Srivathsa Acharya, P. Vijay Kumar, Viveck R. Cadambe

Abstract: We consider the problem of geographically distributed data storage in a network of servers (or nodes) where the nodes are connected to each other via communication links having certain round-trip times (RTTs). Each node serves a specific set of clients, where a client can request for any of the files available in the distributed system. The parent node provides the requested file if available loca… ▽ More We consider the problem of geographically distributed data storage in a network of servers (or nodes) where the nodes are connected to each other via communication links having certain round-trip times (RTTs). Each node serves a specific set of clients, where a client can request for any of the files available in the distributed system. The parent node provides the requested file if available locally; else it contacts other nodes that have the data needed to retrieve the requested file. This inter-node communication incurs a delay resulting in a certain latency in servicing the data request. The worst-case latency incurred at a servicing node and the system average latency are important performance metrics of a storage system, which depend not only on inter-node RTTs, but also on how the data is stored across the nodes. Data files could be placed in the nodes as they are, i.e., in uncoded fashion, or can be coded and placed. This paper provides the necessary and sufficient conditions for the existence of uncoded storage schemes that are optimal in terms of both per-node worst-case latency and system average latency. In addition, the paper provides efficient binary storage codes for a specific case where optimal uncoded schemes do not exist. △ Less

Submitted 13 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

arXiv:2401.16643 [pdf, other]

Game of Coding: Beyond Honest-Majority Assumptions

Authors: Hanzaleh Akbari Nodehi, Viveck R. Cadambe, Mohammad Ali Maddah-Ali

Abstract: Coding theory revolves around the incorporation of redundancy into transmitted symbols, computation tasks, and stored data to guard against adversarial manipulation. However, error correction in coding theory is contingent upon a strict trust assumption. In the context of computation and storage, it is required that honest nodes outnumber adversarial ones by a certain margin. However, in several e… ▽ More Coding theory revolves around the incorporation of redundancy into transmitted symbols, computation tasks, and stored data to guard against adversarial manipulation. However, error correction in coding theory is contingent upon a strict trust assumption. In the context of computation and storage, it is required that honest nodes outnumber adversarial ones by a certain margin. However, in several emerging real-world cases, particularly, in decentralized blockchain-oriented applications, such assumptions are often unrealistic. Consequently, despite the important role of coding in addressing significant challenges within decentralized systems, its applications become constrained. Still, in decentralized platforms, a distinctive characteristic emerges, offering new avenues for secure coding beyond the constraints of conventional methods. In these scenarios, the adversary benefits when the legitimate decoder recovers the data, and preferably with a high estimation error. This incentive motivates them to act rationally, trying to maximize their gains. In this paper, we propose a game theoretic formulation for coding, called the game of coding, that captures this unique dynamic where each of the adversaries and the data collector (decoder) have respective utility functions to optimize. The utility functions reflect the fact that both the data collector and the adversary are interested in increasing the chance of data being recoverable by the data collector. Moreover, the utility functions express the interest of the data collector to estimate the input with lower estimation error, but the opposite interest of the adversary. As a first, still highly non-trivial step, we characterize the equilibrium of the game for the repetition code with a repetition factor of 2 for a wide class of utility functions with minimal assumptions. △ Less

Submitted 9 September, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2309.16105 [pdf, other]

Differentially Private Secure Multiplication: Hiding Information in the Rubble of Noise

Authors: Viveck R. Cadambe, Ateet Devulapalli, Haewon Jeong, Flavio P. Calmon

Abstract: We consider the problem of private distributed multi-party multiplication. It is well-established that Shamir secret-sharing coding strategies can enable perfect information-theoretic privacy in distributed computation via the celebrated algorithm of Ben Or, Goldwasser and Wigderson (the "BGW algorithm"). However, perfect privacy and accuracy require an honest majority, that is, $N \geq 2t+1$ comp… ▽ More We consider the problem of private distributed multi-party multiplication. It is well-established that Shamir secret-sharing coding strategies can enable perfect information-theoretic privacy in distributed computation via the celebrated algorithm of Ben Or, Goldwasser and Wigderson (the "BGW algorithm"). However, perfect privacy and accuracy require an honest majority, that is, $N \geq 2t+1$ compute nodes are required to ensure privacy against any $t$ colluding adversarial nodes. By allowing for some controlled amount of information leakage and approximate multiplication instead of exact multiplication, we study coding schemes for the setting where the number of honest nodes can be a minority, that is $N< 2t+1.$ We develop a tight characterization privacy-accuracy trade-off for cases where $N < 2t+1$ by measuring information leakage using {differential} privacy instead of perfect privacy, and using the mean squared error metric for accuracy. A novel technical aspect is an intricately layered noise distribution that merges ideas from differential privacy and Shamir secret-sharing at different layers. △ Less

Submitted 17 January, 2025; v1 submitted 27 September, 2023; originally announced September 2023.

Comments: Extended version of papers presented in IEEE ISIT 2022, IEEE ISIT 2023 and TPDP 2023

arXiv:2111.12009 [pdf, other]

LEGOStore: A Linearizable Geo-Distributed Store Combining Replication and Erasure Coding

Authors: Hamidreza Zare, Viveck R. Cadambe, Bhuvan Urgaonkar, Chetan Sharma, Praneet Soni, Nader Alfares, Arif Merchant

Abstract: We design and implement LEGOStore, an erasure coding (EC) based linearizable data store over geo-distributed public cloud data centers (DCs). For such a data store, the confluence of the following factors opens up opportunities for EC to be latency-competitive with replication: (a) the necessity of communicating with remote DCs to tolerate entire DC failures and implement linearizability; and (b)… ▽ More We design and implement LEGOStore, an erasure coding (EC) based linearizable data store over geo-distributed public cloud data centers (DCs). For such a data store, the confluence of the following factors opens up opportunities for EC to be latency-competitive with replication: (a) the necessity of communicating with remote DCs to tolerate entire DC failures and implement linearizability; and (b) the emergence of DCs near most large population centers. LEGOStore employs an optimization framework that, for a given object, carefully chooses among replication and EC, as well as among various DC placements to minimize overall costs. To handle workload dynamism, LEGOStore employs a novel agile reconfiguration protocol. Our evaluation using a LEGOStore prototype spanning 9 Google Cloud Platform DCs demonstrates the efficacy of our ideas. We observe cost savings ranging from moderate (5-20\%) to significant (60\%) over baselines representing the state of the art while meeting tail latency SLOs. Our reconfiguration protocol is able to transition key placements in 3 to 4 inter-DC RTTs ($<$ 1s in our experiments), allowing for agile adaptation to dynamic conditions. △ Less

Submitted 3 July, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

Comments: Extended version of paper to appear in PVLDB 2022

arXiv:2105.01973 [pdf, other]

$ε$-Approximate Coded Matrix Multiplication is Nearly Twice as Efficient as Exact Multiplication

Authors: Haewon Jeong, Ateet Devulapalli, Viveck R. Cadambe, Flavio Calmon

Abstract: We study coded distributed matrix multiplication from an approximate recovery viewpoint. We consider a system of $P$ computation nodes where each node stores $1/m$ of each multiplicand via linear encoding. Our main result shows that the matrix product can be recovered with $ε$ relative error from any $m$ of the $P$ nodes for any $ε> 0$. We obtain this result through a careful specialization of Mat… ▽ More We study coded distributed matrix multiplication from an approximate recovery viewpoint. We consider a system of $P$ computation nodes where each node stores $1/m$ of each multiplicand via linear encoding. Our main result shows that the matrix product can be recovered with $ε$ relative error from any $m$ of the $P$ nodes for any $ε> 0$. We obtain this result through a careful specialization of MatDot codes -- a class of matrix multiplication codes previously developed in the context of exact recovery ($ε=0$). Since prior results showed that MatDot codes achieve the best exact recovery threshold for a class of linear coding schemes, our result shows that allowing for mild approximations leads to a system that is nearly twice as efficient as exact reconstruction. As an additional contribution, we develop an optimization framework based on alternating minimization that enables the discovery of new codes for approximate matrix multiplication. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2102.13310 [pdf, other]

CausalEC: A Causally Consistent Data Storage Algorithm based on Cross-Object Erasure Coding

Authors: Viveck R. Cadambe, Shihang Lyu

Abstract: Current causally consistent data storage algorithms use partial or full replication to ensure data access to clients over a distributed setting. We develop, for the first time, an erasure coding-based algorithm called CausalEC that ensures causal consistency for a collection of read-write objects stored in a distributed set of nodes over an asynchronous message-passing system. CausalEC can use an… ▽ More Current causally consistent data storage algorithms use partial or full replication to ensure data access to clients over a distributed setting. We develop, for the first time, an erasure coding-based algorithm called CausalEC that ensures causal consistency for a collection of read-write objects stored in a distributed set of nodes over an asynchronous message-passing system. CausalEC can use an arbitrary linear erasure code for data storage and ensures liveness, fault-tolerance, and storage properties prescribed by the erasure code. CausalEC retains a key benefit of previous replication-based algorithms - every write operation is "local", that is, a server performs only local actions before returning to a client that issued a write operation. For servers that store certain objects in an uncoded manner, read operations to those objects also return locally. In general, a read operation to an object can be returned by a server on contacting a small subset of other servers so long as the underlying erasure code allows for the object to be decoded from that subset. Notably, unlike previous consistent erasure coding-based algorithms, CausalEC is compatible with cross-object erasure coding, where nodes encode values across multiple objects. CausalEC navigates the technical challenges of cross-object erasure coding, in particular, pertaining to re-encoding when writes update the values and ensuring that concurrent reads are served in a non-blocking manner during the transition to storing codeword symbols corresponding to the updated values. △ Less

Submitted 20 May, 2023; v1 submitted 26 February, 2021; originally announced February 2021.

Comments: Extended version of a brief announcement at ACM PODC 2023

arXiv:1910.13598 [pdf, other]

Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization

Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Mehrdad Mahdavi, Viveck R. Cadambe

Abstract: Communication overhead is one of the key challenges that hinders the scalability of distributed optimization algorithms. In this paper, we study local distributed SGD, where data is partitioned among computation nodes, and the computation nodes perform local updates with periodically exchanging the model among the workers to perform averaging. While local SGD is empirically shown to provide promis… ▽ More Communication overhead is one of the key challenges that hinders the scalability of distributed optimization algorithms. In this paper, we study local distributed SGD, where data is partitioned among computation nodes, and the computation nodes perform local updates with periodically exchanging the model among the workers to perform averaging. While local SGD is empirically shown to provide promising results, a theoretical understanding of its performance remains open. We strengthen convergence analysis for local SGD, and show that local SGD can be far less expensive and applied far more generally than current theory suggests. Specifically, we show that for loss functions that satisfy the Polyak-Łojasiewicz condition, $O((pT)^{1/3})$ rounds of communication suffice to achieve a linear speed up, that is, an error of $O(1/pT)$, where $T$ is the total number of model updates at each worker. This is in contrast with previous work which required higher number of communication rounds, as well as was limited to strongly convex loss functions, for a similar asymptotic performance. We also develop an adaptive synchronization scheme that provides a general condition for linear speed up. Finally, we validate the theory with experimental results, running over AWS EC2 clouds and an internal GPU cluster. △ Less

Submitted 14 May, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

Comments: Paper accepted to NeurIPS 2019 - We fixed a flaw in the earlier version regarding the dependency on constants but this change does not affect the communication complexity

arXiv:1903.08326 [pdf, other]

Numerically Stable Polynomially Coded Computing

Authors: Mohammad Fahim, Viveck R. Cadambe

Abstract: We study the numerical stability of polynomial based encoding methods, which has emerged to be a powerful class of techniques for providing straggler and fault tolerance in the area of coded computing. Our contributions are as follows: 1) We construct new codes for matrix multiplication that achieve the same fault/straggler tolerance as the previously constructed MatDot Codes and Polynomial Codes.… ▽ More We study the numerical stability of polynomial based encoding methods, which has emerged to be a powerful class of techniques for providing straggler and fault tolerance in the area of coded computing. Our contributions are as follows: 1) We construct new codes for matrix multiplication that achieve the same fault/straggler tolerance as the previously constructed MatDot Codes and Polynomial Codes. Unlike previous codes that use polynomials expanded in a monomial basis, our codes uses a basis of orthogonal polynomials. 2) We show that the condition number of every $m \times m$ sub-matrix of an $m \times n, n \geq m$ Chebyshev-Vandermonde matrix, evaluated on the $n$-point Chebyshev grid, grows as $O(n^{2(n-m)})$ for $n > m$. An implication of this result is that, when Chebyshev-Vandermonde matrices are used for coded computing, for a fixed number of redundant nodes $s=n-m,$ the condition number grows at most polynomially in the number of nodes $n$. 3) By specializing our orthogonal polynomial based constructions to Chebyshev polynomials, and using our condition number bound for Chebyshev-Vandermonde matrices, we construct new numerically stable techniques for coded matrix multiplication. We empirically demonstrate that our constructions have significantly lower numerical errors compared to previous approaches which involve inversion of Vandermonde matrices. We generalize our constructions to explore the trade-off between computation/communication and fault-tolerance. 4) We propose a numerically stable specialization of Lagrange coded computing. Motivated by our condition number bound, our approach involves the choice of evaluation points and a suitable decoding procedure that involves inversion of an appropriate Chebyshev-Vandermonde matrix. Our approach is demonstrated empirically to have lower numerical errors as compared to standard methods. △ Less

Submitted 22 May, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

Comments: 31 pages, 13 figures, to be presented in part at the IEEE International Symposium on Information Theory (ISIT), July 2019

arXiv:1810.01527

Harnessing Correlations in Distributed Erasure Coded Key-Value Stores

Authors: Ramy E. Ali, Viveck Cadambe

Abstract: Motivated by applications of distributed storage systems to cloud-based key-value stores, the multi-version coding problem has been recently formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by consistency requirements in distributed systems, the main goal in multi-version coding is to ensure that the latest possible version of the data… ▽ More Motivated by applications of distributed storage systems to cloud-based key-value stores, the multi-version coding problem has been recently formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by consistency requirements in distributed systems, the main goal in multi-version coding is to ensure that the latest possible version of the data is decodable, even if the data updates have not reached some servers in the system. In this paper, we study the storage cost of ensuring consistency for the case where the data versions are correlated, in contrast to previous work where data versions were treated as being independent. We provide multi-version code constructions that show that the storage cost can be significantly smaller than the previous constructions depending on the degree of correlation between the different versions of the data. Our achievability results are based on Reed-Solomon codes and random binning. Through an information-theoretic converse, we show that our multi-version codes are nearly-optimal in certain regimes. △ Less

Submitted 9 March, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

Comments: arXiv admin note: substantial text overlap with arXiv:1708.06042 We will update a new version of arXiv:1708.06042 instead

Journal ref: IEEE Transactions on Communications 2018

arXiv:1806.06140 [pdf, other]

Straggler-Resilient and Communication-Efficient Distributed Iterative Linear Solver

Authors: Farzin Haddadpour, Yaoqing Yang, Malhar Chaudhari, Viveck R Cadambe, Pulkit Grover

Abstract: We propose a novel distributed iterative linear inverse solver method. Our method, PolyLin, has significantly lower communication cost, both in terms of number of rounds as well as number of bits, in comparison with the state of the art at the cost of higher computational complexity and storage. Our algorithm also has a built-in resilience to straggling and faulty computation nodes. We develop a n… ▽ More We propose a novel distributed iterative linear inverse solver method. Our method, PolyLin, has significantly lower communication cost, both in terms of number of rounds as well as number of bits, in comparison with the state of the art at the cost of higher computational complexity and storage. Our algorithm also has a built-in resilience to straggling and faulty computation nodes. We develop a natural variant of our main algorithm that trades off communication cost for computational complexity. Our method is inspired by ideas in error correcting codes. △ Less

Submitted 15 June, 2018; originally announced June 2018.

Comments: 15 pages, 3 figures and 2 tables

arXiv:1805.04337 [pdf, other]

Fundamental Limits of Erasure-Coded Key-Value Stores with Side Information

Authors: Ramy E. Ali, Viveck Cadambe, Jaime Llorca, Antonia Tulino

Abstract: In applications of distributed storage systems to modern key-value stores, the stored data is highly dynamic due to frequent updates. The multi-version coding problem was formulated to study the cost of storing dynamic data in distributed storage systems. Previous work on multi-version coding considered a completely decentralized and asynchronous system assuming that the servers are not aware of w… ▽ More In applications of distributed storage systems to modern key-value stores, the stored data is highly dynamic due to frequent updates. The multi-version coding problem was formulated to study the cost of storing dynamic data in distributed storage systems. Previous work on multi-version coding considered a completely decentralized and asynchronous system assuming that the servers are not aware of which versions of the data are received by the other servers. In this paper, we relax this assumption and study a system where a server may acquire side information of the data versions propagated to some other servers based on the network topology. Specifically, we study a storage system with $n$ servers over a directed graph that store $ν$ totally ordered versions of a message. Each server receives a subset of these $ν$ versions. A server is aware of which versions have been received by its neighbors in the network graph. We show that the side information can result in a better storage cost as compared with the case where there is no side information for some regimes at the expense of the additional latency associated with exchanging the side information. Through an information-theoretic converse, we identify surprising scenarios where the side information may not help in improving the worst-case storage cost beyond the case where servers have no side information. Finally, we present a case study over Amazon web services (AWS) that demonstrates the potential cost reductions that may be obtained by our constructions. △ Less

Submitted 18 May, 2019; v1 submitted 11 May, 2018; originally announced May 2018.

Comments: Extended version of the ISIT 2018 paper that generalizes the code constructions and converse for arbitrary graphs, and a case study based on Amazon web services pricing and latency information

arXiv:1805.03727 [pdf, other]

ARES: Adaptive, Reconfigurable, Erasure coded, atomic Storage

Authors: Nicolas Nicolaou, Viveck Cadambe, N. Prakash, Andria Trigeorgi, Kishori M. Konwar, Nancy Lynch, Muriel Medard

Abstract: Atomicity or strong consistency is one of the fundamental, most intuitive, and hardest to provide primitives in distributed shared memory emulations. To ensure survivability, scalability, and availability of a storage service in the presence of failures, traditional approaches for atomic memory emulation, in message passing environments, replicate the objects across multiple servers. Compared to r… ▽ More Atomicity or strong consistency is one of the fundamental, most intuitive, and hardest to provide primitives in distributed shared memory emulations. To ensure survivability, scalability, and availability of a storage service in the presence of failures, traditional approaches for atomic memory emulation, in message passing environments, replicate the objects across multiple servers. Compared to replication based algorithms, erasure code-based atomic memory algorithms has much lower storage and communication costs, but usually, they are harder to design. The difficulty of designing atomic memory algorithms further grows, when the set of servers may be changed to ensure survivability of the service over software and hardware upgrades, while avoiding service interruptions. Atomic memory algorithms for performing server reconfiguration, in the replicated systems, are very few, complex, and are still part of an active area of research; reconfigurations of erasure-code based algorithms are non-existent. In this work, we present ARES, an algorithmic framework that allows reconfiguration of the underlying servers, and is particularly suitable for erasure-code based algorithms emulating atomic objects. ARES introduces new configurations while keeping the service available. To use with ARES we also propose a new, and to our knowledge, the first two-round erasure code based algorithm TREAS, for emulating multi-writer, multi-reader (MWMR) atomic objects in asynchronous, message-passing environments, with near-optimal communication and storage costs. Our algorithms can tolerate crash failures of any client and some fraction of servers, and yet, guarantee safety and liveness property. Moreover, by bringing together the advantages of ARES and TREAS, we propose an optimized algorithm where new configurations can be installed without the objects values passing through the reconfiguration clients. △ Less

Submitted 28 May, 2021; v1 submitted 9 May, 2018; originally announced May 2018.

arXiv:1801.10292 [pdf, other]

On the Optimal Recovery Threshold of Coded Matrix Multiplication

Authors: Sanghamitra Dutta, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck Cadambe, Pulkit Grover

Abstract: We provide novel coded computation strategies for distributed matrix-matrix products that outperform the recent "Polynomial code" constructions in recovery threshold, i.e., the required number of successful workers. When $m$-th fraction of each matrix can be stored in each worker node, Polynomial codes require $m^2$ successful workers, while our MatDot codes only require $2m-1$ successful workers,… ▽ More We provide novel coded computation strategies for distributed matrix-matrix products that outperform the recent "Polynomial code" constructions in recovery threshold, i.e., the required number of successful workers. When $m$-th fraction of each matrix can be stored in each worker node, Polynomial codes require $m^2$ successful workers, while our MatDot codes only require $2m-1$ successful workers, albeit at a higher communication cost from each worker to the fusion node. We also provide a systematic construction of MatDot codes. Further, we propose "PolyDot" coding that interpolates between Polynomial codes and MatDot codes to trade off communication cost and recovery threshold. Finally, we demonstrate a coding technique for multiplying $n$ matrices ($n \geq 3$) by applying MatDot and PolyDot coding ideas. △ Less

Submitted 16 May, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

Comments: Extended version of the paper that appeared at Allerton 2017 (October 2017), including full proofs and further results. Submitted to IEEE Transactions on Information Theory

arXiv:1708.06042 [pdf, other]

Harnessing Correlations in Distributed Erasure-Coded Key-Value Stores

Authors: Ramy E. Ali, Viveck R. Cadambe

Abstract: Motivated by applications of distributed storage systems to key-value stores, the multi-version coding problem was formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by consistency requirements in distributed systems, the main goal in the multi-version coding problem is to ensure that the latest possible version of the data is decodable,… ▽ More Motivated by applications of distributed storage systems to key-value stores, the multi-version coding problem was formulated to efficiently store frequently updated data in asynchronous decentralized storage systems. Inspired by consistency requirements in distributed systems, the main goal in the multi-version coding problem is to ensure that the latest possible version of the data is decodable, even if the data updates have not reached some servers in the system. In this paper, we study the storage cost of ensuring consistency for the case where the data versions are correlated, in contrast to previous work where data versions were treated as being independent. We provide multi-version code constructions that show that the storage cost can be significantly smaller than the previous constructions depending on the degree of correlation, despite the asynchrony and the decentralized nature. Our achievability results are based on Reed-Solomon codes and random binning. Through an information-theoretic converse, we show that our multi-version codes are nearly-optimal, within a factor of $2$, in certain interesting regimes. △ Less

Submitted 9 March, 2019; v1 submitted 20 August, 2017; originally announced August 2017.

Journal ref: IEEE Transactions on Communications 2019

arXiv:1705.03875 [pdf, other]

Coded convolution for parallel and distributed computing within a deadline

Authors: Sanghamitra Dutta, Viveck Cadambe, Pulkit Grover

Abstract: We consider the problem of computing the convolution of two long vectors using parallel processing units in the presence of "stragglers". Stragglers refer to the small fraction of faulty or slow processors that delays the entire computation in time-critical distributed systems. We first show that splitting the vectors into smaller pieces and using a linear code to encode these pieces provides bett… ▽ More We consider the problem of computing the convolution of two long vectors using parallel processing units in the presence of "stragglers". Stragglers refer to the small fraction of faulty or slow processors that delays the entire computation in time-critical distributed systems. We first show that splitting the vectors into smaller pieces and using a linear code to encode these pieces provides better resilience against stragglers than replication-based schemes under a simple, worst-case straggler analysis. We then demonstrate that under commonly used models of computation time, coding can dramatically improve the probability of finishing the computation within a target "deadline" time. As opposed to the more commonly used technique of expected computation time analysis, we quantify the exponents of the probability of failure in the limit of large deadlines. Our exponent metric captures the probability of failing to finish before a specified deadline time, i.e. , the behavior of the "tail". Moreover, our technique also allows for simple closed form expressions for more general models of computation time, e.g. shifted Weibull models instead of only shifted exponentials. Thus, through this problem of coded convolution, we establish the utility of a novel asymptotic failure exponent analysis for distributed systems. △ Less

Submitted 10 May, 2017; originally announced May 2017.

Comments: To appear in ISIT 2017

arXiv:1705.02704 [pdf, other]

Linear Network Coding for Two-Unicast-$Z$ Networks: A Commutative Algebraic Perspective and Fundamental Limits

Authors: Mohammad Fahim, Viveck Cadambe

Abstract: We consider a two-unicast-$Z$ network over a directed acyclic graph of unit capacitated edges; the two-unicast-$Z$ network is a special case of two-unicast networks where one of the destinations has apriori side information of the unwanted (interfering) message. In this paper, we settle open questions on the limits of network coding for two-unicast-$Z$ networks by showing that the generalized netw… ▽ More We consider a two-unicast-$Z$ network over a directed acyclic graph of unit capacitated edges; the two-unicast-$Z$ network is a special case of two-unicast networks where one of the destinations has apriori side information of the unwanted (interfering) message. In this paper, we settle open questions on the limits of network coding for two-unicast-$Z$ networks by showing that the generalized network sharing bound is not tight, vector linear codes outperform scalar linear codes, and non-linear codes outperform linear codes in general. We also develop a commutative algebraic approach to deriving linear network coding achievability results, and demonstrate our approach by providing an alternate proof to the previous results of C. Wang et. al., I. Wang et. al. and Shenvi et. al. regarding feasibility of rate $(1,1)$ in the network. △ Less

Submitted 25 May, 2018; v1 submitted 7 May, 2017; originally announced May 2017.

Comments: A short version of this paper is published in the Proceedings of The IEEE International Symposium on Information Theory (ISIT), June 2017

arXiv:1704.05181 [pdf, other]

"Short-Dot": Computing Large Linear Transforms Distributedly Using Coded Short Dot Products

Authors: Sanghamitra Dutta, Viveck Cadambe, Pulkit Grover

Abstract: Faced with saturation of Moore's law and increasing dimension of data, system designers have increasingly resorted to parallel and distributed computing. However, distributed computing is often bottle necked by a small fraction of slow processors called "stragglers" that reduce the speed of computation because the fusion node has to wait for all processors to finish. To combat the effect of stragg… ▽ More Faced with saturation of Moore's law and increasing dimension of data, system designers have increasingly resorted to parallel and distributed computing. However, distributed computing is often bottle necked by a small fraction of slow processors called "stragglers" that reduce the speed of computation because the fusion node has to wait for all processors to finish. To combat the effect of stragglers, recent literature introduces redundancy in computations across processors, e.g.,~using repetition-based strategies or erasure codes. The fusion node can exploit this redundancy by completing the computation using outputs from only a subset of the processors, ignoring the stragglers. In this paper, we propose a novel technique -- that we call "Short-Dot" -- to introduce redundant computations in a coding theory inspired fashion, for computing linear transforms of long vectors. Instead of computing long dot products as required in the original linear transform, we construct a larger number of redundant and short dot products that can be computed faster and more efficiently at individual processors. In reference to comparable schemes that introduce redundancy to tackle stragglers, Short-Dot reduces the cost of computation, storage and communication since shorter portions are stored and computed at each processor, and also shorter portions of the input is communicated to each processor. We demonstrate through probabilistic analysis as well as experiments that Short-Dot offers significant speed-up compared to existing techniques. We also derive trade-offs between the length of the dot-products and the resilience to stragglers (number of processors to wait for), for any such strategy and compare it to that achieved by our strategy. △ Less

Submitted 17 April, 2017; originally announced April 2017.

Comments: Presented at NIPS 2016, Barcelona, Spain

arXiv:1605.06844 [pdf, ps, other]

Information-Theoretic Lower Bounds on the Storage Cost of Shared Memory Emulation

Authors: Viveck R. Cadambe, Zhiying Wang, Nancy Lynch

Abstract: The focus of this paper is to understand storage costs of emulating an atomic shared memory over an asynchronous, distributed message passing system. Previous literature has developed several shared memory emulation algorithms based on replication and erasure coding techniques. In this paper, we present information-theoretic lower bounds on the storage costs incurred by shared memory emulation alg… ▽ More The focus of this paper is to understand storage costs of emulating an atomic shared memory over an asynchronous, distributed message passing system. Previous literature has developed several shared memory emulation algorithms based on replication and erasure coding techniques. In this paper, we present information-theoretic lower bounds on the storage costs incurred by shared memory emulation algorithms. Our storage cost lower bounds are universally applicable, that is, we make no assumption on the structure of the algorithm or the method of encoding the data. We consider an arbitrary algorithm $A$ that implements an atomic multi-writer single-reader (MWSR) shared memory variable whose values come from a finite set $\mathcal{V}$ over a system of $N$ servers connected by point-to-point asynchronous links. We require that in every fair execution of algorithm $A$ where the number of server failures is smaller than a parameter $f$, every operation invoked at a non-failing client terminates. We define the storage cost of a server in algorithm $A$ as the logarithm (to base 2) of number of states it can take on; the total-storage cost of algorithm $A$ is the sum of the storage cost of all servers. Our results are as follows. (i) We show that if algorithm $A$ does not use server gossip, then the total storage cost is lower bounded by $2 \frac{N}{N-f+1}\log_2|\mathcal{V}|-o(\log_2|\mathcal{V}|)$. (ii) The total storage cost is at least $2 \frac{N}{N-f+2} \log_{2}|\mathcal{V}|-o(\log_{2}|\mathcal{V}|)$ even if the algorithm uses server gossip. (iii) We consider algorithms where the write protocol sends information about the value in at most one phase. We show that the total storage cost is at least $ν^* \frac{N}{N-f+ν^*-1} \log_2( |\mathcal{V}|)- o(\log_2(|\mathcal{V}|),$ where $ν^*$ is the minimum of $f+1$ and the number of active write operations of an execution. △ Less

Submitted 24 May, 2016; v1 submitted 22 May, 2016; originally announced May 2016.

MSC Class: 68P30; 68W15 ACM Class: C.2.4; D.4.2

arXiv:1506.00684 [pdf, ps, other]

Multi-Version Coding - An Information Theoretic Perspective of Consistent Distributed Storage

Authors: Zhiying Wang, Viveck R. Cadambe

Abstract: In applications of distributed storage systems to distributed computing and implementation of key- value stores, the following property, usually referred to as consistency in computer science and engineering, is an important requirement: as the data stored changes, the latest version of the data must be accessible to a client that connects to the storage system. An information theoretic formulatio… ▽ More In applications of distributed storage systems to distributed computing and implementation of key- value stores, the following property, usually referred to as consistency in computer science and engineering, is an important requirement: as the data stored changes, the latest version of the data must be accessible to a client that connects to the storage system. An information theoretic formulation called multi-version coding is introduced in the paper, in order to study storage costs of consistent distributed storage systems. Multi-version coding is characterized by ν totally ordered versions of a message, and a storage system with n servers. At each server, values corresponding to an arbitrary subset of the ν versions are received and encoded. For any subset of c servers in the storage system, the value corresponding to the latest common version, or a later version as per the total ordering, among the c servers is required to be decodable. An achievable multi-version code construction via linear coding and a converse result that shows that the construction is approximately tight, are provided. An implication of the converse is that there is an inevitable price, in terms of storage cost, to ensure consistency in distributed storage systems. △ Less

Submitted 19 October, 2015; v1 submitted 1 June, 2015; originally announced June 2015.

Comments: 30 Pages. Extended version of conference publications in ISIT 2014 and Allerton 2014. Revision adds a section, Section VII, and corrects minor typographical errors in the rest of the document

arXiv:1504.01690 [pdf, ps, other]

doi 10.1109/TIT.2016.2593633

Expanding the Compute-and-Forward Framework: Unequal Powers, Signal Levels, and Multiple Linear Combinations

Authors: Bobak Nazer, Viveck Cadambe, Vasilis Ntranos, Giuseppe Caire

Abstract: The compute-and-forward framework permits each receiver in a Gaussian network to directly decode a linear combination of the transmitted messages. The resulting linear combinations can then be employed as an end-to-end communication strategy for relaying, interference alignment, and other applications. Recent efforts have demonstrated the advantages of employing unequal powers at the transmitters… ▽ More The compute-and-forward framework permits each receiver in a Gaussian network to directly decode a linear combination of the transmitted messages. The resulting linear combinations can then be employed as an end-to-end communication strategy for relaying, interference alignment, and other applications. Recent efforts have demonstrated the advantages of employing unequal powers at the transmitters and decoding more than one linear combination at each receiver. However, neither of these techniques fit naturally within the original formulation of compute-and-forward. This paper proposes an expanded compute-and-forward framework that incorporates both of these possibilities and permits an intuitive interpretation in terms of signal levels. Within this framework, recent achievability and optimality results are unified and generalized. △ Less

Submitted 28 June, 2016; v1 submitted 7 April, 2015; originally announced April 2015.

Comments: 30 pages, 10 figures, to appear in IEEE Transactions on Information Theory

arXiv:1502.07830 [pdf, ps, other]

File Updates Under Random/Arbitrary Insertions And Deletions

Authors: Qiwen Wang, Viveck Cadambe, Sidharth Jaggi, Moshe Schwartz, Muriel Médard

Abstract: A client/encoder edits a file, as modeled by an insertion-deletion (InDel) process. An old copy of the file is stored remotely at a data-centre/decoder, and is also available to the client. We consider the problem of throughput- and computationally-efficient communication from the client to the data-centre, to enable the server to update its copy to the newly edited file. We study two models for t… ▽ More A client/encoder edits a file, as modeled by an insertion-deletion (InDel) process. An old copy of the file is stored remotely at a data-centre/decoder, and is also available to the client. We consider the problem of throughput- and computationally-efficient communication from the client to the data-centre, to enable the server to update its copy to the newly edited file. We study two models for the source files/edit patterns: the random pre-edit sequence left-to-right random InDel (RPES-LtRRID) process, and the arbitrary pre-edit sequence arbitrary InDel (APES-AID) process. In both models, we consider the regime in which the number of insertions/deletions is a small (but constant) fraction of the original file. For both models we prove information-theoretic lower bounds on the best possible compression rates that enable file updates. Conversely, our compression algorithms use dynamic programming (DP) and entropy coding, and achieve rates that are approximately optimal. △ Less

Submitted 27 February, 2015; originally announced February 2015.

Comments: The paper is an extended version of our paper to be appeared at ITW 2015

arXiv:1502.00656 [pdf, other]

Alignment based Network Coding for Two-Unicast-Z Networks

Authors: Weifei Zeng, Viveck R. Cadambe, Muriel Medard

Abstract: In this paper, we study the wireline two-unicast-Z communication network over directed acyclic graphs. The two-unicast-Z network is a two-unicast network where the destination intending to decode the second message has apriori side information of the first message. We make three contributions in this paper: 1. We describe a new linear network coding algorithm for two-unicast-Z networks over dire… ▽ More In this paper, we study the wireline two-unicast-Z communication network over directed acyclic graphs. The two-unicast-Z network is a two-unicast network where the destination intending to decode the second message has apriori side information of the first message. We make three contributions in this paper: 1. We describe a new linear network coding algorithm for two-unicast-Z networks over directed acyclic graphs. Our approach includes the idea of interference alignment as one of its key ingredients. For graphs of a bounded degree, our algorithm has linear complexity in terms of the number of vertices, and polynomial complexity in terms of the number of edges. 2. We prove that our algorithm achieves the rate-pair (1, 1) whenever it is feasible in the network. Our proof serves as an alternative, albeit restricted to two-unicast-Z networks over directed acyclic graphs, to an earlier result of Wang et al. which studied necessary and sufficient conditions for feasibility of the rate pair (1, 1) in two-unicast networks. 3. We provide a new proof of the classical max-flow min-cut theorem for directed acyclic graphs. △ Less

Submitted 8 February, 2015; v1 submitted 2 February, 2015; originally announced February 2015.

Comments: The paper is an extended version of our earlier paper at ITW 2014

arXiv:1407.4167 [pdf, ps, other]

A Coded Shared Atomic Memory Algorithm for Message Passing Architectures

Authors: Viveck R. Cadambe, Nancy Lynch, Muriel Médard, Peter Musial

Abstract: This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multi-reader shared memory in distributed message-passing systems. The paper contains three main contributions: (1) We present a atomic shared-memory emulation algorithm that we call Coded Atomic Storage (CAS). This algorithm uses erasure coding methods. In a storage system with $N$ servers tha… ▽ More This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multi-reader shared memory in distributed message-passing systems. The paper contains three main contributions: (1) We present a atomic shared-memory emulation algorithm that we call Coded Atomic Storage (CAS). This algorithm uses erasure coding methods. In a storage system with $N$ servers that is resilient to $f$ server failures, we show that the communication cost of CAS is $\frac{N}{N-2f}$. The storage cost of CAS is unbounded. (2) We present a modification of the CAS algorithm known as CAS with Garbage Collection (CASGC). The CASGC algorithm is parametrized by an integer $δ$ and has a bounded storage cost. We show that in every execution where the number of write operations that are concurrent with a read operation is no bigger than $δ$, the CASGC algorithm with parameter $δ$ satisfies atomicity and liveness. We explicitly characterize the storage cost of CASGC, and show that it has the same communication cost as CAS. (3) We describe an algorithm known as the Communication Cost Optimal Atomic Storage (CCOAS) algorithm that achieves a smaller communication cost than CAS and CASGC. In particular, CCOAS incurs read and write communication costs of $\frac{N}{N-f}$ measured in terms of number of object values. We also discuss drawbacks of CCOAS as compared with CAS and CASGC. △ Less

Submitted 15 July, 2014; originally announced July 2014.

Comments: Part of the results to appear in IEEE Network Computing and Applications (NCA), Aug 2014. This report supersedes MIT CSAIL technical report MIT-CSAIL-TR-2013-016

arXiv:1308.3200 [pdf, other]

doi 10.1109/NetCod.2013.6570829

An Upper Bound On the Size of Locally Recoverable Codes

Authors: Viveck Cadambe, Arya Mazumdar

Abstract: In a {\em locally recoverable} or {\em repairable} code, any symbol of a codeword can be recovered by reading only a small (constant) number of other symbols. The notion of local recoverability is important in the area of distributed storage where a most frequent error-event is a single storage node failure (erasure). A common objective is to repair the node by downloading data from as few other s… ▽ More In a {\em locally recoverable} or {\em repairable} code, any symbol of a codeword can be recovered by reading only a small (constant) number of other symbols. The notion of local recoverability is important in the area of distributed storage where a most frequent error-event is a single storage node failure (erasure). A common objective is to repair the node by downloading data from as few other storage node as possible. In this paper, we bound the minimum distance of a code in terms of its length, size and locality. Unlike previous bounds, our bound follows from a significantly simple analysis and depends on the size of the alphabet being used. It turns out that the binary Simplex codes satisfy our bound with equality; hence the Simplex codes are the first example of a optimal binary locally repairable code family. We also provide achievability results based on random coding and concatenated codes that are numerically verified to be close to our bounds. △ Less

Submitted 26 March, 2015; v1 submitted 14 August, 2013; originally announced August 2013.

Comments: A shorter version has appeared in IEEE NetCod, 2013

MSC Class: 94B65; 94B25; 94A15

arXiv:1210.0293 [pdf, other]

Feedback Interference Alignment: Exact Alignment for Three Users in Two Time Slots

Authors: Vasilis Ntranos, Viveck R. Cadambe, Bobak Nazer, Giuseppe Caire

Abstract: We study the three-user interference channel where each transmitter has local feedback of the signal from its targeted receiver. We show that in the important case where the channel coefficients are static, exact alignment can be achieved over two time slots using linear schemes. This is in contrast with the interference channel where no feedback is utilized, where it seems that either an infinite… ▽ More We study the three-user interference channel where each transmitter has local feedback of the signal from its targeted receiver. We show that in the important case where the channel coefficients are static, exact alignment can be achieved over two time slots using linear schemes. This is in contrast with the interference channel where no feedback is utilized, where it seems that either an infinite number of channel extensions or infinite precision is required for exact alignment. We also demonstrate, via simulations, that our scheme significantly outperforms time-sharing even at finite SNR. △ Less

Submitted 1 October, 2012; originally announced October 2012.

arXiv:1205.1483 [pdf, other]

Index Coding - An Interference Alignment Perspective

Authors: Hamed Maleki, Viveck R. Cadambe, Syed A. Jafar

Abstract: The index coding problem is studied from an interference alignment perspective, providing new results as well as new insights into, and generalizations of, previously known results. An equivalence is established between multiple unicast index coding where each message is desired by exactly one receiver, and multiple groupcast index coding where a message can be desired by multiple receivers, which… ▽ More The index coding problem is studied from an interference alignment perspective, providing new results as well as new insights into, and generalizations of, previously known results. An equivalence is established between multiple unicast index coding where each message is desired by exactly one receiver, and multiple groupcast index coding where a message can be desired by multiple receivers, which settles the heretofore open question of insufficiency of linear codes for the multiple unicast index coding problem by equivalence with multiple groupcast settings where this question has previously been answered. Necessary and sufficient conditions for the achievability of rate half per message are shown to be a natural consequence of interference alignment constraints, and generalizations to feasibility of rate $\frac{1}{L+1}$ per message when each destination desires at least $L$ messages, are similarly obtained. Finally, capacity optimal solutions are presented to a series of symmetric index coding problems inspired by the local connectivity and local interference characteristics of wireless networks. The solutions are based on vector linear coding. △ Less

Submitted 7 May, 2012; originally announced May 2012.

arXiv:1106.1634 [pdf, other]

Repair Optimal Erasure Codes through Hadamard Designs

Authors: Dimitris S. Papailiopoulos, Alexandros G. Dimakis, Viveck R. Cadambe

Abstract: In distributed storage systems that employ erasure coding, the issue of minimizing the total {\it communication} required to exactly rebuild a storage node after a failure arises. This repair bandwidth depends on the structure of the storage code and the repair strategies used to restore the lost data. Designing high-rate maximum-distance separable (MDS) codes that achieve the optimum repair commu… ▽ More In distributed storage systems that employ erasure coding, the issue of minimizing the total {\it communication} required to exactly rebuild a storage node after a failure arises. This repair bandwidth depends on the structure of the storage code and the repair strategies used to restore the lost data. Designing high-rate maximum-distance separable (MDS) codes that achieve the optimum repair communication has been a well-known open problem. In this work, we use Hadamard matrices to construct the first explicit 2-parity MDS storage code with optimal repair properties for all single node failures, including the parities. Our construction relies on a novel method of achieving perfect interference alignment over finite fields with a finite file size, or number of extensions. We generalize this construction to design $m$-parity MDS codes that achieve the optimum repair communication for single systematic node failures and show that there is an interesting connection between our $m$-parity codes and the systematic-repair optimal permutation-matrix based codes of Tamo {\it et al.} \cite{Tamo} and Cadambe {\it et al.} \cite{PermCodes_ISIT, PermCodes}. △ Less

Submitted 8 June, 2011; originally announced June 2011.

Comments: 19 pages, 9 figures

arXiv:1106.1250 [pdf, ps, other]

Optimal Repair of MDS Codes in Distributed Storage via Subspace Interference Alignment

Authors: Viveck R. Cadambe, Cheng Huang, Syed A. Jafar, Jin Li

Abstract: It is well known that an (n,k) code can be used to store 'k' units of information in 'n' unit-capacity disks of a distributed data storage system. If the code used is maximum distance separable (MDS), then the system can tolerate any (n-k) disk failures, since the original information can be recovered from any k surviving disks. The focus of this paper is the design of a systematic MDS code with t… ▽ More It is well known that an (n,k) code can be used to store 'k' units of information in 'n' unit-capacity disks of a distributed data storage system. If the code used is maximum distance separable (MDS), then the system can tolerate any (n-k) disk failures, since the original information can be recovered from any k surviving disks. The focus of this paper is the design of a systematic MDS code with the additional property that a single disk failure can be repaired with minimum repair bandwidth, i.e., with the minimum possible amount of data to be downloaded for recovery of the failed disk. Previously, a lower bound of (n-1)/(n-k) units has been established by Dimakis et. al, on the repair bandwidth for a single disk failure in an (n,k) MDS code . Recently, the existence of asymptotic codes achieving this lower bound for arbitrary (n,k) has been established by drawing connections to interference alignment. While the existence of asymptotic constructions achieving this lower bound have been shown, finite code constructions achieving this lower bound existed in previous literature only for the special (high-redundancy) scenario where $k \leq \max(n/2,3)$. The question of existence of finite codes for arbitrary values of (n,k) achieving the lower bound on the repair bandwidth remained open. In this paper, by using permutation coding sub-matrices, we provide the first known finite MDS code which achieves the optimal repair bandwidth of (n-1)/(n-k) for arbitrary (n,k), for recovery of a failed systematic disk. We also generalize our permutation matrix based constructions by developing a novel framework for repair-bandwidth-optimal MDS codes based on the idea of subspace interference alignment - a concept previously introduced by Suh and Tse the context of wireless cellular networks. △ Less

Submitted 6 June, 2011; originally announced June 2011.

Comments: To be presented in part at ISIT 2011

arXiv:1004.4299 [pdf, other]

Distributed Data Storage with Minimum Storage Regenerating Codes - Exact and Functional Repair are Asymptotically Equally Efficient

Authors: Viveck R. Cadambe, Syed A. Jafar, Hamed Maleki

Abstract: We consider a set up where a file of size M is stored in n distributed storage nodes, using an (n,k) minimum storage regenerating (MSR) code, i.e., a maximum distance separable (MDS) code that also allows efficient exact-repair of any failed node. The problem of interest in this paper is to minimize the repair bandwidth B for exact regeneration of a single failed node, i.e., the minimum data to… ▽ More We consider a set up where a file of size M is stored in n distributed storage nodes, using an (n,k) minimum storage regenerating (MSR) code, i.e., a maximum distance separable (MDS) code that also allows efficient exact-repair of any failed node. The problem of interest in this paper is to minimize the repair bandwidth B for exact regeneration of a single failed node, i.e., the minimum data to be downloaded by a new node to replace the failed node by its exact replica. Previous work has shown that a bandwidth of B=[M(n-1)]/[k(n-k)] is necessary and sufficient for functional (not exact) regeneration. It has also been shown that if k < = max(n/2, 3), then there is no extra cost of exact regeneration over functional regeneration. The practically relevant setting of low-redundancy, i.e., k/n>1/2 remains open for k>3 and it has been shown that there is an extra bandwidth cost for exact repair over functional repair in this case. In this work, we adopt into the distributed storage context an asymptotically optimal interference alignment scheme previously proposed by Cadambe and Jafar for large wireless interference networks. With this scheme we solve the problem of repair bandwidth minimization for (n,k) exact-MSR codes for all (n,k) values including the previously open case of k > \max(n/2,3). Our main result is that, for any (n,k), and sufficiently large file sizes, there is no extra cost of exact regeneration over functional regeneration in terms of the repair bandwidth per bit of regenerated data. More precisely, we show that in the limit as M approaches infinity, the ratio B/M = (n-1)/(k(n-k))$. △ Less

Submitted 24 April, 2010; originally announced April 2010.

arXiv:1001.4120 [pdf, ps, other]

Sum-Capacity and the Unique Separability of the Parallel Gaussian MAC-Z-BC Network

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: It is known that the capacity of parallel (e.g., multi-carrier) Gaussian point-to-point, multiple access and broadcast channels can be achieved by separate encoding for each subchannel (carrier) subject to a power allocation across carriers. Recent results have shown that parallel interference channels are not separable, i.e., joint coding is needed to achieve capacity in general. This work stud… ▽ More It is known that the capacity of parallel (e.g., multi-carrier) Gaussian point-to-point, multiple access and broadcast channels can be achieved by separate encoding for each subchannel (carrier) subject to a power allocation across carriers. Recent results have shown that parallel interference channels are not separable, i.e., joint coding is needed to achieve capacity in general. This work studies the separability, from a sum-capacity perspective, of single hop Gaussian interference networks with independent messages and arbitrary number of transmitters and receivers. The main result is that the only network that is always (for all values of channel coefficients) separable from a sum-capacity perspective is the MAC-Z-BC network, i.e., a network where a MAC component and a BC component are linked by a Z component. The sum capacity of this network is explicitly characterized. △ Less

Submitted 22 January, 2010; originally announced January 2010.

Comments: Submitted to ISIT 2010

arXiv:0912.3029 [pdf, ps, other]

Interference Alignment and a Noisy Interference Regime for Many-to-One Interference Channels

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: We study the capacity of discrete memoryless many-to-one interference channels, i.e., K user interference channels where only one receiver faces interference. For a class of many-to-one interference channels, we identify a noisy interference regime, i.e., a regime where random coding and treating interference as noise achieves sum-capacity. Specializing our results to the Gaussian MIMO many-to-o… ▽ More We study the capacity of discrete memoryless many-to-one interference channels, i.e., K user interference channels where only one receiver faces interference. For a class of many-to-one interference channels, we identify a noisy interference regime, i.e., a regime where random coding and treating interference as noise achieves sum-capacity. Specializing our results to the Gaussian MIMO many-to-one interference channel, which is a special case of the class of channels considered, we obtain new capacity results. Firstly, we extend the noisy interference regime, previously studied for (many-to-one) interference channels with average power constraints on the inputs, to a more general class of inputs. This more general class includes the practical scenario of inputs being restricted to fixed finite-size constellations such as PSK or QAM. Secondly, we extend noisy interference results previously studied in SISO interference channels with full channel state information (CSI) at all nodes, to MIMO and parallel Gaussian many-to-one interference channels, and to fading Gaussian many-to-one interference channels without CSI at the transmitters. While the many-to-one interference channel requires interference alignment, which in turn requires structured codes in general, we argue that in the noisy interference regime, interference is implicitly aligned by random coding irrespective of the input distribution. As a byproduct of our study, we identify a second class of many-to-one interference channels (albeit deterministic) where random coding is optimal (though interference is not treated as noise). The optimality of random coding in this second class of channels is due to an interference resolvability condition which precludes interference alignment and hence obviates the need of structured codes. △ Less

Submitted 15 December, 2009; originally announced December 2009.

Comments: 21 pages. Partially presented at 47th Allerton Conference on Communication, Control, and Computing, Sep, 2009

arXiv:0904.0274 [pdf, other]

Interference Alignment with Asymmetric Complex Signaling - Settling the Host-Madsen-Nosratinia Conjecture

Authors: Viveck R. Cadambe, Syed A. Jafar, Chenwei Wang

Abstract: It has been conjectured by Host-Madsen and Nosratinia that complex Gaussian interference channels with constant channel coefficients have only one degree-of-freedom regardless of the number of users. While several examples are known of constant channels that achieve more than 1 degree of freedom, these special cases only span a subset of measure zero. In other words, for almost all channel coeff… ▽ More It has been conjectured by Host-Madsen and Nosratinia that complex Gaussian interference channels with constant channel coefficients have only one degree-of-freedom regardless of the number of users. While several examples are known of constant channels that achieve more than 1 degree of freedom, these special cases only span a subset of measure zero. In other words, for almost all channel coefficient values, it is not known if more than 1 degree-of-freedom is achievable. In this paper, we settle the Host-Madsen-Nosratinia conjecture in the negative. We show that at least 1.2 degrees-of-freedom are achievable for all values of complex channel coefficients except for a subset of measure zero. For the class of linear beamforming and interference alignment schemes considered in this paper, it is also shown that 1.2 is the maximum number of degrees of freedom achievable on the complex Gaussian 3 user interference channel with constant channel coefficients, for almost all values of channel coefficients. To establish the achievability of 1.2 degrees of freedom we introduce the novel idea of asymmetric complex signaling - i.e., the inputs are chosen to be complex but not circularly symmetric. It is shown that unlike Gaussian point-to-point, multiple-access and broadcast channels where circularly symmetric complex Gaussian inputs are optimal, for interference channels optimal inputs are in general asymmetric. With asymmetric complex signaling, we also show that the 2 user complex Gaussian X channel with constant channel coefficients achieves the outer bound of 4/3 degrees-of-freedom, i.e., the assumption of time-variations/frequency-selectivity used in prior work to establish the same result, is not needed. △ Less

Submitted 1 April, 2009; originally announced April 2009.

Journal ref: IEEE Transactions on Information Theory, Sep. 2010, Vol. 56, Issue: 9, Pages: 4552-4565

arXiv:0810.4741 [pdf, ps, other]

On the Capacity and Generalized Degrees of Freedom of the X Channel

Authors: Chiachi Huang, Viveck R. Cadambe, Syed A. Jafar

Abstract: We explore the capacity and generalized degrees of freedom of the two-user Gaussian X channel, i.e. a generalization of the 2 user interference channel where there is an independent message from each transmitter to each receiver. There are three main results in this paper. First, we characterize the sum capacity of the deterministic X channel model under a symmetric setting. Second, we character… ▽ More We explore the capacity and generalized degrees of freedom of the two-user Gaussian X channel, i.e. a generalization of the 2 user interference channel where there is an independent message from each transmitter to each receiver. There are three main results in this paper. First, we characterize the sum capacity of the deterministic X channel model under a symmetric setting. Second, we characterize the generalized degrees of freedom of the Gaussian X channel under a similar symmetric model. Third, we extend the noisy interference capacity characterization previously obtained for the interference channel to the X channel. Specifically, we show that the X channel associated with noisy (very weak) interference channel has the same sum capacity as the noisy interference channel. △ Less

Submitted 27 October, 2008; originally announced October 2008.

arXiv:0803.3816 [pdf, ps, other]

doi 10.1109/GLOCOM.2008.ECP.817

Approaching the Capacity of Wireless Networks through Distributed Interference Alignment

Authors: Krishna Gomadam, Viveck R. Cadambe, Syed A. Jafar

Abstract: Recent results establish the optimality of interference alignment to approach the Shannon capacity of interference networks at high SNR. However, the extent to which interference can be aligned over a finite number of signalling dimensions remains unknown. Another important concern for interference alignment schemes is the requirement of global channel knowledge. In this work we provide examples… ▽ More Recent results establish the optimality of interference alignment to approach the Shannon capacity of interference networks at high SNR. However, the extent to which interference can be aligned over a finite number of signalling dimensions remains unknown. Another important concern for interference alignment schemes is the requirement of global channel knowledge. In this work we provide examples of iterative algorithms that utilize the reciprocity of wireless networks to achieve interference alignment with only local channel knowledge at each node. These algorithms also provide numerical insights into the feasibility of interference alignment that are not yet available in theory. △ Less

Submitted 26 March, 2008; originally announced March 2008.

Comments: 10 pages 2 columns

Journal ref: IEEE Transactions on Information Theory, Vol. 57, No. 6, June, 2011, Pages: 3309-3322

arXiv:0802.2125 [pdf, ps, other]

doi 10.1109/GLOCOM.2008.ECP.904

Multiple Access Outerbounds and the Inseparability of Parallel Interference Channels

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: It is known that the capacity of parallel (multi-carrier) Gaussian point-to-point, multiple access and broadcast channels can be achieved by separate encoding for each subchannel (carrier) subject to a power allocation across carriers. In this paper we show that such a separation does not apply to parallel Gaussian interference channels in general. A counter-example is provided in the form of a… ▽ More It is known that the capacity of parallel (multi-carrier) Gaussian point-to-point, multiple access and broadcast channels can be achieved by separate encoding for each subchannel (carrier) subject to a power allocation across carriers. In this paper we show that such a separation does not apply to parallel Gaussian interference channels in general. A counter-example is provided in the form of a 3 user interference channel where separate encoding can only achieve a sum capacity of $\log({SNR})+o(\log({SNR}))$ per carrier while the actual capacity, achieved only by joint-encoding across carriers, is $3/2\log({SNR}))+o(\log({SNR}))$ per carrier. As a byproduct of our analysis, we propose a class of multiple-access-outerbounds on the capacity of the 3 user interference channel. △ Less

Submitted 14 February, 2008; originally announced February 2008.

Journal ref: IEEE Transactions on Information Theory, Vol. 55, No. 9, Sep. 2009,Pages: 3983-3990

arXiv:0802.0534 [pdf, ps, other]

Capacity of Wireless Networks within o(log(SNR)) - the Impact of Relays, Feedback, Cooperation and Full-Duplex Operation

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: Recent work has characterized the sum capacity of time-varying/frequency-selective wireless interference networks and $X$ networks within $o(\log({SNR}))$, i.e., with an accuracy approaching 100% at high SNR (signal to noise power ratio). In this paper, we seek similar capacity characterizations for wireless networks with relays, feedback, full duplex operation, and transmitter/receiver cooperat… ▽ More Recent work has characterized the sum capacity of time-varying/frequency-selective wireless interference networks and $X$ networks within $o(\log({SNR}))$, i.e., with an accuracy approaching 100% at high SNR (signal to noise power ratio). In this paper, we seek similar capacity characterizations for wireless networks with relays, feedback, full duplex operation, and transmitter/receiver cooperation through noisy channels. First, we consider a network with $S$ source nodes, $R$ relay nodes and $D$ destination nodes with random time-varying/frequency-selective channel coefficients and global channel knowledge at all nodes. We allow full-duplex operation at all nodes, as well as causal noise-free feedback of all received signals to all source and relay nodes. The sum capacity of this network is characterized as $\frac{SD}{S+D-1}\log({SNR})+o(\log({SNR}))$. The implication of the result is that the capacity benefits of relays, causal feedback, transmitter/receiver cooperation through physical channels and full duplex operation become a negligible fraction of the network capacity at high SNR. Some exceptions to this result are also pointed out in the paper. Second, we consider a network with $K$ full duplex nodes with an independent message from every node to every other node in the network. We find that the sum capacity of this network is bounded below by $\frac{K(K-1)}{2K-2}+o(\log({SNR}))$ and bounded above by $\frac{K(K-1)}{2K-3}+o(\log({SNR}))$. △ Less

Submitted 4 February, 2008; originally announced February 2008.

Journal ref: IEEE Transactions on Information Theory, Vol. 55, No. 5, May 2009, Pages: 2334-2344

arXiv:0711.2824 [pdf, ps, other]

Degrees of Freedom of Wireless X Networks

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: We explore the degrees of freedom of $M\times N$ user wireless $X$ networks, i.e. networks of $M$ transmitters and $N$ receivers where every transmitter has an independent message for every receiver. We derive a general outerbound on the degrees of freedom \emph{region} of these networks. When all nodes have a single antenna and all channel coefficients vary in time or frequency, we show that th… ▽ More We explore the degrees of freedom of $M\times N$ user wireless $X$ networks, i.e. networks of $M$ transmitters and $N$ receivers where every transmitter has an independent message for every receiver. We derive a general outerbound on the degrees of freedom \emph{region} of these networks. When all nodes have a single antenna and all channel coefficients vary in time or frequency, we show that the \emph{total} number of degrees of freedom of the $X$ network is equal to $\frac{MN}{M+N-1}$ per orthogonal time and frequency dimension. Achievability is proved by constructing interference alignment schemes for $X$ networks that can come arbitrarily close to the outerbound on degrees of freedom. For the case where either M=2 or N=2 we find that the outerbound is exactly achievable. While $X$ networks have significant degrees of freedom benefits over interference networks when the number of users is small, our results show that as the number of users increases, this advantage disappears. Thus, for large $K$, the $K\times K$ user wireless $X$ network loses half the degrees of freedom relative to the $K\times K$ MIMO outerbound achievable through full cooperation. Interestingly, when there are few transmitters sending to many receivers ($N\gg M$) or many transmitters sending to few receivers ($M\gg N$), $X$ networks are able to approach the $\min(M,N)$ degrees of freedom possible with full cooperation on the $M\times N$ MIMO channel. Similar to the interference channel, we also construct an example of a 2 user $X$ channel with propagation delays where the outerbound on degrees of freedom is achieved through interference alignment based on a simple TDMA strategy. △ Less

Submitted 18 November, 2007; originally announced November 2007.

Comments: 26 pages

arXiv:0711.2547 [pdf, ps, other]

Interference Alignment on the Deterministic Channel and Application to Fully Connected AWGN Interference Networks

Authors: Viveck Cadambe, Syed A. Jafar, Shlomo Shamai

Abstract: An interference alignment example is constructed for the deterministic channel model of the $K$ user interference channel. The deterministic channel example is then translated into the Gaussian setting, creating the first known example of a fully connected Gaussian $K$ user interference network with single antenna nodes, real, non-zero and contant channel coefficients, and no propagation delays… ▽ More An interference alignment example is constructed for the deterministic channel model of the $K$ user interference channel. The deterministic channel example is then translated into the Gaussian setting, creating the first known example of a fully connected Gaussian $K$ user interference network with single antenna nodes, real, non-zero and contant channel coefficients, and no propagation delays where the degrees of freedom outerbound is achieved. An analogy is drawn between the propagation delay based interference alignment examples and the deterministic channel model which also allows similar constructions for the 2 user $X$ channel as well. △ Less

Submitted 15 November, 2007; originally announced November 2007.

arXiv:0707.0323 [pdf, ps, other]

Interference Alignment and the Degrees of Freedom for the K User Interference Channel

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: While the best known outerbound for the K user interference channel states that there cannot be more than K/2 degrees of freedom, it has been conjectured that in general the constant interference channel with any number of users has only one degree of freedom. In this paper, we explore the spatial degrees of freedom per orthogonal time and frequency dimension for the K user wireless interference… ▽ More While the best known outerbound for the K user interference channel states that there cannot be more than K/2 degrees of freedom, it has been conjectured that in general the constant interference channel with any number of users has only one degree of freedom. In this paper, we explore the spatial degrees of freedom per orthogonal time and frequency dimension for the K user wireless interference channel where the channel coefficients take distinct values across frequency slots but are fixed in time. We answer five closely related questions. First, we show that K/2 degrees of freedom can be achieved by channel design, i.e. if the nodes are allowed to choose the best constant, finite and nonzero channel coefficient values. Second, we show that if channel coefficients can not be controlled by the nodes but are selected by nature, i.e., randomly drawn from a continuous distribution, the total number of spatial degrees of freedom for the K user interference channel is almost surely K/2 per orthogonal time and frequency dimension. Thus, only half the spatial degrees of freedom are lost due to distributed processing of transmitted and received signals on the interference channel. Third, we show that interference alignment and zero forcing suffice to achieve all the degrees of freedom in all cases. Fourth, we show that the degrees of freedom $D$ directly lead to an $\mathcal{O}(1)$ capacity characterization of the form $C(SNR)=D\log(1+SNR)+\mathcal{O}(1)$ for the multiple access channel, the broadcast channel, the 2 user interference channel, the 2 user MIMO X channel and the 3 user interference channel with M>1 antennas at each node. Fifth, we characterize the degree of freedom benefits from cognitive sharing of messages on the 3 user interference channel. △ Less

Submitted 10 July, 2007; v1 submitted 3 July, 2007; originally announced July 2007.

Comments: 30 pages. Revision extends the 3 user proof to K users

arXiv:0706.1399 [pdf, ps, other]

Duality and Stability Regions of Multi-rate Broadcast and Multiple Access Networks

Authors: Viveck R. Cadambe, Syed A. Jafar

Abstract: We characterize stability regions of two-user fading Gaussian multiple access (MAC) and broadcast (BC) networks with centralized scheduling. The data to be transmitted to the users is encoded into codewords of fixed length. The rates of the codewords used are restricted to a fixed set of finite cardinality. With successive decoding and interference cancellation at the receivers, we find the set… ▽ More We characterize stability regions of two-user fading Gaussian multiple access (MAC) and broadcast (BC) networks with centralized scheduling. The data to be transmitted to the users is encoded into codewords of fixed length. The rates of the codewords used are restricted to a fixed set of finite cardinality. With successive decoding and interference cancellation at the receivers, we find the set of arrival rates that can be stabilized over the MAC and BC networks. In MAC and BC networks with average power constraints, we observe that the duality property that relates the MAC and BC information theoretic capacity regions extend to their stability regions as well. In MAC and BC networks with peak power constraints, the union of stability regions of dual MAC networks is found to be strictly contained in the BC stability region. △ Less

Submitted 11 June, 2007; originally announced June 2007.

Comments: 12 pages, 11 figures, submitted to IEEE Trans. Information Theory for review

Showing 1–43 of 43 results for author: Cadambe, V