-
Independence Properties of Generalized Submodular Information Measures
Authors:
Himanshu Asnani,
Jeff Bilmes,
Rishabh Iyer
Abstract:
Recently a class of generalized information measures was defined on sets of items parametrized by submodular functions. In this paper, we propose and study various notions of independence between sets with respect to such information measures, and connections thereof. Since entropy can also be used to parametrize such measures, we derive interesting independence properties for the entropy of sets…
▽ More
Recently a class of generalized information measures was defined on sets of items parametrized by submodular functions. In this paper, we propose and study various notions of independence between sets with respect to such information measures, and connections thereof. Since entropy can also be used to parametrize such measures, we derive interesting independence properties for the entropy of sets of random variables. We also study the notion of multi-set independence and its properties. Finally, we present optimization algorithms for obtaining a set that is independent of another given set, and also discuss the implications and applications of combinatorial independence.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data
Authors:
Arnab Kumar Mondal,
Himanshu Asnani,
Parag Singla,
Prathosh AP
Abstract:
Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear mapping f…
▽ More
Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear mapping from the high-dimensional data space to a low-dimensional latent space and vice-versa, simultaneously imposing a distributional prior on the latent space, which brings in a regularization effect. This paper argues that RAEs suffer from the infamous problem of bias-variance trade-off in their naive formulation. While a simple AE without a latent regularization results in data over-fitting, a very strong prior leads to under-representation and thus bad clustering. To address the above issues, we propose a modified RAE framework (called the scRAE) for effective clustering of the single-cell RNA sequencing data. scRAE consists of deterministic AE with a flexibly learnable prior generator network, which is jointly trained with the AE. This facilitates scRAE to trade-off better between the bias and variance in the latent space. We demonstrate the efficacy of the proposed method through extensive experimentation on several real-world single-cell Gene expression datasets.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
A Unified Framework for Generic, Query-Focused, Privacy Preserving and Update Summarization using Submodular Information Measures
Authors:
Vishal Kaushal,
Suraj Kothawade,
Ganesh Ramakrishnan,
Jeff Bilmes,
Himanshu Asnani,
Rishabh Iyer
Abstract:
We study submodular information measures as a rich framework for generic, query-focused, privacy sensitive, and update summarization tasks. While past work generally treats these problems differently ({\em e.g.}, different models are often used for generic and query-focused summarization), the submodular information measures allow us to study each of these problems via a unified approach. We first…
▽ More
We study submodular information measures as a rich framework for generic, query-focused, privacy sensitive, and update summarization tasks. While past work generally treats these problems differently ({\em e.g.}, different models are often used for generic and query-focused summarization), the submodular information measures allow us to study each of these problems via a unified approach. We first show that several previous query-focused and update summarization techniques have, unknowingly, used various instantiations of the aforesaid submodular information measures, providing evidence for the benefit and naturalness of these models. We then carefully study and demonstrate the modelling capabilities of the proposed functions in different settings and empirically verify our findings on both a synthetic dataset and an existing real-world image collection dataset (that has been extended by adding concept annotations to each image making it suitable for this task) and will be publicly released. We employ a max-margin framework to learn a mixture model built using the proposed instantiations of submodular information measures and demonstrate the effectiveness of our approach. While our experiments are in the context of image summarization, our framework is generic and can be easily extended to other summarization settings (e.g., videos or documents).
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Submodular Combinatorial Information Measures with Applications in Machine Learning
Authors:
Rishabh Iyer,
Ninad Khargonkar,
Jeff Bilmes,
Himanshu Asnani
Abstract:
Information-theoretic quantities like entropy and mutual information have found numerous uses in machine learning. It is well known that there is a strong connection between these entropic quantities and submodularity since entropy over a set of random variables is submodular. In this paper, we study combinatorial information measures that generalize independence, (conditional) entropy, (condition…
▽ More
Information-theoretic quantities like entropy and mutual information have found numerous uses in machine learning. It is well known that there is a strong connection between these entropic quantities and submodularity since entropy over a set of random variables is submodular. In this paper, we study combinatorial information measures that generalize independence, (conditional) entropy, (conditional) mutual information, and total correlation defined over sets of (not necessarily random) variables. These measures strictly generalize the corresponding entropic measures since they are all parameterized via submodular functions that themselves strictly generalize entropy. Critically, we show that, unlike entropic mutual information in general, the submodular mutual information is actually submodular in one argument, holding the other fixed, for a large class of submodular functions whose third-order partial derivatives satisfy a non-negativity property. This turns out to include a number of practically useful cases such as the facility location and set-cover functions. We study specific instantiations of the submodular information measures on these, as well as the probabilistic coverage, graph-cut, and saturated coverage functions, and see that they all have mathematically intuitive and practically useful expressions. Regarding applications, we connect the maximization of submodular (conditional) mutual information to problems such as mutual-information-based, query-based, and privacy-preserving summarization -- and we connect optimizing the multi-set submodular mutual information to clustering and robust partitioning.
△ Less
Submitted 2 March, 2021; v1 submitted 27 June, 2020;
originally announced June 2020.
-
To Regularize or Not To Regularize? The Bias Variance Trade-off in Regularized AEs
Authors:
Arnab Kumar Mondal,
Himanshu Asnani,
Parag Singla,
Prathosh AP
Abstract:
Regularized Auto-Encoders (RAEs) form a rich class of neural generative models. They effectively model the joint-distribution between the data and the latent space using an Encoder-Decoder combination, with regularization imposed in terms of a prior over the latent space. Despite their advantages, such as stability in training, the performance of AE based models has not reached the superior standa…
▽ More
Regularized Auto-Encoders (RAEs) form a rich class of neural generative models. They effectively model the joint-distribution between the data and the latent space using an Encoder-Decoder combination, with regularization imposed in terms of a prior over the latent space. Despite their advantages, such as stability in training, the performance of AE based models has not reached the superior standards as that of the other generative models such as Generative Adversarial Networks (GANs). Motivated by this, we examine the effect of the latent prior on the generation quality of deterministic AE models in this paper. Specifically, we consider the class of RAEs with deterministic Encoder-Decoder pairs, Wasserstein Auto-Encoders (WAE), and show that having a fixed prior distribution, \textit{a priori}, oblivious to the dimensionality of the `true' latent space, will lead to the infeasibility of the optimization problem considered. Further, we show that, in the finite data regime, despite knowing the correct latent dimensionality, there exists a bias-variance trade-off with any arbitrary prior imposition. As a remedy to both the issues mentioned above, we introduce an additional state space in the form of flexibly learnable latent priors, in the optimization objective of the WAEs. We implicitly learn the distribution of the latent prior jointly with the AE training, which not only makes the learning objective feasible but also facilitates operation on different points of the bias-variance curve. We show the efficacy of our model, called FlexAE, through several experiments on multiple datasets, and demonstrate that it is the new state-of-the-art for the AE based generative models.
△ Less
Submitted 19 September, 2020; v1 submitted 10 June, 2020;
originally announced June 2020.
-
C-MI-GAN : Estimation of Conditional Mutual Information using MinMax formulation
Authors:
Arnab Kumar Mondal,
Arnab Bhattacharya,
Sudipto Mukherjee,
Prathosh AP,
Sreeram Kannan,
Himanshu Asnani
Abstract:
Estimation of information theoretic quantities such as mutual information and its conditional variant has drawn interest in recent times owing to their multifaceted applications. Newly proposed neural estimators for these quantities have overcome severe drawbacks of classical $k$NN-based estimators in high dimensions. In this work, we focus on conditional mutual information (CMI) estimation by uti…
▽ More
Estimation of information theoretic quantities such as mutual information and its conditional variant has drawn interest in recent times owing to their multifaceted applications. Newly proposed neural estimators for these quantities have overcome severe drawbacks of classical $k$NN-based estimators in high dimensions. In this work, we focus on conditional mutual information (CMI) estimation by utilizing its formulation as a minmax optimization problem. Such a formulation leads to a joint training procedure similar to that of generative adversarial networks. We find that our proposed estimator provides better estimates than the existing approaches on a variety of simulated data sets comprising linear and non-linear relations between variables. As an application of CMI estimation, we deploy our estimator for conditional independence (CI) testing on real data and obtain better results than state-of-the-art CI testers.
△ Less
Submitted 23 July, 2020; v1 submitted 17 May, 2020;
originally announced May 2020.
-
MaskAAE: Latent space optimization for Adversarial Auto-Encoders
Authors:
Arnab Kumar Mondal,
Sankalan Pal Chowdhury,
Aravind Jayendran,
Parag Singla,
Himanshu Asnani,
Prathosh AP
Abstract:
The field of neural generative models is dominated by the highly successful Generative Adversarial Networks (GANs) despite their challenges, such as training instability and mode collapse. Auto-Encoders (AE) with regularized latent space provide an alternative framework for generative models, albeit their performance levels have not reached that of GANs. In this work, we hypothesise that the dimen…
▽ More
The field of neural generative models is dominated by the highly successful Generative Adversarial Networks (GANs) despite their challenges, such as training instability and mode collapse. Auto-Encoders (AE) with regularized latent space provide an alternative framework for generative models, albeit their performance levels have not reached that of GANs. In this work, we hypothesise that the dimensionality of the AE model's latent space has a critical effect on the quality of generated data. Under the assumption that nature generates data by sampling from a "true" generative latent space followed by a deterministic function, we show that the optimal performance is obtained when the dimensionality of the latent space of the AE-model matches with that of the "true" generative latent space. Further, we propose an algorithm called the Mask Adversarial Auto-Encoder (MaskAAE), in which the dimensionality of the latent space of an adversarial auto encoder is brought closer to that of the "true" generative latent space, via a procedure to mask the spurious latent dimensions. We demonstrate through experiments on synthetic and several real-world datasets that the proposed formulation yields betterment in the generation quality.
△ Less
Submitted 17 May, 2020; v1 submitted 10 December, 2019;
originally announced December 2019.
-
Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels
Authors:
Yihan Jiang,
Hyeji Kim,
Himanshu Asnani,
Sreeram Kannan,
Sewoong Oh,
Pramod Viswanath
Abstract:
Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist…
▽ More
Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist and the codes designed for canonical models are adapted via heuristics to these channels and are thus not guaranteed to be optimal. In this work, we make significant progress on this problem by designing a fully end-to-end jointly trained neural encoder and decoder, namely, Turbo Autoencoder (TurboAE), with the following contributions: ($a$) under moderate block lengths, TurboAE approaches state-of-the-art performance under canonical channels; ($b$) moreover, TurboAE outperforms the state-of-the-art codes under non-canonical settings in terms of reliability. TurboAE shows that the development of channel coding design can be automated via deep learning, with near-optimal performance.
△ Less
Submitted 7 November, 2019;
originally announced November 2019.
-
CCMI : Classifier based Conditional Mutual Information Estimation
Authors:
Sudipto Mukherjee,
Himanshu Asnani,
Sreeram Kannan
Abstract:
Conditional Mutual Information (CMI) is a measure of conditional dependence between random variables X and Y, given another random variable Z. It can be used to quantify conditional dependence among variables in many data-driven inference problems such as graphical models, causal learning, feature selection and time-series analysis. While k-nearest neighbor (kNN) based estimators as well as kernel…
▽ More
Conditional Mutual Information (CMI) is a measure of conditional dependence between random variables X and Y, given another random variable Z. It can be used to quantify conditional dependence among variables in many data-driven inference problems such as graphical models, causal learning, feature selection and time-series analysis. While k-nearest neighbor (kNN) based estimators as well as kernel-based methods have been widely used for CMI estimation, they suffer severely from the curse of dimensionality. In this paper, we leverage advances in classifiers and generative models to design methods for CMI estimation. Specifically, we introduce an estimator for KL-Divergence based on the likelihood ratio by training a classifier to distinguish the observed joint distribution from the product distribution. We then show how to construct several CMI estimators using this basic divergence estimator by drawing ideas from conditional generative models. We demonstrate that the estimates from our proposed approaches do not degrade in performance with increasing dimension and obtain significant improvement over the widely used KSG estimator. Finally, as an application of accurate CMI estimation, we use our best estimator for conditional independence testing and achieve superior performance than the state-of-the-art tester on both simulated and real data-sets.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
DeepTurbo: Deep Turbo Decoder
Authors:
Yihan Jiang,
Hyeji Kim,
Himanshu Asnani,
Sreeram Kannan,
Sewoong Oh,
Pramod Viswanath
Abstract:
Present-day communication systems routinely use codes that approach the channel capacity when coupled with a computationally efficient decoder. However, the decoder is typically designed for the Gaussian noise channel and is known to be sub-optimal for non-Gaussian noise distribution. Deep learning methods offer a new approach for designing decoders that can be trained and tailored for arbitrary c…
▽ More
Present-day communication systems routinely use codes that approach the channel capacity when coupled with a computationally efficient decoder. However, the decoder is typically designed for the Gaussian noise channel and is known to be sub-optimal for non-Gaussian noise distribution. Deep learning methods offer a new approach for designing decoders that can be trained and tailored for arbitrary channel statistics. We focus on Turbo codes and propose DeepTurbo, a novel deep learning based architecture for Turbo decoding.
The standard Turbo decoder (Turbo) iteratively applies the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm with an interleaver in the middle. A neural architecture for Turbo decoding termed (NeuralBCJR), was proposed recently. There, the key idea is to create a module that imitates the BCJR algorithm using supervised learning, and to use the interleaver architecture along with this module, which is then fine-tuned using end-to-end training. However, knowledge of the BCJR algorithm is required to design such an architecture, which also constrains the resulting learned decoder. Here we remedy this requirement and propose a fully end-to-end trained neural decoder - Deep Turbo Decoder (DeepTurbo). With novel learnable decoder structure and training methodology, DeepTurbo reveals superior performance under both AWGN and non-AWGN settings as compared to the other two decoders - Turbo and NeuralBCJR. Furthermore, among all the three, DeepTurbo exhibits the lowest error floor.
△ Less
Submitted 24 April, 2019; v1 submitted 6 March, 2019;
originally announced March 2019.
-
LEARN Codes: Inventing Low-latency Codes via Recurrent Neural Networks
Authors:
Yihan Jiang,
Hyeji Kim,
Himanshu Asnani,
Sreeram Kannan,
Sewoong Oh,
Pramod Viswanath
Abstract:
Designing channel codes under low-latency constraints is one of the most demanding requirements in 5G standards. However, a sharp characterization of the performance of traditional codes is available only in the large block-length limit. Guided by such asymptotic analysis, code designs require large block lengths as well as latency to achieve the desired error rate. Tail-biting convolutional codes…
▽ More
Designing channel codes under low-latency constraints is one of the most demanding requirements in 5G standards. However, a sharp characterization of the performance of traditional codes is available only in the large block-length limit. Guided by such asymptotic analysis, code designs require large block lengths as well as latency to achieve the desired error rate. Tail-biting convolutional codes and other recent state-of-the-art short block codes, while promising reduced latency, are neither robust to channel-mismatch nor adaptive to varying channel conditions. When the codes designed for one channel (e.g.,~Additive White Gaussian Noise (AWGN) channel) are used for another (e.g.,~non-AWGN channels), heuristics are necessary to achieve non-trivial performance.
In this paper, we first propose an end-to-end learned neural code, obtained by jointly designing a Recurrent Neural Network (RNN) based encoder and decoder. This code outperforms canonical convolutional code under block settings. We then leverage this experience to propose a new class of codes under low-latency constraints, which we call Low-latency Efficient Adaptive Robust Neural (LEARN) codes. These codes outperform state-of-the-art low-latency codes and exhibit robustness and adaptivity properties. LEARN codes show the potential to design new versatile and universal codes for future communications via tools of modern deep learning coupled with communication engineering insights.
△ Less
Submitted 24 July, 2020; v1 submitted 30 November, 2018;
originally announced November 2018.
-
Estimators for Multivariate Information Measures in General Probability Spaces
Authors:
Arman Rahimzamani,
Himanshu Asnani,
Pramod Viswanath,
Sreeram Kannan
Abstract:
Information theoretic quantities play an important role in various settings in machine learning, including causality testing, structure inference in graphical models, time-series problems, feature selection as well as in providing privacy guarantees. A key quantity of interest is the mutual information and generalizations thereof, including conditional mutual information, multivariate mutual infor…
▽ More
Information theoretic quantities play an important role in various settings in machine learning, including causality testing, structure inference in graphical models, time-series problems, feature selection as well as in providing privacy guarantees. A key quantity of interest is the mutual information and generalizations thereof, including conditional mutual information, multivariate mutual information, total correlation and directed information. While the aforementioned information quantities are well defined in arbitrary probability spaces, existing estimators add or subtract entropies (we term them $ΣH$ methods). These methods work only in purely discrete space or purely continuous case since entropy (or differential entropy) is well defined only in that regime.
In this paper, we define a general graph divergence measure ($\mathbb{GDM}$), as a measure of incompatibility between the observed distribution and a given graphical model structure. This generalizes the aforementioned information measures and we construct a novel estimator via a coupling trick that directly estimates these multivariate information measures using the Radon-Nikodym derivative. These estimators are proven to be consistent in a general setting which includes several cases where the existing estimators fail, thus providing the only known estimators for the following settings: (1) the data has some discrete and some continuous-valued components (2) some (or all) of the components themselves are discrete-continuous mixtures (3) the data is real-valued but does not have a joint density on the entire space, rather is supported on a low-dimensional manifold. We show that our proposed estimators significantly outperform known estimators on synthetic and real datasets.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.
-
ClusterGAN : Latent Space Clustering in Generative Adversarial Networks
Authors:
Sudipto Mukherjee,
Himanshu Asnani,
Eugene Lin,
Sreeram Kannan
Abstract:
Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space.
In this paper, we propose ClusterGAN as a new…
▽ More
Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space.
In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets.
△ Less
Submitted 26 January, 2019; v1 submitted 10 September, 2018;
originally announced September 2018.
-
Mimic and Classify : A meta-algorithm for Conditional Independence Testing
Authors:
Rajat Sen,
Karthikeyan Shanmugam,
Himanshu Asnani,
Arman Rahimzamani,
Sreeram Kannan
Abstract:
Given independent samples generated from the joint distribution $p(\mathbf{x},\mathbf{y},\mathbf{z})$, we study the problem of Conditional Independence (CI-Testing), i.e., whether the joint equals the CI distribution $p^{CI}(\mathbf{x},\mathbf{y},\mathbf{z})= p(\mathbf{z}) p(\mathbf{y}|\mathbf{z})p(\mathbf{x}|\mathbf{z})$ or not. We cast this problem under the purview of the proposed, provable met…
▽ More
Given independent samples generated from the joint distribution $p(\mathbf{x},\mathbf{y},\mathbf{z})$, we study the problem of Conditional Independence (CI-Testing), i.e., whether the joint equals the CI distribution $p^{CI}(\mathbf{x},\mathbf{y},\mathbf{z})= p(\mathbf{z}) p(\mathbf{y}|\mathbf{z})p(\mathbf{x}|\mathbf{z})$ or not. We cast this problem under the purview of the proposed, provable meta-algorithm, "Mimic and Classify", which is realized in two-steps: (a) Mimic the CI distribution close enough to recover the support, and (b) Classify to distinguish the joint and the CI distribution. Thus, as long as we have a good generative model and a good classifier, we potentially have a sound CI Tester. With this modular paradigm, CI Testing becomes amiable to be handled by state-of-the-art, both generative and classification methods from the modern advances in Deep Learning, which in general can handle issues related to curse of dimensionality and operation in small sample regime. We show intensive numerical experiments on synthetic and real datasets where new mimic methods such conditional GANs, Regression with Neural Nets, outperform the current best CI Testing performance in the literature. Our theoretical results provide analysis on the estimation of null distribution as well as allow for general measures, i.e., when either some of the random variables are discrete and some are continuous or when one or more of them are discrete-continuous mixtures.
△ Less
Submitted 25 June, 2018;
originally announced June 2018.
-
Capacity of a POST Channel with and without Feedback
Authors:
Haim H. Permuter,
Himanshu Asnani,
Tsachy Weissman
Abstract:
We consider finite state channels where the state of the channel is its previous output. We refer to these as POST (Previous Output is the STate) channels. We first focus on POST($α$) channels. These channels have binary inputs and outputs, where the state determines if the channel behaves as a $Z$ or an $S$ channel, both with parameter $α$. %with parameter $α.$ We show that the non feedback capac…
▽ More
We consider finite state channels where the state of the channel is its previous output. We refer to these as POST (Previous Output is the STate) channels. We first focus on POST($α$) channels. These channels have binary inputs and outputs, where the state determines if the channel behaves as a $Z$ or an $S$ channel, both with parameter $α$. %with parameter $α.$ We show that the non feedback capacity of the POST($α$) channel equals its feedback capacity, despite the memory of the channel. The proof of this surprising result is based on showing that the induced output distribution, when maximizing the directed information in the presence of feedback, can also be achieved by an input distribution that does not utilize of the feedback. We show that this is a sufficient condition for the feedback capacity to equal the non feedback capacity for any finite state channel. We show that the result carries over from the POST($α$) channel to a binary POST channel where the previous output determines whether the current channel will be binary with parameters $(a,b)$ or $(b,a)$. Finally, we show that, in general, feedback may increase the capacity of a POST channel.
△ Less
Submitted 21 September, 2013;
originally announced September 2013.
-
Network Compression: Worst-Case Analysis
Authors:
Himanshu Asnani,
Ilan Shomorony,
A. Salman Avestimehr,
Tsachy Weissman
Abstract:
We study the problem of communicating a distributed correlated memoryless source over a memoryless network, from source nodes to destination nodes, under quadratic distortion constraints. We establish the following two complementary results: (a) for an arbitrary memoryless network, among all distributed memoryless sources of a given correlation, Gaussian sources are least compressible, that is, th…
▽ More
We study the problem of communicating a distributed correlated memoryless source over a memoryless network, from source nodes to destination nodes, under quadratic distortion constraints. We establish the following two complementary results: (a) for an arbitrary memoryless network, among all distributed memoryless sources of a given correlation, Gaussian sources are least compressible, that is, they admit the smallest set of achievable distortion tuples, and (b) for any memoryless source to be communicated over a memoryless additive-noise network, among all noise processes of a given correlation, Gaussian noise admits the smallest achievable set of distortion tuples. We establish these results constructively by showing how schemes for the corresponding Gaussian problems can be applied to achieve similar performance for (source or noise) distributions that are not necessarily Gaussian but have the same covariance.
△ Less
Submitted 5 April, 2013;
originally announced April 2013.
-
Worst-Case Source for Distributed Compression with Quadratic Distortion
Authors:
Ilan Shomorony,
A. Salman Avestimehr,
Himanshu Asnani,
Tsachy Weissman
Abstract:
We consider the k-encoder source coding problem with a quadratic distortion measure. We show that among all source distributions with a given covariance matrix K, the jointly Gaussian source requires the highest rates in order to meet a given set of distortion constraints.
We consider the k-encoder source coding problem with a quadratic distortion measure. We show that among all source distributions with a given covariance matrix K, the jointly Gaussian source requires the highest rates in order to meet a given set of distortion constraints.
△ Less
Submitted 8 August, 2012;
originally announced August 2012.
-
Information Embedding on Actions
Authors:
Behzad Ahmadi,
Himanshu Asnani,
Osvaldo Simeone,
Haim H. Permuter
Abstract:
The problem of optimal actuation for channel and source coding was recently formulated and solved in a number of relevant scenarios. In this class of models, actions are taken at encoders or decoders, either to acquire side information in an efficient way or to control or probe effectively the channel state. In this paper, the problem of embedding information on the actions is studied for both the…
▽ More
The problem of optimal actuation for channel and source coding was recently formulated and solved in a number of relevant scenarios. In this class of models, actions are taken at encoders or decoders, either to acquire side information in an efficient way or to control or probe effectively the channel state. In this paper, the problem of embedding information on the actions is studied for both the source and the channel coding set-ups. In both cases, a decoder is present that observes only a function of the actions taken by an encoder or a decoder of an action-dependent point-to-point link. For the source coding model, this decoder wishes to reconstruct a lossy version of the source being transmitted over the point-to-point link, while for the channel coding problem the decoder wishes to retrieve a portion of the message conveyed over the link. For the problem of source coding with actions taken at the decoder, a single letter characterization of the set of all achievable tuples of rate, distortions at the two decoders and action cost is derived, under the assumption that the mentioned decoder observes a function of the actions non-causally, strictly causally or causally. A special case of the problem in which the actions are taken by the encoder is also solved. A single-letter characterization of the achievable capacity-cost region is then obtained for the channel coding set-up with actions. Examples are provided that shed light into the effect of information embedding on the actions for the action-dependent source and channel coding problems.
△ Less
Submitted 25 July, 2012;
originally announced July 2012.
-
Lossy Compression of Quality Values via Rate Distortion Theory
Authors:
Himanshu Asnani,
Dinesh Bharadia,
Mainak Chowdhury,
Idoia Ochoa,
Itai Sharon,
Tsachy Weissman
Abstract:
Motivation: Next Generation Sequencing technologies revolutionized many fields in biology by enabling the fast and cheap sequencing of large amounts of genomic data. The ever increasing sequencing capacities enabled by current sequencing machines hold a lot of promise as for the future applications of these technologies, but also create increasing computational challenges related to the analysis a…
▽ More
Motivation: Next Generation Sequencing technologies revolutionized many fields in biology by enabling the fast and cheap sequencing of large amounts of genomic data. The ever increasing sequencing capacities enabled by current sequencing machines hold a lot of promise as for the future applications of these technologies, but also create increasing computational challenges related to the analysis and storage of these data. A typical sequencing data file may occupy tens or even hundreds of gigabytes of disk space, prohibitively large for many users. Raw sequencing data consists of both the DNA sequences (reads) and per-base quality values that indicate the level of confidence in the readout of these sequences. Quality values account for about half of the required disk space in the commonly used FASTQ format and therefore their compression can significantly reduce storage requirements and speed up analysis and transmission of these data.
Results: In this paper we present a framework for the lossy compression of the quality value sequences of genomic read files. Numerical experiments with reference based alignment using these quality values suggest that we can achieve significant compression with little compromise in performance for several downstream applications of interest, as is consistent with our theoretical analysis. Our framework also allows compression in a regime - below one bit per quality value - for which there are no existing compressors.
△ Less
Submitted 21 July, 2012;
originally announced July 2012.
-
Successive Refinement with Decoder Cooperation and its Channel Coding Duals
Authors:
Himanshu Asnani,
Haim Permuter,
Tsachy Weissman
Abstract:
We study cooperation in multi terminal source coding models involving successive refinement. Specifically, we study the case of a single encoder and two decoders, where the encoder provides a common description to both the decoders and a private description to only one of the decoders. The decoders cooperate via cribbing, i.e., the decoder with access only to the common description is allowed to o…
▽ More
We study cooperation in multi terminal source coding models involving successive refinement. Specifically, we study the case of a single encoder and two decoders, where the encoder provides a common description to both the decoders and a private description to only one of the decoders. The decoders cooperate via cribbing, i.e., the decoder with access only to the common description is allowed to observe, in addition, a deterministic function of the reconstruction symbols produced by the other. We characterize the fundamental performance limits in the respective settings of non-causal, strictly-causal and causal cribbing. We use a new coding scheme, referred to as Forward Encoding and Block Markov Decoding, which is a variant of one recently used by Cuff and Zhao for coordination via implicit communication. Finally, we use the insight gained to introduce and solve some dual channel coding scenarios involving Multiple Access Channels with cribbing.
△ Less
Submitted 21 March, 2012;
originally announced March 2012.
-
Multi-Terminal Source Coding With Action Dependent Side Information
Authors:
Yeow-Khiang Chia,
Himanshu Asnani,
Tsachy Weissman
Abstract:
We consider multi-terminal source coding with a single encoder and multiple decoders where either the encoder or the decoders can take cost constrained actions which affect the quality of the side information present at the decoders. For the scenario where decoders take actions, we characterize the rate-cost trade-off region for lossless source coding, and give an achievability scheme for lossy so…
▽ More
We consider multi-terminal source coding with a single encoder and multiple decoders where either the encoder or the decoders can take cost constrained actions which affect the quality of the side information present at the decoders. For the scenario where decoders take actions, we characterize the rate-cost trade-off region for lossless source coding, and give an achievability scheme for lossy source coding for two decoders which is optimum for a variety of special cases of interest. For the case where the encoder takes actions, we characterize the rate-cost trade-off for a class of lossless source coding scenarios with multiple decoders. Finally, we also consider extensions to other multi-terminal source coding settings with actions, and characterize the rate -distortion-cost tradeoff for a case of successive refinement with actions.
△ Less
Submitted 31 October, 2011;
originally announced October 2011.
-
On Real Time Coding with Limited Lookahead
Authors:
Himanshu Asnani,
Tsachy Weissman
Abstract:
A real time coding system with lookahead consists of a memoryless source, a memoryless channel, an encoder, which encodes the source symbols sequentially with knowledge of future source symbols upto a fixed finite lookahead, d, with or without feedback of the past channel output symbols and a decoder, which sequentially constructs the source symbols using the channel output. The objective is to mi…
▽ More
A real time coding system with lookahead consists of a memoryless source, a memoryless channel, an encoder, which encodes the source symbols sequentially with knowledge of future source symbols upto a fixed finite lookahead, d, with or without feedback of the past channel output symbols and a decoder, which sequentially constructs the source symbols using the channel output. The objective is to minimize the expected per-symbol distortion. For a fixed finite lookahead d>=1 we invoke the theory of controlled markov chains to obtain an average cost optimality equation (ACOE), the solution of which, denoted by D(d), is the minimum expected per-symbol distortion. With increasing d, D(d) bridges the gap between causal encoding, d=0, where symbol by symbol encoding-decoding is optimal and the infinite lookahead case, d=\infty, where Shannon Theoretic arguments show that separation is optimal. We extend the analysis to a system with finite state decoders, with or without noise-free feedback. For a Bernoulli source and binary symmetric channel, under hamming loss, we compute the optimal distortion for various source and channel parameters, and thus obtain computable bounds on D(d). We also identify regions of source and channel parameters where symbol by symbol encoding-decoding is suboptimal. Finally, we demonstrate the wide applicability of our approach by applying it in additional coding scenarios, such as the case where the sequential decoder can take cost constrained actions affecting the quality or availability of side information about the source.
△ Less
Submitted 29 May, 2011;
originally announced May 2011.
-
Multiple Access Channel with Partial and Controlled Cribbing Encoders
Authors:
Haim Permuter,
Himanshu Asnani
Abstract:
In this paper we consider a multiple access channel (MAC) with partial cribbing encoders. This means that each of two encoders obtains a deterministic function of the other encoder output with or without delay. The partial cribbing scheme is especially motivated by the additive noise Gaussian MAC since perfect cribbing results in the degenerated case of full cooperation between the encoders and re…
▽ More
In this paper we consider a multiple access channel (MAC) with partial cribbing encoders. This means that each of two encoders obtains a deterministic function of the other encoder output with or without delay. The partial cribbing scheme is especially motivated by the additive noise Gaussian MAC since perfect cribbing results in the degenerated case of full cooperation between the encoders and requires an infinite entropy link. We derive a single letter characterization of the capacity of the MAC with partial cribbing for the cases of causal and strictly causal partial cribbing. Several numerical examples, such as quantized cribbing, are presented. We further consider and derive the capacity region where the cribbing depends on actions that are functions of the previous cribbed observations. In particular, we consider a scenario where the action is "to crib or not to crib" and show that a naive time-sharing strategy is not optimal.
△ Less
Submitted 21 March, 2011;
originally announced March 2011.
-
To Feed or Not to Feed Back
Authors:
Himanshu Asnani,
Haim Permuter,
Tsachy Weissman
Abstract:
We study the communication over Finite State Channels (FSCs), where the encoder and the decoder can control the availability or the quality of the noise-free feedback. Specifically, the instantaneous feedback is a function of an action taken by the encoder, an action taken by the decoder, and the channel output. Encoder and decoder actions take values in finite alphabets, and may be subject to ave…
▽ More
We study the communication over Finite State Channels (FSCs), where the encoder and the decoder can control the availability or the quality of the noise-free feedback. Specifically, the instantaneous feedback is a function of an action taken by the encoder, an action taken by the decoder, and the channel output. Encoder and decoder actions take values in finite alphabets, and may be subject to average cost constraints. We prove capacity results for such a setting by constructing a sequence of achievable rates, using a simple scheme based on 'code tree' generation, that generates channel input symbols along with encoder and decoder actions. We prove that the limit of this sequence exists. For a given block length and probability of error, we give an upper bound on the maximum achievable rate. Our upper and lower bounds coincide and hence yield the capacity for the case where the probability of initial state is positive for all states. Further, for stationary indecomposable channels without intersymbol interference (ISI), the capacity is given as the limit of normalized directed information between the input and output sequence, maximized over an appropriate set of causally conditioned distributions. As an important special case, we consider the framework of 'to feed or not to feed back' where either the encoder or the decoder takes binary actions, which determine whether current channel output will be fed back to the encoder, with a constraint on the fraction of channel outputs that are fed back. As another special case of our framework, we characterize the capacity of 'coding on the backward link' in FSCs, i.e. when the decoder sends limited-rate instantaneous coded noise-free feedback on the backward link. Finally, we propose an extension of the Blahut-Arimoto algorithm for evaluating the capacity when actions can be cost constrained, and demonstrate its application on a few examples.
△ Less
Submitted 11 August, 2012; v1 submitted 6 November, 2010;
originally announced November 2010.
-
Probing Capacity
Authors:
Himanshu Asnani,
Haim Permuter,
Tsachy Weissman
Abstract:
We consider the problem of optimal probing of states of a channel by transmitter and receiver for maximizing rate of reliable communication. The channel is discrete memoryless (DMC) with i.i.d. states. The encoder takes probing actions dependent on the message. It then uses the state information obtained from probing causally or non-causally to generate channel input symbols. The decoder may also…
▽ More
We consider the problem of optimal probing of states of a channel by transmitter and receiver for maximizing rate of reliable communication. The channel is discrete memoryless (DMC) with i.i.d. states. The encoder takes probing actions dependent on the message. It then uses the state information obtained from probing causally or non-causally to generate channel input symbols. The decoder may also take channel probing actions as a function of the observed channel output and use the channel state information thus acquired, along with the channel output, to estimate the message. We refer to the maximum achievable rate for reliable communication for such systems as the 'Probing Capacity'. We characterize this capacity when the encoder and decoder actions are cost constrained. To motivate the problem, we begin by characterizing the trade-off between the capacity and fraction of channel states the encoder is allowed to observe, while the decoder is aware of channel states. In this setting of 'to observe or not to observe' state at the encoder, we compute certain numerical examples and note a pleasing phenomenon, where encoder can observe a relatively small fraction of states and yet communicate at maximum rate, i.e. rate when observing states at encoder is not cost constrained.
△ Less
Submitted 6 October, 2010;
originally announced October 2010.
-
Asymptotic Capacity of Wireless Ad Hoc Networks with Realistic Links under a Honey Comb Topology
Authors:
Himanshu Asnani,
Abhay Karandikar
Abstract:
We consider the effects of Rayleigh fading and lognormal shadowing in the physical interference model for all the successful transmissions of traffic across the network. New bounds are derived for the capacity of a given random ad hoc wireless network that reflect packet drop or capture probability of the transmission links. These bounds are based on a simplified network topology termed as honey…
▽ More
We consider the effects of Rayleigh fading and lognormal shadowing in the physical interference model for all the successful transmissions of traffic across the network. New bounds are derived for the capacity of a given random ad hoc wireless network that reflect packet drop or capture probability of the transmission links. These bounds are based on a simplified network topology termed as honey-comb topology under a given routing and scheduling scheme.
△ Less
Submitted 4 October, 2010; v1 submitted 10 November, 2007;
originally announced November 2007.