-
On Sampling from Ising Models with Spectral Constraints
Authors:
Andreas Galanis,
Alkis Kalavasis,
Anthimos Vardis Kandiros
Abstract:
We consider the problem of sampling from the Ising model when the underlying interaction matrix has eigenvalues lying within an interval of length $γ$. Recent work in this setting has shown various algorithmic results that apply roughly when $γ< 1$, notably with nearly-linear running times based on the classical Glauber dynamics. However, the optimality of the range of $γ$ was not clear since prev…
▽ More
We consider the problem of sampling from the Ising model when the underlying interaction matrix has eigenvalues lying within an interval of length $γ$. Recent work in this setting has shown various algorithmic results that apply roughly when $γ< 1$, notably with nearly-linear running times based on the classical Glauber dynamics. However, the optimality of the range of $γ$ was not clear since previous inapproximability results developed for the antiferromagnetic case (where the matrix has entries $\leq 0$) apply only for $γ>2$.
To this end, Kunisky (SODA'24) recently provided evidence that the problem becomes hard already when $γ>1$ based on the low-degree hardness for an inference problem on random matrices. Based on this, he conjectured that sampling from the Ising model in the same range of $γ$ is NP-hard.
Here we confirm this conjecture, complementing in particular the known algorithmic results by showing NP-hardness results for approximately counting and sampling when $γ>1$, with strong inapproximability guarantees; we also obtain a more refined hardness result for matrices where only a constant number of entries per row are allowed to be non-zero. The main observation in our reductions is that, for $γ>1$, Glauber dynamics mixes slowly when the interactions are all positive (ferromagnetic) for the complete and random regular graphs, due to a bimodality in the underlying distribution. While ferromagnetic interactions typically preclude NP-hardness results, here we work around this by introducing in an appropriate way mild antiferromagnetism, keeping the spectrum roughly within the same range. This allows us to exploit the bimodality of the aforementioned graphs and show the target NP-hardness by adapting suitably previous inapproximability techniques developed for antiferromagnetic systems.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Learning Hard-Constrained Models with One Sample
Authors:
Andreas Galanis,
Alkis Kalavasis,
Anthimos Vardis Kandiros
Abstract:
We consider the problem of estimating the parameters of a Markov Random Field with hard-constraints using a single sample. As our main running examples, we use the $k$-SAT and the proper coloring models, as well as general $H$-coloring models; for all of these we obtain both positive and negative results. In contrast to the soft-constrained case, we show in particular that single-sample estimation…
▽ More
We consider the problem of estimating the parameters of a Markov Random Field with hard-constraints using a single sample. As our main running examples, we use the $k$-SAT and the proper coloring models, as well as general $H$-coloring models; for all of these we obtain both positive and negative results. In contrast to the soft-constrained case, we show in particular that single-sample estimation is not always possible, and that the existence of an estimator is related to the existence of non-satisfiable instances.
Our algorithms are based on the pseudo-likelihood estimator. We show variance bounds for this estimator using coupling techniques inspired, in the case of $k$-SAT, by Moitra's sampling algorithm (JACM, 2019); our positive results for colorings build on this new coupling approach. For $q$-colorings on graphs with maximum degree $d$, we give a linear-time estimator when $q>d+1$, whereas the problem is non-identifiable when $q\leq d+1$. For general $H$-colorings, we show that standard conditions that guarantee sampling, such as Dobrushin's condition, are insufficient for one-sample learning; on the positive side, we provide a general condition that is sufficient to guarantee linear-time learning and obtain applications for proper colorings and permissive models. For the $k$-SAT model on formulas with maximum degree $d$, we provide a linear-time estimator when $k\gtrsim 6.45\log d$, whereas the problem becomes non-identifiable when $k\lesssim \log d$.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Learning and Testing Latent-Tree Ising Models Efficiently
Authors:
Davin Choo,
Yuval Dagan,
Constantinos Daskalakis,
Anthimos Vardis Kandiros
Abstract:
We provide time- and sample-efficient algorithms for learning and testing latent-tree Ising models, i.e. Ising models that may only be observed at their leaf nodes. On the learning side, we obtain efficient algorithms for learning a tree-structured Ising model whose leaf node distribution is close in Total Variation Distance, improving on the results of prior work. On the testing side, we provide…
▽ More
We provide time- and sample-efficient algorithms for learning and testing latent-tree Ising models, i.e. Ising models that may only be observed at their leaf nodes. On the learning side, we obtain efficient algorithms for learning a tree-structured Ising model whose leaf node distribution is close in Total Variation Distance, improving on the results of prior work. On the testing side, we provide an efficient algorithm with fewer samples for testing whether two latent-tree Ising models have leaf-node distributions that are close or far in Total Variation distance. We obtain our algorithms by showing novel localization results for the total variation distance between the leaf-node distributions of tree-structured Ising models, in terms of their marginals on pairs of leaves.
△ Less
Submitted 10 July, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
EM's Convergence in Gaussian Latent Tree Models
Authors:
Yuval Dagan,
Constantinos Daskalakis,
Anthimos Vardis Kandiros
Abstract:
We study the optimization landscape of the log-likelihood function and the convergence of the Expectation-Maximization (EM) algorithm in latent Gaussian tree models, i.e. tree-structured Gaussian graphical models whose leaf nodes are observable and non-leaf nodes are unobservable. We show that the unique non-trivial stationary point of the population log-likelihood is its global maximum, and estab…
▽ More
We study the optimization landscape of the log-likelihood function and the convergence of the Expectation-Maximization (EM) algorithm in latent Gaussian tree models, i.e. tree-structured Gaussian graphical models whose leaf nodes are observable and non-leaf nodes are unobservable. We show that the unique non-trivial stationary point of the population log-likelihood is its global maximum, and establish that the expectation-maximization algorithm is guaranteed to converge to it in the single latent variable case. Our results for the landscape of the log-likelihood function in general latent tree models provide support for the extensive practical use of maximum likelihood based-methods in this setting. Our results for the EM algorithm extend an emerging line of work on obtaining global convergence guarantees for this celebrated algorithm. We show our results for the non-trivial stationary points of the log-likelihood by arguing that a certain system of polynomial equations obtained from the EM updates has a unique non-trivial solution. The global convergence of the EM algorithm follows by arguing that all trivial fixed points are higher-order saddle points.
△ Less
Submitted 23 November, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Statistical Estimation from Dependent Data
Authors:
Yuval Dagan,
Constantinos Daskalakis,
Nishanth Dikkala,
Surbhi Goel,
Anthimos Vardis Kandiros
Abstract:
We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e.g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies. We model these dependencies in the language of Markov Random Fields and…
▽ More
We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e.g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies. We model these dependencies in the language of Markov Random Fields and, importantly, allow these dependencies to be substantial, i.e do not assume that the Markov Random Field capturing these dependencies is in high temperature. As our main contribution we provide algorithms and statistically efficient estimation rates for this model, giving several instantiations of our bounds in logistic regression, sparse logistic regression, and neural network settings with dependent data. Our estimation guarantees follow from novel results for estimating the parameters (i.e. external fields and interaction strengths) of Ising models from a {\em single} sample. {We evaluate our estimation approach on real networked data, showing that it outperforms standard regression approaches that ignore dependencies, across three text classification datasets: Cora, Citeseer and Pubmed.}
△ Less
Submitted 20 July, 2021;
originally announced July 2021.
-
Learning Ising models from one or multiple samples
Authors:
Yuval Dagan,
Constantinos Daskalakis,
Nishanth Dikkala,
Anthimos Vardis Kandiros
Abstract:
There have been two separate lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one…
▽ More
There have been two separate lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples.
Our main theorem provides guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the afore-described multiple-sample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anti-concentration results for the Ising model, which we believe have applications beyond the scope of this paper.
△ Less
Submitted 10 December, 2020; v1 submitted 20 April, 2020;
originally announced April 2020.