-
Comparison Theorems for the Mixing Times of Systematic and Random Scan Dynamics
Authors:
Jason Gaitonde,
Elchanan Mossel
Abstract:
A popular method for sampling from high-dimensional distributions is the Gibbs sampler, which iteratively resamples sites from the conditional distribution of the desired measure given the values of the other coordinates. But to what extent does the order of site updates matter in the mixing time? Two natural choices are (i) standard, or random scan, Glauber dynamics where the updated variable is…
▽ More
A popular method for sampling from high-dimensional distributions is the Gibbs sampler, which iteratively resamples sites from the conditional distribution of the desired measure given the values of the other coordinates. But to what extent does the order of site updates matter in the mixing time? Two natural choices are (i) standard, or random scan, Glauber dynamics where the updated variable is chosen uniformly at random, and (ii) the systematic scan dynamics where variables are updated in a fixed, cyclic order. We first show that for systems of dimension $n$, one round of the systematic scan dynamics has spectral gap at most a factor of order $n$ worse than the corresponding spectral gap of a single step of Glauber dynamics, tightening existing bounds in the literature by He, et al. [NeurIPS '16] and Chlebicka, Łatuszyński, and Miasodejow [Ann. Appl. Probab. '24]. This result is sharp even for simple spin systems by an example of Roberts and Rosenthal [Int. J. Statist. Prob. '15]. We complement this with a converse statement: if all, or even just one scan order rapidly mixes, the Glauber dynamics has a polynomially related mixing time, resolving a question of Chlebicka, Łatuszyński, and Miasodejow. Our arguments are simple and only use elementary linear algebra and probability.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Bypassing the Noisy Parity Barrier: Learning Higher-Order Markov Random Fields from Dynamics
Authors:
Jason Gaitonde,
Ankur Moitra,
Elchanan Mossel
Abstract:
We consider the problem of learning graphical models, also known as Markov random fields (MRFs) from temporally correlated samples. As in many traditional statistical settings, fundamental results in the area all assume independent samples from the distribution. However, these samples generally will not directly correspond to more realistic observations from nature, which instead evolve according…
▽ More
We consider the problem of learning graphical models, also known as Markov random fields (MRFs) from temporally correlated samples. As in many traditional statistical settings, fundamental results in the area all assume independent samples from the distribution. However, these samples generally will not directly correspond to more realistic observations from nature, which instead evolve according to some stochastic process. From the computational lens, even generating a single sample from the true MRF distribution is intractable unless $\mathsf{NP}=\mathsf{RP}$, and moreover, any algorithm to learn from i.i.d. samples requires prohibitive runtime due to hardness reductions to the parity with noise problem. These computational barriers for sampling and learning from the i.i.d. setting severely lessen the utility of these breakthrough results for this important task; however, dropping this assumption typically only introduces further algorithmic and statistical complexities.
In this work, we surprisingly demonstrate that the direct trajectory data from a natural evolution of the MRF overcomes the fundamental computational lower bounds to efficient learning. In particular, we show that given a trajectory with $\widetilde{O}_k(n)$ site updates of an order $k$ MRF from the Glauber dynamics, a well-studied, natural stochastic process on graphical models, there is an algorithm that recovers the graph and the parameters in $\widetilde{O}_k(n^2)$ time. By contrast, all prior algorithms for learning order $k$ MRFs inherently suffer from $n^{Θ(k)}$ runtime even in sparse instances due to the reductions to sparse parity with noise. Our results thus surprisingly show that this more realistic, but intuitively less tractable, model for MRFs actually leads to efficiency far beyond what is known and believed to be true in the traditional i.i.d. case.
△ Less
Submitted 4 November, 2024; v1 submitted 8 September, 2024;
originally announced September 2024.
-
Sample-Efficient Linear Regression with Self-Selection Bias
Authors:
Jason Gaitonde,
Elchanan Mossel
Abstract:
We consider the problem of linear regression with self-selection bias in the unknown-index setting, as introduced in recent work by Cherapanamjeri, Daskalakis, Ilyas, and Zampetakis [STOC 2023]. In this model, one observes $m$ i.i.d. samples $(\mathbf{x}_{\ell},z_{\ell})_{\ell=1}^m$ where $z_{\ell}=\max_{i\in [k]}\{\mathbf{x}_{\ell}^T\mathbf{w}_i+η_{i,\ell}\}$, but the maximizing index $i_{\ell}$…
▽ More
We consider the problem of linear regression with self-selection bias in the unknown-index setting, as introduced in recent work by Cherapanamjeri, Daskalakis, Ilyas, and Zampetakis [STOC 2023]. In this model, one observes $m$ i.i.d. samples $(\mathbf{x}_{\ell},z_{\ell})_{\ell=1}^m$ where $z_{\ell}=\max_{i\in [k]}\{\mathbf{x}_{\ell}^T\mathbf{w}_i+η_{i,\ell}\}$, but the maximizing index $i_{\ell}$ is unobserved. Here, the $\mathbf{x}_{\ell}$ are assumed to be $\mathcal{N}(0,I_n)$ and the noise distribution $\mathbfη_{\ell}\sim \mathcal{D}$ is centered and independent of $\mathbf{x}_{\ell}$. We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $\mathbf{w}_1,\ldots,\mathbf{w}_k\in \mathbb{R}^n$ up to additive $\ell_2$-error $\varepsilon$ with polynomial sample complexity $\tilde{O}(n)\cdot \mathsf{poly}(k,1/\varepsilon)$ and significantly improved time complexity $\mathsf{poly}(n,k,1/\varepsilon)+O(\log(k)/\varepsilon)^{O(k)}$. When $k=O(1)$, our algorithm runs in $\mathsf{poly}(n,1/\varepsilon)$ time, generalizing the polynomial guarantee of an explicit moment matching algorithm of Cherapanamjeri, et al. for $k=2$ and when it is known that $\mathcal{D}=\mathcal{N}(0,I_k)$. Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression where the added noise is taken outside the maximum. For this problem, our algorithm is efficient in a much larger range of $k$ than the state-of-the-art due to Ghosh, Pananjady, Guntuboyina, and Ramchandran [IEEE Trans. Inf. Theory 2022] for not too small $\varepsilon$, and leads to improved algorithms for any $\varepsilon$ by providing a warm start for existing local convergence methods.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
A Unified Approach to Learning Ising Models: Beyond Independence and Bounded Width
Authors:
Jason Gaitonde,
Elchanan Mossel
Abstract:
We revisit the problem of efficiently learning the underlying parameters of Ising models from data. Current algorithmic approaches achieve essentially optimal sample complexity when given i.i.d. samples from the stationary measure and the underlying model satisfies "width" bounds on the total $\ell_1$ interaction involving each node. We show that a simple existing approach based on node-wise logis…
▽ More
We revisit the problem of efficiently learning the underlying parameters of Ising models from data. Current algorithmic approaches achieve essentially optimal sample complexity when given i.i.d. samples from the stationary measure and the underlying model satisfies "width" bounds on the total $\ell_1$ interaction involving each node. We show that a simple existing approach based on node-wise logistic regression provably succeeds at recovering the underlying model in several new settings where these assumptions are violated:
(1) Given dynamically generated data from a wide variety of local Markov chains, like block or round-robin dynamics, logistic regression recovers the parameters with optimal sample complexity up to $\log\log n$ factors. This generalizes the specialized algorithm of Bresler, Gamarnik, and Shah [IEEE Trans. Inf. Theory'18] for structure recovery in bounded degree graphs from Glauber dynamics.
(2) For the Sherrington-Kirkpatrick model of spin glasses, given $\mathsf{poly}(n)$ independent samples, logistic regression recovers the parameters in most of the known high-temperature regime via a simple reduction to weaker structural properties of the measure. This improves on recent work of Anari, Jain, Koehler, Pham, and Vuong [ArXiv'23] which gives distribution learning at higher temperature.
(3) As a simple byproduct of our techniques, logistic regression achieves an exponential improvement in learning from samples in the M-regime of data considered by Dutt, Lokhov, Vuffray, and Misra [ICML'21] as well as novel guarantees for learning from the adversarial Glauber dynamics of Chin, Moitra, Mossel, and Sandon [ArXiv'23].
Our approach thus significantly generalizes the elegant analysis of Wu, Sanghavi, and Dimakis [Neurips'19] without any algorithmic modification.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Budget Pacing in Repeated Auctions: Regret and Efficiency without Convergence
Authors:
Jason Gaitonde,
Yingkai Li,
Bar Light,
Brendan Lucier,
Aleksandrs Slivkins
Abstract:
We study the aggregate welfare and individual regret guarantees of dynamic \emph{pacing algorithms} in the context of repeated auctions with budgets. Such algorithms are commonly used as bidding agents in Internet advertising platforms, adaptively learning to shade bids by a tunable linear multiplier in order to match a specified budget. We show that when agents simultaneously apply a natural form…
▽ More
We study the aggregate welfare and individual regret guarantees of dynamic \emph{pacing algorithms} in the context of repeated auctions with budgets. Such algorithms are commonly used as bidding agents in Internet advertising platforms, adaptively learning to shade bids by a tunable linear multiplier in order to match a specified budget. We show that when agents simultaneously apply a natural form of gradient-based pacing, the liquid welfare obtained over the course of the learning dynamics is at least half the optimal expected liquid welfare obtainable by any allocation rule. Crucially, this result holds \emph{without requiring convergence of the dynamics}, allowing us to circumvent known complexity-theoretic obstacles of finding equilibria. This result is also robust to the correlation structure between agent valuations and holds for any \emph{core auction}, a broad class of auctions that includes first-price, second-price, and generalized second-price auctions as special cases. For individual guarantees, we further show such pacing algorithms enjoy \emph{dynamic regret} bounds for individual value maximization, with respect to the sequence of budget-pacing bids, for any auction satisfying a monotone bang-for-buck property. To complement our theoretical findings, we provide semi-synthetic numerical simulations based on auction data from the Bing Advertising platform.
△ Less
Submitted 13 November, 2024; v1 submitted 17 May, 2022;
originally announced May 2022.
-
Eigenstripping, Spectral Decay, and Edge-Expansion on Posets
Authors:
Jason Gaitonde,
Max Hopkins,
Tali Kaufman,
Shachar Lovett,
Ruizhe Zhang
Abstract:
We study the relationship between the underlying structure of posets and the spectral and combinatorial properties of their higher-order random walks. While fast mixing of random walks on hypergraphs has led to myriad breakthroughs throughout theoretical computer science in the last five years, many other important applications (e.g. locally testable codes, 2-2 games) rely on the more general non-…
▽ More
We study the relationship between the underlying structure of posets and the spectral and combinatorial properties of their higher-order random walks. While fast mixing of random walks on hypergraphs has led to myriad breakthroughs throughout theoretical computer science in the last five years, many other important applications (e.g. locally testable codes, 2-2 games) rely on the more general non-simplicial structures. These works make it clear that the global expansion properties of posets depend strongly on their underlying architecture (e.g. simplicial, cubical, linear algebraic), but the overall phenomenon remains poorly understood. In this work, we quantify the advantage of different architectures, highlighting how structural regularity controls the spectral decay and edge-expansion of corresponding random walks.
In particular, we show the spectra of walks on expanding posets (Dikstein, Dinur, Filmus, Harsha RANDOM 2018) concentrate in strips around a small number of approximate eigenvalues controlled by the poset's regularity. This gives a simple condition to identify architectures (e.g. the Grassmann) that exhibit fast (exponential) decay of eigenvalues, versus architectures like hypergraphs with slow (linear) decay -- a crucial distinction in applications to hardness of approximation and agreement testing such as the recent proof of the 2-2 Games Conjecture (Khot, Minzer, Safra FOCS 2018). We show these results lead to a tight variance-based characterization of edge-expansion on eposets generalizing (Bafna, Hopkins, Kaufman, and Lovett (SODA 2022)), and pay special attention to the case of the Grassmann where we show our results are tight for a natural set of sparsifications of the Grassmann graphs. We note for clarity that our results do not recover the characterization used in the proof of the 2-2 Games Conjecture which relies on $\ell_\infty$ rather than $\ell_2$-structure.
△ Less
Submitted 12 March, 2023; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Polarization in Geometric Opinion Dynamics
Authors:
Jason Gaitonde,
Jon Kleinberg,
Éva Tardos
Abstract:
In light of increasing recent attention to political polarization, understanding how polarization can arise poses an important theoretical question. While more classical models of opinion dynamics seem poorly equipped to study this phenomenon, a recent novel approach by Hązła, Jin, Mossel, and Ramnarayan (HJMR) proposes a simple geometric model of opinion evolution that provably exhibits strong po…
▽ More
In light of increasing recent attention to political polarization, understanding how polarization can arise poses an important theoretical question. While more classical models of opinion dynamics seem poorly equipped to study this phenomenon, a recent novel approach by Hązła, Jin, Mossel, and Ramnarayan (HJMR) proposes a simple geometric model of opinion evolution that provably exhibits strong polarization in specialized cases. Moreover, polarization arises quite organically in their model: in each time step, each agent updates opinions according to their correlation/response with an issue drawn at random. However, their techniques do not seem to extend beyond a set of special cases they identify, which benefit from fragile symmetry or contractiveness assumptions, leaving open how general this phenomenon really is.
In this paper, we further the study of polarization in related geometric models. We show that the exact form of polarization in such models is quite nuanced: even when strong polarization does not hold, it is possible for weaker notions of polarization to nonetheless attain. We provide a concrete example where weak polarization holds, but strong polarization provably fails. However, we show that strong polarization provably holds in many variants of the HJMR model, which are also robust to a wider array of distributions of random issues -- this indicates that the form of polarization introduced by HJMR is more universal than suggested by their special cases. We also show that the weaker notions connect more readily to the theory of Markov chains on general state spaces.
△ Less
Submitted 23 June, 2021;
originally announced June 2021.
-
Virtues of Patience in Strategic Queuing Systems
Authors:
Jason Gaitonde,
Eva Tardos
Abstract:
We consider the problem of selfish agents in discrete-time queuing systems, where competitive queues try to get their packets served. In this model, a queue gets to send a packet each step to one of the servers, which will attempt to serve the oldest arriving packet, and unprocessed packets are returned to each queue. We model this as a repeated game where queues compete for the capacity of the se…
▽ More
We consider the problem of selfish agents in discrete-time queuing systems, where competitive queues try to get their packets served. In this model, a queue gets to send a packet each step to one of the servers, which will attempt to serve the oldest arriving packet, and unprocessed packets are returned to each queue. We model this as a repeated game where queues compete for the capacity of the servers, but where the state of the game evolves as the length of each queue varies, resulting in a highly dependent random process. Earlier work by the authors [EC'20] shows that with no-regret learners, the system needs twice the capacity as would be required in the coordinated setting to ensure queue lengths remain stable despite the selfish behavior of the queues. In this paper, we demonstrate that this way of evaluating outcomes is myopic: if more patient queues choose strategies that selfishly maximize their long-run success rate, stability can be ensured with just $\frac{e}{e-1}\approx 1.58$ times extra capacity, better than what is possible assuming the no-regret property.
As these systems induce highly dependent processes, our analysis draws heavily on techniques from probability theory. Though these systems are random under any fixed policies by the queues, we show that, surprisingly, these systems have deterministic and explicit asymptotic behavior. We show that the asymptotic growth rates of queues can be written as a ratio of a submodular and modular function, which provides significant game-theoretic properties. Our equilibrium analysis then relies on a novel deformation argument towards a more analyzable solution that differs significantly from previous price of anarchy results. While the intermediate points will not be equilibria, this analytic structure will ensure that this deformation is monotonic along this continuous path.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Fractional Pseudorandom Generators from Any Fourier Level
Authors:
Eshan Chattopadhyay,
Jason Gaitonde,
Chin Ho Lee,
Shachar Lovett,
Abhishek Shetty
Abstract:
We prove new results on the polarizing random walk framework introduced in recent works of Chattopadhyay {et al.} [CHHL19,CHLT19] that exploit $L_1$ Fourier tail bounds for classes of Boolean functions to construct pseudorandom generators (PRGs). We show that given a bound on the $k$-th level of the Fourier spectrum, one can construct a PRG with a seed length whose quality scales with $k$. This in…
▽ More
We prove new results on the polarizing random walk framework introduced in recent works of Chattopadhyay {et al.} [CHHL19,CHLT19] that exploit $L_1$ Fourier tail bounds for classes of Boolean functions to construct pseudorandom generators (PRGs). We show that given a bound on the $k$-th level of the Fourier spectrum, one can construct a PRG with a seed length whose quality scales with $k$. This interpolates previous works, which either require Fourier bounds on all levels [CHHL19], or have polynomial dependence on the error parameter in the seed length [CHLT10], and thus answers an open question in [CHLT19]. As an example, we show that for polynomial error, Fourier bounds on the first $O(\log n)$ levels is sufficient to recover the seed length in [CHHL19], which requires bounds on the entire tail.
We obtain our results by an alternate analysis of fractional PRGs using Taylor's theorem and bounding the degree-$k$ Lagrange remainder term using multilinearity and random restrictions. Interestingly, our analysis relies only on the \emph{level-k unsigned Fourier sum}, which is potentially a much smaller quantity than the $L_1$ notion in previous works. By generalizing a connection established in [CHH+20], we give a new reduction from constructing PRGs to proving correlation bounds. Finally, using these improvements we show how to obtain a PRG for $\mathbb{F}_2$ polynomials with seed length close to the state-of-the-art construction due to Viola [Vio09], which was not known to be possible using this framework.
△ Less
Submitted 7 November, 2020; v1 submitted 4 August, 2020;
originally announced August 2020.
-
Adversarial Perturbations of Opinion Dynamics in Networks
Authors:
Jason Gaitonde,
Jon Kleinberg,
Eva Tardos
Abstract:
We study the connections between network structure, opinion dynamics, and an adversary's power to artificially induce disagreements. We approach these questions by extending models of opinion formation in the social sciences to represent scenarios, familiar from recent events, in which external actors seek to destabilize communities through sophisticated information warfare tactics via fake news a…
▽ More
We study the connections between network structure, opinion dynamics, and an adversary's power to artificially induce disagreements. We approach these questions by extending models of opinion formation in the social sciences to represent scenarios, familiar from recent events, in which external actors seek to destabilize communities through sophisticated information warfare tactics via fake news and bots. In many instances, the intrinsic goals of these efforts are not necessarily to shift the overall sentiment of the network, but rather to induce discord. These perturbations diffuse via opinion dynamics on the underlying network, through mechanisms that have been analyzed and abstracted through work in computer science and the social sciences. We investigate the properties of such attacks, considering optimal strategies both for the adversary seeking to create disagreement and for the entities tasked with defending the network from attack. We show that for different formulations of these types of objectives, different regimes of the spectral structure of the network will limit the adversary's capacity to sow discord; this enables us to qualitatively describe which networks are most vulnerable against these perturbations. We then consider the algorithmic task of a network defender to mitigate these sorts of adversarial attacks by insulating nodes heterogeneously; we show that, by considering the geometry of this problem, this optimization task can be efficiently solved via convex programming. Finally, we generalize these results to allow for two network structures, where the opinion dynamics process and the measurement of disagreement become uncoupled, and determine how the adversary's power changes; for instance, this may arise when opinion dynamics are controlled an online community via social media, while disagreement is measured along "real-world" connections.
△ Less
Submitted 13 July, 2020; v1 submitted 16 March, 2020;
originally announced March 2020.
-
Stability and Learning in Strategic Queuing Systems
Authors:
Jason Gaitonde,
Eva Tardos
Abstract:
Bounding the price of anarchy, which quantifies the damage to social welfare due to selfish behavior of the participants, has been an important area of research. In this paper, we study this phenomenon in the context of a game modeling queuing systems: routers compete for servers, where packets that do not get service will be resent at future rounds, resulting in a system where the number of packe…
▽ More
Bounding the price of anarchy, which quantifies the damage to social welfare due to selfish behavior of the participants, has been an important area of research. In this paper, we study this phenomenon in the context of a game modeling queuing systems: routers compete for servers, where packets that do not get service will be resent at future rounds, resulting in a system where the number of packets at each round depends on the success of the routers in the previous rounds. We model this as an (infinitely) repeated game, where the system holds a state (number of packets held by each queue) that arises from the results of the previous round. We assume that routers satisfy the no-regret condition, e.g. they use learning strategies to identify the server where their packets get the best service.
Classical work on repeated games makes the strong assumption that the subsequent rounds of the repeated games are independent (beyond the influence on learning from past history). The carryover effect caused by packets remaining in this system makes learning in our context result in a highly dependent random process. We analyze this random process and find that if the capacity of the servers is high enough to allow a centralized and knowledgeable scheduler to get all packets served even with double the packet arrival rate, and queues use no-regret learning algorithms, then the expected number of packets in the queues will remain bounded throughout time, assuming older packets have priority. This paper is the first to study the effect of selfish learning in a queuing system, where the learners compete for resources, but rounds are not all independent: the number of packets to be routed at each round depends on the success of the routers in the previous rounds.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.