-
Unweighted Layered Graph Traversal: Passing a Crown via Entropy Maximization
Authors:
Xingjian Bai,
Christian Coester,
Romain Cosson
Abstract:
Introduced by Papadimitriou and Yannakakis in 1989, layered graph traversal is a central problem in online algorithms and mobile computing that has been studied for several decades, and which now is essentially resolved in its original formulation. In this paper, we demonstrate that what appears to be an innocuous modification of the problem actually leads to a drastic (exponential) reduction of t…
▽ More
Introduced by Papadimitriou and Yannakakis in 1989, layered graph traversal is a central problem in online algorithms and mobile computing that has been studied for several decades, and which now is essentially resolved in its original formulation. In this paper, we demonstrate that what appears to be an innocuous modification of the problem actually leads to a drastic (exponential) reduction of the competitive ratio. Specifically, we present an algorithm that is $O(\log^2 w)$-competitive for traversing unweighted layered graphs of width $w$. Our algorithm chooses the agent's position simply according to the probability distribution over the current layer that maximizes the sum of entropies of the induced distributions in the preceding layers.
△ Less
Submitted 20 October, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Barely Random Algorithms and Collective Metrical Task Systems
Authors:
Romain Cosson,
Laurent Massoulié
Abstract:
We consider metrical task systems on general metric spaces with $n$ points, and show that any fully randomized algorithm can be turned into a randomized algorithm that uses only $2\log n$ random bits, and achieves the same competitive ratio up to a factor $2$. This provides the first order-optimal barely random algorithms for metrical task systems, i.e., which use a number of random bits that does…
▽ More
We consider metrical task systems on general metric spaces with $n$ points, and show that any fully randomized algorithm can be turned into a randomized algorithm that uses only $2\log n$ random bits, and achieves the same competitive ratio up to a factor $2$. This provides the first order-optimal barely random algorithms for metrical task systems, i.e., which use a number of random bits that does not depend on the number of requests addressed to the system. We discuss implications on various aspects of online decision-making such as: distributed systems, advice complexity, and transaction costs, suggesting broad applicability. We put forward an equivalent view that we call collective metrical task systems where $k$ agents in a metrical task system team up, and suffer the average cost paid by each agent. Our results imply that such a team can be $O(\log^2 n)$-competitive as soon as $k\geq n^2$. In comparison, a single agent is always $Ω(n)$-competitive.
△ Less
Submitted 7 November, 2024; v1 submitted 17 March, 2024;
originally announced March 2024.
-
Ariadne and Theseus: Exploration and Rendezvous with Two Mobile Agents in an Unknown Graph
Authors:
Romain Cosson
Abstract:
We investigate two fundamental problems in mobile computing: exploration and rendezvous, with two distinct mobile agents in an unknown graph. The agents may communicate by reading and writing information on whiteboards that are located at all nodes. They both move along one adjacent edge at every time-step. In the exploration problem, the agents start from the same arbitrary node and must traverse…
▽ More
We investigate two fundamental problems in mobile computing: exploration and rendezvous, with two distinct mobile agents in an unknown graph. The agents may communicate by reading and writing information on whiteboards that are located at all nodes. They both move along one adjacent edge at every time-step. In the exploration problem, the agents start from the same arbitrary node and must traverse all the edges. We present an algorithm achieving collective exploration in $m$ time-steps, where $m$ is the number of edges of the graph. This improves over the guarantee of depth-first search, which requires $2m$ time-steps. In the rendezvous problem, the agents start from different nodes of the graph and must meet as fast as possible. We present an algorithm guaranteeing rendezvous in at most $\frac{3}{2}m$ time-steps. This improves over the so-called `wait for Mommy' algorithm which is based on depth-first search and which also requires $2m$ time-steps. Importantly, all our guarantees are derived from a more general asynchronous setting in which the speeds of the agents are controlled by an adversary at all times. Our guarantees generalize to weighted graphs, when replacing the number of edges $m$ with the sum of all edge lengths. We show that our guarantees are met with matching lower-bounds in the asynchronous setting.
△ Less
Submitted 4 July, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Collective Tree Exploration via Potential Function Method
Authors:
Romain Cosson,
Laurent Massoulié
Abstract:
We study the problem of collective tree exploration (CTE) where a team of $k$ agents is tasked to traverse all the edges of an unknown tree as fast as possible, assuming complete communication between the agents. In this paper, we present an algorithm performing collective tree exploration in only $2n/k+O(kD)$ rounds, where $n$ is the number of nodes in the tree, and $D$ is the tree depth. This le…
▽ More
We study the problem of collective tree exploration (CTE) where a team of $k$ agents is tasked to traverse all the edges of an unknown tree as fast as possible, assuming complete communication between the agents. In this paper, we present an algorithm performing collective tree exploration in only $2n/k+O(kD)$ rounds, where $n$ is the number of nodes in the tree, and $D$ is the tree depth. This leads to a competitive ratio of $O(\sqrt{k})$ for collective tree exploration, the first polynomial improvement over the initial $O(k/\log(k))$ ratio of [FGKP06]. Our analysis relies on a game with robots at the leaves of a continuously growing tree, which is presented in a similar manner as the `evolving tree game' of [BCR22], though its analysis and applications differ significantly. This game extends the `tree-mining game' (TM) of [Cos23] and leads to guarantees for an asynchronous extension of collective tree exploration (ACTE). Another surprising consequence of our results is the existence of algorithms $\{A_k\}_{k\in \mathbb{N}}$ for layered tree traversal (LTT) with cost at most $2L/k+O(kD)$, where $L$ is the sum of edge lengths and $D$ is the tree depth. For the case of layered trees of width $w$ and unit edge lengths, our guarantee is thus in $O(\sqrt{w}D)$.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
Breaking the k/log k Barrier in Collective Tree Exploration via Tree-Mining
Authors:
Romain Cosson
Abstract:
In collective tree exploration, a team of $k$ mobile agents is tasked to go through all edges of an unknown tree as fast as possible. An edge of the tree is revealed to the team when one agent becomes adjacent to that edge. The agents start from the root and all move synchronously along one adjacent edge in each round. Communication between the agents is unrestricted, and they are, therefore, cent…
▽ More
In collective tree exploration, a team of $k$ mobile agents is tasked to go through all edges of an unknown tree as fast as possible. An edge of the tree is revealed to the team when one agent becomes adjacent to that edge. The agents start from the root and all move synchronously along one adjacent edge in each round. Communication between the agents is unrestricted, and they are, therefore, centrally controlled by a single exploration algorithm. The algorithm's guarantee is typically compared to the number of rounds required by the agents to go through all edges if they had known the tree in advance. This quantity is at least $\max\{2n/k,2D\}$ where $n$ is the number of nodes and $D$ is the tree depth. Since the introduction of the problem by [FGKP04], two types of guarantees have emerged: the first takes the form $r(k)(n/k+D)$, where $r(k)$ is called the competitive ratio, and the other takes the form $2n/k+f(k,D)$, where $f(k,D)$ is called the competitive overhead. In this paper, we present the first algorithm with linear-in-$D$ competitive overhead, thereby reconciling both approaches. Specifically, our bound is in $2n/k + O(k^{\log_2(k)-1} D)$ and leads to a competitive ratio in $O(k/\exp(\sqrt{\ln 2\ln k}))$. This is the first improvement over $O(k/\ln k)$ since the introduction of the problem, twenty years ago. Our algorithm is developed for an asynchronous generalization of collective tree exploration (ACTE). It belongs to a broad class of locally-greedy exploration algorithms that we define. We show that the analysis of locally-greedy algorithms can be seen through the lens of a 2-player game that we call the tree-mining game and which could be of independent interest.
△ Less
Submitted 30 October, 2023; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Breadth-First Depth-Next: Optimal Collaborative Exploration of Trees with Low Diameter
Authors:
Romain Cosson,
Laurent Massoulié,
Laurent Viennot
Abstract:
We consider the problem of collaborative tree exploration posed by Fraigniaud, Gasieniec, Kowalski, and Pelc where a team of $k$ agents is tasked to collectively go through all the edges of an unknown tree as fast as possible. Denoting by $n$ the total number of nodes and by $D$ the tree depth, the $\mathcal{O}(n/\log(k)+D)$ algorithm of Fraigniaud et al. achieves the best-known competitive ratio…
▽ More
We consider the problem of collaborative tree exploration posed by Fraigniaud, Gasieniec, Kowalski, and Pelc where a team of $k$ agents is tasked to collectively go through all the edges of an unknown tree as fast as possible. Denoting by $n$ the total number of nodes and by $D$ the tree depth, the $\mathcal{O}(n/\log(k)+D)$ algorithm of Fraigniaud et al. achieves the best-known competitive ratio with respect to the cost of offline exploration which is $Θ(\max{\{2n/k,2D\}})$. Brass, Cabrera-Mora, Gasparri, and Xiao consider an alternative performance criterion, namely the additive overhead with respect to $2n/k$, and obtain a $2n/k+\mathcal{O}((D+k)^k)$ runtime guarantee. In this paper, we introduce `Breadth-First Depth-Next' (BFDN), a novel and simple algorithm that performs collaborative tree exploration in time $2n/k+\mathcal{O}(D^2\log(k))$, thus outperforming Brass et al. for all values of $(n,D)$ and being order-optimal for all trees with depth $D=o_k(\sqrt{n})$. Moreover, a recent result from Disser et al. implies that no exploration algorithm can achieve a $2n/k+\mathcal{O}(D^{2-ε})$ runtime guarantee. The dependency in $D^2$ of our bound is in this sense optimal. The proof of our result crucially relies on the analysis of an associated two-player game. We extend the guarantees of BFDN to: scenarios with limited memory and communication, adversarial setups where robots can be blocked, and exploration of classes of non-tree graphs. Finally, we provide a recursive version of BFDN with a runtime of $\mathcal{O}_\ell(n/k^{1/\ell}+\log(k) D^{1+1/\ell})$ for parameter $\ell\ge 1$, thereby improving performance for trees with large depth.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Gradient Descent for Low-Rank Functions
Authors:
Romain Cosson,
Ali Jadbabaie,
Anuran Makur,
Amirhossein Reisizadeh,
Devavrat Shah
Abstract:
Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss function varies significantly in only a few directions of the input space. In this paper, we leverage such low-rank structure to reduce the high computational cost of canonical gradient-based methods such as gradient descent (GD). Our p…
▽ More
Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss function varies significantly in only a few directions of the input space. In this paper, we leverage such low-rank structure to reduce the high computational cost of canonical gradient-based methods such as gradient descent (GD). Our proposed \emph{Low-Rank Gradient Descent} (LRGD) algorithm finds an $ε$-approximate stationary point of a $p$-dimensional function by first identifying $r \leq p$ significant directions, and then estimating the true $p$-dimensional gradient at every iteration by computing directional derivatives only along those $r$ directions. We establish that the "directional oracle complexities" of LRGD for strongly convex and non-convex objective functions are $\mathcal{O}(r \log(1/ε) + rp)$ and $\mathcal{O}(r/ε^2 + rp)$, respectively. When $r \ll p$, these complexities are smaller than the known complexities of $\mathcal{O}(p \log(1/ε))$ and $\mathcal{O}(p/ε^2)$ of {\gd} in the strongly convex and non-convex settings, respectively. Thus, LRGD significantly reduces the computational cost of gradient-based methods for sufficiently low-rank functions. In the course of our analysis, we also formally define and characterize the classes of exact and approximately low-rank functions.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Universal Online Learning with Unbounded Losses: Memory Is All You Need
Authors:
Moise Blanchard,
Romain Cosson,
Steve Hanneke
Abstract:
We resolve an open problem of Hanneke on the subject of universally consistent online learning with non-i.i.d. processes and unbounded losses. The notion of an optimistically universal learning rule was defined by Hanneke in an effort to study learning theory under minimal assumptions. A given learning rule is said to be optimistically universal if it achieves a low long-run average loss whenever…
▽ More
We resolve an open problem of Hanneke on the subject of universally consistent online learning with non-i.i.d. processes and unbounded losses. The notion of an optimistically universal learning rule was defined by Hanneke in an effort to study learning theory under minimal assumptions. A given learning rule is said to be optimistically universal if it achieves a low long-run average loss whenever the data generating process makes this goal achievable by some learning rule. Hanneke posed as an open problem whether, for every unbounded loss, the family of processes admitting universal learning are precisely those having a finite number of distinct values almost surely. In this paper, we completely resolve this problem, showing that this is indeed the case. As a consequence, this also offers a dramatically simpler formulation of an optimistically universal learning rule for any unbounded loss: namely, the simple memorization rule already suffices. Our proof relies on constructing random measurable partitions of the instance space and could be of independent interest for solving other open questions. We extend the results to the non-realizable setting thereby providing an optimistically universal Bayes consistent learning rule.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
Universal Online Learning with Bounded Loss: Reduction to Binary Classification
Authors:
Moïse Blanchard,
Romain Cosson
Abstract:
We study universal consistency of non-i.i.d. processes in the context of online learning. A stochastic process is said to admit universal consistency if there exists a learner that achieves vanishing average loss for any measurable response function on this process. When the loss function is unbounded, Blanchard et al. showed that the only processes admitting strong universal consistency are those…
▽ More
We study universal consistency of non-i.i.d. processes in the context of online learning. A stochastic process is said to admit universal consistency if there exists a learner that achieves vanishing average loss for any measurable response function on this process. When the loss function is unbounded, Blanchard et al. showed that the only processes admitting strong universal consistency are those taking a finite number of values almost surely. However, when the loss function is bounded, the class of processes admitting strong universal consistency is much richer and its characterization could be dependent on the response setting (Hanneke). In this paper, we show that this class of processes is independent from the response setting thereby closing an open question (Hanneke, Open Problem 3). Specifically, we show that the class of processes that admit universal online learning is the same for binary classification as for multiclass classification with countable number of classes. Consequently, any output setting with bounded loss can be reduced to binary classification. Our reduction is constructive and practical. Indeed, we show that the nearest neighbor algorithm is transported by our construction. For binary classification on a process admitting strong universal learning, we prove that nearest neighbor successfully learns at least all finite unions of intervals.
△ Less
Submitted 15 July, 2022; v1 submitted 29 December, 2021;
originally announced December 2021.
-
Quantifying Variational Approximation for the Log-Partition Function
Authors:
Romain Cosson,
Devavrat Shah
Abstract:
Variational approximation, such as mean-field (MF) and tree-reweighted (TRW), provide a computationally efficient approximation of the log-partition function for a generic graphical model. TRW provably provides an upper bound, but the approximation ratio is generally not quantified.
As the primary contribution of this work, we provide an approach to quantify the approximation ratio through the p…
▽ More
Variational approximation, such as mean-field (MF) and tree-reweighted (TRW), provide a computationally efficient approximation of the log-partition function for a generic graphical model. TRW provably provides an upper bound, but the approximation ratio is generally not quantified.
As the primary contribution of this work, we provide an approach to quantify the approximation ratio through the property of the underlying graph structure. Specifically, we argue that (a variant of) TRW produces an estimate that is within factor $\frac{1}{\sqrt{κ(G)}}$ of the true log-partition function for any discrete pairwise graphical model over graph $G$, where $κ(G) \in (0,1]$ captures how far $G$ is from tree structure with $κ(G) = 1$ for trees and $2/N$ for the complete graph over $N$ vertices. As a consequence, the approximation ratio is $1$ for trees, $\sqrt{(d+1)/2}$ for any graph with maximum average degree $d$, and $\stackrel{β\to\infty}{\approx} 1+1/(2β)$ for graphs with girth (shortest cycle) at least $β\log N$. In general, $κ(G)$ is the solution of a max-min problem associated with $G$ that can be evaluated in polynomial time for any graph.
Using samples from the uniform distribution over the spanning trees of G, we provide a near linear-time variant that achieves an approximation ratio equal to the inverse of square-root of minimal (across edges) effective resistance of the graph. We connect our results to the graph partition-based approximation method and thus provide a unified perspective.
Keywords: variational inference, log-partition function, spanning tree polytope, minimum effective resistance, min-max spanning tree, local inference
△ Less
Submitted 19 August, 2021; v1 submitted 19 February, 2021;
originally announced February 2021.