-
The Structural Complexity of Matrix-Vector Multiplication
Authors:
Emile Anand,
Jan van den Brand,
Rose McCarty
Abstract:
We consider the problem of preprocessing an $n\times n$ matrix M, and supporting queries that, for any vector v, returns the matrix-vector product Mv. This problem has been extensively studied in both theory and practice: on one side, practitioners have developed algorithms that are highly efficient in practice, whereas theoreticians have proven that the problem cannot be solved faster than naive…
▽ More
We consider the problem of preprocessing an $n\times n$ matrix M, and supporting queries that, for any vector v, returns the matrix-vector product Mv. This problem has been extensively studied in both theory and practice: on one side, practitioners have developed algorithms that are highly efficient in practice, whereas theoreticians have proven that the problem cannot be solved faster than naive multiplication in the worst-case. This lower bound holds even in the average-case, implying that existing average-case analyses cannot explain this gap between theory and practice. Therefore, we study the problem for structured matrices. We show that for $n\times n$ matrices of VC-dimension d, the matrix-vector multiplication problem can be solved with $\tilde{O}(n^2)$ preprocessing and $\tilde O(n^{2-1/d})$ query time. Given the low constant VC-dimensions observed in most real-world data, our results posit an explanation for why the problem can be solved so much faster in practice. Moreover, our bounds hold even if the matrix does not have a low VC-dimension, but is obtained by (possibly adversarially) corrupting at most a subquadratic number of entries of any unknown low VC-dimension matrix. Our results yield the first non-trivial upper bounds for many applications. In previous works, the online matrix-vector hypothesis (conjecturing that quadratic time is needed per query) was used to prove many conditional lower bounds, showing that it is impossible to compute and maintain high-accuracy estimates for shortest paths, Laplacian solvers, effective resistance, and triangle detection in graphs subject to node insertions and deletions in subquadratic time. Yet, via a reduction to our matrix-vector-multiplication result, we show we can maintain the aforementioned problems efficiently if the input is structured, providing the first subquadratic upper bounds in the high-accuracy regime.
△ Less
Submitted 6 March, 2025; v1 submitted 28 February, 2025;
originally announced February 2025.
-
Towards the Pseudorandomness of Expander Random Walks for Read-Once ACC0 circuits
Authors:
Emile Anand
Abstract:
Expander graphs are among the most useful combinatorial objects in theoretical computer science. A line of work studies random walks on expander graphs for their pseudorandomness against various classes of test functions, including symmetric functions, read-only branching programs, permutation branching programs, and $\mathrm{AC}^0$ circuits. The promising results of pseudorandomness of expander r…
▽ More
Expander graphs are among the most useful combinatorial objects in theoretical computer science. A line of work studies random walks on expander graphs for their pseudorandomness against various classes of test functions, including symmetric functions, read-only branching programs, permutation branching programs, and $\mathrm{AC}^0$ circuits. The promising results of pseudorandomness of expander random walks against $\mathrm{AC}^0$ circuits indicate a robustness of expander random walks beyond symmetric functions, motivating the question of whether expander random walks can fool more robust \emph{asymmetric} complexity classes, such as $\mathrm{ACC}^0$. In this work, we make progress towards this question by considering certain two-layered circuit compositions of $\mathrm{MOD}[k]$ gates, where we show that these family of circuits are fooled by expander random walks with total variation distance error $O(λ)$, where $λ$ is the second largest eigenvalue of the underlying expander graph. For $k\geq 3$, these circuits can be highly asymmetric with complicated Fourier characters. In this context, our work takes a step in the direction of fooling more complex asymmetric circuits. Separately, drawing from the learning-theory literature, we construct an explicit threshold circuit in the circuit family $\mathrm{TC}^0$, and show that it is \emph{not} fooled by expander random walk, providing an upper bound on the set of functions fooled by expander random walks.
△ Less
Submitted 22 January, 2025; v1 submitted 13 January, 2025;
originally announced January 2025.
-
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Authors:
Emile Anand,
Ishani Karmarkar,
Guannan Qu
Abstract:
Designing efficient algorithms for multi-agent reinforcement learning (MARL) is fundamentally challenging because the size of the joint state and action spaces grows exponentially in the number of agents. These difficulties are exacerbated when balancing sequential global decision-making with local agent interactions. In this work, we propose a new algorithm $\texttt{SUBSAMPLE-MFQ}$ (…
▽ More
Designing efficient algorithms for multi-agent reinforcement learning (MARL) is fundamentally challenging because the size of the joint state and action spaces grows exponentially in the number of agents. These difficulties are exacerbated when balancing sequential global decision-making with local agent interactions. In this work, we propose a new algorithm $\texttt{SUBSAMPLE-MFQ}$ ($\textbf{Subsample}$-$\textbf{M}$ean-$\textbf{F}$ield-$\textbf{Q}$-learning) and a decentralized randomized policy for a system with $n$ agents. For any $k\leq n$, our algorithm learns a policy for the system in time polynomial in $k$. We prove that this learned policy converges to the optimal policy on the order of $\tilde{O}(1/\sqrt{k})$ as the number of subsampled agents $k$ increases. In particular, this bound is independent of the number of agents $n$.
△ Less
Submitted 20 May, 2025; v1 submitted 30 November, 2024;
originally announced December 2024.
-
Peer-to-Peer Learning Dynamics of Wide Neural Networks
Authors:
Shreyas Chaudhari,
Srinivasa Pranav,
Emile Anand,
José M. F. Moura
Abstract:
Peer-to-peer learning is an increasingly popular framework that enables beyond-5G distributed edge devices to collaboratively train deep neural networks in a privacy-preserving manner without the aid of a central server. Neural network training algorithms for emerging environments, e.g., smart cities, have many design considerations that are difficult to tune in deployment settings -- such as neur…
▽ More
Peer-to-peer learning is an increasingly popular framework that enables beyond-5G distributed edge devices to collaboratively train deep neural networks in a privacy-preserving manner without the aid of a central server. Neural network training algorithms for emerging environments, e.g., smart cities, have many design considerations that are difficult to tune in deployment settings -- such as neural network architectures and hyperparameters. This presents a critical need for characterizing the training dynamics of distributed optimization algorithms used to train highly nonconvex neural networks in peer-to-peer learning environments. In this work, we provide an explicit characterization of the learning dynamics of wide neural networks trained using popular distributed gradient descent (DGD) algorithms. Our results leverage both recent advancements in neural tangent kernel (NTK) theory and extensive previous work on distributed learning and consensus. We validate our analytical results by accurately predicting the parameter and error dynamics of wide neural networks trained for classification tasks.
△ Less
Submitted 21 April, 2025; v1 submitted 23 September, 2024;
originally announced September 2024.
-
Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale
Authors:
Emile Anand,
Guannan Qu
Abstract:
We study reinforcement learning for global decision-making in the presence of local agents, where the global decision-maker makes decisions affecting all local agents, and the objective is to learn a policy that maximizes the joint rewards of all the agents. Such problems find many applications, e.g. demand response, EV charging, queueing, etc. In this setting, scalability has been a long-standing…
▽ More
We study reinforcement learning for global decision-making in the presence of local agents, where the global decision-maker makes decisions affecting all local agents, and the objective is to learn a policy that maximizes the joint rewards of all the agents. Such problems find many applications, e.g. demand response, EV charging, queueing, etc. In this setting, scalability has been a long-standing challenge due to the size of the state space which can be exponential in the number of agents. This work proposes the \texttt{SUBSAMPLE-Q} algorithm where the global agent subsamples $k\leq n$ local agents to compute a policy in time that is polynomial in $k$. We show that this learned policy converges to the optimal policy in the order of $\tilde{O}(1/\sqrt{k}+ε_{k,m})$ as the number of sub-sampled agents $k$ increases, where $ε_{k,m}$ is the Bellman noise. Finally, we validate the theory through numerical simulations in a demand-response setting and a queueing setting.
△ Less
Submitted 22 October, 2024; v1 submitted 29 February, 2024;
originally announced March 2024.
-
The Bit Complexity of Dynamic Algebraic Formulas and their Determinants
Authors:
Emile Anand,
Jan van den Brand,
Mehrdad Ghadiri,
Daniel Zhang
Abstract:
Many iterative algorithms in optimization, computational geometry, computer algebra, and other areas of computer science require repeated computation of some algebraic expression whose input changes slightly from one iteration to the next. Although efficient data structures have been proposed for maintaining the solution of such algebraic expressions under low-rank updates, most of these results a…
▽ More
Many iterative algorithms in optimization, computational geometry, computer algebra, and other areas of computer science require repeated computation of some algebraic expression whose input changes slightly from one iteration to the next. Although efficient data structures have been proposed for maintaining the solution of such algebraic expressions under low-rank updates, most of these results are only analyzed under exact arithmetic (real-RAM model and finite fields) which may not accurately reflect the complexity guarantees of real computers. In this paper, we analyze the stability and bit complexity of such data structures for expressions that involve the inversion, multiplication, addition, and subtraction of matrices under the word-RAM model. We show that the bit complexity only increases linearly in the number of matrix operations in the expression. In addition, we consider the bit complexity of maintaining the determinant of a matrix expression. We show that the required bit complexity depends on the logarithm of the condition number of matrices instead of the logarithm of their determinant. We also discuss rank maintenance and its connections to determinant maintenance. Our results have wide applications ranging from computational geometry (e.g., computing the volume of a polytope) to optimization (e.g., solving linear programs using the simplex algorithm).
△ Less
Submitted 22 January, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
Pseudorandomness of the Sticky Random Walk
Authors:
Emile Anand,
Chris Umans
Abstract:
We extend the pseudorandomness of random walks on expander graphs using the sticky random walk. Building on prior works, it was recently shown that expander random walks can fool all symmetric functions in total variation distance (TVD) upto an $O(λ(\frac{p}{\min f})^{O(p)})$ error, where $λ$ is the second largest eigenvalue of the expander, $p$ is the size of the arbitrary alphabet used to label…
▽ More
We extend the pseudorandomness of random walks on expander graphs using the sticky random walk. Building on prior works, it was recently shown that expander random walks can fool all symmetric functions in total variation distance (TVD) upto an $O(λ(\frac{p}{\min f})^{O(p)})$ error, where $λ$ is the second largest eigenvalue of the expander, $p$ is the size of the arbitrary alphabet used to label the vertices, and $\min f = \min_{b\in[p]} f_b$, where $f_b$ is the fraction of vertices labeled $b$ in the graph. Golowich and Vadhan conjecture that the dependency on the $(\frac{p}{\min f})^{O(p)}$ term is not tight. In this paper, we resolve the conjecture in the affirmative for a family of expanders. We present a generalization of the sticky random walk for which Golowich and Vadhan predict a TVD upper bound of $O(λp^{O(p)})$ using a Fourier-analytic approach. For this family of graphs, we use a combinatorial approach involving the Krawtchouk functions to derive a strengthened TVD of $O(λ)$. Furthermore, we present equivalencies between the generalized sticky random walk, and, using linear-algebraic techniques, show that the generalized sticky random walk parameterizes an infinite family of expander graphs.
△ Less
Submitted 24 April, 2025; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Identifying Chemicals Through Dimensionality Reduction
Authors:
Emile Anand,
Charles Steinhardt,
Martin Hansen
Abstract:
Civilizations have tried to make drinking water safe to consume for thousands of years. The process of determining water contaminants has evolved with the complexity of the contaminants due to pesticides and heavy metals. The routine procedure to determine water safety is to use targeted analysis which searches for specific substances from some known list; however, we do not explicitly know which…
▽ More
Civilizations have tried to make drinking water safe to consume for thousands of years. The process of determining water contaminants has evolved with the complexity of the contaminants due to pesticides and heavy metals. The routine procedure to determine water safety is to use targeted analysis which searches for specific substances from some known list; however, we do not explicitly know which substances should be on this list. Before experimentally determining which substances are contaminants, how do we answer the sampling problem of identifying all the substances in the water? Here, we present an approach that builds on the work of Jaanus Liigand et al., which used non-targeted analysis that conducts a broader search on the sample to develop a random-forest regression model, to predict the names of all the substances in a sample, as well as their respective concentrations[1]. This work utilizes techniques from dimensionality reduction and linear decompositions to present a more accurate model using data from the European Massbank Metabolome Library to produce a global list of chemicals that researchers can then identify and test for when purifying water.
△ Less
Submitted 24 April, 2025; v1 submitted 26 November, 2022;
originally announced November 2022.