Skip to main content

Showing 1–19 of 19 results for author: Sonthalia, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.16295  [pdf, other

    physics.comp-ph cond-mat.dis-nn cond-mat.stat-mech cs.LG stat.ML

    Identification of Mean-Field Dynamics using Transformers

    Authors: Shiba Biswal, Karthik Elamvazhuthi, Rishi Sonthalia

    Abstract: This paper investigates the use of transformer architectures to approximate the mean-field dynamics of interacting particle systems exhibiting collective behavior. Such systems are fundamental in modeling phenomena across physics, biology, and engineering, including gas dynamics, opinion formation, biological networks, and swarm robotics. The key characteristic of these systems is that the particl… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

  2. arXiv:2410.13991  [pdf, other

    math.ST cs.LG stat.ML

    Generalization for Least Squares Regression With Simple Spiked Covariances

    Authors: Jiping Li, Rishi Sonthalia

    Abstract: Random matrix theory has proven to be a valuable tool in analyzing the generalization of linear models. However, the generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood. To understand the generalization performance of such networks, it is crucial to characterize the spectrum of the feature matrix at the hidden layer. Recent work has made… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  3. arXiv:2406.04425  [pdf, other

    cs.LG math.OC math.ST stat.ML

    On Regularization via Early Stopping for Least Squares Regression

    Authors: Rishi Sonthalia, Jackie Lok, Elizaveta Rebrova

    Abstract: A fundamental problem in machine learning is understanding the effect of early stopping on the parameters obtained and the generalization capabilities of the model. Even for linear models, the effect is not fully understood for arbitrary learning rates and data. In this paper, we analyze the dynamics of discrete full batch gradient descent for linear regression. With minimal assumptions, we charac… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2406.03696  [pdf, other

    stat.ML cs.LG math.OC

    Error dynamics of mini-batch gradient descent with random reshuffling for least squares regression

    Authors: Jackie Lok, Rishi Sonthalia, Elizaveta Rebrova

    Abstract: We study the discrete dynamics of mini-batch gradient descent with random reshuffling for least squares regression. We show that the training and generalization errors depend on a sample cross-covariance matrix $Z$ between the original features $X$ and a set of new features $\widetilde{X}$ in which each feature is modified by the mini-batches that appear before it during the learning process in an… ▽ More

    Submitted 3 February, 2025; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: 33 pages. Accepted at ALT 2025

  5. arXiv:2403.07264  [pdf, other

    stat.ML cs.LG

    Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization

    Authors: Yutong Wang, Rishi Sonthalia, Wei Hu

    Abstract: We study the generalization capability of nearly-interpolating linear regressors: $\boldsymbolβ$'s whose training error $τ$ is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix $\boldsymbolΣ$, we demonstrate that any near-interpolator exhibits rapid norm growth: for $τ$ fix… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024

  6. arXiv:2310.00729  [pdf, other

    cs.LG math.AP math.NA stat.ML

    Spectral Neural Networks: Approximation Theory and Optimization Landscape

    Authors: Chenghui Li, Rishi Sonthalia, Nicolas Garcia Trillos

    Abstract: There is a large variety of machine learning methodologies that are based on the extraction of spectral geometric information from data. However, the implementations of many of these methods often depend on traditional eigensolvers, which present limitations when applied in practical online big data scenarios. To address some of these challenges, researchers have proposed different strategies for… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

  7. arXiv:2305.17297  [pdf, other

    cs.LG math.ST stat.ML

    Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers

    Authors: Chinmaya Kausik, Kashvi Srivastava, Rishi Sonthalia

    Abstract: Despite the importance of denoising in modern machine learning and ample empirical work on supervised denoising, its theoretical understanding is still relatively scarce. One concern about studying supervised denoising is that one might not always have noiseless training data from the test distribution. It is more reasonable to have access to noiseless training data from a different dataset than t… ▽ More

    Submitted 14 March, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Complete overhaul of presentation, many new results

  8. arXiv:2305.14689  [pdf, other

    stat.ML cs.LG math.ST

    Least Squares Regression Can Exhibit Under-Parameterized Double Descent

    Authors: Xinyue Li, Rishi Sonthalia

    Abstract: The relationship between the number of training data points, the number of parameters, and the generalization capabilities of models has been widely studied. Previous work has shown that double descent can occur in the over-parameterized regime and that the standard bias-variance trade-off holds in the under-parameterized regime. These works provide multiple reasons for the existence of the peak.… ▽ More

    Submitted 24 October, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  9. arXiv:2305.14632  [pdf, other

    math.CO cs.CC cs.DM cs.LG math.OC

    Supermodular Rank: Set Function Decomposition and Optimization

    Authors: Rishi Sonthalia, Anna Seigal, Guido Montufar

    Abstract: We define the supermodular rank of a function on a lattice. This is the smallest number of terms needed to decompose it into a sum of supermodular functions. The supermodular summands are defined with respect to different partial orders. We characterize the maximum possible value of the supermodular rank and describe the functions with fixed supermodular rank. We analogously define the submodular… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  10. Predicting the Future of AI with AI: High-quality link prediction in an exponentially growing knowledge network

    Authors: Mario Krenn, Lorenzo Buffoni, Bruno Coutinho, Sagi Eppel, Jacob Gates Foster, Andrew Gritsevskiy, Harlin Lee, Yichao Lu, Joao P. Moutinho, Nima Sanjabi, Rishi Sonthalia, Ngoc Mai Tran, Francisco Valente, Yangxinyu Xie, Rose Yu, Michael Kopp

    Abstract: A tool that could suggest new personalized research directions and ideas by taking insights from the scientific literature could significantly accelerate the progress of science. A field that might benefit from such an approach is artificial intelligence (AI) research, where the number of scientific publications has been growing exponentially over the last years, making it challenging for human re… ▽ More

    Submitted 23 September, 2022; originally announced October 2022.

    Comments: 13 pages, 7 figures. Comments welcome!

    Journal ref: Nature Machine Intelligence 5, 1326 (2023)

  11. ICLR 2022 Challenge for Computational Geometry and Topology: Design and Results

    Authors: Adele Myers, Saiteja Utpala, Shubham Talbar, Sophia Sanborn, Christian Shewmake, Claire Donnat, Johan Mathe, Umberto Lupo, Rishi Sonthalia, Xinyue Cui, Tom Szwagier, Arthur Pignet, Andri Bergsson, Soren Hauberg, Dmitriy Nielsen, Stefan Sommer, David Klindt, Erik Hermansen, Melvin Vaupel, Benjamin Dunn, Jeffrey Xiong, Noga Aharony, Itsik Pe'er, Felix Ambellan, Martin Hanik , et al. (3 additional authors not shown)

    Abstract: This paper presents the computational challenge on differential geometry and topology that was hosted within the ICLR 2022 workshop ``Geometric and Topological Representation Learning". The competition asked participants to provide implementations of machine learning algorithms on manifolds that would respect the API of the open-source software Geomstats (manifold part) and Scikit-Learn (machine l… ▽ More

    Submitted 26 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

  12. arXiv:2110.11430  [pdf, other

    cs.CG cs.LG

    How can classical multidimensional scaling go wrong?

    Authors: Rishi Sonthalia, Gregory Van Buskirk, Benjamin Raichel, Anna C. Gilbert

    Abstract: Given a matrix $D$ describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However, theoretical analysis of the robustness of the algorithm and an in-depth analysis of its performance on non-Euclidean metrics is lacking. In this paper, we deriv… ▽ More

    Submitted 28 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021

  13. arXiv:2110.04932  [pdf, other

    cs.SI cs.CL

    An Analysis of COVID-19 Knowledge Graph Construction and Applications

    Authors: Dominic Flocco, Bryce Palmer-Toy, Ruixiao Wang, Hongyu Zhu, Rishi Sonthalia, Junyuan Lin, Andrea L. Bertozzi, P. Jeffrey Brantingham

    Abstract: The construction and application of knowledge graphs have seen a rapid increase across many disciplines in recent years. Additionally, the problem of uncovering relationships between developments in the COVID-19 pandemic and social media behavior is of great interest to researchers hoping to curb the spread of the disease. In this paper we present a knowledge graph constructed from COVID-19 relate… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  14. arXiv:2012.03126  [pdf, other

    cs.CG math.PR

    Dual Regularized Optimal Transport

    Authors: Rishi Sonthalia, Anna C. Gilbert

    Abstract: In this paper, we present a new formulation of unbalanced optimal transport called Dual Regularized Optimal Transport (DROT). We argue that regularizing the dual formulation of optimal transport results in a version of unbalanced optimal transport that leads to sparse solutions and that gives us control over mass creation and destruction. We build intuition behind such control and present theoreti… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  15. arXiv:2005.03853  [pdf, other

    cs.LG math.OC stat.ML

    Project and Forget: Solving Large-Scale Metric Constrained Problems

    Authors: Rishi Sonthalia, Anna C. Gilbert

    Abstract: Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. Existing methods are restricted to specific kinds of metrics or small problem sizes because of the large number of metric… ▽ More

    Submitted 26 September, 2022; v1 submitted 8 May, 2020; originally announced May 2020.

  16. arXiv:2005.03847  [pdf, other

    cs.LG math.MG stat.ML

    Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding

    Authors: Rishi Sonthalia, Anna C. Gilbert

    Abstract: Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data. In this paper, we explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a… ▽ More

    Submitted 22 October, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Comments: Code available at https://github.com/rsonthal/TreeRep

  17. arXiv:1908.08411  [pdf, other

    cs.DS cs.CG

    Generalized Metric Repair on Graphs

    Authors: Chenglin Fan, Anna C. Gilbert, Benjamin Raichel, Rishi Sonthalia, Gregory Van Buskirk

    Abstract: Many modern data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimension reduction. As real data sets are noisy, distances often fail to satisfy a metric. For this reason, Gilbert and Jain and Fan et al. introduced the closely related sparse metric repair and… ▽ More

    Submitted 21 August, 2019; originally announced August 2019.

    Comments: arXiv admin note: text overlap with arXiv:1807.08078

  18. arXiv:1807.07619  [pdf, other

    cs.DS

    Generalized Metric Repair on Graphs

    Authors: Anna C. Gilbert, Rishi Sonthalia

    Abstract: Many modern data analysis algorithms either assume that or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimensionality reduction. Because real data sets are noisy, the similarity measures often fail to satisfy a metric. For this reason, Gilbert and Jain [11] and Fan, et al. [8] introduce the clo… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

  19. Unsupervised Metric Learning in Presence of Missing Data

    Authors: Anna C. Gilbert, Rishi Sonthalia

    Abstract: For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional structure, most algorithms are inefficient. The typical solution is to reduce the dimension of the input data using standard dimension reduction algorithms such as ISOMAP, LAPLACIAN EIGENMAPS or LLES. This approach, however, does not always wo… ▽ More

    Submitted 3 March, 2019; v1 submitted 19 July, 2018; originally announced July 2018.

    Journal ref: 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)