Skip to main content

Showing 1–30 of 30 results for author: Steinerberger, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  2. arXiv:2402.11758  [pdf, other

    math.CO cs.DM math.OC math.SP

    Conformally rigid graphs

    Authors: Stefan Steinerberger, Rekha R. Thomas

    Abstract: Given a finite, simple, connected graph $G=(V,E)$ with $|V|=n$, we consider the associated graph Laplacian matrix $L = D - A$ with eigenvalues $0 = λ_1 < λ_2 \leq \dots \leq λ_n$. One can also consider the same graph equipped with positive edge weights $w:E \rightarrow \mathbb{R}_{> 0}$ normalized to $\sum_{e \in E} w_e = |E|$ and the associated weighted Laplacian matrix $L_w$. We say that $G$ is… ▽ More

    Submitted 5 April, 2025; v1 submitted 18 February, 2024; originally announced February 2024.

  3. arXiv:2306.06204  [pdf, other

    cs.DM math.CO math.OC

    Spectrahedral Geometry of Graph Sparsifiers

    Authors: Catherine Babecki, Stefan Steinerberger, Rekha R. Thomas

    Abstract: We propose an approach to graph sparsification based on the idea of preserving the smallest $k$ eigenvalues and eigenvectors of the Graph Laplacian. This is motivated by the fact that small eigenvalues and their associated eigenvectors tend to be more informative of the global structure and geometry of the graph than larger eigenvalues and their eigenvectors. The set of all weighted subgraphs of a… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 34 pages, 17 figures, 3 tables

  4. arXiv:2208.06676  [pdf, other

    cs.LG

    May the force be with you

    Authors: Yulan Zhang, Anna C. Gilbert, Stefan Steinerberger

    Abstract: Modern methods in dimensionality reduction are dominated by nonlinear attraction-repulsion force-based methods (this includes t-SNE, UMAP, ForceAtlas2, LargeVis, and many more). The purpose of this paper is to demonstrate that all such methods, by design, come with an additional feature that is being automatically computed along the way, namely the vector field associated with these forces. We sho… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: 23 pages, 17 figures

  5. arXiv:2206.05346  [pdf, other

    math.CO cs.DM math.OC

    Random Walks, Equidistribution and Graphical Designs

    Authors: Stefan Steinerberger, Rekha R. Thomas

    Abstract: Let $G=(V,E)$ be a $d$-regular graph on $n$ vertices and let $μ_0$ be a probability measure on $V$. The act of moving to a randomly chosen neighbor leads to a sequence of probability measures supported on $V$ given by $μ_{k+1} = A D^{-1} μ_k$, where $A$ is the adjacency matrix and $D$ is the diagonal matrix of vertex degrees of $G$. Ordering the eigenvalues of $ A D^{-1}$ as… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  6. arXiv:2204.13278  [pdf, other

    math.CO cs.DM

    Sums of Distances on Graphs and Embeddings into Euclidean Space

    Authors: Stefan Steinerberger

    Abstract: Let $G=(V,E)$ be a finite, connected graph. We consider a greedy selection of vertices: given a list of vertices $x_1, \dots, x_k$, take $x_{k+1}$ to be any vertex maximizing the sum of distances to the existing vertices and iterate: we keep adding the `most remote' vertex. The frequency with which the vertices of the graph appear in this sequence converges to a set of probability measures with ni… ▽ More

    Submitted 5 May, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  7. A common variable minimax theorem for graphs

    Authors: Ronald R. Coifman, Nicholas F. Marshall, Stefan Steinerberger

    Abstract: Let $\mathcal{G} = \{G_1 = (V, E_1), \dots, G_m = (V, E_m)\}$ be a collection of $m$ graphs defined on a common set of vertices $V$ but with different edge sets $E_1, \dots, E_m$. Informally, a function $f :V \rightarrow \mathbb{R}$ is smooth with respect to $G_k = (V,E_k)$ if $f(u) \sim f(v)$ whenever $(u, v) \in E_k$. We study the problem of understanding whether there exists a nonconstant funct… ▽ More

    Submitted 30 July, 2021; originally announced July 2021.

    Comments: 21 pages, 11 figures

  8. arXiv:2104.14404  [pdf, ps, other

    cs.DS

    A 0.502$\cdot$MaxCut Approximation using Quadratic Programming

    Authors: Stefan Steinerberger

    Abstract: We study the MaxCut problem for graphs $G=(V,E)$. The problem is NP-hard, there are two main approximation algorithms with theoretical guarantees: (1) the Goemans \& Williamson algorithm uses semi-definite programming to provide a 0.878MaxCut approximation (which, if the Unique Games Conjecture is true, is the best that can be done in polynomial time) and (2) Trevisan proposed an algorithm using s… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  9. arXiv:2102.13009  [pdf, other

    cs.LG

    t-SNE, Forceful Colorings and Mean Field Limits

    Authors: Yulan Zhang, Stefan Steinerberger

    Abstract: t-SNE is one of the most commonly used force-based nonlinear dimensionality reduction methods. This paper has two contributions: the first is forceful colorings, an idea that is also applicable to other force-based methods (UMAP, ForceAtlas2,...). In every equilibrium, the attractive and repulsive forces acting on a particle cancel out: however, both the size and the direction of the attractive (o… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

  10. arXiv:2102.04931  [pdf, other

    math.OC cs.DS

    Max-Cut via Kuramoto-type Oscillators

    Authors: Stefan Steinerberger

    Abstract: We consider the Max-Cut problem. Let $G = (V,E)$ be a graph with adjacency matrix $(a_{ij})_{i,j=1}^{n}$. Burer, Monteiro & Zhang proposed to find, for $n$ angles $\left\{θ_1, θ_2, \dots, θ_n\right\} \subset [0, 2π]$, minima of the energy $$ f(θ_1, \dots, θ_n) = \sum_{i,j=1}^{n} a_{ij} \cos{(θ_i - θ_j)}$$ because configurations achieving a global minimum leads to a partition of size 0.878 Max-Cut(… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

  11. arXiv:2012.08465  [pdf, other

    cs.LG math.CA

    Neural Collapse with Cross-Entropy Loss

    Authors: Jianfeng Lu, Stefan Steinerberger

    Abstract: We consider the variational problem of cross-entropy loss with $n$ feature vectors on a unit hypersphere in $\mathbb{R}^d$. We prove that when $d \geq n - 1$, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that as $n \rightarrow \infty$ with fixed $d$, the minimizing points will distribute uniformly on the hypersphere… ▽ More

    Submitted 18 January, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

  12. arXiv:2007.13288  [pdf, other

    math.NA cs.LG math.OC stat.ML

    On the Regularization Effect of Stochastic Gradient Descent applied to Least Squares

    Authors: Stefan Steinerberger

    Abstract: We study the behavior of stochastic gradient descent applied to $\|Ax -b \|_2^2 \rightarrow \min$ for invertible $A \in \mathbb{R}^{n \times n}$. We show that there is an explicit constant $c_{A}$ depending (mildly) on $A$ such that… ▽ More

    Submitted 1 September, 2020; v1 submitted 26 July, 2020; originally announced July 2020.

  13. arXiv:2004.01163  [pdf, other

    math.CO cs.CG cs.DM math.SP

    A Spectral Approach to the Shortest Path Problem

    Authors: Stefan Steinerberger

    Abstract: Let $G=(V,E)$ be a simple, connected graph. One is often interested in a short path between two vertices $u,v$. We propose a spectral algorithm: construct the function $φ:V \rightarrow \mathbb{R}_{\geq 0}$ $$ φ= \arg\min_{f:V \rightarrow \mathbb{R} \atop f(u) = 0, f \not\equiv 0} \frac{\sum_{(w_1, w_2) \in E}{(f(w_1)-f(w_2))^2}}{\sum_{w \in V}{f(w)^2}}.$$ $φ$ can also be understood as the smallest… ▽ More

    Submitted 16 April, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

  14. arXiv:2003.09969  [pdf, other

    math.PR cs.DM cs.LG math.SP stat.ML

    Spectral Clustering Revisited: Information Hidden in the Fiedler Vector

    Authors: Adela DePavia, Stefan Steinerberger

    Abstract: We are interested in the clustering problem on graphs: it is known that if there are two underlying clusters, then the signs of the eigenvector corresponding to the second largest eigenvalue of the adjacency matrix can reliably reconstruct the two clusters. We argue that the vertices for which the eigenvector has the largest and the smallest entries, respectively, are unusually strongly connected… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  15. arXiv:2003.07331  [pdf, other

    math.ST cs.IT

    Randomly Aggregated Least Squares for Support Recovery

    Authors: Ofir Lindenbaum, Stefan Steinerberger

    Abstract: We study the problem of exact support recovery: given an (unknown) vector $θ\in \left\{-1,0,1\right\}^D$, we are given access to the noisy measurement $$ y = Xθ+ ω,$$ where $X \in \mathbb{R}^{N \times D}$ is a (known) Gaussian matrix and the noise $ω\in \mathbb{R}^N$ is an (unknown) Gaussian vector. How small we can choose $N$ and still reliably recover the support of $θ$? We present RAWLS (Random… ▽ More

    Submitted 9 November, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

  16. arXiv:2002.12317  [pdf, other

    cs.LG stat.ML

    The Spectral Underpinning of word2vec

    Authors: Ariel Jaffe, Yuval Kluger, Ofir Lindenbaum, Jonathan Patsenker, Erez Peterfreund, Stefan Steinerberger

    Abstract: word2vec due to Mikolov \textit{et al.} (2013) is a word embedding method that is widely used in natural language processing. Despite its great success and frequent use, theoretical justification is still lacking. The main contribution of our paper is to propose a rigorous analysis of the highly nonlinear functional of word2vec. Our results suggest that word2vec may be primarily driven by an under… ▽ More

    Submitted 9 November, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

  17. arXiv:2001.01322  [pdf, other

    math.AP cs.CG math.MG

    Non-Convex Planar Harmonic Maps

    Authors: Shahar Z. Kovalsky, Noam Aigerman, Ingrid Daubechies, Michael Kazhdan, Jianfeng Lu, Stefan Steinerberger

    Abstract: We formulate a novel characterization of a family of invertible maps between two-dimensional domains. Our work follows two classic results: The Radó-Kneser-Choquet (RKC) theorem, which establishes the invertibility of harmonic maps into a convex planer domain; and Tutte's embedding theorem for planar graphs - RKC's discrete counterpart - which proves the invertibility of piecewise linear maps of t… ▽ More

    Submitted 5 January, 2020; originally announced January 2020.

  18. arXiv:1912.08327  [pdf, ps, other

    math.CO cs.DM math.SP

    Extreme Values of the Fiedler Vector on Trees

    Authors: Roy R. Lederman, S. Steinerberger

    Abstract: Let $G$ be a connected tree on $n$ vertices and let $L = D-A$ denote the Laplacian matrix on $G$. The second-smallest eigenvalue $λ_{2}(G) > 0$, also known as the algebraic connectivity, as well as the associated eigenvector $φ_2$ have been of substantial interest. We investigate the question of when the maxima and minima of $φ_2$ are assumed at the endpoints of the longest path in $G$. Our result… ▽ More

    Submitted 10 March, 2023; v1 submitted 17 December, 2019; originally announced December 2019.

  19. Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations

    Authors: Dmitry Kobak, George Linderman, Stefan Steinerberger, Yuval Kluger, Philipp Berens

    Abstract: T-distributed stochastic neighbour embedding (t-SNE) is a widely used data visualisation technique. It differs from its predecessor SNE by the low-dimensional similarity kernel: the Gaussian kernel was replaced by the heavy-tailed Cauchy kernel, solving the "crowding problem" of SNE. Here, we develop an efficient implementation of t-SNE for a $t$-distribution kernel with an arbitrary degree of fre… ▽ More

    Submitted 4 April, 2019; v1 submitted 15 February, 2019; originally announced February 2019.

    Journal ref: ECML PKDD 2019

  20. arXiv:1806.11096  [pdf, other

    stat.ML cs.LG math.FA

    Recovering Trees with Convex Clustering

    Authors: Eric C. Chi, Stefan Steinerberger

    Abstract: Convex clustering refers, for given $\left\{x_1, \dots, x_n\right\} \subset \mathbb{R}^p$, to the minimization of \begin{eqnarray*} u(γ) & = & \underset{u_1, \dots, u_n }{\arg\min}\;\sum_{i=1}^{n}{\lVert x_i - u_i \rVert^2} + γ\sum_{i,j=1}^{n}{w_{ij} \lVert u_i - u_j\rVert},\\ \end{eqnarray*} where $w_{ij} \geq 0$ is an affinity that quantifies the similarity between $x_i$ and $x_j$. We prove that… ▽ More

    Submitted 28 June, 2018; v1 submitted 28 June, 2018; originally announced June 2018.

    Comments: 26 pages, 7 figures

  21. arXiv:1804.09816  [pdf, other

    eess.SP cs.LG math.AP math.SP

    On the Dual Geometry of Laplacian Eigenfunctions

    Authors: Alexander Cloninger, Stefan Steinerberger

    Abstract: We discuss the geometry of Laplacian eigenfunctions $-Δφ= λφ$ on compact manifolds $(M,g)$ and combinatorial graphs $G=(V,E)$. The 'dual' geometry of Laplacian eigenfunctions is well understood on $\mathbb{T}^d$ (identified with $\mathbb{Z}^d$) and $\mathbb{R}^n$ (which is self-dual). The dual geometry is of tremendous role in various fields of pure and applied mathematics. The purpose of our pape… ▽ More

    Submitted 25 April, 2018; originally announced April 2018.

    MSC Class: 35J05; 35P05; 42C10; 65T60; 81Q50; 94A11

  22. arXiv:1803.06989  [pdf, other

    math.ST cs.LG math.NA stat.ML

    Numerical Integration on Graphs: where to sample and how to weigh

    Authors: George C. Linderman, Stefan Steinerberger

    Abstract: Let $G=(V,E,w)$ be a finite, connected graph with weighted edges. We are interested in the problem of finding a subset $W \subset V$ of vertices and weights $a_w$ such that $$ \frac{1}{|V|}\sum_{v \in V}^{}{f(v)} \sim \sum_{w \in W}{a_w f(w)}$$ for functions $f:V \rightarrow \mathbb{R}$ that are `smooth' with respect to the geometry of the graph. The main application are problems where $f$ is know… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

  23. Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

    Authors: George C. Linderman, Manas Rachh, Jeremy G. Hoskins, Stefan Steinerberger, Yuval Kluger

    Abstract: t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization that has become widely popular in recent years. Efficient implementations of t-SNE are available, but they scale poorly to datasets with hundreds of thousands to millions of high dimensional data-points. We present Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)… ▽ More

    Submitted 24 December, 2017; originally announced December 2017.

  24. arXiv:1711.04712  [pdf, other

    math.CO cs.DM cs.DS math.PR stat.ML

    Randomized Near Neighbor Graphs, Giant Components, and Applications in Data Science

    Authors: George C. Linderman, Gal Mishne, Yuval Kluger, Stefan Steinerberger

    Abstract: If we pick $n$ random points uniformly in $[0,1]^d$ and connect each point to its $k-$nearest neighbors, then it is well known that there exists a giant connected component with high probability. We prove that in $[0,1]^d$ it suffices to connect every point to $ c_{d,1} \log{\log{n}}$ points chosen randomly among its $ c_{d,2} \log{n}-$nearest neighbors to ensure a giant component of size… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

  25. Stability, Fairness and Random Walks in the Bargaining Problem

    Authors: Jakob Kapeller, Stefan Steinerberger

    Abstract: We study the classical bargaining problem and its two canonical solutions, (Nash and Kalai-Smorodinsky), from a novel point of view: we ask for stability of the solution if both players are able distort the underlying bargaining process by reference to a third party (e.g. a court). By exploring the simplest case, where decisions of the third party are made randomly we obtain a stable solution, whe… ▽ More

    Submitted 8 July, 2017; originally announced July 2017.

    Comments: to appear in Physica A

  26. arXiv:1706.02582  [pdf, other

    cs.LG stat.ML

    Clustering with t-SNE, provably

    Authors: George C. Linderman, Stefan Steinerberger

    Abstract: t-distributed Stochastic Neighborhood Embedding (t-SNE), a clustering and visualization method proposed by van der Maaten & Hinton in 2008, has rapidly become a standard tool in a number of natural sciences. Despite its overwhelming success, there is a distinct lack of mathematical foundations and the inner workings of the algorithm are not well understood. The purpose of this paper is to prove th… ▽ More

    Submitted 8 June, 2017; originally announced June 2017.

  27. arXiv:1705.01883  [pdf, other

    math.CO cs.DM math.DS

    Ulam Sequences and Ulam Sets

    Authors: Noah Kravitz, Stefan Steinerberger

    Abstract: The Ulam sequence is given by $a_1 =1, a_2 = 2$, and then, for $n \geq 3$, the element $a_n$ is defined as the smallest integer that can be written as the sum of two distinct earlier elements in a unique way. This gives the sequence $1, 2, 3, 4, 6, 8, 11, 13, 16, \dots$, which has a mysterious quasi-periodic behavior that is not understood. Ulam's definition naturally extends to higher dimensions:… ▽ More

    Submitted 27 August, 2018; v1 submitted 4 May, 2017; originally announced May 2017.

  28. arXiv:1507.00267  [pdf, other

    math.CO cs.DM math.NT

    A Hidden Signal in the Ulam sequence

    Authors: Stefan Steinerberger

    Abstract: The Ulam sequence is defined as $a_1 =1, a_2 = 2$ and $a_n$ being the smallest integer that can be written as the sum of two distinct earlier elements in a unique way. This gives $$1, 2, 3, 4, 6, 8, 11, 13, 16, 18, 26, 28, 36, 38, 47, \dots$$ Ulam remarked that understanding the sequence, which has been described as 'quite erratic', seems difficult and indeed nothing is known. We report the empiri… ▽ More

    Submitted 5 July, 2016; v1 submitted 1 July, 2015; originally announced July 2015.

  29. arXiv:1411.1638  [pdf, other

    cs.DM

    A filtering technique for Markov chains with applications to spectral embedding

    Authors: Stefan Steinerberger

    Abstract: Spectral methods have proven to be a highly effective tool in understanding the intrinsic geometry of a high-dimensional data set $\left\{x_i \right\}_{i=1}^{n} \subset \mathbb{R}^d$. The key ingredient is the construction of a Markov chain on the set, where transition probabilities depend on the distance between elements, for example where for every $1 \leq j \leq n$ the probability of going from… ▽ More

    Submitted 5 November, 2014; originally announced November 2014.

    Comments: 9 pages, 19 figures

  30. arXiv:1403.8002  [pdf, ps, other

    math.NA cs.DM math.MG

    A Remark on Disk Packings and Numerical Integration of Harmonic Functions

    Authors: Stefan Steinerberger

    Abstract: We are interested in the following problem: given an open, bounded domain $Ω\subset \mathbb{R}^2$, what is the largest constant $α= α(Ω) > 0$ such that there exist an infinite sequence of disks $B_1, B_2, \dots, B_N, \dots \subset \mathbb{R}^2$ and a sequence $(n_i)$ with $n_i \in \left\{1,2\right\}$ such that… ▽ More

    Submitted 7 December, 2014; v1 submitted 31 March, 2014; originally announced March 2014.

    Comments: to appear in Journal of Complexity