Skip to main content

Showing 1–13 of 13 results for author: Larsen, K G

Searching in archive math. Search in all archives.
.
  1. arXiv:2502.16462  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Improved Margin Generalization Bounds for Voting Classifiers

    Authors: Mikael Møller Høgsgaard, Kasper Green Larsen

    Abstract: In this paper we establish a new margin-based generalization bound for voting classifiers, refining existing results and yielding tighter generalization guarantees for widely used boosting algorithms such as AdaBoost (Freund and Schapire, 1997). Furthermore, the new margin-based generalization bound enables the derivation of an optimal weak-to-strong learner: a Majority-of-3 large-margin classifie… ▽ More

    Submitted 3 June, 2025; v1 submitted 23 February, 2025; originally announced February 2025.

  2. arXiv:2502.13692  [pdf, ps, other

    cs.LG math.ST

    Tight Generalization Bounds for Large-Margin Halfspaces

    Authors: Kasper Green Larsen, Natascha Schalburg

    Abstract: We prove the first generalization bound for large-margin halfspaces that is asymptotically tight in the tradeoff between the margin, the fraction of training points with the given margin, the failure probability and the number of training points.

    Submitted 19 February, 2025; originally announced February 2025.

  3. arXiv:2409.17567  [pdf, ps, other

    cs.LG cs.CC cs.DS math.ST

    Derandomizing Multi-Distribution Learning

    Authors: Kasper Green Larsen, Omar Montasser, Nikita Zhivotovskiy

    Abstract: Multi-distribution or collaborative learning involves learning a single predictor that works well across multiple data distributions, using samples from each during training. Recent research on multi-distribution learning, focusing on binary loss and finite VC dimension classes, has shown near-optimal sample complexity that is achieved with oracle efficient algorithms. That is, these algorithms ar… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

  4. arXiv:2407.19777  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Revisiting Agnostic PAC Learning

    Authors: Steve Hanneke, Kasper Green Larsen, Nikita Zhivotovskiy

    Abstract: PAC learning, dating back to Valiant'84 and Vapnik and Chervonenkis'64,'74, is a classic model for studying supervised learning. In the agnostic setting, we have access to a hypothesis set $\mathcal{H}$ and a training set of labeled samples $(x_1,y_1),\dots,(x_n,y_n) \in \mathcal{X} \times \{-1,1\}$ drawn i.i.d. from an unknown distribution $\mathcal{D}$. The goal is to produce a classifier… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  5. arXiv:2403.08831  [pdf, ps, other

    stat.ML cs.LG math.ST

    Majority-of-Three: The Simplest Optimal Learner?

    Authors: Ishaq Aden-Ali, Mikael Møller Høgsgaard, Kasper Green Larsen, Nikita Zhivotovskiy

    Abstract: Developing an optimal PAC learning algorithm in the realizable setting, where empirical risk minimization (ERM) is suboptimal, was a major open problem in learning theory for decades. The problem was finally resolved by Hanneke a few years ago. Unfortunately, Hanneke's algorithm is quite complex as it returns the majority vote of many ERM classifiers that are trained on carefully selected subsets… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 22 pages

  6. arXiv:2301.01924  [pdf, other

    math.CO cs.CC cs.DM math.LO

    Diagonalization Games

    Authors: Noga Alon, Olivier Bousquet, Kasper Green Larsen, Shay Moran, Shlomo Moran

    Abstract: We study several variants of a combinatorial game which is based on Cantor's diagonal argument. The game is between two players called Kronecker and Cantor. The names of the players are motivated by the known fact that Leopold Kronecker did not appreciate Georg Cantor's arguments about the infinite, and even referred to him as a "scientific charlatan". In the game Kronecker maintains a list of m… ▽ More

    Submitted 22 January, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: 11 pages, added acknowledgements

  7. arXiv:1909.09912  [pdf, ps, other

    cs.DS cs.DM cs.LG math.CO

    Optimal Learning of Joint Alignments with a Faulty Oracle

    Authors: Kasper Green Larsen, Michael Mitzenmacher, Charalampos E. Tsourakakis

    Abstract: We consider the following problem, which is useful in applications such as joint image and shape alignment. The goal is to recover $n$ discrete variables $g_i \in \{0, \ldots, k-1\}$ (up to some global offset) given noisy observations of a set of their pairwise differences $\{(g_i - g_j) \bmod k\}$; specifically, with probability $\frac{1}{k}+δ$ for some $δ> 0$ one obtains the correct answer, and… ▽ More

    Submitted 21 September, 2019; originally announced September 2019.

    Comments: 10 pages

  8. arXiv:1805.08539  [pdf, other

    cs.LG cs.DS math.FA stat.ML

    Fully Understanding the Hashing Trick

    Authors: Casper Benjamin Freksen, Lior Kamma, Kasper Green Larsen

    Abstract: Feature hashing, also known as {\em the hashing trick}, introduced by Weinberger et al. (2009), is one of the key techniques used in scaling-up machine learning algorithms. Loosely speaking, feature hashing uses a random sparse projection matrix $A : \mathbb{R}^n \to \mathbb{R}^m$ (where $m \ll n$) in order to reduce the dimension of the data from $n$ to $m$ while approximately preserving the Eucl… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  9. arXiv:1711.02860  [pdf, ps, other

    cs.DS cs.DM math.CO

    Constructive Discrepancy Minimization with Hereditary L2 Guarantees

    Authors: Kasper Green Larsen

    Abstract: In discrepancy minimization problems, we are given a family of sets $\mathcal{S} = \{S_1,\dots,S_m\}$, with each $S_i \in \mathcal{S}$ a subset of some universe $U = \{u_1,\dots,u_n\}$ of $n$ elements. The goal is to find a coloring $χ: U \to \{-1,+1\}$ of the elements of $U$ such that each set $S \in \mathcal{S}$ is colored as evenly as possible. Two classic measures of discrepancy are… ▽ More

    Submitted 13 December, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

  10. arXiv:1709.07308  [pdf, other

    cs.DS cs.DM cs.LG cs.SI math.CO

    Predicting Positive and Negative Links with Noisy Queries: Theory & Practice

    Authors: Charalampos E. Tsourakakis, Michael Mitzenmacher, Kasper Green Larsen, Jarosław Błasiok, Ben Lawson, Preetum Nakkiran, Vasileios Nakos

    Abstract: Social networks involve both positive and negative relationships, which can be captured in signed graphs. The {\em edge sign prediction problem} aims to predict whether an interaction between a pair of nodes will be positive or negative. We provide theoretical results for this problem that motivate natural improvements to recent heuristics. The edge sign prediction problem is related to correlat… ▽ More

    Submitted 6 December, 2020; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: arXiv admin note: text overlap with arXiv:1609.00750

  11. arXiv:1706.10110  [pdf, ps, other

    math.FA cs.CC cs.DS

    On Using Toeplitz and Circulant Matrices for Johnson-Lindenstrauss Transforms

    Authors: Casper Benjamin Freksen, Kasper Green Larsen

    Abstract: The Johnson-Lindenstrauss lemma is one of the corner stone results in dimensionality reduction. It says that given $N$, for any set of $N$ vectors $X \subset \mathbb{R}^n$, there exists a mapping $f : X \to \mathbb{R}^m$ such that $f(X)$ preserves all pairwise distances between vectors in $X$ to within $(1 \pm \varepsilon)$ if $m = O(\varepsilon^{-2} \lg N)$. Much effort has gone into developing f… ▽ More

    Submitted 8 November, 2017; v1 submitted 30 June, 2017; originally announced June 2017.

  12. arXiv:1609.02094  [pdf, ps, other

    cs.IT cs.CG cs.DS math.FA

    Optimality of the Johnson-Lindenstrauss Lemma

    Authors: Kasper Green Larsen, Jelani Nelson

    Abstract: For any integers $d, n \geq 2$ and $1/({\min\{n,d\}})^{0.4999} < \varepsilon<1$, we show the existence of a set of $n$ vectors $X\subset \mathbb{R}^d$ such that any embedding $f:X\rightarrow \mathbb{R}^m$ satisfying $$ \forall x,y\in X,\ (1-\varepsilon)\|x-y\|_2^2\le \|f(x)-f(y)\|_2^2 \le (1+\varepsilon)\|x-y\|_2^2 $$ must have $$ m = Ω(\varepsilon^{-2} \lg n). $$ This lower bound matches the uppe… ▽ More

    Submitted 8 November, 2017; v1 submitted 7 September, 2016; originally announced September 2016.

    Comments: v2: simplified proof, also added reference to Lev83

  13. arXiv:1411.2404  [pdf, ps, other

    cs.IT cs.CG cs.DS math.FA

    The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction

    Authors: Kasper Green Larsen, Jelani Nelson

    Abstract: For any $n>1$ and $0<\varepsilon<1/2$, we show the existence of an $n^{O(1)}$-point subset $X$ of $\mathbb{R}^n$ such that any linear map from $(X,\ell_2)$ to $\ell_2^m$ with distortion at most $1+\varepsilon$ must have $m = Ω(\min\{n, \varepsilon^{-2}\log n\})$. Our lower bound matches the upper bounds provided by the identity matrix and the Johnson-Lindenstrauss lemma, improving the previous low… ▽ More

    Submitted 10 November, 2014; originally announced November 2014.