Skip to main content

Showing 1–6 of 6 results for author: Benning, F

Searching in archive math. Search in all archives.
.
  1. arXiv:2506.22048  [pdf, ps, other

    math.ST

    Schoenberg characterization of continuous non-stationary isotropic positive definite kernels

    Authors: Felix Benning, Max David Schölpple

    Abstract: We provide a characterization for the continuous positive definite kernels on $\mathbb R^d$ that are invariant to linear isometries, i.e. invariant under the orthogonal group $O(d)$. Furthermore, we provide necessary and sufficient conditions for these kernels to be strictly positive definite. This class of isotropic kernels is fairly general: First, it unifies stationary isotropic and dot product… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    MSC Class: 33C50; 33C55; 42A82; 42C10; 43A35; 60G15; 68T07

  2. arXiv:2504.08867  [pdf, ps, other

    cs.LG math.PR stat.ML

    In almost all shallow analytic neural network optimization landscapes, efficient minimizers have strongly convex neighborhoods

    Authors: Felix Benning, Steffen Dereich

    Abstract: Whether or not a local minimum of a cost function has a strongly convex neighborhood greatly influences the asymptotic convergence rate of optimizers. In this article, we rigorously analyze the prevalence of this property for the mean squared error induced by shallow, 1-hidden layer neural networks with analytic activation functions when applied to regression problems. The parameter space is divid… ▽ More

    Submitted 11 April, 2025; originally announced April 2025.

    MSC Class: 60G15; 60G60; 62J02; 62M45; 68T07

  3. arXiv:2504.08513  [pdf, ps, other

    math.PR math.ST

    Measure Theory of Conditionally Independent Random Function Evaluation

    Authors: Felix Benning

    Abstract: The next evaluation point $x_{n+1}$ of a random function $\mathbf f = (\mathbf f(x))_{x\in \mathbb X}$ (a.k.a. stochastic process or random field) is often chosen based on the filtration of previously seen evaluations $\mathcal F_n := σ(\mathbf f(x_0),\dots, \mathbf f(x_n))$. This turns $x_{n+1}$ into a random variable $X_{n+1}$ and thereby $\mathbf f(X_{n+1})$ into a complex measure theoretical o… ▽ More

    Submitted 11 April, 2025; originally announced April 2025.

    MSC Class: 60A10; 60G05; 60G15; 60G60

  4. arXiv:2410.09973  [pdf, other

    stat.ML cs.LG math.OC math.PR

    Gradient Span Algorithms Make Predictable Progress in High Dimension

    Authors: Felix Benning, Leif Döring

    Abstract: We prove that all 'gradient span algorithms' have asymptotically deterministic behavior on scaled Gaussian random functions as the dimension tends to infinity. In particular, this result explains the counterintuitive phenomenon that different training runs of many large machine learning models result in approximately equal cost curves despite random initialization on a complicated non-convex lands… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    MSC Class: 60F99; 68T01; 82D30

  5. arXiv:2305.01377  [pdf, other

    math.OC cs.LG stat.ML

    Random Function Descent

    Authors: Felix Benning, Leif Döring

    Abstract: Classical worst-case optimization theory neither explains the success of optimization in machine learning, nor does it help with step size selection. In this paper we demonstrate the viability and advantages of replacing the classical 'convex function' framework with a 'random function' framework. With complexity $\mathcal{O}(n^3d^3)$, where $n$ is the number of steps and $d$ the number of dimensi… ▽ More

    Submitted 15 October, 2024; v1 submitted 2 May, 2023; originally announced May 2023.

    Journal ref: Advances in Neural Information Processing Systems, Vol. 37. Vancouver, Canada: Curran Associates, Inc., 2024

  6. arXiv:2112.15392  [pdf, other

    math.OC stat.ML

    High Dimensional Optimization through the Lens of Machine Learning

    Authors: Felix Benning

    Abstract: This thesis reviews numerical optimization methods with machine learning problems in mind. Since machine learning models are highly parametrized, we focus on methods suited for high dimensional optimization. We build intuition on quadratic models to figure out which methods are suited for non-convex optimization, and develop convergence proofs on convex functions for this selection of methods. Wit… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: text overlap with arXiv:1606.04838 by other authors