Skip to main content

Showing 1–10 of 10 results for author: Holzmüller, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.22429  [pdf, ps, other

    stat.ML cs.LG

    Beyond ReLU: How Activations Affect Neural Kernels and Random Wide Networks

    Authors: David Holzmüller, Max Schölpple

    Abstract: While the theory of deep learning has made some progress in recent years, much of it is limited to the ReLU activation function. In particular, while the neural tangent kernel (NTK) and neural network Gaussian process kernel (NNGP) have given theoreticians tractable limiting cases of fully connected neural networks, their properties for most activation functions except for powers of the ReLU funct… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  2. arXiv:2312.01416  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci stat.ML

    Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

    Authors: Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, Johannes Kästner

    Abstract: Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning, which uses biased or unbiased molecular dynamics (MD) to generate candidate pools, aims to address this objective. Existing biased and unbiased MD-simulation methods, however, are prone to miss either rare events or extrapolative regio… ▽ More

    Submitted 2 November, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Journal ref: npj Comput. Mater. 10, 83 (2024)

  3. arXiv:2312.01414  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci stat.ML

    Predicting Properties of Periodic Systems from Cluster Data: A Case Study of Liquid Water

    Authors: Viktor Zaverkin, David Holzmüller, Robin Schuldt, Johannes Kästner

    Abstract: The accuracy of the training data limits the accuracy of bulk properties from machine-learned potentials. For example, hybrid functionals or wave-function-based quantum chemical methods are readily available for cluster data but effectively out-of-scope for periodic structures. We show that local, atom-centred descriptors for machine-learned potentials enable the prediction of bulk properties from… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Journal ref: J. Chem. Phys. 156, 114103 (2022)

  4. arXiv:2305.14077  [pdf, other

    stat.ML cs.LG math.ST

    Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

    Authors: Moritz Haas, David Holzmüller, Ulrike von Luxburg, Ingo Steinwart

    Abstract: The success of over-parameterized neural networks trained to near-zero training error has caused great interest in the phenomenon of benign overfitting, where estimators are statistically consistent even though they interpolate noisy training data. While benign overfitting in fixed dimension has been established for some learning methods, current literature suggests that for regression with typica… ▽ More

    Submitted 6 November, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Compared to the NeurIPS version (v2), this version strengthens Assumption (K) from d/2<s<=3d/4 to d/2<s<3d/4 and corrects Lemma B.2 by posing additional assumptions. This does not affect any other statements. We provide Python code to reproduce all of our experimental results at https://github.com/moritzhaas/mind-the-spikes

  5. arXiv:2303.03237  [pdf, other

    stat.ML cs.LG math.ST stat.CO

    Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation

    Authors: David Holzmüller, Francis Bach

    Abstract: Sampling from Gibbs distributions $p(x) \propto \exp(-V(x)/\varepsilon)$ and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical physics. However, while efficient algorithms are known for convex potentials $V$, the situation is much more difficult in the non-convex case, where algorithms necessarily suffer from the curse of dimensionality i… ▽ More

    Submitted 1 August, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: Changes in v3: Minor corrections and improvements. Plots can be reproduced using the code at https://github.com/dholzmueller/sampling_experiments

  6. arXiv:2212.03916  [pdf, other

    physics.comp-ph stat.ML

    Transfer learning for chemically accurate interatomic neural network potentials

    Authors: Viktor Zaverkin, David Holzmüller, Luca Bonfirraro, Johannes Kästner

    Abstract: Developing machine learning-based interatomic potentials from ab-initio electronic structure methods remains a challenging task for computational chemistry and materials science. This work studies the capability of transfer learning, in particular discriminative fine-tuning, for efficiently generating chemically accurate interatomic neural network potentials on organic molecules from the MD17 and… ▽ More

    Submitted 28 January, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

  7. arXiv:2203.09410  [pdf, other

    stat.ML cs.LG cs.NE

    A Framework and Benchmark for Deep Batch Active Learning for Regression

    Authors: David Holzmüller, Viktor Zaverkin, Johannes Kästner, Ingo Steinwart

    Abstract: The acquisition of labels for supervised learning can be expensive. To improve the sample efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods. Our framework encompasses many e… ▽ More

    Submitted 1 August, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Published at the Journal of Machine Learning Research (JMLR). Changes in v4: Improvements in writing and other minor changes. Accompanying code can be found at https://github.com/dholzmueller/bmdal_reg

    Journal ref: Journal of Machine Learning Research, 24(164):1-81, 2023

  8. arXiv:2109.09569  [pdf, other

    physics.comp-ph stat.ML

    Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments

    Authors: Viktor Zaverkin, David Holzmüller, Ingo Steinwart, Johannes Kästner

    Abstract: Artificial neural networks (NNs) are one of the most frequently used machine learning approaches to construct interatomic potentials and enable efficient large-scale atomistic simulations with almost ab initio accuracy. However, the simultaneous training of NNs on energies and forces, which are a prerequisite for, e.g., molecular dynamics simulations, can be demanding. In this work, we present an… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: Manuscript accepted for publication in J. Chem. Theory Comput.; Code published at https://gitlab.com/zaverkin_v/gmnn

  9. arXiv:2010.01851  [pdf, other

    stat.ML cs.LG cs.NE math.ST

    On the Universality of the Double Descent Peak in Ridgeless Regression

    Authors: David Holzmüller

    Abstract: We prove a non-asymptotic distribution-independent lower bound for the expected mean squared generalization error caused by label noise in ridgeless linear regression. Our lower bound generalizes a similar known result to the overparameterized (interpolating) regime. In contrast to most previous works, our analysis applies to a broad class of input distributions with almost surely full-rank featur… ▽ More

    Submitted 1 August, 2023; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Published at ICLR 2021. 9 pages + 34 pages appendix. Changes in v8: Small corrections. Experimental results can be reproduced using the code at https://github.com/dholzmueller/universal_double_descent

  10. arXiv:2002.04861  [pdf, other

    stat.ML cs.LG

    Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent

    Authors: David Holzmüller, Ingo Steinwart

    Abstract: We prove that two-layer (Leaky)ReLU networks initialized by e.g. the widely used method proposed by He et al. (2015) and trained using gradient descent on a least-squares loss are not universally consistent. Specifically, we describe a large class of one-dimensional data-generating distributions for which, with high probability, gradient descent only finds a bad local minimum of the optimization l… ▽ More

    Submitted 8 June, 2022; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: To appear in Journal of Machine Learning Research (JMLR). Changes in v3: Added new Section 10 with extensive experimental evaluation. Code available at https://github.com/dholzmueller/nn_inconsistency