Skip to main content

Showing 1–25 of 25 results for author: Kutyniok, G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.21423  [pdf, other

    cs.LG stat.ML

    Conflicting Biases at the Edge of Stability: Norm versus Sharpness Regularization

    Authors: Vit Fojtik, Maria Matveev, Hung-Hsu Chou, Gitta Kutyniok, Johannes Maly

    Abstract: A widely believed explanation for the remarkable generalization capacities of overparameterized neural networks is that the optimization algorithms used for training induce an implicit bias towards benign solutions. To grasp this theoretically, recent works examine gradient descent and its variants in simplified training settings, often assuming vanishing learning rates. These studies reveal vario… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  2. arXiv:2504.18433  [pdf, other

    cs.LG stat.ML

    An Axiomatic Assessment of Entropy- and Variance-based Uncertainty Quantification in Regression

    Authors: Christopher Bülte, Yusuf Sale, Timo Löhr, Paul Hofman, Gitta Kutyniok, Eyke Hüllermeier

    Abstract: Uncertainty quantification (UQ) is crucial in machine learning, yet most (axiomatic) studies of uncertainty measures focus on classification, leaving a gap in regression settings with limited formal justification and evaluations. In this work, we introduce a set of axioms to rigorously assess measures of aleatoric, epistemic, and total uncertainty in supervised regression. By utilizing a predictiv… ▽ More

    Submitted 16 May, 2025; v1 submitted 25 April, 2025; originally announced April 2025.

  3. arXiv:2307.02301  [pdf, other

    cs.LG cs.CL stat.ML

    Sumformer: Universal Approximation for Efficient Transformers

    Authors: Silas Alberti, Niclas Dern, Laura Thesing, Gitta Kutyniok

    Abstract: Natural language processing (NLP) made an impressive jump with the introduction of Transformers. ChatGPT is one of the most famous examples, changing the perception of the possibilities of AI even outside the research community. However, besides the impressive performance, the quadratic time and space complexity of Transformers with respect to sequence length pose significant limitations for handl… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  4. arXiv:2205.15117  [pdf, other

    cs.LG cs.AI math.NA stat.ML

    OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs

    Authors: Yangze Zhou, Gitta Kutyniok, Bruno Ribeiro

    Abstract: This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (gMPNNs) -- such as Graph Neural Networks (GNNs) -- to perform inductive out-of-distribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove non-asymptotic bounds showing that link predictors based on permutation-equivariant (struct… ▽ More

    Submitted 9 October, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: Accepted at NeurIPS 2022

  5. arXiv:2203.08890  [pdf, other

    cs.LG math.HO stat.ML

    The Mathematics of Artificial Intelligence

    Authors: Gitta Kutyniok

    Abstract: We currently witness the spectacular success of artificial intelligence in both science and public life. However, the development of a rigorous mathematical foundation is still at an early stage. In this survey article, which is based on an invited lecture at the International Congress of Mathematicians 2022, we will in particular focus on the current "workhorse" of artificial intelligence, namely… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: 16 pages, 7 figures

    MSC Class: Primary 68T07; Secondary 41A25; 42C15; 35C20; 65D18

  6. The Modern Mathematics of Deep Learning

    Authors: Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen

    Abstract: We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surpr… ▽ More

    Submitted 8 February, 2023; v1 submitted 9 May, 2021; originally announced May 2021.

    Comments: A version of this review paper appears as a chapter in the book "Mathematical Aspects of Deep Learning" by Cambridge University Press

    Journal ref: Mathematical Aspects of Deep Learning, pp. 1-111. Cambridge University Press, 2022

  7. arXiv:2012.04477  [pdf, ps, other

    cs.LG stat.ML

    Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?

    Authors: Mariia Seleznova, Gitta Kutyniok

    Abstract: Neural Tangent Kernel (NTK) theory is widely used to study the dynamics of infinitely-wide deep neural networks (DNNs) under gradient descent. But do the results for infinitely-wide networks give us hints about the behavior of real finite-width ones? In this paper, we study empirically when NTK theory is valid in practice for fully-connected ReLU and sigmoid DNNs. We find out that whether a networ… ▽ More

    Submitted 1 February, 2022; v1 submitted 8 December, 2020; originally announced December 2020.

    Journal ref: Proceedings of Machine Learning Research vol 145:1-28, 2021 2nd Annual Conference on Mathematical and Scientific Machine Learning

  8. arXiv:2007.04759  [pdf, other

    cs.LG math.FA stat.ML

    Expressivity of Deep Neural Networks

    Authors: Ingo Gühring, Mones Raslan, Gitta Kutyniok

    Abstract: In this review paper, we give a comprehensive overview of the large variety of approximation results for neural networks. Approximation rates for classical function spaces as well as benefits of deep neural networks over shallow ones for specifically structured function classes are discussed. While the mainbody of existing results is for general feedforward architectures, we also depict approximat… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: This review paper will appear as a book chapter in the book "Theory of Deep Learning" by Cambridge University Press

    MSC Class: 41-02; 41-03; 68T07; 68Q32

  9. arXiv:2007.00758  [pdf, other

    cs.LG stat.ML

    In-Distribution Interpretability for Challenging Modalities

    Authors: Cosmas Heiß, Ron Levie, Cinjon Resnick, Gitta Kutyniok, Joan Bruna

    Abstract: It is widely recognized that the predictions of deep neural networks are difficult to parse relative to simpler approaches. However, the development of methods to investigate the mode of operation of such models has advanced rapidly in the past few years. Recent work introduced an intuitive framework which utilizes generative models to improve on the meaningfulness of such explanations. In this wo… ▽ More

    Submitted 7 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

  10. arXiv:2007.00479  [pdf, ps, other

    stat.ML cs.LG math.NA

    The Restricted Isometry of ReLU Networks: Generalization through Norm Concentration

    Authors: Alex Goeßmann, Gitta Kutyniok

    Abstract: While regression tasks aim at interpolating a relation on the entire input space, they often have to be solved with a limited amount of training data. Still, if the hypothesis functions can be sketched well with the data, one can hope for identifying a generalizing model. In this work, we introduce with the Neural Restricted Isometry Property (NeuRIP) a uniform concentration event, in which all… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: 27 pages, 5 figures

    MSC Class: G.3 ACM Class: F.2; G.3

  11. arXiv:2006.05397  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    Real-time Localization Using Radio Maps

    Authors: Çağkan Yapar, Ron Levie, Gitta Kutyniok, Giuseppe Caire

    Abstract: This paper deals with the problem of localization in a cellular network in a dense urban scenario. Global Navigation Satellite System typically performs poorly in urban environments when there is no line-of-sight between the devices and the satellites, and thus alternative localization methods are often required. We present a simple yet effective method for localization based on pathloss. In our a… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

  12. arXiv:2004.12131  [pdf, other

    math.NA cs.LG stat.ML

    Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

    Authors: Moritz Geist, Philipp Petersen, Mones Raslan, Reinhold Schneider, Gitta Kutyniok

    Abstract: We perform a comprehensive numerical study of the effect of approximation-theoretical results for neural networks on practical learning problems in the context of numerical analysis. As the underlying model, we study the machine-learning-based solution of parametric partial differential equations. Here, approximation theory predicts that the performance of the model should depend only very mildly… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    MSC Class: 35J99; 41A25; 41A30; 68T05; 65N30

  13. arXiv:2003.11566  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Interval Neural Networks: Uncertainty Scores

    Authors: Luis Oala, Cosmas Heiß, Jan Macdonald, Maximilian März, Wojciech Samek, Gitta Kutyniok

    Abstract: We propose a fast, non-Bayesian method for producing uncertainty scores in the output of pre-trained deep neural networks (DNNs) using a data-driven interval propagating network. This interval neural network (INN) has interval valued parameters and propagates its input using interval arithmetic. The INN produces sensible lower and upper bounds encompassing the ground truth. We provide theoretical… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: LO and CH contributed equally

    ACM Class: I.5.1; I.4.5; J.3; I.2.m

  14. arXiv:2002.12388  [pdf, other

    math.NA cs.LG math.DS quant-ph stat.ML

    Tensor network approaches for learning non-linear dynamical laws

    Authors: A. Goeßmann, M. Götte, I. Roth, R. Sweke, G. Kutyniok, J. Eisert

    Abstract: Given observations of a physical system, identifying the underlying non-linear governing equation is a fundamental task, necessary both for gaining understanding and generating deterministic future predictions. Of most practical relevance are automated approaches to theory building that scale efficiently for complex systems with many degrees of freedom. To date, available scalable methods aim at a… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

    Comments: 17 pages, 8 figures

  15. arXiv:1911.09002  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    RadioUNet: Fast Radio Map Estimation with Convolutional Neural Networks

    Authors: Ron Levie, Çağkan Yapar, Gitta Kutyniok, Giuseppe Caire

    Abstract: In this paper we propose a highly efficient and very accurate deep learning method for estimating the propagation pathloss from a point $x$ (transmitter location) to any point $y$ on a planar domain. For applications such as user-cell site association and device-to-device link scheduling, an accurate knowledge of the pathloss function for all pairs of transmitter-receiver locations is very importa… ▽ More

    Submitted 22 December, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

  16. arXiv:1907.12972  [pdf, other

    cs.LG stat.ML

    Transferability of Spectral Graph Convolutional Neural Networks

    Authors: Ron Levie, Wei Huang, Lorenzo Bucci, Michael M. Bronstein, Gitta Kutyniok

    Abstract: This paper focuses on spectral graph convolutional neural networks (ConvNets), where filters are defined as elementwise multiplication in the frequency domain of a graph. In machine learning settings where the dataset consists of signals defined on many different graphs, the trained ConvNet should generalize to signals on graphs unseen in the training set. It is thus important to transfer ConvNets… ▽ More

    Submitted 12 June, 2021; v1 submitted 30 July, 2019; originally announced July 2019.

  17. arXiv:1905.11092  [pdf, other

    cs.LG cs.CC cs.IT stat.ML

    A Rate-Distortion Framework for Explaining Neural Network Decisions

    Authors: Jan Macdonald, Stephan Wäldchen, Sascha Hauch, Gitta Kutyniok

    Abstract: We formalise the widespread idea of interpreting neural network decisions as an explicit optimisation problem in a rate-distortion framework. A set of input features is deemed relevant for a classification decision if the expected classifier score remains nearly constant when randomising the remaining features. We discuss the computational complexity of finding small sets of relevant features and… ▽ More

    Submitted 27 May, 2019; originally announced May 2019.

  18. arXiv:1905.01208  [pdf, other

    math.FA cs.NE stat.ML

    Approximation spaces of deep neural networks

    Authors: Rémi Gribonval, Gitta Kutyniok, Morten Nielsen, Felix Voigtlaender

    Abstract: We study the expressivity of deep neural networks. Measuring a network's complexity by its number of connections or by its number of neurons, we consider the class of functions for which the error of best approximation with networks of a given complexity decays at a certain rate when increasing the complexity budget. Using results from classical approximation theory, we show that this class can be… ▽ More

    Submitted 17 July, 2020; v1 submitted 3 May, 2019; originally announced May 2019.

  19. arXiv:1904.00377  [pdf, ps, other

    math.NA cs.LG math.FA stat.ML

    A Theoretical Analysis of Deep Neural Networks and Parametric PDEs

    Authors: Gitta Kutyniok, Philipp Petersen, Mones Raslan, Reinhold Schneider

    Abstract: We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations. In particular, without any knowledge of its concrete shape, we use the inherent low-dimensionality of the solution manifold to obtain approximation rates which are significantly superior to those provided by classical neural network approximation results. C… ▽ More

    Submitted 14 May, 2020; v1 submitted 31 March, 2019; originally announced April 2019.

    MSC Class: 35A35; 35J99; 41A25; 41A46; 68T05; 65N30

  20. arXiv:1901.10524  [pdf, other

    cs.LG stat.ML

    On the Transferability of Spectral Graph Filters

    Authors: Ron Levie, Elvin Isufi, Gitta Kutyniok

    Abstract: This paper focuses on spectral filters on graphs, namely filters defined as elementwise multiplication in the frequency domain of a graph. In many graph signal processing settings, it is important to transfer a filter from one graph to another. One example is in graph convolutional neural networks (ConvNets), where the dataset consists of signals defined on many different graphs, and the learned f… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

  21. arXiv:1901.05744  [pdf, ps, other

    cs.LG stat.ML

    The Oracle of DLphi

    Authors: Dominik Alfke, Weston Baines, Jan Blechschmidt, Mauricio J. del Razo Sarmina, Amnon Drory, Dennis Elbrächter, Nando Farchmin, Matteo Gambara, Silke Glas, Philipp Grohs, Peter Hinz, Danijel Kivaranovic, Christian Kümmerle, Gitta Kutyniok, Sebastian Lunz, Jan Macdonald, Ryan Malthaner, Gregory Naisat, Ariel Neufeld, Philipp Christian Petersen, Rafael Reisenhofer, Jun-Da Sheng, Laura Thesing, Philipp Trunschke, Johannes von Lindheim , et al. (2 additional authors not shown)

    Abstract: We present a novel technique based on deep learning and set theory which yields exceptional classification and prediction results. Having access to a sufficiently large amount of labelled training data, our methodology is capable of predicting the labels of the test data almost always even if the training data is entirely unrelated to the test data. In other words, we prove in a specific setting t… ▽ More

    Submitted 27 January, 2019; v1 submitted 17 January, 2019; originally announced January 2019.

    MSC Class: 68T05; 82C32

  22. arXiv:1901.01388  [pdf, other

    eess.IV cs.LG eess.SP stat.ML

    Extraction of digital wavefront sets using applied harmonic analysis and deep neural networks

    Authors: Héctor Andrade-Loarca, Gitta Kutyniok, Ozan Öktem, Philipp Petersen

    Abstract: Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images, which combines data-based and model-based method… ▽ More

    Submitted 10 July, 2019; v1 submitted 5 January, 2019; originally announced January 2019.

    MSC Class: 35A18; 65T60; 68T10

  23. arXiv:1808.06329  [pdf, ps, other

    math.ST stat.ML

    The Mismatch Principle: The Generalized Lasso Under Large Model Uncertainties

    Authors: Martin Genzel, Gitta Kutyniok

    Abstract: We study the estimation capacity of the generalized Lasso, i.e., least squares minimization combined with a (convex) structural constraint. While Lasso-type estimators were originally designed for noisy linear regression problems, it has recently turned out that they are in fact robust against various types of model uncertainties and misspecifications, most notably, non-linearly distorted observat… ▽ More

    Submitted 11 September, 2019; v1 submitted 20 August, 2018; originally announced August 2018.

    MSC Class: 68T37; 60D05; 90C25; 62F30; 62F35

  24. arXiv:1608.08852  [pdf, other

    stat.ML cs.LG math.ST

    A Mathematical Framework for Feature Selection from Real-World Data with Non-Linear Observations

    Authors: Martin Genzel, Gitta Kutyniok

    Abstract: In this paper, we study the challenge of feature selection based on a relatively small collection of sample pairs $\{(x_i, y_i)\}_{1 \leq i \leq m}$. The observations $y_i \in \mathbb{R}$ are thereby supposed to follow a noisy single-index model, depending on a certain set of signal variables. A major difficulty is that these variables usually cannot be observed directly, but rather arise as hidde… ▽ More

    Submitted 31 August, 2016; originally announced August 2016.

  25. Sparse Proteomics Analysis - A compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data

    Authors: Tim Conrad, Martin Genzel, Nada Cvetkovic, Niklas Wulkow, Alexander Leichtle, Jan Vybiral, Gitta Kutyniok, Christof Schütte

    Abstract: Background: High-throughput proteomics techniques, such as mass spectrometry (MS)-based approaches, produce very high-dimensional data-sets. In a clinical setting one is often interested in how mass spectra differ between patients of different classes, for example spectra from healthy patients vs. spectra from patients having a particular disease. Machine learning algorithms are needed to (a) iden… ▽ More

    Submitted 26 November, 2016; v1 submitted 11 June, 2015; originally announced June 2015.

    Journal ref: BMC Bioinform. 18 (2017), 160