Skip to main content

Showing 1–29 of 29 results for author: Köhler, M

Searching in archive math. Search in all archives.
.
  1. arXiv:2506.10589  [pdf, ps, other

    eess.SY math.OC

    Transient performance of MPC for tracking without terminal constraints

    Authors: Nadine Ehmann, Matthias Köhler, Frank Allgöwer

    Abstract: Model predictive control (MPC) for tracking is a recently introduced approach, which extends standard MPC formulations by incorporating an artificial reference as an additional optimization variable, in order to track external and potentially time-varying references. In this work, we analyze the performance of such an MPC for tracking scheme without a terminal cost and terminal constraints. We der… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2504.08489  [pdf, other

    math.ST cs.LG stat.ML

    Statistically guided deep learning

    Authors: Michael Kohler, Adam Krzyzak

    Abstract: We present a theoretically well-founded deep learning algorithm for nonparametric regression. It uses over-parametrized deep neural networks with logistic activation function, which are fitted to the given data via gradient descent. We propose a special topology of these networks, a special random initialization of the weights, and a data-dependent choice of the learning rate and the number of gra… ▽ More

    Submitted 11 April, 2025; originally announced April 2025.

    Comments: arXiv admin note: text overlap with arXiv:2504.03405

  3. arXiv:2504.03405  [pdf, ps, other

    math.ST

    On the rate of convergence of an over-parametrized deep neural network regression estimate learned by gradient descent

    Authors: Michael Kohler

    Abstract: Nonparametric regression with random design is considered. The $L_2$ error with integration with respect to the design measure is used as the error criterion. An over-parametrized deep neural network regression estimate with logistic activation function is defined, where all weights are learned by gradient descent. It is shown that the estimate achieves a nearly optimal rate of con… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

  4. arXiv:2501.10853  [pdf, other

    math.AP math-ph

    Quasiconvex relaxation of planar Biot-type energies and the role of determinant constraints

    Authors: Robert J. Martin, Ionel-Dumitrel Ghiba, Maximilian Köhler, Daniel Balzani, Oliver Sander, Patrizio Neff

    Abstract: We derive the quasiconvex relaxation of the Biot-type energy density $\lVert\sqrt{\operatorname{D}\varphi^T \operatorname{D}\varphi}-I_2\rVert^2$ for planar mappings $\varphi\colon\mathbb{R}^2\to \mathbb{R}^2$ in two different scenarios. First, we consider the case $\operatorname{D}\varphi\in\textrm{GL}^+(2)$, in which the energy can be expressed as the squared Euclidean distance… ▽ More

    Submitted 18 January, 2025; originally announced January 2025.

    MSC Class: 74A05; 74A60; 74B20; 74G65

  5. arXiv:2404.07128  [pdf, ps, other

    math.ST

    Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization

    Authors: Michael Kohler, Adam Krzyzak, Alisha Sänger

    Abstract: Image classification from independent and identically distributed random variables is considered. Image classifiers are defined which are based on a linear combination of deep convolutional networks with max-pooling layer. Here all the weights are learned by stochastic gradient descent. A general result is presented which shows that the image classifiers are able to approximate the best possible d… ▽ More

    Submitted 5 March, 2025; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.17007

  6. arXiv:2312.17007  [pdf, ps, other

    cs.LG math.ST stat.ML

    On the rate of convergence of an over-parametrized Transformer classifier learned by gradient descent

    Authors: Michael Kohler, Adam Krzyzak

    Abstract: One of the most recent and fascinating breakthroughs in artificial intelligence is ChatGPT, a chatbot which can simulate human conversation. ChatGPT is an instance of GPT4, which is a language model based on generative gredictive gransformers. So if one wants to study from a theoretical point of view, how powerful such artificial intelligence can be, one approach is to consider transformer network… ▽ More

    Submitted 20 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  7. arXiv:2304.03002  [pdf, ps, other

    eess.SY math.OC

    Distributed Model Predictive Control for Periodic Cooperation of Multi-Agent Systems

    Authors: Matthias Köhler, Matthias A. Müller, Frank Allgöwer

    Abstract: We consider multi-agent systems with heterogeneous, nonlinear agents subject to individual constraints that want to achieve a periodic, dynamic cooperative control goal which can be characterised by a set and a suitable cost. We propose a sequential distributed model predictive control (MPC) scheme in which agents sequentially solve an individual optimisation problem to track an artificial periodi… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  8. Transient Performance of MPC for Tracking

    Authors: Matthias Köhler, Lisa Krügel, Lars Grüne, Matthias A. Müller, Frank Allgöwer

    Abstract: We analyse the closed-loop performance of a model predictive control (MPC) for tracking formulation with artificial references. It has been shown that such a scheme guarantees closed-loop stability and recursive feasibility for any externally supplied reference, even if it is unreachable or time-varying. The basic idea is to consider an artificial reference as an additional decision variable and t… ▽ More

    Submitted 24 January, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

    Journal ref: IEEE Control Systems Letters, vol. 7, pp. 2545-2550, 2023

  9. arXiv:2211.14318  [pdf, other

    cs.CE math.NA

    Multidimensional rank-one convexification of incremental damage models at finite strains

    Authors: Daniel Balzani, Maximilian Köhler, Timo Neumeier, Malte A. Peter, Daniel Peterseim

    Abstract: This paper presents computationally feasible rank-one relaxation algorithms for the efficient simulation of a time-incremental damage model with nonconvex incremental stress potentials in multiple spatial dimensions. While the standard model suffers from numerical issues due to the lack of convexity, the relaxation by rank-one convexification prevents non-existence of minimizers and mesh dependenc… ▽ More

    Submitted 9 February, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  10. arXiv:2210.01443  [pdf, ps, other

    math.ST

    Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent

    Authors: Michael Kohler, Adam Krzyzak

    Abstract: Estimation of a regression function from independent and identically distributed random variables is considered. The $L_2$ error with integration with respect to the design measure is used as an error criterion. Over-parametrized deep neural network estimates are defined where all the weights are learned by the gradient descent. It is shown that the expected $L_2$ error of these estimates converge… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

  11. Evolving Microstructures in Relaxed Continuum Damage Mechanics for Strain Softening

    Authors: Maximilian Köhler, Daniel Balzani

    Abstract: A new relaxation approach is proposed which allows for the description of stress- and strain-softening at finite strains. The model is based on the construction of a convex hull replacing the originally non-convex incremental stress potential which in turn represents damage in terms of the classical $(1-D)$ approach. This convex hull is given as the linear convex combination of weakly and strongly… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

  12. arXiv:2208.14283  [pdf, ps, other

    math.ST

    On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

    Authors: Selina Drews, Michael Kohler

    Abstract: Estimation of a multivariate regression function from independent and identically distributed data is considered. An estimate is defined which fits a deep neural network consisting of a large number of fully connected neural networks, which are computed in parallel, via gradient descent to the data. The estimate is over-parametrized in the sense that the number of its parameters is much larger tha… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

  13. Data-driven distributed MPC of dynamically coupled linear systems

    Authors: Matthias Köhler, Julian Berberich, Matthias A. Müller, Frank Allgöwer

    Abstract: In this paper, we present a data-driven distributed model predictive control (MPC) scheme to stabilise the origin of dynamically coupled discrete-time linear systems subject to decoupled input constraints. The local optimisation problems solved by the subsystems rely on a distributed adaptation of the Fundamental Lemma by Willems et al., allowing to parametrise system trajectories using only measu… ▽ More

    Submitted 11 August, 2023; v1 submitted 25 February, 2022; originally announced February 2022.

    Journal ref: IFAC-PapersOnLine, 55(30), 365-370 (2022)

  14. arXiv:2111.14574  [pdf, other

    math.ST cs.LG

    On the rate of convergence of a classifier based on a Transformer encoder

    Authors: Iryna Gurevych, Michael Kohler, Gözde Gül Sahin

    Abstract: Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

  15. arXiv:2107.09550  [pdf, other

    math.ST

    Convergence rates for shallow neural networks learned by gradient descent

    Authors: Alina Braun, Michael Kohler, Sophie Langer, Harro Walk

    Abstract: In this paper we analyze the $L_2$ error of neural network regression estimates with one hidden layer. Under the assumption that the Fourier transform of the regression function decays suitably fast, we show that an estimate, where all initial weights are chosen according to proper uniform distributions and where the weights are learned by gradient descent, achieves a rate of convergence of… ▽ More

    Submitted 18 August, 2023; v1 submitted 20 July, 2021; originally announced July 2021.

    MSC Class: 62G05 (Primary); 62G20 (Secondary)

  16. arXiv:2107.09532  [pdf, ps, other

    math.ST stat.ML

    Estimation of a regression function on a manifold by fully connected deep neural networks

    Authors: Michael Kohler, Sophie Langer, Ulrich Reif

    Abstract: Estimation of a regression function from independent and identically distributed data is considered. The $L_2$ error with integration with respect to the distribution of the predictor variable is used as the error criterion. The rate of convergence of least squares estimates based on fully connected spaces of deep neural networks with ReLU activation function is analyzed for smooth regression func… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

    MSC Class: 62G05 (Primary); 62G20 (Secondary)

  17. arXiv:2012.10113  [pdf, other

    math.ST

    On the density estimation problem for uncertainty propagation with unknown input distributions

    Authors: Sebastian Kersting, Michael Kohler

    Abstract: In this article we study the problem of quantifying the uncertainty in an experiment with a technical system. We propose new density estimates which combine observed data of the technical system and simulated data from an (imperfect) simulation model based on estimated input distributions. We analyze the rate of convergence of these estimates. The finite sample size performance of the estimates is… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

    Comments: 46 pages, 2 figures

  18. arXiv:2011.13602  [pdf, other

    math.ST

    Statistical theory for image classification using deep convolutional neural networks with cross-entropy loss under the hierarchical max-pooling model

    Authors: Michael Kohler, Sophie Langer

    Abstract: Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper,… ▽ More

    Submitted 29 April, 2024; v1 submitted 27 November, 2020; originally announced November 2020.

    Comments: arXiv admin note: text overlap with arXiv:2003.01526

  19. arXiv:2003.02088  [pdf, other

    cs.MS math.NA

    Matrix Equations, Sparse Solvers: M-M.E.S.S.-2.0.1 -- Philosophy, Features and Application for (Parametric) Model

    Authors: Peter Benner, Martin Köhler, Jens Saak

    Abstract: Matrix equations are omnipresent in (numerical) linear algebra and systems theory. Especially in model order reduction (MOR) they play a key role in many balancing based reduction methods for linear dynamical systems. When these systems arise from spatial discretizations of evolutionary partial differential equations, their coefficient matrices are typically large and sparse. Moreover, the numbers… ▽ More

    Submitted 9 May, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

    Comments: 18 pages, 4 figures, 5 tables

  20. arXiv:1912.05436  [pdf, ps, other

    math.ST

    Analysis of the rate of convergence of neural network regression estimates which are easy to implement

    Authors: Alina Braun, Michael Kohler, Adam Krzyzak

    Abstract: Recent results in nonparametric regression show that for deep learning, i.e., for neural network estimates with many hidden layers, we are able to achieve good rates of convergence even in case of high-dimensional predictor variables, provided suitable assumptions on the structure of the regression function are imposed. The estimates are defined by minimizing the empirical $L_2$ risk over a class… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

    Comments: arXiv admin note: text overlap with arXiv:1912.03921

  21. arXiv:1912.03925  [pdf, ps, other

    math.ST

    Over-parametrized deep neural networks do not generalize well

    Authors: Michael Kohler, Adam Krzyzak

    Abstract: Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not genera… ▽ More

    Submitted 14 January, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

  22. arXiv:1912.03921  [pdf, ps, other

    math.ST

    On the rate of convergence of a neural network regression estimate learned by gradient descent

    Authors: Alina Braun, Michael Kohler, Harro Walk

    Abstract: Nonparametric regression with random design is considered. Estimates are defined by minimzing a penalized empirical $L_2$ risk over a suitably chosen class of neural networks with one hidden layer via gradient descent. Here, the gradient descent procedure is repeated several times with randomly chosen starting values for the weights, and from the list of constructed estimates the one with the mini… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

  23. arXiv:1908.11140  [pdf, ps, other

    stat.ML cs.LG math.ST

    Estimation of a function of low local dimensionality by deep neural networks

    Authors: Michael Kohler, Adam Krzyzak, Sophie Langer

    Abstract: Deep neural networks (DNNs) achieve impressive results for complicated tasks like object detection on images and speech recognition. Motivated by this practical success, there is now a strong interest in showing good theoretical properties of DNNs. To describe for which tasks DNNs perform well and when they fail, it is a key challenge to understand their performance. The aim of this paper is to co… ▽ More

    Submitted 15 June, 2020; v1 submitted 29 August, 2019; originally announced August 2019.

  24. arXiv:1908.11133  [pdf, ps, other

    stat.ML cs.LG math.ST

    On the rate of convergence of fully connected very deep neural network regression estimates

    Authors: Michael Kohler, Sophie Langer

    Abstract: Recent results in nonparametric regression show that deep learning, i.e., neural network estimates with many hidden layers, are able to circumvent the so-called curse of dimensionality in case that suitable restrictions on the structure of the regression function hold. One key feature of the neural networks used in these results is that their network architecture has a further constraint, namely t… ▽ More

    Submitted 29 September, 2020; v1 submitted 29 August, 2019; originally announced August 2019.

    MSC Class: Primary 62G08; secondary 41A25; 82C32

  25. arXiv:1705.05933  [pdf, other

    cs.LG math.OC stat.ML

    Sub-sampled Cubic Regularization for Non-convex Optimization

    Authors: Jonas Moritz Kohler, Aurelien Lucchi

    Abstract: We consider the minimization of non-convex functions that typically arise in machine learning. Specifically, we focus our attention on a variant of trust region methods known as cubic regularization. This approach is particularly attractive because it escapes strict saddle points and it provides stronger convergence guarantees than first- and second-order as well as classical trust region methods.… ▽ More

    Submitted 1 July, 2017; v1 submitted 16 May, 2017; originally announced May 2017.

    Comments: Proceedings of the 34th International Conference on Machine Learning

  26. On data-based optimal stopping under stationarity and ergodicity

    Authors: Michael Kohler, Harro Walk

    Abstract: The problem of optimal stopping with finite horizon in discrete time is considered in view of maximizing the expected gain. The algorithm proposed in this paper is completely nonparametric in the sense that it uses observed data from the past of the process up to time $-n+1$, $n\in\mathbb{N}$, not relying on any specific model assumption. Kernel regression estimation of conditional expectations an… ▽ More

    Submitted 23 July, 2013; originally announced July 2013.

    Comments: Published in at http://dx.doi.org/10.3150/12-BEJ439 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

    Report number: IMS-BEJ-BEJ439

    Journal ref: Bernoulli 2013, Vol. 19, No. 3, 931-953

  27. arXiv:1203.5952  [pdf, other

    math.OC math.DS

    Optimal $RH_2$-- and $RH_\infty$--Approximation of Unstable Descriptor Systems

    Authors: Marcus Köhler

    Abstract: Stability perserving is an important topic in approximation of systems, e.g.\ model reduction. If the original system is stable, we often want the approximation to be stable. But even if an algorithm preserves stability the resulting system could be unstable in practice because of round-off errors. Our approach is approximating this unstable reduced system by a stable system. More precisely, we co… ▽ More

    Submitted 1 August, 2012; v1 submitted 27 March, 2012; originally announced March 2012.

  28. arXiv:1101.5702  [pdf, ps, other

    math.OA math.KT

    Universal coefficient theorems for C*-algebras over finite topological spaces

    Authors: Rasmus Bentmann, Manuel Köhler

    Abstract: We determine the class of finite T_0-spaces allowing for a universal coefficient theorem computing equivariant KK-theory by filtrated K-theory.

    Submitted 7 April, 2011; v1 submitted 29 January, 2011; originally announced January 2011.

    Comments: 50 pages, 6 figures; several minor changes. arXiv admin note: text overlap with arXiv:0810.0096 by other authors

    Report number: CPH-SYM-00 MSC Class: 19K35 (Primary) 46L35; 46L80; 46M18; 46M20 (Secondary)

  29. A dynamic look-ahead Monte Carlo algorithm for pricing Bermudan options

    Authors: Daniel Egloff, Michael Kohler, Nebojsa Todorovic

    Abstract: Under the assumption of no-arbitrage, the pricing of American and Bermudan options can be casted into optimal stopping problems. We propose a new adaptive simulation based algorithm for the numerical solution of optimal stopping problems in discrete time. Our approach is to recursively compute the so-called continuation values. They are defined as regression functions of the cash flow, which wou… ▽ More

    Submitted 19 October, 2007; originally announced October 2007.

    Comments: Published in at http://dx.doi.org/10.1214/105051607000000249 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AAP-AAP436 MSC Class: 91B28; 60G40; 93E20 (Primary) 65C05; 93E24; 62G05 (Secondary)

    Journal ref: Annals of Applied Probability 2007, Vol. 17, No. 4, 1138-1171