Skip to main content

Showing 1–4 of 4 results for author: Herberg, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.15995  [pdf, other

    cs.LG math.OC

    Sensitivity-Based Layer Insertion for Residual and Feedforward Neural Networks

    Authors: Evelyn Herberg, Roland Herzog, Frederik Köhne, Leonie Kreis, Anton Schiela

    Abstract: The training of neural networks requires tedious and often manual tuning of the network architecture. We propose a systematic method to insert new layers during the training process, which eliminates the need to choose a fixed network size before training. Our technique borrows techniques from constrained optimization and is based on first-order sensitivity information of the objective with respec… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  2. arXiv:2306.16111  [pdf, other

    cs.LG math.OC

    Time Regularization in Optimal Time Variable Learning

    Authors: Evelyn Herberg, Roland Herzog, Frederik Köhne

    Abstract: Recently, optimal time variable learning in deep neural networks (DNNs) was introduced in arXiv:2204.08528. In this manuscript we extend the concept by introducing a regularization term that directly relates to the time horizon in discrete dynamical systems. Furthermore, we propose an adaptive pruning approach for Residual Neural Networks (ResNets), which reduces network complexity without comprom… ▽ More

    Submitted 6 December, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

  3. arXiv:2304.05133  [pdf, other

    cs.LG math.OC

    Lecture Notes: Neural Network Architectures

    Authors: Evelyn Herberg

    Abstract: These lecture notes provide an overview of Neural Network architectures from a mathematical point of view. Especially, Machine Learning with Neural Networks is seen as an optimization problem. Covered are an introduction to Neural Networks and the following architectures: Feedforward Neural Network, Convolutional Neural Network, ResNet, and Recurrent Neural Network.

    Submitted 18 April, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: added more references

    MSC Class: 68T07

  4. arXiv:2204.08528  [pdf, other

    math.OC cs.LG math.NA

    An Optimal Time Variable Learning Framework for Deep Neural Networks

    Authors: Harbir Antil, Hugo Díaz, Evelyn Herberg

    Abstract: Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. Thi… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    MSC Class: 34A08; 49J15; 68T05; 82C32