-
Learning mechanical systems from real-world data using discrete forced Lagrangian dynamics
Authors:
Martine Dyring Hansen,
Elena Celledoni,
Benjamin Kwanen Tapley
Abstract:
We introduce a data-driven method for learning the equations of motion of mechanical systems directly from position measurements, without requiring access to velocity data. This is particularly relevant in system identification tasks where only positional information is available, such as motion capture, pixel data or low-resolution tracking. Our approach takes advantage of the discrete Lagrange-d…
▽ More
We introduce a data-driven method for learning the equations of motion of mechanical systems directly from position measurements, without requiring access to velocity data. This is particularly relevant in system identification tasks where only positional information is available, such as motion capture, pixel data or low-resolution tracking. Our approach takes advantage of the discrete Lagrange-d'Alembert principle and the forced discrete Euler-Lagrange equations to construct a physically grounded model of the system's dynamics. We decompose the dynamics into conservative and non-conservative components, which are learned separately using feed-forward neural networks. In the absence of external forces, our method reduces to a variational discretization of the action principle naturally preserving the symplectic structure of the underlying Hamiltonian system. We validate our approach on a variety of synthetic and real-world datasets, demonstrating its effectiveness compared to baseline methods. In particular, we apply our model to (1) measured human motion data and (2) latent embeddings obtained via an autoencoder trained on image sequences. We demonstrate that we can faithfully reconstruct and separate both the conservative and forced dynamics, yielding interpretable and physically consistent predictions.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Approximation properties of neural ODEs
Authors:
Arturo De Marinis,
Davide Murari,
Elena Celledoni,
Nicola Guglielmi,
Brynjulf Owren,
Francesco Tudisco
Abstract:
We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of s…
▽ More
We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters are required to satisfy some constraints. In particular, we constrain the Lipschitz constant of the flow of the neural ODE to increase the stability of the shallow neural network, and we restrict the norm of the weight matrices of the linear layers to one to make sure that the restricted expansivity of the flow is not compensated by the increased expansivity of the linear layers. For this setting, we prove approximation bounds that tell us the accuracy to which we can approximate a continuous function with a shallow neural network with such constraints. We prove that the UAP holds if we consider only the constraint on the Lipschitz constant of the flow or the unit norm constraint on the weight matrices of the linear layers.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Designing Stable Neural Networks using Convex Analysis and ODEs
Authors:
Ferdia Sherry,
Elena Celledoni,
Matthias J. Ehrhardt,
Davide Murari,
Brynjulf Owren,
Carola-Bibiane Schönlieb
Abstract:
Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrain…
▽ More
Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrained, has a Lipschitz constant that, in the worst case, grows exponentially with the depth of the network. Further analysis of the proposed architecture shows that the spectral norms of the weights can be further constrained to ensure that the network is an averaged operator, making it a natural candidate for a learned denoiser in Plug-and-Play algorithms. Using a novel adaptive way of enforcing the spectral norm constraints, we show that, even with these constraints, it is possible to train performant networks. The proposed architecture is applied to the problem of adversarially robust image classification, to image denoising, and finally to the inverse problem of deblurring.
△ Less
Submitted 18 April, 2024; v1 submitted 29 June, 2023;
originally announced June 2023.
-
Learning Dynamical Systems from Noisy Data with Inverse-Explicit Integrators
Authors:
Håkon Noren,
Sølve Eidnes,
Elena Celledoni
Abstract:
We introduce the mean inverse integrator (MII), a novel approach to increase the accuracy when training neural networks to approximate vector fields of dynamical systems from noisy data. This method can be used to average multiple trajectories obtained by numerical integrators such as Runge-Kutta methods. We show that the class of mono-implicit Runge-Kutta methods (MIRK) has particular advantages…
▽ More
We introduce the mean inverse integrator (MII), a novel approach to increase the accuracy when training neural networks to approximate vector fields of dynamical systems from noisy data. This method can be used to average multiple trajectories obtained by numerical integrators such as Runge-Kutta methods. We show that the class of mono-implicit Runge-Kutta methods (MIRK) has particular advantages when used in connection with MII. When training vector field approximations, explicit expressions for the loss functions are obtained when inserting the training data in the MIRK formulae, unlocking symmetric and high-order integrators that would otherwise be implicit for initial value problems. The combined approach of applying MIRK within MII yields a significantly lower error compared to the plain use of the numerical integrator without averaging the trajectories. This is demonstrated with experiments using data from several (chaotic) Hamiltonian systems. Additionally, we perform a sensitivity analysis of the loss functions under normally distributed perturbations, supporting the favorable performance of MII.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Predictions Based on Pixel Data: Insights from PDEs and Finite Differences
Authors:
Elena Celledoni,
James Jackaman,
Davide Murari,
Brynjulf Owren
Abstract:
As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional net…
▽ More
As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations.
△ Less
Submitted 21 June, 2024; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Dynamical systems' based neural networks
Authors:
Elena Celledoni,
Davide Murari,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomo…
▽ More
Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets.
△ Less
Submitted 31 August, 2023; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Deep neural networks on diffeomorphism groups for optimal shape reparameterization
Authors:
Elena Celledoni,
Helge Glöckner,
Jørgen Riseth,
Alexander Schmeding
Abstract:
One of the fundamental problems in shape analysis is to align curves or surfaces before computing geodesic distances between their shapes. Finding the optimal reparametrization realizing this alignment is a computationally demanding task, typically done by solving an optimization problem on the diffeomorphism group. In this paper, we propose an algorithm for constructing approximations of orientat…
▽ More
One of the fundamental problems in shape analysis is to align curves or surfaces before computing geodesic distances between their shapes. Finding the optimal reparametrization realizing this alignment is a computationally demanding task, typically done by solving an optimization problem on the diffeomorphism group. In this paper, we propose an algorithm for constructing approximations of orientation-preserving diffeomorphisms by composition of elementary diffeomorphisms. The algorithm is implemented using PyTorch, and is applicable for both unparametrized curves and surfaces. Moreover, we show universal approximation properties for the constructed architectures, and obtain bounds for the Lipschitz constants of the resulting diffeomorphisms.
△ Less
Submitted 30 August, 2023; v1 submitted 22 July, 2022;
originally announced July 2022.
-
Learning Hamiltonians of constrained mechanical systems
Authors:
Elena Celledoni,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we pr…
▽ More
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we propose new approaches for the accurate approximation of the Hamiltonian function of constrained mechanical systems given sample data information of their solutions. We focus on the importance of the preservation of the constraints in the learning strategy by using both explicit Lie group integrators and other classical schemes.
△ Less
Submitted 27 June, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Equivariant neural networks for inverse problems
Authors:
Elena Celledoni,
Matthias J. Ehrhardt,
Christian Etmann,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-t…
▽ More
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-translational symmetry of $\mathbf R^d$, but other examples are the scaling symmetry of $\mathbf R^d$ and rotational symmetry of the sphere. In this work, we demonstrate that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods for inverse problems that are motivated by the variational regularisation approach. Indeed, if the regularisation functional is invariant under a group symmetry, the corresponding proximal operator will satisfy an equivariance property with respect to the same group symmetry. As a result of this observation, we design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks. We use roto-translationally equivariant operations in the proposed methodology and apply it to the problems of low-dose computerised tomography reconstruction and subsampled magnetic resonance imaging reconstruction. The proposed methodology is demonstrated to improve the reconstruction quality of a learned reconstruction method with a little extra computational cost at training time but without any extra cost at test time.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Structure preserving deep learning
Authors:
Elena Celledoni,
Matthias J. Ehrhardt,
Christian Etmann,
Robert I McLachlan,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff betw…
▽ More
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Signatures in Shape Analysis: an Efficient Approach to Motion Identification
Authors:
Elena Celledoni,
Pål Erik Lystad,
Nikolas Tapia
Abstract:
Signatures provide a succinct description of certain features of paths in a reparametrization invariant way. We propose a method for classifying shapes based on signatures, and compare it to current approaches based on the SRV transform and dynamic programming.
Signatures provide a succinct description of certain features of paths in a reparametrization invariant way. We propose a method for classifying shapes based on signatures, and compare it to current approaches based on the SRV transform and dynamic programming.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
Deep learning as optimal control problems: models and numerical methods
Authors:
Martin Benning,
Elena Celledoni,
Matthias J. Ehrhardt,
Brynjulf Owren,
Carola-Bibiane Schönlieb
Abstract:
We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving…
▽ More
We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.
△ Less
Submitted 30 September, 2019; v1 submitted 11 April, 2019;
originally announced April 2019.
-
Shape Analysis on Lie Groups with Applications in Computer Animation
Authors:
Elena Celledoni,
Markus Eslitzbichler,
Alexander Schmeding
Abstract:
Shape analysis methods have in the past few years become very popular, both for theoretical exploration as well as from an application point of view. Originally developed for planar curves, these methods have been expanded to higher dimensional curves, surfaces, activities, character motions and many other objects. In this paper, we develop a framework for shape analysis of curves in Lie groups fo…
▽ More
Shape analysis methods have in the past few years become very popular, both for theoretical exploration as well as from an application point of view. Originally developed for planar curves, these methods have been expanded to higher dimensional curves, surfaces, activities, character motions and many other objects. In this paper, we develop a framework for shape analysis of curves in Lie groups for problems of computer animations. In particular, we will use these methods to find cyclic approximations of non-cyclic character animations and interpolate between existing animations to generate new ones.
△ Less
Submitted 19 May, 2016; v1 submitted 2 June, 2015;
originally announced June 2015.