-
Approximation theory for 1-Lipschitz ResNets
Authors:
Davide Murari,
Takashi Furuya,
Carola-Bibiane Schönlieb
Abstract:
1-Lipschitz neural networks are fundamental for generative modelling, inverse problems, and robust classifiers. In this paper, we focus on 1-Lipschitz residual networks (ResNets) based on explicit Euler steps of negative gradient flows and study their approximation capabilities. Leveraging the Restricted Stone-Weierstrass Theorem, we first show that these 1-Lipschitz ResNets are dense in the set o…
▽ More
1-Lipschitz neural networks are fundamental for generative modelling, inverse problems, and robust classifiers. In this paper, we focus on 1-Lipschitz residual networks (ResNets) based on explicit Euler steps of negative gradient flows and study their approximation capabilities. Leveraging the Restricted Stone-Weierstrass Theorem, we first show that these 1-Lipschitz ResNets are dense in the set of scalar 1-Lipschitz functions on any compact domain when width and depth are allowed to grow. We also show that these networks can exactly represent scalar piecewise affine 1-Lipschitz functions. We then prove a stronger statement: by inserting norm-constrained linear maps between the residual blocks, the same density holds when the hidden width is fixed. Because every layer obeys simple norm constraints, the resulting models can be trained with off-the-shelf optimisers. This paper provides the first universal approximation guarantees for 1-Lipschitz ResNets, laying a rigorous foundation for their practical use.
△ Less
Submitted 17 May, 2025;
originally announced May 2025.
-
From Euler-Jacobi to Bogoyavlensky and back
Authors:
Davide Murari,
Nicola Sansonetto
Abstract:
This work focuses on two notions of non-Hamiltonian integrable systems: B-integrability and Euler-Jacobi integrability. We first show that the first notion is stronger. We then investigate which possible "non-evident" properties one can add to the Euler-Jacobi Theorem to make the dynamics B-integrable.
This work focuses on two notions of non-Hamiltonian integrable systems: B-integrability and Euler-Jacobi integrability. We first show that the first notion is stronger. We then investigate which possible "non-evident" properties one can add to the Euler-Jacobi Theorem to make the dynamics B-integrable.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
Approximation properties of neural ODEs
Authors:
Arturo De Marinis,
Davide Murari,
Elena Celledoni,
Nicola Guglielmi,
Brynjulf Owren,
Francesco Tudisco
Abstract:
We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of s…
▽ More
We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters are required to satisfy some constraints. In particular, we constrain the Lipschitz constant of the flow of the neural ODE to increase the stability of the shallow neural network, and we restrict the norm of the weight matrices of the linear layers to one to make sure that the restricted expansivity of the flow is not compensated by the increased expansivity of the linear layers. For this setting, we prove approximation bounds that tell us the accuracy to which we can approximate a continuous function with a shallow neural network with such constraints. We prove that the UAP holds if we consider only the constraint on the Lipschitz constant of the flow or the unit norm constraint on the weight matrices of the linear layers.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Hamiltonian Matching for Symplectic Neural Integrators
Authors:
Priscilla Canizares,
Davide Murari,
Carola-Bibiane Schönlieb,
Ferdia Sherry,
Zakhar Shumaylov
Abstract:
Hamilton's equations of motion form a fundamental framework in various branches of physics, including astronomy, quantum mechanics, particle physics, and climate science. Classical numerical solvers are typically employed to compute the time evolution of these systems. However, when the system spans multiple spatial and temporal scales numerical errors can accumulate, leading to reduced accuracy.…
▽ More
Hamilton's equations of motion form a fundamental framework in various branches of physics, including astronomy, quantum mechanics, particle physics, and climate science. Classical numerical solvers are typically employed to compute the time evolution of these systems. However, when the system spans multiple spatial and temporal scales numerical errors can accumulate, leading to reduced accuracy. To address the challenges of evolving such systems over long timescales, we propose SympFlow, a novel neural network-based symplectic integrator, which is the composition of a sequence of exact flow maps of parametrised time-dependent Hamiltonian functions. This architecture allows for a backward error analysis: we can identify an underlying Hamiltonian function of the architecture and use it to define a Hamiltonian matching objective function, which we use for training. In numerical experiments, we show that SympFlow exhibits promising results, with qualitative energy conservation behaviour similar to that of time-stepping symplectic integrators.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
Parallel-in-Time Solutions with Random Projection Neural Networks
Authors:
Marta M. Betcke,
Lisa Maria Kreusser,
Davide Murari
Abstract:
This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator. We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations. In our numerical sim…
▽ More
This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator. We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations. In our numerical simulations, we further specialize the underpinning neural architecture to Random Projection Neural Networks (RPNNs), a 2-layer neural network where the first layer weights are drawn at random rather than optimized. This restriction substantially increases the efficiency of fitting RPNN's weights in comparison to a standard feedforward network without negatively impacting the accuracy, as demonstrated in the SIR system example.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Neural networks for the approximation of Euler's elastica
Authors:
Elena Celledoni,
Ergys Çokaj,
Andrea Leone,
Sigrid Leyendecker,
Davide Murari,
Brynjulf Owren,
Rodrigo T. Sato Martín de Almagro,
Martina Stavole
Abstract:
Euler's elastica is a classical model of flexible slender structures, relevant in many industrial applications. Static equilibrium equations can be derived via a variational principle. The accurate approximation of solutions of this problem can be challenging due to nonlinearity and constraints. We here present two neural network based approaches for the simulation of this Euler's elastica. Starti…
▽ More
Euler's elastica is a classical model of flexible slender structures, relevant in many industrial applications. Static equilibrium equations can be derived via a variational principle. The accurate approximation of solutions of this problem can be challenging due to nonlinearity and constraints. We here present two neural network based approaches for the simulation of this Euler's elastica. Starting from a data set of solutions of the discretised static equilibria, we train the neural networks to produce solutions for unseen boundary conditions. We present a $\textit{discrete}$ approach learning discrete solutions from the discrete data. We then consider a $\textit{continuous}$ approach using the same training data set, but learning continuous solutions to the problem. We present numerical evidence that the proposed neural networks can effectively approximate configurations of the planar Euler's elastica for a range of different boundary conditions.
△ Less
Submitted 4 June, 2024; v1 submitted 1 December, 2023;
originally announced December 2023.
-
Predictions Based on Pixel Data: Insights from PDEs and Finite Differences
Authors:
Elena Celledoni,
James Jackaman,
Davide Murari,
Brynjulf Owren
Abstract:
As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional net…
▽ More
As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations.
△ Less
Submitted 21 June, 2024; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Dynamical systems' based neural networks
Authors:
Elena Celledoni,
Davide Murari,
Brynjulf Owren,
Carola-Bibiane Schönlieb,
Ferdia Sherry
Abstract:
Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomo…
▽ More
Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets.
△ Less
Submitted 31 August, 2023; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Learning Hamiltonians of constrained mechanical systems
Authors:
Elena Celledoni,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we pr…
▽ More
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we propose new approaches for the accurate approximation of the Hamiltonian function of constrained mechanical systems given sample data information of their solutions. We focus on the importance of the preservation of the constraints in the learning strategy by using both explicit Lie group integrators and other classical schemes.
△ Less
Submitted 27 June, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Dynamics of the N-fold Pendulum in the framework of Lie Group Integrators
Authors:
Elena Celledoni,
Ergys Çokaj,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Since their introduction, Lie group integrators have become a method of choice in many application areas. Various formulations of these integrators exist, and in this work we focus on Runge--Kutta--Munthe--Kaas methods. First, we briefly introduce this class of integrators, considering some of the practical aspects of their implementation, such as adaptive time stepping. We then present some mathe…
▽ More
Since their introduction, Lie group integrators have become a method of choice in many application areas. Various formulations of these integrators exist, and in this work we focus on Runge--Kutta--Munthe--Kaas methods. First, we briefly introduce this class of integrators, considering some of the practical aspects of their implementation, such as adaptive time stepping. We then present some mathematical background that allows us to apply them to some families of Lagrangian mechanical systems. We conclude with an application to a nontrivial mechanical system: the N-fold 3D pendulum.
△ Less
Submitted 25 September, 2021;
originally announced September 2021.
-
Lie Group integrators for mechanical systems
Authors:
Elena Celledoni,
Ergys Çokaj,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge--Kutta--Munthe--Kaas methods and the commutator free Lie group integrators.
We give a short in…
▽ More
Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge--Kutta--Munthe--Kaas methods and the commutator free Lie group integrators.
We give a short introduction to these classes of methods. The Hamiltonian framework is attractive for many mechanical problems, and in particular we shall consider Lie group integrators for problems on cotangent bundles of Lie groups where a number of different formulations are possible. There is a natural symplectic structure on such manifolds and through variational principles one may derive symplectic Lie group integrators. We also consider the practical aspects of the implementation of Lie group integrators, such as adaptive time stepping. The theory is illustrated by applying the methods to two nontrivial applications in mechanics. One is the N-fold spherical pendulum where we introduce the restriction of the adjoint action of the group $SE(3)$ to $TS^2$, the tangent bundle of the two-dimensional sphere. Finally, we show how Lie group integrators can be applied to model the controlled path of a payload being transported by two rotors. This problem is modeled on $\mathbb{R}^6\times \left(SO(3)\times \mathfrak{so}(3)\right)^2\times (TS^2)^2$ and put in a format where Lie group integrators can be applied.
△ Less
Submitted 14 October, 2021; v1 submitted 25 February, 2021;
originally announced February 2021.