Search | arXiv e-print repository

arXiv:2505.12003 [pdf, ps, other]

Approximation theory for 1-Lipschitz ResNets

Authors: Davide Murari, Takashi Furuya, Carola-Bibiane Schönlieb

Abstract: 1-Lipschitz neural networks are fundamental for generative modelling, inverse problems, and robust classifiers. In this paper, we focus on 1-Lipschitz residual networks (ResNets) based on explicit Euler steps of negative gradient flows and study their approximation capabilities. Leveraging the Restricted Stone-Weierstrass Theorem, we first show that these 1-Lipschitz ResNets are dense in the set o… ▽ More 1-Lipschitz neural networks are fundamental for generative modelling, inverse problems, and robust classifiers. In this paper, we focus on 1-Lipschitz residual networks (ResNets) based on explicit Euler steps of negative gradient flows and study their approximation capabilities. Leveraging the Restricted Stone-Weierstrass Theorem, we first show that these 1-Lipschitz ResNets are dense in the set of scalar 1-Lipschitz functions on any compact domain when width and depth are allowed to grow. We also show that these networks can exactly represent scalar piecewise affine 1-Lipschitz functions. We then prove a stronger statement: by inserting norm-constrained linear maps between the residual blocks, the same density holds when the hidden width is fixed. Because every layer obeys simple norm constraints, the resulting models can be trained with off-the-shelf optimisers. This paper provides the first universal approximation guarantees for 1-Lipschitz ResNets, laying a rigorous foundation for their practical use. △ Less

Submitted 17 May, 2025; originally announced May 2025.

MSC Class: 68T07

arXiv:2503.21950 [pdf, ps, other]

From Euler-Jacobi to Bogoyavlensky and back

Authors: Davide Murari, Nicola Sansonetto

Abstract: This work focuses on two notions of non-Hamiltonian integrable systems: B-integrability and Euler-Jacobi integrability. We first show that the first notion is stronger. We then investigate which possible "non-evident" properties one can add to the Euler-Jacobi Theorem to make the dynamics B-integrable. This work focuses on two notions of non-Hamiltonian integrable systems: B-integrability and Euler-Jacobi integrability. We first show that the first notion is stronger. We then investigate which possible "non-evident" properties one can add to the Euler-Jacobi Theorem to make the dynamics B-integrable. △ Less

Submitted 27 March, 2025; originally announced March 2025.

MSC Class: 37J35; 37J60; 37J15; 70H06

arXiv:2503.17797 [pdf, ps, other]

Enhancing Fourier Neural Operators with Local Spatial Features

Authors: Chaoyu Liu, Davide Murari, Lihao Liu, Yangming Li, Chris Budd, Carola-Bibiane Schönlieb

Abstract: Partial Differential Equation (PDE) problems often exhibit strong local spatial structures, and effectively capturing these structures is critical for approximating their solutions. Recently, the Fourier Neural Operator (FNO) has emerged as an efficient approach for solving these PDE problems. By using parametrization in the frequency domain, FNOs can efficiently capture global patterns. However,… ▽ More Partial Differential Equation (PDE) problems often exhibit strong local spatial structures, and effectively capturing these structures is critical for approximating their solutions. Recently, the Fourier Neural Operator (FNO) has emerged as an efficient approach for solving these PDE problems. By using parametrization in the frequency domain, FNOs can efficiently capture global patterns. However, this approach inherently overlooks the critical role of local spatial features, as frequency-domain parameterized convolutions primarily emphasize global interactions without encoding comprehensive localized spatial dependencies. Although several studies have attempted to address this limitation, their extracted Local Spatial Features (LSFs) remain insufficient, and computational efficiency is often compromised. To address this limitation, we introduce a convolutional neural network (CNN)-based feature pre-extractor to capture LSFs directly from input data, resulting in a hybrid architecture termed \textit{Conv-FNO}. Furthermore, we introduce two novel resizing schemes to make our Conv-FNO resolution invariant. In this work, we focus on demonstrating the effectiveness of incorporating LSFs into FNOs by conducting both a theoretical analysis and extensive numerical experiments. Our findings show that this simple yet impactful modification enhances the representational capacity of FNOs and significantly improves performance on challenging PDE benchmarks. △ Less

Submitted 3 June, 2025; v1 submitted 22 March, 2025; originally announced March 2025.

arXiv:2503.15696 [pdf, other]

Approximation properties of neural ODEs

Authors: Arturo De Marinis, Davide Murari, Elena Celledoni, Nicola Guglielmi, Brynjulf Owren, Francesco Tudisco

Abstract: We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of s… ▽ More We study the approximation properties of shallow neural networks whose activation function is defined as the flow of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters are required to satisfy some constraints. In particular, we constrain the Lipschitz constant of the flow of the neural ODE to increase the stability of the shallow neural network, and we restrict the norm of the weight matrices of the linear layers to one to make sure that the restricted expansivity of the flow is not compensated by the increased expansivity of the linear layers. For this setting, we prove approximation bounds that tell us the accuracy to which we can approximate a continuous function with a shallow neural network with such constraints. We prove that the UAP holds if we consider only the constraint on the Lipschitz constant of the flow or the unit norm constraint on the weight matrices of the linear layers. △ Less

Submitted 19 March, 2025; originally announced March 2025.

Comments: 30 pages, 11 figures, 2 tables

arXiv:2412.16787 [pdf, other]

Symplectic Neural Flows for Modeling and Discovery

Authors: Priscilla Canizares, Davide Murari, Carola-Bibiane Schönlieb, Ferdia Sherry, Zakhar Shumaylov

Abstract: Hamilton's equations are fundamental for modeling complex physical systems, where preserving key properties such as energy and momentum is crucial for reliable long-term simulations. Geometric integrators are widely used for this purpose, but neural network-based methods that incorporate these principles remain underexplored. This work introduces SympFlow, a time-dependent symplectic neural networ… ▽ More Hamilton's equations are fundamental for modeling complex physical systems, where preserving key properties such as energy and momentum is crucial for reliable long-term simulations. Geometric integrators are widely used for this purpose, but neural network-based methods that incorporate these principles remain underexplored. This work introduces SympFlow, a time-dependent symplectic neural network designed using parameterized Hamiltonian flow maps. This design allows for backward error analysis and ensures the preservation of the symplectic structure. SympFlow allows for two key applications: (i) providing a time-continuous symplectic approximation of the exact flow of a Hamiltonian system--purely based on the differential equations it satisfies, and (ii) approximating the flow map of an unknown Hamiltonian system relying on trajectory data. We demonstrate the effectiveness of SympFlow on diverse problems, including chaotic and dissipative systems, showing improved energy conservation compared to general-purpose numerical methods and accurate △ Less

Submitted 21 December, 2024; originally announced December 2024.

Comments: 26 pages, 14 figures

arXiv:2410.18262 [pdf, other]

Hamiltonian Matching for Symplectic Neural Integrators

Authors: Priscilla Canizares, Davide Murari, Carola-Bibiane Schönlieb, Ferdia Sherry, Zakhar Shumaylov

Abstract: Hamilton's equations of motion form a fundamental framework in various branches of physics, including astronomy, quantum mechanics, particle physics, and climate science. Classical numerical solvers are typically employed to compute the time evolution of these systems. However, when the system spans multiple spatial and temporal scales numerical errors can accumulate, leading to reduced accuracy.… ▽ More Hamilton's equations of motion form a fundamental framework in various branches of physics, including astronomy, quantum mechanics, particle physics, and climate science. Classical numerical solvers are typically employed to compute the time evolution of these systems. However, when the system spans multiple spatial and temporal scales numerical errors can accumulate, leading to reduced accuracy. To address the challenges of evolving such systems over long timescales, we propose SympFlow, a novel neural network-based symplectic integrator, which is the composition of a sequence of exact flow maps of parametrised time-dependent Hamiltonian functions. This architecture allows for a backward error analysis: we can identify an underlying Hamiltonian function of the architecture and use it to define a Hamiltonian matching objective function, which we use for training. In numerical experiments, we show that SympFlow exhibits promising results, with qualitative energy conservation behaviour similar to that of time-stepping symplectic integrators. △ Less

Submitted 23 October, 2024; originally announced October 2024.

Comments: NeurReps 2024

arXiv:2408.09756 [pdf, other]

Parallel-in-Time Solutions with Random Projection Neural Networks

Authors: Marta M. Betcke, Lisa Maria Kreusser, Davide Murari

Abstract: This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator. We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations. In our numerical sim… ▽ More This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator. We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations. In our numerical simulations, we further specialize the underpinning neural architecture to Random Projection Neural Networks (RPNNs), a 2-layer neural network where the first layer weights are drawn at random rather than optimized. This restriction substantially increases the efficiency of fitting RPNN's weights in comparison to a standard feedforward network without negatively impacting the accuracy, as demonstrated in the SIR system example. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2312.00644 [pdf, other]

Neural networks for the approximation of Euler's elastica

Authors: Elena Celledoni, Ergys Çokaj, Andrea Leone, Sigrid Leyendecker, Davide Murari, Brynjulf Owren, Rodrigo T. Sato Martín de Almagro, Martina Stavole

Abstract: Euler's elastica is a classical model of flexible slender structures, relevant in many industrial applications. Static equilibrium equations can be derived via a variational principle. The accurate approximation of solutions of this problem can be challenging due to nonlinearity and constraints. We here present two neural network based approaches for the simulation of this Euler's elastica. Starti… ▽ More Euler's elastica is a classical model of flexible slender structures, relevant in many industrial applications. Static equilibrium equations can be derived via a variational principle. The accurate approximation of solutions of this problem can be challenging due to nonlinearity and constraints. We here present two neural network based approaches for the simulation of this Euler's elastica. Starting from a data set of solutions of the discretised static equilibria, we train the neural networks to produce solutions for unseen boundary conditions. We present a $\textit{discrete}$ approach learning discrete solutions from the discrete data. We then consider a $\textit{continuous}$ approach using the same training data set, but learning continuous solutions to the problem. We present numerical evidence that the proposed neural networks can effectively approximate configurations of the planar Euler's elastica for a range of different boundary conditions. △ Less

Submitted 4 June, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

Comments: v2: Improved the rigorousness of the experiments considering training, validation and test sets. Fixed language and notation

arXiv:2311.06942 [pdf, other]

Resilient Graph Neural Networks: A Coupled Dynamical Systems Approach

Authors: Moshe Eliasof, Davide Murari, Ferdia Sherry, Carola-Bibiane Schönlieb

Abstract: Graph Neural Networks (GNNs) have established themselves as a key component in addressing diverse graph-based tasks. Despite their notable successes, GNNs remain susceptible to input perturbations in the form of adversarial attacks. This paper introduces an innovative approach to fortify GNNs against adversarial perturbations through the lens of coupled dynamical systems. Our method introduces gra… ▽ More Graph Neural Networks (GNNs) have established themselves as a key component in addressing diverse graph-based tasks. Despite their notable successes, GNNs remain susceptible to input perturbations in the form of adversarial attacks. This paper introduces an innovative approach to fortify GNNs against adversarial perturbations through the lens of coupled dynamical systems. Our method introduces graph neural layers based on differential equations with contractive properties, which, as we show, improve the robustness of GNNs. A distinctive feature of the proposed approach is the simultaneous learned evolution of both the node features and the adjacency matrix, yielding an intrinsic enhancement of model robustness to perturbations in the input features and the connectivity of the graph. We mathematically derive the underpinnings of our novel architecture and provide theoretical insights to reason about its expected behavior. We demonstrate the efficacy of our method through numerous real-world benchmarks, reading on par or improved performance compared to existing methods. △ Less

Submitted 11 September, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: ECAI 2024

arXiv:2306.17332 [pdf, other]

doi 10.1016/j.physd.2024.134159

Designing Stable Neural Networks using Convex Analysis and ODEs

Authors: Ferdia Sherry, Elena Celledoni, Matthias J. Ehrhardt, Davide Murari, Brynjulf Owren, Carola-Bibiane Schönlieb

Abstract: Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrain… ▽ More Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrained, has a Lipschitz constant that, in the worst case, grows exponentially with the depth of the network. Further analysis of the proposed architecture shows that the spectral norms of the weights can be further constrained to ensure that the network is an averaged operator, making it a natural candidate for a learned denoiser in Plug-and-Play algorithms. Using a novel adaptive way of enforcing the spectral norm constraints, we show that, even with these constraints, it is possible to train performant networks. The proposed architecture is applied to the problem of adversarially robust image classification, to image denoising, and finally to the inverse problem of deblurring. △ Less

Submitted 18 April, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

Comments: 34 pages, 6 figures. This is the accepted version of a paper published in Physica D: Nonlinear Phenomena

arXiv:2305.00723 [pdf, other]

Predictions Based on Pixel Data: Insights from PDEs and Finite Differences

Authors: Elena Celledoni, James Jackaman, Davide Murari, Brynjulf Owren

Abstract: As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional net… ▽ More As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations. △ Less

Submitted 21 June, 2024; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2210.02373 [pdf, other]

Dynamical systems' based neural networks

Authors: Elena Celledoni, Davide Murari, Brynjulf Owren, Carola-Bibiane Schönlieb, Ferdia Sherry

Abstract: Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomo… ▽ More Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets. △ Less

Submitted 31 August, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

MSC Class: 65L05; 65L06; 37M15

arXiv:2201.13254 [pdf, other]

Learning Hamiltonians of constrained mechanical systems

Authors: Elena Celledoni, Andrea Leone, Davide Murari, Brynjulf Owren

Abstract: Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we pr… ▽ More Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we propose new approaches for the accurate approximation of the Hamiltonian function of constrained mechanical systems given sample data information of their solutions. We focus on the importance of the preservation of the constraints in the learning strategy by using both explicit Lie group integrators and other classical schemes. △ Less

Submitted 27 June, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

Comments: 21 pages, Conference proceeding for NUMDIFF-16

arXiv:2109.12325 [pdf, other]

Dynamics of the N-fold Pendulum in the framework of Lie Group Integrators

Authors: Elena Celledoni, Ergys Çokaj, Andrea Leone, Davide Murari, Brynjulf Owren

Abstract: Since their introduction, Lie group integrators have become a method of choice in many application areas. Various formulations of these integrators exist, and in this work we focus on Runge--Kutta--Munthe--Kaas methods. First, we briefly introduce this class of integrators, considering some of the practical aspects of their implementation, such as adaptive time stepping. We then present some mathe… ▽ More Since their introduction, Lie group integrators have become a method of choice in many application areas. Various formulations of these integrators exist, and in this work we focus on Runge--Kutta--Munthe--Kaas methods. First, we briefly introduce this class of integrators, considering some of the practical aspects of their implementation, such as adaptive time stepping. We then present some mathematical background that allows us to apply them to some families of Lagrangian mechanical systems. We conclude with an application to a nontrivial mechanical system: the N-fold 3D pendulum. △ Less

Submitted 25 September, 2021; originally announced September 2021.

arXiv:2102.12778 [pdf, other]

doi 10.1080/00207160.2021.1966772

Lie Group integrators for mechanical systems

Authors: Elena Celledoni, Ergys Çokaj, Andrea Leone, Davide Murari, Brynjulf Owren

Abstract: Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge--Kutta--Munthe--Kaas methods and the commutator free Lie group integrators. We give a short in… ▽ More Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge--Kutta--Munthe--Kaas methods and the commutator free Lie group integrators. We give a short introduction to these classes of methods. The Hamiltonian framework is attractive for many mechanical problems, and in particular we shall consider Lie group integrators for problems on cotangent bundles of Lie groups where a number of different formulations are possible. There is a natural symplectic structure on such manifolds and through variational principles one may derive symplectic Lie group integrators. We also consider the practical aspects of the implementation of Lie group integrators, such as adaptive time stepping. The theory is illustrated by applying the methods to two nontrivial applications in mechanics. One is the N-fold spherical pendulum where we introduce the restriction of the adjoint action of the group $SE(3)$ to $TS^2$, the tangent bundle of the two-dimensional sphere. Finally, we show how Lie group integrators can be applied to model the controlled path of a payload being transported by two rotors. This problem is modeled on $\mathbb{R}^6\times \left(SO(3)\times \mathfrak{so}(3)\right)^2\times (TS^2)^2$ and put in a format where Lie group integrators can be applied. △ Less

Submitted 14 October, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

Comments: 35 pages

MSC Class: 65L05; 70E55

Showing 1–15 of 15 results for author: Murari, D