Search | arXiv e-print repository

Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints

Authors: Utkarsh Utkarsh, Pengfei Cai, Alan Edelman, Rafael Gomez-Bombarelli, Christopher Vincent Rackauckas

Abstract: Deep generative models have recently been applied to physical systems governed by partial differential equations (PDEs), offering scalable simulation and uncertainty-aware inference. However, enforcing physical constraints, such as conservation laws (linear and nonlinear) and physical consistencies, remains challenging. Existing methods often rely on soft penalties or architectural biases that fai… ▽ More Deep generative models have recently been applied to physical systems governed by partial differential equations (PDEs), offering scalable simulation and uncertainty-aware inference. However, enforcing physical constraints, such as conservation laws (linear and nonlinear) and physical consistencies, remains challenging. Existing methods often rely on soft penalties or architectural biases that fail to guarantee hard constraints. In this work, we propose Physics-Constrained Flow Matching (PCFM), a zero-shot inference framework that enforces arbitrary nonlinear constraints in pretrained flow-based generative models. PCFM continuously guides the sampling process through physics-based corrections applied to intermediate solution states, while remaining aligned with the learned flow and satisfying physical constraints. Empirically, PCFM outperforms both unconstrained and constrained baselines on a range of PDEs, including those with shocks, discontinuities, and sharp features, while ensuring exact constraint satisfaction at the final solution. Our method provides a general framework for enforcing hard constraints in both scientific and general-purpose generative models, especially in applications where constraint satisfaction is essential. △ Less

Submitted 4 June, 2025; originally announced June 2025.

Comments: 27 pages, 9 figures, 4 tables

arXiv:2505.20515 [pdf, ps, other]

Semi-Explicit Neural DAEs: Learning Long-Horizon Dynamical Systems with Algebraic Constraints

Authors: Avik Pal, Alan Edelman, Christopher Rackauckas

Abstract: Despite the promise of scientific machine learning (SciML) in combining data-driven techniques with mechanistic modeling, existing approaches for incorporating hard constraints in neural differential equations (NDEs) face significant limitations. Scalability issues and poor numerical properties prevent these neural models from being used for modeling physical systems with complicated conservation… ▽ More Despite the promise of scientific machine learning (SciML) in combining data-driven techniques with mechanistic modeling, existing approaches for incorporating hard constraints in neural differential equations (NDEs) face significant limitations. Scalability issues and poor numerical properties prevent these neural models from being used for modeling physical systems with complicated conservation laws. We propose Manifold-Projected Neural ODEs (PNODEs), a method that explicitly enforces algebraic constraints by projecting each ODE step onto the constraint manifold. This framework arises naturally from semi-explicit differential-algebraic equations (DAEs), and includes both a robust iterative variant and a fast approximation requiring a single Jacobian factorization. We further demonstrate that prior works on relaxation methods are special cases of our approach. PNODEs consistently outperform baselines across six benchmark problems achieving a mean constraint violation error below $10^{-10}$. Additionally, PNODEs consistently achieve lower runtime compared to other methods for a given level of error tolerance. These results show that constraint projection offers a simple strategy for learning physically consistent long-horizon dynamics. △ Less

Submitted 26 May, 2025; originally announced May 2025.

arXiv:2501.07701 [pdf, other]

Active Learning Enhanced Surrogate Modeling of Jet Engines in JuliaSim

Authors: Anas Abdelrehim, Dhairya Gandhi, Sharan Yalburgi, Ashutosh Bharambe, Ranjan Anantharaman, Chris Rackauckas

Abstract: Surrogate models are effective tools for accelerated design of complex systems. The result of a design optimization procedure using surrogate models can be used to initialize an optimization routine using the full order system. High accuracy of the surrogate model can be advantageous for fast convergence. In this work, we present an active learning approach to produce a very high accuracy surrogat… ▽ More Surrogate models are effective tools for accelerated design of complex systems. The result of a design optimization procedure using surrogate models can be used to initialize an optimization routine using the full order system. High accuracy of the surrogate model can be advantageous for fast convergence. In this work, we present an active learning approach to produce a very high accuracy surrogate model of a turbofan jet engine, that demonstrates 0.1\% relative error for all quantities of interest. We contrast this with a surrogate model produced using a more traditional brute-force data generation approach. △ Less

Submitted 13 January, 2025; originally announced January 2025.

arXiv:2412.14362 [pdf, other]

A Fully Adaptive Radau Method for the Efficient Solution of Stiff Ordinary Differential Equations at Low Tolerances

Authors: Shreyas Ekanathan, Oscar Smith, Christopher Rackauckas

Abstract: Radau IIA methods, specifically the adaptive order Radau method in Fortran due to Hairer, are known to be state-of-the-art for the high-accuracy solution of highly stiff ordinary differential equations (ODEs). However, the traditional implementation was specialized to a specific range of tolerance, in particular only supporting 5th, 9th, and 13th order versions of the tableau and only derived in d… ▽ More Radau IIA methods, specifically the adaptive order Radau method in Fortran due to Hairer, are known to be state-of-the-art for the high-accuracy solution of highly stiff ordinary differential equations (ODEs). However, the traditional implementation was specialized to a specific range of tolerance, in particular only supporting 5th, 9th, and 13th order versions of the tableau and only derived in double precision floating point, thus limiting the ability to be truly general purpose for high fidelity scenarios. To alleviate these constraints, we implement an adaptive-time adaptive-order Radau method which can derive the coefficients for the Radau IIA embedded tableau to any order on the fly to any precision. Additionally, our Julia-based implementation includes many modernizations to improve performance, including improvements to the order adaptation scheme and improved linear algebra integrations. In a head-to-head benchmark against the classic Fortran implementation, we demonstrate our implementation is approximately 2x across a range of stiff ODEs. We benchmark our algorithm against several well-reputed numerical integrators for stiff ODEs and find state-of-the-art performance on several test problems, with a 1.5-times speed-up over common numerical integrators for stiff ODEs when low error tolerance is required. The newly implemented method is distributed in open source software for free usage on stiff ODEs. △ Less

Submitted 13 May, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

arXiv:2306.07961 [pdf, other]

Differentiating Metropolis-Hastings to Optimize Intractable Densities

Authors: Gaurav Arya, Ruben Seyer, Frank Schäfer, Kartik Chandra, Alexander K. Lew, Mathieu Huot, Vikash K. Mansinghka, Jonathan Ragan-Kelley, Christopher Rackauckas, Moritz Schauer

Abstract: We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us t… ▽ More We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us to apply gradient-based optimization to objectives expressed as expectations over intractable target densities. We demonstrate our approach by finding an ambiguous observation in a Gaussian mixture model and by maximizing the specific heat in an Ising model. △ Less

Submitted 30 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: 6 pages, 6 figures; accepted at Differentiable Almost Everything Workshop of ICML 2023

arXiv:2306.06992 [pdf, other]

doi 10.21105/jcon.00133

Extending JumpProcess.jl for fast point process simulation with time-varying intensities

Authors: Guilherme Augusto Zagatti, Samuel A. Isaacson, Christopher Rackauckas, Vasily Ilin, See-Kiong Ng, Stéphane Bressan

Abstract: Point processes model the occurrence of a countable number of random points over some support. They can model diverse phenomena, such as chemical reactions, stock market transactions and social interactions. We show that JumpProcesses.jl is a fast, general-purpose library for simulating point processes. JumpProcesses.jl was first developed for simulating jump processes via stochastic simulation al… ▽ More Point processes model the occurrence of a countable number of random points over some support. They can model diverse phenomena, such as chemical reactions, stock market transactions and social interactions. We show that JumpProcesses.jl is a fast, general-purpose library for simulating point processes. JumpProcesses.jl was first developed for simulating jump processes via stochastic simulation algorithms (SSAs) (including Doob's method, Gillespie's methods, and Kinetic Monte Carlo methods). Historically, jump processes have been developed in the context of dynamical systems to describe dynamics with discrete jumps. In contrast, the development of point processes has been more focused on describing the occurrence of random events. In this paper, we bridge the gap between the treatment of point and jump process simulation. The algorithms previously included in JumpProcesses.jl can be mapped to three general methods developed in statistics for simulating evolutionary point processes. Our comparative exercise revealed that the library initially lacked an efficient algorithm for simulating processes with variable intensity rates. We, therefore, extended JumpProcesses.jl with a new simulation algorithm, Coevolve, that enables the rapid simulation of processes with locally-bounded variable intensity rates. It is now possible to efficiently simulate any point process on the real line with a non-negative, left-continuous, history-adapted and locally bounded intensity rate coupled or not with differential equations. This extension significantly improves the computational performance of JumpProcesses.jl when simulating such processes, enabling it to become one of the few readily available, fast, general-purpose libraries for simulating evolutionary point processes. △ Less

Submitted 24 July, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

MSC Class: 60G55 ACM Class: G.3; G.4

Journal ref: The Proceedings of the JuliaCon Conferences, 6(58), 133 (2024)

arXiv:2304.06835 [pdf, other]

doi 10.1016/j.cma.2023.116591

Automated Translation and Accelerated Solving of Differential Equations on Multiple GPU Platforms

Authors: Utkarsh Utkarsh, Valentin Churavy, Yingbo Ma, Tim Besard, Prakitr Srisuma, Tim Gymnich, Adam R. Gerlach, Alan Edelman, George Barbastathis, Richard D. Braatz, Christopher Rackauckas

Abstract: We demonstrate a high-performance vendor-agnostic method for massively parallel solving of ensembles of ordinary differential equations (ODEs) and stochastic differential equations (SDEs) on GPUs. The method is integrated with a widely used differential equation solver library in a high-level language (Julia's DifferentialEquations.jl) and enables GPU acceleration without requiring code changes by… ▽ More We demonstrate a high-performance vendor-agnostic method for massively parallel solving of ensembles of ordinary differential equations (ODEs) and stochastic differential equations (SDEs) on GPUs. The method is integrated with a widely used differential equation solver library in a high-level language (Julia's DifferentialEquations.jl) and enables GPU acceleration without requiring code changes by the user. Our approach achieves state-of-the-art performance compared to hand-optimized CUDA-C++ kernels while performing 20--100$\times$ faster than the vectorizing map (vmap) approach implemented in JAX and PyTorch. Performance evaluation on NVIDIA, AMD, Intel, and Apple GPUs demonstrates performance portability and vendor-agnosticism. We show composability with MPI to enable distributed multi-GPU workflows. The implemented solvers are fully featured -- supporting event handling, automatic differentiation, and incorporation of datasets via the GPU's texture memory -- allowing scientists to take advantage of GPU acceleration on all major current architectures without changing their model code and without loss of performance. We distribute the software as an open-source library https://github.com/SciML/DiffEqGPU.jl △ Less

Submitted 13 November, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: 14 figures

Journal ref: Computer Methods in Applied Mechanics and Engineering, Volume 419, 2024

arXiv:2304.04752 [pdf, other]

A Practitioner's Guide to Bayesian Inference in Pharmacometrics using Pumas

Authors: Mohamed Tarek, Jose Storopoli, Casey Davis, Chris Elrod, Julius Krumbiegel, Chris Rackauckas, Vijay Ivaturi

Abstract: This paper provides a comprehensive tutorial for Bayesian practitioners in pharmacometrics using Pumas workflows. We start by giving a brief motivation of Bayesian inference for pharmacometrics highlighting limitations in existing software that Pumas addresses. We then follow by a description of all the steps of a standard Bayesian workflow for pharmacometrics using code snippets and examples. Thi… ▽ More This paper provides a comprehensive tutorial for Bayesian practitioners in pharmacometrics using Pumas workflows. We start by giving a brief motivation of Bayesian inference for pharmacometrics highlighting limitations in existing software that Pumas addresses. We then follow by a description of all the steps of a standard Bayesian workflow for pharmacometrics using code snippets and examples. This includes: model definition, prior selection, sampling from the posterior, prior and posterior simulations and predictions, counter-factual simulations and predictions, convergence diagnostics, visual predictive checks, and finally model comparison with cross-validation. Finally, the background and intuition behind many advanced concepts in Bayesian statistics are explained in simple language. This includes many important ideas and precautions that users need to keep in mind when performing Bayesian analysis. Many of the algorithms, codes, and ideas presented in this paper are highly applicable to clinical research and statistical learning at large but we chose to focus our discussions on pharmacometrics in this paper to have a narrower scope in mind and given the nature of Pumas as a software primarily for pharmacometricians. △ Less

Submitted 31 March, 2023; originally announced April 2023.

arXiv:2303.13555 [pdf, other]

Efficient hybrid modeling and sorption model discovery for non-linear advection-diffusion-sorption systems: A systematic scientific machine learning approach

Authors: Vinicius V. Santana, Erbet Costa, Carine M. Rebello, Ana Mafalda Ribeiro, Chris Rackauckas, Idelfonso B. R. Nogueira

Abstract: This study presents a systematic machine learning approach for creating efficient hybrid models and discovering sorption uptake models in non-linear advection-diffusion-sorption systems. It demonstrates an effective method to train these complex systems using gradient based optimizers, adjoint sensitivity analysis, and JIT-compiled vector Jacobian products, combined with spatial discretization and… ▽ More This study presents a systematic machine learning approach for creating efficient hybrid models and discovering sorption uptake models in non-linear advection-diffusion-sorption systems. It demonstrates an effective method to train these complex systems using gradient based optimizers, adjoint sensitivity analysis, and JIT-compiled vector Jacobian products, combined with spatial discretization and adaptive integrators. Sparse and symbolic regression were employed to identify missing functions in the artificial neural network. The robustness of the proposed method was tested on an in-silico data set of noisy breakthrough curve observations of fixed-bed adsorption, resulting in a well-fitted hybrid model. The study successfully reconstructed sorption uptake kinetics using sparse and symbolic regression, and accurately predicted breakthrough curves using identified polynomials, highlighting the potential of the proposed framework for discovering sorption kinetic law structures. △ Less

Submitted 25 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: GitHub repo made available

arXiv:2303.02262 [pdf, other]

Locally Regularized Neural Differential Equations: Some Black Boxes Were Meant to Remain Closed!

Authors: Avik Pal, Alan Edelman, Chris Rackauckas

Abstract: Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework due to their ability to adapt to new problems automatically. Training a neural differential equation is effectively a search over a space of plausible dynamical systems. However, controlling the computational cost for these models is difficult since it relies on the number of st… ▽ More Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework due to their ability to adapt to new problems automatically. Training a neural differential equation is effectively a search over a space of plausible dynamical systems. However, controlling the computational cost for these models is difficult since it relies on the number of steps the adaptive solver takes. Most prior works have used higher-order methods to reduce prediction timings while greatly increasing training time or reducing both training and prediction timings by relying on specific training algorithms, which are harder to use as a drop-in replacement due to strict requirements on automatic differentiation. In this manuscript, we use internal cost heuristics of adaptive differential equation solvers at stochastic time points to guide the training toward learning a dynamical system that is easier to integrate. We "close the black-box" and allow the use of our method with any adjoint technique for gradient calculations of the differential equation solution. We perform experimental studies to compare our method to global regularization to show that we attain similar performance numbers without compromising the flexibility of implementation on ordinary differential equations (ODEs) and stochastic differential equations (SDEs). We develop two sampling strategies to trade off between performance and training time. Our method reduces the number of function evaluations to 0.556-0.733x and accelerates predictions by 1.3-2x. △ Less

Submitted 2 June, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

Comments: Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023

arXiv:2303.02159 [pdf, ps, other]

Robust Parameter Estimation for Rational Ordinary Differential Equations

Authors: Oren Bassik, Yosef Berman, Soo Go, Hoon Hong, Ilia Ilmer, Alexey Ovchinnikov, Chris Rackauckas, Pedro Soto, Chee Yap

Abstract: We present a new approach for estimating parameters in rational ODE models from given (measured) time series data. In typical existing approaches, an initial guess for the parameter values is made from a given search interval. Then, in a loop, the corresponding outputs are computed by solving the ODE numerically, followed by computing the error from the given time series data. If the error is sm… ▽ More We present a new approach for estimating parameters in rational ODE models from given (measured) time series data. In typical existing approaches, an initial guess for the parameter values is made from a given search interval. Then, in a loop, the corresponding outputs are computed by solving the ODE numerically, followed by computing the error from the given time series data. If the error is small, the loop terminates and the parameter values are returned. Otherwise, heuristics/theories are used to possibly improve the guess and continue the loop. These approaches tend to be non-robust in the sense that their accuracy depend on the search interval and the true parameter values; furthermore, they cannot handle the case where the parameters are locally identifiable. In this paper, we propose a new approach, which does not suffer from the above non-robustness. In particular, it does not require making good initial guesses for the parameter values or specifying search intervals. Instead, it uses differential algebra, interpolation of the data using rational functions, and multivariate polynomial system solving. We also compare the performance of the resulting software with several other estimation software packages. △ Less

Submitted 17 December, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: Updates regarding robustness

arXiv:2301.04027 [pdf]

doi 10.1038/s43017-023-00450-9

Differentiable modeling to unify machine learning and physical models and advance Geosciences

Authors: Chaopeng Shen, Alison P. Appling, Pierre Gentine, Toshiyuki Bandai, Hoshin Gupta, Alexandre Tartakovsky, Marco Baity-Jesi, Fabrizio Fenicia, Daniel Kifer, Li Li, Xiaofeng Liu, Wei Ren, Yi Zheng, Ciaran J. Harman, Martyn Clark, Matthew Farthing, Dapeng Feng, Praveen Kumar, Doaa Aboelyazeed, Farshid Rahmani, Hylke E. Beck, Tadd Bindas, Dipankar Dwivedi, Kuai Fang, Marvin Höge , et al. (5 additional authors not shown)

Abstract: Process-Based Modeling (PBM) and Machine Learning (ML) are often perceived as distinct paradigms in the geosciences. Here we present differentiable geoscientific modeling as a powerful pathway toward dissolving the perceived barrier between them and ushering in a paradigm shift. For decades, PBM offered benefits in interpretability and physical consistency but struggled to efficiently leverage lar… ▽ More Process-Based Modeling (PBM) and Machine Learning (ML) are often perceived as distinct paradigms in the geosciences. Here we present differentiable geoscientific modeling as a powerful pathway toward dissolving the perceived barrier between them and ushering in a paradigm shift. For decades, PBM offered benefits in interpretability and physical consistency but struggled to efficiently leverage large datasets. ML methods, especially deep networks, presented strong predictive skills yet lacked the ability to answer specific scientific questions. While various methods have been proposed for ML-physics integration, an important underlying theme -- differentiable modeling -- is not sufficiently recognized. Here we outline the concepts, applicability, and significance of differentiable geoscientific modeling (DG). "Differentiable" refers to accurately and efficiently calculating gradients with respect to model variables, critically enabling the learning of high-dimensional unknown relationships. DG refers to a range of methods connecting varying amounts of prior knowledge to neural networks and training them together, capturing a different scope than physics-guided machine learning and emphasizing first principles. Preliminary evidence suggests DG offers better interpretability and causality than ML, improved generalizability and extrapolation capability, and strong potential for knowledge discovery, while approaching the performance of purely data-driven ML. DG models require less training data while scaling favorably in performance and efficiency with increasing amounts of data. With DG, geoscientists may be better able to frame and investigate questions, test hypotheses, and discover unrecognized linkages. △ Less

Submitted 26 December, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

Journal ref: Nat Rev Earth Environ 4, 552-567 (2023)

arXiv:2210.08572 [pdf, other]

Automatic Differentiation of Programs with Discrete Randomness

Authors: Gaurav Arya, Moritz Schauer, Frank Schäfer, Chris Rackauckas

Abstract: Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs tha… ▽ More Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs that have discrete stochastic behaviors governed by distribution parameters, such as flipping a coin with probability $p$ of being heads, pose a challenge to these systems because the connection between the result (heads vs tails) and the parameters ($p$) is fundamentally discrete. In this paper we develop a new reparameterization-based methodology that allows for generating programs whose expectation is the derivative of the expectation of the original program. We showcase how this method gives an unbiased and low-variance estimator which is as automated as traditional AD mechanisms. We demonstrate unbiased forward-mode AD of discrete-time Markov chains, agent-based models such as Conway's Game of Life, and unbiased reverse-mode AD of a particle filter. Our code package is available at https://github.com/gaurav-arya/StochasticAD.jl. △ Less

Submitted 9 January, 2023; v1 submitted 16 October, 2022; originally announced October 2022.

Comments: In Proceedings of NeurIPS 2022

arXiv:2208.12879 [pdf, ps, other]

DelayDiffEq: Generating Delay Differential Equation Solvers via Recursive Embedding of Ordinary Differential Equation Solvers

Authors: David Widmann, Chris Rackauckas

Abstract: Traditional solvers for delay differential equations (DDEs) are designed around only a single method and do not effectively use the infrastructure of their more-developed ordinary differential equation (ODE) counterparts. In this work we present DelayDiffEq, a Julia package for numerically solving delay differential equations (DDEs) which leverages the multitude of numerical algorithms in Ordinary… ▽ More Traditional solvers for delay differential equations (DDEs) are designed around only a single method and do not effectively use the infrastructure of their more-developed ordinary differential equation (ODE) counterparts. In this work we present DelayDiffEq, a Julia package for numerically solving delay differential equations (DDEs) which leverages the multitude of numerical algorithms in OrdinaryDiffEq for solving both stiff and non-stiff ODEs, and manages to solve challenging stiff DDEs. We describe how compiling the ODE integrator within itself, and accounting for discontinuity propagation, leads to a design that is effective for DDEs while using all of the ODE internals. We highlight some difficulties that a numerical DDE solver has to address, and explain how DelayDiffEq deals with these problems. We show how DelayDiffEq is able to solve difficult equations, how its stiff DDE solvers give efficiency on problems with time-scale separation, and how the design allows for generality and flexibility in usage such as being repurposed for generating solvers for stochastic delay differential equations. △ Less

Submitted 26 August, 2022; originally announced August 2022.

Comments: 8 pages, 3 figures

arXiv:2207.08135 [pdf, other]

Parallelizing Explicit and Implicit Extrapolation Methods for Ordinary Differential Equations

Authors: Utkarsh, Chris Elrod, Yingbo Ma, Christopher Rackauckas

Abstract: Numerically solving ordinary differential equations (ODEs) is a naturally serial process and as a result the vast majority of ODE solver software are serial. In this manuscript we developed a set of parallelized ODE solvers using extrapolation methods which exploit "parallelism within the method" so that arbitrary user ODEs can be parallelized. We describe the specific choices made in the implemen… ▽ More Numerically solving ordinary differential equations (ODEs) is a naturally serial process and as a result the vast majority of ODE solver software are serial. In this manuscript we developed a set of parallelized ODE solvers using extrapolation methods which exploit "parallelism within the method" so that arbitrary user ODEs can be parallelized. We describe the specific choices made in the implementation of the explicit and implicit extrapolation methods which allow for generating low overhead static schedules to then exploit with optimized multi-threaded implementations. We demonstrate that while the multi-threading gives a noticeable acceleration on both explicit and implicit problems, the explicit parallel extrapolation methods gave no significant improvement over state-of-the-art even with a multi-threading advantage against current optimized high order Runge-Kutta tableaus. However, we demonstrate that the implicit parallel extrapolation methods are able to achieve state-of-the-art performance (2x-4x) on standard multicore x86 CPUs for systems of $<200$ stiff ODEs solved at low tolerance, a typical setup for a vast majority of users of high level language equation solver suites. The resulting method is distributed as the first widely available open source software for within-method parallel acceleration targeting typical modest compute architectures. △ Less

Submitted 10 September, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

Comments: 6 figures

arXiv:2204.08775 [pdf, other]

Plots.jl -- a user extendable plotting API for the julia programming language

Authors: Simon Christ, Daniel Schwabeneder, Christopher Rackauckas, Michael Krabbe Borregaard, Thomas Breloff

Abstract: There are plenty of excellent plotting libraries. Each excels at a different use case: one is good for printed 2D publication figures, the other at interactive 3D graphics, a third has excellent L A TEX integration or is good for creating dashboards on the web. The aim of Plots.jl is to enable the user to use the same syntax to interact with many different plotting libraries, such that it is possi… ▽ More There are plenty of excellent plotting libraries. Each excels at a different use case: one is good for printed 2D publication figures, the other at interactive 3D graphics, a third has excellent L A TEX integration or is good for creating dashboards on the web. The aim of Plots.jl is to enable the user to use the same syntax to interact with many different plotting libraries, such that it is possible to change the library "backend" without needing to touch the code that creates the content -- and without having to learn yet another application programming interface (API). This is achieved by the separation of the plot specification from the implementation of the actual graphical backend. These plot specifications may be extended by a "recipe" system, which allows package authors and users to define how to plot any new type (be it a statistical model, a map, a phylogenetic tree or the solution to a system of differential equations) and create new types of plots -- without depending on the Plots.jl package. This supports a modular ecosystem structure for plotting and yields a high reuse potential across the entire julia package ecosystem. Plots.jl is publicly available at https://github.com/JuliaPlots/Plots.jl. △ Less

Submitted 17 June, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: 22 pages, 6 figures, 6 code listings

ACM Class: I.3.3

arXiv:2204.05117 [pdf, other]

ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models

Authors: Francesco Martinuzzi, Chris Rackauckas, Anas Abdelrehim, Miguel D. Mahecha, Karin Mora

Abstract: We introduce ReservoirComputing.jl, an open source Julia library for reservoir computing models. The software offers a great number of algorithms presented in the literature, and allows to expand on them with both internal and external tools in a simple way. The implementation is highly modular, fast and comes with a comprehensive documentation, which includes reproduced experiments from literatur… ▽ More We introduce ReservoirComputing.jl, an open source Julia library for reservoir computing models. The software offers a great number of algorithms presented in the literature, and allows to expand on them with both internal and external tools in a simple way. The implementation is highly modular, fast and comes with a comprehensive documentation, which includes reproduced experiments from literature. The code and documentation are hosted on Github under an MIT license https://github.com/SciML/ReservoirComputing.jl. △ Less

Submitted 8 April, 2022; originally announced April 2022.

Journal ref: Journal of Machine Learning Research 23 (2022) 1-8

arXiv:2201.12468 [pdf, ps, other]

Symbolic-Numeric Integration of Univariate Expressions based on Sparse Regression

Authors: Shahriar Iravanian, Carl Julius Martensen, Alessandro Cheli, Shashi Gowda, Anand Jain, Yingbo Ma, Chris Rackauckas

Abstract: Most computer algebra systems (CAS) support symbolic integration as core functionality. The majority of the integration packages use a combination of heuristic algebraic and rule-based (integration table) methods. In this paper, we present a hybrid (symbolic-numeric) methodology to calculate the indefinite integrals of univariate expressions. The primary motivation for this work is to add symbolic… ▽ More Most computer algebra systems (CAS) support symbolic integration as core functionality. The majority of the integration packages use a combination of heuristic algebraic and rule-based (integration table) methods. In this paper, we present a hybrid (symbolic-numeric) methodology to calculate the indefinite integrals of univariate expressions. The primary motivation for this work is to add symbolic integration functionality to a modern CAS (the symbolic manipulation packages of SciML, the Scientific Machine Learning ecosystem of the Julia programming language), which is mainly designed toward numerical and machine learning applications and has a different set of features than traditional CAS. The symbolic part of our method is based on the combination of candidate terms generation (borrowed from the Homotopy operators theory) with rule-based expression transformations provided by the underlying CAS. The numeric part is based on sparse-regression, a component of Sparse Identification of Nonlinear Dynamics (SINDy) technique. We show that this system can solve a large variety of common integration problems using only a few dozen basic integration rules. △ Less

Submitted 6 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

Comments: 8 pages. submitted to ISSAC 2022. Code at https://github.com/SciML/SymbolicNumericIntegration.jl

ACM Class: I.1.0; I.1.2

arXiv:2201.12240 [pdf, other]

Continuous Deep Equilibrium Models: Training Neural ODEs faster by integrating them to Infinity

Authors: Avik Pal, Alan Edelman, Christopher Rackauckas

Abstract: Implicit models separate the definition of a layer from the description of its solution process. While implicit layers allow features such as depth to adapt to new scenarios and inputs automatically, this adaptivity makes its computational expense challenging to predict. In this manuscript, we increase the "implicitness" of the DEQ by redefining the method in terms of an infinite time neural ODE,… ▽ More Implicit models separate the definition of a layer from the description of its solution process. While implicit layers allow features such as depth to adapt to new scenarios and inputs automatically, this adaptivity makes its computational expense challenging to predict. In this manuscript, we increase the "implicitness" of the DEQ by redefining the method in terms of an infinite time neural ODE, which paradoxically decreases the training cost over a standard neural ODE by 2-4x. Additionally, we address the question: is there a way to simultaneously achieve the robustness of implicit layers while allowing the reduced computational expense of an explicit layer? To solve this, we develop Skip and Skip Reg. DEQ, an implicit-explicit (IMEX) layer that simultaneously trains an explicit prediction followed by an implicit correction. We show that training this explicit predictor is free and even decreases the training time by 1.11-3.19x. Together, this manuscript shows how bridging the dichotomy of implicit and explicit deep learning can combine the advantages of both techniques. △ Less

Submitted 3 March, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

arXiv:2111.05841 [pdf, other]

doi 10.1038/s42256-023-00761-y

Physics-enhanced deep surrogates for partial differential equations

Authors: Raphaël Pestourie, Youssef Mroueh, Chris Rackauckas, Payel Das, Steven G. Johnson

Abstract: Many physics and engineering applications demand Partial Differential Equations (PDE) property evaluations that are traditionally computed with resource-intensive high-fidelity numerical solvers. Data-driven surrogate models provide an efficient alternative but come with a significant cost of training. Emerging applications would benefit from surrogates with an improved accuracy-cost tradeoff, whi… ▽ More Many physics and engineering applications demand Partial Differential Equations (PDE) property evaluations that are traditionally computed with resource-intensive high-fidelity numerical solvers. Data-driven surrogate models provide an efficient alternative but come with a significant cost of training. Emerging applications would benefit from surrogates with an improved accuracy-cost tradeoff, while studied at scale. Here we present a "physics-enhanced deep-surrogate" ("PEDS") approach towards developing fast surrogate models for complex physical systems, which is described by PDEs. Specifically, a combination of a low-fidelity, explainable physics simulator and a neural network generator is proposed, which is trained end-to-end to globally match the output of an expensive high-fidelity numerical solver. Experiments on three exemplar testcases, diffusion, reaction-diffusion, and electromagnetic scattering models, show that a PEDS surrogate can be up to 3$\times$ more accurate than an ensemble of feedforward neural networks with limited data ($\approx 10^3$ training points), and reduces the training data need by at least a factor of 100 to achieve a target error of 5%. Experiments reveal that PEDS provides a general, data-driven strategy to bridge the gap between a vast array of simplified physical models with corresponding brute-force numerical solvers modeling complex systems, offering accuracy, speed, data efficiency, as well as physical insights into the process. △ Less

Submitted 14 December, 2023; v1 submitted 10 November, 2021; originally announced November 2021.

arXiv:2109.12449 [pdf, other]

AbstractDifferentiation.jl: Backend-Agnostic Differentiable Programming in Julia

Authors: Frank Schäfer, Mohamed Tarek, Lyndon White, Chris Rackauckas

Abstract: No single Automatic Differentiation (AD) system is the optimal choice for all problems. This means informed selection of an AD system and combinations can be a problem-specific variable that can greatly impact performance. In the Julia programming language, the major AD systems target the same input and thus in theory can compose. Hitherto, switching between AD packages in the Julia Language requi… ▽ More No single Automatic Differentiation (AD) system is the optimal choice for all problems. This means informed selection of an AD system and combinations can be a problem-specific variable that can greatly impact performance. In the Julia programming language, the major AD systems target the same input and thus in theory can compose. Hitherto, switching between AD packages in the Julia Language required end-users to familiarize themselves with the user-facing API of the respective packages. Furthermore, implementing a new, usable AD package required AD package developers to write boilerplate code to define convenience API functions for end-users. As a response to these issues, we present AbstractDifferentiation.jl for the automatized generation of an extensive, unified, user-facing API for any AD package. By splitting the complexity between AD users and AD developers, AD package developers only need to implement one or two primitive definitions to support various utilities for AD users like Jacobians, Hessians and lazy product operators from native primitives such as pullbacks or pushforwards, thus removing tedious -- but so far inevitable -- boilerplate code, and enabling the easy switching and composing between AD implementations for end-users. △ Less

Submitted 4 February, 2022; v1 submitted 25 September, 2021; originally announced September 2021.

Comments: 3 figures, 2 tables 15 pages

arXiv:2107.09443 [pdf]

NeuralPDE: Automating Physics-Informed Neural Networks (PINNs) with Error Approximations

Authors: Kirill Zubov, Zoe McCarthy, Yingbo Ma, Francesco Calisto, Valerio Pagliarino, Simone Azeglio, Luca Bottero, Emmanuel Luján, Valentin Sulzer, Ashutosh Bharambe, Nand Vinchhi, Kaushik Balakrishnan, Devesh Upadhyay, Chris Rackauckas

Abstract: Physics-informed neural networks (PINNs) are an increasingly powerful way to solve partial differential equations, generate digital twins, and create neural surrogates of physical models. In this manuscript we detail the inner workings of NeuralPDE.jl and show how a formulation structured around numerical quadrature gives rise to new loss functions which allow for adaptivity towards bounded error… ▽ More Physics-informed neural networks (PINNs) are an increasingly powerful way to solve partial differential equations, generate digital twins, and create neural surrogates of physical models. In this manuscript we detail the inner workings of NeuralPDE.jl and show how a formulation structured around numerical quadrature gives rise to new loss functions which allow for adaptivity towards bounded error tolerances. We describe the various ways one can use the tool, detailing mathematical techniques like using extended loss functions for parameter estimation and operator discovery, to help potential users adopt these PINN-based techniques into their workflow. We showcase how NeuralPDE uses a purely symbolic formulation so that all of the underlying training code is generated from an abstract formulation, and show how to make use of GPUs and solve systems of PDEs. Afterwards we give a detailed performance analysis which showcases the trade-off between training techniques on a large set of PDEs. We end by focusing on a complex multiphysics example, the Doyle-Fuller-Newman (DFN) Model, and showcase how this PDE can be formulated and solved with NeuralPDE. Together this manuscript is meant to be a detailed and approachable technical report to help potential users of the technique quickly get a sense of the real-world performance trade-offs and use cases of the PINN techniques. △ Less

Submitted 19 July, 2021; originally announced July 2021.

Comments: 74 pages, 20+ figures, 20+ tables

arXiv:2105.05946 [pdf, other]

Composing Modeling and Simulation with Machine Learning in Julia

Authors: Chris Rackauckas, Ranjan Anantharaman, Alan Edelman, Shashi Gowda, Maja Gwozdz, Anand Jain, Chris Laughman, Yingbo Ma, Francesco Martinuzzi, Avik Pal, Utkarsh Rajput, Elliot Saba, Viral B. Shah

Abstract: In this paper we introduce JuliaSim, a high-performance programming environment designed to blend traditional modeling and simulation with machine learning. JuliaSim can build accelerated surrogates from component-based models, such as those conforming to the FMI standard, using continuous-time echo state networks (CTESN). The foundation of this environment, ModelingToolkit.jl, is an acausal model… ▽ More In this paper we introduce JuliaSim, a high-performance programming environment designed to blend traditional modeling and simulation with machine learning. JuliaSim can build accelerated surrogates from component-based models, such as those conforming to the FMI standard, using continuous-time echo state networks (CTESN). The foundation of this environment, ModelingToolkit.jl, is an acausal modeling language which can compose the trained surrogates as components within its staged compilation process. As a complementary factor we present the JuliaSim model library, a standard library with differential-algebraic equations and pre-trained surrogates, which can be composed using the modeling system for design, optimization, and control. We demonstrate the effectiveness of the surrogate-accelerated modeling and simulation approach on HVAC dynamics by showing that the CTESN surrogates accurately capture the dynamics of a HVAC cycle at less than 4\% error while accelerating its simulation by 340x. We illustrate the use of surrogate acceleration in the design process via global optimization of simulation parameters using the embedded surrogate, yielding a speedup of two orders of magnitude to find the optimum. We showcase the surrogate deployed in a co-simulation loop, as a drop-in replacement for one of the coupled FMUs, allowing engineers to effectively explore the design space of a coupled system. Together this demonstrates a workflow for automating the integration of machine learning techniques into traditional modeling and simulation processes. △ Less

Submitted 12 May, 2021; originally announced May 2021.

arXiv:2105.03949 [pdf, other]

High-performance symbolic-numerics via multiple dispatch

Authors: Shashi Gowda, Yingbo Ma, Alessandro Cheli, Maja Gwozdz, Viral B. Shah, Alan Edelman, Christopher Rackauckas

Abstract: As mathematical computing becomes more democratized in high-level languages, high-performance symbolic-numeric systems are necessary for domain scientists and engineers to get the best performance out of their machine without deep knowledge of code optimization. Naturally, users need different term types either to have different algebraic properties for them, or to use efficient data structures. T… ▽ More As mathematical computing becomes more democratized in high-level languages, high-performance symbolic-numeric systems are necessary for domain scientists and engineers to get the best performance out of their machine without deep knowledge of code optimization. Naturally, users need different term types either to have different algebraic properties for them, or to use efficient data structures. To this end, we developed Symbolics.jl, an extendable symbolic system which uses dynamic multiple dispatch to change behavior depending on the domain needs. In this work we detail an underlying abstract term interface which allows for speed without sacrificing generality. We show that by formalizing a generic API on actions independent of implementation, we can retroactively add optimized data structures to our system without changing the pre-existing term rewriters. We showcase how this can be used to optimize term construction and give a 113x acceleration on general symbolic transformations. Further, we show that such a generic API allows for complementary term-rewriting implementations. We demonstrate the ability to swap between classical term-rewriting simplifiers and e-graph-based term-rewriting simplifiers. We showcase an e-graph ruleset which minimizes the number of CPU cycles during expression evaluation, and demonstrate how it simplifies a real-world reaction-network simulation to halve the runtime. Additionally, we show a reaction-diffusion partial differential equation solver which is able to be automatically converted into symbolic expressions via multiple dispatch tracing, which is subsequently accelerated and parallelized to give a 157x simulation speedup. Together, this presents Symbolics.jl as a next-generation symbolic-numeric computing environment geared towards modeling and simulation. △ Less

Submitted 5 February, 2022; v1 submitted 9 May, 2021; originally announced May 2021.

ACM Class: D.3.3; I.1.1; I.1.3

arXiv:2105.03918 [pdf, other]

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Authors: Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Abstract: Democratization of machine learning requires architectures that automatically adapt to new problems. Neural Differential Equations (NDEs) have emerged as a popular modeling framework by removing the need for ML practitioners to choose the number of layers in a recurrent model. While we can control the computational cost by choosing the number of layers in standard architectures, in NDEs the number… ▽ More Democratization of machine learning requires architectures that automatically adapt to new problems. Neural Differential Equations (NDEs) have emerged as a popular modeling framework by removing the need for ML practitioners to choose the number of layers in a recurrent model. While we can control the computational cost by choosing the number of layers in standard architectures, in NDEs the number of neural network evaluations for a forward pass can depend on the number of steps of the adaptive ODE solver. But, can we force the NDE to learn the version with the least steps while not increasing the training cost? Current strategies to overcome slow prediction require high order automatic differentiation, leading to significantly higher training time. We describe a novel regularization method that uses the internal cost heuristics of adaptive differential equation solvers combined with discrete adjoint sensitivities to guide the training process towards learning NDEs that are easier to solve. This approach opens up the blackbox numerical analysis behind the differential equation solver's algorithm and directly uses its local error estimates and stiffness heuristics as cheap and accurate cost estimates. We incorporate our method without any change in the underlying NDE framework and show that our method extends beyond Ordinary Differential Equations to accommodate Neural Stochastic Differential Equations. We demonstrate how our approach can halve the prediction time and, unlike other methods which can increase the training time by an order of magnitude, we demonstrate similar reduction in training times. Together this showcases how the knowledge embedded within state-of-the-art equation solvers can be used to enhance machine learning. △ Less

Submitted 4 February, 2022; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: Proceedings of the 38 th International Conference on Machine Learning, 2021

Journal ref: International Conference on Machine Learning, 139 (2021), 8325-8335

arXiv:2103.15341 [pdf, other]

doi 10.1063/5.0060697

Stiff Neural Ordinary Differential Equations

Authors: Suyong Kim, Weiqi Ji, Sili Deng, Yingbo Ma, Christopher Rackauckas

Abstract: Neural Ordinary Differential Equations (ODE) are a promising approach to learn dynamic models from time-series data in science and engineering applications. This work aims at learning Neural ODE for stiff systems, which are usually raised from chemical kinetic modeling in chemical and biological systems. We first show the challenges of learning neural ODE in the classical stiff ODE systems of Robe… ▽ More Neural Ordinary Differential Equations (ODE) are a promising approach to learn dynamic models from time-series data in science and engineering applications. This work aims at learning Neural ODE for stiff systems, which are usually raised from chemical kinetic modeling in chemical and biological systems. We first show the challenges of learning neural ODE in the classical stiff ODE systems of Robertson's problem and propose techniques to mitigate the challenges associated with scale separations in stiff systems. We then present successful demonstrations in stiff systems of Robertson's problem and an air pollution problem. The demonstrations show that the usage of deep networks with rectified activations, proper scaling of the network outputs as well as loss functions, and stabilized gradient calculations are the key techniques enabling the learning of stiff neural ODE. The success of learning stiff neural ODE opens up possibilities of using neural ODEs in applications with widely varying time-scales, like chemical dynamics in energy conversion, environmental engineering, and the life sciences. △ Less

Submitted 14 September, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

arXiv:2103.05244 [pdf]

ModelingToolkit: A Composable Graph Transformation System For Equation-Based Modeling

Authors: Yingbo Ma, Shashi Gowda, Ranjan Anantharaman, Chris Laughman, Viral Shah, Chris Rackauckas

Abstract: Getting good performance out of numerical equation solvers requires that the user has provided stable and efficient functions representing their model. However, users should not be trusted to write good code. In this manuscript we describe ModelingToolkit (MTK), a symbolic equation-based modeling system which allows for composable transformations to generate stable, efficient, and parallelized mod… ▽ More Getting good performance out of numerical equation solvers requires that the user has provided stable and efficient functions representing their model. However, users should not be trusted to write good code. In this manuscript we describe ModelingToolkit (MTK), a symbolic equation-based modeling system which allows for composable transformations to generate stable, efficient, and parallelized model implementations. MTK blurs the lines of traditional symbolic computing by acting directly on a user's numerical code. We show the ability to apply graph algorithms for automatically parallelizing and performing index reduction on code written for differential-algebraic equation (DAE) solvers, "fixing" the performance and stability of the model without requiring any changes to on the user's part. We demonstrate how composable model transformations can be combined with automated data-driven surrogate generation techniques, allowing machine learning methods to generate accelerated approximate models within an acausal modeling framework. These reduced models are shown to outperform the Dymola Modelica compiler on an HVAC model by 590x at 3\% error. Together, this demonstrates MTK as a system for bringing the latest research in graph transformations directly to modeling applications. △ Less

Submitted 9 February, 2022; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: 10 pages, 3 figures, 1 table

arXiv:2012.07244 [pdf, other]

Bayesian Neural Ordinary Differential Equations

Authors: Raj Dandekar, Karen Chung, Vaibhav Dixit, Mohamed Tarek, Aslan Garcia-Valadez, Krishna Vishal Vemula, Chris Rackauckas

Abstract: Recently, Neural Ordinary Differential Equations has emerged as a powerful framework for modeling physical simulations without explicitly defining the ODEs governing the system, but instead learning them via machine learning. However, the question: "Can Bayesian learning frameworks be integrated with Neural ODE's to robustly quantify the uncertainty in the weights of a Neural ODE?" remains unanswe… ▽ More Recently, Neural Ordinary Differential Equations has emerged as a powerful framework for modeling physical simulations without explicitly defining the ODEs governing the system, but instead learning them via machine learning. However, the question: "Can Bayesian learning frameworks be integrated with Neural ODE's to robustly quantify the uncertainty in the weights of a Neural ODE?" remains unanswered. In an effort to address this question, we primarily evaluate the following categories of inference methods: (a) The No-U-Turn MCMC sampler (NUTS), (b) Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) and (c) Stochastic Langevin Gradient Descent (SGLD). We demonstrate the successful integration of Neural ODEs with the above Bayesian inference frameworks on classical physical systems, as well as on standard machine learning datasets like MNIST, using GPU acceleration. On the MNIST dataset, we achieve a posterior sample accuracy of 98.5% on the test ensemble of 10,000 images. Subsequently, for the first time, we demonstrate the successful integration of variational inference with normalizing flows and Neural ODEs, leading to a powerful Bayesian Neural ODE object. Finally, considering a predator-prey model and an epidemiological system, we demonstrate the probabilistic identification of model specification in partially-described dynamical systems using universal ordinary differential equations. Together, this gives a scientific machine learning tool for probabilistic estimation of epistemic uncertainties. △ Less

Submitted 6 February, 2022; v1 submitted 13 December, 2020; originally announced December 2020.

Comments: 16 pages, 10 figures, 3 tables; added new inference methods, substantially improved MNIST accuracy, revised author affiliations

arXiv:2011.04426 [pdf, other]

AutoMat: Accelerated Computational Electrochemical systems Discovery

Authors: Emil Annevelink, Rachel Kurchin, Eric Muckley, Lance Kavalsky, Vinay I. Hegde, Valentin Sulzer, Shang Zhu, Jiankun Pu, David Farina, Matthew Johnson, Dhairya Gandhi, Adarsh Dave, Hongyi Lin, Alan Edelman, Bharath Ramsundar, James Saal, Christopher Rackauckas, Viral Shah, Bryce Meredig, Venkatasubramanian Viswanathan

Abstract: Large-scale electrification is vital to addressing the climate crisis, but several scientific and technological challenges remain to fully electrify both the chemical industry and transportation. In both of these areas, new electrochemical materials will be critical, but their development currently relies heavily on human-time-intensive experimental trial and error and computationally expensive fi… ▽ More Large-scale electrification is vital to addressing the climate crisis, but several scientific and technological challenges remain to fully electrify both the chemical industry and transportation. In both of these areas, new electrochemical materials will be critical, but their development currently relies heavily on human-time-intensive experimental trial and error and computationally expensive first-principles, meso-scale and continuum simulations. We present an automated workflow, AutoMat, that accelerates these computational steps by introducing both automated input generation and management of simulations across scales from first principles to continuum device modeling. Furthermore, we show how to seamlessly integrate multi-fidelity predictions such as machine learning surrogates or automated robotic experiments "in-the-loop". The automated framework is implemented with design space search techniques to dramatically accelerate the overall materials discovery pipeline by implicitly learning design features that optimize device performance across several metrics. We discuss the benefits of AutoMat using examples in electrocatalysis and energy storage and highlight lessons learned. △ Less

Submitted 13 May, 2022; v1 submitted 3 November, 2020; originally announced November 2020.

Comments: v1-3:4 pages, 1 figure, accepted to NeurIPS Climate Change and AI Workshop 2020, updating acknowledgements and citations v4: substantially updated content and author list, accepted to MRS Bulletin

arXiv:2010.04004 [pdf, other]

Accelerating Simulation of Stiff Nonlinear Systems using Continuous-Time Echo State Networks

Authors: Ranjan Anantharaman, Yingbo Ma, Shashi Gowda, Chris Laughman, Viral Shah, Alan Edelman, Chris Rackauckas

Abstract: Modern design, control, and optimization often requires simulation of highly nonlinear models, leading to prohibitive computational costs. These costs can be amortized by evaluating a cheap surrogate of the full model. Here we present a general data-driven method, the continuous-time echo state network (CTESN), for generating surrogates of nonlinear ordinary differential equations with dynamics at… ▽ More Modern design, control, and optimization often requires simulation of highly nonlinear models, leading to prohibitive computational costs. These costs can be amortized by evaluating a cheap surrogate of the full model. Here we present a general data-driven method, the continuous-time echo state network (CTESN), for generating surrogates of nonlinear ordinary differential equations with dynamics at widely separated timescales. We empirically demonstrate near-constant time performance using our CTESNs on a physically motivated scalable model of a heating system whose full execution time increases exponentially, while maintaining relative error of within 0.2 %. We also show that our model captures fast transients as well as slow dynamics effectively, while other techniques such as physics informed neural networks have difficulties trying to train and predict the highly nonlinear behavior of these models. △ Less

Submitted 24 March, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

arXiv:2007.12158 [pdf, other]

Signal Enhancement for Magnetic Navigation Challenge Problem

Authors: Albert R. Gnadt, Joseph Belarge, Aaron Canciani, Glenn Carl, Lauren Conger, Joseph Curro, Alan Edelman, Peter Morales, Aaron P. Nielsen, Michael F. O'Keeffe, Christopher V. Rackauckas, Jonathan Taylor, Allan B. Wollaber

Abstract: Harnessing the magnetic field of the Earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field measurements from the magnetometer enco… ▽ More Harnessing the magnetic field of the Earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field measurements from the magnetometer encompass the magnetic field from not just the Earth, but also from the vehicle on which it is mounted. It is difficult to separate the Earth magnetic anomaly field, which is crucial for navigation, from the total magnetic field reading from the sensor. The purpose of this challenge problem is to decouple the Earth and aircraft magnetic signals in order to derive a clean signal from which to perform magnetic navigation. Baseline testing on the dataset has shown that the Earth magnetic field can be extracted from the total magnetic field using machine learning (ML). The challenge is to remove the aircraft magnetic field from the total magnetic field using a trained model. This challenge offers an opportunity to construct an effective model for removing the aircraft magnetic field from the dataset by using a scientific machine learning (SciML) approach comprised of an ML algorithm integrated with the physics of magnetic navigation. △ Less

Submitted 6 January, 2023; v1 submitted 23 July, 2020; originally announced July 2020.

Comments: 12 pages, 2 figures. See https://github.com/MIT-AI-Accelerator/MagNav.jl for accompanying data and code

arXiv:2001.04385 [pdf, other]

Universal Differential Equations for Scientific Machine Learning

Authors: Christopher Rackauckas, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, Ali Ramadhan, Alan Edelman

Abstract: In the context of science, the well-known adage "a picture is worth a thousand words" might well be "a model is worth a thousand datasets." In this manuscript we introduce the SciML software ecosystem as a tool for mixing the information of physical laws and scientific models with data-driven machine learning approaches. We describe a mathematical object, which we denote universal differential equ… ▽ More In the context of science, the well-known adage "a picture is worth a thousand words" might well be "a model is worth a thousand datasets." In this manuscript we introduce the SciML software ecosystem as a tool for mixing the information of physical laws and scientific models with data-driven machine learning approaches. We describe a mathematical object, which we denote universal differential equations (UDEs), as the unifying framework connecting the ecosystem. We show how a wide variety of applications, from automatically discovering biological mechanisms to solving high-dimensional Hamilton-Jacobi-Bellman equations, can be phrased and efficiently handled through the UDE formalism and its tooling. We demonstrate the generality of the software tooling to handle stochasticity, delays, and implicit constraints. This funnels the wide variety of SciML applications into a core set of training mechanisms which are highly optimized, stabilized for stiff equations, and compatible with distributed parallelism and GPU accelerators. △ Less

Submitted 2 November, 2021; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 5 figures, 2 tables, 11 supplemental figures, 29 pages, 25 supplemental pages

arXiv:1907.07587 [pdf, other]

A Differentiable Programming System to Bridge Machine Learning and Scientific Computing

Authors: Mike Innes, Alan Edelman, Keno Fischer, Chris Rackauckas, Elliot Saba, Viral B Shah, Will Tebbutt

Abstract: Scientific computing is increasingly incorporating the advancements in machine learning and the ability to work with large amounts of data. At the same time, machine learning models are becoming increasingly sophisticated and exhibit many features often seen in scientific computing, stressing the capabilities of machine learning frameworks. Just as the disciplines of scientific computing and machi… ▽ More Scientific computing is increasingly incorporating the advancements in machine learning and the ability to work with large amounts of data. At the same time, machine learning models are becoming increasingly sophisticated and exhibit many features often seen in scientific computing, stressing the capabilities of machine learning frameworks. Just as the disciplines of scientific computing and machine learning have shared common underlying infrastructure in the form of numerical linear algebra, we now have the opportunity to further share new computational infrastructure, and thus ideas, in the form of Differentiable Programming. We describe Zygote, a Differentiable Programming system that is able to take gradients of general program structures. We implement this system in the Julia programming language. Our system supports almost all language constructs (control flow, recursion, mutation, etc.) and compiles high-performance code without requiring any user intervention or refactoring to stage computations. This enables an expressive programming model for deep learning, but more importantly, it enables us to incorporate a large ecosystem of libraries in our models in a straightforward way. We discuss our approach to automatic differentiation, including its support for advanced techniques such as mixed-mode, complex and checkpointed differentiation, and present several examples of differentiating programs. △ Less

Submitted 18 July, 2019; v1 submitted 17 July, 2019; originally announced July 2019.

Comments: Submitted to NeurIPS 2019

arXiv:1902.02376 [pdf, other]

DiffEqFlux.jl - A Julia Library for Neural Differential Equations

Authors: Chris Rackauckas, Mike Innes, Yingbo Ma, Jesse Bettencourt, Lyndon White, Vaibhav Dixit

Abstract: DiffEqFlux.jl is a library for fusing neural networks and differential equations. In this work we describe differential equations from the viewpoint of data science and discuss the complementary nature between machine learning models and differential equations. We demonstrate the ability to incorporate DifferentialEquations.jl-defined differential equation problems into a Flux-defined neural netwo… ▽ More DiffEqFlux.jl is a library for fusing neural networks and differential equations. In this work we describe differential equations from the viewpoint of data science and discuss the complementary nature between machine learning models and differential equations. We demonstrate the ability to incorporate DifferentialEquations.jl-defined differential equation problems into a Flux-defined neural network, and vice versa. The advantages of being able to use the entire DifferentialEquations.jl suite for this purpose is demonstrated by counter examples where simple integration strategies fail, but the sophisticated integration strategies provided by the DifferentialEquations.jl library succeed. This is followed by a demonstration of delay differential equations and stochastic differential equations inside of neural networks. We show high-level functionality for defining neural ordinary differential equations (neural networks embedded into the differential equation) and describe the extra models in the Flux model zoo which includes neural stochastic differential equations. We conclude by discussing the various adjoint methods used for backpropogation of the differential equation solvers. DiffEqFlux.jl is an important contribution to the area, as it allows the full weight of the differential equation solvers developed from decades of research in the scientific computing field to be readily applied to the challenges posed by machine learning and data science. △ Less

Submitted 6 February, 2019; originally announced February 2019.

Comments: Julialang Blog post, DiffEqFlux.jl

arXiv:1807.06430 [pdf, other]

Confederated Modular Differential Equation APIs for Accelerated Algorithm Development and Benchmarking

Authors: Christopher Rackauckas, Qing Nie

Abstract: Performant numerical solving of differential equations is required for large-scale scientific modeling. In this manuscript we focus on two questions: (1) how can researchers empirically verify theoretical advances and consistently compare methods in production software settings and (2) how can users (scientific domain experts) keep up with the state-of-the-art methods to select those which are mos… ▽ More Performant numerical solving of differential equations is required for large-scale scientific modeling. In this manuscript we focus on two questions: (1) how can researchers empirically verify theoretical advances and consistently compare methods in production software settings and (2) how can users (scientific domain experts) keep up with the state-of-the-art methods to select those which are most appropriate? Here we describe how the confederated modular API of DifferentialEquations.jl addresses these concerns. We detail the package-free API which allows numerical methods researchers to readily utilize and benchmark any compatible method directly in full-scale scientific applications. In addition, we describe how the complexity of the method choices is abstracted via a polyalgorithm. We show how scientific tooling built on top of DifferentialEquations.jl, such as packages for dynamical systems quantification and quantum optics simulation, both benefit from this structure and provide themselves as convenient benchmarking tools. △ Less

Submitted 17 July, 2018; originally announced July 2018.

Comments: 4 figures, 3 algorithms

Showing 1–35 of 35 results for author: Rackauckas, C