-
Multi-fidelity learning for interatomic potentials: Low-level forces and high-level energies are all you need
Authors:
Mitchell Messerly,
Sakib Matin,
Alice E. A. Allen,
Benjamin Nebgen,
Kipton Barros,
Justin S. Smith,
Nicholas Lubbers,
Richard Messerly
Abstract:
The promise of machine learning interatomic potentials (MLIPs) has led to an abundance of public quantum mechanical (QM) training datasets. The quality of an MLIP is directly limited by the accuracy of the energies and atomic forces in the training dataset. Unfortunately, most of these datasets are computed with relatively low-accuracy QM methods, e.g., density functional theory with a moderate ba…
▽ More
The promise of machine learning interatomic potentials (MLIPs) has led to an abundance of public quantum mechanical (QM) training datasets. The quality of an MLIP is directly limited by the accuracy of the energies and atomic forces in the training dataset. Unfortunately, most of these datasets are computed with relatively low-accuracy QM methods, e.g., density functional theory with a moderate basis set. Due to the increased computational cost of more accurate QM methods, e.g., coupled-cluster theory with a complete basis set extrapolation, most high-accuracy datasets are much smaller and often do not contain atomic forces. The lack of high-accuracy atomic forces is quite troubling, as training with force data greatly improves the stability and quality of the MLIP compared to training to energy alone. Because most datasets are computed with a unique level of theory, traditional single-fidelity learning is not capable of leveraging the vast amounts of published QM data. In this study, we apply multi-fidelity learning to train an MLIP to multiple QM datasets of different levels of accuracy, i.e., levels of fidelity. Specifically, we perform three test cases to demonstrate that multi-fidelity learning with both low-level forces and high-level energies yields an extremely accurate MLIP -- far more accurate than a single-fidelity MLIP trained solely to high-level energies and almost as accurate as a single-fidelity MLIP trained directly to high-level energies and forces. Therefore, multi-fidelity learning greatly alleviates the need for generating large and expensive datasets containing high-accuracy atomic forces and allows for more effective training to existing high-accuracy energy-only datasets. Indeed, low-accuracy atomic forces and high-accuracy energies are all that are needed to achieve a high-accuracy MLIP with multi-fidelity learning.
△ Less
Submitted 26 June, 2025; v1 submitted 2 May, 2025;
originally announced May 2025.
-
Ensemble Knowledge Distillation for Machine Learning Interatomic Potentials
Authors:
Sakib Matin,
Emily Shinkle,
Yulia Pimonova,
Galen T. Craven,
Aleksandra Pachalieva,
Ying Wai Li,
Kipton Barros,
Nicholas Lubbers
Abstract:
The quality of machine learning interatomic potentials (MLIPs) strongly depends on the quantity of training data as well as the quantum chemistry (QC) level of theory used. Datasets generated with high-fidelity QC methods are typically restricted to small molecules and may be missing energy gradients, which make it difficult to train accurate MLIPs. We present an ensemble knowledge distillation (E…
▽ More
The quality of machine learning interatomic potentials (MLIPs) strongly depends on the quantity of training data as well as the quantum chemistry (QC) level of theory used. Datasets generated with high-fidelity QC methods are typically restricted to small molecules and may be missing energy gradients, which make it difficult to train accurate MLIPs. We present an ensemble knowledge distillation (EKD) method to improve MLIP accuracy when trained to energy-only datasets. First, multiple teacher models are trained to QC energies and then generate atomic forces for all configurations in the dataset. Next, the student MLIP is trained to both QC energies and to ensemble-averaged forces generated by the teacher models. We apply this workflow on the ANI-1ccx dataset where the configuration energies computed at the coupled cluster level of theory. The resulting student MLIPs achieve new state-of-the-art accuracy on the COMP6 benchmark and show improved stability for molecular dynamics simulations.
△ Less
Submitted 12 June, 2025; v1 submitted 18 March, 2025;
originally announced March 2025.
-
Teacher-student training improves accuracy and efficiency of machine learning interatomic potentials
Authors:
Sakib Matin,
Alice E. A. Allen,
Emily Shinkle,
Aleksandra Pachalieva,
Galen T. Craven,
Benjamin Nebgen,
Justin S. Smith,
Richard Messerly,
Ying Wai Li,
Sergei Tretiak,
Kipton Barros,
Nicholas Lubbers
Abstract:
Machine learning interatomic potentials (MLIPs) are revolutionizing the field of molecular dynamics (MD) simulations. Recent MLIPs have tended towards more complex architectures trained on larger datasets. The resulting increase in computational and memory costs may prohibit the application of these MLIPs to perform large-scale MD simulations. Here, we present a teacher-student training framework…
▽ More
Machine learning interatomic potentials (MLIPs) are revolutionizing the field of molecular dynamics (MD) simulations. Recent MLIPs have tended towards more complex architectures trained on larger datasets. The resulting increase in computational and memory costs may prohibit the application of these MLIPs to perform large-scale MD simulations. Here, we present a teacher-student training framework in which the latent knowledge from the teacher (atomic energies) is used to augment the students' training. We show that the light-weight student MLIPs have faster MD speeds at a fraction of the memory footprint compared to the teacher models. Remarkably, the student models can even surpass the accuracy of the teachers, even though both are trained on the same quantum chemistry dataset. Our work highlights a practical method for MLIPs to reduce the resources required for large-scale MD simulations.
△ Less
Submitted 12 June, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Sunny.jl: A Julia Package for Spin Dynamics
Authors:
David Dahlbom,
Hao Zhang,
Cole Miles,
Sam Quinn,
Alin Niraula,
Bhushan Thipe,
Matthew Wilson,
Sakib Matin,
Het Mankad,
Steven Hahn,
Daniel Pajerowski,
Steve Johnston,
Zhentao Wang,
Harry Lane,
Ying Wai Li,
Xiaojian Bai,
Martin Mourigal,
Cristian D. Batista,
Kipton Barros
Abstract:
Sunny is a Julia package designed to serve the needs of the quantum magnetism community. It supports the specification of a very broad class of spin models and a diverse suite of numerical solvers. These include powerful methods for simulating spin dynamics both in and out of equilibrium. Uniquely, it features a broad generalization of classical and semiclassical approaches to SU(N) coherent state…
▽ More
Sunny is a Julia package designed to serve the needs of the quantum magnetism community. It supports the specification of a very broad class of spin models and a diverse suite of numerical solvers. These include powerful methods for simulating spin dynamics both in and out of equilibrium. Uniquely, it features a broad generalization of classical and semiclassical approaches to SU(N) coherent states, which is useful for studying systems exhibiting strong spin-orbit coupling or local entanglement effects. Sunny also offers a well-developed framework for calculating the dynamical spin structure factor, enabling direct comparison with scattering experiments. Ease of use is a priority, with tools for symmetry-guided modeling and interactive visualization.
△ Less
Submitted 23 January, 2025; v1 submitted 22 January, 2025;
originally announced January 2025.
-
Thermodynamic Transferability in Coarse-Grained Force Fields using Graph Neural Networks
Authors:
Emily Shinkle,
Aleksandra Pachalieva,
Riti Bahl,
Sakib Matin,
Brendan Gifford,
Galen T. Craven,
Nicholas Lubbers
Abstract:
Coarse-graining is a molecular modeling technique in which an atomistic system is represented in a simplified fashion that retains the most significant system features that contribute to a target output, while removing the degrees of freedom that are less relevant. This reduction in model complexity allows coarse-grained molecular simulations to reach increased spatial and temporal scales compared…
▽ More
Coarse-graining is a molecular modeling technique in which an atomistic system is represented in a simplified fashion that retains the most significant system features that contribute to a target output, while removing the degrees of freedom that are less relevant. This reduction in model complexity allows coarse-grained molecular simulations to reach increased spatial and temporal scales compared to corresponding all-atom models. A core challenge in coarse-graining is to construct a force field that represents the interactions in the new representation in a way that preserves the atomistic-level properties. Many approaches to building coarse-grained force fields have limited transferability between different thermodynamic conditions as a result of averaging over internal fluctuations at a specific thermodynamic state point. Here, we use a graph-convolutional neural network architecture, the Hierarchically Interacting Particle Neural Network with Tensor Sensitivity (HIP-NN-TS), to develop a highly automated training pipeline for coarse grained force fields which allows for studying the transferability of coarse-grained models based on the force-matching approach. We show that this approach not only yields highly accurate force fields, but also that these force fields are more transferable through a variety of thermodynamic conditions. These results illustrate the potential of machine learning techniques such as graph neural networks to improve the construction of transferable coarse-grained force fields.
△ Less
Submitted 18 November, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Machine learning potentials with Iterative Boltzmann Inversion: training to experiment
Authors:
Sakib Matin,
Alice Allen,
Justin S. Smith,
Nicholas Lubbers,
Ryan B. Jadrich,
Richard A. Messerly,
Benjamin T. Nebgen,
Ying Wai Li,
Sergei Tretiak,
Kipton Barros
Abstract:
Methodologies for training machine learning potentials (MLPs) to quantum-mechanical simulation data have recently seen tremendous progress. Experimental data has a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a training procedure based on Iterative Boltzmann Invers…
▽ More
Methodologies for training machine learning potentials (MLPs) to quantum-mechanical simulation data have recently seen tremendous progress. Experimental data has a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a training procedure based on Iterative Boltzmann Inversion that produces a pair potential correction to an existing MLP, using equilibrium radial distribution function data. By applying these corrections to a MLP for pure aluminum based on Density Functional Theory, we observe that the resulting model largely addresses previous overstructuring in the melt phase. Interestingly, the corrected MLP also exhibits improved performance in predicting experimental diffusion constants, which are not included in the training procedure. The presented method does not require auto-differentiating through a molecular dynamics solver, and does not make assumptions about the MLP architecture. The results suggest a practical framework of incorporating experimental data into machine learning models to improve accuracy of molecular dynamics simulations.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Learning Together: Towards foundational models for machine learning interatomic potentials with meta-learning
Authors:
Alice E. A. Allen,
Nicholas Lubbers,
Sakib Matin,
Justin Smith,
Richard Messerly,
Sergei Tretiak,
Kipton Barros
Abstract:
The development of machine learning models has led to an abundance of datasets containing quantum mechanical (QM) calculations for molecular and material systems. However, traditional training methods for machine learning models are unable to leverage the plethora of data available as they require that each dataset be generated using the same QM method. Taking machine learning interatomic potentia…
▽ More
The development of machine learning models has led to an abundance of datasets containing quantum mechanical (QM) calculations for molecular and material systems. However, traditional training methods for machine learning models are unable to leverage the plethora of data available as they require that each dataset be generated using the same QM method. Taking machine learning interatomic potentials (MLIPs) as an example, we show that meta-learning techniques, a recent advancement from the machine learning community, can be used to fit multiple levels of QM theory in the same training process. Meta-learning changes the training procedure to learn a representation that can be easily re-trained to new tasks with small amounts of data. We then demonstrate that meta-learning enables simultaneously training to multiple large organic molecule datasets. As a proof of concept, we examine the performance of a MLIP refit to a small drug-like molecule and show that pre-training potentials to multiple levels of theory with meta-learning improves performance. This difference in performance can be seen both in the reduced error and in the improved smoothness of the potential energy surface produced. We therefore show that meta-learning can utilize existing datasets with inconsistent QM levels of theory to produce models that are better at specializing to new datasets. This opens new routes for creating pre-trained, foundational models for interatomic potentials.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
Scaling of causal neural avalanches in a neutral model
Authors:
Sakib Matin,
Thomas Tenzin,
W. Klein
Abstract:
Neural avalanches are collective firings of neurons that exhibit emergent scale-free behavior. Understanding the nature and distribution of these avalanches is an important element in understanding how the brain functions. We study a model of neural avalanches for which the dynamics are governed by neutral theory. The neural avalanches are defined using causal connections between the firing neuron…
▽ More
Neural avalanches are collective firings of neurons that exhibit emergent scale-free behavior. Understanding the nature and distribution of these avalanches is an important element in understanding how the brain functions. We study a model of neural avalanches for which the dynamics are governed by neutral theory. The neural avalanches are defined using causal connections between the firing neurons. We analyze the scaling of causal neural avalanches as the critical point is approached from the absorbing phase. By using cluster analysis tools from percolation theory, we characterize the critical properties of the neural avalanches. We identify the tuning parameters consistent with experiments. The scaling hypothesis provides a unified explanation of the power laws which characterize the critical point. The critical exponents characterizing the avalanche distributions and divergence of the response functions are consistent with the predictions of the scaling hypothesis. We use a universal scaling function for the avalanche profile to find that the firing rates for avalanches of different durations show data collapse after appropriate rescaling. We also find data collapse for the avalanche distribution functions, which is stronger evidence of criticality than just the existence of power laws. Critical slowing-down and power law relaxation of avalanches is observed as the system is tuned to its critical point. We discuss how our results motivate future empirical studies of criticality in the brain.
△ Less
Submitted 14 January, 2021; v1 submitted 15 September, 2019;
originally announced September 2019.
-
Prediction in a driven-dissipative system displaying a continuous phase transition
Authors:
Chon-Kit Pun,
Sakib Matin,
W. Klein,
Harvey Gould
Abstract:
Prediction in complex systems at criticality is believed to be very difficult, if not impossible. Of particular interest is whether earthquakes, whose distribution follows a power law (Gutenberg-Richter) distribution, are in principle unpredictable. We study the predictability of event sizes in the Olmai-Feder-Christensen model at different proximities to criticality using a convolutional neural n…
▽ More
Prediction in complex systems at criticality is believed to be very difficult, if not impossible. Of particular interest is whether earthquakes, whose distribution follows a power law (Gutenberg-Richter) distribution, are in principle unpredictable. We study the predictability of event sizes in the Olmai-Feder-Christensen model at different proximities to criticality using a convolutional neural network. The distribution of event sizes satisfies a power law with a cutoff for large events. We find that prediction decreases as criticality is approached and that prediction is possible only for large, non-scaling events. Our results suggest that earthquake faults that satisfy Gutenberg-Richter scaling are difficult to forecast.
△ Less
Submitted 26 July, 2019;
originally announced July 2019.
-
Genetic drift in range expansions is very sensitive to density feedback in dispersal and growth
Authors:
Gabriel Birzu,
Sakib Matin,
Oskar Hallatschek,
Kirill S. Korolev
Abstract:
Theory predicts rapid genetic drift during invasions, yet many expanding populations maintain high genetic diversity. We find that genetic drift is dramatically suppressed when dispersal rates increase with the population density because many more migrants from the diverse, high-density regions arrive at the expansion edge. When density-dependence is weak or negative, the effective population size…
▽ More
Theory predicts rapid genetic drift during invasions, yet many expanding populations maintain high genetic diversity. We find that genetic drift is dramatically suppressed when dispersal rates increase with the population density because many more migrants from the diverse, high-density regions arrive at the expansion edge. When density-dependence is weak or negative, the effective population size of the front scales only logarithmically with the carrying capacity. The dependence, however, switches to a sublinear power law and then to a linear increase as the density-dependence becomes strongly positive. We develop a unified framework revealing that the transitions between different regimes of diversity loss are controlled by a single, universal parameter: the ratio of the expansion velocity to the geometric mean of dispersal and growth rates at expansion edge. Our results suggest that positive density-dependence could dramatically alter evolution in expanding populations even when its contributions to the expansion velocity is small.
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
Universal fluctuations in growth dynamics of economic systems
Authors:
Nathan C. Frey,
Sakib Matin,
H. Eugene Stanley,
Michael Salinger
Abstract:
The growth of business firms is an example of a system of complex interacting units that resembles complex interacting systems in nature such as earthquakes. Remarkably, work in econophysics has provided evidence that the statistical properties of the growth of business firms follow the same sorts of power laws that characterize physical systems near their critical points. Given how economies chan…
▽ More
The growth of business firms is an example of a system of complex interacting units that resembles complex interacting systems in nature such as earthquakes. Remarkably, work in econophysics has provided evidence that the statistical properties of the growth of business firms follow the same sorts of power laws that characterize physical systems near their critical points. Given how economies change over time, whether these statistical properties are persistent, robust, and universal like those of physical systems remains an open question. Here, we show that the scaling properties of firm growth previously demonstrated for publicly-traded U.S. manufacturing firms from 1974 to 1993 apply to the same sorts of firms from 1993 to 2015, to firms in other broad sectors (such as materials), and to firms in new sectors (such as Internet services). We measure virtually the same scaling exponent for manufacturing for the 1993 to 2015 period as for the 1974 to 1993 period and virtually the same scaling exponent for other sectors as for manufacturing. Furthermore, we show that fluctuations of the growth rate for new industries self-organize into a power law over relatively short time scales.
△ Less
Submitted 21 May, 2018; v1 submitted 5 December, 2017;
originally announced December 2017.