-
A Foundation Model for Material Fracture Prediction
Authors:
Agnese Marcato,
Aleksandra Pachalieva,
Ryley G. Hill,
Kai Gao,
Xiaoyu Wang,
Esteban Rougier,
Zhou Lei,
Vinamra Agrawal,
Janel Chua,
Qinjun Kang,
Jeffrey D. Hyman,
Abigail Hunter,
Nathan DeBardeleben,
Earl Lawrence,
Hari Viswanathan,
Daniel O'Malley,
Javier E. Santos
Abstract:
Accurately predicting when and how materials fail is critical to designing safe, reliable structures, mechanical systems, and engineered components that operate under stress. Yet, fracture behavior remains difficult to model across the diversity of materials, geometries, and loading conditions in real-world applications. While machine learning (ML) methods show promise, most models are trained on…
▽ More
Accurately predicting when and how materials fail is critical to designing safe, reliable structures, mechanical systems, and engineered components that operate under stress. Yet, fracture behavior remains difficult to model across the diversity of materials, geometries, and loading conditions in real-world applications. While machine learning (ML) methods show promise, most models are trained on narrow datasets, lack robustness, and struggle to generalize. Meanwhile, physics-based simulators offer high-fidelity predictions but are fragmented across specialized methods and require substantial high-performance computing resources to explore the input space. To address these limitations, we present a data-driven foundation model for fracture prediction, a transformer-based architecture that operates across simulators, a wide range of materials (including plastic-bonded explosives, steel, aluminum, shale, and tungsten), and diverse loading conditions. The model supports both structured and unstructured meshes, combining them with large language model embeddings of textual input decks specifying material properties, boundary conditions, and solver settings. This multimodal input design enables flexible adaptation across simulation scenarios without changes to the model architecture. The trained model can be fine-tuned with minimal data on diverse downstream tasks, including time-to-failure estimation, modeling fracture evolution, and adapting to combined finite-discrete element method simulations. It also generalizes to unseen materials such as titanium and concrete, requiring as few as a single sample, dramatically reducing data needs compared to standard ML. Our results show that fracture prediction can be unified under a single model architecture, offering a scalable, extensible alternative to simulator-specific workflows.
△ Less
Submitted 30 July, 2025;
originally announced July 2025.
-
Ensemble Knowledge Distillation for Machine Learning Interatomic Potentials
Authors:
Sakib Matin,
Emily Shinkle,
Yulia Pimonova,
Galen T. Craven,
Aleksandra Pachalieva,
Ying Wai Li,
Kipton Barros,
Nicholas Lubbers
Abstract:
The quality of machine learning interatomic potentials (MLIPs) strongly depends on the quantity of training data as well as the quantum chemistry (QC) level of theory used. Datasets generated with high-fidelity QC methods are typically restricted to small molecules and may be missing energy gradients, which make it difficult to train accurate MLIPs. We present an ensemble knowledge distillation (E…
▽ More
The quality of machine learning interatomic potentials (MLIPs) strongly depends on the quantity of training data as well as the quantum chemistry (QC) level of theory used. Datasets generated with high-fidelity QC methods are typically restricted to small molecules and may be missing energy gradients, which make it difficult to train accurate MLIPs. We present an ensemble knowledge distillation (EKD) method to improve MLIP accuracy when trained to energy-only datasets. First, multiple teacher models are trained to QC energies and then generate atomic forces for all configurations in the dataset. Next, the student MLIP is trained to both QC energies and to ensemble-averaged forces generated by the teacher models. We apply this workflow on the ANI-1ccx dataset where the configuration energies computed at the coupled cluster level of theory. The resulting student MLIPs achieve new state-of-the-art accuracy on the COMP6 benchmark and show improved stability for molecular dynamics simulations.
△ Less
Submitted 12 June, 2025; v1 submitted 18 March, 2025;
originally announced March 2025.
-
Teacher-student training improves accuracy and efficiency of machine learning interatomic potentials
Authors:
Sakib Matin,
Alice E. A. Allen,
Emily Shinkle,
Aleksandra Pachalieva,
Galen T. Craven,
Benjamin Nebgen,
Justin S. Smith,
Richard Messerly,
Ying Wai Li,
Sergei Tretiak,
Kipton Barros,
Nicholas Lubbers
Abstract:
Machine learning interatomic potentials (MLIPs) are revolutionizing the field of molecular dynamics (MD) simulations. Recent MLIPs have tended towards more complex architectures trained on larger datasets. The resulting increase in computational and memory costs may prohibit the application of these MLIPs to perform large-scale MD simulations. Here, we present a teacher-student training framework…
▽ More
Machine learning interatomic potentials (MLIPs) are revolutionizing the field of molecular dynamics (MD) simulations. Recent MLIPs have tended towards more complex architectures trained on larger datasets. The resulting increase in computational and memory costs may prohibit the application of these MLIPs to perform large-scale MD simulations. Here, we present a teacher-student training framework in which the latent knowledge from the teacher (atomic energies) is used to augment the students' training. We show that the light-weight student MLIPs have faster MD speeds at a fraction of the memory footprint compared to the teacher models. Remarkably, the student models can even surpass the accuracy of the teachers, even though both are trained on the same quantum chemistry dataset. Our work highlights a practical method for MLIPs to reduce the resources required for large-scale MD simulations.
△ Less
Submitted 12 June, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Developing a Foundation Model for Predicting Material Failure
Authors:
Agnese Marcato,
Javier E. Santos,
Aleksandra Pachalieva,
Kai Gao,
Ryley Hill,
Esteban Rougier,
Qinjun Kang,
Jeffrey Hyman,
Abigail Hunter,
Janel Chua,
Earl Lawrence,
Hari Viswanathan,
Daniel O'Malley
Abstract:
Understanding material failure is critical for designing stronger and lighter structures by identifying weaknesses that could be mitigated. Existing full-physics numerical simulation techniques involve trade-offs between speed, accuracy, and the ability to handle complex features like varying boundary conditions, grid types, resolution, and physical models. We present the first foundation model sp…
▽ More
Understanding material failure is critical for designing stronger and lighter structures by identifying weaknesses that could be mitigated. Existing full-physics numerical simulation techniques involve trade-offs between speed, accuracy, and the ability to handle complex features like varying boundary conditions, grid types, resolution, and physical models. We present the first foundation model specifically designed for predicting material failure, leveraging large-scale datasets and a high parameter count (up to 3B) to significantly improve the accuracy of failure predictions. In addition, a large language model provides rich context embeddings, enabling our model to make predictions across a diverse range of conditions. Unlike traditional machine learning models, which are often tailored to specific systems or limited to narrow simulation conditions, our foundation model is designed to generalize across different materials and simulators. This flexibility enables the model to handle a range of material properties and conditions, providing accurate predictions without the need for retraining or adjustments for each specific case. Our model is capable of accommodating diverse input formats, such as images and varying simulation conditions, and producing a range of outputs, from simulation results to effective properties. It supports both Cartesian and unstructured grids, with design choices that allow for seamless updates and extensions as new data and requirements emerge. Our results show that increasing the scale of the model leads to significant performance gains (loss scales as $N^{-1.6}$, compared to language models which often scale as $N^{-0.5}$).
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Thermodynamic Transferability in Coarse-Grained Force Fields using Graph Neural Networks
Authors:
Emily Shinkle,
Aleksandra Pachalieva,
Riti Bahl,
Sakib Matin,
Brendan Gifford,
Galen T. Craven,
Nicholas Lubbers
Abstract:
Coarse-graining is a molecular modeling technique in which an atomistic system is represented in a simplified fashion that retains the most significant system features that contribute to a target output, while removing the degrees of freedom that are less relevant. This reduction in model complexity allows coarse-grained molecular simulations to reach increased spatial and temporal scales compared…
▽ More
Coarse-graining is a molecular modeling technique in which an atomistic system is represented in a simplified fashion that retains the most significant system features that contribute to a target output, while removing the degrees of freedom that are less relevant. This reduction in model complexity allows coarse-grained molecular simulations to reach increased spatial and temporal scales compared to corresponding all-atom models. A core challenge in coarse-graining is to construct a force field that represents the interactions in the new representation in a way that preserves the atomistic-level properties. Many approaches to building coarse-grained force fields have limited transferability between different thermodynamic conditions as a result of averaging over internal fluctuations at a specific thermodynamic state point. Here, we use a graph-convolutional neural network architecture, the Hierarchically Interacting Particle Neural Network with Tensor Sensitivity (HIP-NN-TS), to develop a highly automated training pipeline for coarse grained force fields which allows for studying the transferability of coarse-grained models based on the force-matching approach. We show that this approach not only yields highly accurate force fields, but also that these force fields are more transferable through a variety of thermodynamic conditions. These results illustrate the potential of machine learning techniques such as graph neural networks to improve the construction of transferable coarse-grained force fields.
△ Less
Submitted 18 November, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Learning the Factors Controlling Mineralization for Geologic Carbon Sequestration
Authors:
Aleksandra Pachalieva,
Jeffrey D. Hyman,
Daniel O'Malley,
Hari Viswanathan,
Gowri Srinivasan
Abstract:
We perform a set of flow and reactive transport simulations within three-dimensional fracture networks to learn the factors controlling mineral reactions. CO$_2$ mineralization requires CO$_2$-laden water, dissolution of a mineral that then leads to precipitation of a CO$_2$-bearing mineral. Our discrete fracture networks (DFN) are partially filled with quartz that gradually dissolves until it rea…
▽ More
We perform a set of flow and reactive transport simulations within three-dimensional fracture networks to learn the factors controlling mineral reactions. CO$_2$ mineralization requires CO$_2$-laden water, dissolution of a mineral that then leads to precipitation of a CO$_2$-bearing mineral. Our discrete fracture networks (DFN) are partially filled with quartz that gradually dissolves until it reaches a quasi-steady state. At the end of the simulation, we measure the quartz remaining in each fracture within the domain. We observe that a small backbone of fracture exists, where the quartz is fully dissolved which leads to increased flow and transport. However, depending on the DFN topology and the rate of dissolution, we observe a large variability of these changes, which indicates an interplay between the fracture network structure and the impact of geochemical dissolution. In this work, we developed a machine learning framework to extract the important features that support mineralization in the form of dissolution. In addition, we use structural and topological features of the fracture network to predict the remaining quartz volume in quasi-steady state conditions. As a first step to characterizing carbon mineralization, we study dissolution with this framework. We studied a variety of reaction and fracture parameters and their impact on the dissolution of quartz in fracture networks. We found that the dissolution reaction rate constant of quartz and the distance to the flowing backbone in the fracture network are the two most important features that control the amount of quartz left in the system. For the first time, we use a combination of a finite-volume reservoir model and graph-based approach to study reactive transport in a complex fracture network to determine the key features that control dissolution.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Impact of artificial topological changes on flow and transport through fractured media due to mesh resolution
Authors:
Aleksandra A. Pachalieva,
Matthew R. Sweeney,
Hari Viswanathan,
Emily Stein,
Rosie Leone,
Jeffrey D. Hyman
Abstract:
We performed a set of numerical simulations to characterize the interplay of fracture network topology, upscaling, and mesh refinement on flow and transport properties in fractured porous media. We generated a set of generic three-dimensional discrete fracture networks at various densities, where the radii of the fractures were sampled from a truncated power-law distribution, and whose parameters…
▽ More
We performed a set of numerical simulations to characterize the interplay of fracture network topology, upscaling, and mesh refinement on flow and transport properties in fractured porous media. We generated a set of generic three-dimensional discrete fracture networks at various densities, where the radii of the fractures were sampled from a truncated power-law distribution, and whose parameters were loosely based on field site characterizations. We also considered five network densities, which were defined using a dimensionless version of density based on percolation theory. Once the networks were generated, we upscaled them into a single continuum model using the upscaled discrete fracture matrix model presented by Sweeney et al. We considered steady, isothermal pressure-driven flow through each domain and then simulated conservative, decaying, and adsorbing tracers using a pulse injection into the domain. For each simulation, we calculated the effective permeability and solute breakthrough curves as quantities of interest to compare between network realizations. We found that selecting a mesh resolution such that the global topology of the upscaled mesh matches the fracture network is essential. If the upscaled mesh has a connected pathway of fracture (higher permeability) cells but the fracture network does not, then the estimates for effective permeability and solute breakthrough will be incorrect. False connections cannot be eliminated entirely, but they can be managed by choosing appropriate mesh resolution and refinement for a given network. Adopting octree meshing to obtain sufficient levels of refinement leads to fewer computational cells (up to a 90% reduction in overall cell count) when compared to using a uniform resolution grid and can result in a more accurate continuum representation of the true fracture network.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Physics-informed machine learning with differentiable programming for heterogeneous underground reservoir pressure management
Authors:
Aleksandra Pachalieva,
Daniel O'Malley,
Dylan Robert Harp,
Hari Viswanathan
Abstract:
Avoiding over-pressurization in subsurface reservoirs is critical for applications like CO2 sequestration and wastewater injection. Managing the pressures by controlling injection/extraction are challenging because of complex heterogeneity in the subsurface. The heterogeneity typically requires high-fidelity physics-based models to make predictions on CO$_2$ fate. Furthermore, characterizing the h…
▽ More
Avoiding over-pressurization in subsurface reservoirs is critical for applications like CO2 sequestration and wastewater injection. Managing the pressures by controlling injection/extraction are challenging because of complex heterogeneity in the subsurface. The heterogeneity typically requires high-fidelity physics-based models to make predictions on CO$_2$ fate. Furthermore, characterizing the heterogeneity accurately is fraught with parametric uncertainty. Accounting for both, heterogeneity and uncertainty, makes this a computationally-intensive problem challenging for current reservoir simulators. To tackle this, we use differentiable programming with a full-physics model and machine learning to determine the fluid extraction rates that prevent over-pressurization at critical reservoir locations. We use DPFEHM framework, which has trustworthy physics based on the standard two-point flux finite volume discretization and is also automatically differentiable like machine learning models. Our physics-informed machine learning framework uses convolutional neural networks to learn an appropriate extraction rate based on the permeability field. We also perform a hyperparameter search to improve the model's accuracy. Training and testing scenarios are executed to evaluate the feasibility of using physics-informed machine learning to manage reservoir pressures. We constructed and tested a sufficiently accurate simulator that is 400000 times faster than the underlying physics-based simulator, allowing for near real-time analysis and robust uncertainty quantification.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Connecting lattice Boltzmann methods to physical reality by coarse-graining Molecular Dynamics simulations
Authors:
Aleksandra Pachalieva,
Alexander J. Wagner
Abstract:
The success of lattice Boltzmann methods has been attributed to their mesoscopic nature as a method derivable from a physically consistent microscopic model. Original lattice Boltzmann methods were Boltzmann averages of an underlying lattice gas. In the transition to modern lattice Boltzmann method, this link was broken, and the frequently used over-relaxation to achieve high Reynolds numbers has…
▽ More
The success of lattice Boltzmann methods has been attributed to their mesoscopic nature as a method derivable from a physically consistent microscopic model. Original lattice Boltzmann methods were Boltzmann averages of an underlying lattice gas. In the transition to modern lattice Boltzmann method, this link was broken, and the frequently used over-relaxation to achieve high Reynolds numbers has been seen as lacking physical motivation. While this approach has undeniable utility, it appeared to break the link to any underlying physical reality putting into question the special place of lattice Boltzmann methods among fluid simulation methods. In this letter, we show that over-relaxation arises naturally from physical lattice gases that are derived as a coarse-graining of Molecular Dynamics simulations thereby re-affirming the firm foundation of lattice Boltzmann methods in physical reality.
△ Less
Submitted 22 September, 2021; v1 submitted 10 September, 2021;
originally announced September 2021.
-
Molecular dynamics lattice gas equilibrium distribution function for Lennard-Jones particles
Authors:
Aleksandra Pachalieva,
Alexander J. Wagner
Abstract:
The molecular dynamics lattice gas method maps a molecular dynamics simulation onto a lattice gas using a coarse-graining procedure. This is a novel fundamental approach to derive the lattice Boltzmann method by taking a Boltzmann average over the molecular dynamics lattice gas. A key property of the lattice Boltzmann method is the equilibrium distribution function, which was originally derived by…
▽ More
The molecular dynamics lattice gas method maps a molecular dynamics simulation onto a lattice gas using a coarse-graining procedure. This is a novel fundamental approach to derive the lattice Boltzmann method by taking a Boltzmann average over the molecular dynamics lattice gas. A key property of the lattice Boltzmann method is the equilibrium distribution function, which was originally derived by assuming that the particle displacements in the molecular dynamics simulation are Boltzmann distributed. However, we recently discovered that a single Gaussian distribution function is not sufficient to describe the particle displacements in a broad transition regime between free particles and particles undergoing many collisions in one time step. In a recent publication, we proposed a Poisson weighted sum of Gaussians which shows better agreement with the molecular dynamics data. We derive a lattice Boltzmann equilibrium distribution function from the Poisson weighted sum of Gaussians model and compare it to a measured equilibrium distribution function from molecular dynamics data and to an analytical approximation of the equilibrium distribution function from a single Gaussian probability distribution function.
△ Less
Submitted 22 March, 2021; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Non-Gaussian distribution of displacements for Lennard-Jones particles in equilibrium
Authors:
Aleksandra Pachalieva,
Alexander J. Wagner
Abstract:
Most meso-scale simulation methods assume Gaussian distributions of velocity-like quantities. These quantities are not true velocities, however, but rather time-averaged velocities or displacements of particles. We show that there is a large range of coarse-graining scales where the assumption of a Gaussian distribution of these displacements fails, and a more complex distribution is required to a…
▽ More
Most meso-scale simulation methods assume Gaussian distributions of velocity-like quantities. These quantities are not true velocities, however, but rather time-averaged velocities or displacements of particles. We show that there is a large range of coarse-graining scales where the assumption of a Gaussian distribution of these displacements fails, and a more complex distribution is required to adequately express these distribution functions of displacements.
△ Less
Submitted 9 June, 2020;
originally announced June 2020.
-
Validity of the Molecular-Dynamics-Lattice-Gas Global Equilibrium Distribution Function
Authors:
M. Reza Parsa,
Aleksandra Pachalieva,
Alexander J. Wagner
Abstract:
The MDLG method establishes a direct link between a lattice-gas method and the coarse-graining of a Molecular Dynamics approach. Due to its connection to Molecular Dynamics, the MDLG rigorously recovers the hydrodynamics and allows to validate the behavior of the lattice-gas or lattice-Boltzmann methods directly without using the standard kinetic theory approach. In this paper, we show that the an…
▽ More
The MDLG method establishes a direct link between a lattice-gas method and the coarse-graining of a Molecular Dynamics approach. Due to its connection to Molecular Dynamics, the MDLG rigorously recovers the hydrodynamics and allows to validate the behavior of the lattice-gas or lattice-Boltzmann methods directly without using the standard kinetic theory approach. In this paper, we show that the analytical definition of the equilibrium distribution function remains valid even for very high volume fractions.
△ Less
Submitted 26 February, 2019; v1 submitted 20 November, 2018;
originally announced November 2018.