-
A collaborative digital twin built on FAIR data and compute infrastructure
Authors:
Thomas M. Deucher,
Juan C. Verduzco,
Michael Titus,
Alejandro Strachan
Abstract:
The integration of machine learning with automated experimentation in self-driving laboratories (SDL) offers a powerful approach to accelerate discovery and optimization tasks in science and engineering applications. When supported by findable, accessible, interoperable, and reusable (FAIR) data infrastructure, SDLs with overlapping interests can collaborate more effectively. This work presents a…
▽ More
The integration of machine learning with automated experimentation in self-driving laboratories (SDL) offers a powerful approach to accelerate discovery and optimization tasks in science and engineering applications. When supported by findable, accessible, interoperable, and reusable (FAIR) data infrastructure, SDLs with overlapping interests can collaborate more effectively. This work presents a distributed SDL implementation built on nanoHUB services for online simulation and FAIR data management. In this framework, geographically dispersed collaborators conducting independent optimization tasks contribute raw experimental data to a shared central database. These researchers can then benefit from analysis tools and machine learning models that automatically update as additional data become available. New data points are submitted through a simple web interface and automatically processed using a nanoHUB Sim2L, which extracts derived quantities and indexes all inputs and outputs in a FAIR data repository called ResultsDB. A separate nanoHUB workflow enables sequential optimization using active learning, where researchers define the optimization objective, and machine learning models are trained on-the-fly with all existing data, guiding the selection of future experiments. Inspired by the concept of ``frugal twin", the optimization task seeks to find the optimal recipe to combine food dyes to achieve the desired target color. With easily accessible and inexpensive materials, researchers and students can set up their own experiments, share data with collaborators, and explore the combination of FAIR data, predictive ML models, and sequential optimization. The tools introduced are generally applicable and can easily be extended to other optimization problems.
△ Less
Submitted 24 June, 2025;
originally announced July 2025.
-
Accelerating active learning materials discovery with FAIR data and workflows: a case study for alloy melting temperatures
Authors:
Mohnish Harwani,
Juan C. Verduzco,
Brian H. Lee,
Alejandro Strachan
Abstract:
Active learning (AL) is a powerful sequential optimization approach that has shown great promise in the discovery of new materials. However, a major challenge remains the acquisition of the initial data and the development of workflows to generate new data at each iteration. In this study, we demonstrate a significant speedup in an optimization task by reusing a published simulation workflow avail…
▽ More
Active learning (AL) is a powerful sequential optimization approach that has shown great promise in the discovery of new materials. However, a major challenge remains the acquisition of the initial data and the development of workflows to generate new data at each iteration. In this study, we demonstrate a significant speedup in an optimization task by reusing a published simulation workflow available for online simulations and its associated data repository, where the results of each workflow run are automatically stored. Both the workflow and its data follow FAIR (findable, accessible, interoperable, and reusable) principles using nanoHUB's infrastructure. The workflow employs molecular dynamics to calculate the melting temperature of multi-principal component alloys. We leveraged all prior data not only to develop an accurate machine learning model to start the sequential optimization but also to optimize the simulation parameters and accelerate convergence. Prior work showed that finding the alloy composition with the highest melting temperature required testing 15 alloy compositions, and establishing the melting temperature for each composition took, on average, 4 simulations. By developing a workflow that utilizes the FAIR data in the nanoHUB database, we reduced the number of simulations per composition to one and found the alloy with the lowest melting temperature testing only three compositions. This second optimization, therefore, shows a speedup of 10x as compared to models that do not access the FAIR databases.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
GPT-4 as an interface between researchers and computational software: improving usability and reproducibility
Authors:
Juan C. Verduzco,
Ethan Holbrook,
Alejandro Strachan
Abstract:
Large language models (LLMs) are playing an increasingly important role in science and engineering. For example, their ability to parse and understand human and computer languages makes them powerful interpreters and their use in applications like code generation are well-documented. We explore the ability of the GPT-4 LLM to ameliorate two major challenges in computational materials science: i) t…
▽ More
Large language models (LLMs) are playing an increasingly important role in science and engineering. For example, their ability to parse and understand human and computer languages makes them powerful interpreters and their use in applications like code generation are well-documented. We explore the ability of the GPT-4 LLM to ameliorate two major challenges in computational materials science: i) the high barriers for adoption of scientific software associated with the use of custom input languages, and ii) the poor reproducibility of published results due to insufficient details in the description of simulation methods. We focus on a widely used software for molecular dynamics simulations, the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS), and quantify the usefulness of input files generated by GPT-4 from task descriptions in English and its ability to generate detailed descriptions of computational tasks from input files. We find that GPT-4 can generate correct and ready-to-use input files for relatively simple tasks and useful starting points for more complex, multi-step simulations. In addition, GPT-4's description of computational tasks from input files can be tuned from a detailed set of step-by-step instructions to a summary description appropriate for publications. Our results show that GPT-4 can reduce the number of routine tasks performed by researchers, accelerate the training of new users, and enhance reproducibility.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
nanoHUB services for FAIR simulations and data: ResultsDB and Sim2Ls
Authors:
Daniel Mejia,
Steven Clark,
Juan Carlos Verduzco,
Michael Zentner,
Lynn Zentner,
Gerhard Klimeck,
Alejandro Strachan
Abstract:
nanoHUB is an open cyber platform for online simulation, data, and education that seeks to make scientific software and associated data widely available and useful. This paper describes recent developments in our simulation infrastructure to address modern data needs. nanoHUB's Sim2Ls (pronounced sim tools) make simulation, modeling, and data workflows discoverable and accessible to all users for…
▽ More
nanoHUB is an open cyber platform for online simulation, data, and education that seeks to make scientific software and associated data widely available and useful. This paper describes recent developments in our simulation infrastructure to address modern data needs. nanoHUB's Sim2Ls (pronounced sim tools) make simulation, modeling, and data workflows discoverable and accessible to all users for cloud computing using standard APIs. In addition, published tools are findable (with digital object identifiers), reusable (via documented requirements and services), and reproducible via containerization. In addition, all Sim2L runs are automatically cached, and their results indexed into a global and queryable database (ResultsDB). We believe this infrastructure significantly lowers the barriers towards making simulation/data workflows and their data findable, accessible, interoperable, and reusable (FAIR). This frictionless access to simulations and data enables researchers, instructors, and students to focus on the application of these products to advance their fields.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Atomistic mechanisms underlying the maximum in diffusivity in doped Li$_7$La$_3$Zr$_2$O$_{12}$
Authors:
Juan C. Verduzco,
Ernesto E. Marinero,
Alejandro Strachan
Abstract:
Doped lithium lanthanum zirconium oxide (LLZO) is a promising class of solid electrolytes for lithium-ion batteries due to their good electrochemical stability and compatibility with Li metal anodes. Ionic diffusivity in these ceramics is known to occur via correlated, vacancy mediated, jumps of Li+ between alternating tetrahedral and octahedral sites. Aliovalent doping at the Zr-site increases th…
▽ More
Doped lithium lanthanum zirconium oxide (LLZO) is a promising class of solid electrolytes for lithium-ion batteries due to their good electrochemical stability and compatibility with Li metal anodes. Ionic diffusivity in these ceramics is known to occur via correlated, vacancy mediated, jumps of Li+ between alternating tetrahedral and octahedral sites. Aliovalent doping at the Zr-site increases the concentration of vacancies in the Li+ sublattice and cation diffusivity, but such an increase is universally followed by a decrease for Li+ concentration lower than 6.3 - 6.5 Li molar content. Molecular dynamics simulations based on density functional theory show that the maximum in diffusivity originates from competing effects between the increased vacancy concentration and the increasing occupancy of the low-energy tetrahedral sites by Li+, which increases the overall activation energy associated with diffusion. For the relatively high temperatures of our simulations, Li+ concentration plays a dominant role in transport as compared to dopant chemistry.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Mapping microstructure to shock-induced temperature fields using deep learning
Authors:
Chunyu Li,
Juan Carlos Verduzco,
Brian H. Lee,
Robert J. Appleton,
Alejandro Strachan
Abstract:
The response of materials to dynamical, or shock, loading is important to planetary science, aerospace engineering, and energetic materials. Thermal-activated processes, including chemical reactions and phase transitions, are significantly accelerated by the localization of the energy deposited into hotspots. These results from the interaction of a supersonic wave with the materials microstructure…
▽ More
The response of materials to dynamical, or shock, loading is important to planetary science, aerospace engineering, and energetic materials. Thermal-activated processes, including chemical reactions and phase transitions, are significantly accelerated by the localization of the energy deposited into hotspots. These results from the interaction of a supersonic wave with the materials microstructure and are governed by complex, coupled processes, including the collapse of porosity, interfacial friction, and localized plastic deformation. These mechanisms are not fully understood and today we lack predictive models to, for example, predict the shock to detonation transition from chemistry and microstructure alone. We demonstrate that deep learning techniques can be used to predict the resulting shock-induced temperature fields in complex composite materials obtained from large-scale molecular dynamics simulations with the initial microstructure as the only input. The accuracy of the Microstructure-Informed Shock-induced Temperature net (MISTnet) model is higher than the current state of the art at a fraction of the computation cost.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
Active learning and molecular dynamics simulations to find high melting temperature alloys
Authors:
David E. Farache,
Juan C. Verduzco,
Zachary D. McClure,
Saaketh Desai,
Alejandro Strachan
Abstract:
Active learning (AL) can drastically accelerate materials discovery; its power has been shown in various classes of materials and target properties. Prior efforts have used machine learning models for the optimal selection of physical experiments or physics-based simulations. However, the latter efforts have been mostly limited to the use of electronic structure calculations and properties that ca…
▽ More
Active learning (AL) can drastically accelerate materials discovery; its power has been shown in various classes of materials and target properties. Prior efforts have used machine learning models for the optimal selection of physical experiments or physics-based simulations. However, the latter efforts have been mostly limited to the use of electronic structure calculations and properties that can be obtained at the unit cell level and with negligible noise. We couple AL with molecular dynamics simulations to identify multiple principal component alloys (MPCAs) with high melting temperatures. Building on cloud computing services through nanoHUB, we present a fully autonomous workflow for the efficient exploration of the high dimensional compositional space of MPCAs. We characterize how uncertainties arising from the stochastic nature of the simulations and the acquisition functions used to select simulations affect the convergence of the approach. Interestingly, we find that relatively short simulations with significant uncertainties can be used to efficiently find the desired alloys as the random forest models used for AL average out fluctuations.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Ionic conductivity optimization of composite polymer electrolytes through filler particle chemical modification
Authors:
Andres Villa,
Juan Carlos Verduzco,
Joseph A. Libera,
Ernesto E. Marinero
Abstract:
The addition of filler particles to polymer electrolytes is known to increment their ionic conductivity (IC). A detailed understanding of how the interactions between the constituent materials are responsible for the enhancement, remains to be developed. A significant contribution is ascribed to an increment of the polymer amorphous fraction, induced by the fillers, resulting in the formation of h…
▽ More
The addition of filler particles to polymer electrolytes is known to increment their ionic conductivity (IC). A detailed understanding of how the interactions between the constituent materials are responsible for the enhancement, remains to be developed. A significant contribution is ascribed to an increment of the polymer amorphous fraction, induced by the fillers, resulting in the formation of higher ionic conductivity channels in the polymer matrix. However, the dependence of IC on the particle weight load and its composition on the polymer morphology is not fully understood. This work investigates Li ion transport in composite polymer electrolytes (CPE) comprising Bi-doped LLZO particles embedded in PEO: LiTFSI matrixes. We find that the IC optimizes for very low particle weight loads (5 to 10%) and that both its magnitude and the load required, strongly depend on the garnet particle composition. Based on structural characterization results and electrochemical impedance spectroscopy, a mechanism is proposed to explain these findings. It is suggested that the Li-molar content in the garnet particle controls its interactions with the polymer matrix, resulting at the optimum loads reported, in the formation of high ionic conductivity channels. We propose that filler particle chemical manipulation of the polymer morphology is a promising avenue for the further development of composite polymer electrolytes.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
Hybrid Polymer-Garnet Materials for All-Solid-State Energy Storage Devices
Authors:
Juan C. Verduzco,
John N. Vergados,
Alejandro Strachan,
Ernesto E. Marinero
Abstract:
Hybrid electrolyte materials comprising polymer-ionic salt matrixes embedded with garnet particles constitute a promising class of materials for the realization of all-solid-state batteries. In addition to providing solutions to the safety issues inherent to current liquid electrolytes, hybrid polymer electrolytes offer advantages over other solid-state electrolytes. This is because their function…
▽ More
Hybrid electrolyte materials comprising polymer-ionic salt matrixes embedded with garnet particles constitute a promising class of materials for the realization of all-solid-state batteries. In addition to providing solutions to the safety issues inherent to current liquid electrolytes, hybrid polymer electrolytes offer advantages over other solid-state electrolytes. This is because their functional properties such as ionic conductivity, electrochemical stability, mechanical and thermal properties can be tailored to a particular application by independently optimizing the properties of the constituent materials. Thereby, providing a rational approach to solving bottlenecks currently preventing solid-state electrolytes from practical implementation into battery devices. This review starts with a survey of solid-state electrolytes, focusing on their materials and ion transport limitations. Next, we summarize the current understanding of transport mechanisms in composite polymer electrolytes (CPEs) with the purpose of identifying materials solutions for further improving their properties. The overall goal of the review is to foster heightened research interest in these hybrid structures to rapidly advance development of future all-solid-state battery devices.
△ Less
Submitted 30 April, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.