-
Designing Corrosion-Resistant CoCrNi Medium Entropy Alloys via Short-Range Order Modification
Authors:
Elaf A. Anber,
Debashish Sur,
Annie K. Barnett,
Daniel L. Foley,
Andrew M. Minor,
Brian L. DeCost,
Howie Joress,
Anatoly I. Frenkel,
Michael L. Falk,
John R. Scully,
Mitra L. Taheri
Abstract:
Equiatomic CoCrNi medium entropy alloys are known for their unique properties linked to chemical short-range order (CSRO), crucial in both percolation processes and/or nucleation and growth processes influencing alloy passivation in aqueous environments. This study combines extended x-ray absorption fine structure, atomistic simulations, electrochemical methods, x-ray photoelectron spectroscopy, a…
▽ More
Equiatomic CoCrNi medium entropy alloys are known for their unique properties linked to chemical short-range order (CSRO), crucial in both percolation processes and/or nucleation and growth processes influencing alloy passivation in aqueous environments. This study combines extended x-ray absorption fine structure, atomistic simulations, electrochemical methods, x-ray photoelectron spectroscopy, and transmission electron microscopy to explore CSRO evolution, passive film formation, as well as its characteristics in the as-homogenized CoCrNi condition, both before and after aging treatment. Results reveal a shift in local alloying element bonding environments post-aging, with simulations indicating increased Cr-Cr CSRO in 2nd nearest neighbor shells. Enhanced passive film formation kinetics and superior protection of the aged alloy in harsh acidified 3 mol/L NaCl solution indicate improved aqueous passivation correlated with Cr-Cr CSRO. This work establishes a direct connection between alloy CSRO and aqueous passivation in CoCrNi, highlighting its potential for tailored corrosion-resistant applications.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Intrinsic Direct Air Capture
Authors:
Austin McDannald,
Daniel W. Siderius,
Brian DeCost,
Kamal Choudhary,
Diana L. Ortiz-Montalvo
Abstract:
We present new metrics to evaluate solid sorbent materials for Direct Air Capture (DAC). These new metrics provide a theoretical upper bound on CO2 captured per energy as well as a theoretical upper limit on the purity of the captured CO2. These new metrics are based entirely on intrinsic material properties and are therefore agnostic to the design of the DAC system. These metrics apply to any ads…
▽ More
We present new metrics to evaluate solid sorbent materials for Direct Air Capture (DAC). These new metrics provide a theoretical upper bound on CO2 captured per energy as well as a theoretical upper limit on the purity of the captured CO2. These new metrics are based entirely on intrinsic material properties and are therefore agnostic to the design of the DAC system. These metrics apply to any adsorption-refresh cycle design. In this work we demonstrate the use of these metrics with the example of temperature-pressure swing refresh cycles. The main requirement for applying these metrics is to describe the equilibrium uptake (along with a few other materials properties) of each species in terms of the thermodynamic variables (e.g. temperature, pressure). We derive these metrics from thermodynamic energy balances. To apply these metrics on a set of examples, we first generated approximations of the necessary materials properties for 11 660 metal-organic framework materials (MOFs). We find that the performance of the sorbents is highly dependent on the path through thermodynamic parameter space. These metrics allow for: 1) finding the optimum materials given a particular refresh cycle, and 2) finding the optimum refresh cycles given a particular sorbent. Applying these metrics to the database of MOFs lead to the following insights: 1) start cold - the equilibrium uptake of CO2 diverges from that of N2 at lower temperatures, and 2) selectivity of CO2 vs other gases at any one point in the cycle does not matter - what matters is the relative change in uptake along the cycle.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Probing out-of-distribution generalization in machine learning for materials
Authors:
Kangming Li,
Andre Niyongabo Rubungo,
Xiangyun Lei,
Daniel Persaud,
Kamal Choudhary,
Brian DeCost,
Adji Bousso Dieng,
Jason Hattrick-Simpers
Abstract:
Scientific machine learning (ML) endeavors to develop generalizable models with broad applicability. However, the assessment of generalizability is often based on heuristics. Here, we demonstrate in the materials science setting that heuristics based evaluations lead to substantially biased conclusions of ML generalizability and benefits of neural scaling. We evaluate generalization performance in…
▽ More
Scientific machine learning (ML) endeavors to develop generalizable models with broad applicability. However, the assessment of generalizability is often based on heuristics. Here, we demonstrate in the materials science setting that heuristics based evaluations lead to substantially biased conclusions of ML generalizability and benefits of neural scaling. We evaluate generalization performance in over 700 out-of-distribution tasks that features new chemistry or structural symmetry not present in the training data. Surprisingly, good performance is found in most tasks and across various ML models including simple boosted trees. Analysis of the materials representation space reveals that most tasks contain test data that lie in regions well covered by training data, while poorly-performing tasks contain mainly test data outside the training domain. For the latter case, increasing training set size or training time has marginal or even adverse effects on the generalization performance, contrary to what the neural scaling paradigm assumes. Our findings show that most heuristically-defined out-of-distribution tests are not genuinely difficult and evaluate only the ability to interpolate. Evaluating on such tasks rather than the truly challenging ones can lead to an overestimation of generalizability and benefits of scaling.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Efficient first principles based modeling via machine learning: from simple representations to high entropy materials
Authors:
Kangming Li,
Kamal Choudhary,
Brian DeCost,
Michael Greenwood,
Jason Hattrick-Simpers
Abstract:
High-entropy materials (HEMs) have recently emerged as a significant category of materials, offering highly tunable properties. However, the scarcity of HEM data in existing density functional theory (DFT) databases, primarily due to computational expense, hinders the development of effective modeling strategies for computational materials discovery. In this study, we introduce an open DFT dataset…
▽ More
High-entropy materials (HEMs) have recently emerged as a significant category of materials, offering highly tunable properties. However, the scarcity of HEM data in existing density functional theory (DFT) databases, primarily due to computational expense, hinders the development of effective modeling strategies for computational materials discovery. In this study, we introduce an open DFT dataset of alloys and employ machine learning (ML) methods to investigate the material representations needed for HEM modeling. Utilizing high-throughput DFT calculations, we generate a comprehensive dataset of 84k structures, encompassing both ordered and disordered alloys across a spectrum of up to seven components and the entire compositional range. We apply descriptor-based models and graph neural networks to assess how material information is captured across diverse chemical-structural representations. We first evaluate the in-distribution performance of ML models to confirm their predictive accuracy. Subsequently, we demonstrate the capability of ML models to generalize between ordered and disordered structures, between low-order and high-order alloys, and between equimolar and non-equimolar compositions. Our findings suggest that ML models can generalize from cost-effective calculations of simpler systems to more complex scenarios. Additionally, we discuss the influence of dataset size and reveal that the information loss associated with the use of unrelaxed structures could significantly degrade the generalization performance. Overall, this research sheds light on several critical aspects of HEM modeling and offers insights for data-driven atomistic modeling of HEMs.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Leveraging Domain Adaptation for Accurate Machine Learning Predictions of New Halide Perovskites
Authors:
Dipannoy Das Gupta,
Zachary J. L. Bare,
Suxuen Yew,
Santosh Adhikari,
Brian DeCost,
Qi Zhang,
Charles Musgrave,
Christopher Sutton
Abstract:
We combine graph neural networks (GNN) with an inexpensive and reliable structure generation approach based on the bond-valence method (BVM) to train accurate machine learning models for screening 222,960 halide perovskites using statistical estimates of the DFT/PBE formation energy (Ef), and the PBE and HSE band gaps (Eg). The GNNs were fined tuned using domain adaptation (DA) from a source model…
▽ More
We combine graph neural networks (GNN) with an inexpensive and reliable structure generation approach based on the bond-valence method (BVM) to train accurate machine learning models for screening 222,960 halide perovskites using statistical estimates of the DFT/PBE formation energy (Ef), and the PBE and HSE band gaps (Eg). The GNNs were fined tuned using domain adaptation (DA) from a source model, which yields a factor of 1.8 times improvement in Ef and 1.2 - 1.35 times improvement in HSE Eg compared to direct training (i.e., without DA). Using these two ML models, 48 compounds were identified out of 222,960 candidates as both stable and that have an HSE Eg that is relevant for photovoltaic applications. For this subset, only 8 have been reported to date, indicating that 40 compounds remain unexplored to the best of our knowledge and therefore offer opportunities for potential experimental examination.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Learning material synthesis-process-structure-property relationship by data fusion: Bayesian Coregionalization N-Dimensional Piecewise Function Learning
Authors:
A. Gilad Kusne,
Austin McDannald,
Brian DeCost
Abstract:
Autonomous materials research labs require the ability to combine and learn from diverse data streams. This is especially true for learning material synthesis-process-structure-property relationships, key to accelerating materials optimization and discovery as well as accelerating mechanistic understanding. We present the Synthesis-process-structure-property relAtionship coreGionalized lEarner (SA…
▽ More
Autonomous materials research labs require the ability to combine and learn from diverse data streams. This is especially true for learning material synthesis-process-structure-property relationships, key to accelerating materials optimization and discovery as well as accelerating mechanistic understanding. We present the Synthesis-process-structure-property relAtionship coreGionalized lEarner (SAGE) algorithm. A fully Bayesian algorithm that uses multimodal coregionalization to merge knowledge across data sources to learn synthesis-process-structure-property relationships. SAGE outputs a probabilistic posterior for the relationships including the most likely relationships given the data.
△ Less
Submitted 20 August, 2024; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Approaches for Uncertainty Quantification of AI-predicted Material Properties: A Comparison
Authors:
Francesca Tavazza,
Kamal Choudhary,
Brian DeCost
Abstract:
The development of large databases of material properties, together with the availability of powerful computers, has allowed machine learning (ML) modeling to become a widely used tool for predicting material performances. While confidence intervals are commonly reported for such ML models, prediction intervals, i.e., the uncertainty on each prediction, are not as frequently available. Here, we in…
▽ More
The development of large databases of material properties, together with the availability of powerful computers, has allowed machine learning (ML) modeling to become a widely used tool for predicting material performances. While confidence intervals are commonly reported for such ML models, prediction intervals, i.e., the uncertainty on each prediction, are not as frequently available. Here, we investigate three easy-to-implement approaches to determine such individual uncertainty, comparing them across ten ML quantities spanning energetics, mechanical, electronic, optical, and spectral properties. Specifically, we focused on the Quantile approach, the direct machine learning of the prediction intervals and Ensemble methods.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Accelerating Defect Predictions in Semiconductors Using Graph Neural Networks
Authors:
Md Habibur Rahman,
Prince Gollapalli,
Panayotis Manganaris,
Satyesh Kumar Yadav,
Ghanshyam Pilania,
Brian DeCost,
Kamal Choudhary,
Arun Mannodi-Kanakkithodi
Abstract:
Here, we develop a framework for the prediction and screening of native defects and functional impurities in a chemical space of Group IV, III-V, and II-VI zinc blende (ZB) semiconductors, powered by crystal Graph-based Neural Networks (GNNs) trained on high-throughput density functional theory (DFT) data. Using an innovative approach of sampling partially optimized defect configurations from DFT…
▽ More
Here, we develop a framework for the prediction and screening of native defects and functional impurities in a chemical space of Group IV, III-V, and II-VI zinc blende (ZB) semiconductors, powered by crystal Graph-based Neural Networks (GNNs) trained on high-throughput density functional theory (DFT) data. Using an innovative approach of sampling partially optimized defect configurations from DFT calculations, we generate one of the largest computational defect datasets to date, containing many types of vacancies, self-interstitials, anti-site substitutions, impurity interstitials and substitutions, as well as some defect complexes. We applied three types of established GNN techniques, namely Crystal Graph Convolutional Neural Network (CGCNN), Materials Graph Network (MEGNET), and Atomistic Line Graph Neural Network (ALIGNN), to rigorously train models for predicting defect formation energy (DFE) in multiple charge states and chemical potential conditions. We find that ALIGNN yields the best DFE predictions with root mean square errors around 0.3 eV, which represents a prediction accuracy of 98 % given the range of values within the dataset, improving significantly on the state-of-the-art. Models are tested for different defect types as well as for defect charge transition levels. We further show that GNN-based defective structure optimization can take us close to DFT-optimized geometries at a fraction of the cost of full DFT. DFT-GNN models enable prediction and screening across thousands of hypothetical defects based on both unoptimized and partially-optimized defective structures, helping identify electronically active defects in technologically-important semiconductors.
△ Less
Submitted 13 September, 2023; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Emulating Expert Insight: A Robust Strategy for Optimal Experimental Design
Authors:
Matthew R. Carbone,
Hyeong Jin Kim,
Chandima Fernando,
Shinjae Yoo,
Daniel Olds,
Howie Joress,
Brian DeCost,
Bruce Ravel,
Yugang Zhang,
Phillip M. Maffettone
Abstract:
The challenge of optimal design of experiments (DOE) pervades materials science, physics, chemistry, and biology. Bayesian optimization has been used to address this challenge in vast sample spaces, although it requires framing experimental campaigns through the lens of maximizing some observable. This framing is insufficient for epistemic research goals that seek to comprehensively analyze a samp…
▽ More
The challenge of optimal design of experiments (DOE) pervades materials science, physics, chemistry, and biology. Bayesian optimization has been used to address this challenge in vast sample spaces, although it requires framing experimental campaigns through the lens of maximizing some observable. This framing is insufficient for epistemic research goals that seek to comprehensively analyze a sample space, without an explicit scalar objective (e.g., the characterization of a wafer or sample library). In this work, we propose a flexible formulation of scientific value that recasts a dataset of input conditions and higher-dimensional observable data into a continuous, scalar metric. Intuitively, the scientific value function measures where observables change significantly, emulating the perspective of experts driving an experiment, and can be used in collaborative analysis tools or as an objective for optimization techniques. We demonstrate this technique by exploring simulated phase boundaries from different observables, autonomously driving a variable temperature measurement of a ferroelectric material, and providing feedback from a nanoparticle synthesis campaign. The method is seamlessly compatible with existing optimization tools, can be extended to multi-modal and multi-fidelity experiments, and can integrate existing models of an experimental system. Because of its flexibility, it can be deployed in a range of experimental settings for autonomous or accelerated experiments.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Recent progress in the JARVIS infrastructure for next-generation data-driven materials design
Authors:
Daniel Wines,
Ramya Gurunathan,
Kevin F. Garrity,
Brian DeCost,
Adam J. Biacchi,
Francesca Tavazza,
Kamal Choudhary
Abstract:
The Joint Automated Repository for Various Integrated Simulations (JARVIS) infrastructure at the National Institute of Standards and Technology (NIST) is a large-scale collection of curated datasets and tools with more than 80000 materials and millions of properties. JARVIS uses a combination of electronic structure, artificial intelligence (AI), advanced computation and experimental methods to ac…
▽ More
The Joint Automated Repository for Various Integrated Simulations (JARVIS) infrastructure at the National Institute of Standards and Technology (NIST) is a large-scale collection of curated datasets and tools with more than 80000 materials and millions of properties. JARVIS uses a combination of electronic structure, artificial intelligence (AI), advanced computation and experimental methods to accelerate materials design. Here we report some of the new features that were recently included in the infrastructure such as: 1) doubling the number of materials in the database since its first release, 2) including more accurate electronic structure methods such as Quantum Monte Carlo, 3) including graph neural network-based materials design, 4) development of unified force-field, 5) development of a universal tight-binding model, 6) addition of computer-vision tools for advanced microscopy applications, 7) development of a natural language processing tool for text-generation and analysis, 8) debuting a large-scale benchmarking endeavor, 9) including quantum computing algorithms for solids, 10) integrating several experimental datasets and 11) staging several community engagement and outreach events. New classes of materials, properties, and workflows added to the database include superconductors, two-dimensional (2D) magnets, magnetic topological materials, metal-organic frameworks, defects, and interface systems. The rich and reliable datasets, tools, documentation, and tutorials make JARVIS a unique platform for modern materials design. JARVIS ensures openness of data and tools to enhance reproducibility and transparency and to promote a healthy and collaborative scientific environment.
△ Less
Submitted 25 October, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
AutoEIS: automated Bayesian model selection and analysis for electrochemical impedance spectroscopy
Authors:
Runze Zhang,
Robert Black,
Debashish Sur,
Parisa Karimi,
Kangming Li,
Brian DeCost,
John Scully,
Jason Hattrick-Simpers
Abstract:
Electrochemical Impedance Spectroscopy (EIS) is a powerful tool for electrochemical analysis; however, its data can be challenging to interpret. Here, we introduce a new open-source tool named AutoEIS that assists EIS analysis by automatically proposing statistically plausible equivalent circuit models (ECMs). AutoEIS does this without requiring an exhaustive mechanistic understanding of the elect…
▽ More
Electrochemical Impedance Spectroscopy (EIS) is a powerful tool for electrochemical analysis; however, its data can be challenging to interpret. Here, we introduce a new open-source tool named AutoEIS that assists EIS analysis by automatically proposing statistically plausible equivalent circuit models (ECMs). AutoEIS does this without requiring an exhaustive mechanistic understanding of the electrochemical systems. We demonstrate the generalizability of AutoEIS by using it to analyze EIS datasets from three distinct electrochemical systems, including thin-film oxygen evolution reaction (OER) electrocatalysis, corrosion of self-healing multi-principal components alloys, and a carbon dioxide reduction electrolyzer device. In each case, AutoEIS identified competitive or in some cases superior ECMs to those recommended by experts and provided statistical indicators of the preferred solution. The results demonstrated AutoEIS's capability to facilitate EIS analysis without expert labels while diminishing user bias in a high-throughput manner. AutoEIS provides a generalized automated approach to facilitate EIS analysis spanning a broad suite of electrochemical applications with minimal prior knowledge of the system required. This tool holds great potential in improving the efficiency, accuracy, and ease of EIS analysis and thus creates an avenue to the widespread use of EIS in accelerating the development of new electrochemical materials and devices.
△ Less
Submitted 24 May, 2023; v1 submitted 8 May, 2023;
originally announced May 2023.
-
On the redundancy in large material datasets: efficient and robust learning with less data
Authors:
Kangming Li,
Daniel Persaud,
Kamal Choudhary,
Brian DeCost,
Michael Greenwood,
Jason Hattrick-Simpers
Abstract:
Extensive efforts to gather materials data have largely overlooked potential data redundancy. In this study, we present evidence of a significant degree of redundancy across multiple large datasets for various material properties, by revealing that up to 95 % of data can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant dat…
▽ More
Extensive efforts to gather materials data have largely overlooked potential data redundancy. In this study, we present evidence of a significant degree of redundancy across multiple large datasets for various material properties, by revealing that up to 95 % of data can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant data is related to over-represented material types and does not mitigate the severe performance degradation on out-of-distribution samples. In addition, we show that uncertainty-based active learning algorithms can construct much smaller but equally informative datasets. We discuss the effectiveness of informative data in improving prediction performance and robustness and provide insights into efficient data acquisition and machine learning training. This work challenges the "bigger is better" mentality and calls for attention to the information richness of materials data rather than a narrow emphasis on data volume.
△ Less
Submitted 25 July, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Why is EXAFS analysis for multicomponent metals so hard? Challenges and opportunities for measuring ordering in complex concentrated alloys using x-ray absorption spectroscopy
Authors:
Howie Joress,
Bruce Ravel,
Elaf Anber,
Jonathan Hollenbach,
Debashish Sur,
Jason Hattrick-Simpers,
Mitra L. Taheri,
Brian DeCost
Abstract:
Short range order is a critical driver of properties (e.g. corrosion resistance and tensile strength) in multicomponent alloys such as complex concentrated alloys (CCAs). Extended x-ray absorption fine structure (EXAFS) is a powerful technique well suited for quantifying this short range order.Here, we described in detail the characteristics of CCAs that make the already challenging task of analyz…
▽ More
Short range order is a critical driver of properties (e.g. corrosion resistance and tensile strength) in multicomponent alloys such as complex concentrated alloys (CCAs). Extended x-ray absorption fine structure (EXAFS) is a powerful technique well suited for quantifying this short range order.Here, we described in detail the characteristics of CCAs that make the already challenging task of analyzing EXAFS data even more difficult. We then illustrate novel paths towards robust and scalable quantitative SRO analysis which will accelerate the scientific understanding and development of CCAs.
△ Less
Submitted 22 March, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
An experimental high-throughput to high-fidelity study towards discovering Al-Cr containing corrosion-resistant compositionally complex alloys
Authors:
Debashish Sur,
Emily F. Holcombe,
William H. Blades,
Elaf A. Anber,
Daniel L. Foley,
Brian L. DeCost,
Jing Liu,
Jason Hattrick-Simpers,
Karl Sieradzki,
Howie Joress,
John R. Scully,
Mitra L. Taheri
Abstract:
Compositionally complex alloys hold the promise of simultaneously attaining superior combinations of properties, such as corrosion resistance, light-weighting, and strength. Achieving this goal is a challenge due in part to a large number of possible compositions and structures in the vast alloy design space. High-throughput methods offer a path forward, but a strong connection between the synthes…
▽ More
Compositionally complex alloys hold the promise of simultaneously attaining superior combinations of properties, such as corrosion resistance, light-weighting, and strength. Achieving this goal is a challenge due in part to a large number of possible compositions and structures in the vast alloy design space. High-throughput methods offer a path forward, but a strong connection between the synthesis of an alloy of a given composition and structure with its properties has not been fully realized to date. Here, we present the rapid identification of corrosion-resistant alloys based on combinations of Al and Cr in a base Al-Co-Cr-Fe-Ni alloy. Previously unstudied alloy stoichiometries were identified using a combination of high-throughput experimental screening coupled with key metallurgical and electrochemical corrosion tests, identifying alloys with excellent passivation behavior. The alloy native oxide performance and its self-healing attributes were probed using rapid tests in deaerated 0.1 mol/L H2SO4. Importantly, a correlation was found between the electrochemical impedance modulus of the exposure-modified air-formed film and self-healing rate of the CCAs. Multi-element extended x-ray absorption fine structure analyses connected more ordered type chemical short-range order in the Ni-Al 1st nearest-neighbor shell to poorer corrosion resistance. This report underscores the utility of high throughput exploration of compositionally complex alloys for the identification and rapid screening of a vast stoichiometric space.
△ Less
Submitted 19 March, 2024; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Self-driving Multimodal Studies at User Facilities
Authors:
Phillip M. Maffettone,
Daniel B. Allan,
Stuart I. Campbell,
Matthew R. Carbone,
Thomas A. Caswell,
Brian L. DeCost,
Dmitri Gavrilov,
Marcus D. Hanwell,
Howie Joress,
Joshua Lynch,
Bruce Ravel,
Stuart B. Wilkins,
Jakub Wlodek,
Daniel Olds
Abstract:
Multimodal characterization is commonly required for understanding materials. User facilities possess the infrastructure to perform these measurements, albeit in serial over days to months. In this paper, we describe a unified multimodal measurement of a single sample library at distant instruments, driven by a concert of distributed agents that use analysis from each modality to inform the direct…
▽ More
Multimodal characterization is commonly required for understanding materials. User facilities possess the infrastructure to perform these measurements, albeit in serial over days to months. In this paper, we describe a unified multimodal measurement of a single sample library at distant instruments, driven by a concert of distributed agents that use analysis from each modality to inform the direction of the other in real time. Powered by the Bluesky project at the National Synchrotron Light Source II, this experiment is a world's first for beamline science, and provides a blueprint for future approaches to multimodal and multifidelity experiments at user facilities.
△ Less
Submitted 22 January, 2023;
originally announced January 2023.
-
AtomVision: A machine vision library for atomistic images
Authors:
Kamal Choudhary,
Ramya Gurunathan,
Brian DeCost,
Adam Biacchi
Abstract:
Computer vision techniques have immense potential for materials design applications. In this work, we introduce an integrated and general-purpose AtomVision library that can be used to generate, curate scanning tunneling microscopy (STM) and scanning transmission electron microscopy (STEM) datasets and apply machine learning techniques. To demonstrate the applicability of this library, we 1) gener…
▽ More
Computer vision techniques have immense potential for materials design applications. In this work, we introduce an integrated and general-purpose AtomVision library that can be used to generate, curate scanning tunneling microscopy (STM) and scanning transmission electron microscopy (STEM) datasets and apply machine learning techniques. To demonstrate the applicability of this library, we 1) generate and curate an atomistic image dataset of about 10000 materials, 2) develop and compare convolutional and graph neural network models to classify the Bravais lattices, 3) develop fully convolutional neural network using U-Net architecture to pixelwise classify atom vs background, 4) use generative adversarial network for super-resolution, 5) curate a natural language processing based image dataset using open-access arXiv dataset, and 6) integrate the computational framework with experimental microscopy tools. AtomVision library is available at https://github.com/usnistgov/atomvision.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
A critical examination of robustness and generalizability of machine learning prediction of materials properties
Authors:
Kangming Li,
Brian DeCost,
Kamal Choudhary,
Michael Greenwood,
Jason Hattrick-Simpers
Abstract:
Recent advances in machine learning (ML) methods have led to substantial improvement in materials property prediction against community benchmarks, but an excellent benchmark score may not imply good generalization of performance. Here we show that ML models trained on the Materials Project 2018 (MP18) dataset can have severely degraded prediction performance on new compounds in the Materials Proj…
▽ More
Recent advances in machine learning (ML) methods have led to substantial improvement in materials property prediction against community benchmarks, but an excellent benchmark score may not imply good generalization of performance. Here we show that ML models trained on the Materials Project 2018 (MP18) dataset can have severely degraded prediction performance on new compounds in the Materials Project 2021 (MP21) dataset. We document performance degradation in graph neural networks and traditional descriptor-based ML models for both quantitative and qualitative predictions. We find the source of the predictive degradation is due to the distribution shift between the MP18 and MP21 versions. This is revealed by the uniform manifold approximation and projection (UMAP) of the feature space. We then show that the performance degradation issue can be foreseen using a few simple tools. Firstly, the UMAP can be used to investigate the connectivity and relative proximity of the training and test data within feature space. Secondly, the disagreement between multiple ML models on the test data can illuminate out-of-distribution samples. We demonstrate that the simple yet efficient UMAP-guided and query-by-committee acquisition strategies can greatly improve prediction accuracy through adding only 1~\% of the test data. We believe this work provides valuable insights for building materials databases and ML models that enable better prediction robustness and generalizability.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Unified Graph Neural Network Force-field for the Periodic Table
Authors:
Kamal Choudhary,
Brian DeCost,
Lily Major,
Keith Butler,
Jeyan Thiyagalingam,
Francesca Tavazza
Abstract:
Classical force fields (FF) based on machine learning (ML) methods show great potential for large scale simulations of materials. MLFFs have hitherto largely been designed and fitted for specific systems and are not usually transferable to chemistries beyond the specific training set. We develop a unified atomisitic line graph neural network-based FF (ALIGNN-FF) that can model both structurally an…
▽ More
Classical force fields (FF) based on machine learning (ML) methods show great potential for large scale simulations of materials. MLFFs have hitherto largely been designed and fitted for specific systems and are not usually transferable to chemistries beyond the specific training set. We develop a unified atomisitic line graph neural network-based FF (ALIGNN-FF) that can model both structurally and chemically diverse materials with any combination of 89 elements from the periodic table. To train the ALIGNN-FF model, we use the JARVIS-DFT dataset which contains around 75000 materials and 4 million energy-force entries, out of which 307113 are used in the training. We demonstrate the applicability of this method for fast optimization of atomic structures in the crystallography open database and by predicting accurate crystal structures using genetic algorithm for alloys.
△ Less
Submitted 16 September, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
Reproducible Sorbent Materials Foundry for Carbon Capture at Scale
Authors:
Austin McDannald,
Howie Joress,
Brian DeCost,
Avery E. Baumann,
A. Gilad Kusne,
Kamal Choudhary,
Taner Yildirim,
Daniel W. Siderius,
Winnie Wong-Ng,
Andrew J. Allen,
Christopher M. Stafford,
Diana Ortiz-Montalvo
Abstract:
We envision an autonomous sorbent materials foundry (SMF) for rapidly evaluating materials for direct air capture of carbon dioxide (CO2), specifically targeting novel metal organic framework materials. Our proposed SMF is hierarchical, simultaneously addressing the most critical gaps in the inter-related space of sorbent material synthesis, processing, properties, and performance. The ability to…
▽ More
We envision an autonomous sorbent materials foundry (SMF) for rapidly evaluating materials for direct air capture of carbon dioxide (CO2), specifically targeting novel metal organic framework materials. Our proposed SMF is hierarchical, simultaneously addressing the most critical gaps in the inter-related space of sorbent material synthesis, processing, properties, and performance. The ability to collect these critical data streams in an agile, coordinated, and automated fashion will enable efficient end-to-end sorbent materials design through machine learning driven research framework.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Development of an automated millifluidic platform and data-analysis pipeline for rapid electrochemical corrosion measurements: a pH study on Zn-Ni
Authors:
Howie Joress,
Brian DeCost,
Najlaa Hassan,
Trevor M. Braun,
Justin M. Gorham,
Jason Hattrick-Simpers
Abstract:
We describe the development of a millifluidic based scanning droplet cell platform for rapid and automated corrosion. This system allows for measurement of corrosion properties (e.g., open circuit potential, corrosion current through Tafel and linear polarization resistance measurements, and cyclic voltammograms) on a localized section of a planar sample. Our system is highly automated and flexibl…
▽ More
We describe the development of a millifluidic based scanning droplet cell platform for rapid and automated corrosion. This system allows for measurement of corrosion properties (e.g., open circuit potential, corrosion current through Tafel and linear polarization resistance measurements, and cyclic voltammograms) on a localized section of a planar sample. Our system is highly automated and flexible, allowing for scripted changing and mixing of solutions and point-to-point motion on the sample. We have also created an automated data analysis pipeline. Here we demonstrate this tool by corroding a plate of electroplated Zn$_{85}$Ni$_{15}$ alloy over a range of pH values and correlate our results with XPS measurements and literature.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Towards automated design of corrosion resistant alloy coatings with an autonomous scanning droplet cell
Authors:
Brian DeCost,
Howie Joress,
Suchismita Sarker,
Apurva Mehta,
Jason Hattrick-Simpers
Abstract:
We present an autonomous scanning droplet cell platform designed for on-demand alloy electrodeposition and real-time electrochemical characterization for investigating the corrosion-resistance properties of multicomponent alloys. Automation and machine learning are currently driving rapid innovation in high throughput and autonomous materials design and discovery. We present two alloy design case…
▽ More
We present an autonomous scanning droplet cell platform designed for on-demand alloy electrodeposition and real-time electrochemical characterization for investigating the corrosion-resistance properties of multicomponent alloys. Automation and machine learning are currently driving rapid innovation in high throughput and autonomous materials design and discovery. We present two alloy design case studies: one focusing on a multi-objective corrosion resistant alloy optimization, and a case study highlighting the complexity of the multimodal characterization needed to provide insight into the underlying structural and chemical factors that drive observed material behavior. This motivates a close coupling between autonomous research platforms and scientific machine learning methodology that blends mechanistic physical models and black box machine learning models. This emerging research area presents new opportunities to accelerate materials synthesis, evaluation, and hence discovery and design.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Reflections on the future of machine learning for materials research
Authors:
Naohiro Fujinuma,
Brian L. DeCost,
Jason Hattrick-Simpers,
Samuel E. Lofland
Abstract:
Applied machine learning (ML) has rapidly spread throughout the physical sciences; in fact, ML-based data analysis and experimental decision-making has become commonplace. We suggest a shift in the conversation from proving that ML can be used to evaluating how to equitably and effectively implement ML for science.We advocate a shift from a "more data, more compute" mentality to a model-oriented a…
▽ More
Applied machine learning (ML) has rapidly spread throughout the physical sciences; in fact, ML-based data analysis and experimental decision-making has become commonplace. We suggest a shift in the conversation from proving that ML can be used to evaluating how to equitably and effectively implement ML for science.We advocate a shift from a "more data, more compute" mentality to a model-oriented approach that prioritizes using machine learning to support the ecosystem of computational models and experimental measurements.We also recommend an open conversation about dataset bias to stabilize productive research through careful model interrogation and deliberate exploitation of known biases. Further, we encourage the community to develop ML methods that connect experiments with theoretical models to increase scientific understanding rather than incrementally optimizing materials. Moreover we envision a future of radical materials innovations enabled by computational creativity tools combined with online visualization and analysis tools that support active outside-the-box thinking inside the scientific knowledge feedback loop. Finally, as a community we must acknowledge ethical issues that can arise from blindly following machine learning predictions and the issues of social equity that will arise if data, code, and computational resources are not readily available to all.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Physics in the Machine: Integrating Physical Knowledge in Autonomous Phase-Mapping
Authors:
A. Gilad Kusne,
Austin McDannald,
Brian DeCost,
Corey Oses,
Cormac Toher,
Stefano Curtarolo,
Apurva Mehta,
Ichiro Takeuchi
Abstract:
Application of artificial intelligence (AI), and more specifically machine learning, to the physical sciences has expanded significantly over the past decades. In particular, science-informed AI, also known as scientific AI or inductive bias AI, has grown from a focus on data analysis to now controlling experiment design, simulation, execution and analysis in closed-loop autonomous systems. The CA…
▽ More
Application of artificial intelligence (AI), and more specifically machine learning, to the physical sciences has expanded significantly over the past decades. In particular, science-informed AI, also known as scientific AI or inductive bias AI, has grown from a focus on data analysis to now controlling experiment design, simulation, execution and analysis in closed-loop autonomous systems. The CAMEO (closed-loop autonomous materials exploration and optimization) algorithm employs scientific AI to address two tasks: learning a material system's composition-structure relationship and identifying materials compositions with optimal functional properties. By integrating these, accelerated materials screening across compositional phase diagrams was demonstrated, resulting in the discovery of a best-in-class phase change memory material. Key to this success is the ability to guide subsequent measurements to maximize knowledge of the composition-structure relationship, or phase map. In this work we investigate the benefits of incorporating varying levels of prior physical knowledge into CAMEO's autonomous phase-mapping. This includes the use of ab-initio phase boundary data from the AFLOW repositories, which has been shown to optimize CAMEO's search when used as a prior.
△ Less
Submitted 16 February, 2022; v1 submitted 14 November, 2021;
originally announced November 2021.
-
Recent Advances and Applications of Deep Learning Methods in Materials Science
Authors:
Kamal Choudhary,
Brian DeCost,
Chi Chen,
Anubhav Jain,
Francesca Tavazza,
Ryan Cohn,
Cheol WooPark,
Alok Choudhary,
Ankit Agrawal,
Simon J. L. Billinge,
Elizabeth Holm,
Shyue Ping Ong,
Chris Wolverton
Abstract:
Deep learning (DL) is one of the fastest growing topics in materials data science, with rapidly emerging applications spanning atomistic, image-based, spectral, and textual data modalities. DL allows analysis of unstructured data and automated identification of features. Recent development of large materials databases has fueled the application of DL methods in atomistic prediction in particular.…
▽ More
Deep learning (DL) is one of the fastest growing topics in materials data science, with rapidly emerging applications spanning atomistic, image-based, spectral, and textual data modalities. DL allows analysis of unstructured data and automated identification of features. Recent development of large materials databases has fueled the application of DL methods in atomistic prediction in particular. In contrast, advances in image and spectral data have largely leveraged synthetic data enabled by high quality forward models as well as by generative unsupervised DL methods. In this article, we present a high-level overview of deep-learning methods followed by a detailed discussion of recent developments of deep learning in atomistic simulation, materials imaging, spectral analysis, and natural language processing. For each modality we discuss applications involving both theoretical and experimental data, typical modeling approaches with their strengths and limitations, and relevant publicly available software and datasets. We conclude the review with a discussion of recent cross-cutting work related to uncertainty quantification in this field and a brief perspective on limitations, challenges, and potential growth areas for DL methods in materials science. The application of DL methods in materials science presents an exciting avenue for future materials discovery and design.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Atomistic Line Graph Neural Network for Improved Materials Property Predictions
Authors:
Kamal Choudhary,
Brian DeCost
Abstract:
Graph neural networks (GNN) have been shown to provide substantial performance improvements for atomistic material representation and modeling compared with descriptor-based machine learning models. While most existing GNN models for atomistic predictions are based on atomic distance information, they do not explicitly incorporate bond angles, which are critical for distinguishing many atomic stru…
▽ More
Graph neural networks (GNN) have been shown to provide substantial performance improvements for atomistic material representation and modeling compared with descriptor-based machine learning models. While most existing GNN models for atomistic predictions are based on atomic distance information, they do not explicitly incorporate bond angles, which are critical for distinguishing many atomic structures. Furthermore, many material properties are known to be sensitive to slight changes in bond angles. We present an Atomistic Line Graph Neural Network (ALIGNN), a GNN architecture that performs message passing on both the interatomic bond graph and its line graph corresponding to bond angles. We demonstrate that angle information can be explicitly and efficiently included, leading to improved performance on multiple atomistic prediction tasks. We ALIGNN models for predicting 52 solid-state and molecular properties available in the JARVIS-DFT, Materials project, and QM9 databases. ALIGNN can outperform some previously reported GNN models on atomistic prediction tasks by up to 85% in accuracy with better or comparable model training speed.
△ Less
Submitted 6 April, 2022; v1 submitted 3 June, 2021;
originally announced June 2021.
-
The Joint Automated Repository for Various Integrated Simulations (JARVIS) for data-driven materials design
Authors:
Kamal Choudhary,
Kevin F. Garrity,
Andrew C. E. Reid,
Brian DeCost,
Adam J. Biacchi,
Angela R. Hight Walker,
Zachary Trautt,
Jason Hattrick-Simpers,
A. Gilad Kusne,
Andrea Centrone,
Albert Davydov,
Jie Jiang,
Ruth Pachter,
Gowoon Cheon,
Evan Reed,
Ankit Agrawal,
Xiaofeng Qian,
Vinit Sharma,
Houlong Zhuang,
Sergei V. Kalinin,
Bobby G. Sumpter,
Ghanshyam Pilania,
Pinar Acar,
Subhasish Mandal,
Kristjan Haule
, et al. (3 additional authors not shown)
Abstract:
The Joint Automated Repository for Various Integrated Simulations (JARVIS) is an integrated infrastructure to accelerate materials discovery and design using density functional theory (DFT), classical force-fields (FF), and machine learning (ML) techniques. JARVIS is motivated by the Materials Genome Initiative (MGI) principles of developing open-access databases and tools to reduce the cost and d…
▽ More
The Joint Automated Repository for Various Integrated Simulations (JARVIS) is an integrated infrastructure to accelerate materials discovery and design using density functional theory (DFT), classical force-fields (FF), and machine learning (ML) techniques. JARVIS is motivated by the Materials Genome Initiative (MGI) principles of developing open-access databases and tools to reduce the cost and development time of materials discovery, optimization, and deployment. The major features of JARVIS are: JARVIS-DFT, JARVIS-FF, JARVIS-ML, and JARVIS-Tools. To date, JARVIS consists of 40,000 materials and 1 million calculated properties in JARVIS-DFT, 1,500 materials and 110 force-fields in JARVIS-FF, and 25 ML models for material-property predictions in JARVIS-ML, all of which are continuously expanding. JARVIS-Tools provides scripts and workflows for running and analyzing various simulations. We compare our computational data to experiments or high-fidelity computational methods wherever applicable to evaluate error/uncertainty in predictions. In addition to the existing workflows, the infrastructure can support a wide variety of other technologically important applications as part of the data-driven materials design paradigm. The JARVIS datasets and tools are publicly available at the website: https://jarvis.nist.gov .
△ Less
Submitted 11 July, 2021; v1 submitted 3 July, 2020;
originally announced July 2020.
-
On-the-fly Closed-loop Autonomous Materials Discovery via Bayesian Active Learning
Authors:
A. Gilad Kusne,
Heshan Yu,
Changming Wu,
Huairuo Zhang,
Jason Hattrick-Simpers,
Brian DeCost,
Suchismita Sarker,
Corey Oses,
Cormac Toher,
Stefano Curtarolo,
Albert V. Davydov,
Ritesh Agarwal,
Leonid A. Bendersky,
Mo Li,
Apurva Mehta,
Ichiro Takeuchi
Abstract:
Active learning - the field of machine learning (ML) dedicated to optimal experiment design, has played a part in science as far back as the 18th century when Laplace used it to guide his discovery of celestial mechanics [1]. In this work we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex…
▽ More
Active learning - the field of machine learning (ML) dedicated to optimal experiment design, has played a part in science as far back as the 18th century when Laplace used it to guide his discovery of celestial mechanics [1]. In this work we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex synthesis-processes-structure-property landscape. We demonstrate autonomous research methodology (i.e. autonomous hypothesis definition and evaluation) that can place complex, advanced materials in reach, allowing scientists to fail smarter, learn faster, and spend less resources in their studies, while simultaneously improving trust in scientific results and machine learning tools. Additionally, this robot science enables science-over-the-network, reducing the economic impact of scientists being physically separated from their labs. We used the real-time closed-loop, autonomous system for materials exploration and optimization (CAMEO) at the synchrotron beamline to accelerate the fundamentally interconnected tasks of rapid phase mapping and property optimization, with each cycle taking seconds to minutes, resulting in the discovery of a novel epitaxial nanocomposite phase-change memory material.
△ Less
Submitted 10 November, 2020; v1 submitted 10 June, 2020;
originally announced June 2020.
-
Scientific AI in materials science: a path to a sustainable and scalable paradigm
Authors:
Brian DeCost,
Jason Hattrick-Simpers,
Zachary Trautt,
Aaron Kusne,
Eva Campo,
Martin Green
Abstract:
Recently there has been an ever-increasing trend in the use of machine learning (ML) and artificial intelligence (AI) methods by the materials science, condensed matter physics, and chemistry communities. This perspective article identifies key scientific, technical, and social opportunities that the materials community must prioritize to consistently develop and leverage Scientific AI to provide…
▽ More
Recently there has been an ever-increasing trend in the use of machine learning (ML) and artificial intelligence (AI) methods by the materials science, condensed matter physics, and chemistry communities. This perspective article identifies key scientific, technical, and social opportunities that the materials community must prioritize to consistently develop and leverage Scientific AI to provide a credible path towards the advancement of current materials-limited technologies. Here we highlight the intersections of these opportunities with a series of proposed paths forward. The opportunities are roughly sorted from scientific/technical (e.g., development of robust, physically meaningful multiscale material representations) to social (e.g., promoting an AI-ready workforce). The proposed paths forward range from developing new infrastructure and capabilities to deploying them in industry and academia. We provide a brief introduction to AI in materials science and engineering, followed by detailed discussions of each of the opportunities and paths forward.
△ Less
Submitted 18 March, 2020;
originally announced March 2020.
-
A high-throughput structural and electrochemical study of metallic glass formation in Ni-Ti-Al
Authors:
Howie Joress,
Brian L. DeCost,
Suchismita Sarker,
Trevor M. Braun,
Sidra Jilani,
Ryan Smith,
Logan Ward,
Kevin J. Laws,
Apurva Mehta,
Jason Hattrick-Simpers
Abstract:
Based on a set of machine learning predictions of glass formation in the Ni-Ti-Al system, we have undertaken a high-throughput experimental study of that system. We utilized rapid synthesis followed by high-throughput structural and electrochemical characterization. Using this dual-modality approach, we are able to better classify the amorphous portion of the library, which we found to be the port…
▽ More
Based on a set of machine learning predictions of glass formation in the Ni-Ti-Al system, we have undertaken a high-throughput experimental study of that system. We utilized rapid synthesis followed by high-throughput structural and electrochemical characterization. Using this dual-modality approach, we are able to better classify the amorphous portion of the library, which we found to be the portion with a full-width-half-maximum (FWHM) of 0.42 A$^{-1}$ for the first sharp x-ray diffraction peak. We demonstrate that the FWHM and corrosion resistance are correlated but that, while chemistry still plays a role, a large FWHM is necessary for the best corrosion resistance.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Accelerating Photovoltaic Materials Development via High-Throughput Experiments and Machine-Learning-Assisted Diagnosis
Authors:
Shijing Sun,
Noor T. P. Hartono,
Zekun D. Ren,
Felipe Oviedo,
Antonio M. Buscemi,
Mariya Layurova,
De Xin Chen,
Tofunmi Ogunfunmi,
Janak Thapa,
Savitha Ramasamy,
Charles Settens,
Brian L. DeCost,
Aaron Gilad Kusne,
Zhe Liu,
Siyu I. P. Tian,
I. Marius Peters,
Juan-Pablo Correa-Baena,
Tonio Buonassisi
Abstract:
Accelerating the experimental cycle for new materials development is vital for addressing the grand energy challenges of the 21st century. We fabricate and characterize 75 unique halide perovskite-inspired solution-based thin-film materials within a two-month period, with 87% exhibiting band gaps between 1.2 eV and 2.4 eV that are of interest for energy-harvesting applications. This increased thro…
▽ More
Accelerating the experimental cycle for new materials development is vital for addressing the grand energy challenges of the 21st century. We fabricate and characterize 75 unique halide perovskite-inspired solution-based thin-film materials within a two-month period, with 87% exhibiting band gaps between 1.2 eV and 2.4 eV that are of interest for energy-harvesting applications. This increased throughput is enabled by streamlining experimental workflows, developing a set of precursors amenable to high-throughput synthesis, and developing machine-learning assisted diagnosis. We utilize a deep neural network to classify compounds based on experimental X-ray diffraction data into 0D, 2D, and 3D structures more than 10 times faster than human analysis and with 90% accuracy. We validate our methods using lead-halide perovskites and extend the application to novel lead-free compositions. The wider synthesis window and faster cycle of learning enables three noteworthy scientific findings: (1) we realize four inorganic layered perovskites, A3B2Br9 (A = Cs, Rb; B = Bi, Sb) in thin-film form via one-step liquid deposition; (2) we report a multi-site lead-free alloy series that was not previously described in literature, Cs3(Bi1-xSbx)2(I1-xBrx)9; and (3) we reveal the effect on bandgap (reduction to <2 eV) and structure upon simultaneous alloying on the B-site and X-site of Cs3Bi2I9 with Sb and Br. This study demonstrates that combining an accelerated experimental cycle of learning and machine-learning based diagnosis represents an important step toward realizing fully-automated laboratories for materials discovery and development.
△ Less
Submitted 25 November, 2018;
originally announced December 2018.
-
Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks
Authors:
Felipe Oviedo,
Zekun Ren,
Shijing Sun,
Charlie Settens,
Zhe Liu,
Noor Titan Putri Hartono,
Ramasamy Savitha,
Brian L. DeCost,
Siyu I. P. Tian,
Giuseppe Romano,
Aaron Gilad Kusne,
Tonio Buonassisi
Abstract:
X-ray diffraction (XRD) data acquisition and analysis is among the most time-consuming steps in the development cycle of novel thin-film materials. We propose a machine-learning-enabled approach to predict crystallographic dimensionality and space group from a limited number of thin-film XRD patterns. We overcome the scarce-data problem intrinsic to novel materials development by coupling a superv…
▽ More
X-ray diffraction (XRD) data acquisition and analysis is among the most time-consuming steps in the development cycle of novel thin-film materials. We propose a machine-learning-enabled approach to predict crystallographic dimensionality and space group from a limited number of thin-film XRD patterns. We overcome the scarce-data problem intrinsic to novel materials development by coupling a supervised machine learning approach with a model agnostic, physics-informed data augmentation strategy using simulated data from the Inorganic Crystal Structure Database (ICSD) and experimental data. As a test case, 115 thin-film metal halides spanning 3 dimensionalities and 7 space-groups are synthesized and classified. After testing various algorithms, we develop and implement an all convolutional neural network, with cross validated accuracies for dimensionality and space-group classification of 93% and 89%, respectively. We propose average class activation maps, computed from a global average pooling layer, to allow high model interpretability by human experimentalists, elucidating the root causes of misclassification. Finally, we systematically evaluate the maximum XRD pattern step size (data acquisition rate) before loss of predictive accuracy occurs, and determine it to be 0.16°, which enables an XRD pattern to be obtained and classified in 5.5 minutes or less.
△ Less
Submitted 23 April, 2019; v1 submitted 20 November, 2018;
originally announced November 2018.
-
Machine learning with force-field inspired descriptors for materials: fast screening and mapping energy landscape
Authors:
Kamal Choudhary,
Brian DeCost,
Francesca Tavazza
Abstract:
We present a complete set of chemo-structural descriptors to significantly extend the applicability of machine-learning (ML) in material screening and mapping energy landscape for multicomponent systems. These new descriptors allow differentiating between structural prototypes, which is not possible using the commonly used chemical-only descriptors. Specifically, we demonstrate that the combinatio…
▽ More
We present a complete set of chemo-structural descriptors to significantly extend the applicability of machine-learning (ML) in material screening and mapping energy landscape for multicomponent systems. These new descriptors allow differentiating between structural prototypes, which is not possible using the commonly used chemical-only descriptors. Specifically, we demonstrate that the combination of pairwise radial, nearest neighbor, bond-angle, dihedral-angle and core-charge distributions plays an important role in predicting formation energies, bandgaps, static refractive indices, magnetic properties, and modulus of elasticity for three-dimensional (3D) materials as well as exfoliation energies of two-dimensional (2D) layered materials. The training data consists of 24549 bulk and 616 monolayer materials taken from JARVIS-DFT database. We obtained very accurate ML models using gradient boosting algorithm. Then we use the trained models to discover exfoliable 2D-layered materials satisfying specific property requirements. Additionally, we integrate our formation energy ML model with a genetic algorithm for structure search to verify if the ML model reproduces the DFT convex hull. This verification establishes a more stringent evaluation metric for the ML model than what commonly used in data sciences. Our learnt model is publicly available on the JARVIS-ML website (https://www.ctcms.nist.gov/jarvisml ) property predictions of generalized materials.
△ Less
Submitted 19 July, 2018; v1 submitted 18 May, 2018;
originally announced May 2018.
-
Building Data-driven Models with Microstructural Images: Generalization and Interpretability
Authors:
Julia Ling,
Maxwell Hutchinson,
Erin Antono,
Brian DeCost,
Elizabeth A. Holm,
Bryce Meredig
Abstract:
As data-driven methods rise in popularity in materials science applications, a key question is how these machine learning models can be used to understand microstructure. Given the importance of process-structure-property relations throughout materials science, it seems logical that models that can leverage microstructural data would be more capable of predicting property information. While there…
▽ More
As data-driven methods rise in popularity in materials science applications, a key question is how these machine learning models can be used to understand microstructure. Given the importance of process-structure-property relations throughout materials science, it seems logical that models that can leverage microstructural data would be more capable of predicting property information. While there have been some recent attempts to use convolutional neural networks to understand microstructural images, these early studies have focused only on which featurizations yield the highest machine learning model accuracy for a single data set. This paper explores the use of convolutional neural networks for classifying microstructure with a more holistic set of objectives in mind: generalization between data sets, number of features required, and interpretability.
△ Less
Submitted 1 November, 2017;
originally announced November 2017.
-
Exploring the microstructure manifold: image texture representations applied to ultrahigh carbon steel microstructures
Authors:
Brian L. DeCost,
Toby Francis,
Elizabeth A. Holm
Abstract:
We introduce a microstructure informatics dataset focusing on complex, hierarchical structures found in a single Ultrahigh carbon steel under a range of heat treatments. Applying image representations from contemporary computer vision research to these microstructures, we discuss how both supervised and unsupervised machine learning techniques can be used to yield insight into microstructural tren…
▽ More
We introduce a microstructure informatics dataset focusing on complex, hierarchical structures found in a single Ultrahigh carbon steel under a range of heat treatments. Applying image representations from contemporary computer vision research to these microstructures, we discuss how both supervised and unsupervised machine learning techniques can be used to yield insight into microstructural trends and their relationship to processing conditions. We evaluate and compare keypoint-based and convolutional neural network representations by classifying microstructures according to their primary microconstituent, and by classifying a subset of the microstructures according to the annealing conditions that generated them. Using t-SNE, a nonlinear dimensionality reduction and visualization technique, we demonstrate graphical methods of exploring microstructure and processing datasets, and for understanding and interpreting high-dimensional microstructure representations.
△ Less
Submitted 9 February, 2017; v1 submitted 3 February, 2017;
originally announced February 2017.