-
Perspective on Utilizing Foundation Models for Laboratory Automation in Materials Research
Authors:
Kan Hatakeyama-Sato,
Toshihiko Nishida,
Kenta Kitamura,
Yoshitaka Ushiku,
Koichi Takahashi,
Yuta Nabae,
Teruaki Hayakawa
Abstract:
This review explores the potential of foundation models to advance laboratory automation in the materials and chemical sciences. It emphasizes the dual roles of these models: cognitive functions for experimental planning and data analysis, and physical functions for hardware operations. While traditional laboratory automation has relied heavily on specialized, rigid systems, foundation models offe…
▽ More
This review explores the potential of foundation models to advance laboratory automation in the materials and chemical sciences. It emphasizes the dual roles of these models: cognitive functions for experimental planning and data analysis, and physical functions for hardware operations. While traditional laboratory automation has relied heavily on specialized, rigid systems, foundation models offer adaptability through their general-purpose intelligence and multimodal capabilities. Recent advancements have demonstrated the feasibility of using large language models (LLMs) and multimodal robotic systems to handle complex and dynamic laboratory tasks. However, significant challenges remain, including precision manipulation of hardware, integration of multimodal data, and ensuring operational safety. This paper outlines a roadmap highlighting future directions, advocating for close interdisciplinary collaboration, benchmark establishment, and strategic human-AI integration to realize fully autonomous experimental laboratories.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
SPACIER: On-Demand Polymer Design with Fully Automated All-Atom Classical Molecular Dynamics Integrated into Machine Learning Pipelines
Authors:
Shun Nanjo,
Arifin,
Hayato Maeda,
Yoshihiro Hayashi,
Kan Hatakeyama-Sato,
Ryoji Himeno,
Teruaki Hayakawa,
Ryo Yoshida
Abstract:
Machine learning has rapidly advanced the design and discovery of new materials with targeted applications in various systems. First-principles calculations and other computer experiments have been integrated into material design pipelines to address the lack of experimental data and the limitations of interpolative machine learning predictors. However, the enormous computational costs and technic…
▽ More
Machine learning has rapidly advanced the design and discovery of new materials with targeted applications in various systems. First-principles calculations and other computer experiments have been integrated into material design pipelines to address the lack of experimental data and the limitations of interpolative machine learning predictors. However, the enormous computational costs and technical challenges of automating computer experiments for polymeric materials have limited the availability of open-source automated polymer design systems that integrate molecular simulations and machine learning. We developed SPACIER, an open-source software program that integrates RadonPy, a Python library for fully automated polymer property calculations based on all-atom classical molecular dynamics into a Bayesian optimization-based polymer design system to overcome these challenges. As a proof-of-concept study, we successfully synthesized optical polymers that surpass the Pareto boundary formed by the tradeoff between the refractive index and Abbe number.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
A Generative Model for Extrapolation Prediction in Materials Informatics
Authors:
Kan Hatakeyama-Sato,
Kenichi Oyaizu
Abstract:
We report a deep generative model for regression tasks in materials informatics. The model is introduced as a component of a data imputer, and predicts more than 20 diverse experimental properties of organic molecules. The imputer is designed to predict material properties by "imagining" the missing data in the database, enabling the use of incomplete material data. Even removing 60% of the data d…
▽ More
We report a deep generative model for regression tasks in materials informatics. The model is introduced as a component of a data imputer, and predicts more than 20 diverse experimental properties of organic molecules. The imputer is designed to predict material properties by "imagining" the missing data in the database, enabling the use of incomplete material data. Even removing 60% of the data does not diminish the prediction accuracy in a model task. Moreover, the model excels at extrapolation prediction, where target values of the test data are out of the range of the training data. Such extrapolation has been regarded as an essential technique for exploring novel materials, but has hardly been studied to date due to its difficulty. We demonstrate that the prediction performance can be improved by >30% by using the imputer compared with traditional linear regression and boosting models. The benefit becomes especially pronounced with few records for an experimental property (< 100 cases) when prediction would be difficult by conventional methods. The presented approach can be used to more efficiently explore functional materials and break through previous performance limits.
△ Less
Submitted 27 February, 2021;
originally announced March 2021.
-
Tackling the challenge of a huge materials science search space with quantum-inspired annealing
Authors:
Kan Hatakeyama-Sato,
Takahiro Kashikawa,
Koichi Kimura,
Kenichi Oyaizu
Abstract:
Efficient screening of chemicals is essential for exploring new materials. However, the search space is astronomically large, making calculations with conventional computers infeasible. For example, an $N$-component system of organic molecules generates >$10^{60N}$ candidates. Here, a quantum-inspired annealing machine is used to tackle the challenge of the large search space. The prototype system…
▽ More
Efficient screening of chemicals is essential for exploring new materials. However, the search space is astronomically large, making calculations with conventional computers infeasible. For example, an $N$-component system of organic molecules generates >$10^{60N}$ candidates. Here, a quantum-inspired annealing machine is used to tackle the challenge of the large search space. The prototype system extracts candidate chemicals and their composites with desirable parameters, such as melting temperature and ionic conductivity. The system can be at least $10^4$-$10^7$ times faster than conventional approaches. Such exponential acceleration is critical for exploring the enormous search space in virtual screening.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.