LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Rubungo, Andre Niyongabo; Li, Kangming; Hattrick-Simpers, Jason; Dieng, Adji Bousso

Condensed Matter > Materials Science

arXiv:2411.00177 (cond-mat)

[Submitted on 31 Oct 2024 (v1), last revised 30 Nov 2024 (this version, v3)]

Title:LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Authors:Andre Niyongabo Rubungo, Kangming Li, Jason Hattrick-Simpers, Adji Bousso Dieng

View PDF HTML (experimental)

Abstract:Large language models (LLMs) are increasingly being used in materials science. However, little attention has been given to benchmarking and standardized evaluation for LLM-based materials property prediction, which hinders progress. We present LLM4Mat-Bench, the largest benchmark to date for evaluating the performance of LLMs in predicting the properties of crystalline materials. LLM4Mat-Bench contains about 1.9M crystal structures in total, collected from 10 publicly available materials data sources, and 45 distinct properties. LLM4Mat-Bench features different input modalities: crystal composition, CIF, and crystal text description, with 4.7M, 615.5M, and 3.1B tokens in total for each modality, respectively. We use LLM4Mat-Bench to fine-tune models with different sizes, including LLM-Prop and MatBERT, and provide zero-shot and few-shot prompts to evaluate the property prediction capabilities of LLM-chat-like models, including Llama, Gemma, and Mistral. The results highlight the challenges of general-purpose LLMs in materials science and the need for task-specific predictive models and task-specific instruction-tuned LLMs in materials property prediction.

Comments:	Accepted at NeurIPS 2024-AI4Mat Workshop. The Benchmark and code can be found at this https URL
Subjects:	Materials Science (cond-mat.mtrl-sci); Computation and Language (cs.CL)
Cite as:	arXiv:2411.00177 [cond-mat.mtrl-sci]
	(or arXiv:2411.00177v3 [cond-mat.mtrl-sci] for this version)
	https://doi.org/10.48550/arXiv.2411.00177

Submission history

From: Andre Niyongabo Rubungo [view email]
[v1] Thu, 31 Oct 2024 19:48:12 UTC (311 KB)
[v2] Fri, 8 Nov 2024 16:42:18 UTC (316 KB)
[v3] Sat, 30 Nov 2024 14:01:56 UTC (323 KB)

Condensed Matter > Materials Science

Title:LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Condensed Matter > Materials Science

Title:LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators