Skip to main content

Showing 1–5 of 5 results for author: Polak, M P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.12326  [pdf, other

    cs.CV cond-mat.mtrl-sci cs.AI

    Leveraging Vision Capabilities of Multimodal LLMs for Automated Data Extraction from Plots

    Authors: Maciej P. Polak, Dane Morgan

    Abstract: Automated data extraction from research texts has been steadily improving, with the emergence of large language models (LLMs) accelerating progress even further. Extracting data from plots in research papers, however, has been such a complex task that it has predominantly been confined to manual data extraction. We show that current multimodal large language models, with proper instructions and en… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

    Comments: 8 pages, 3 figures

  2. arXiv:2409.06756  [pdf

    cs.LG cond-mat.mtrl-sci cs.AI

    Beyond designer's knowledge: Generating materials design hypotheses via large language models

    Authors: Quanliang Liu, Maciej P. Polak, So Yeon Kim, MD Al Amin Shuvo, Hrishikesh Shridhar Deodhar, Jeongsoo Han, Dane Morgan, Hyunseok Oh

    Abstract: Materials design often relies on human-generated hypotheses, a process inherently limited by cognitive constraints such as knowledge gaps and limited ability to integrate and extract knowledge implications, particularly when multidisciplinary expertise is required. This work demonstrates that large language models (LLMs), coupled with prompt engineering, can effectively generate non-trivial materi… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

  3. arXiv:2409.06080  [pdf

    cond-mat.mtrl-sci cs.LG

    Regression with Large Language Models for Materials and Molecular Property Prediction

    Authors: Ryan Jacobs, Maciej P. Polak, Lane E. Schultz, Hamed Mahdavi, Vasant Honavar, Dane Morgan

    Abstract: We demonstrate the ability of large language models (LLMs) to perform material and molecular property regression tasks, a significant deviation from the conventional LLM use case. We benchmark the Large Language Model Meta AI (LLaMA) 3 on several molecular properties in the QM9 dataset and 24 materials properties. Only composition-based input strings are used as the model input and we fine tune on… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  4. arXiv:2303.05352  [pdf, other

    cs.CL cond-mat.mtrl-sci

    Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering

    Authors: Maciej P. Polak, Dane Morgan

    Abstract: There has been a growing effort to replace manual extraction of data from research papers with automated data extraction based on natural language processing, language models, and recently, large language models (LLMs). Although these methods enable efficient extraction of data from large sets of research papers, they require a significant amount of up-front effort, expertise, and coding. In this… ▽ More

    Submitted 21 February, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: 7 pages, 3 figures, 1 table

    Journal ref: Nature Communications (2024) 15:1569

  5. arXiv:2302.04914  [pdf, other

    cond-mat.mtrl-sci cs.AI cs.CL

    Flexible, Model-Agnostic Method for Materials Data Extraction from Text Using General Purpose Language Models

    Authors: Maciej P. Polak, Shrey Modi, Anna Latosinska, Jinming Zhang, Ching-Wen Wang, Shaonan Wang, Ayan Deep Hazra, Dane Morgan

    Abstract: Accurate and comprehensive material databases extracted from research papers are crucial for materials science and engineering, but their development requires significant human effort. With large language models (LLMs) transforming the way humans interact with text, LLMs provide an opportunity to revolutionize data extraction. In this study, we demonstrate a simple and efficient method for extract… ▽ More

    Submitted 12 June, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: 13 pages, 4 figures

    Journal ref: Digital Discovery, 2024, 3, 1221-1235