Skip to main content

Showing 1–15 of 15 results for author: Baker, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2508.11860  [pdf, ps, other

    cs.AI cs.CL

    LARC: Towards Human-level Constrained Retrosynthesis Planning through an Agentic Framework

    Authors: Frazier N. Baker, Daniel Adu-Ampratwum, Reza Averly, Botao Yu, Huan Sun, Xia Ning

    Abstract: Large language model (LLM) agent evaluators leverage specialized tools to ground the rational decision-making of LLMs, making them well-suited to aid in scientific discoveries, such as constrained retrosynthesis planning. Constrained retrosynthesis planning is an essential, yet challenging, process within chemistry for identifying synthetic routes from commercially available starting materials to… ▽ More

    Submitted 15 August, 2025; originally announced August 2025.

    Comments: 24 pages, 5 figures

  2. arXiv:2502.13959  [pdf, ps, other

    cs.CL

    LIDDIA: Language-based Intelligent Drug Discovery Agent

    Authors: Reza Averly, Frazier N. Baker, Ian A. Watson, Xia Ning

    Abstract: Drug discovery is a long, expensive, and complex process, relying heavily on human medicinal chemists, who can spend years searching the vast space of potential therapies. Recent advances in artificial intelligence for chemistry have sought to expedite individual drug discovery tasks; however, there remains a critical need for an intelligent agent that can navigate the drug discovery process. Towa… ▽ More

    Submitted 16 August, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: Preprint

  3. Robustness tests for biomedical foundation models should tailor to specifications

    Authors: R. Patrick Xian, Noah R. Baker, Tom David, Qiming Cui, A. Jay Holmgren, Stefan Bauer, Madhumita Sushil, Reza Abbasi-Asl

    Abstract: The rise of biomedical foundation models creates new hurdles in model testing and authorization, given their broad capabilities and susceptibility to complex distribution shifts. We suggest tailoring robustness tests according to task-dependent priorities and propose to integrate granular notions of robustness in a predefined specification to guide implementation. Our approach facilitates the stan… ▽ More

    Submitted 14 August, 2025; v1 submitted 14 February, 2025; originally announced February 2025.

    Comments: 17 pages, accepted version with SI, repo at https://github.com/RealPolitiX/bfm-robust

    Journal ref: npj Digital Medicine 8, 557 (2025)

  4. arXiv:2411.07228  [pdf, other

    cs.AI cs.CE

    ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving

    Authors: Botao Yu, Frazier N. Baker, Ziru Chen, Garrett Herb, Boyu Gou, Daniel Adu-Ampratwum, Xia Ning, Huan Sun

    Abstract: To enhance large language models (LLMs) for chemistry problem solving, several LLM-based agents augmented with tools have been proposed, such as ChemCrow and Coscientist. However, their evaluations are narrow in scope, leaving a large gap in understanding the benefits of tools across diverse chemistry tasks. To bridge this gap, we develop ChemToolAgent, an enhanced chemistry agent over ChemCrow, a… ▽ More

    Submitted 26 May, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

    Comments: Accepted to NAACL 2025 Findings. Previous title: Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving. Based on the camera ready version, this version adds more experimental results. We also release the toolkit in ChemMCP (https://osu-nlp-group.github.io/ChemMCP), which is a continuously updated and MCP-compatible chemistry toolkit

  5. arXiv:2410.05080  [pdf, other

    cs.CL cs.AI cs.LG

    ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

    Authors: Ziru Chen, Shijie Chen, Yuting Ning, Qianheng Zhang, Boshi Wang, Botao Yu, Yifei Li, Zeyi Liao, Chen Wei, Zitong Lu, Vishal Dey, Mingyi Xue, Frazier N. Baker, Benjamin Burns, Daniel Adu-Ampratwum, Xuhui Huang, Xia Ning, Song Gao, Yu Su, Huan Sun

    Abstract: The advancements of large language models (LLMs) have piqued growing interest in developing LLM-based language agents to automate scientific discovery end-to-end, which has sparked both excitement and skepticism about their true capabilities. In this work, we call for rigorous assessment of agents on individual tasks in a scientific workflow before making bold claims on end-to-end automation. To t… ▽ More

    Submitted 31 March, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: ICLR 2025. 60 pages

  6. arXiv:2402.09391  [pdf, other

    cs.AI cs.CE cs.CL

    LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset

    Authors: Botao Yu, Frazier N. Baker, Ziqi Chen, Xia Ning, Huan Sun

    Abstract: Chemistry plays a crucial role in many domains, such as drug discovery and material science. While large language models (LLMs) such as GPT-4 exhibit remarkable capabilities on natural language processing tasks, existing research indicates that their performance on chemistry tasks is discouragingly low. In this paper, however, we demonstrate that our developed LLMs can achieve very strong results… ▽ More

    Submitted 10 August, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted by COLM 2024

  7. arXiv:2309.02671  [pdf, other

    cs.LG cs.AI

    RLSynC: Offline-Online Reinforcement Learning for Synthon Completion

    Authors: Frazier N. Baker, Ziqi Chen, Daniel Adu-Ampratwum, Xia Ning

    Abstract: Retrosynthesis is the process of determining the set of reactant molecules that can react to form a desired product. Semi-template-based retrosynthesis methods, which imitate the reverse logic of synthesis reactions, first predict the reaction centers in the products, and then complete the resulting synthons back into reactants. We develop a new offline-online reinforcement learning method RLSynC… ▽ More

    Submitted 29 March, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: 32 pages, 5 figures, 4 tables

  8. arXiv:2208.13301  [pdf, other

    cs.DC

    ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP

    Authors: Thomas Huber, Swaroop Pophale, Nolan Baker, Michael Carr, Nikhil Rao, Jaydon Reap, Kristina Holsapple, Joshua Hoke Davis, Tobias Burnus, Seyong Lee, David E. Bernholdt, Sunita Chandrasekaran

    Abstract: The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of… ▽ More

    Submitted 14 November, 2022; v1 submitted 28 August, 2022; originally announced August 2022.

  9. arXiv:2201.00692  [pdf

    cs.IR cs.LG

    Validation and Transparency in AI systems for pharmacovigilance: a case study applied to the medical literature monitoring of adverse events

    Authors: Bruno Ohana, Jack Sullivan, Nicole Baker

    Abstract: Recent advances in artificial intelligence applied to biomedical text are opening exciting opportunities for improving pharmacovigilance activities currently burdened by the ever growing volumes of real world data. To fully realize these opportunities, existing regulatory guidance and industry best practices should be taken into consideration in order to increase the overall trustworthiness of the… ▽ More

    Submitted 21 December, 2021; originally announced January 2022.

  10. arXiv:1904.01131  [pdf, other

    quant-ph cs.ET physics.chem-ph physics.comp-ph

    Q# and NWChem: Tools for Scalable Quantum Chemistry on Quantum Computers

    Authors: Guang Hao Low, Nicholas P. Bauman, Christopher E. Granade, Bo Peng, Nathan Wiebe, Eric J. Bylaska, Dave Wecker, Sriram Krishnamoorthy, Martin Roetteler, Karol Kowalski, Matthias Troyer, Nathan A. Baker

    Abstract: Fault-tolerant quantum computation promises to solve outstanding problems in quantum chemistry within the next decade. Realizing this promise requires scalable tools that allow users to translate descriptions of electronic structure problems to optimized quantum gate sequences executed on physical hardware, without requiring specialized quantum computing knowledge. To this end, we present a quantu… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: 36 pages, 5 figures. Examples and data in ancillary files folder

  11. arXiv:1710.02238  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    How Much Chemistry Does a Deep Neural Network Need to Know to Make Accurate Predictions?

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas, Nathan Baker

    Abstract: The meteoric rise of deep learning models in computer vision research, having achieved human-level accuracy in image recognition tasks is firm evidence of the impact of representation learning of deep neural networks. In the chemistry domain, recent advances have also led to the development of similar CNN models, such as Chemception, that is trained to predict chemical properties using images of m… ▽ More

    Submitted 18 March, 2018; v1 submitted 5 October, 2017; originally announced October 2017.

    Comments: In Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)

  12. arXiv:1706.06689  [pdf

    stat.ML cs.AI cs.CE cs.CV cs.LG

    Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

    Authors: Garrett B. Goh, Charles Siegel, Abhinav Vishnu, Nathan O. Hodas, Nathan Baker

    Abstract: In the last few years, we have seen the transformative impact of deep learning in many applications, particularly in speech recognition and computer vision. Inspired by Google's Inception-ResNet deep convolutional neural network (CNN) for image classification, we have developed "Chemception", a deep CNN for the prediction of chemical properties, using just the images of 2D drawings of molecules. W… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

    Comments: Submitted to a chemistry peer-reviewed journal

  13. arXiv:1606.02711  [pdf, ps, other

    cs.HC

    ChinMotion Rapidly Enables 3D Computer Interaction after Tetraplegia

    Authors: Ferran Galán, Stuart N. Baker, Monica A. Perez

    Abstract: Individuals with severe paralysis require hands-free interfaces to control assistive devices that can improve their quality of life. We present ChinMotion, an interface that noninvasively harnesses preserved chin, lip and tongue sensorimotor function after tetraplegia to convey intuitive control commands. After two hours of practice, ChinMotion enables superior point-and-click performance over exi… ▽ More

    Submitted 8 June, 2016; originally announced June 2016.

    Comments: The .ps file contains main manuscript and supplementary information. The .ps file is accompanied with ancillary files (supplementary files)

  14. arXiv:1606.02596  [pdf, ps, other

    physics.comp-ph cs.CE q-bio.QM

    Data-driven parameterization of the generalized Langevin equation

    Authors: Huan Lei, Nathan Baker, Xiantao Li

    Abstract: We present a data-driven approach to determine the memory kernel and random noise in generalized Langevin equations. To facilitate practical implementations, we parameterize the kernel function in the Laplace domain by a rational function, with coefficients directly linked to the equilibrium statistics of the coarse-grain variables. We show that such an approximation can be constructed to arbitrar… ▽ More

    Submitted 9 June, 2016; v1 submitted 8 June, 2016; originally announced June 2016.

  15. arXiv:1202.5519  [pdf

    cs.DC

    Context-Aware Service Utilisation in the Clouds and Energy Conservation

    Authors: Saad Liaquat Kiani, Ashiq Anjum, Nick Antonopoulos, Michael Knappmeyer, Nigel Baker, Richard McClatchey

    Abstract: Ubiquitous computing environments are characterised by smart, interconnected artefacts embedded in our physical world that are projected to provide useful services to human inhabitants unobtrusively. Mobile devices are becoming the primary tools of human interaction with these embedded artefacts and utilisation of services available in smart computing environments such as clouds. Advancements in c… ▽ More

    Submitted 24 February, 2012; originally announced February 2012.

    Comments: 27 pages; 17 figures; 2 tables. Under review at the Journal of Ambient Intelligence and Humanized Computing. 2011