Skip to main content

Showing 1–9 of 9 results for author: Foster, I

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.07237  [pdf, other

    cs.LG cs.CL q-bio.BM stat.ML

    DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization

    Authors: Xuefeng Liu, Songhao Jiang, Siyu Chen, Zhuoran Yang, Yuxin Chen, Ian Foster, Rick Stevens

    Abstract: Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research delves into the realm of drug optimization and introduce a novel reinforcement learning algorithm to finetune a drug optimization LLM-based generative model, enhancing the original drug across target objectives, while retains the beneficial chemical properties of the original drug.… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  2. arXiv:2410.00709  [pdf, other

    q-bio.QM cs.AI stat.ML

    Binding Affinity Prediction: From Conventional to Machine Learning-Based Approaches

    Authors: Xuefeng Liu, Songhao Jiang, Xiaotian Duan, Archit Vasan, Chong Liu, Chih-chan Tien, Heng Ma, Thomas Brettin, Fangfang Xia, Ian T. Foster, Rick L. Stevens

    Abstract: Protein-ligand binding is the process by which a small molecule (drug or inhibitor) attaches to a target protein. The binding affinity, which refers to the strength of this interaction, is central to many important problems in bioinformatics such as drug design. An extensive amount of work has been devoted to predicting binding affinity over the past decades due to its significance. In this paper,… ▽ More

    Submitted 29 September, 2024; originally announced October 2024.

  3. arXiv:2406.06348  [pdf, other

    cs.LG cs.DC stat.ME

    Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning

    Authors: Ashka Shah, Adela DePavia, Nathaniel Hudson, Ian Foster, Rick Stevens

    Abstract: The aim in many sciences is to understand the mechanisms that underlie the observed distribution of variables, starting from a set of initial hypotheses. Causal discovery allows us to infer mechanisms as sets of cause and effect relationships in a generalized way -- without necessarily tailoring to a specific domain. Causal discovery algorithms search over a structured hypothesis space, defined by… ▽ More

    Submitted 3 March, 2025; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: TMLR 03/2025

  4. arXiv:2101.06813  [pdf, other

    cs.LG cs.AI stat.AP

    Fast and accurate learned multiresolution dynamical downscaling for precipitation

    Authors: Jiali Wang, Zhengchun Liu, Ian Foster, Won Chang, Rajkumar Kettimuthu, Rao Kotamarthi

    Abstract: This study develops a neural network-based approach for emulating high-resolution modeled precipitation data with comparable statistical properties but at greatly reduced computational cost. The key idea is to use combination of low- and high- resolution simulations to train a neural network to map from the former to the latter. Specifically, we define two types of CNNs, one that stacks variables… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  5. arXiv:2007.00784  [pdf, other

    cs.LG cs.DC stat.ML

    Convolutional Neural Network Training with Distributed K-FAC

    Authors: J. Gregory Pauloski, Zhao Zhang, Lei Huang, Weijia Xu, Ian T. Foster

    Abstract: Training neural networks with many processors can reduce time-to-solution; however, it is challenging to maintain convergence and efficiency at large scales. The Kronecker-factored Approximate Curvature (K-FAC) was recently proposed as an approximation of the Fisher Information Matrix that can be used in natural gradient optimizers. We investigate here a scalable K-FAC design and its applicability… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: To be published in the proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20)

  6. arXiv:2006.02431  [pdf, other

    q-bio.BM cs.LG q-bio.QM stat.ML

    Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release

    Authors: Yadu Babuji, Ben Blaiszik, Tom Brettin, Kyle Chard, Ryan Chard, Austin Clyde, Ian Foster, Zhi Hong, Shantenu Jha, Zhuozhao Li, Xuefeng Liu, Arvind Ramanathan, Yi Ren, Nicholaus Saint, Marcus Schwarting, Rick Stevens, Hubertus van Dam, Rick Wagner

    Abstract: Researchers across the globe are seeking to rapidly repurpose existing drugs or discover new drugs to counter the the novel coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). One promising approach is to train machine learning (ML) and artificial intelligence (AI) tools to screen large numbers of small molecules. As a contribution to that effort,… ▽ More

    Submitted 27 May, 2020; originally announced June 2020.

    Comments: 11 pages, 5 figures

  7. arXiv:1907.03222  [pdf, other

    physics.comp-ph cs.LG stat.ML

    IRNet: A General Purpose Deep Residual Regression Framework for Materials Discovery

    Authors: Dipendra Jha, Logan Ward, Zijiang Yang, Christopher Wolverton, Ian Foster, Wei-keng Liao, Alok Choudhary, Ankit Agrawal

    Abstract: Materials discovery is crucial for making scientific advances in many domains. Collections of data from experiments and first-principle computations have spurred interest in applying machine learning methods to create predictive models capable of mapping from composition and crystal structures to materials properties. Generally, these are regression problems with the input being a 1D vector compos… ▽ More

    Submitted 7 July, 2019; originally announced July 2019.

    Comments: 9 pages, under publication at KDD'19

  8. arXiv:1906.03233  [pdf

    physics.comp-ph cond-mat.mtrl-sci physics.chem-ph stat.ML

    Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations

    Authors: Logan Ward, Ben Blaiszik, Ian Foster, Rajeev S. Assary, Badri Narayanan, Larry Curtiss

    Abstract: Recent studies illustrate how machine learning (ML) can be used to bypass a core challenge of molecular modeling: the tradeoff between accuracy and computational cost. Here, we assess multiple ML approaches for predicting the atomization energy of organic molecules. Our resulting models learn the difference between low-fidelity, B3LYP, and high-accuracy, G4MP2, atomization energies, and predict th… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  9. arXiv:1811.11213  [pdf, other

    cs.LG cs.DC stat.ML

    DLHub: Model and Data Serving for Science

    Authors: Ryan Chard, Zhuozhao Li, Kyle Chard, Logan Ward, Yadu Babuji, Anna Woodard, Steve Tuecke, Ben Blaiszik, Michael J. Franklin, Ian Foster

    Abstract: While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the "learning systems" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and ser… ▽ More

    Submitted 27 November, 2018; originally announced November 2018.

    Comments: 10 pages, 8 figures, conference paper