Skip to main content

Showing 1–50 of 56 results for author: Stevens, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.04846  [pdf, ps, other

    cs.IR cs.CE cs.CL cs.DC cs.LG

    HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights

    Authors: Ozan Gokdemir, Carlo Siebenschuh, Alexander Brace, Azton Wells, Brian Hsu, Kyle Hippe, Priyanka V. Setty, Aswathy Ajith, J. Gregory Pauloski, Varuni Sastry, Sam Foreman, Huihuo Zheng, Heng Ma, Bharat Kale, Nicholas Chia, Thomas Gibbs, Michael E. Papka, Thomas Brettin, Francis J. Alexander, Anima Anandkumar, Ian Foster, Rick Stevens, Venkatram Vishwanath, Arvind Ramanathan

    Abstract: The volume of scientific literature is growing exponentially, leading to underutilized discoveries, duplicated efforts, and limited cross-disciplinary collaboration. Retrieval Augmented Generation (RAG) offers a way to assist scientists by improving the factuality of Large Language Models (LLMs) in processing this influx of information. However, scaling RAG to handle millions of articles introduce… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: This paper has been accepted at the Platform for Advanced Scientific Computing Conference (PASC 25), June 16-18, 2025, Brugg-Windisch, Switzerland

    ACM Class: H.3.3; I.2.7

  2. arXiv:2505.01435  [pdf, other

    cs.IR cs.CL cs.DC cs.LG

    AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine

    Authors: Carlo Siebenschuh, Kyle Hippe, Ozan Gokdemir, Alexander Brace, Arham Khan, Khalid Hossain, Yadu Babuji, Nicholas Chia, Venkatram Vishwanath, Rick Stevens, Arvind Ramanathan, Ian Foster, Robert Underwood

    Abstract: Language models for scientific tasks are trained on text from scientific publications, most distributed as PDFs that require parsing. PDF parsing approaches range from inexpensive heuristics (for simple documents) to computationally intensive ML-driven systems (for complex or degraded ones). The choice of the "best" parser for a particular document depends on its computational cost and the accurac… ▽ More

    Submitted 23 April, 2025; originally announced May 2025.

    Comments: This paper has been accepted at the The Eighth Annual Conference on Machine Learning and Systems (MLSys 2025)

  3. arXiv:2504.04770  [pdf, other

    cs.LG cs.AI q-bio.MN

    Bidirectional Hierarchical Protein Multi-Modal Representation Learning

    Authors: Xuefeng Liu, Songhao Jiang, Chih-chan Tien, Jinbo Xu, Rick Stevens

    Abstract: Protein representation learning is critical for numerous biological tasks. Recently, large transformer-based protein language models (pLMs) pretrained on large scale protein sequences have demonstrated significant success in sequence-based tasks. However, pLMs lack structural information. Conversely, graph neural networks (GNNs) designed to leverage 3D structural information have shown promising g… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  4. arXiv:2503.14356  [pdf, other

    cs.LG q-bio.QM

    Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis

    Authors: Alexander Partin, Priyanka Vasanthakumari, Oleksandr Narykov, Andreas Wilke, Natasha Koussa, Sara E. Jones, Yitan Zhu, Jamie C. Overbeek, Rajeev Jain, Gayara Demini Fernando, Cesar Sanchez-Villalobos, Cristina Garcia-Cardona, Jamaludin Mohd-Yusof, Nicholas Chia, Justin M. Wozniak, Souparno Ghosh, Ranadip Pal, Thomas S. Brettin, M. Ryan Weil, Rick L. Stevens

    Abstract: Deep learning (DL) and machine learning (ML) models have shown promise in drug response prediction (DRP), yet their ability to generalize across datasets remains an open question, raising concerns about their real-world applicability. Due to the lack of standardized benchmarking approaches, model evaluations and comparisons often rely on inconsistent datasets and evaluation criteria, making it dif… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: 18 pages, 9 figures

  5. arXiv:2502.20309  [pdf, other

    cs.AI

    EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants

    Authors: Franck Cappello, Sandeep Madireddy, Robert Underwood, Neil Getty, Nicholas Lee-Ping Chia, Nesar Ramachandra, Josh Nguyen, Murat Keceli, Tanwi Mallick, Zilinghan Li, Marieme Ngom, Chenhui Zhang, Angel Yanguas-Gil, Evan Antoniuk, Bhavya Kailkhura, Minyang Tian, Yufeng Du, Yuan-Sen Ting, Azton Wells, Bogdan Nicolae, Avinash Maurya, M. Mustafa Rafique, Eliu Huerta, Bo Li, Ian Foster , et al. (1 additional authors not shown)

    Abstract: Recent advancements have positioned AI, and particularly Large Language Models (LLMs), as transformative tools for scientific research, capable of addressing complex tasks that require reasoning, problem-solving, and decision-making. Their exceptional capabilities suggest their potential as scientific research assistants but also highlight the need for holistic, rigorous, and domain-specific evalu… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 33 pages, 18 figures

  6. arXiv:2502.10631  [pdf, other

    cs.LG cs.AI q-bio.BM

    ControllableGPT: A Ground-Up Designed Controllable GPT for Molecule Optimization

    Authors: Xuefeng Liu, Songhao Jiang, Bo Li, Rick Stevens

    Abstract: Large Language Models (LLMs) employ three popular training approaches: Masked Language Models (MLM), Causal Language Models (CLM), and Sequence-to-Sequence Models (seq2seq). However, each approach has its strengths and limitations, and faces challenges in addressing specific tasks that require controllable and bidirectional generation, such as drug optimization. To address this challenge, inspired… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  7. arXiv:2502.07937  [pdf, other

    cs.LG stat.ML

    Active Advantage-Aligned Online Reinforcement Learning with Offline Data

    Authors: Xuefeng Liu, Hung T. C. Le, Siyu Chen, Rick Stevens, Zhuoran Yang, Matthew R. Walter, Yuxin Chen

    Abstract: Online reinforcement learning (RL) enhances policies through direct interactions with the environment, but faces challenges related to sample efficiency. In contrast, offline RL leverages extensive pre-collected data to learn policies, but often produces suboptimal results due to limited data coverage. Recent efforts have sought to integrate offline and online RL in order to harness the advantages… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  8. arXiv:2502.07237  [pdf, other

    cs.LG cs.CL q-bio.BM stat.ML

    DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization

    Authors: Xuefeng Liu, Songhao Jiang, Siyu Chen, Zhuoran Yang, Yuxin Chen, Ian Foster, Rick Stevens

    Abstract: Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research delves into the realm of drug optimization and introduce a novel reinforcement learning algorithm to finetune a drug optimization LLM-based generative model, enhancing the original drug across target objectives, while retains the beneficial chemical properties of the original drug.… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  9. arXiv:2502.06891  [pdf, other

    q-bio.BM cs.CL cs.LG

    ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization

    Authors: Xuefeng Liu, Songhao Jiang, Ian Foster, Jinbo Xu, Rick Stevens

    Abstract: Drug optimization has become increasingly crucial in light of fast-mutating virus strains and drug-resistant cancer cells. Nevertheless, it remains challenging as it necessitates retaining the beneficial properties of the original drug while simultaneously enhancing desired attributes beyond its scope. In this work, we aim to tackle this challenge by introducing ScaffoldGPT, a novel Generative Pre… ▽ More

    Submitted 11 April, 2025; v1 submitted 9 February, 2025; originally announced February 2025.

  10. arXiv:2501.15370  [pdf

    cs.CV cs.AI

    Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis

    Authors: Robinson Umeike, Neil Getty, Fangfang Xia, Rick Stevens

    Abstract: Large language models (LLMs) have demonstrated immense capabilities in understanding textual data and are increasingly being adopted to help researchers accelerate scientific discovery through knowledge extraction (information retrieval), knowledge distillation (summarizing key findings and methodologies into concise forms), and knowledge synthesis (aggregating information from multiple scientific… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

    Comments: 4 Pages, 4 Figures, 1 Table

    ACM Class: I.2.7; I.4.9

  11. arXiv:2412.06700  [pdf, other

    cs.CR

    Facade: High-Precision Insider Threat Detection Using Deep Contextual Anomaly Detection

    Authors: Alex Kantchelian, Casper Neo, Ryan Stevens, Hyungwon Kim, Zhaohao Fu, Sadegh Momeni, Birkett Huber, Elie Bursztein, Yanis Pavlidis, Senaka Buthpitiya, Martin Cochran, Massimiliano Poletto

    Abstract: We present Facade (Fast and Accurate Contextual Anomaly DEtection): a high-precision deep-learning-based anomaly detection system deployed at Google (a large technology company) as the last line of defense against insider threats since 2018. Facade is an innovative unsupervised action-context system that detects suspicious actions by considering the context surrounding each action, including relev… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: Under review

  12. arXiv:2410.00709  [pdf, other

    q-bio.QM cs.AI stat.ML

    Binding Affinity Prediction: From Conventional to Machine Learning-Based Approaches

    Authors: Xuefeng Liu, Songhao Jiang, Xiaotian Duan, Archit Vasan, Chong Liu, Chih-chan Tien, Heng Ma, Thomas Brettin, Fangfang Xia, Ian T. Foster, Rick L. Stevens

    Abstract: Protein-ligand binding is the process by which a small molecule (drug or inhibitor) attaches to a target protein. The binding affinity, which refers to the strength of this interaction, is central to many important problems in bioinformatics such as drug design. An extensive amount of work has been devoted to predicting binding affinity over the past decades due to its significance. In this paper,… ▽ More

    Submitted 29 September, 2024; originally announced October 2024.

  13. arXiv:2409.12215  [pdf, other

    q-bio.BM cs.LG

    Assessing Reusability of Deep Learning-Based Monotherapy Drug Response Prediction Models Trained with Omics Data

    Authors: Jamie C. Overbeek, Alexander Partin, Thomas S. Brettin, Nicholas Chia, Oleksandr Narykov, Priyanka Vasanthakumari, Andreas Wilke, Yitan Zhu, Austin Clyde, Sara Jones, Rohan Gnanaolivu, Yuanhang Liu, Jun Jiang, Chen Wang, Carter Knutson, Andrew McNaughton, Neeraj Kumar, Gayara Demini Fernando, Souparno Ghosh, Cesar Sanchez-Villalobos, Ruibo Zhang, Ranadip Pal, M. Ryan Weil, Rick L. Stevens

    Abstract: Cancer drug response prediction (DRP) models present a promising approach towards precision oncology, tailoring treatments to individual patient profiles. While deep learning (DL) methods have shown great potential in this area, models that can be successfully translated into clinical practice and shed light on the molecular mechanisms underlying treatment response will likely emerge from collabor… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: 12 pages, 2 figures

  14. arXiv:2406.07025  [pdf, other

    cs.LG cs.AI q-bio.QM stat.ML

    Entropy-Reinforced Planning with Large Language Models for Drug Discovery

    Authors: Xuefeng Liu, Chih-chan Tien, Peng Ding, Songhao Jiang, Rick L. Stevens

    Abstract: The objective of drug discovery is to identify chemical compounds that possess specific pharmaceutical properties toward a binding target. Existing large language models (LLMS) can achieve high token matching scores in terms of likelihood for molecule generation. However, relying solely on LLM decoding often results in the generation of molecules that are either invalid due to a single misused tok… ▽ More

    Submitted 29 March, 2025; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Published in ICML2024

  15. arXiv:2406.06348  [pdf, other

    cs.LG cs.DC stat.ME

    Causal Discovery over High-Dimensional Structured Hypothesis Spaces with Causal Graph Partitioning

    Authors: Ashka Shah, Adela DePavia, Nathaniel Hudson, Ian Foster, Rick Stevens

    Abstract: The aim in many sciences is to understand the mechanisms that underlie the observed distribution of variables, starting from a set of initial hypotheses. Causal discovery allows us to infer mechanisms as sets of cause and effect relationships in a generalized way -- without necessarily tailoring to a specific domain. Causal discovery algorithms search over a structured hypothesis space, defined by… ▽ More

    Submitted 3 March, 2025; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: TMLR 03/2025

  16. arXiv:2402.03480  [pdf, other

    cs.LG cs.AI cs.DC

    Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision

    Authors: Nathaniel Hudson, J. Gregory Pauloski, Matt Baughman, Alok Kamatar, Mansi Sakarvadia, Logan Ward, Ryan Chard, André Bauer, Maksim Levental, Wenyi Wang, Will Engler, Owen Price Skelly, Ben Blaiszik, Rick Stevens, Kyle Chard, Ian Foster

    Abstract: Deep learning methods are transforming research, enabling new techniques, and ultimately leading to new discoveries. As the demand for more capable AI models continues to grow, we are now entering an era of Trillion Parameter Models (TPM), or models with more than a trillion parameters -- such as Huawei's PanGu-$Σ$. We describe a vision for the ecosystem of TPM users and providers that caters to t… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 10 pages, 3 figures, accepted for publication in the proceedings of the 10th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT2023)

  17. arXiv:2312.10188  [pdf, other

    cs.LG

    WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data

    Authors: Maurice Weber, Carlo Siebenschuh, Rory Butler, Anton Alexandrov, Valdemar Thanner, Georgios Tsolakis, Haris Jabbar, Ian Foster, Bo Li, Rick Stevens, Ce Zhang

    Abstract: We introduce WordScape, a novel pipeline for the creation of cross-disciplinary, multilingual corpora comprising millions of pages with annotations for document layout detection. Relating visual and textual items on document pages has gained further significance with the advent of multimodal models. Various approaches proved effective for visual question answering or layout segmentation. However,… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks

  18. arXiv:2310.04610  [pdf, other

    cs.AI cs.LG

    DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

    Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

    Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More

    Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  19. arXiv:2310.03899  [pdf, other

    cs.LG

    CrysFormer: Protein Structure Prediction via 3d Patterson Maps and Partial Structure Attention

    Authors: Chen Dun, Qiutai Pan, Shikai Jin, Ria Stevens, Mitchell D. Miller, George N. Phillips, Jr., Anastasios Kyrillidis

    Abstract: Determining the structure of a protein has been a decades-long open question. A protein's three-dimensional structure often poses nontrivial computation costs, when classical simulation algorithms are utilized. Advances in the transformer neural network architecture -- such as AlphaFold2 -- achieve significant improvements for this problem, by learning from a large dataset of sequence information… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  20. arXiv:2310.01737  [pdf, other

    cs.LG cs.AI stat.ML

    Blending Imitation and Reinforcement Learning for Robust Policy Improvement

    Authors: Xuefeng Liu, Takuma Yoneda, Rick L. Stevens, Matthew R. Walter, Yuxin Chen

    Abstract: While reinforcement learning (RL) has shown promising performance, its sample complexity continues to be a substantial hurdle, restricting its broader application across a variety of domains. Imitation learning (IL) utilizes oracles to improve sample efficiency, yet it is often constrained by the quality of the oracles deployed. which actively interleaves between IL and RL based on an online estim… ▽ More

    Submitted 4 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  21. arXiv:2308.09793  [pdf, other

    cs.RO

    Towards a Modular Architecture for Science Factories

    Authors: Rafael Vescovi, Tobias Ginsburg, Kyle Hippe, Doga Ozgulbas, Casey Stone, Abraham Stroka, Rory Butler, Ben Blaiszik, Tom Brettin, Kyle Chard, Mark Hereld, Arvind Ramanathan, Rick Stevens, Aikaterini Vriza, Jie Xu, Qingteng Zhang, Ian Foster

    Abstract: Advances in robotic automation, high-performance computing (HPC), and artificial intelligence (AI) encourage us to conceive of science factories: large, general-purpose computation- and AI-enabled self-driving laboratories (SDLs) with the generality and scale needed both to tackle large discovery problems and to support thousands of scientists. Science factories require modular hardware and softwa… ▽ More

    Submitted 17 October, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

  22. arXiv:2308.01921  [pdf, other

    q-bio.BM cs.AI cs.LG

    Transferable Graph Neural Fingerprint Models for Quick Response to Future Bio-Threats

    Authors: Wei Chen, Yihui Ren, Ai Kagawa, Matthew R. Carbone, Samuel Yen-Chi Chen, Xiaohui Qu, Shinjae Yoo, Austin Clyde, Arvind Ramanathan, Rick L. Stevens, Hubertus J. J. van Dam, Deyu Lu

    Abstract: Fast screening of drug molecules based on the ligand binding affinity is an important step in the drug discovery pipeline. Graph neural fingerprint is a promising method for developing molecular docking surrogates with high throughput and great fidelity. In this study, we built a COVID-19 drug docking dataset of about 300,000 drug candidates on 23 coronavirus protein targets. With this dataset, we… ▽ More

    Submitted 14 September, 2023; v1 submitted 17 July, 2023; originally announced August 2023.

    Comments: 8 pages, 5 figures, 2 tables, accepted by ICLMA2023

    ACM Class: I.2.1

  23. arXiv:2304.03210  [pdf, other

    q-bio.MN cs.DC

    Causal Discovery and Optimal Experimental Design for Genome-Scale Biological Network Recovery

    Authors: Ashka Shah, Arvind Ramanathan, Valerie Hayot-Sasson, Rick Stevens

    Abstract: Causal discovery of genome-scale networks is important for identifying pathways from genes to observable traits - e.g. differences in cell function, disease, drug resistance and others. Causal learners based on graphical models rely on interventional samples to orient edges in the network. However, these models have not been shown to scale up the size of the genome, which are on the order of 1e3-1… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: To be published in Platform for Advanced Scientific Computing 2023 (PASC23) conference proceedings

  24. arXiv:2303.07470  [pdf, other

    cs.LG cs.AR

    X-Former: In-Memory Acceleration of Transformers

    Authors: Shrihari Sridharan, Jacob R. Stevens, Kaushik Roy, Anand Raghunathan

    Abstract: Transformers have achieved great success in a wide variety of natural language processing (NLP) tasks due to the attention mechanism, which assigns an importance score for every word relative to other words in a sequence. However, these models are very large, often reaching hundreds of billions of parameters, and therefore require a large number of DRAM accesses. Hence, traditional deep neural net… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  25. arXiv:2303.04630  [pdf

    cs.LG q-bio.QM stat.AP

    Mining the contribution of intensive care clinical course to outcome after traumatic brain injury

    Authors: Shubhayu Bhattacharyay, Pier Francesco Caruso, Cecilia Åkerlund, Lindsay Wilson, Robert D Stevens, David K Menon, Ewout W Steyerberg, David W Nelson, Ari Ercole, the CENTER-TBI investigators/participants

    Abstract: Existing methods to characterise the evolving condition of traumatic brain injury (TBI) patients in the intensive care unit (ICU) do not capture the context necessary for individualising treatment. Here, we integrate all heterogenous data stored in medical records (1,166 pre-ICU and ICU variables) to model the individualised contribution of clinical course to six-month functional outcome on the Gl… ▽ More

    Submitted 1 August, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Journal ref: npj Digit. Med. 6, 154 (2023)

  26. arXiv:2211.10442  [pdf, other

    q-bio.QM cs.LG

    Deep learning methods for drug response prediction in cancer: predominant and emerging trends

    Authors: Alexander Partin, Thomas S. Brettin, Yitan Zhu, Oleksandr Narykov, Austin Clyde, Jamie Overbeek, Rick L. Stevens

    Abstract: Cancer claims millions of lives yearly worldwide. While many therapies have been made available in recent years, by in large cancer remains unsolved. Exploiting computational predictive models to study and treat cancer holds great promise in improving drug development and personalized design of treatment plans, ultimately suppressing tumors, alleviating suffering, and prolonging lives of patients.… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  27. arXiv:2207.06030  [pdf, other

    cs.LG cs.AI stat.ML

    Contextual Active Model Selection

    Authors: Xuefeng Liu, Fangfang Xia, Rick L. Stevens, Yuxin Chen

    Abstract: While training models and labeling data are resource-intensive, a wealth of pre-trained models and unlabeled data exists. To effectively utilize these resources, we present an approach to actively select pre-trained models while minimizing labeling costs. We frame this as an online contextual active model selection problem: At each round, the learner receives an unlabeled data point as a context.… ▽ More

    Submitted 9 February, 2025; v1 submitted 13 July, 2022; originally announced July 2022.

  28. The leap to ordinal: detailed functional prognosis after traumatic brain injury with a flexible modelling approach

    Authors: Shubhayu Bhattacharyay, Ioan Milosevic, Lindsay Wilson, David K. Menon, Robert D. Stevens, Ewout W. Steyerberg, David W. Nelson, Ari Ercole, the CENTER-TBI investigators/participants

    Abstract: When a patient is admitted to the intensive care unit (ICU) after a traumatic brain injury (TBI), an early prognosis is essential for baseline risk adjustment and shared decision making. TBI outcomes are commonly categorised by the Glasgow Outcome Scale-Extended (GOSE) into 8, ordered levels of functional recovery at 6 months after injury. Existing ICU prognostic models predict binary outcomes at… ▽ More

    Submitted 4 May, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

    Comments: 68 pages, 4 figures, 4 tables, 1 appendix, 6 supplementary figures, 4 supplementary tables, 3 supplementary methods, 1 supplementary result

    ACM Class: J.3; I.2.0; I.5.1

    Journal ref: PLOS ONE 17:7 (2022) e0270973

  29. arXiv:2111.13786  [pdf, other

    cs.LG cs.AI

    Learning from learning machines: a new generation of AI technology to meet the needs of science

    Authors: Luca Pion-Tonachini, Kristofer Bouchard, Hector Garcia Martin, Sean Peisert, W. Bradley Holtz, Anil Aswani, Dipankar Dwivedi, Haruko Wainwright, Ghanshyam Pilania, Benjamin Nachman, Babetta L. Marrone, Nicola Falco, Prabhat, Daniel Arnold, Alejandro Wolf-Yadlin, Sarah Powers, Sharlee Climer, Quinn Jackson, Ty Carlson, Michael Sohn, Petrus Zwart, Neeraj Kumar, Amy Justice, Claire Tomlin, Daniel Jacobson , et al. (11 additional authors not shown)

    Abstract: We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

  30. arXiv:2106.07036  [pdf, other

    q-bio.BM cs.LG

    Protein-Ligand Docking Surrogate Models: A SARS-CoV-2 Benchmark for Deep Learning Accelerated Virtual Screening

    Authors: Austin Clyde, Thomas Brettin, Alexander Partin, Hyunseung Yoo, Yadu Babuji, Ben Blaiszik, Andre Merzky, Matteo Turilli, Shantenu Jha, Arvind Ramanathan, Rick Stevens

    Abstract: We propose a benchmark to study surrogate model accuracy for protein-ligand docking. We share a dataset consisting of 200 million 3D complex structures and 2D structure scores across a consistent set of 13 million "in-stock" molecules over 15 receptors, or binding sites, across the SARS-CoV-2 proteome. Our work shows surrogate docking models have six orders of magnitude more throughput than standa… ▽ More

    Submitted 30 June, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

  31. arXiv:2106.02190  [pdf, other

    cs.LG cs.AI q-bio.BM

    Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery

    Authors: Yulun Wu, Mikaela Cashman, Nicholas Choma, Érica T. Prates, Verónica G. Melesse Vergara, Manesh Shah, Andrew Chen, Austin Clyde, Thomas S. Brettin, Wibe A. de Jong, Neeraj Kumar, Martha S. Head, Rick L. Stevens, Peter Nugent, Daniel A. Jacobson, James B. Brown

    Abstract: We developed Distilled Graph Attention Policy Network (DGAPN), a reinforcement learning model to generate novel graph-structured chemical representations that optimize user-defined objectives by efficiently navigating a physically constrained domain. The framework is examined on the task of generating molecules that are designed to bind, noncovalently, to functional sites of SARS-CoV-2 proteins. W… ▽ More

    Submitted 11 May, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

  32. arXiv:2105.00324  [pdf, other

    cs.LG cs.AI cs.NE

    Neko: a Library for Exploring Neuromorphic Learning Rules

    Authors: Zixuan Zhao, Nathan Wycoff, Neil Getty, Rick Stevens, Fangfang Xia

    Abstract: The field of neuromorphic computing is in a period of active exploration. While many tools have been developed to simulate neuronal dynamics or convert deep networks to spiking models, general software libraries for learning rules remain underexplored. This is partly due to the diverse, challenging nature of efforts to design new learning rules, which range from encoding methods to gradient approx… ▽ More

    Submitted 13 August, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

    Comments: Accepted by International Conference on Neuromorphic Systems (ICONS'21)

  33. arXiv:2103.10836  [pdf, other

    cs.AR

    GNNerator: A Hardware/Software Framework for Accelerating Graph Neural Networks

    Authors: Jacob R. Stevens, Dipankar Das, Sasikanth Avancha, Bharat Kaul, Anand Raghunathan

    Abstract: Graph Neural Networks (GNNs) use a fully-connected layer to extract features from the nodes of a graph and aggregate these features using message passing between nodes, combining two distinct computational patterns: dense, regular computations and sparse, irregular computations. To address this challenge, we propose GNNerator, an accelerator with heterogeneous compute engines optimized for these… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: To appear in Proceedings of the 58th Design Automation Conference (DAC '21)

  34. arXiv:2103.09301  [pdf, other

    cs.AR

    Softermax: Hardware/Software Co-Design of an Efficient Softmax for Transformers

    Authors: Jacob R. Stevens, Rangharajan Venkatesan, Steve Dai, Brucek Khailany, Anand Raghunathan

    Abstract: Transformers have transformed the field of natural language processing. This performance is largely attributed to the use of stacked self-attention layers, each of which consists of matrix multiplies as well as softmax operations. As a result, unlike other neural networks, the softmax operation accounts for a significant fraction of the total run-time of Transformers. To address this, we propose S… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

    Comments: To appear in Proceedings of the 58th Design Automation Conference (DAC '21)

  35. arXiv:2103.06867  [pdf, other

    cs.LG

    Scaffold Embeddings: Learning the Structure Spanned by Chemical Fragments, Scaffolds and Compounds

    Authors: Austin Clyde, Arvind Ramanathan, Rick Stevens

    Abstract: Molecules have seemed like a natural fit to deep learning's tendency to handle a complex structure through representation learning, given enough data. However, this often continuous representation is not natural for understanding chemical space as a domain and is particular to samples and their differences. We focus on exploring a natural structure for representing chemical space as a structured d… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

  36. arXiv:2103.02843  [pdf

    cs.DC cs.CE cs.LG physics.bio-ph q-bio.QM

    Pandemic Drugs at Pandemic Speed: Infrastructure for Accelerating COVID-19 Drug Discovery with Hybrid Machine Learning- and Physics-based Simulations on High Performance Computers

    Authors: Agastya P. Bhati, Shunzhou Wan, Dario Alfè, Austin R. Clyde, Mathis Bode, Li Tan, Mikhail Titov, Andre Merzky, Matteo Turilli, Shantenu Jha, Roger R. Highfield, Walter Rocchia, Nicola Scafuri, Sauro Succi, Dieter Kranzlmüller, Gerald Mathias, David Wifling, Yann Donon, Alberto Di Meglio, Sofia Vallecorsa, Heng Ma, Anda Trifan, Arvind Ramanathan, Tom Brettin, Alexander Partin , et al. (4 additional authors not shown)

    Abstract: The race to meet the challenges of the global pandemic has served as a reminder that the existing drug discovery process is expensive, inefficient and slow. There is a major bottleneck screening the vast number of potential small molecules to shortlist lead compounds for antiviral drug development. New opportunities to accelerate drug discovery lie at the interface between machine learning methods… ▽ More

    Submitted 4 September, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Journal ref: Interface Focus. 2021. 11 (6): 20210018

  37. arXiv:2011.12466  [pdf, other

    q-bio.QM cs.LG

    Learning Curves for Drug Response Prediction in Cancer Cell Lines

    Authors: Alexander Partin, Thomas Brettin, Yvonne A. Evrard, Yitan Zhu, Hyunseung Yoo, Fangfang Xia, Songhao Jiang, Austin Clyde, Maulik Shukla, Michael Fonstein, James H. Doroshow, Rick Stevens

    Abstract: Motivated by the size of cell line drug sensitivity data, researchers have been developing machine learning (ML) models for predicting drug response to advance cancer treatment. As drug sensitivity studies continue generating data, a common question is whether the proposed predictors can further improve the generalization performance with more training data. We utilize empirical learning curves fo… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

    Comments: 14 pages, 7 figures

  38. arXiv:2010.10517  [pdf, other

    cs.DC cs.CE

    Scalable HPC and AI Infrastructure for COVID-19 Therapeutics

    Authors: Hyungro Lee, Andre Merzky, Li Tan, Mikhail Titov, Matteo Turilli, Dario Alfe, Agastya Bhati, Alex Brace, Austin Clyde, Peter Coveney, Heng Ma, Arvind Ramanathan, Rick Stevens, Anda Trifan, Hubertus Van Dam, Shunzhou Wan, Sean Wilkinson, Shantenu Jha

    Abstract: COVID-19 has claimed more 1 million lives and resulted in over 40 million infections. There is an urgent need to identify drugs that can inhibit SARS-CoV-2. In response, the DOE recently established the Medical Therapeutics project as part of the National Virtual Biotechnology Laboratory, and tasked it with creating the computational infrastructure and methods necessary to advance therapeutics dev… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  39. arXiv:2010.06574  [pdf, other

    cs.DC cs.CE q-bio.QM

    IMPECCABLE: Integrated Modeling PipelinE for COVID Cure by Assessing Better LEads

    Authors: Aymen Al Saadi, Dario Alfe, Yadu Babuji, Agastya Bhati, Ben Blaiszik, Thomas Brettin, Kyle Chard, Ryan Chard, Peter Coveney, Anda Trifan, Alex Brace, Austin Clyde, Ian Foster, Tom Gibbs, Shantenu Jha, Kristopher Keipert, Thorsten Kurth, Dieter Kranzlmüller, Hyungro Lee, Zhuozhao Li, Heng Ma, Andre Merzky, Gerald Mathias, Alexander Partin, Junqi Yin , et al. (11 additional authors not shown)

    Abstract: The drug discovery process currently employed in the pharmaceutical industry typically requires about 10 years and $2-3 billion to deliver one new drug. This is both too expensive and too slow, especially in emergencies like the COVID-19 pandemic. In silicomethodologies need to be improved to better select lead compounds that can proceed to later stages of the drug discovery protocol accelerating… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  40. arXiv:2010.03688  [pdf, other

    cs.CL cs.AI cs.LG

    AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models

    Authors: Amrit Nagarajan, Sanchari Sen, Jacob R. Stevens, Anand Raghunathan

    Abstract: Transformers have greatly advanced the state-of-the-art in Natural Language Processing (NLP) in recent years, but present very large computation and storage requirements. We observe that the design process of Transformers (pre-train a foundation model on a large dataset in a self-supervised manner, and subsequently fine-tune it for different downstream tasks) leads to task-specific models that are… ▽ More

    Submitted 9 June, 2022; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: International Joint Conference on Neural Networks (IJCNN) 2022

  41. arXiv:2006.02431  [pdf, other

    q-bio.BM cs.LG q-bio.QM stat.ML

    Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release

    Authors: Yadu Babuji, Ben Blaiszik, Tom Brettin, Kyle Chard, Ryan Chard, Austin Clyde, Ian Foster, Zhi Hong, Shantenu Jha, Zhuozhao Li, Xuefeng Liu, Arvind Ramanathan, Yi Ren, Nicholaus Saint, Marcus Schwarting, Rick Stevens, Hubertus van Dam, Rick Wagner

    Abstract: Researchers across the globe are seeking to rapidly repurpose existing drugs or discover new drugs to counter the the novel coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). One promising approach is to train machine learning (ML) and artificial intelligence (AI) tools to screen large numbers of small molecules. As a contribution to that effort,… ▽ More

    Submitted 27 May, 2020; originally announced June 2020.

    Comments: 11 pages, 5 figures

  42. arXiv:2006.01171  [pdf, other

    q-bio.QM cs.LG stat.ML

    Regression Enrichment Surfaces: a Simple Analysis Technique for Virtual Drug Screening Models

    Authors: Austin Clyde, Xiaotian Duan, Rick Stevens

    Abstract: We present a new method for understanding the performance of a model in virtual drug screening tasks. While most virtual screening problems present as a mix between ranking and classification, the models are typically trained as regression models presenting a problem requiring either a choice of a cutoff or ranking measure. Our method, regression enrichment surfaces (RES), is based on the goal of… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

  43. arXiv:2005.09572  [pdf

    q-bio.QM cs.LG

    Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug Response

    Authors: Yitan Zhu, Thomas Brettin, Yvonne A. Evrard, Alexander Partin, Fangfang Xia, Maulik Shukla, Hyunseung Yoo, James H. Doroshow, Rick Stevens

    Abstract: Transfer learning has been shown to be effective in many applications in which training data for the target problem are limited but data for a related (source) problem are abundant. In this paper, we apply transfer learning to the prediction of anti-cancer drug response. Previous transfer learning studies for drug response prediction focused on building models that predict the response of tumor ce… ▽ More

    Submitted 13 May, 2020; originally announced May 2020.

  44. arXiv:2005.05431  [pdf

    eess.IV cs.CV cs.LG

    Deep Medical Image Analysis with Representation Learning and Neuromorphic Computing

    Authors: Neil Getty, Thomas Brettin, Dong Jin, Rick Stevens, Fangfang Xia

    Abstract: We explore three representative lines of research and demonstrate the utility of our methods on a classification benchmark of brain cancer MRI data. First, we present a capsule network that explicitly learns a representation robust to rotation and affine transformation. This model requires less training data and outperforms both the original convolutional baseline and a previous capsule network im… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: 8 pages, 7 figures

  45. arXiv:2005.00095  [pdf, other

    cs.LG q-bio.GN q-bio.QM

    A Systematic Approach to Featurization for Cancer Drug Sensitivity Predictions with Deep Learning

    Authors: Austin Clyde, Tom Brettin, Alexander Partin, Maulik Shaulik, Hyunseung Yoo, Yvonne Evrard, Yitan Zhu, Fangfang Xia, Rick Stevens

    Abstract: By combining various cancer cell line (CCL) drug screening panels, the size of the data has grown significantly to begin understanding how advances in deep learning can advance drug response predictions. In this paper we train >35,000 neural network models, sweeping over common featurization techniques. We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 f… ▽ More

    Submitted 4 May, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

  46. arXiv:2004.09673  [pdf, other

    q-bio.QM cs.LG eess.IV

    Neural Network Segmentation of Cell Ultrastructure Using Incomplete Annotation

    Authors: John Paul Francis, Hongzhi Wang, Kate White, Tanveer Syeda-Mahmood, Raymond Stevens

    Abstract: The Pancreatic beta cell is an important target in diabetes research. For scalable modeling of beta cell ultrastructure, we investigate automatic segmentation of whole cell imaging data acquired through soft X-ray tomography. During the course of the study, both complete and partial ultrastructure annotations were produced manually for different subsets of the data. To more effectively use existin… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

  47. A Physiology-Driven Computational Model for Post-Cardiac Arrest Outcome Prediction

    Authors: Han B. Kim, Hieu Nguyen, Qingchu Jin, Sharmila Tamby, Tatiana Gelaf Romer, Eric Sung, Ran Liu, Joseph Greenstein, Jose I. Suarez, Christian Storm, Raimond Winslow, Robert D. Stevens

    Abstract: Patients resuscitated from cardiac arrest (CA) face a high risk of neurological disability and death, however pragmatic methods are lacking for accurate and reliable prognostication. The aim of this study was to build computational models to predict post-CA outcome by leveraging high-dimensional patient data available early after admission to the intensive care unit (ICU). We hypothesized that mod… ▽ More

    Submitted 11 February, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

    Comments: 51 pages, 7 figures, 4 supplementary figures

    ACM Class: J.3; I.2.1; I.6.4; G.3

    Journal ref: Anaesthesia Critical Care & Pain Medicine 41.1 (2022): 101015

  48. Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research

    Authors: Prasanna Balaprakash, Romain Egele, Misha Salim, Stefan Wild, Venkatram Vishwanath, Fangfang Xia, Tom Brettin, Rick Stevens

    Abstract: Cancer is a complex disease, the understanding and treatment of which are being aided through increases in the volume of collected data and in the scale of deployed computing power. Consequently, there is a growing need for the development of data-driven and, in particular, deep learning methods for various tasks such as cancer diagnosis, detection, prognosis, and prediction. Despite recent succes… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: SC '19: IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis, November 17--22, 2019, Denver, CO

  49. arXiv:1810.05093  [pdf, other

    cs.DL

    Measuring Expert Performance at Manually Classifying Domain Entities under Upper Ontology Classes

    Authors: Robert Stevens, Phillip Lord, James Malone, Nicolas Matentzoglu

    Abstract: Classifying entities in domain ontologies under upper ontology classes is a recommended task in ontology engineering to facilitate semantic interoperability and modelling consistency. Integrating upper ontologies this way is difficult and, despite emerging automated methods, remains a largely manual task. Little is known about how well experts perform at upper ontology integration. To develop me… ▽ More

    Submitted 11 October, 2018; originally announced October 2018.

  50. arXiv:1804.11002  [pdf, other

    cs.AI cs.CY

    Precision Medicine as an Accelerator for Next Generation Cognitive Supercomputing

    Authors: Edmon Begoli, Jim Brase, Bambi DeLaRosa, Penelope Jones, Dimitri Kusnezov, Jason Paragas, Rick Stevens, Fred Streitz, Georgia Tourassi

    Abstract: In the past several years, we have taken advantage of a number of opportunities to advance the intersection of next generation high-performance computing AI and big data technologies through partnerships in precision medicine. Today we are in the throes of piecing together what is likely the most unique convergence of medical data and computer technologies. But more deeply, we observe that the tra… ▽ More

    Submitted 29 April, 2018; originally announced April 2018.

    ACM Class: I.2.1; C.3

    Journal ref: SUPERCOMPUTING FRONTIERS AND INNOVATIONS, 2018