Skip to main content

Showing 1–13 of 13 results for author: Koes, D R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.00169  [pdf, other

    cs.LG cs.AI

    GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation

    Authors: Filipp Nikitin, Ian Dunn, David Ryan Koes, Olexandr Isayev

    Abstract: Deep generative models have shown significant promise in generating valid 3D molecular structures, with the GEOM-Drugs dataset serving as a key benchmark. However, current evaluation protocols suffer from critical flaws, including incorrect valency definitions, bugs in bond order calculations, and reliance on force fields inconsistent with the reference data. In this work, we revisit GEOM-Drugs an… ▽ More

    Submitted 15 May, 2025; v1 submitted 30 April, 2025; originally announced May 2025.

  2. arXiv:2411.16644  [pdf, other

    cs.LG q-bio.BM

    Exploring Discrete Flow Matching for 3D De Novo Molecule Generation

    Authors: Ian Dunn, David R. Koes

    Abstract: Deep generative models that produce novel molecular structures have the potential to facilitate chemical discovery. Flow matching is a recently proposed generative modeling framework that has achieved impressive performance on a variety of tasks including those on biomolecular structures. The seminal flow matching framework was developed only for continuous data. However, de novo molecular design… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.

    Comments: Presented at the NeurIPS 2024 Machine Learning for Structural Biology Workshop

  3. arXiv:2411.15418  [pdf, other

    q-bio.BM cs.LG

    Scaling Structure Aware Virtual Screening to Billions of Molecules with SPRINT

    Authors: Andrew T. McNutt, Abhinav K. Adduri, Caleb N. Ellington, Monica T. Dayao, Eric P. Xing, Hosein Mohimani, David R. Koes

    Abstract: Virtual screening of small molecules against protein targets can accelerate drug discovery and development by predicting drug-target interactions (DTIs). However, structure-based methods like molecular docking are too slow to allow for broad proteome-scale screens, limiting their application in screening for off-target effects or new molecular mechanisms. Recently, vector-based methods using prote… ▽ More

    Submitted 20 January, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

  4. arXiv:2404.19739  [pdf, other

    q-bio.BM cs.LG

    Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation

    Authors: Ian Dunn, David Ryan Koes

    Abstract: Deep generative models that produce novel molecular structures have the potential to facilitate chemical discovery. Diffusion models currently achieve state of the art performance for 3D molecule generation. In this work, we explore the use of flow matching, a recently proposed generative modeling framework that generalizes diffusion models, for the task of de novo molecule generation. Flow matchi… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  5. arXiv:2311.13466  [pdf, other

    q-bio.BM cs.LG

    Accelerating Inference in Molecular Diffusion Models with Latent Representations of Protein Structure

    Authors: Ian Dunn, David Ryan Koes

    Abstract: Diffusion generative models have emerged as a powerful framework for addressing problems in structural biology and structure-based drug design. These models operate directly on 3D molecular structures. Due to the unfavorable scaling of graph neural networks (GNNs) with graph size as well as the relatively slow inference speeds inherent to diffusion models, many existing molecular diffusion models… ▽ More

    Submitted 8 May, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: This paper appeared as a spotlight paper at the NeurIPS 2023 Generative AI and Biology Workshop

  6. arXiv:2110.15200  [pdf, other

    q-bio.QM cs.LG

    Generating 3D Molecules Conditional on Receptor Binding Sites with Deep Generative Models

    Authors: Matthew Ragoza, Tomohide Masuda, David Ryan Koes

    Abstract: The goal of structure-based drug discovery is to find small molecules that bind to a given target protein. Deep learning has been used to generate drug-like molecules with certain cheminformatic properties, but has not yet been applied to generating 3D molecules predicted to bind to proteins by sampling the conditional distribution of protein-ligand binding interactions. In this work, we describe… ▽ More

    Submitted 26 January, 2022; v1 submitted 28 October, 2021; originally announced October 2021.

    Comments: Main: 12 pages, 7 figures; Supplement: 6 pages, 14 figures

  7. arXiv:2010.14442  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    Generating 3D Molecular Structures Conditional on a Receptor Binding Site with Deep Generative Models

    Authors: Tomohide Masuda, Matthew Ragoza, David Ryan Koes

    Abstract: Deep generative models have been applied with increasing success to the generation of two dimensional molecules as SMILES strings and molecular graphs. In this work we describe for the first time a deep generative model that can generate 3D molecular structures conditioned on a three-dimensional (3D) binding pocket. Using convolutional neural networks, we encode atomic density grids into separate… ▽ More

    Submitted 23 November, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

  8. arXiv:2010.08687  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    Learning a Continuous Representation of 3D Molecular Structures with Deep Generative Models

    Authors: Matthew Ragoza, Tomohide Masuda, David Ryan Koes

    Abstract: Machine learning in drug discovery has been focused on virtual screening of molecular libraries using discriminative models. Generative models are an entirely different approach that learn to represent and optimize molecules in a continuous latent space. These methods have been increasingly successful at generating two dimensional molecules as SMILES strings and molecular graphs. In this work, we… ▽ More

    Submitted 14 November, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: Camera-ready submission to NeurIPS 2020 MLSB workshop

  9. arXiv:2010.08162  [pdf, other

    q-bio.BM cs.LG

    SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning

    Authors: Jonathan E. King, David Ryan Koes

    Abstract: Despite recent advancements in deep learning methods for protein structure prediction and representation, little focus has been directed at the simultaneous inclusion and prediction of protein backbone and sidechain structure information. We present SidechainNet, a new dataset that directly extends the ProteinNet dataset. SidechainNet includes angle and atomic coordinate information capable of des… ▽ More

    Submitted 15 November, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: 8 pages, 2 figures, 1 table, Accepted for the Machine Learning for Structural Biology Workshop at the 34th Conference on Neural Information Processing Systems (MLSB NeurIPS 2020)

  10. arXiv:1912.04822  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications

    Authors: Jocelyn Sunseri, David Ryan Koes

    Abstract: There are many ways to represent a molecule as input to a machine learning model and each is associated with loss and retention of certain kinds of information. In the interest of preserving three-dimensional spatial information, including bond angles and torsions, we have developed libmolgrid, a general-purpose library for representing three-dimensional molecules using multidimensional arrays. Th… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

  11. arXiv:1803.02398  [pdf, other

    stat.ML cs.LG q-bio.BM

    Visualizing Convolutional Neural Network Protein-Ligand Scoring

    Authors: Joshua Hochuli, Alec Helbling, Tamar Skaist, Matthew Ragoza, David Ryan Koes

    Abstract: Protein-ligand scoring is an important step in a structure-based drug design pipeline. Selecting a correct binding pose and predicting the binding affinity of a protein-ligand complex enables effective virtual screening. Machine learning techniques can make use of the increasing amounts of structural data that are becoming publicly available. Convolutional neural network (CNN) scoring functions in… ▽ More

    Submitted 6 March, 2018; originally announced March 2018.

  12. arXiv:1710.07400  [pdf, other

    stat.ML cs.LG q-bio.BM

    Ligand Pose Optimization with Atomic Grid-Based Convolutional Neural Networks

    Authors: Matthew Ragoza, Lillian Turner, David Ryan Koes

    Abstract: Docking is an important tool in computational drug discovery that aims to predict the binding pose of a ligand to a target protein through a combination of pose scoring and optimization. A scoring function that is differentiable with respect to atom positions can be used for both scoring and gradient-based optimization of poses for docking. Using a differentiable grid-based atomic representation a… ▽ More

    Submitted 19 October, 2017; originally announced October 2017.

    Comments: 10 pages

  13. arXiv:1612.02751  [pdf, other

    stat.ML cs.LG q-bio.BM

    Protein-Ligand Scoring with Convolutional Neural Networks

    Authors: Matthew Ragoza, Joshua Hochuli, Elisa Idrobo, Jocelyn Sunseri, David Ryan Koes

    Abstract: Computational approaches to drug discovery can reduce the time and cost associated with experimental assays and enable the screening of novel chemotypes. Structure-based drug design methods rely on scoring functions to rank and predict binding affinities and poses. The ever-expanding amount of protein-ligand binding and structural data enables the use of deep machine learning techniques for protei… ▽ More

    Submitted 8 December, 2016; originally announced December 2016.