Skip to main content

Showing 1–17 of 17 results for author: Aspuru-Guzik, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2410.09290  [pdf, other

    cs.LG cs.AI stat.ML

    Ranking over Regression for Bayesian Optimization and Molecule Selection

    Authors: Gary Tom, Stanley Lo, Samantha Corapi, Alan Aspuru-Guzik, Benjamin Sanchez-Lengeling

    Abstract: Bayesian optimization (BO) has become an indispensable tool for autonomous decision-making across diverse applications from autonomous vehicle control to accelerated drug and materials discovery. With the growing interest in self-driving laboratories, BO of chemical systems is crucial for machine learning (ML) guided experimental planning. Typically, BO employs a regression surrogate model to pred… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

    Comments: 14 + 4 pages, 5 + 3 figures

  2. arXiv:2103.03391  [pdf, other

    stat.ML cs.LG

    Gemini: Dynamic Bias Correction for Autonomous Experimentation and Molecular Simulation

    Authors: Riley J. Hickman, Florian Häse, Loïc M. Roch, Alán Aspuru-Guzik

    Abstract: Bayesian optimization has emerged as a powerful strategy to accelerate scientific discovery by means of autonomous experimentation. However, expensive measurements are required to accurately estimate materials properties, and can quickly become a hindrance to exhaustive materials discovery campaigns. Here, we introduce Gemini: a data-driven model capable of using inexpensive measurements as proxie… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 12 pages, 5 figures, 2 tables

  3. arXiv:2011.02004  [pdf, other

    cs.LG math.OC stat.ML

    Bayesian Variational Optimization for Combinatorial Spaces

    Authors: Tony C. Wu, Daniel Flam-Shepherd, Alán Aspuru-Guzik

    Abstract: This paper focuses on Bayesian Optimization in combinatorial spaces. In many applications in the natural science. Broad applications include the study of molecules, proteins, DNA, device structures and quantum circuit designs, a on optimization over combinatorial categorical spaces is needed to find optimal or pareto-optimal solutions. However, only a limited amount of methods have been proposed t… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

  4. arXiv:2010.04153  [pdf, other

    stat.ML cs.LG physics.chem-ph

    Olympus: a benchmarking framework for noisy optimization and experiment planning

    Authors: Florian Häse, Matteo Aldeghi, Riley J. Hickman, Loïc M. Roch, Melodie Christensen, Elena Liles, Jason E. Hein, Alán Aspuru-Guzik

    Abstract: Research challenges encountered across science, engineering, and economics can frequently be formulated as optimization tasks. In chemistry and materials science, recent growth in laboratory digitization and automation has sparked interest in optimization-guided autonomous discovery and closed-loop experimentation. Experiment planning strategies based on off-the-shelf optimization algorithms can b… ▽ More

    Submitted 30 March, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: 15 pages, 4 figures, 4 tables (with SI: 22 pages, 11 figures, 15 tables). Changes: minor fixes to text and references. Two paragraphs added in Sec. III

    Journal ref: Mach. Learn.: Sci. Technol. 2 (2021) 035021

  5. arXiv:2003.12127  [pdf, other

    stat.ML cs.LG physics.app-ph

    Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge

    Authors: Florian Häse, Matteo Aldeghi, Riley J. Hickman, Loïc M. Roch, Alán Aspuru-Guzik

    Abstract: Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise effi… ▽ More

    Submitted 28 May, 2021; v1 submitted 26 March, 2020; originally announced March 2020.

    Comments: 19 pages, 6 figures (SI: 16 pages, 14 figures). Expanded background, discussion, minor fixes and changes

    Journal ref: Appl. Phys. Rev. 8 (2021) 031406

  6. arXiv:2002.10413  [pdf, other

    cs.LG stat.ML

    Neural Message Passing on High Order Paths

    Authors: Daniel Flam-Shepherd, Tony Wu, Pascal Friederich, Alan Aspuru-Guzik

    Abstract: Graph neural network have achieved impressive results in predicting molecular properties, but they do not directly account for local and hidden structures in the graph such as functional groups and molecular geometry. At each propagation step, GNNs aggregate only over first order neighbours, ignoring important information contained in subsequent neighbours as well as the relationships between thos… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

  7. arXiv:2002.07087  [pdf, other

    cs.LG stat.ML

    Graph Deconvolutional Generation

    Authors: Daniel Flam-Shepherd, Tony Wu, Alan Aspuru-Guzik

    Abstract: Graph generation is an extremely important task, as graphs are found throughout different areas of science and engineering. In this work, we focus on the modern equivalent of the Erdos-Renyi random graph model: the graph variational autoencoder (GVAE). This model assumes edges and nodes are independent in order to generate entire graphs at a time using a multi-layer perceptron decoder. As a result… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  8. arXiv:1910.10685  [pdf, other

    stat.ML cs.LG physics.chem-ph

    Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules

    Authors: Benjamin Sanchez-Lengeling, Jennifer N. Wei, Brian K. Lee, Richard C. Gerkin, Alán Aspuru-Guzik, Alexander B. Wiltschko

    Abstract: Predicting the relationship between a molecule's structure and its odor remains a difficult, decades-old task. This problem, termed quantitative structure-odor relationship (QSOR) modeling, is an important challenge in chemistry, impacting human nutrition, manufacture of synthetic fragrance, the environment, and sensory neuroscience. We propose the use of graph neural networks for QSOR, and show t… ▽ More

    Submitted 25 October, 2019; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: 18 pages, 13 figures

  9. arXiv:1905.13741  [pdf, other

    cs.LG physics.chem-ph quant-ph stat.ML

    Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation

    Authors: Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, Alán Aspuru-Guzik

    Abstract: The discovery of novel materials and functional molecules can help to solve some of society's most urgent challenges, ranging from efficient energy harvesting and storage to uncovering novel pharmaceutical drug candidates. Traditionally matter engineering -- generally denoted as inverse design -- was based massively on human intuition and high-throughput virtual screening. The last few years have… ▽ More

    Submitted 4 March, 2020; v1 submitted 31 May, 2019; originally announced May 2019.

    Comments: 6+3 pages, 6+1 figures

    Journal ref: Machine Learning: Science and Technology 1, 045024 (2020)

  10. arXiv:1811.12823  [pdf, other

    cs.LG cs.AI cs.DB stat.ML

    Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

    Authors: Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, Artur Kadurin, Simon Johansson, Hongming Chen, Sergey Nikolenko, Alan Aspuru-Guzik, Alex Zhavoronkov

    Abstract: Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervised predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare an… ▽ More

    Submitted 28 October, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

  11. arXiv:1801.01469  [pdf, other

    stat.ML physics.chem-ph

    PHOENICS: A universal deep Bayesian optimizer

    Authors: Florian Häse, Loïc M. Roch, Christoph Kreisbeck, Alán Aspuru-Guzik

    Abstract: In this work we introduce PHOENICS, a probabilistic global optimization algorithm combining ideas from Bayesian optimization with concepts from Bayesian kernel density estimation. We propose an inexpensive acquisition function balancing the explorative and exploitative behavior of the algorithm. This acquisition function enables intuitive sampling strategies for an efficient parallel search of glo… ▽ More

    Submitted 4 January, 2018; originally announced January 2018.

  12. arXiv:1707.06338  [pdf, other

    physics.chem-ph stat.ML

    Machine Learning for Quantum Dynamics: Deep Learning of Excitation Energy Transfer Properties

    Authors: Florian Häse, Christoph Kreisbeck, Alán Aspuru-Guzik

    Abstract: Understanding the relationship between the structure of light-harvesting systems and their excitation energy transfer properties is of fundamental importance in many applications including the development of next generation photovoltaics. Natural light harvesting in photosynthesis shows remarkable excitation energy transfer properties, which suggests that pigment-protein complexes could serve as b… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.

  13. arXiv:1706.01825  [pdf, other

    stat.ML

    Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space

    Authors: José Miguel Hernández-Lobato, James Requeima, Edward O. Pyzer-Knapp, Alán Aspuru-Guzik

    Abstract: Chemical space is so large that brute force searches for new interesting molecules are infeasible. High-throughput virtual screening via computer cluster simulations can speed up the discovery process by collecting very large amounts of data in parallel, e.g., up to hundreds or thousands of parallel measurements. Bayesian optimization (BO) can produce additional acceleration by sequentially identi… ▽ More

    Submitted 6 June, 2017; originally announced June 2017.

    Comments: Accepted for publication in the proceedings of the 2017 ICML conference

  14. arXiv:1705.10843  [pdf, other

    stat.ML cs.LG

    Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models

    Authors: Gabriel Lima Guimaraes, Benjamin Sanchez-Lengeling, Carlos Outeiral, Pedro Luis Cunha Farias, Alán Aspuru-Guzik

    Abstract: In unsupervised data generation tasks, besides the generation of a sample based on previous observations, one would often like to give hints to the model in order to bias the generation towards desirable metrics. We propose a method that combines Generative Adversarial Networks (GANs) and reinforcement learning (RL) in order to accomplish exactly that. While RL biases the data generation process t… ▽ More

    Submitted 6 February, 2018; v1 submitted 30 May, 2017; originally announced May 2017.

    Comments: 10 pages, 7 figures

  15. arXiv:1608.06296  [pdf, other

    physics.chem-ph q-bio.QM stat.ML

    Neural networks for the prediction organic chemistry reactions

    Authors: Jennifer N. Wei, David Duvenaud, Alán Aspuru-Guzik

    Abstract: Reaction prediction remains one of the major challenges for organic chemistry, and is a pre-requisite for efficient synthetic planning. It is desirable to develop algorithms that, like humans, "learn" from being exposed to examples of the application of the rules of organic chemistry. We explore the use of neural networks for predicting reaction types, using a new reaction fingerprinting method. W… ▽ More

    Submitted 17 October, 2016; v1 submitted 22 August, 2016; originally announced August 2016.

    Comments: 21 pages, 5 figures

    Journal ref: ACS.Cent.Sci. 2 (2016) 725-732

  16. arXiv:1608.05747  [pdf, other

    stat.ML physics.chem-ph

    Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models

    Authors: Dipti Jasrasaria, Edward O. Pyzer-Knapp, Dmitrij Rappoport, Alan Aspuru-Guzik

    Abstract: A fundamental problem in applying machine learning techniques for chemical problems is to find suitable representations for molecular and crystal structures. While the structure representations based on atom connectivities are prevalent for molecules, two-dimensional descriptors are not suitable for describing molecular crystals. In this work, we introduce the SFC-M family of feature representatio… ▽ More

    Submitted 19 August, 2016; originally announced August 2016.

  17. arXiv:1509.09292  [pdf, other

    cs.LG cs.NE stat.ML

    Convolutional Networks on Graphs for Learning Molecular Fingerprints

    Authors: David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams

    Abstract: We introduce a convolutional neural network that operates directly on graphs. These networks allow end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape. The architecture we present generalizes standard molecular feature extraction methods based on circular fingerprints. We show that these data-driven features are more interpretable, and have better predic… ▽ More

    Submitted 3 November, 2015; v1 submitted 30 September, 2015; originally announced September 2015.

    Comments: 9 pages, 5 figures. To appear in Neural Information Processing Systems (NIPS)