-
Prefix-Tree Decoding for Predicting Mass Spectra from Molecules
Authors:
Samuel Goldman,
John Bradshaw,
Jiayi Xin,
Connor W. Coley
Abstract:
Computational predictions of mass spectra from molecules have enabled the discovery of clinically relevant metabolites. However, such predictive tools are still limited as they occupy one of two extremes, either operating (a) by fragmenting molecules combinatorially with overly rigid constraints on potential rearrangements and poor time complexity or (b) by decoding lossy and nonphysical discretiz…
▽ More
Computational predictions of mass spectra from molecules have enabled the discovery of clinically relevant metabolites. However, such predictive tools are still limited as they occupy one of two extremes, either operating (a) by fragmenting molecules combinatorially with overly rigid constraints on potential rearrangements and poor time complexity or (b) by decoding lossy and nonphysical discretized spectra vectors. In this work, we use a new intermediate strategy for predicting mass spectra from molecules by treating mass spectra as sets of molecular formulae, which are themselves multisets of atoms. After first encoding an input molecular graph, we decode a set of molecular subformulae, each of which specify a predicted peak in the mass spectrum, the intensities of which are predicted by a second model. Our key insight is to overcome the combinatorial possibilities for molecular subformulae by decoding the formula set using a prefix tree structure, atom-type by atom-type, representing a general method for ordered multiset decoding. We show promising empirical results on mass spectra prediction tasks.
△ Less
Submitted 3 December, 2023; v1 submitted 11 March, 2023;
originally announced March 2023.
-
Barking up the right tree: an approach to search over molecule synthesis DAGs
Authors:
John Bradshaw,
Brooks Paige,
Matt J. Kusner,
Marwin H. S. Segler,
José Miguel Hernández-Lobato
Abstract:
When designing new molecules with particular properties, it is not only important what to make but crucially how to make it. These instructions form a synthesis directed acyclic graph (DAG), describing how a large vocabulary of simple building blocks can be recursively combined through chemical reactions to create more complicated molecules of interest. In contrast, many current deep generative mo…
▽ More
When designing new molecules with particular properties, it is not only important what to make but crucially how to make it. These instructions form a synthesis directed acyclic graph (DAG), describing how a large vocabulary of simple building blocks can be recursively combined through chemical reactions to create more complicated molecules of interest. In contrast, many current deep generative models for molecules ignore synthesizability. We therefore propose a deep generative model that better represents the real world process, by directly outputting molecule synthesis DAGs. We argue that this provides sensible inductive biases, ensuring that our model searches over the same chemical space that chemists would also have access to, as well as interpretability. We show that our approach is able to model chemical space well, producing a wide range of diverse molecules, and allows for unconstrained optimization of an inherently constrained problem: maximize certain chemical properties such that discovered molecules are synthesizable.
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
50/500 or 100/1000 debate is not about the time frame - Reply to Rosenfeld
Authors:
Richard Frankham,
Corey J. A. Bradshaw,
Barry W. Brook
Abstract:
The Letter from Rosenfeld (2014, Biological Conservation) in response to Jamieson and Allendorf (2012, Trends in Ecology and Evolution) and Frankham et al. (2014, Biological Conservation) and related papers is misleading in places and requires clarification and correction. We provide those here.
The Letter from Rosenfeld (2014, Biological Conservation) in response to Jamieson and Allendorf (2012, Trends in Ecology and Evolution) and Frankham et al. (2014, Biological Conservation) and related papers is misleading in places and requires clarification and correction. We provide those here.
△ Less
Submitted 30 June, 2014; v1 submitted 24 June, 2014;
originally announced June 2014.