-
Towards Interpreting Zoonotic Potential of Betacoronavirus Sequences With Attention
Authors:
Kahini Wadhawan,
Payel Das,
Barbara A. Han,
Ilya R. Fischhoff,
Adrian C. Castellanos,
Arvind Varsani,
Kush R. Varshney
Abstract:
Current methods for viral discovery target evolutionarily conserved proteins that accurately identify virus families but remain unable to distinguish the zoonotic potential of newly discovered viruses. Here, we apply an attention-enhanced long-short-term memory (LSTM) deep neural net classifier to a highly conserved viral protein target to predict zoonotic potential across betacoronaviruses. The c…
▽ More
Current methods for viral discovery target evolutionarily conserved proteins that accurately identify virus families but remain unable to distinguish the zoonotic potential of newly discovered viruses. Here, we apply an attention-enhanced long-short-term memory (LSTM) deep neural net classifier to a highly conserved viral protein target to predict zoonotic potential across betacoronaviruses. The classifier performs with a 94% accuracy. Analysis and visualization of attention at the sequence and structure-level features indicate possible association between important protein-protein interactions governing viral replication in zoonotic betacoronaviruses and zoonotic transmission.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Optimizing Molecules using Efficient Queries from Property Evaluations
Authors:
Samuel Hoffman,
Vijil Chenthamarakshan,
Kahini Wadhawan,
Pin-Yu Chen,
Payel Das
Abstract:
Machine learning based methods have shown potential for optimizing existing molecules with more desirable properties, a critical step towards accelerating new chemical discovery. Here we propose QMO, a generic query-based molecule optimization framework that exploits latent embeddings from a molecule autoencoder. QMO improves the desired properties of an input molecule based on efficient queries,…
▽ More
Machine learning based methods have shown potential for optimizing existing molecules with more desirable properties, a critical step towards accelerating new chemical discovery. Here we propose QMO, a generic query-based molecule optimization framework that exploits latent embeddings from a molecule autoencoder. QMO improves the desired properties of an input molecule based on efficient queries, guided by a set of molecular property predictions and evaluation metrics. We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules for drug-likeness and solubility under similarity constraints. We also demonstrate significant property improvement using QMO on two new and challenging tasks that are also important in real-world discovery problems: (i) optimizing existing potential SARS-CoV-2 Main Protease inhibitors toward higher binding affinity; and (ii) improving known antimicrobial peptides towards lower toxicity. Results from QMO show high consistency with external validations, suggesting effective means to facilitate material optimization problems with design constraints.
△ Less
Submitted 18 October, 2021; v1 submitted 3 November, 2020;
originally announced November 2020.
-
Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics
Authors:
Payel Das,
Tom Sercu,
Kahini Wadhawan,
Inkit Padhi,
Sebastian Gehrmann,
Flaviu Cipcigan,
Vijil Chenthamarakshan,
Hendrik Strobelt,
Cicero dos Santos,
Pin-Yu Chen,
Yi Yan Yang,
Jeremy Tan,
James Hedrick,
Jason Crain,
Aleksandra Mojsilovic
Abstract:
De novo therapeutic design is challenged by a vast chemical repertoire and multiple constraints, e.g., high broad-spectrum potency and low toxicity. We propose CLaSS (Controlled Latent attribute Space Sampling) - an efficient computational method for attribute-controlled generation of molecules, which leverages guidance from classifiers trained on an informative latent space of molecules modeled u…
▽ More
De novo therapeutic design is challenged by a vast chemical repertoire and multiple constraints, e.g., high broad-spectrum potency and low toxicity. We propose CLaSS (Controlled Latent attribute Space Sampling) - an efficient computational method for attribute-controlled generation of molecules, which leverages guidance from classifiers trained on an informative latent space of molecules modeled using a deep generative autoencoder. We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations. The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency, which are emerging drug candidates for tackling antibiotic resistance. Synthesis and testing of only twenty designed sequences identified two novel and minimalist AMPs with high potency against diverse Gram-positive and Gram-negative pathogens, including one multidrug-resistant and one antibiotic-resistant K. pneumoniae, via membrane pore formation. Both antimicrobials exhibit low in vitro and in vivo toxicity and mitigate the onset of drug resistance. The proposed approach thus presents a viable path for faster and efficient discovery of potent and selective broad-spectrum antimicrobials.
△ Less
Submitted 25 February, 2021; v1 submitted 22 May, 2020;
originally announced May 2020.
-
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
Authors:
Payel Das,
Kahini Wadhawan,
Oscar Chang,
Tom Sercu,
Cicero Dos Santos,
Matthew Riemer,
Vijil Chenthamarakshan,
Inkit Padhi,
Aleksandra Mojsilovic
Abstract:
Given the emerging global threat of antimicrobial resistance, new methods for next-generation antimicrobial design are urgently needed. We report a peptide generation framework PepCVAE, based on a semi-supervised variational autoencoder (VAE) model, for designing novel antimicrobial peptide (AMP) sequences. Our model learns a rich latent space of the biological peptide context by taking advantage…
▽ More
Given the emerging global threat of antimicrobial resistance, new methods for next-generation antimicrobial design are urgently needed. We report a peptide generation framework PepCVAE, based on a semi-supervised variational autoencoder (VAE) model, for designing novel antimicrobial peptide (AMP) sequences. Our model learns a rich latent space of the biological peptide context by taking advantage of abundant, unlabeled peptide sequences. The model further learns a disentangled antimicrobial attribute space by using the feedback from a jointly trained AMP classifier that uses limited labeled instances. The disentangled representation allows for controllable generation of AMPs. Extensive analysis of the PepCVAE-generated sequences reveals superior performance of our model in comparison to a plain VAE, as PepCVAE generates novel AMP sequences with higher long-range diversity, while being closer to the training distribution of biological peptides. These features are highly desired in next-generation antimicrobial design.
△ Less
Submitted 13 November, 2018; v1 submitted 17 October, 2018;
originally announced October 2018.