-
CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening
Authors:
Gen Zhou,
Sugitha Janarthanan,
Yutong Lu,
Pingzhao Hu
Abstract:
Due to the rise in antimicrobial resistance, identifying novel compounds with antibiotic potential is crucial for combatting this global health issue. However, traditional drug development methods are costly and inefficient. Recognizing the pressing need for more effective solutions, researchers have turned to machine learning techniques to streamline the prediction and development of novel antibi…
▽ More
Due to the rise in antimicrobial resistance, identifying novel compounds with antibiotic potential is crucial for combatting this global health issue. However, traditional drug development methods are costly and inefficient. Recognizing the pressing need for more effective solutions, researchers have turned to machine learning techniques to streamline the prediction and development of novel antibiotic compounds. While foundation models have shown promise in antibiotic discovery, current mainstream efforts still fall short of fully leveraging the potential of multimodal molecular data. Recent studies suggest that contrastive learning frameworks utilizing multimodal data exhibit excellent performance in representation learning across various domains. Building upon this, we introduce CL-MFAP, an unsupervised contrastive learning (CL)-based multimodal foundation (MF) model specifically tailored for discovering small molecules with potential antibiotic properties (AP) using three types of molecular data. This model employs 1.6 million bioactive molecules with drug-like properties from the ChEMBL dataset to jointly pretrain three encoders: (1) a transformer-based encoder with rotary position embedding for processing SMILES strings; (2) another transformer-based encoder, incorporating a novel bi-level routing attention mechanism to handle molecular graph representations; and (3) a Morgan fingerprint encoder using a multilayer perceptron, to achieve the contrastive learning purpose. The CL-MFAP outperforms baseline models in antibiotic property prediction by effectively utilizing different molecular modalities and demonstrates superior domain-specific performance when fine-tuned for antibiotic-related property prediction tasks.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
BioNeMo Framework: a modular, high-performance library for AI model development in drug discovery
Authors:
Peter St. John,
Dejun Lin,
Polina Binder,
Malcolm Greaves,
Vega Shah,
John St. John,
Adrian Lange,
Patrick Hsu,
Rajesh Illango,
Arvind Ramanathan,
Anima Anandkumar,
David H Brookes,
Akosua Busia,
Abhishaike Mahajan,
Stephen Malina,
Neha Prasad,
Sam Sinai,
Lindsay Edwards,
Thomas Gaudelet,
Cristian Regep,
Martin Steinegger,
Burkhard Rost,
Alexander Brace,
Kyle Hippe,
Luca Naef
, et al. (63 additional authors not shown)
Abstract:
Artificial Intelligence models encoding biology and chemistry are opening new routes to high-throughput and high-quality in-silico drug development. However, their training increasingly relies on computational scale, with recent protein language models (pLM) training on hundreds of graphical processing units (GPUs). We introduce the BioNeMo Framework to facilitate the training of computational bio…
▽ More
Artificial Intelligence models encoding biology and chemistry are opening new routes to high-throughput and high-quality in-silico drug development. However, their training increasingly relies on computational scale, with recent protein language models (pLM) training on hundreds of graphical processing units (GPUs). We introduce the BioNeMo Framework to facilitate the training of computational biology and chemistry AI models across hundreds of GPUs. Its modular design allows the integration of individual components, such as data loaders, into existing workflows and is open to community contributions. We detail technical features of the BioNeMo Framework through use cases such as pLM pre-training and fine-tuning. On 256 NVIDIA A100s, BioNeMo Framework trains a three billion parameter BERT-based pLM on over one trillion tokens in 4.2 days. The BioNeMo Framework is open-source and free for everyone to use.
△ Less
Submitted 15 November, 2024;
originally announced November 2024.
-
S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule Search
Authors:
Gengmo Zhou,
Zhen Wang,
Feng Yu,
Guolin Ke,
Zhewei Wei,
Zhifeng Gao
Abstract:
Virtual Screening is an essential technique in the early phases of drug discovery, aimed at identifying promising drug candidates from vast molecular libraries. Recently, ligand-based virtual screening has garnered significant attention due to its efficacy in conducting extensive database screenings without relying on specific protein-binding site information. Obtaining binding affinity data for c…
▽ More
Virtual Screening is an essential technique in the early phases of drug discovery, aimed at identifying promising drug candidates from vast molecular libraries. Recently, ligand-based virtual screening has garnered significant attention due to its efficacy in conducting extensive database screenings without relying on specific protein-binding site information. Obtaining binding affinity data for complexes is highly expensive, resulting in a limited amount of available data that covers a relatively small chemical space. Moreover, these datasets contain a significant amount of inconsistent noise. It is challenging to identify an inductive bias that consistently maintains the integrity of molecular activity during data augmentation. To tackle these challenges, we propose S-MolSearch, the first framework to our knowledge, that leverages molecular 3D information and affinity information in semi-supervised contrastive learning for ligand-based virtual screening. Drawing on the principles of inverse optimal transport, S-MolSearch efficiently processes both labeled and unlabeled data, training molecular structural encoders while generating soft labels for the unlabeled data. This design allows S-MolSearch to adaptively utilize unlabeled data within the learning process. Empirically, S-MolSearch demonstrates superior performance on widely-used benchmarks LIT-PCBA and DUD-E. It surpasses both structure-based and ligand-based virtual screening methods for AUROC, BEDROC and EF.
△ Less
Submitted 21 November, 2024; v1 submitted 27 August, 2024;
originally announced September 2024.
-
Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction
Authors:
Eric Alcaide,
Zhifeng Gao,
Guolin Ke,
Yaqi Li,
Linfeng Zhang,
Hang Zheng,
Gengmo Zhou
Abstract:
In recent years, machine learning (ML) methods have emerged as promising alternatives for molecular docking, offering the potential for high accuracy without incurring prohibitive computational costs. However, recent studies have indicated that these ML models may overfit to quantitative metrics while neglecting the physical constraints inherent in the problem. In this work, we present Uni-Mol Doc…
▽ More
In recent years, machine learning (ML) methods have emerged as promising alternatives for molecular docking, offering the potential for high accuracy without incurring prohibitive computational costs. However, recent studies have indicated that these ML models may overfit to quantitative metrics while neglecting the physical constraints inherent in the problem. In this work, we present Uni-Mol Docking V2, which demonstrates a remarkable improvement in performance, accurately predicting the binding poses of 77+% of ligands in the PoseBusters benchmark with an RMSD value of less than 2.0 Å, and 75+% passing all quality checks. This represents a significant increase from the 62% achieved by the previous Uni-Mol Docking model. Notably, our Uni-Mol Docking approach generates chemically accurate predictions, circumventing issues such as chirality inversions and steric clashes that have plagued previous ML models. Furthermore, we observe enhanced performance in terms of high-quality predictions (RMSD values of less than 1.0 Å and 1.5 Å) and physical soundness when Uni-Mol Docking is combined with more physics-based methods like Uni-Dock. Our results represent a significant advancement in the application of artificial intelligence for scientific research, adopting a holistic approach to ligand docking that is well-suited for industrial applications in virtual screening and drug design. The code, data and service for Uni-Mol Docking are publicly available for use and further development in https://github.com/dptech-corp/Uni-Mol.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Graph schemas as abstractions for transfer learning, inference, and planning
Authors:
J. Swaroop Guntupalli,
Rajkumar Vasudeva Raju,
Shrinu Kushagra,
Carter Wendelken,
Danny Sawyer,
Ishan Deshpande,
Guangyao Zhou,
Miguel Lázaro-Gredilla,
Dileep George
Abstract:
Transferring latent structure from one environment or problem to another is a mechanism by which humans and animals generalize with very little data. Inspired by cognitive and neurobiological insights, we propose graph schemas as a mechanism of abstraction for transfer learning. Graph schemas start with latent graph learning where perceptually aliased observations are disambiguated in the latent s…
▽ More
Transferring latent structure from one environment or problem to another is a mechanism by which humans and animals generalize with very little data. Inspired by cognitive and neurobiological insights, we propose graph schemas as a mechanism of abstraction for transfer learning. Graph schemas start with latent graph learning where perceptually aliased observations are disambiguated in the latent space using contextual information. Latent graph learning is also emerging as a new computational model of the hippocampus to explain map learning and transitive inference. Our insight is that a latent graph can be treated as a flexible template -- a schema -- that models concepts and behaviors, with slots that bind groups of latent nodes to the specific observations or groundings. By treating learned latent graphs (schemas) as prior knowledge, new environments can be quickly learned as compositions of schemas and their newly learned bindings. We evaluate graph schemas on two previously published challenging tasks: the memory & planning game and one-shot StreetLearn, which are designed to test rapid task solving in novel environments. Graph schemas can be learned in far fewer episodes than previous baselines, and can model and plan in a few steps in novel variations of these tasks. We also demonstrate learning, matching, and reusing graph schemas in more challenging 2D and 3D environments with extensive perceptual aliasing and size variations, and show how different schemas can be composed to model larger and more complex environments. To summarize, our main contribution is a unified system, inspired and grounded in cognitive science, that facilitates rapid transfer learning of new environments using schemas via map-induction and composition that handles perceptual aliasing.
△ Less
Submitted 12 December, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Do Deep Learning Methods Really Perform Better in Molecular Conformation Generation?
Authors:
Gengmo Zhou,
Zhifeng Gao,
Zhewei Wei,
Hang Zheng,
Guolin Ke
Abstract:
Molecular conformation generation (MCG) is a fundamental and important problem in drug discovery. Many traditional methods have been developed to solve the MCG problem, such as systematic searching, model-building, random searching, distance geometry, molecular dynamics, Monte Carlo methods, etc. However, they have some limitations depending on the molecular structures. Recently, there are plenty…
▽ More
Molecular conformation generation (MCG) is a fundamental and important problem in drug discovery. Many traditional methods have been developed to solve the MCG problem, such as systematic searching, model-building, random searching, distance geometry, molecular dynamics, Monte Carlo methods, etc. However, they have some limitations depending on the molecular structures. Recently, there are plenty of deep learning based MCG methods, which claim they largely outperform the traditional methods. However, to our surprise, we design a simple and cheap algorithm (parameter-free) based on the traditional methods and find it is comparable to or even outperforms deep learning based MCG methods in the widely used GEOM-QM9 and GEOM-Drugs benchmarks. In particular, our design algorithm is simply the clustering of the RDKIT-generated conformations. We hope our findings can help the community to revise the deep learning methods for MCG. The code of the proposed algorithm could be found at https://gist.github.com/ZhouGengmo/5b565f51adafcd911c0bc115b2ef027c.
△ Less
Submitted 27 March, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Space is a latent sequence: Structured sequence learning as a unified theory of representation in the hippocampus
Authors:
Rajkumar Vasudeva Raju,
J. Swaroop Guntupalli,
Guangyao Zhou,
Miguel Lázaro-Gredilla,
Dileep George
Abstract:
Fascinating and puzzling phenomena, such as landmark vector cells, splitter cells, and event-specific representations to name a few, are regularly discovered in the hippocampus. Without a unifying principle that can explain these divergent observations, each experiment seemingly discovers a new anomaly or coding type. Here, we provide a unifying principle that the mental representation of space is…
▽ More
Fascinating and puzzling phenomena, such as landmark vector cells, splitter cells, and event-specific representations to name a few, are regularly discovered in the hippocampus. Without a unifying principle that can explain these divergent observations, each experiment seemingly discovers a new anomaly or coding type. Here, we provide a unifying principle that the mental representation of space is an emergent property of latent higher-order sequence learning. Treating space as a sequence resolves myriad phenomena, and suggests that the place-field mapping methodology where sequential neuron responses are interpreted in spatial and Euclidean terms might itself be a source of anomalies. Our model, called Clone-structured Causal Graph (CSCG), uses a specific higher-order graph scaffolding to learn latent representations by mapping sensory inputs to unique contexts. Learning to compress sequential and episodic experiences using CSCGs result in the emergence of cognitive maps - mental representations of spatial and conceptual relationships in an environment that are suited for planning, introspection, consolidation, and abstraction. We demonstrate that over a dozen different hippocampal phenomena, ranging from those reported in classic experiments to the most recent ones, are succinctly and mechanistically explained by our model.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
Predicting Protein-Ligand Binding Affinity via Joint Global-Local Interaction Modeling
Authors:
Yang Zhang,
Gengmo Zhou,
Zhewei Wei,
Hongteng Xu
Abstract:
The prediction of protein-ligand binding affinity is of great significance for discovering lead compounds in drug research. Facing this challenging task, most existing prediction methods rely on the topological and/or spatial structure of molecules and the local interactions while ignoring the multi-level inter-molecular interactions between proteins and ligands, which often lead to sub-optimal pe…
▽ More
The prediction of protein-ligand binding affinity is of great significance for discovering lead compounds in drug research. Facing this challenging task, most existing prediction methods rely on the topological and/or spatial structure of molecules and the local interactions while ignoring the multi-level inter-molecular interactions between proteins and ligands, which often lead to sub-optimal performance. To solve this issue, we propose a novel global-local interaction (GLI) framework to predict protein-ligand binding affinity. In particular, our GLI framework considers the inter-molecular interactions between proteins and ligands, which involve not only the high-energy short-range interactions between closed atoms but also the low-energy long-range interactions between non-bonded atoms. For each pair of protein and ligand, our GLI embeds the long-range interactions globally and aggregates local short-range interactions, respectively. Such a joint global-local interaction modeling strategy helps to improve prediction accuracy, and the whole framework is compatible with various neural network-based modules. Experiments demonstrate that our GLI framework outperforms state-of-the-art methods with simple neural network architectures and moderate computational costs.
△ Less
Submitted 18 September, 2022;
originally announced September 2022.
-
An integrated recurrent neural network and regression model with spatial and climatic couplings for vector-borne disease dynamics
Authors:
Zhijian Li,
Jack Xin,
Guofa Zhou
Abstract:
We developed an integrated recurrent neural network and nonlinear regression spatio-temporal model for vector-borne disease evolution. We take into account climate data and seasonality as external factors that correlate with disease transmitting insects (e.g. flies), also spill-over infections from neighboring regions surrounding a region of interest. The climate data is encoded to the model throu…
▽ More
We developed an integrated recurrent neural network and nonlinear regression spatio-temporal model for vector-borne disease evolution. We take into account climate data and seasonality as external factors that correlate with disease transmitting insects (e.g. flies), also spill-over infections from neighboring regions surrounding a region of interest. The climate data is encoded to the model through a quadratic embedding scheme motivated by recommendation systems. The neighboring regions' influence is modeled by a long short-term memory neural network. The integrated model is trained by stochastic gradient descent and tested on leish-maniasis data in Sri Lanka from 2013-2018 where infection outbreaks occurred. Our model outperformed ARIMA models across a number of regions with high infections, and an associated ablation study renders support to our modeling hypothesis and ideas.
△ Less
Submitted 23 January, 2022;
originally announced January 2022.
-
Drug-Target Interaction Prediction with Graph Attention networks
Authors:
Haiyang Wang,
Guangyu Zhou,
Siqi Liu,
Jyun-Yu Jiang,
Wei Wang
Abstract:
Motivation: Predicting Drug-Target Interaction (DTI) is a well-studied topic in bioinformatics due to its relevance in the fields of proteomics and pharmaceutical research. Although many machine learning methods have been successfully applied in this task, few of them aim at leveraging the inherent heterogeneous graph structure in the DTI network to address the challenge. For better learning and i…
▽ More
Motivation: Predicting Drug-Target Interaction (DTI) is a well-studied topic in bioinformatics due to its relevance in the fields of proteomics and pharmaceutical research. Although many machine learning methods have been successfully applied in this task, few of them aim at leveraging the inherent heterogeneous graph structure in the DTI network to address the challenge. For better learning and interpreting the DTI topological structure and the similarity, it is desirable to have methods specifically for predicting interactions from the graph structure.
Results: We present an end-to-end framework, DTI-GAT (Drug-Target Interaction prediction with Graph Attention networks) for DTI predictions. DTI-GAT incorporates a deep neural network architecture that operates on graph-structured data with the attention mechanism, which leverages both the interaction patterns and the features of drug and protein sequences. DTI-GAT facilitates the interpretation of the DTI topological structure by assigning different attention weights to each node with the self-attention mechanism. Experimental evaluations show that DTI-GAT outperforms various state-of-the-art systems on the binary DTI prediction problem. Moreover, the independent study results further demonstrate that our model can be generalized better than other conventional methods.
Availability: The source code and all datasets are available at https://github.com/Haiyang-W/DTI-GRAPH
△ Less
Submitted 10 July, 2021;
originally announced July 2021.
-
A Recurrent Neural Network and Differential Equation Based Spatiotemporal Infectious Disease Model with Application to COVID-19
Authors:
Zhijian Li,
Yunling Zheng,
Jack Xin,
Guofa Zhou
Abstract:
The outbreaks of Coronavirus Disease 2019 (COVID-19) have impacted the world significantly. Modeling the trend of infection and real-time forecasting of cases can help decision making and control of the disease spread. However, data-driven methods such as recurrent neural networks (RNN) can perform poorly due to limited daily samples in time. In this work, we develop an integrated spatiotemporal m…
▽ More
The outbreaks of Coronavirus Disease 2019 (COVID-19) have impacted the world significantly. Modeling the trend of infection and real-time forecasting of cases can help decision making and control of the disease spread. However, data-driven methods such as recurrent neural networks (RNN) can perform poorly due to limited daily samples in time. In this work, we develop an integrated spatiotemporal model based on the epidemic differential equations (SIR) and RNN. The former after simplification and discretization is a compact model of temporal infection trend of a region while the latter models the effect of nearest neighboring regions. The latter captures latent spatial information. %that is not publicly reported. We trained and tested our model on COVID-19 data in Italy, and show that it out-performs existing temporal models (fully connected NN, SIR, ARIMA) in 1-day, 3-day, and 1-week ahead forecasting especially in the regime of limited training data.
△ Less
Submitted 17 September, 2020; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Intervention Pathway Discovery via Context-Dependent Dynamic Sensitivity Analysis
Authors:
Gaoxiang Zhou,
Kai-Wen Liang,
Natasa Miskov-Zivanov
Abstract:
The sensitivity analysis of biological system models can significantly contribute to identifying and explaining influences of internal or external changes on model and its elements. We propose here a comprehensive framework to study sensitivity of intra-cellular networks and to identify key intervention pathways, by performing both static and dynamic sensitivity analysis. While the static sensitiv…
▽ More
The sensitivity analysis of biological system models can significantly contribute to identifying and explaining influences of internal or external changes on model and its elements. We propose here a comprehensive framework to study sensitivity of intra-cellular networks and to identify key intervention pathways, by performing both static and dynamic sensitivity analysis. While the static sensitivity analysis focuses on the impact of network topology and update functions, the dynamic analysis accounts for context-dependent transient state distributions. To study sensitivity, we use discrete models, where each element is represented as a discrete variable and assigned an update rule, which is a function of element's known direct and indirect regulators. Our sensitivity analysis framework allows for assessing the effect of context on individual element sensitivity, as well as on element criticality in reaching preferred outcomes. The framework also enables discovery of most influential pathways in the model that are essential for satisfying important system properties, and thus, could be used for interventions. We discuss the role of nine different network attributes in identifying key elements and intervention pathways, and evaluate their performance using model checking method. Finally, we apply our methods on the model of naive T cell differentiation, and further demonstrate the importance of context-based sensitivity analysis in identifying most influential elements and pathways.
△ Less
Submitted 8 February, 2019;
originally announced February 2019.
-
Screening of Fungi for the Application of Self-Healing Concrete
Authors:
Rakenth R. Menon,
Jing Luo,
Xiaobo Chen,
Hui Zhou,
Zhiyong Liu,
Guangwen Zhou,
Ning Zhang,
Congrui Jin
Abstract:
Concrete is susceptible to cracking owing to drying shrinkage, freeze-thaw cycles, delayed ettringite formation, reinforcement corrosion, creep and fatigue, etc. Since maintenance and inspection of concrete infrastructure require onerous labor and high costs, self-healing of harmful cracks without human interference or intervention could be of great attraction. The goal of this study is to explore…
▽ More
Concrete is susceptible to cracking owing to drying shrinkage, freeze-thaw cycles, delayed ettringite formation, reinforcement corrosion, creep and fatigue, etc. Since maintenance and inspection of concrete infrastructure require onerous labor and high costs, self-healing of harmful cracks without human interference or intervention could be of great attraction. The goal of this study is to explore a new self-healing approach in which fungi are used as a self-healing agent to promote calcium carbonate precipitation to fill the cracks in concrete structures. Recent research results in the field of geomycology have shown that many species of fungi could play an important role in promoting calcium carbonate mineralization, but their application in self-healing concrete has not been reported. Therefore, a screening of different species of fungi has been conducted in this study. Our results showed that, despite the drastic pH increase owing to the leaching of calcium hydroxide from concrete, Aspergillus nidulans (MAD1445), a pH regulatory mutant, could grow on concrete plates and promote calcium carbonate precipitation.
△ Less
Submitted 16 July, 2018; v1 submitted 23 November, 2017;
originally announced November 2017.
-
Interactions of Fungi with Concrete: Significant Importance for Bio-Based Self-Healing Concrete
Authors:
Jing Luo,
Xiaobo Chen,
Jada Crump,
Hui Zhou,
David G. Davies,
Guangwen Zhou,
Ning Zhang,
Congrui Jin
Abstract:
The goal of this study is to explore a new self-healing concept in which fungi are used as a self-healing agent to promote calcium mineral precipitation to fill the cracks in concrete. An initial screening of different species of fungi has been conducted. Fungal growth medium was overlaid onto cured concrete plate. Mycelial discs were aseptically deposited at the plate center. The results showed t…
▽ More
The goal of this study is to explore a new self-healing concept in which fungi are used as a self-healing agent to promote calcium mineral precipitation to fill the cracks in concrete. An initial screening of different species of fungi has been conducted. Fungal growth medium was overlaid onto cured concrete plate. Mycelial discs were aseptically deposited at the plate center. The results showed that, due to the dissolving of Ca(OH)2 from concrete, the pH of the growth medium increased from its original value of 6.5 to 13.0. Despite the drastic pH increase, Trichoderma reesei (ATCC13631) spores germinated into hyphal mycelium and grew equally well with or without concrete. X-ray diffraction (XRD) and scanning electron microscope (SEM) confirmed that the crystals precipitated on the fungal hyphae were composed of calcite. These results indicate that T. reesei has great potential to be used in bio-based self-healing concrete for sustainable infrastructure.
△ Less
Submitted 24 December, 2017; v1 submitted 3 August, 2017;
originally announced August 2017.
-
A maximum-caliber approach to predicting perturbed folding kinetics due to mutations
Authors:
Vincent A. Voelz,
Guangfeng Zhou,
Hongbin Wan
Abstract:
We present a maximum-caliber method for inferring transition rates of a Markov State Model (MSM) with perturbed equilibrium populations, given estimates of state populations and rates for an unperturbed MSM. It is similar in spirit to previous approaches but given the inclusion of prior information it is more robust and simple to implement. We examine its performance in simple biased diffusion mod…
▽ More
We present a maximum-caliber method for inferring transition rates of a Markov State Model (MSM) with perturbed equilibrium populations, given estimates of state populations and rates for an unperturbed MSM. It is similar in spirit to previous approaches but given the inclusion of prior information it is more robust and simple to implement. We examine its performance in simple biased diffusion models of kinetics, and then apply the method to predicting changes in folding rates for several highly non-trivial protein folding systems for which non-native interactions play a significant role, including (1) tryptophan variants of GB1 hairpin, (2) salt-bridge mutations of Fs peptide helix, and (3) MSMs built from ultra-long folding trajectories of FiP35 and GTT variants of WW domain. In all cases, the method correctly predicts changes in folding rates, suggesting the wide applicability of maximum-caliber approaches to efficiently predict how mutations perturb protein conformational dynamics.
△ Less
Submitted 25 May, 2016;
originally announced May 2016.
-
Complexes Detection in Biological Networks via Diversified Dense Subgraphs Mining
Authors:
Xiuli Ma,
Guangyu Zhou,
Jingjing Wang,
Jian Peng,
Jiawei Han
Abstract:
Protein-protein interaction (PPI) networks, providing a comprehensive landscape of protein interacting patterns, enable us to explore biological processes and cellular components at multiple resolutions. For a biological process, a number of proteins need to work together to perform the job. Proteins densely interact with each other, forming large molecular machines or cellular building blocks. Id…
▽ More
Protein-protein interaction (PPI) networks, providing a comprehensive landscape of protein interacting patterns, enable us to explore biological processes and cellular components at multiple resolutions. For a biological process, a number of proteins need to work together to perform the job. Proteins densely interact with each other, forming large molecular machines or cellular building blocks. Identification of such densely interconnected clusters or protein complexes from PPI networks enables us to obtain a better understanding of the hierarchy and organization of biological processes and cellular components. Most existing methods apply efficient graph clustering algorithms on PPI networks, often failing to detect possible densely connected subgraphs and overlapped subgraphs. Besides clustering-based methods, dense subgraph enumeration methods have also been used, which aim to find all densely connected protein sets. However, such methods are not practically tractable even on a small yeast PPI network, due to high computational complexity. In this paper, we introduce a novel approximate algorithm to efficiently enumerate putative protein complexes from biological networks. The key insight of our algorithm is that we do not need to enumerate all dense subgraphs. Instead we only need to find a small subset of subgraphs that cover as many proteins as possible. The problem is formulated as finding a diverse set of dense subgraphs, where we develop highly effective pruning techniques to guarantee efficiency. To handle large networks, we take a divide-and-conquer approach to speed up the algorithm in a distributed manner. By comparing with existing clustering and dense subgraph-based algorithms on several human and yeast PPI networks, we demonstrate that our method can detect more putative protein complexes and achieves better prediction accuracy.
△ Less
Submitted 12 April, 2016;
originally announced April 2016.
-
Oscillation in microRNA Feedback Loop
Authors:
Bin Ao,
Sheng Zhang,
Caiyong Ye,
Lei Chang,
Guangming Zhou,
Lei Yang
Abstract:
The dynamic behaviors of microRNA and mRNA under external stress are studied with biological experiments and mathematics models. In this study, we developed a mathematic model to describe the biological phenomenon and for the first time reported that, as responses to external stress, the expression levels of microRNA and mRNA sustained oscillation. And the period of the oscillation is much shorter…
▽ More
The dynamic behaviors of microRNA and mRNA under external stress are studied with biological experiments and mathematics models. In this study, we developed a mathematic model to describe the biological phenomenon and for the first time reported that, as responses to external stress, the expression levels of microRNA and mRNA sustained oscillation. And the period of the oscillation is much shorter than several reported transcriptional regulation negative feedback loop.
△ Less
Submitted 16 December, 2013; v1 submitted 22 April, 2013;
originally announced April 2013.
-
Warburg Effect due to Exposure to Different Types of Radiation
Authors:
Zhitong Bing,
Bin Ao,
Yanan Zhang,
Fengling Wang,
Caiyong Ye,
Jinpeng He,
Jintu Sun,
Jie Xiong,
Nan Ding,
Xiao-fei Gao,
Ji Qi,
Sheng Zhang,
Guangming Zhou,
Lei Yang
Abstract:
Cancer cells maintain a high level of aerobic glycolysis (the Warburg effect), which is associated with their rapid proliferation. Many studies have reported that the suppression of glycolysis and activation of oxidative phosphorylation can repress the growth of cancer cells through regulation of key regulators. Whether Warburg effect of cancer cells could be switched by some other environmental s…
▽ More
Cancer cells maintain a high level of aerobic glycolysis (the Warburg effect), which is associated with their rapid proliferation. Many studies have reported that the suppression of glycolysis and activation of oxidative phosphorylation can repress the growth of cancer cells through regulation of key regulators. Whether Warburg effect of cancer cells could be switched by some other environmental stimulus? Herein, we report an interesting phenomenon in which cells alternated between glycolysis and mitochondrial respiration depending on the type of radiation they were exposed to. We observed enhanced glycolysis and mitochondrial respiration in HeLa cells exposed to 2-Gy X-ray and 2-Gy carbon ion radiation, respectively. This discovery may provide novel insights for tumor therapy.
△ Less
Submitted 10 March, 2013;
originally announced March 2013.