-
ILETIA: An AI-enhanced method for individualized trigger-oocyte pickup interval estimation of progestin-primed ovarian stimulation protocol
Authors:
Binjian Wu,
Qian Li,
Zhe Kuang,
Hongyuan Gao,
Xinyi Liu,
Haiyan Guo,
Qiuju Chen,
Xinyi Liu,
Yangruizhe Jiang,
Yuqi Zhang,
Jinyin Zha,
Mingyu Li,
Qiuhan Ren,
Sishuo Feng,
Haicang Zhang,
Xuefeng Lu,
Jian Zhang
Abstract:
In vitro fertilization-embryo transfer (IVF-ET) stands as one of the most prevalent treatments for infertility. During an IVF-ET cycle, the time interval between trigger shot and oocyte pickup (OPU) is a pivotal period for follicular maturation, which determines mature oocytes yields and impacts the success of subsequent procedures. However, accurately predicting this interval is severely hindered…
▽ More
In vitro fertilization-embryo transfer (IVF-ET) stands as one of the most prevalent treatments for infertility. During an IVF-ET cycle, the time interval between trigger shot and oocyte pickup (OPU) is a pivotal period for follicular maturation, which determines mature oocytes yields and impacts the success of subsequent procedures. However, accurately predicting this interval is severely hindered by the variability of clinicians'experience that often leads to suboptimal oocyte retrieval rate. To address this challenge, we propose ILETIA, the first machine learning-based method that could predict the optimal trigger-OPU interval for patients receiving progestin-primed ovarian stimulation (PPOS) protocol. Specifically, ILETIA leverages a Transformer to learn representations from clinical tabular data, and then employs gradient-boosted trees for interval prediction. For model training and evaluating, we compiled a dataset PPOS-DS of nearly ten thousand patients receiving PPOS protocol, the largest such dataset to our knowledge. Experimental results demonstrate that our method achieves strong performance (AUROC = 0.889), outperforming both clinicians and other widely used computational models. Moreover, ILETIA also supports premature ovulation risk prediction in a specific OPU time (AUROC = 0.838). Collectively, by enabling more precise and individualized decisions, ILETIA has the potential to improve clinical outcomes and lay the foundation for future IVF-ET research.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach
Authors:
Jingyi Zhao,
Yuxuan Ou,
Austin Tripp,
Morteza Rasoulianboroujeni,
José Miguel Hernández-Lobato
Abstract:
Ionizable lipids are essential in developing lipid nanoparticles (LNPs) for effective messenger RNA (mRNA) delivery. While traditional methods for designing new ionizable lipids are typically time-consuming, deep generative models have emerged as a powerful solution, significantly accelerating the molecular discovery process. However, a practical challenge arises as the molecular structures genera…
▽ More
Ionizable lipids are essential in developing lipid nanoparticles (LNPs) for effective messenger RNA (mRNA) delivery. While traditional methods for designing new ionizable lipids are typically time-consuming, deep generative models have emerged as a powerful solution, significantly accelerating the molecular discovery process. However, a practical challenge arises as the molecular structures generated can often be difficult or infeasible to synthesize. This project explores Monte Carlo tree search (MCTS)-based generative models for synthesizable ionizable lipids. Leveraging a synthetically accessible lipid building block dataset and two specialized predictors to guide the search through chemical space, we introduce a policy network guided MCTS generative model capable of producing new ionizable lipids with available synthesis pathways.
△ Less
Submitted 1 December, 2024;
originally announced December 2024.
-
Multiscale differential geometry learning for protein flexibility analysis
Authors:
Hongsong Feng,
Jeffrey Y. Zhao,
Guo-Wei Wei
Abstract:
Protein flexibility is crucial for understanding protein structures, functions, and dynamics, and it can be measured through experimental methods such as X-ray crystallography. Theoretical approaches have also been developed to predict B-factor values, which reflect protein flexibility. Previous models have made significant strides in analyzing B-factors by fitting experimental data. In this study…
▽ More
Protein flexibility is crucial for understanding protein structures, functions, and dynamics, and it can be measured through experimental methods such as X-ray crystallography. Theoretical approaches have also been developed to predict B-factor values, which reflect protein flexibility. Previous models have made significant strides in analyzing B-factors by fitting experimental data. In this study, we propose a novel approach for B-factor prediction using differential geometry theory, based on the assumption that the intrinsic properties of proteins reside on a family of low-dimensional manifolds embedded within the high-dimensional space of protein structures. By analyzing the mean and Gaussian curvatures of a set of kernel-function-defined low-dimensional manifolds, we develop effective and robust multiscale differential geometry (mDG) models. Our mDG model demonstrates a 27\% increase in accuracy compared to the classical Gaussian network model (GNM) in predicting B-factors for a dataset of 364 proteins. Additionally, by incorporating both global and local protein features, we construct a highly effective machine learning model for the blind prediction of B-factors. Extensive least-squares approximations and machine learning-based blind predictions validate the effectiveness of the mDG modeling approach for B-factor prediction.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
NP-TCMtarget: a network pharmacology platform for exploring mechanisms of action of Traditional Chinese medicine
Authors:
Aoyi Wang,
Yingdong Wang,
Haoyang Peng,
Haoran Zhang,
Caiping Cheng,
Jinzhong Zhao,
Wuxia Zhang,
Jianxin Chen,
Peng Li
Abstract:
The biological targets of traditional Chinese medicine (TCM) are the core effectors mediating the interaction between TCM and the human body. Identification of TCM targets is essential to elucidate the chemical basis and mechanisms of TCM for treating diseases. Given the chemical complexity of TCM, both in silico high-throughput drug-target interaction predicting models and biological profile-base…
▽ More
The biological targets of traditional Chinese medicine (TCM) are the core effectors mediating the interaction between TCM and the human body. Identification of TCM targets is essential to elucidate the chemical basis and mechanisms of TCM for treating diseases. Given the chemical complexity of TCM, both in silico high-throughput drug-target interaction predicting models and biological profile-based methods have been commonly applied for identifying TCM targets based on the structural information of TCM chemical components and biological information, respectively. However, the existing methods lack the integration of TCM chemical and biological information, resulting in difficulty in the systematic discovery of TCM action pathways. To solve this problem, we propose a novel target identification model NP-TCMtarget to explore the TCM target path by combining the overall chemical and biological profiles. First, NP-TCMtarget infers TCM effect targets by calculating associations between drug/disease inducible gene expression profiles and specific gene signatures for 8,233 targets. Then, NP-TCMtarget utilizes a constructed binary classification model to predict binding targets of herbal ingredients. Finally, we can distinguish TCM direct and indirect targets by comparing the effect targets and binding targets to establish the action pathways of herbal components-direct targets-indirect targets by mapping TCM targets in the biological molecular network. We apply NP-TCMtarget to the formula XiaoKeAn to demonstrate the power of revealing the action pathways of herbal formula. We expect that this novel model could provide a systematic framework for exploring the molecular mechanisms of TCM at the target level. NP-TCMtarget is available at http://www.bcxnfz.top/NP-TCMtarget.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Deciphering interventional dynamical causality from non-intervention systems
Authors:
Jifan Shi,
Yang Li,
Juan Zhao,
Siyang Leng,
Kazuyuki Aihara,
Luonan Chen,
Wei Lin
Abstract:
Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational crite…
▽ More
Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational criterion, Interventional Embedding Entropy (IEE), to quantify causality. The IEE criterion theoretically and numerically enables the deciphering of IntDC solely from observational (non-interventional) time-series data, without requiring any knowledge of dynamical models or real interventions in the considered system. Demonstrations of performance showed the accuracy and robustness of IEE on benchmark simulated systems as well as real-world systems, including the neural connectomes of C. elegans, COVID-19 transmission networks in Japan, and regulatory networks surrounding key circadian genes.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Survey
Authors:
Meng Cui,
Xubo Liu,
Haohe Liu,
Jinzheng Zhao,
Daoliang Li,
Wenwu Wang
Abstract:
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. This paper presents a comprehensive review of three interconnected digital aquaculture tasks, namely, fish tracking, counting, and behaviour analysis, using a novel and unified approach. Unlike previous reviews which focused on single modalities or ind…
▽ More
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. This paper presents a comprehensive review of three interconnected digital aquaculture tasks, namely, fish tracking, counting, and behaviour analysis, using a novel and unified approach. Unlike previous reviews which focused on single modalities or individual tasks, we analyse vision-based (i.e. image- and video-based), acoustic-based, and biosensor-based methods across all three tasks. We examine their advantages, limitations, and applications, highlighting recent advancements and identifying critical cross-cutting research gaps. The review also includes emerging ideas such as applying multi-task learning and large language models to address various aspects of fish monitoring, an approach not previously explored in aquaculture literature. We identify the major obstacles hindering research progress in this field, including the scarcity of comprehensive fish datasets and the lack of unified evaluation standards. To overcome the current limitations, we explore the potential of using emerging technologies such as multimodal data fusion and deep learning to improve the accuracy, robustness, and efficiency of integrated fish monitoring systems. In addition, we provide a summary of existing datasets available for fish tracking, counting, and behaviour analysis. This holistic perspective offers a roadmap for future research, emphasizing the need for comprehensive datasets and evaluation standards to facilitate meaningful comparisons between technologies and to promote their practical implementations in real-world settings.
△ Less
Submitted 1 March, 2025; v1 submitted 20 June, 2024;
originally announced June 2024.
-
Drug-target interaction prediction by integrating heterogeneous information with mutual attention network
Authors:
Yuanyuan Zhang,
Yingdong Wang,
Chaoyong Wu,
Lingmin Zhana,
Aoyi Wang,
Caiping Cheng,
Jinzhong Zhao,
Wuxia Zhang,
Jianxin Chen,
Peng Li
Abstract:
Identification of drug-target interactions is an indispensable part of drug discovery. While conventional shallow machine learning and recent deep learning methods based on chemogenomic properties of drugs and target proteins have pushed this prediction performance improvement to a new level, these methods are still difficult to adapt to novel structures. Alternatively, large-scale biological and…
▽ More
Identification of drug-target interactions is an indispensable part of drug discovery. While conventional shallow machine learning and recent deep learning methods based on chemogenomic properties of drugs and target proteins have pushed this prediction performance improvement to a new level, these methods are still difficult to adapt to novel structures. Alternatively, large-scale biological and pharmacological data provide new ways to accelerate drug-target interaction prediction. Here, we propose DrugMAN, a deep learning model for predicting drug-target interaction by integrating multiplex heterogeneous functional networks with a mutual attention network (MAN). DrugMAN uses a graph attention network-based integration algorithm to learn network-specific low-dimensional features for drugs and target proteins by integrating four drug networks and seven gene/protein networks, respectively. DrugMAN then captures interaction information between drug and target representations by a mutual attention network to improve drug-target prediction. DrugMAN achieves the best prediction performance under four different scenarios, especially in real-world scenarios. DrugMAN spotlights heterogeneous information to mine drug-target interactions and can be a powerful tool for drug discovery and drug repurposing.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
A genome-scale deep learning model to predict gene expression changes of genetic perturbations from multiplex biological networks
Authors:
Lingmin Zhan,
Yuanyuan Zhang,
Yingdong Wang,
Aoyi Wang,
Caiping Cheng,
Jinzhong Zhao,
Wuxia Zhang,
Peng Lia,
Jianxin Chen
Abstract:
Systematic characterization of biological effects to genetic perturbation is essential to the application of molecular biology and biomedicine. However, the experimental exhaustion of genetic perturbations on the genome-wide scale is challenging. Here, we show that TranscriptionNet, a deep learning model that integrates multiple biological networks to systematically predict transcriptional profile…
▽ More
Systematic characterization of biological effects to genetic perturbation is essential to the application of molecular biology and biomedicine. However, the experimental exhaustion of genetic perturbations on the genome-wide scale is challenging. Here, we show that TranscriptionNet, a deep learning model that integrates multiple biological networks to systematically predict transcriptional profiles to three types of genetic perturbations based on transcriptional profiles induced by genetic perturbations in the L1000 project: RNA interference (RNAi), clustered regularly interspaced short palindromic repeat (CRISPR) and overexpression (OE). TranscriptionNet performs better than existing approaches in predicting inducible gene expression changes for all three types of genetic perturbations. TranscriptionNet can predict transcriptional profiles for all genes in existing biological networks and increases perturbational gene expression changes for each type of genetic perturbation from a few thousand to 26,945 genes. TranscriptionNet demonstrates strong generalization ability when comparing predicted and true gene expression changes on different external tasks. Overall, TranscriptionNet can systemically predict transcriptional consequences induced by perturbing genes on a genome-wide scale and thus holds promise to systemically detect gene function and enhance drug development and target discovery.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Renal function changes in chronic hepatitis B patients
Authors:
Jinhua Zhao,
Lili Wu,
Xiaoan Yang,
Zhilaing Gao,
Hong Deng
Abstract:
The best way to treat chronic hepatitis B is with pegylated interferon alone or with oral antiviral drugs. There is limited research comparing the renal safety of entecavir and tenofovir when used with pegylated interferon. This study will compare changes in renal function in chronic hepatitis B patients treated with pegylated interferon and either entecavir or tenofovir. The study included a coho…
▽ More
The best way to treat chronic hepatitis B is with pegylated interferon alone or with oral antiviral drugs. There is limited research comparing the renal safety of entecavir and tenofovir when used with pegylated interferon. This study will compare changes in renal function in chronic hepatitis B patients treated with pegylated interferon and either entecavir or tenofovir. The study included a cohort of 836 patients with chronic hepatitis B (CHB) who received treatment with pegylated interferon (IFN) either alone or in combination with entecavir (ETV) and tenofovir (TDF) between the years 2018 and 2021. Of these patients, 713 were included in a matched analysis comparing outcomes between those who were cured and those who were uncured, while 123 patients received IFN alone as a control group for comparison with the ETV and TDF treatment groups. The primary outcome measured was the change in renal function, specifically estimated glomerular filtration rate (eGFR), cystatin C (CysC), and inorganic phosphorus (IPHOS). Patients were categorized into stage 1 or stage 2 based on a baseline eGFR of less than 90 ml/min/m^2 Results: 125 CHB patients were matched 1:1 in both the combined treatment and cured groups. Baseline eGFR, CysC, and IPHOS levels were similar between the groups. Renal function in stage 1 and stage 2 groups showed a decreasing trend at 48 weeks after an initial increase.Correlation analysis showed significant relationships between changes in ALT and eGFR values at 12 weeks in both non-cured and cured groups. Conclusions: Over the 48-week duration of combined treatment in patients with chronic hepatitis B (CHB), it was found that both Tenofovir Disoproxil Fumarate (TDF) and Entecavir (ETV) did not lead to an increase in renal injury.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Pre-Training Protein Bi-level Representation Through Span Mask Strategy On 3D Protein Chains
Authors:
Jiale Zhao,
Wanru Zhuang,
Jia Song,
Yaqi Li,
Shuqi Lu
Abstract:
In recent years, there has been a surge in the development of 3D structure-based pre-trained protein models, representing a significant advancement over pre-trained protein language models in various downstream tasks. However, most existing structure-based pre-trained models primarily focus on the residue level, i.e., alpha carbon atoms, while ignoring other atoms like side chain atoms. We argue t…
▽ More
In recent years, there has been a surge in the development of 3D structure-based pre-trained protein models, representing a significant advancement over pre-trained protein language models in various downstream tasks. However, most existing structure-based pre-trained models primarily focus on the residue level, i.e., alpha carbon atoms, while ignoring other atoms like side chain atoms. We argue that modeling proteins at both residue and atom levels is important since the side chain atoms can also be crucial for numerous downstream tasks, for example, molecular docking. Nevertheless, we find that naively combining residue and atom information during pre-training typically fails. We identify a key reason is the information leakage caused by the inclusion of atom structure in the input, which renders residue-level pre-training tasks trivial and results in insufficiently expressive residue representations. To address this issue, we introduce a span mask pre-training strategy on 3D protein chains to learn meaningful representations of both residues and atoms. This leads to a simple yet effective approach to learning protein representation suitable for diverse downstream tasks. Extensive experimental results on binding site prediction and function prediction tasks demonstrate our proposed pre-training approach significantly outperforms other methods. Our code will be made public.
△ Less
Submitted 2 June, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Exploring the Impacts of Land Use/Cover Change on Ecosystem Services in Multiple Scenarios --The Case of Sichuan-Chongqing Region, China
Authors:
Ran Chen,
Jing Zhao,
Xiaomin Luo,
Xinxue Yan,
Xi Zheng,
Yijun Mao,
Xiaoping Fu,
Xueqi Yao,
Sijia Jiang
Abstract:
To improve the environment of the ecosystem, China has implemented the Green-forGrain Program for two decades, which has resulted in an imbalance among ecology.economy and food. This study focuses on the "ecology-food" imbalance problem.taking Sichuan-Chongqing Region as an example, to set up future scenarios topredicate the distribution of ESs. We first forecast land use/cover change in 2050under…
▽ More
To improve the environment of the ecosystem, China has implemented the Green-forGrain Program for two decades, which has resulted in an imbalance among ecology.economy and food. This study focuses on the "ecology-food" imbalance problem.taking Sichuan-Chongqing Region as an example, to set up future scenarios topredicate the distribution of ESs. We first forecast land use/cover change in 2050under four different scenarios: Natural Development Scenarios; Arable LandConservation Scenarios; Ecological Priority Scenarios; Ecology-Arable LandHarmonization Scenarios. Then we assess changes in five ESs: habitat quality ,cropproduction, soil conservation, water yield, and carbon storage from 1990 to 2020 and2050. Finally, we reveal the spacial distribution of ESs. The following conclusions areobtained: (1) From 1990-2020, CS, SC, and HQ reveal an increasing trend with growthrates of 1.68%, 0.08%, and 0.46%: CP reveals a reduce rate of 2.75% . (2) S4 has anincrease in arable land, and CP has increased by 7.56% compared to S1, reversingthe trend of reduced CP under S1. (3) The high-high anomalies area of CP under S4 isoasically the same as that under S2, which proves that S4 is a scenario policy that canbe referred to for future development.
△ Less
Submitted 19 December, 2023;
originally announced January 2024.
-
Multi-View Variational Autoencoder for Missing Value Imputation in Untargeted Metabolomics
Authors:
Chen Zhao,
Kuan-Jui Su,
Chong Wu,
Xuewei Cao,
Qiuying Sha,
Wu Li,
Zhe Luo,
Tian Qin,
Chuan Qiu,
Lan Juan Zhao,
Anqi Liu,
Lindong Jiang,
Xiao Zhang,
Hui Shen,
Weihua Zhou,
Hong-Wen Deng
Abstract:
Background: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. Method: In this study, we propose a novel method that leverages the information f…
▽ More
Background: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. Method: In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-view variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. Results: We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved R^2-scores > 0.01 for 71.55% of metabolites. Conclusion: The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.
△ Less
Submitted 12 March, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
A Radiomics-Incorporated Deep Ensemble Learning Model for Multi-Parametric MRI-based Glioma Segmentation
Authors:
Yang Chen,
Zhenyu Yang,
Jingtong Zhao,
Justus Adamson,
Yang Sheng,
Fang-Fang Yin,
Chunhao Wang
Abstract:
We developed a deep ensemble learning model with a radiomics spatial encoding execution for improved glioma segmentation accuracy using multi-parametric MRI (mp-MRI). This model was developed using 369 glioma patients with a 4-modality mp-MRI protocol: T1, contrast-enhanced T1 (T1-Ce), T2, and FLAIR. In each modality volume, a 3D sliding kernel was implemented across the brain to capture image het…
▽ More
We developed a deep ensemble learning model with a radiomics spatial encoding execution for improved glioma segmentation accuracy using multi-parametric MRI (mp-MRI). This model was developed using 369 glioma patients with a 4-modality mp-MRI protocol: T1, contrast-enhanced T1 (T1-Ce), T2, and FLAIR. In each modality volume, a 3D sliding kernel was implemented across the brain to capture image heterogeneity: fifty-six radiomic features were extracted within the kernel, resulting in a 4th order tensor. Each radiomic feature can then be encoded as a 3D image volume, namely a radiomic feature map (RFM). PCA was employed for data dimension reduction and the first 4 PCs were selected. Four deep neural networks as sub-models following the U-Net architecture were trained for the segmenting of a region-of-interest (ROI): each sub-model utilizes the mp-MRI and 1 of the 4 PCs as a 5-channel input for a 2D execution. The 4 softmax probability results given by the U-net ensemble were superimposed and binarized by Otsu method as the segmentation result. Three ensemble models were trained to segment enhancing tumor (ET), tumor core (TC), and whole tumor (WT). The adopted radiomics spatial encoding execution enriches the image heterogeneity information that leads to the successful demonstration of the proposed deep ensemble model, which offers a new tool for mp-MRI based medical image segmentation.
△ Less
Submitted 18 March, 2023;
originally announced March 2023.
-
Federated attention consistent learning models for prostate cancer diagnosis and Gleason grading
Authors:
Fei Kong,
Xiyue Wang,
Jinxi Xiang,
Sen Yang,
Xinran Wang,
Meng Yue,
Jun Zhang,
Junhan Zhao,
Xiao Han,
Yuhan Dong,
Biyue Zhu,
Fang Wang,
Yueping Liu
Abstract:
Artificial intelligence (AI) holds significant promise in transforming medical imaging, enhancing diagnostics, and refining treatment strategies. However, the reliance on extensive multicenter datasets for training AI models poses challenges due to privacy concerns. Federated learning provides a solution by facilitating collaborative model training across multiple centers without sharing raw data.…
▽ More
Artificial intelligence (AI) holds significant promise in transforming medical imaging, enhancing diagnostics, and refining treatment strategies. However, the reliance on extensive multicenter datasets for training AI models poses challenges due to privacy concerns. Federated learning provides a solution by facilitating collaborative model training across multiple centers without sharing raw data. This study introduces a federated attention-consistent learning (FACL) framework to address challenges associated with large-scale pathological images and data heterogeneity. FACL enhances model generalization by maximizing attention consistency between local clients and the server model. To ensure privacy and validate robustness, we incorporated differential privacy by introducing noise during parameter transfer. We assessed the effectiveness of FACL in cancer diagnosis and Gleason grading tasks using 19,461 whole-slide images of prostate cancer from multiple centers. In the diagnosis task, FACL achieved an area under the curve (AUC) of 0.9718, outperforming seven centers with an average AUC of 0.9499 when categories are relatively balanced. For the Gleason grading task, FACL attained a Kappa score of 0.8463, surpassing the average Kappa score of 0.7379 from six centers. In conclusion, FACL offers a robust, accurate, and cost-effective AI training model for prostate cancer pathology while maintaining effective data safeguards.
△ Less
Submitted 28 March, 2024; v1 submitted 12 February, 2023;
originally announced February 2023.
-
Deep Learning Approach to Predict Hemorrhage in Moyamoya Disease
Authors:
Meng Zhao,
Yonggang Ma,
Qian Zhang,
Jizong Zhao
Abstract:
Objective: Reliable tools to predict moyamoya disease (MMD) patients at risk for hemorrhage could have significant value. The aim of this paper is to develop three machine learning classification algorithms to predict hemorrhage in moyamoya disease.
Methods: Clinical data of consecutive MMD patients who were admitted to our hospital between 2009 and 2015 were reviewed. Demographics, clinical, radi…
▽ More
Objective: Reliable tools to predict moyamoya disease (MMD) patients at risk for hemorrhage could have significant value. The aim of this paper is to develop three machine learning classification algorithms to predict hemorrhage in moyamoya disease.
Methods: Clinical data of consecutive MMD patients who were admitted to our hospital between 2009 and 2015 were reviewed. Demographics, clinical, radiographic data were analyzed to develop artificial neural network (ANN), support vector machine (SVM), and random forest models.
Results: We extracted 33 parameters, including 11 demographic and 22 radiographic features as input for model development. Of all compared classification results, ANN achieved the highest overall accuracy of 75.7% (95% CI, 68.6%-82.8%), followed by SVM with 69.2% (95% CI, 56.9%-81.5%) and random forest with 70.0% (95% CI, 57.0%-83.0%).
Conclusions: The proposed ANN framework can be a potential effective tool to predict the possibility of hemorrhage among adult MMD patients based on clinical information and radiographic features.
△ Less
Submitted 31 January, 2023;
originally announced February 2023.
-
Efficient Cavity Searching for Gene Network of Influenza A Virus
Authors:
Junjie Li,
Jietong Zhao,
Yanqing Su,
Jiahao Shen,
Yaohua Liu,
Xinyue Fan,
Zheng Kou
Abstract:
High order structures (cavities and cliques) of the gene network of influenza A virus reveal tight associations among viruses during evolution and are key signals that indicate viral cross-species infection and cause pandemics. As indicators for sensing the dynamic changes of viral genes, these higher order structures have been the focus of attention in the field of virology. However, the size of…
▽ More
High order structures (cavities and cliques) of the gene network of influenza A virus reveal tight associations among viruses during evolution and are key signals that indicate viral cross-species infection and cause pandemics. As indicators for sensing the dynamic changes of viral genes, these higher order structures have been the focus of attention in the field of virology. However, the size of the viral gene network is usually huge, and searching these structures in the networks introduces unacceptable delay. To mitigate this issue, in this paper, we propose a simple-yet-effective model named HyperSearch based on deep learning to search cavities in a computable complex network for influenza virus genetics. Extensive experiments conducted on a public influenza virus dataset demonstrate the effectiveness of HyperSearch over other advanced deep-learning methods without any elaborated model crafting. Moreover, HyperSearch can finish the search works in minutes while 0-1 programming takes days. Since the proposed method is simple and easy to be transferred to other complex networks, HyperSearch has the potential to facilitate the monitoring of dynamic changes in viral genes and help humans keep up with the pace of virus mutations.
△ Less
Submitted 5 November, 2022;
originally announced November 2022.
-
Metabolic scaling is governed by Murray's network in animals and by hydraulic conductance and photosynthesis in plants
Authors:
Jinkui Zhao
Abstract:
The prevailing theory for metabolic scaling is based on area-preserved, space-filling fractal vascular networks. However, it's known both theoretically and experimentally that animals' vascular systems obey Murray's cubic branching law. Area-preserved branching conflicts with energy minimization and hence the least-work principle. Additionally, while Kleiber's law is the dominant rule for both ani…
▽ More
The prevailing theory for metabolic scaling is based on area-preserved, space-filling fractal vascular networks. However, it's known both theoretically and experimentally that animals' vascular systems obey Murray's cubic branching law. Area-preserved branching conflicts with energy minimization and hence the least-work principle. Additionally, while Kleiber's law is the dominant rule for both animals and plants, small animals are observed to follow the 2/3-power law, large animals have larger than 3/4 scaling exponents, and small plants have near-linear scaling behaviors. No known theory explains all the observations. Here, I show that animals' metabolism is determined by their Murray's vascular systems. For plants, the scaling is determined by the trunks' hydraulic conductance and the leaves' photosynthesis. Both analyses agree with data of various body sizes. Animals' scaling has a concave curvature while plants have a convex one. The empirical power laws are approximations within selected mass ranges. Generally, the 3/4-power law applies to animals of ~15 g to 10,000 kg and the 2/3-power law to those of ~1 g to 10 kg. For plants, the scaling exponent is 1 for small plants and decreases to 3/4 for those greater than ~10 kg.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Retro Drug Design: From Target Properties to Molecular Structures
Authors:
Yuhong Wang,
Sam Michael,
Ruili Huang,
Jinghua Zhao,
Katlin Recabo,
Danielle Bougie,
Qiang Shu,
Paul Shinn,
Hongmao Sun
Abstract:
To generate drug molecules of desired properties with computational methods is the holy grail in pharmaceutical research. Here we describe an AI strategy, retro drug design, or RDD, to generate novel small molecule drugs from scratch to meet predefined requirements, including but not limited to biological activity against a drug target, and optimal range of physicochemical and ADMET properties. Tr…
▽ More
To generate drug molecules of desired properties with computational methods is the holy grail in pharmaceutical research. Here we describe an AI strategy, retro drug design, or RDD, to generate novel small molecule drugs from scratch to meet predefined requirements, including but not limited to biological activity against a drug target, and optimal range of physicochemical and ADMET properties. Traditional predictive models were first trained over experimental data for the target properties, using an atom typing based molecular descriptor system, ATP. Monte Carlo sampling algorithm was then utilized to find the solutions in the ATP space defined by the target properties, and the deep learning model of Seq2Seq was employed to decode molecular structures from the solutions. To test feasibility of the algorithm, we challenged RDD to generate novel drugs that can activate μ opioid receptor (MOR) and penetrate blood brain barrier (BBB). Starting from vectors of random numbers, RDD generated 180,000 chemical structures, of which 78% were chemically valid. About 42,000 (31%) of the valid structures fell into the property space defined by MOR activity and BBB permeability. Out of the 42,000 structures, only 267 chemicals were commercially available, indicating a high extent of novelty of the AI-generated compounds. We purchased and assayed 96 compounds, and 25 of which were found to be MOR agonists. These compounds also have excellent BBB scores. The results presented in this paper illustrate that RDD has potential to revolutionize the current drug discovery process and create novel structures with multiple desired properties, including biological functions and ADMET properties. Availability of an AI-enabled fast track in drug discovery is essential to cope with emergent public health threat, such as pandemic of COVID-19.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Closed-loop spiking control on a neuromorphic processor implemented on the iCub
Authors:
Jingyue Zhao,
Nicoletta Risi,
Marco Monforte,
Chiara Bartolozzi,
Giacomo Indiveri,
Elisa Donati
Abstract:
Despite neuromorphic engineering promises the deployment of low latency, adaptive and low power systems that can lead to the design of truly autonomous artificial agents, the development of a fully neuromorphic artificial agent is still missing. While neuromorphic sensing and perception, as well as decision-making systems, are now mature, the control and actuation part is lagging behind. In this p…
▽ More
Despite neuromorphic engineering promises the deployment of low latency, adaptive and low power systems that can lead to the design of truly autonomous artificial agents, the development of a fully neuromorphic artificial agent is still missing. While neuromorphic sensing and perception, as well as decision-making systems, are now mature, the control and actuation part is lagging behind. In this paper, we present a closed-loop motor controller implemented on mixed-signal analog-digital neuromorphic hardware using a spiking neural network. The network performs a proportional control action by encoding target, feedback, and error signals using a spiking relational network. It continuously calculates the error through a connectivity pattern, which relates the three variables by means of feed-forward connections. Recurrent connections within each population are used to speed up the convergence, decrease the effect of mismatch and improve selectivity. The neuromorphic motor controller is interfaced with the iCub robot simulator. We tested our spiking P controller in a single joint control task, specifically for the robot head yaw. The spiking controller sends the target positions, reads the motor state from its encoder, and sends back the motor commands to the joint. The performance of the spiking controller is tested in a step response experiment and in a target pursuit task. In this work, we optimize the network structure to make it more robust to noisy inputs and device mismatch, which leads to better control performances.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
A matter of time: Using dynamics and theory to uncover mechanisms of transcriptional bursting
Authors:
Nicholas C. Lammers,
Yang Joon Kim,
Jiaxi Zhao,
Hernan G. Garcia
Abstract:
Eukaryotic transcription generally occurs in bursts of activity lasting minutes to hours; however, state-of-the-art measurements have revealed that many of the molecular processes that underlie bursting, such as transcription factor binding to DNA, unfold on timescales of seconds. This temporal disconnect lies at the heart of a broader challenge in physical biology of predicting transcriptional ou…
▽ More
Eukaryotic transcription generally occurs in bursts of activity lasting minutes to hours; however, state-of-the-art measurements have revealed that many of the molecular processes that underlie bursting, such as transcription factor binding to DNA, unfold on timescales of seconds. This temporal disconnect lies at the heart of a broader challenge in physical biology of predicting transcriptional outcomes and cellular decision-making from the dynamics of underlying molecular processes. Here, we review how new dynamical information about the processes underlying transcriptional control can be combined with theoretical models that predict not only averaged transcriptional dynamics, but also their variability, to formulate testable hypotheses about the molecular mechanisms underlying transcriptional bursting and control.
△ Less
Submitted 17 December, 2020; v1 submitted 20 August, 2020;
originally announced August 2020.
-
Human Mobility Trends during the COVID-19 Pandemic in the United States
Authors:
Minha Lee,
Jun Zhao,
Qianqian Sun,
Yixuan Pan,
Weiyi Zhou,
Chenfeng Xiong,
Lei Zhang
Abstract:
In March of this year, COVID-19 was declared a pandemic and it continues to threaten public health. This global health crisis imposes limitations on daily movements, which have deteriorated every sector in our society. Understanding public reactions to the virus and the non-pharmaceutical interventions should be of great help to fight COVID-19 in a strategic way. We aim to provide tangible evidenc…
▽ More
In March of this year, COVID-19 was declared a pandemic and it continues to threaten public health. This global health crisis imposes limitations on daily movements, which have deteriorated every sector in our society. Understanding public reactions to the virus and the non-pharmaceutical interventions should be of great help to fight COVID-19 in a strategic way. We aim to provide tangible evidence of the human mobility trends by comparing the day-by-day variations across the U.S. Large-scale public mobility at an aggregated level is observed by leveraging mobile device location data and the measures related to social distancing. Our study captures spatial and temporal heterogeneity as well as the sociodemographic variations regarding the pandemic propagation and the non-pharmaceutical interventions. All mobility metrics adapted capture decreased public movements after the national emergency declaration. The population staying home has increased in all states and becomes more stable after the stay-at-home order with a smaller range of fluctuation. There exists overall mobility heterogeneity between the income or population density groups. The public had been taking active responses, voluntarily staying home more, to the in-state confirmed cases while the stay-at-home orders stabilize the variations. The study suggests that the public mobility trends conform with the government message urging to stay home. We anticipate our data-driven analysis offers integrated perspectives and serves as evidence to raise public awareness and, consequently, reinforce the importance of social distancing while assisting policymakers.
△ Less
Submitted 6 May, 2020; v1 submitted 3 May, 2020;
originally announced May 2020.
-
Modeling Epidemic Spreading through Public Transit using Time-Varying Encounter Network
Authors:
Baichuan Mo,
Kairui Feng,
Yu Shen,
Clarence Tam,
Daqing Li,
Yafeng Yin,
Jinhua Zhao
Abstract:
Passenger contact in public transit (PT) networks can be a key mediate in the spreading of infectious diseases. This paper proposes a time-varying weighted PT encounter network to model the spreading of infectious diseases through the PT systems. Social activity contacts at both local and global levels are also considered. We select the epidemiological characteristics of coronavirus disease 2019 (…
▽ More
Passenger contact in public transit (PT) networks can be a key mediate in the spreading of infectious diseases. This paper proposes a time-varying weighted PT encounter network to model the spreading of infectious diseases through the PT systems. Social activity contacts at both local and global levels are also considered. We select the epidemiological characteristics of coronavirus disease 2019 (COVID-19) as a case study along with smart card data from Singapore to illustrate the model at the metropolitan level. A scalable and lightweight theoretical framework is derived to capture the time-varying and heterogeneous network structures, which enables to solve the problem at the whole population level with low computational costs. Different control policies from both the public health side and the transportation side are evaluated. We find that people's preventative behavior is one of the most effective measures to control the spreading of epidemics. From the transportation side, partial closure of bus routes helps to slow down but cannot fully contain the spreading of epidemics. Identifying "influential passengers" using the smart card data and isolating them at an early stage can also effectively reduce the epidemic spreading.
△ Less
Submitted 21 November, 2020; v1 submitted 9 April, 2020;
originally announced April 2020.
-
Enhanced diffusion and enzyme dissociation
Authors:
Ah-Young Jee,
Kuo Chen,
Tsvi Tlusty,
Jiang Zhao,
Steve Granick
Abstract:
The concept that catalytic enzymes can act as molecular machines transducing chemical activity into motion has conceptual and experimental support, but much of the claimed support comes from experimental conditions where the substrate concentration is higher than biologically relevant and accordingly exceeds kM, the Michaelis-Menten constant. Moreover, many of the enzymes studied experimentally to…
▽ More
The concept that catalytic enzymes can act as molecular machines transducing chemical activity into motion has conceptual and experimental support, but much of the claimed support comes from experimental conditions where the substrate concentration is higher than biologically relevant and accordingly exceeds kM, the Michaelis-Menten constant. Moreover, many of the enzymes studied experimentally to date are oligomeric. Urease, a hexamer of subunits, has been considered to be the gold standard demonstrating enhanced diffusion. Here we show that urease and certain other oligomeric enzymes of high catalytic activity above kM dissociate into their smaller subunit fragments that diffuse more rapidly, thus providing a simple physical mechanism of enhanced diffusion in this regime of concentrations. Mindful that this conclusion may be controversial, our findings are sup-ported by four independent analytical techniques, static light scattering, dynamic light scattering (DLS), size-exclusion chroma-tography (SEC), and fluorescence correlation spectroscopy (FCS). Data for urease are presented in the main text and the con-clusion is validated for hexokinase and acetylcholinesterase with data presented in supplementary information. For substrate concentration regimes below kM at which these enzymes do not dissociate, our findings from both FCS and DLS validate that enzymatic catalysis does lead to the enhanced diffusion phenomenon. INTRODUCT
△ Less
Submitted 30 September, 2019; v1 submitted 7 September, 2019;
originally announced September 2019.
-
Shared Causal Paths underlying Alzheimer's dementia and Type 2 Diabetes
Authors:
Zixin Hu,
Rong Jiao,
Jiucun Wang,
Panpan Wang,
Yun Zhu,
Jinying Zhao,
Phil De Jager,
David A Bennett,
Li Jin,
Momiao Xiong
Abstract:
Background: Although Alzheimer's disease (AD) is a central nervous system disease and type 2 diabetes mellitus (T2DM) is a metabolic disorder, an increasing number of genetic epidemiological studies show clear link between AD and T2DM. The current approach to uncovering the shared pathways between AD and T2DM involves association analysis; however, such analyses lack power to discover the mechanis…
▽ More
Background: Although Alzheimer's disease (AD) is a central nervous system disease and type 2 diabetes mellitus (T2DM) is a metabolic disorder, an increasing number of genetic epidemiological studies show clear link between AD and T2DM. The current approach to uncovering the shared pathways between AD and T2DM involves association analysis; however, such analyses lack power to discover the mechanisms of the diseases. Methods: We develop novel statistical methods to shift the current paradigm of genetic analysis from association analysis to deep causal inference for uncovering the shared mechanisms between AD and T2DM, and develop pipelines to infer multilevel omics causal networks which lead to shifting the current paradigm of genetic analysis from genetic analysis alone to integrated causal genomic, epigenomic, transcriptional and phenotypic data analysis. To discover common causal paths from genetic variants to AD and T2DM, we also develop algorithms that can automatically search the causal paths from genetic variants to diseases and Results: The proposed methods and algorithms are applied to ROSMAP dataset with 432 individuals who simultaneously had genotype, RNA-seq, DNA methylation and some phenotypes. We construct multi-omics causal networks and identify 13 shared causal genes, 16 shared causal pathways between AD and T2DM, and 754 gene expression and 101 gene methylation nodes that were connected to both AD and T2DM in multi-omics causal networks. Conclusions: The results of application of the proposed pipelines for identifying causal paths to real data analysis of AD and T2DM provided strong evidence to support the link between AD and T2DM and unraveled causal mechanism to explain this link.
△ Less
Submitted 16 January, 2019;
originally announced January 2019.
-
Dynamics of cardiac re-entry in micro-CT and serial histological sections based models of mammalian hearts
Authors:
Girish S. Ramlugun,
Belvin Thomas,
Vadim N. Biktashev,
Diane P. Fraser,
Ian J. LeGrice,
Bruce H. Smaill,
Jichao Zhao,
Irina V. Biktasheva
Abstract:
Cardiac re-entry regime of self-organised abnormal synchronisation underlie dangerous arrhythmias and fatal fibrillation. Recent advances in the theory of dissipative vortices, experimental studies, and anatomically realistic computer simulations, elucidated the role of cardiac re-entry interaction with fine anatomical features in the heart, and anatomy induced drift. The fact that anatomy and str…
▽ More
Cardiac re-entry regime of self-organised abnormal synchronisation underlie dangerous arrhythmias and fatal fibrillation. Recent advances in the theory of dissipative vortices, experimental studies, and anatomically realistic computer simulations, elucidated the role of cardiac re-entry interaction with fine anatomical features in the heart, and anatomy induced drift. The fact that anatomy and structural anisotropy of the heart is consistent within a species suggested its possible functional effect on spontaneous drift of cardiac re-entry. A comparative study of the anatomy induced drift could be used in order to predict evolution of atrial arrhythmia, and improve low-voltage defibrillation protocols and ablation strategies. Here, in micro-CT based model of rat pulmonary vein wall, and in sheep atria models based on high resolution serial histological sections, we demonstrate effects of heart geometry and anisotropy on cardiac re-entry anatomy induced drift, its pinning to fluctuations of thickness in the layer. The data sets of sheep atria and rat pulmonary vein wall are incorporated into the BeatBox High Performance Computing simulation environment. Re-entry is initiated at prescribed locations in the spatially homogeneous mono-domain models of cardiac tissue. Excitation is described by FitzHugh-Nagumo kinetics. In the in-silico models, isotropic and anisotropic conduction show specific anatomy effects and the interplay between anatomy and anisotropy of the heart. The main objectives are to demonstrate the functional role of the species hearts geometry and anisotropy on cardiac re-entry anatomy induced drift. In case of the rat pulmonary vein wall with ~90 degree transmural fibre rotation, it is shown that the joint effect of the PV wall geometry and anisotropy turns a plane excitation wave into a re-entry pinned to a small fluctuation of thickness in the wall.
△ Less
Submitted 4 September, 2018;
originally announced September 2018.
-
Molecular Computing for Markov Chains
Authors:
Chuan Zhang,
Ziyuan Shen,
Wei Wei,
Jing Zhao,
Zaichen Zhang,
Xiaohu You
Abstract:
In this paper, it is presented a methodology for implementing arbitrarily constructed time-homogenous Markov chains with biochemical systems. Not only discrete but also continuous-time Markov chains are allowed to be computed. By employing chemical reaction networks (CRNs) as a programmable language, molecular concentrations serve to denote both input and output values. One reaction network is ela…
▽ More
In this paper, it is presented a methodology for implementing arbitrarily constructed time-homogenous Markov chains with biochemical systems. Not only discrete but also continuous-time Markov chains are allowed to be computed. By employing chemical reaction networks (CRNs) as a programmable language, molecular concentrations serve to denote both input and output values. One reaction network is elaborately designed for each chain. The evolution of species' concentrations over time well matches the transient solutions of the target continuous-time Markov chain, while equilibrium concentrations can indicate the steady state probabilities. Additionally, second-order Markov chains are considered for implementation, with bimolecular reactions rather that unary ones. An original scheme is put forward to compile unimolecular systems to DNA strand displacement reactions for the sake of future physical implementations. Deterministic, stochastic and DNA simulations are provided to enhance correctness, validity and feasibility.
△ Less
Submitted 14 February, 2018;
originally announced February 2018.
-
Computational Modelling of Aquaporin Co-regulation in Cancer
Authors:
Junzhe Zhao,
Benjamin A. Hall
Abstract:
A computational model of aquaporin regulation in cancer cells has been constructed as a Qualitative Network in the software BioModelAnalyzer (BMA). The model connects some important aquaporins expressed in human cancer to common phenotypes via a number of fundamental, dysregulated signalling pathways. Based on over 60 publications, this model can not only reproduce the results reported in a discre…
▽ More
A computational model of aquaporin regulation in cancer cells has been constructed as a Qualitative Network in the software BioModelAnalyzer (BMA). The model connects some important aquaporins expressed in human cancer to common phenotypes via a number of fundamental, dysregulated signalling pathways. Based on over 60 publications, this model can not only reproduce the results reported in a discrete, qualitative manner, but also reconcile the seemingly incompatible phenotype with research consensus by suggesting molecular mechanisms accountable for it. Novel predictions have also been made by mimicking real-life experiments in the model.
△ Less
Submitted 4 January, 2018; v1 submitted 21 December, 2017;
originally announced December 2017.
-
Kinetic Modelling and Inference of Hyperpolarized 13C Molecules in Cancer Metabolism
Authors:
Junzhe Zhao
Abstract:
Hyperpolarized 13C-MRI allows real time observation of metabolism in vivo. Imaging sequences have been developed to follow the metabolism of [1-13C] pyruvate and extract reaction kinetics, which can show tumour treatment response. We applied the fitting model and algorithm for the imaging data of mice tumour models and determined error estimates for the parameters of interest. Data was least-squar…
▽ More
Hyperpolarized 13C-MRI allows real time observation of metabolism in vivo. Imaging sequences have been developed to follow the metabolism of [1-13C] pyruvate and extract reaction kinetics, which can show tumour treatment response. We applied the fitting model and algorithm for the imaging data of mice tumour models and determined error estimates for the parameters of interest. Data was least-squares fitted onto a two-site exchange model in MATLAB, followed by statistic computation to assess model performance. Inference through the application of MCMC was also performed. The modelling and inference process extracted quantitative information satisfactorily and reproducibly, demonstrating metabolic activity and intratumour heterogeneity. Finally, novel fitting methods were evaluated and further recommendations were made.
△ Less
Submitted 4 January, 2018; v1 submitted 20 December, 2017;
originally announced December 2017.
-
Predicting potential treatments for complex diseases based on miRNA and tissue specificity
Authors:
Liang Yu,
Jin Zhao,
Lin Gao
Abstract:
Drug repositioning, that is finding new uses for existing drugs to treat more patients. Cumulative studies demonstrate that the mature miRNAs as well as their precursors can be targeted by small molecular drugs. At the same time, human diseases result from the disordered interplay of tissue- and cell lineage-specific processes. However, few computational researches predict drug-disease potential r…
▽ More
Drug repositioning, that is finding new uses for existing drugs to treat more patients. Cumulative studies demonstrate that the mature miRNAs as well as their precursors can be targeted by small molecular drugs. At the same time, human diseases result from the disordered interplay of tissue- and cell lineage-specific processes. However, few computational researches predict drug-disease potential relationships based on miRNA data and tissue specificity. Therefore, based on miRNA data and the tissue specificity of diseases, we propose a new method named as miTS to predict the potential treatments for diseases. Firstly, based on miRNAs data, target genes and information of FDA approved drugs, we evaluate the relationships between miRNAs and drugs in the tissue-specific PPI network. Then, we construct a tripartite network: drug-miRNA-disease Finally, we obtain the potential drug-disease associations based on the tripartite network. In this paper, we take breast cancer as case study and focus on the top-30 predicted drugs. 25 of them (83.3%) are found having known connections with breast cancer in CTD benchmark and the other 5 drugs are potential drugs for breast cancer. We further evaluate the 5 newly predicted drugs from clinical records, literature mining, KEGG pathways enrichment analysis and overlapping genes between enriched pathways. For each of the 5 new drugs, strongly supported evidences can be found in three or more aspects. In particular, Regorafenib has 15 overlapping KEGG pathways with breast cancer and their p-values are all very small. In addition, whether in the literature curation or clinical validation, Regorafenib has a strong correlation with breast cancer. All the facts show that Regorafenib is likely to be a truly effective drug, worthy of our further study. It further follows that our method miTS is effective and practical for predicting new drug indications.
△ Less
Submitted 30 June, 2017;
originally announced August 2017.
-
Exponential distance distribution of connected neurons in simulations of two-dimensional in vitro neural network development
Authors:
Zhi-Song lv,
Chen-Ping Zhu,
Pei Nie,
Jing Zhao,
Hui-Jie Yang,
Yan-Jun Wang,
Chin-Kun Hu
Abstract:
The distribution of the geometric distances of connected neurons is a practical factor underlying neural networks in the brain. It can affect the brainś dynamic properties at the ground level. Karbowski derived a power-law decay distribution that has not yet been verified by experiment. In this work, we check its validity using simulations with a phenomenological model. Based on the in vitro two-d…
▽ More
The distribution of the geometric distances of connected neurons is a practical factor underlying neural networks in the brain. It can affect the brainś dynamic properties at the ground level. Karbowski derived a power-law decay distribution that has not yet been verified by experiment. In this work, we check its validity using simulations with a phenomenological model. Based on the in vitro two-dimensional development of neural networks in culture vessels by Ito, we match the synapse number saturation time to obtain suitable parameters for the development process, then determine the distribution of distances between connected neurons under such conditions. Our simulations obtain a clear exponential distribution instead of a power-law one, which indicates that Karbowski's conclusion is invalid, at least for the case of in vitro neural network development in two-dimensional culture vessels.
△ Less
Submitted 11 February, 2017;
originally announced February 2017.
-
Predicting the kinetics of RNA oligonucleotides using Markov state models
Authors:
Giovanni Pinamonti,
Jianbo Zhao,
David E. Condon,
Fabian Paul,
Frank Noé,
Douglas H. Turner,
Giovanni Bussi
Abstract:
Nowadays different experimental techniques, such as single molecule or relaxation experiments, can provide dynamic properties of biomolecular systems, but the amount of detail obtainable with these methods is often limited in terms of time or spatial resolution. Here we use state-of-the-art computational techniques, namely atomistic molecular dynamics and Markov state models, to provide insight in…
▽ More
Nowadays different experimental techniques, such as single molecule or relaxation experiments, can provide dynamic properties of biomolecular systems, but the amount of detail obtainable with these methods is often limited in terms of time or spatial resolution. Here we use state-of-the-art computational techniques, namely atomistic molecular dynamics and Markov state models, to provide insight into the rapid dynamics of short RNA oligonucleotides, in order to elucidate the kinetics of stacking interactions. Analysis of multiple microsecond-long simulations indicates that the main relaxation modes of such molecules can consist of transitions between alternative folded states, rather than between random coils and native structures. After properly removing structures that are artificially stabilized by known inaccuracies of the current RNA AMBER force field, the kinetic properties predicted are consistent with the timescales of previously reported relaxation experiments.
△ Less
Submitted 22 December, 2016;
originally announced December 2016.
-
Nonequilibrium and nonlinear kinetics as key determinants for bistability in fission yeast G2-M transition
Authors:
De Zhao,
Teng Wang,
Jian Zhao,
Dianjie Li,
Zhili Lin,
Zeyan Chen,
Qi Ouyang,
Hong Qian,
Yu V. Fu,
Fangting Li
Abstract:
A living cell is an open, nonequilibrium biochemical system where ATP hydrolysis serves as the energy source for a wide range of intracellular processes, possibly including the assurance for decision-making. In the fission yeast cell cycle, the transition from G2 to M phase is driven by the activation of Cdc13/Cdc2 and Cdc25 and the deactivation of Wee1 through phosphorylation-dephosphorylation cy…
▽ More
A living cell is an open, nonequilibrium biochemical system where ATP hydrolysis serves as the energy source for a wide range of intracellular processes, possibly including the assurance for decision-making. In the fission yeast cell cycle, the transition from G2 to M phase is driven by the activation of Cdc13/Cdc2 and Cdc25 and the deactivation of Wee1 through phosphorylation-dephosphorylation cycles with feedback loops. Here, we present a kinetic description of the G2-M circuit which reveals that both cellular ATP level and ATP hydrolysis free energy critically control Cdc2 activation. Using fission yeast nucleoplasmic extract (YNPE), we experimentally verify that increased ATP level drives the activation of Cdc2 which exhibits bistability and hysteresis in response to changes in cellular ATP level and ATP hydrolysis energy. These findings suggest that cellular ATP level and ATP hydrolysis energy are determinants of the bistability and robustness of Cdc2 activation during G2-M transition.
△ Less
Submitted 6 May, 2024; v1 submitted 30 October, 2016;
originally announced October 2016.
-
A common origin for 3/4- and 2/3-power rules in metabolic scaling
Authors:
Jinkui Zhao
Abstract:
A central debate in biology has been the allometric scaling of metabolic rate. Kleiber's observation that animals' basal metabolic rate scales to the 3/4-power of body mass (Kleiber's rule) has been the prevailing hypothesis in the last eight decades. Increasingly, more evidences are supporting the alternative 2/3-power scaling rule, especially for smaller animals. The 2/3-rule dates back to befor…
▽ More
A central debate in biology has been the allometric scaling of metabolic rate. Kleiber's observation that animals' basal metabolic rate scales to the 3/4-power of body mass (Kleiber's rule) has been the prevailing hypothesis in the last eight decades. Increasingly, more evidences are supporting the alternative 2/3-power scaling rule, especially for smaller animals. The 2/3-rule dates back to before Kleiber's time and was thought to originate from the surface to volume relationship in Euclidean geometry. In this study, we show that both the 3/4- and 2/3-scaling rules have in fact one common origin. They are governed by animals' nutrient supply networks-their vascular systems that obey Murray's law. Murray's law describes the branching pattern of energy optimized vascular network under laminar flow. It is generally regarded as being closely followed by blood vessels. Our analysis agrees with experimental observations and recent numerical analyses that showed a curvature in metabolic scaling. When applied to metabolic data, our model accurately produces the observed 2/3-scaling rule for small animals of ~10 kg or less and the 3/4-rule for all animals excluding the smallest ones (~15 g). The model has broad implications to the ongoing debate. It proves that both the 3/4- and 2/3-exponents are phenomenological approximations of the same scaling rule within their applicable mass ranges, and that the 2/3-rule does not originate from the classical surface law.
△ Less
Submitted 29 September, 2015;
originally announced September 2015.
-
Description and comparison of algorithms for correcting anisotropic magnification in cryo-EM images
Authors:
Jianhua Zhao,
Marcus A. Brubaker,
Samir Benlekbir,
John L. Rubinstein
Abstract:
Single particle electron cryomicroscopy (cryo-EM) allows for structures of proteins and protein complexes to be determined from images of non-crystalline specimens. Cryo-EM data analysis requires electron microscope images of randomly oriented ice-embedded protein particles to be rotated and translated to allow for coherent averaging when calculating three-dimensional (3D) structures. Rotation of…
▽ More
Single particle electron cryomicroscopy (cryo-EM) allows for structures of proteins and protein complexes to be determined from images of non-crystalline specimens. Cryo-EM data analysis requires electron microscope images of randomly oriented ice-embedded protein particles to be rotated and translated to allow for coherent averaging when calculating three-dimensional (3D) structures. Rotation of 2D images is usually done with the assumption that the magnification of the electron microscope is the same in all directions. However, due to electron optical aberrations, this condition is not met with some electron microscopes when used with the settings necessary for cryo-EM with a direct detector device (DDD) camera. Correction of images by linear interpolation in real space has allowed high-resolution structures to be calculated from cryo-EM images for symmetric particles. Here we describe and compare a simple real space method, a simple Fourier space method, and a somewhat more sophisticated Fourier space method to correct images for a measured anisotropy in magnification. Further, anisotropic magnification causes contrast transfer function (CTF) parameters estimated from image power spectra to have an apparent systematic astigmatism. To address this problem we develop an approach to adjust CTF parameters measured from distorted images so that they can be used with corrected images. The effect of anisotropic magnification on CTF parameters provides a simple way of detecting magnification anisotropy in cryo-EM datasets.
△ Less
Submitted 7 August, 2015; v1 submitted 23 January, 2015;
originally announced January 2015.
-
Structuring research methods and data with the Research Object model: genomics workflows as a case study
Authors:
Kristina M. Hettne,
Harish Dharuri,
Jun Zhao,
Katherine Wolstencroft,
Khalid Belhajjame,
Stian Soiland-Reyes,
Eleni Mina,
Mark Thompson,
Don Cruickshank,
Lourdes Verdes-Montenegro,
Julian Garrido,
David de Roure,
Oscar Corcho,
Graham Klyne,
Reinout van Schouwen,
Peter A. C. 't Hoen,
Sean Bechhofer,
Carole Goble,
Marco Roos
Abstract:
One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinform…
▽ More
One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows.
△ Less
Submitted 19 September, 2014; v1 submitted 12 November, 2013;
originally announced November 2013.
-
Three faces of metabolites: Pathways, localizations and network positions
Authors:
Jing Zhao,
Petter Holme
Abstract:
To understand the system-wide organization of metabolism, different lines of study have devised different categorizations of metabolites. The relationship and difference between categories can provide new insights for a more detailed description of the organization of metabolism. In this study, we investigate the relative organization of three categorizations of metabolites -- pathways, subcellu…
▽ More
To understand the system-wide organization of metabolism, different lines of study have devised different categorizations of metabolites. The relationship and difference between categories can provide new insights for a more detailed description of the organization of metabolism. In this study, we investigate the relative organization of three categorizations of metabolites -- pathways, subcellular localizations and network clusters, by block-model techniques borrowed from social-network studies and further characterize the categories from topological point of view. The picture of the metabolism we obtain is that of peripheral modules, characterized both by being dense network clusters and localized to organelles, connected by a central, highly connected core. Pathways typically run through several network clusters and localizations, connecting them laterally. The strong overlap between organelles and network clusters suggest that these are natural "modules" -- relatively independent sub-systems. The different categorizations divide the core metabolism differently suggesting that this, if possible, should be not be treated as a module on par with the organelles. Although overlapping more than chance, none of the pathways correspond very closely to a network cluster or localization. This, we believe, highlights the benefits of different orthogonal classifications and future experimental categorizations based on simple principles.
△ Less
Submitted 22 July, 2009;
originally announced July 2009.
-
Response delay as a strategy for survival in fluctuating environment
Authors:
Xiao chuan Xue,
Jinhua Zhao,
Fei Liu,
Zhong-can Ou-Yang
Abstract:
Response time-delay is an ubiquitous phenomenon in biological systems. Here we use a simple stochastic population model with time-delayed switching-rate conversion to quantitatively study the biological influence of the response time-delay on the survival fitness of cells living in periodically fluctuating and stochastic environment, respectively. Our calculation and simulation show that for the…
▽ More
Response time-delay is an ubiquitous phenomenon in biological systems. Here we use a simple stochastic population model with time-delayed switching-rate conversion to quantitatively study the biological influence of the response time-delay on the survival fitness of cells living in periodically fluctuating and stochastic environment, respectively. Our calculation and simulation show that for the cells having a slow rate transition into fit phenotype and a fast rate transition into unfit phenotype, the response time-delay can always enhance their fitness during the environment change. Particularly, in the periodic or stochastic environment with small variance, the optimal fitness achieved by these cells is superior to that of cells with reverse switching rates even if the latter exhibits rapid response. These results suggest that the response time delay may be utilized by cells to enhance their adaptation to the fluctuating environment.
△ Less
Submitted 12 April, 2009;
originally announced April 2009.
-
Modular co-evolution of metabolic networks
Authors:
Jing Zhao,
Guo-Hui Ding,
Lin Tao,
Hong Yu,
Zhong-Hao Yu,
Jian-Hua Luo,
Zhi-Wei Cao,
Yi-Xue Li
Abstract:
The architecture of biological networks has been reported to exhibit high level of modularity, and to some extent, topological modules of networks overlap with known functional modules. However, how the modular topology of the molecular network affects the evolution of its member proteins remains unclear. In this work, the functional and evolutionary modularity of Homo sapiens (H. sapiens) metab…
▽ More
The architecture of biological networks has been reported to exhibit high level of modularity, and to some extent, topological modules of networks overlap with known functional modules. However, how the modular topology of the molecular network affects the evolution of its member proteins remains unclear. In this work, the functional and evolutionary modularity of Homo sapiens (H. sapiens) metabolic network were investigated from a topological point of view. Network decomposition shows that the metabolic network is organized in a highly modular core-periphery way, in which the core modules are tightly linked together and perform basic metabolism functions, whereas the periphery modules only interact with few modules and accomplish relatively independent and specialized functions. Moreover, over half of the modules exhibit co-evolutionary feature and belong to specific evolutionary ages. Peripheral modules tend to evolve more cohesively and faster than core modules do. The correlation between functional, evolutionary and topological modularity suggests that the evolutionary history and functional requirements of metabolic systems have been imprinted in the architecture of metabolic networks. Such systems level analysis could demonstrate how the evolution of genes may be placed in a genome-scale network context, giving a novel perspective on molecular evolution.
△ Less
Submitted 6 September, 2007;
originally announced September 2007.
-
Exploring the assortativity-clustering space of a network's degree sequence
Authors:
Petter Holme,
Jing Zhao
Abstract:
Nowadays there is a multitude of measures designed to capture different aspects of network structure. To be able to say if the structure of certain network is expected or not, one needs a reference model (null model). One frequently used null model is the ensemble of graphs with the same set of degrees as the original network. In this paper we argue that this ensemble can be more than just a nul…
▽ More
Nowadays there is a multitude of measures designed to capture different aspects of network structure. To be able to say if the structure of certain network is expected or not, one needs a reference model (null model). One frequently used null model is the ensemble of graphs with the same set of degrees as the original network. In this paper we argue that this ensemble can be more than just a null model -- it also carries information about the original network and factors that affect its evolution. By mapping out this ensemble in the space of some low-level network structure -- in our case those measured by the assortativity and clustering coefficients -- one can for example study how close to the valid region of the parameter space the observed networks are. Such analysis suggests which quantities are actively optimized during the evolution of the network. We use four very different biological networks to exemplify our method. Among other things, we find that high clustering might be a force in the evolution of protein interaction networks. We also find that all four networks are conspicuously robust to both random errors and targeted attacks.
△ Less
Submitted 6 November, 2006;
originally announced November 2006.
-
Hierarchical modularity of nested bow-ties in metabolic networks
Authors:
Jing Zhao,
Hong Yu,
Jian-Hua Luo,
Zhi-Wei Cao,
Yi-Xue Li
Abstract:
The exploration of the structural topology and the organizing principles of genome-based large-scale metabolic networks is essential for studying possible relations between structure and functionality of metabolic networks. Topological analysis of graph models has often been applied to study the structural characteristics of complex metabolic networks.In this work, metabolic networks of 75 organ…
▽ More
The exploration of the structural topology and the organizing principles of genome-based large-scale metabolic networks is essential for studying possible relations between structure and functionality of metabolic networks. Topological analysis of graph models has often been applied to study the structural characteristics of complex metabolic networks.In this work, metabolic networks of 75 organisms were investigated from a topological point of view. Network decomposition of three microbes (Escherichia coli, Aeropyrum pernix and Saccharomyces cerevisiae) shows that almost all of the sub-networks exhibit a highly modularized bow-tie topological pattern similar to that of the global metabolic networks. Moreover, these small bow-ties are hierarchically nested into larger ones and collectively integrated into a large metabolic network, and important features of this modularity are not observed in the random shuffled network. In addition, such a bow-tie pattern appears to be present in certain chemically isolated functional modules and spatially separated modules including carbohydrate metabolism, cytosol and mitochondrion respectively. The highly modularized bow-tie pattern is present at different levels and scales, and in different chemical and spatial modules of metabolic networks, which is likely the result of the evolutionary process rather than a random accident. Identification and analysis of such a pattern is helpful for understanding the design principles and facilitate the modelling of metabolic networks.
△ Less
Submitted 31 August, 2006; v1 submitted 30 April, 2006;
originally announced May 2006.
-
Complex networks theory for analyzing metabolic networks
Authors:
Jing Zhao,
Hong Yu,
Jianhua Luo,
Z. W. Cao,
Yi-Xue Li
Abstract:
One of the main tasks of post-genomic informatics is to systematically investigate all molecules and their interactions within a living cell so as to understand how these molecules and the interactions between them relate to the function of the organism, while networks are appropriate abstract description of all kinds of interactions. In the past few years, great achievement has been made in dev…
▽ More
One of the main tasks of post-genomic informatics is to systematically investigate all molecules and their interactions within a living cell so as to understand how these molecules and the interactions between them relate to the function of the organism, while networks are appropriate abstract description of all kinds of interactions. In the past few years, great achievement has been made in developing theory of complex networks for revealing the organizing principles that govern the formation and evolution of various complex biological, technological and social networks. This paper reviews the accomplishments in constructing genome-based metabolic networks and describes how the theory of complex networks is applied to analyze metabolic networks.
△ Less
Submitted 13 August, 2006; v1 submitted 15 March, 2006;
originally announced March 2006.