Skip to main content

Showing 1–50 of 68 results for author: Yang, Z

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2507.07201  [pdf, ps, other

    q-bio.BM cs.AI cs.LG

    MODA: A Unified 3D Diffusion Framework for Multi-Task Target-Aware Molecular Generation

    Authors: Dong Xu, Zhangfan Yang, Sisi Yuan, Jenna Xinyi Yao, Jiangqiang Li, Junkai Ji

    Abstract: Three-dimensional molecular generators based on diffusion models can now reach near-crystallographic accuracy, yet they remain fragmented across tasks. SMILES-only inputs, two-stage pretrain-finetune pipelines, and one-task-one-model practices hinder stereochemical fidelity, task alignment, and zero-shot transfer. We introduce MODA, a diffusion framework that unifies fragment growing, linker desig… ▽ More

    Submitted 9 July, 2025; originally announced July 2025.

  2. arXiv:2507.05656  [pdf, ps, other

    eess.IV cs.CV cs.LG q-bio.QM

    ADPv2: A Hierarchical Histological Tissue Type-Annotated Dataset for Potential Biomarker Discovery of Colorectal Disease

    Authors: Zhiyuan Yang, Kai Li, Sophia Ghamoshi Ramandi, Patricia Brassard, Hakim Khellaf, Vincent Quoc-Huy Trinh, Jennifer Zhang, Lina Chen, Corwyn Rowsell, Sonal Varma, Kostas Plataniotis, Mahdi S. Hosseini

    Abstract: Computational pathology (CoPath) leverages histopathology images to enhance diagnostic precision and reproducibility in clinical pathology. However, publicly available datasets for CoPath that are annotated with extensive histological tissue type (HTT) taxonomies at a granular level remain scarce due to the significant expertise and high annotation costs required. Existing datasets, such as the At… ▽ More

    Submitted 9 July, 2025; v1 submitted 8 July, 2025; originally announced July 2025.

    ACM Class: I.2.10; I.2.1

  3. arXiv:2506.14488  [pdf, ps, other

    q-bio.BM cs.LG

    Reimagining Target-Aware Molecular Generation through Retrieval-Enhanced Aligned Diffusion

    Authors: Dong Xu, Zhangfan Yang, Ka-chun Wong, Zexuan Zhu, Jiangqiang Li, Junkai Ji

    Abstract: Breakthroughs in high-accuracy protein structure prediction, such as AlphaFold, have established receptor-based molecule design as a critical driver for rapid early-phase drug discovery. However, most approaches still struggle to balance pocket-specific geometric fit with strict valence and synthetic constraints. To resolve this trade-off, a Retrieval-Enhanced Aligned Diffusion termed READ is intr… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 13 pages, 5 figures

  4. arXiv:2506.01833  [pdf, ps, other

    cs.LG q-bio.GN

    SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model

    Authors: Zhao Yang, Jiwei Zhu, Bing Su

    Abstract: Inspired by the success of unsupervised pre-training paradigms, researchers have applied these approaches to DNA pre-training. However, we argue that these approaches alone yield suboptimal results because pure DNA sequences lack sufficient information, since their functions are regulated by genomic profiles like chromatin accessibility. Here, we demonstrate that supervised training for genomic pr… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to ICML 2025

  5. arXiv:2505.08850  [pdf, ps, other

    q-bio.BM cond-mat.mtrl-sci

    High-throughput Screening of the Mechanical Properties of Peptide Assemblies

    Authors: Sarah K. Yorke, Zhenze Yang, Aviad Levin, Alice Ray, Jeremy Owusu Boamah, Tuomas P. J. Knowles, Markus J. Buehler

    Abstract: Peptides are recognized for their varied self-assembly behaviors, forming a wide array of structures and geometries, such as spheres, fibers, and hydrogels, each presenting a unique set of material properties. The functionalities of these materials hold exceptional interest for applications in biology, medicine, photonics, nanotechnology and the food industry. In specific, the ability to exploit p… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  6. arXiv:2505.07053  [pdf, ps, other

    q-bio.MN physics.bio-ph q-bio.SC

    The Dynamics of Inducible Genetic Circuits

    Authors: Zitao Yang, Rebecca J. Rousseau, Sara D. Mahdavi, Hernan G. Garcia, Rob Phillips

    Abstract: Genes are connected in complex networks of interactions where often the product of one gene is a transcription factor that alters the expression of another. Many of these networks are based on a few fundamental motifs leading to switches and oscillators of various kinds. And yet, there is more to the story than which transcription factors control these various circuits. These transcription factors… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

    Comments: 57 pages, 38 figures

  7. arXiv:2504.06316  [pdf, ps, other

    q-bio.QM cs.LG

    DeepGDel: Deep Learning-based Gene Deletion Prediction Framework for Growth-Coupled Production in Genome-Scale Metabolic Models

    Authors: Ziwei Yang, Takeyuki Tamura

    Abstract: In genome-scale constraint-based metabolic models, gene deletion strategies are crucial for achieving growth-coupled production, where cell growth and target metabolite production are simultaneously achieved. While computational methods for calculating gene deletions have been widely explored and contribute to developing gene deletion strategy databases, current approaches are limited in leveragin… ▽ More

    Submitted 19 June, 2025; v1 submitted 8 April, 2025; originally announced April 2025.

  8. arXiv:2503.07981  [pdf, other

    cs.LG q-bio.GN

    Regulatory DNA sequence Design with Reinforcement Learning

    Authors: Zhao Yang, Bing Su, Chuan Cao, Ji-Rong Wen

    Abstract: Cis-regulatory elements (CREs), such as promoters and enhancers, are relatively short DNA sequences that directly regulate gene expression. The fitness of CREs, measured by their ability to modulate gene expression, highly depends on the nucleotide sequences, especially specific motifs known as transcription factor binding sites (TFBSs). Designing high-fitness CREs is crucial for therapeutic and b… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  9. arXiv:2503.05031  [pdf, other

    eess.IV cs.AI cs.CV q-bio.NC

    Enhancing Alzheimer's Diagnosis: Leveraging Anatomical Landmarks in Graph Convolutional Neural Networks on Tetrahedral Meshes

    Authors: Yanxi Chen, Mohammad Farazi, Zhangsihao Yang, Yonghui Fan, Nicholas Ashton, Eric M Reiman, Yi Su, Yalin Wang

    Abstract: Alzheimer's disease (AD) is a major neurodegenerative condition that affects millions around the world. As one of the main biomarkers in the AD diagnosis procedure, brain amyloid positivity is typically identified by positron emission tomography (PET), which is costly and invasive. Brain structural magnetic resonance imaging (sMRI) may provide a safer and more convenient solution for the AD diagno… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  10. arXiv:2502.10807  [pdf, other

    cs.LG cs.AI q-bio.GN

    HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model

    Authors: Mingqian Ma, Guoqing Liu, Chuan Cao, Pan Deng, Tri Dao, Albert Gu, Peiran Jin, Zhao Yang, Yingce Xia, Renqian Luo, Pipi Hu, Zun Wang, Yuan-Jyue Chen, Haiguang Liu, Tao Qin

    Abstract: Advances in natural language processing and large language models have sparked growing interest in modeling DNA, often referred to as the "language of life". However, DNA modeling poses unique challenges. First, it requires the ability to process ultra-long DNA sequences while preserving single-nucleotide resolution, as individual nucleotides play a critical role in DNA function. Second, success i… ▽ More

    Submitted 17 February, 2025; v1 submitted 15 February, 2025; originally announced February 2025.

    Comments: Project page: https://hybridna-project.github.io/HybriDNA-Project/

  11. arXiv:2502.07237  [pdf, other

    cs.LG cs.CL q-bio.BM stat.ML

    DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization

    Authors: Xuefeng Liu, Songhao Jiang, Siyu Chen, Zhuoran Yang, Yuxin Chen, Ian Foster, Rick Stevens

    Abstract: Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research delves into the realm of drug optimization and introduce a novel reinforcement learning algorithm to finetune a drug optimization LLM-based generative model, enhancing the original drug across target objectives, while retains the beneficial chemical properties of the original drug.… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  12. arXiv:2501.08356  [pdf, other

    q-bio.QM physics.data-an q-bio.CB

    A large population of cell-specific action potential models replicating fluorescence recordings of voltage in rabbit ventricular myocytes

    Authors: Radostin D. Simitev, Rebecca J. Gilchrist, Zhechao Yang, Rachel Myles, Francis Burton, Godfrey L. Smith

    Abstract: Recent high-throughput experiments unveil substantial electrophysiological diversity among uncoupled healthy myocytes under identical conditions. To quantify inter-cell variability, the values of a subset of the parameters in a well-regarded mathematical model of the action potential of rabbit ventricular myocytes are estimated from fluorescence voltage measurements of a large number of cells. Sta… ▽ More

    Submitted 13 January, 2025; originally announced January 2025.

    Comments: Accepted for publication in Royal Society Open Science (ISSN:2054-5703) on 2025-01-13. Supplementary material available as open source from Royal Society Open Science

    Journal ref: Royal Society Open Science (2025) 12: 241539

  13. arXiv:2501.06271  [pdf, other

    q-bio.QM cs.AI cs.CE

    Large Language Models for Bioinformatics

    Authors: Wei Ruan, Yanjun Lyu, Jing Zhang, Jiazhang Cai, Peng Shu, Yang Ge, Yao Lu, Shang Gao, Yue Wang, Peilong Wang, Lin Zhao, Tao Wang, Yufang Liu, Luyang Fang, Ziyu Liu, Zhengliang Liu, Yiwei Li, Zihao Wu, Junhao Chen, Hanqi Jiang, Yi Pan, Zhenyuan Yang, Jingyuan Chen, Shizhe Liang, Wei Zhang , et al. (30 additional authors not shown)

    Abstract: With the rapid advancements in large language model (LLM) technology and the emergence of bioinformatics-specific language models (BioLMs), there is a growing need for a comprehensive analysis of the current landscape, computational characteristics, and diverse applications. This survey aims to address this need by providing a thorough review of BioLMs, focusing on their evolution, classification,… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

    Comments: 64 pages, 1 figure

  14. arXiv:2501.05644  [pdf, ps, other

    q-bio.BM cs.LG

    Interpretable Enzyme Function Prediction via Residue-Level Detection

    Authors: Zhao Yang, Bing Su, Jiahao Chen, Ji-Rong Wen

    Abstract: Predicting multiple functions labeled with Enzyme Commission (EC) numbers from the enzyme sequence is of great significance but remains a challenge due to its sparse multi-label classification nature, i.e., each enzyme is typically associated with only a few labels out of more than 6000 possible EC numbers. However, existing machine learning algorithms generally learn a fixed global representation… ▽ More

    Submitted 5 June, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

  15. arXiv:2411.08077  [pdf, other

    q-bio.QM cs.DB

    DBgDel: Database-Enhanced Gene Deletion Framework for Growth-Coupled Production in Genome-Scale Metabolic Models

    Authors: Ziwei Yang, Takeyuki Tamura

    Abstract: When simulating metabolite productions with genome-scale constraint-based metabolic models, gene deletion strategies are necessary to achieve growth-coupled production, which means cell growth and target metabolite production occur simultaneously. Since obtaining gene deletion strategies for large genome-scale models suffers from significant computational time, it is necessary to develop methods t… ▽ More

    Submitted 26 March, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

  16. arXiv:2410.21591  [pdf, other

    cs.AI cs.CL q-bio.GN q-bio.QM

    Can Large Language Models Replace Data Scientists in Biomedical Research?

    Authors: Zifeng Wang, Benjamin Danek, Ziwei Yang, Zheng Chen, Jimeng Sun

    Abstract: Data science plays a critical role in biomedical research, but it requires professionals with expertise in coding and medical data analysis. Large language models (LLMs) have shown great potential in supporting medical tasks and performing well in general coding tests. However, existing evaluations fail to assess their capability in biomedical data science, particularly in handling diverse data ty… ▽ More

    Submitted 8 April, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

  17. arXiv:2410.20852  [pdf, other

    cs.SD cs.CE eess.AS q-bio.QM

    Atrial Fibrillation Detection System via Acoustic Sensing for Mobile Phones

    Authors: Xuanyu Liu, Jiao Li, Haoxian Liu, Zongqi Yang, Yi Huang, Jin Zhang

    Abstract: Atrial fibrillation (AF) is characterized by irregular electrical impulses originating in the atria, which can lead to severe complications and even death. Due to the intermittent nature of the AF, early and timely monitoring of AF is critical for patients to prevent further exacerbation of the condition. Although ambulatory ECG Holter monitors provide accurate monitoring, the high cost of these d… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

    Comments: This paper has been submitted to ACM Transactions on Sensor Networks (TOSN)

  18. arXiv:2409.12080  [pdf, other

    q-bio.BM

    Design of Ligand-Binding Proteins with Atomic Flow Matching

    Authors: Junqi Liu, Shaoning Li, Chence Shi, Zhi Yang, Jian Tang

    Abstract: Designing novel proteins that bind to small molecules is a long-standing challenge in computational biology, with applications in developing catalysts, biosensors, and more. Current computational methods rely on the assumption that the binding pose of the target molecule is known, which is not always feasible, as conformations of novel targets are often unknown and tend to change upon binding. In… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

  19. arXiv:2409.02143  [pdf, ps, other

    q-bio.GN cs.LG

    MLOmics: Cancer Multi-Omics Database for Machine Learning

    Authors: Ziwei Yang, Rikuto Kotoge, Xihao Piao, Zheng Chen, Lingwei Zhu, Peng Gao, Yasuko Matsubara, Yasushi Sakurai, Jimeng Sun

    Abstract: Framing the investigation of diverse cancers as a machine learning problem has recently shown significant potential in multi-omics analysis and cancer research. Empowering these successful machine learning models are the high-quality training datasets with sufficient data volume and adequate preprocessing. However, while there exist several public data portals, including The Cancer Genome Atlas (T… ▽ More

    Submitted 16 June, 2025; v1 submitted 2 September, 2024; originally announced September 2024.

    Comments: This work has been published in Scientific Data

  20. arXiv:2407.12897  [pdf, other

    q-bio.QM stat.ML

    Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples

    Authors: Sai Spandana Chintapalli, Rongguang Wang, Zhijian Yang, Vasiliki Tassopoulou, Fanyang Yu, Vishnu Bashyam, Guray Erus, Pratik Chaudhari, Haochang Shou, Christos Davatzikos

    Abstract: Availability of large and diverse medical datasets is often challenged by privacy and data sharing restrictions. For successful application of machine learning techniques for disease diagnosis, prognosis, and precision medicine, large amounts of data are necessary for model building and optimization. To help overcome such limitations in the context of brain MRI, we present GenMIND: a collection of… ▽ More

    Submitted 1 October, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

  21. arXiv:2402.03781  [pdf, other

    q-bio.QM cs.AI cs.LG

    MolTC: Towards Molecular Relational Modeling In Language Models

    Authors: Junfeng Fang, Shuai Zhang, Chang Wu, Zhengyi Yang, Zhiyuan Liu, Sihang Li, Kun Wang, Wenjie Du, Xiang Wang

    Abstract: Molecular Relational Learning (MRL), aiming to understand interactions between molecular pairs, plays a pivotal role in advancing biochemical research. Recently, the adoption of large language models (LLMs), known for their vast knowledge repositories and advanced logical inference capabilities, has emerged as a promising way for efficient and effective MRL. Despite their potential, these methods… ▽ More

    Submitted 10 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: ACL 2024

  22. arXiv:2401.09517  [pdf

    cs.LG eess.IV q-bio.QM

    Dimensional Neuroimaging Endophenotypes: Neurobiological Representations of Disease Heterogeneity Through Machine Learning

    Authors: Junhao Wen, Mathilde Antoniades, Zhijian Yang, Gyujoon Hwang, Ioanna Skampardoni, Rongguang Wang, Christos Davatzikos

    Abstract: Machine learning has been increasingly used to obtain individualized neuroimaging signatures for disease diagnosis, prognosis, and response to treatment in neuropsychiatric and neurodegenerative disorders. Therefore, it has contributed to a better understanding of disease heterogeneity by identifying disease subtypes that present significant differences in various brain phenotypic measures. In thi… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  23. arXiv:2311.04837  [pdf, other

    cs.LG cs.AI q-bio.QM

    Identifying Semantic Component for Robust Molecular Property Prediction

    Authors: Zijian Li, Zunhong Xu, Ruichu Cai, Zhenhui Yang, Yuguang Yan, Zhifeng Hao, Guangyi Chen, Kun Zhang

    Abstract: Although graph neural networks have achieved great success in the task of molecular property prediction in recent years, their generalization ability under out-of-distribution (OOD) settings is still under-explored. Different from existing methods that learn discriminative representations for prediction, we propose a generative model with semantic-components identifiability, named SCI. We demonstr… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  24. arXiv:2308.09725  [pdf

    q-bio.GN cs.AI cs.LG

    MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive Learning with Omics-Inference Modeling

    Authors: Ziwei Yang, Zheng Chen, Yasuko Matsubara, Yasushi Sakurai

    Abstract: Precision medicine fundamentally aims to establish causality between dysregulated biochemical mechanisms and cancer subtypes. Omics-based cancer subtyping has emerged as a revolutionary approach, as different level of omics records the biochemical products of multistep processes in cancers. This paper focuses on fully exploiting the potential of multi-omics data to improve cancer subtyping outcome… ▽ More

    Submitted 24 August, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: CIKM'23 Long/Full Papers

  25. arXiv:2308.01941  [pdf

    q-bio.NC cs.AI cs.NE

    Digital twin brain: a bridge between biological intelligence and artificial intelligence

    Authors: Hui Xiong, Congying Chu, Lingzhong Fan, Ming Song, Jiaqi Zhang, Yawei Ma, Ruonan Zheng, Junyang Zhang, Zhengyi Yang, Tianzi Jiang

    Abstract: In recent years, advances in neuroscience and artificial intelligence have paved the way for unprecedented opportunities for understanding the complexity of the brain and its emulation by computational systems. Cutting-edge advancements in neuroscience research have revealed the intricate relationship between brain structure and function, while the success of artificial neural networks highlights… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Journal ref: Intell Comput. 2023;2:0055

  26. arXiv:2307.08848  [pdf

    physics.bio-ph q-bio.QM

    Microbiome-derived bile acids contribute to elevated antigenic response and bone erosion in rheumatoid arthritis

    Authors: Xiuli Su, Xiaona Li, Yanqin Bian, Qing Ren, Leiguang Li, Xiaohao Wu, Hemi Luan, Bing He, Xiaojuan He, Hui Feng, Xingye Cheng, Pan-Jun Kim, Leihan Tang, Aiping Lu, Lianbo Xiao, Liang Tian, Zhu Yang, Zongwei Cai

    Abstract: Rheumatoid arthritis (RA) is a chronic, disabling and incurable autoimmune disease. It has been widely recognized that gut microbial dysbiosis is an important contributor to the pathogenesis of RA, although distinct alterations in microbiota have been associated with this disease. Yet, the metabolites that mediate the impacts of the gut microbiome on RA are less well understood. Here, with microbi… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: 38 pages, 6 figures

  27. arXiv:2307.07443  [pdf, other

    cs.LG cs.AI q-bio.QM

    Can Large Language Models Empower Molecular Property Prediction?

    Authors: Chen Qian, Huayi Tang, Zhirui Yang, Hong Liang, Yong Liu

    Abstract: Molecular property prediction has gained significant attention due to its transformative potential in multiple scientific disciplines. Conventionally, a molecule graph can be represented either as a graph-structured data or a SMILES text. Recently, the rapid development of Large Language Models (LLMs) has revolutionized the field of NLP. Although it is natural to utilize LLMs to assist in understa… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  28. arXiv:2306.09629  [pdf, other

    eess.IV cs.CV q-bio.NC

    Fusing Structural and Functional Connectivities using Disentangled VAE for Detecting MCI

    Authors: Qiankun Zuo, Yanfei Zhu, Libin Lu, Zhi Yang, Yuhui Li, Ning Zhang

    Abstract: Brain network analysis is a useful approach to studying human brain disorders because it can distinguish patients from healthy people by detecting abnormal connections. Due to the complementary information from multiple modal neuroimages, multimodal fusion technology has a lot of potential for improving prediction performance. However, effective fusion of multimodal medical images to achieve compl… ▽ More

    Submitted 21 August, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 4 figures

  29. arXiv:2306.04886  [pdf, other

    q-bio.BM cs.LG

    Multi-task Bioassay Pre-training for Protein-ligand Binding Affinity Prediction

    Authors: Jiaxian Yan, Zhaofeng Ye, Ziyi Yang, Chengqiang Lu, Shengyu Zhang, Qi Liu, Jiezhong Qiu

    Abstract: Protein-ligand binding affinity (PLBA) prediction is the fundamental task in drug discovery. Recently, various deep learning-based models predict binding affinity by incorporating the three-dimensional structure of protein-ligand complexes as input and achieving astounding progress. However, due to the scarcity of high-quality training data, the generalization ability of current models is still li… ▽ More

    Submitted 20 December, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 21 pages, 7 figures

  30. arXiv:2303.10533  [pdf

    q-bio.QM cs.CV

    A Radiomics-Incorporated Deep Ensemble Learning Model for Multi-Parametric MRI-based Glioma Segmentation

    Authors: Yang Chen, Zhenyu Yang, Jingtong Zhao, Justus Adamson, Yang Sheng, Fang-Fang Yin, Chunhao Wang

    Abstract: We developed a deep ensemble learning model with a radiomics spatial encoding execution for improved glioma segmentation accuracy using multi-parametric MRI (mp-MRI). This model was developed using 369 glioma patients with a 4-modality mp-MRI protocol: T1, contrast-enhanced T1 (T1-Ce), T2, and FLAIR. In each modality volume, a 3D sliding kernel was implemented across the brain to capture image het… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

  31. arXiv:2301.10772  [pdf

    q-bio.QM cs.LG eess.IV

    Gene-SGAN: a method for discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering

    Authors: Zhijian Yang, Junhao Wen, Ahmed Abdulkadir, Yuhan Cui, Guray Erus, Elizabeth Mamourian, Randa Melhem, Dhivya Srinivasan, Sindhuja T. Govindarajan, Jiong Chen, Mohamad Habes, Colin L. Masters, Paul Maruff, Jurgen Fripp, Luigi Ferrucci, Marilyn S. Albert, Sterling C. Johnson, John C. Morris, Pamela LaMontagne, Daniel S. Marcus, Tammie L. S. Benzinger, David A. Wolk, Li Shen, Jingxuan Bao, Susan M. Resnick , et al. (3 additional authors not shown)

    Abstract: Disease heterogeneity has been a critical challenge for precision diagnosis and treatment, especially in neurologic and neuropsychiatric diseases. Many diseases can display multiple distinct brain phenotypes across individuals, potentially reflecting disease subtypes that can be captured using MRI and machine learning methods. However, biological interpretability and treatment relevance are limite… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  32. arXiv:2301.03424  [pdf, other

    q-bio.BM cs.AI cs.LG

    An open unified deep graph learning framework for discovering drug leads

    Authors: Yueming Yin, Haifeng Hu, Zhen Yang, Jitao Yang, Chun Ye, Jiansheng Wu, Wilson Wen Bin Goh

    Abstract: Computational discovery of ideal lead compounds is a critical process for modern drug discovery. It comprises multiple stages: hit screening, molecular property prediction, and molecule optimization. Current efforts are disparate, involving the establishment of models for each stage, followed by multi-stage multi-model integration. However, this is non-ideal, as clumsy integration of incompatible… ▽ More

    Submitted 20 January, 2023; v1 submitted 5 December, 2022; originally announced January 2023.

    Comments: This article is used as the preliminary studies for the application of Lee Kuan Yew Postdoctoral Fellowship (LKYPDF) 2023 in Singapore. All rights reserved

  33. arXiv:2211.10419  [pdf, other

    q-bio.NC cs.AI cs.LG

    A Neural Active Inference Model of Perceptual-Motor Learning

    Authors: Zhizhuo Yang, Gabriel J. Diaz, Brett R. Fajen, Reynold Bailey, Alexander Ororbia

    Abstract: The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience that can produce human-like behavior through reward-based learning. In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans through the systematic investigation of a visual-motor task that has been well-explored… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 16 pages including references, 6 figures. Submitted to Frontiers in Computational Neuroscience

  34. arXiv:2210.13225  [pdf, other

    cs.NE cs.LG q-bio.NC

    Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All Networks

    Authors: Zhile Yang, Shangqi Guo, Ying Fang, Jian K. Liu

    Abstract: One stream of reinforcement learning research is exploring biologically plausible models and algorithms to simulate biological intelligence and fit neuromorphic hardware. Among them, reward-modulated spike-timing-dependent plasticity (R-STDP) is a recent branch with good potential in energy efficiency. However, current R-STDP methods rely on heuristic designs of local learning rules, thus requirin… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to BMVC 2022

  35. arXiv:2210.06512  [pdf

    q-bio.QM cs.CV eess.IV

    Quantifying U-Net Uncertainty in Multi-Parametric MRI-based Glioma Segmentation by Spherical Image Projection

    Authors: Zhenyu Yang, Kyle Lafata, Eugene Vaios, Zongsheng Hu, Trey Mullikin, Fang-Fang Yin, Chunhao Wang

    Abstract: The projection of planar MRI data onto a spherical surface is equivalent to a nonlinear image transformation that retains global anatomical information. By incorporating this image transformation process in our proposed spherical projection-based U-Net (SPU-Net) segmentation model design, multiple independent segmentation predictions can be obtained from a single MRI. The final segmentation is the… ▽ More

    Submitted 12 August, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 31 pages, 9 figures, 1 table

  36. arXiv:2209.13492  [pdf, other

    q-bio.QM cs.AI cs.LG

    Unraveling Key Elements Underlying Molecular Property Prediction: A Systematic Study

    Authors: Jianyuan Deng, Zhibo Yang, Hehe Wang, Iwao Ojima, Dimitris Samaras, Fusheng Wang

    Abstract: Artificial intelligence (AI) has been widely applied in drug discovery with a major task as molecular property prediction. Despite booming techniques in molecular representation learning, key elements underlying molecular property prediction remain largely unexplored, which impedes further advancements in this field. Herein, we conduct an extensive evaluation of representative models using various… ▽ More

    Submitted 2 September, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

  37. arXiv:2206.10801  [pdf, other

    cs.LG cs.AI q-bio.QM

    Automated Cancer Subtyping via Vector Quantization Mutual Information Maximization

    Authors: Zheng Chen, Lingwei Zhu, Ziwei Yang, Takashi Matsubara

    Abstract: Cancer subtyping is crucial for understanding the nature of tumors and providing suitable therapy. However, existing labelling methods are medically controversial, and have driven the process of subtyping away from teaching signals. Moreover, cancer genetic expression profiles are high-dimensional, scarce, and have complicated dependence, thereby posing a serious challenge to existing subtyping mo… ▽ More

    Submitted 14 November, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: accepted by ECML-PKDD 2022

  38. arXiv:2205.09548  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution

    Authors: Lixue Cheng, Ziyi Yang, Changyu Hsieh, Benben Liao, Shengyu Zhang

    Abstract: Directed evolution is a versatile technique in protein engineering that mimics the process of natural selection by iteratively alternating between mutagenesis and screening in order to search for sequences that optimize a given property of interest, such as catalytic activity and binding affinity to a specified target. However, the space of possible proteins is too large to search exhaustively in… ▽ More

    Submitted 1 May, 2024; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: 27 pages, 13 figures

  39. arXiv:2204.09840  [pdf, other

    eess.SP cs.LG q-bio.NC

    Multi-Tier Platform for Cognizing Massive Electroencephalogram

    Authors: Zheng Chen, Lingwei Zhu, Ziwei Yang, Renyuan Zhang

    Abstract: An end-to-end platform assembling multiple tiers is built for precisely cognizing brain activities. Being fed massive electroencephalogram (EEG) data, the time-frequency spectrograms are conventionally projected into the episode-wise feature matrices (seen as tier-1). A spiking neural network (SNN) based tier is designed to distill the principle information in terms of spike-streams from the rare… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 7 pages, accepted by IJCAI 2022

  40. arXiv:2204.02278  [pdf, other

    cs.LG q-bio.GN

    Cancer Subtyping via Embedded Unsupervised Learning on Transcriptomics Data

    Authors: Ziwei Yang, Lingwei Zhu, Zheng Chen, Ming Huang, Naoaki Ono, MD Altaf-Ul-Amin, Shigehiko Kanaya

    Abstract: Cancer is one of the deadliest diseases worldwide. Accurate diagnosis and classification of cancer subtypes are indispensable for effective clinical treatment. Promising results on automatic cancer subtyping systems have been published recently with the emergence of various deep learning methods. However, such automatic systems often overfit the data due to the high dimensionality and scarcity. In… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: 4 pages, accepted for EMBC 2022

  41. arXiv:2203.08648  [pdf, other

    cs.RO cs.AI cs.HC cs.LG q-bio.NC

    Artificial Intelligence Enables Real-Time and Intuitive Control of Prostheses via Nerve Interface

    Authors: Diu Khue Luu, Anh Tuan Nguyen, Ming Jiang, Markus W. Drealan, Jian Xu, Tong Wu, Wing-kin Tam, Wenfeng Zhao, Brian Z. H. Lim, Cynthia K. Overstreet, Qi Zhao, Jonathan Cheng, Edward W. Keefer, Zhi Yang

    Abstract: Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

  42. arXiv:2203.00628  [pdf

    q-bio.QM cs.LG eess.IV

    A Neural Ordinary Differential Equation Model for Visualizing Deep Neural Network Behaviors in Multi-Parametric MRI based Glioma Segmentation

    Authors: Zhenyu Yang, Zongsheng Hu, Hangjie Ji, Kyle Lafata, Scott Floyd, Fang-Fang Yin, Chunhao Wang

    Abstract: Purpose: To develop a neural ordinary differential equation (ODE) model for visualizing deep neural network (DNN) behavior during multi-parametric MRI (mp-MRI) based glioma segmentation as a method to enhance deep learning explainability. Methods: By hypothesizing that deep feature extraction can be modeled as a spatiotemporally continuous process, we designed a novel deep learning model, neural O… ▽ More

    Submitted 23 March, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: 30 pages, 7 figures, 2 tables

  43. arXiv:2111.08008  [pdf, other

    q-bio.QM cs.LG

    SPLDExtraTrees: Robust machine learning approach for predicting kinase inhibitor resistance

    Authors: Ziyi Yang, Zhaofeng Ye, Yijia Xiao, Changyu Hsieh, Shengyu Zhang

    Abstract: Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development. The mutation in proteins that is related to drug binding is a common cause for adaptive drug resistance. Therefore, quantitative estimations of how mutations would affect the interaction between a drug and the target protein would be of vital signific… ▽ More

    Submitted 14 January, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

    Comments: 14 pages, 5 figures

    MSC Class: machine learning

  44. arXiv:2110.11347  [pdf

    q-bio.NC cs.LG

    Multidimensional representations in late-life depression: convergence in neuroimaging, cognition, clinical symptomatology and genetics

    Authors: Junhao Wen, Cynthia H. Y. Fu, Duygu Tosun, Yogasudha Veturi, Zhijian Yang, Ahmed Abdulkadir, Elizabeth Mamourian, Dhivya Srinivasan, Jingxuan Bao, Guray Erus, Haochang Shou, Mohamad Habes, Jimit Doshi, Erdem Varol, Scott R Mackin, Aristeidis Sotiras, Yong Fan, Andrew J. Saykin, Yvette I. Sheline, Li Shen, Marylyn D. Ritchie, David A. Wolk, Marilyn Albert, Susan M. Resnick, Christos Davatzikos

    Abstract: Late-life depression (LLD) is characterized by considerable heterogeneity in clinical manifestation. Unraveling such heterogeneity would aid in elucidating etiological mechanisms and pave the road to precision and individualized medicine. We sought to delineate, cross-sectionally and longitudinally, disease-related heterogeneity in LLD linked to neuroanatomy, cognitive functioning, clinical sympto… ▽ More

    Submitted 25 October, 2021; v1 submitted 20 October, 2021; originally announced October 2021.

  45. arXiv:2102.12582  [pdf

    cs.LG eess.IV q-bio.NC q-bio.QM

    Disentangling brain heterogeneity via semi-supervised deep-learning and MRI: dimensional representations of Alzheimer's Disease

    Authors: Zhijian Yang, Ilya M. Nasrallah, Haochang Shou, Junhao Wen, Jimit Doshi, Mohamad Habes, Guray Erus, Ahmed Abdulkadir, Susan M. Resnick, David Wolk, Christos Davatzikos

    Abstract: Heterogeneity of brain diseases is a challenge for precision diagnosis/prognosis. We describe and validate Smile-GAN (SeMI-supervised cLustEring-Generative Adversarial Network), a novel semi-supervised deep-clustering method, which dissects neuroanatomical heterogeneity, enabling identification of disease subtypes via their imaging signatures relative to controls. When applied to MRIs (2 studies;… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: 37 pages, 11 figures

  46. arXiv:2012.15418  [pdf

    q-bio.GN

    EPIHC: Improving Enhancer-Promoter Interaction Prediction by using Hybrid features and Communicative learning

    Authors: Shuai Liu, Xinran Xu, Zhihao Yang, Xiaohan Zhao, Wen Zhang

    Abstract: Enhancer-promoter interactions (EPIs) regulate the expression of specific genes in cells, and EPIs are important for understanding gene regulation, cell differentiation and disease mechanisms. EPI identification through the wet experiments is costly and time-consuming, and computational methods are in demand. In this paper, we propose a deep neural network-based method EPIHC based on sequence-deri… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

    Comments: 7 pages, 9 figures, 2 tables

  47. arXiv:2007.00975  [pdf

    q-bio.BM physics.chem-ph

    Molcontroller: a VMD Graphical User Interface for Manipulating Molecules

    Authors: ChenChen Wu, Shengtang Liu, Shitong Zhang, Zaixing Yang

    Abstract: Visual Molecular Dynamics (VMD) is one of the most widely used molecular graphics software in the community of theoretical simulations. So far, however, it still lacks a graphical user interface (GUI) for molecular manipulations when doing some modeling tasks. For instance, translation or rotation of a selected molecule(s) or part(s) of a molecule, which are currently only can be achieved using tc… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 7 pages, 3 figures

  48. arXiv:2006.15255  [pdf, other

    q-bio.QM cs.LG eess.IV stat.ML

    Smile-GANs: Semi-supervised clustering via GANs for dissecting brain disease heterogeneity from medical images

    Authors: Zhijian Yang, Junhao Wen, Christos Davatzikos

    Abstract: Machine learning methods applied to complex biomedical data has enabled the construction of disease signatures of diagnostic/prognostic value. However, less attention has been given to understanding disease heterogeneity. Semi-supervised clustering methods can address this problem by estimating multiple transformations from a (e.g. healthy) control (CN) group to a patient (PT) group, seeking to ca… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  49. arXiv:2005.10951  [pdf, other

    q-bio.QM q-bio.TO

    A machine learning approach to using Quality-of-Life patient scores in guiding prostate radiation therapy dosing

    Authors: Zhijian Yang, Daniel Olszewski, Chujun He, Giulia Pintea, Jun Lian, Tom Chou, Ronald Chen, Blerta Shtylla

    Abstract: Thanks to advancements in diagnosis and treatment, prostate cancer patients have high long-term survival rates. Currently, an important goal is to preserve quality-of-life during and after treatment. The relationship between the radiation a patient receives and the subsequent side effects he experiences is complex and difficult to model or predict. Here, we use machine learning algorithms and stat… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

  50. Using single-cell entropy to describe the dynamics of reprogramming and differentiation of induced pluripotent stem cells

    Authors: Yusong Ye, Zhuoqin Yang, Jinzhi Lei

    Abstract: Induced pluripotent stem cells (iPSCs) provide a great model to study the process of reprogramming and differentiation of stem cells. Single-cell RNA sequencing (scRNA-seq) enables us to investigate the reprogramming process at single-cell level. Here, we introduce single-cell entropy (scEntropy) as a macroscopic variable to quantify the cellular transcriptome from scRNA-seq data during reprogramm… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

    Comments: 12 pages, 5 figures