Skip to main content

Showing 1–50 of 149 results for author: Zhang, L

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2507.06326  [pdf, ps, other

    cs.LG cs.AI eess.SY q-bio.NC

    Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease

    Authors: Harsh Ravivarapu, Gaurav Bagwe, Xiaoyong Yuan, Chunxiu Yu, Lan Zhang

    Abstract: Deep brain stimulation (DBS) is an established intervention for Parkinson's disease (PD), but conventional open-loop systems lack adaptability, are energy-inefficient due to continuous stimulation, and provide limited personalization to individual neural dynamics. Adaptive DBS (aDBS) offers a closed-loop alternative, using biomarkers such as beta-band oscillations to dynamically modulate stimulati… ▽ More

    Submitted 8 July, 2025; originally announced July 2025.

    Comments: Accepted by IEEE IMC 2025

  2. arXiv:2507.02734  [pdf, ps, other

    q-bio.QM

    Leveraging Transformer Models to Capture Multi-Scale Dynamics in Biomolecules by nano-GPT

    Authors: Wenqi Zeng, Lu Zhang, Yuan Yao

    Abstract: Long-term biomolecular dynamics are critical for understanding key evolutionary transformations in molecular systems. However, capturing these processes requires extended simulation timescales that often exceed the practical limits of conventional models. To address this, shorter simulations, initialized with diverse perturbations, are commonly used to sample phase space and explore a wide range o… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

  3. arXiv:2506.17310  [pdf, ps, other

    q-bio.NC cs.CL cs.NE

    PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding

    Authors: Kangcong Li, Peng Ye, Chongjun Tu, Lin Zhang, Chunfeng Song, Jiamin Wu, Tao Yang, Qihao Zheng, Tao Chen

    Abstract: While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural activations causing information decay and unstructured feed-forward network (FFN) weights leading to semantic fragmentation. Inspired by the brain's working memory and cortical modularity, we propose PaceLLM, featuring two innovations: (1) a Persistent A… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  4. arXiv:2506.00936  [pdf, ps, other

    cs.LG cs.AI q-bio.QM

    Uncertainty-Aware Metabolic Stability Prediction with Dual-View Contrastive Learning

    Authors: Peijin Guo, Minghui Li, Hewen Pan, Bowen Chen, Yang Wu, Zikang Guo, Leo Yu Zhang, Shengshan Hu, Shengqing Hu

    Abstract: Accurate prediction of molecular metabolic stability (MS) is critical for drug research and development but remains challenging due to the complex interplay of molecular interactions. Despite recent advances in graph neural networks (GNNs) for MS prediction, current approaches face two critical limitations: (1) incomplete molecular modeling due to atom-centric message-passing mechanisms that disre… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: This manuscript has been accepted for publication at ECML-PKDD 2025. The final version will be published in the conference proceedings

  5. arXiv:2505.16492  [pdf, ps, other

    q-bio.QM

    Learning collective multi-cellular dynamics from temporal scRNA-seq via a transformer-enhanced Neural SDE

    Authors: Qi Jiang, Lei Zhang, Longquan Li, Lin Wan

    Abstract: Time-series single-cell RNA-sequencing (scRNA-seq) datasets offer unprecedented insights into the dynamics and heterogeneity of cellular systems. These systems exhibit multiscale collective behaviors driven by intricate intracellular gene regulatory networks and intercellular interactions of molecules. However, inferring interacting cell population dynamics from time-series scRNA-seq data remains… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  6. arXiv:2505.07896  [pdf, ps, other

    q-bio.GN cs.AI

    Bridging Large Language Models and Single-Cell Transcriptomics in Dissecting Selective Motor Neuron Vulnerability

    Authors: Douglas Jiang, Zilin Dai, Luxuan Zhang, Qiyi Yu, Haoqi Sun, Feng Tian

    Abstract: Understanding cell identity and function through single-cell level sequencing data remains a key challenge in computational biology. We present a novel framework that leverages gene-specific textual annotations from the NCBI Gene database to generate biologically contextualized cell embeddings. For each cell in a single-cell RNA sequencing (scRNA-seq) dataset, we rank genes by expression level, re… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  7. arXiv:2503.19823  [pdf, other

    q-bio.NC cs.AI cs.CV

    GyralNet Subnetwork Partitioning via Differentiable Spectral Modularity Optimization

    Authors: Yan Zhuang, Minheng Chen, Chao Cao, Tong Chen, Jing Zhang, Xiaowei Yu, Yanjun Lyu, Lu Zhang, Tianming Liu, Dajiang Zhu

    Abstract: Understanding the structural and functional organization of the human brain requires a detailed examination of cortical folding patterns, among which the three-hinge gyrus (3HG) has been identified as a key structural landmark. GyralNet, a network representation of cortical folding, models 3HGs as nodes and gyral crests as edges, highlighting their role as critical hubs in cortico-cortical connect… ▽ More

    Submitted 31 March, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

    Comments: 10 pages, 3 figures

  8. arXiv:2503.17656  [pdf, other

    q-bio.QM cs.AI cs.LG

    NaFM: Pre-training a Foundation Model for Small-Molecule Natural Products

    Authors: Yuheng Ding, Bo Qiang, Yiran Zhou, Jie Yu, Qi Li, Liangren Zhang, Yusong Wang, Zhenmin Liu

    Abstract: Natural products, as metabolites from microorganisms, animals, or plants, exhibit diverse biological activities, making them crucial for drug discovery. Nowadays, existing deep learning methods for natural products research primarily rely on supervised learning approaches designed for specific downstream tasks. However, such one-model-for-a-task paradigm often lacks generalizability and leaves sig… ▽ More

    Submitted 18 May, 2025; v1 submitted 22 March, 2025; originally announced March 2025.

  9. arXiv:2503.16278  [pdf, other

    cs.LG cond-mat.mtrl-sci q-bio.BM

    Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens

    Authors: Shuqi Lu, Haowei Lin, Lin Yao, Zhifeng Gao, Xiaohong Ji, Weinan E, Linfeng Zhang, Guolin Ke

    Abstract: Recent advancements in large language models and their multi-modal extensions have demonstrated the effectiveness of unifying generation and understanding through autoregressive next-token prediction. However, despite the critical role of 3D structural generation and understanding (3D GU) in AI for science, these tasks have largely evolved independently, with autoregressive methods remaining under… ▽ More

    Submitted 21 March, 2025; v1 submitted 20 March, 2025; originally announced March 2025.

  10. arXiv:2503.15438  [pdf, other

    cs.CL cs.AI q-bio.QM

    VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning

    Authors: Yang Tan, Chen Liu, Jingyuan Gao, Banghao Wu, Mingchen Li, Ruilin Wang, Lingrong Zhang, Huiqun Yu, Guisheng Fan, Liang Hong, Bingxin Zhou

    Abstract: Natural language processing (NLP) has significantly influenced scientific domains beyond human language, including protein engineering, where pre-trained protein language models (PLMs) have demonstrated remarkable success. However, interdisciplinary adoption remains limited due to challenges in data collection, task benchmarking, and application. This work presents VenusFactory, a versatile engine… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 12 pages, 1 figure, 8 tables

  11. arXiv:2503.14655  [pdf, other

    q-bio.NC cs.AI cs.CV eess.IV

    Core-Periphery Principle Guided State Space Model for Functional Connectome Classification

    Authors: Minheng Chen, Xiaowei Yu, Jing Zhang, Tong Chen, Chao Cao, Yan Zhuang, Yanjun Lyu, Lu Zhang, Tianming Liu, Dajiang Zhu

    Abstract: Understanding the organization of human brain networks has become a central focus in neuroscience, particularly in the study of functional connectivity, which plays a crucial role in diagnosing neurological disorders. Advances in functional magnetic resonance imaging and machine learning techniques have significantly improved brain network analysis. However, traditional machine learning approaches… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  12. arXiv:2503.13522  [pdf, ps, other

    q-bio.BM cs.AI cs.LG

    Advanced Deep Learning Methods for Protein Structure Prediction and Design

    Authors: Yichao Zhang, Ningyuan Deng, Xinyuan Song, Ziqian Bi, Tianyang Wang, Zheyu Yao, Keyu Chen, Ming Li, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Liu, Li Zhang, Xuanhe Pan, Jinlang Wang, Pohsun Feng, Yizhu Wen, Lawrence KQ Yan, Hongming Tseng, Yan Zhong, Yunze Wang, Ziyuan Qin, Bowen Jing, Junjie Yang , et al. (3 additional authors not shown)

    Abstract: After AlphaFold won the Nobel Prize, protein prediction with deep learning once again became a hot topic. We comprehensively explore advanced deep learning methods applied to protein structure prediction and design. It begins by examining recent innovations in prediction architectures, with detailed discussions on improvements such as diffusion based frameworks and novel pairwise attention modules… ▽ More

    Submitted 29 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

  13. arXiv:2503.10489  [pdf, other

    q-bio.BM cs.LG

    Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling

    Authors: Shuqi Lu, Xiaohong Ji, Bohang Zhang, Lin Yao, Siyuan Liu, Zhifeng Gao, Linfeng Zhang, Guolin Ke

    Abstract: Molecular pretrained representations (MPR) has emerged as a powerful approach for addressing the challenge of limited supervised data in applications such as drug discovery and material design. While early MPR methods relied on 1D sequences and 2D graphs, recent advancements have incorporated 3D conformational information to capture rich atomic interactions. However, these prior models treat molec… ▽ More

    Submitted 18 March, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

  14. arXiv:2503.07640  [pdf

    cs.LG cs.AI q-bio.NC

    BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification

    Authors: Jing Zhang, Xiaowei Yu, Tong Chen, Chao Cao, Mingheng Chen, Yan Zhuang, Yanjun Lyu, Lu Zhang, Li Su, Tianming Liu, Dajiang Zhu

    Abstract: The Lewy body dementia (LBD) is the second most common neurodegenerative dementia after Alzheimer's disease (AD). Early differentiation between AD and LBD is crucial because they require different treatment approaches, but this is challenging due to significant clinical overlap, heterogeneity, complex pathogenesis, and the rarity of LBD. While recent advances in artificial intelligence (AI) demons… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  15. arXiv:2503.04851  [pdf

    q-bio.QM

    VenusMutHub: A systematic evaluation of protein mutation effect predictors on small-scale experimental data

    Authors: Liang Zhang, Hua Pang, Chenghao Zhang, Song Li, Yang Tan, Fan Jiang, Mingchen Li, Yuanxi Yu, Ziyi Zhou, Banghao Wu, Bingxin Zhou, Hao Liu, Pan Tan, Liang Hong

    Abstract: In protein engineering, while computational models are increasingly used to predict mutation effects, their evaluations primarily rely on high-throughput deep mutational scanning (DMS) experiments that use surrogate readouts, which may not adequately capture the complex biochemical properties of interest. Many proteins and their functions cannot be assessed through high-throughput methods due to t… ▽ More

    Submitted 10 March, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

  16. arXiv:2502.18888  [pdf

    q-bio.QM stat.ME

    Association of normalization, non-differentially expressed genes and data source with machine learning performance in intra-dataset or cross-dataset modelling of transcriptomic and clinical data

    Authors: Fei Deng, Lanjing Zhang

    Abstract: Cross-dataset testing is critical for examining machine learning (ML) model's performance. However, most studies on modelling transcriptomic and clinical data only conducted intra-dataset testing. It is also unclear whether normalization and non-differentially expressed genes (NDEG) can improve cross-dataset modeling performance of ML. We thus aim to understand whether normalization, NDEG and data… ▽ More

    Submitted 26 February, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

  17. arXiv:2502.15867  [pdf

    q-bio.OT cs.AI

    Strategic priorities for transformative progress in advancing biology with proteomics and artificial intelligence

    Authors: Yingying Sun, Jun A, Zhiwei Liu, Rui Sun, Liujia Qian, Samuel H. Payne, Wout Bittremieux, Markus Ralser, Chen Li, Yi Chen, Zhen Dong, Yasset Perez-Riverol, Asif Khan, Chris Sander, Ruedi Aebersold, Juan Antonio Vizcaíno, Jonathan R Krieger, Jianhua Yao, Han Wen, Linfeng Zhang, Yunping Zhu, Yue Xuan, Benjamin Boyang Sun, Liang Qiao, Henning Hermjakob , et al. (37 additional authors not shown)

    Abstract: Artificial intelligence (AI) is transforming scientific research, including proteomics. Advances in mass spectrometry (MS)-based proteomics data quality, diversity, and scale, combined with groundbreaking AI techniques, are unlocking new challenges and opportunities in biological discovery. Here, we highlight key areas where AI is driving innovation, from data analysis to new biological insights.… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

    Comments: 28 pages, 2 figures, perspective in AI proteomics

  18. arXiv:2502.09392  [pdf, ps, other

    q-bio.OT

    Conditional Success of Adaptive Therapy: The Role of Treatment-Holiday Thresholds Revealed by Mathematical Modeling

    Authors: Lanfei Sun, Haifeng Zhang, Kai Kang, Xiaoxin Wang, Leyi Zhang, Yanan Cai, Changjing Zhuge, Lei Zhang

    Abstract: Adaptive therapy (AT) improves cancer treatment by controlling the competition between sensitive and resistant cells through treatment holidays. This study highlights the critical role of treatment-holiday thresholds in AT for tumors composed of drug-sensitive and resistant cells. Using a Lotka-Volterra model, the research compares AT with maximum tolerated dose therapy and intermittent therapy, s… ▽ More

    Submitted 15 February, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    MSC Class: 92-10

  19. arXiv:2501.16409  [pdf

    eess.IV cs.AI q-bio.NC

    Classification of Mild Cognitive Impairment Based on Dynamic Functional Connectivity Using Spatio-Temporal Transformer

    Authors: Jing Zhang, Yanjun Lyu, Xiaowei Yu, Lu Zhang, Chao Cao, Tong Chen, Minheng Chen, Yan Zhuang, Tianming Liu, Dajiang Zhu

    Abstract: Dynamic functional connectivity (dFC) using resting-state functional magnetic resonance imaging (rs-fMRI) is an advanced technique for capturing the dynamic changes of neural activities, and can be very useful in the studies of brain diseases such as Alzheimer's disease (AD). Yet, existing studies have not fully leveraged the sequential information embedded within dFC that can potentially provide… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

  20. arXiv:2501.14248  [pdf

    q-bio.QM q-bio.GN stat.CO stat.ME

    Normalization and selecting non-differentially expressed genes improve machine learning modelling of cross-platform transcriptomic data

    Authors: Fei Deng, Catherine H Feng, Nan Gao, Lanjing Zhang

    Abstract: Normalization is a critical step in quantitative analyses of biological processes. Recent works show that cross-platform integration and normalization enable machine learning (ML) training on RNA microarray and RNA-seq data, but no independent datasets were used in their studies. Therefore, it is unclear how to improve ML modelling performance on independent RNA array and RNA-seq based datasets. I… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

    Comments: 35 pages, 5 figures, 2 tables

    Report number: https://media.sciltp.com/articles/2505000683/2505000683.pdf

    Journal ref: Trans. Artif. Intell. 2025, 1(1), 5

  21. arXiv:2501.08567  [pdf, other

    q-bio.OT

    A new perspective on brain stimulation interventions: Optimal stochastic tracking control of brain network dynamics

    Authors: Kangli Dong, Siya Chen, Ying Dan, Lu Zhang, Xinyi Li, Wei Liang, Yue Zhao, Yu Sun

    Abstract: Network control theory (NCT) has recently been utilized in neuroscience to facilitate our understanding of brain stimulation effects. A particularly useful branch of NCT is optimal control, which focuses on applying theoretical and computational principles of control theory to design optimal strategies to achieve specific goals in neural processes. However, most existing research focuses on optima… ▽ More

    Submitted 16 January, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

    Comments: Supplementary materials can be found at: https://zjueducn-my.sharepoint.com/:b:/g/personal/dongkl_zju_edu_cn/EbG817wduDFIgqqS3zt2d4gB0OZXM9wt-v18Xr41zXS1Fg?e=XOGNwG

  22. arXiv:2501.06271  [pdf, other

    q-bio.QM cs.AI cs.CE

    Large Language Models for Bioinformatics

    Authors: Wei Ruan, Yanjun Lyu, Jing Zhang, Jiazhang Cai, Peng Shu, Yang Ge, Yao Lu, Shang Gao, Yue Wang, Peilong Wang, Lin Zhao, Tao Wang, Yufang Liu, Luyang Fang, Ziyu Liu, Zhengliang Liu, Yiwei Li, Zihao Wu, Junhao Chen, Hanqi Jiang, Yi Pan, Zhenyuan Yang, Jingyuan Chen, Shizhe Liang, Wei Zhang , et al. (30 additional authors not shown)

    Abstract: With the rapid advancements in large language model (LLM) technology and the emergence of bioinformatics-specific language models (BioLMs), there is a growing need for a comprehensive analysis of the current landscape, computational characteristics, and diverse applications. This survey aims to address this need by providing a thorough review of BioLMs, focusing on their evolution, classification,… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

    Comments: 64 pages, 1 figure

  23. arXiv:2501.01462  [pdf

    cs.LG cs.AI q-bio.GN

    Pan-infection Foundation Framework Enables Multiple Pathogen Prediction

    Authors: Lingrui Zhang, Haonan Wu, Nana Jin, Chenqing Zheng, Jize Xie, Qitai Cai, Jun Wang, Qin Cao, Xubin Zheng, Jiankun Wang, Lixin Cheng

    Abstract: Host-response-based diagnostics can improve the accuracy of diagnosing bacterial and viral infections, thereby reducing inappropriate antibiotic prescriptions. However, the existing cohorts with limited sample size and coarse infections types are unable to support the exploration of an accurate and generalizable diagnostic model. Here, we curate the largest infection host-response transcriptome da… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

    Comments: 15 pages, 8 figures

  24. arXiv:2412.12229  [pdf

    q-bio.NC

    Efficacy of Temporal Interference Electrical Stimulation for Spinal Cord Injury Rehabilitation: A Case Series

    Authors: Ruidong Cheng, Yuling Shao, Xi Li, Li Zhang, Zehao Sheng, Chenyang Li, Xu Xie, Huilin Mou, Weidong Chen, Shaomin Zhang, Yuchen Xu, Minmin Wang

    Abstract: Spinal cord injury (SCI) is a debilitating condition that often results in significant motor and sensory deficits, impacting the quality of life. Current rehabilitation methods, including physical therapy and electrical stimulation, offer variable outcomes and often require invasive procedures. Temporal interference (TI) stimulation has emerged as a novel, non-invasive neuromodulation technique ca… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

    Comments: 19 pages,1 table

  25. arXiv:2412.09661  [pdf

    q-bio.QM cs.AI

    Language model driven: a PROTAC generation pipeline with dual constraints of structure and property

    Authors: Jinsong Shao, Qineng Gong, Zeyu Yin, Yu Chen, Yajie Hao, Lei Zhang, Linlin Jiang, Min Yao, Jinlong Li, Fubo Wang, Li Wang

    Abstract: The imperfect modeling of ternary complexes has limited the application of computer-aided drug discovery tools in PROTAC research and development. In this study, an AI-assisted approach for PROTAC molecule design pipeline named LM-PROTAC was developed, which stands for language model driven Proteolysis Targeting Chimera, by embedding a transformer-based generative model with dual constraints on st… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: 61 pages,12 figures

    ACM Class: I.2.7; D.3.2

  26. arXiv:2412.02915  [pdf, other

    cs.CL q-bio.GN

    Single-Cell Omics Arena: A Benchmark Study for Large Language Models on Cell Type Annotation Using Single-Cell Data

    Authors: Junhao Liu, Siwei Xu, Lei Zhang, Jing Zhang

    Abstract: Over the past decade, the revolution in single-cell sequencing has enabled the simultaneous molecular profiling of various modalities across thousands of individual cells, allowing scientists to investigate the diverse functions of complex tissues and uncover underlying disease mechanisms. Among all the analytical steps, assigning individual cells to specific types is fundamental for understanding… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  27. arXiv:2412.00651  [pdf, other

    cs.CV q-bio.GN

    Towards Unified Molecule-Enhanced Pathology Image Representation Learning via Integrating Spatial Transcriptomics

    Authors: Minghao Han, Dingkang Yang, Jiabei Cheng, Xukun Zhang, Linhao Qu, Zizhi Chen, Lihua Zhang

    Abstract: Recent advancements in multimodal pre-training models have significantly advanced computational pathology. However, current approaches predominantly rely on visual-language models, which may impose limitations from a molecular perspective and lead to performance bottlenecks. Here, we introduce a Unified Molecule-enhanced Pathology Image REpresentationn Learning framework (UMPIRE). UMPIRE aims to l… ▽ More

    Submitted 30 November, 2024; originally announced December 2024.

    Comments: 21 pages, 11 figures, 7 tables

  28. arXiv:2411.11875  [pdf, other

    cs.IR cs.AI cs.CL q-bio.BM

    Exploring Optimal Transport-Based Multi-Grained Alignments for Text-Molecule Retrieval

    Authors: Zijun Min, Bingshuai Liu, Liang Zhang, Jia Song, Jinsong Su, Song He, Xiaochen Bo

    Abstract: The field of bioinformatics has seen significant progress, making the cross-modal text-molecule retrieval task increasingly vital. This task focuses on accurately retrieving molecule structures based on textual descriptions, by effectively aligning textual descriptions and molecules to assist researchers in identifying suitable molecular candidates. However, many existing approaches overlook the d… ▽ More

    Submitted 4 November, 2024; originally announced November 2024.

    Comments: BIBM 2024 Regular Paper

  29. arXiv:2410.23326  [pdf, other

    q-bio.QM cs.LG

    MassSpecGym: A benchmark for the discovery and identification of molecules

    Authors: Roman Bushuiev, Anton Bushuiev, Niek F. de Jonge, Adamo Young, Fleming Kretschmer, Raman Samusevich, Janne Heirman, Fei Wang, Luke Zhang, Kai Dührkop, Marcus Ludwig, Nils A. Haupt, Apurva Kalia, Corinna Brungs, Robin Schmid, Russell Greiner, Bo Wang, David S. Wishart, Li-Ping Liu, Juho Rousu, Wout Bittremieux, Hannes Rost, Tytus D. Mak, Soha Hassoun, Florian Huber , et al. (5 additional authors not shown)

    Abstract: The discovery and identification of molecules in biological and environmental samples is crucial for advancing biomedical and chemical sciences. Tandem mass spectrometry (MS/MS) is the leading technique for high-throughput elucidation of molecular structures. However, decoding a molecular structure from its mass spectrum is exceptionally challenging, even when performed by human experts. As a resu… ▽ More

    Submitted 14 February, 2025; v1 submitted 30 October, 2024; originally announced October 2024.

  30. arXiv:2410.20644  [pdf

    q-bio.SC

    ZIF-90 treats fungal keratitis by promoting macrophage apoptosis and inhibiting inflammatory response

    Authors: Xueyun Fu, Jing Lin, Qian Wang, Lina Zhang, Ziyi Wang, Menghui Chi, Daohao Li, Guiqiu Zhao, Cui Li

    Abstract: Fungal keratitis is a severe vision-threatening corneal infection with a prognosis influenced by fungal virulence and the host's immune defense mechanisms. The immune system, through its regulation of the inflammatory response, ensures cells and tissues can effectively activate defense mechanisms in response to infection and injury. However, there is still a lack of effective drugs that attenuate… ▽ More

    Submitted 27 October, 2024; originally announced October 2024.

  31. arXiv:2410.00920  [pdf

    q-bio.TO

    Zeolitic Imidazolate Framework-8 offers an anti-inflammatory and antifungal method in the treatment of Aspergillus fungus keratitis in vitro and in vivo

    Authors: Xueyun Fu, Xue Tian, Jing Lin, Qian Wang, Lingwen Gu, Ziyi Wang, Menghui Chi, Bing Yu, Zhuhui Feng, Wenyao Liu, Lina Zhang, Cui Li, Guiqiu Zhao

    Abstract: Background: Fungal keratitis is a serious blinding eye disease. Traditional drugs used to treat fungal keratitis commonly have the disadvantages of low bioavailability, poor dispersion, and limited permeability. Purpose: To develop a new method for the treatment of fungal keratitis with improved bioavailability, dispersion, and permeability. Purpose: To develop a new method for the treatment of fu… ▽ More

    Submitted 29 October, 2024; v1 submitted 29 September, 2024; originally announced October 2024.

    Comments: 25 pages, 8 figures, this paper has been received by international journal of nanomedicine

  32. arXiv:2408.14202  [pdf, other

    q-bio.PE math.CO

    Bounding the number of reticulation events for displaying multiple trees in a phylogenetic network

    Authors: Yufeng Wu, Louxin Zhang

    Abstract: Reconstructing a parsimonious phylogenetic network that displays multiple phylogenetic trees is an important problem in theory of phylogenetics, where the complexity of the inferred networks is measured by reticulation numbers. The reticulation number for a set of trees is defined as the minimum number of reticulations in a phylogenetic network that displays those trees. A mathematical problem is… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 9 figures, 18 pages

    MSC Class: 5C30 ACM Class: J.3

  33. arXiv:2408.12413  [pdf, other

    q-bio.BM cs.AI

    Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures

    Authors: Ce Liu, Jun Wang, Zhiqiang Cai, Yingxu Wang, Huizhen Kuang, Kaihui Cheng, Liwei Zhang, Qingkun Su, Yining Tang, Fenglei Cao, Limei Han, Siyu Zhu, Yuan Qi

    Abstract: Despite significant progress in static protein structure collection and prediction, the dynamic behavior of proteins, one of their most vital characteristics, has been largely overlooked in prior research. This oversight can be attributed to the limited availability, diversity, and heterogeneity of dynamic protein datasets. To address this gap, we propose to enhance existing prestigious static 3D… ▽ More

    Submitted 18 September, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

  34. arXiv:2408.06150  [pdf, other

    cs.CL physics.chem-ph q-bio.BM

    LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

    Authors: Tianhao Yu, Cai Yao, Zhuorui Sun, Feng Shi, Lin Zhang, Kangjie Lyu, Xuan Bai, Andong Liu, Xicheng Zhang, Jiali Zou, Wenshou Wang, Chris Lai, Kai Wang

    Abstract: In this study, we generate and maintain a database of 10 million virtual lipids through METiS's in-house de novo lipid generation algorithms and lipid virtual screening techniques. These virtual lipids serve as a corpus for pre-training, lipid representation learning, and downstream task knowledge transfer, culminating in state-of-the-art LNP property prediction performance. We propose LipidBERT,… ▽ More

    Submitted 3 May, 2025; v1 submitted 12 August, 2024; originally announced August 2024.

  35. arXiv:2407.19852  [pdf

    quant-ph cs.LG q-bio.BM

    Quantum Long Short-Term Memory for Drug Discovery

    Authors: Liang Zhang, Yin Xu, Mohan Wu, Liang Wang, Hua Xu

    Abstract: Quantum computing combined with machine learning (ML) is an extremely promising research area, with numerous studies demonstrating that quantum machine learning (QML) is expected to solve scientific problems more effectively than classical ML. In this work, we successfully apply QML to drug discovery, showing that QML can significantly improve model performance and achieve faster convergence compa… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  36. arXiv:2407.09540  [pdf, other

    eess.IV cs.CE cs.CV cs.LG q-bio.TO

    Prompting Whole Slide Image Based Genetic Biomarker Prediction

    Authors: Ling Zhang, Boxiang Yun, Xingran Xie, Qingli Li, Xinxing Li, Yan Wang

    Abstract: Prediction of genetic biomarkers, e.g., microsatellite instability and BRAF in colorectal cancer is crucial for clinical decision making. In this paper, we propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques. Our work aims at addressing the following challenges: (1) extracting foreground instances related to genetic biomarkers from gigapixel WSIs, a… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures, MICCAI2024

  37. arXiv:2406.15514  [pdf, other

    physics.soc-ph q-bio.PE stat.ME

    How big does a population need to be before demographers can ignore individual-level randomness in demographic events?

    Authors: John Bryant, Tahu Kukutai, Junni L. Zhang

    Abstract: When studying a national-level population, demographers can safely ignore the effect of individual-level randomness on age-sex structure. When studying a single community, or group of communities, however, the potential importance of individual-level randomness is less clear. We seek to measure the effect of individual-level randomness in births and deaths on standard summary indicators of age-sex… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 28 pages, 8 figures, 3 tables

    MSC Class: 91-XX

  38. arXiv:2406.13133  [pdf, other

    cs.CL cs.LG q-bio.GN

    PathoLM: Identifying pathogenicity from the DNA sequence through the Genome Foundation Model

    Authors: Sajib Acharjee Dip, Uddip Acharjee Shuvo, Tran Chau, Haoqiu Song, Petra Choi, Xuan Wang, Liqing Zhang

    Abstract: Pathogen identification is pivotal in diagnosing, treating, and preventing diseases, crucial for controlling infections and safeguarding public health. Traditional alignment-based methods, though widely used, are computationally intense and reliant on extensive reference databases, often failing to detect novel pathogens due to their low sensitivity and specificity. Similarly, conventional machine… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 9 pages, 3 figures

  39. arXiv:2406.09817  [pdf, other

    physics.chem-ph q-bio.BM

    Efficient and Precise Force Field Optimization for Biomolecules Using DPA-2

    Authors: Junhan Chang, Duo Zhang, Yuqing Deng, Hongrui Lin, Zhirong Liu, Linfeng Zhang, Hang Zheng, Xinyan Wang

    Abstract: Molecular simulations are essential tools in computational chemistry, enabling the prediction and understanding of molecular interactions and thermodynamic properties of biomolecules. However, traditional force fields face significant challenges in accurately representing novel molecules and complex chemical environments due to the labor-intensive process of manually setting optimization parameter… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  40. arXiv:2406.00085  [pdf, other

    eess.IV cs.LG q-bio.NC

    Augmentation-based Unsupervised Cross-Domain Functional MRI Adaptation for Major Depressive Disorder Identification

    Authors: Yunling Ma, Chaojun Zhang, Xiaochuan Wang, Qianqian Wang, Liang Cao, Limei Zhang, Mingxia Liu

    Abstract: Major depressive disorder (MDD) is a common mental disorder that typically affects a person's mood, cognition, behavior, and physical health. Resting-state functional magnetic resonance imaging (rs-fMRI) data are widely used for computer-aided diagnosis of MDD. While multi-site fMRI data can provide more data for training reliable diagnostic models, significant cross-site data heterogeneity would… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

  41. arXiv:2405.19420  [pdf, other

    cs.LG cs.AI q-bio.NC

    Learning Human-Aligned Representations with Contrastive Learning and Generative Similarity

    Authors: Raja Marjieh, Sreejan Kumar, Declan Campbell, Liyi Zhang, Gianluca Bencomo, Jake Snell, Thomas L. Griffiths

    Abstract: Humans rely on effective representations to learn from few examples and abstract useful information from sensory data. Inducing such representations in machine learning models has been shown to improve their performance on various benchmarks such as few-shot learning and robustness. However, finding effective training procedures to achieve that goal can be challenging as psychologically rich train… ▽ More

    Submitted 31 January, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

  42. arXiv:2405.11769  [pdf, other

    q-bio.BM cs.LG physics.bio-ph

    Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction

    Authors: Eric Alcaide, Zhifeng Gao, Guolin Ke, Yaqi Li, Linfeng Zhang, Hang Zheng, Gengmo Zhou

    Abstract: In recent years, machine learning (ML) methods have emerged as promising alternatives for molecular docking, offering the potential for high accuracy without incurring prohibitive computational costs. However, recent studies have indicated that these ML models may overfit to quantitative metrics while neglecting the physical constraints inherent in the problem. In this work, we present Uni-Mol Doc… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  43. arXiv:2405.07110  [pdf, other

    q-bio.PE cs.DS math.CO

    A Vector Representation for Phylogenetic Trees

    Authors: Cedric Chauve, Caroline Colijn, Louxin Zhang

    Abstract: Good representations for phylogenetic trees and networks are important for optimizing storage efficiency and implementation of scalable methods for the inference and analysis of evolutionary trees for genes, genomes and species. We introduce a new representation for rooted phylogenetic trees that encodes a binary tree on n taxa as a vector of length 2n in which each taxon appears exactly twice. Us… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    MSC Class: 05C05; 92B10 ACM Class: G.2.2; J.3

  44. arXiv:2404.06691  [pdf

    q-bio.BM cs.LG cs.NE

    Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation

    Authors: Ningfeng Liu, Jie Yu, Siyu Xiu, Xinfang Zhao, Siyu Lin, Bo Qiang, Ruqiu Zheng, Hongwei Jin, Liangren Zhang, Zhenming Liu

    Abstract: Molecular generation, an essential method for identifying new drug structures, has been supported by advancements in machine learning and computational technology. However, challenges remain in multi-objective generation, model adaptability, and practical application in drug discovery. In this study, we developed a versatile 'plug-in' molecular generation model that incorporates multiple objective… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  45. arXiv:2402.02004  [pdf

    q-bio.BM

    Enhancing the efficiency of protein language models with minimal wet-lab data through few-shot learning

    Authors: Ziyi Zhou, Liang Zhang, Yuanxi Yu, Mingchen Li, Liang Hong, Pan Tan

    Abstract: Accurately modeling the protein fitness landscapes holds great importance for protein engineering. Recently, due to their capacity and representation ability, pre-trained protein language models have achieved state-of-the-art performance in predicting protein fitness without experimental data. However, their predictions are limited in accuracy as well as interpretability. Furthermore, such deep le… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  46. arXiv:2311.18218  [pdf, other

    q-bio.PE

    Computing the Bounds of the Number of Reticulations in a Tree-Child Network That Displays a Set of Trees

    Authors: Yufeng Wu, Louxin Zhang

    Abstract: Phylogenetic network is an evolutionary model that uses a rooted directed acyclic graph (instead of a tree) to model an evolutionary history of species in which reticulate events (e.g., hybrid speciation or horizontal gene transfer) occurred. Tree-child network is a kind of phylogenetic network with structural constraints. Existing approaches for tree-child network reconstruction can be slow for l… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 7 figures, 1 table, 33 pages

    MSC Class: 05C30 ACM Class: J.3

  47. arXiv:2311.01276  [pdf, other

    cs.LG q-bio.QM

    Neural Atoms: Propagating Long-range Interaction in Molecular Graphs through Efficient Communication Channel

    Authors: Xuan Li, Zhanke Zhou, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han

    Abstract: Graph Neural Networks (GNNs) have been widely adopted for drug discovery with molecular graphs. Nevertheless, current GNNs mainly excel in leveraging short-range interactions (SRI) but struggle to capture long-range interactions (LRI), both of which are crucial for determining molecular properties. To tackle this issue, we propose a method to abstract the collective information of atomic groups in… ▽ More

    Submitted 31 March, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  48. arXiv:2309.07701  [pdf

    cs.HC eess.SP q-bio.NC

    Semantic reconstruction of continuous language from MEG signals

    Authors: Bo Wang, Xiran Xu, Longxiang Zhang, Boda Xiao, Xihong Wu, Jing Chen

    Abstract: Decoding language from neural signals holds considerable theoretical and practical importance. Previous research has indicated the feasibility of decoding text or speech from invasive neural signals. However, when using non-invasive neural signals, significant challenges are encountered due to their low quality. In this study, we proposed a data-driven approach for decoding semantic of language fr… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  49. arXiv:2309.03907  [pdf, other

    q-bio.BM cs.LG

    DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

    Authors: Youwei Liang, Ruiyi Zhang, Li Zhang, Pengtao Xie

    Abstract: A ChatGPT-like system for drug compounds could be a game-changer in pharmaceutical research, accelerating drug discovery, enhancing our understanding of structure-activity relationships, guiding lead optimization, aiding drug repurposing, reducing the failure rate, and streamlining clinical trials. In this work, we make an initial attempt towards enabling ChatGPT-like capabilities on drug molecule… ▽ More

    Submitted 18 May, 2023; originally announced September 2023.

  50. Deep neural network improves the estimation of polygenic risk scores for breast cancer

    Authors: Adrien Badré, Li Zhang, Wellington Muchero, Justin C. Reynolds, Chongle Pan

    Abstract: Polygenic risk scores (PRS) estimate the genetic risk of an individual for a complex disease based on many genetic variants across the whole genome. In this study, we compared a series of computational models for estimation of breast cancer PRS. A deep neural network (DNN) was found to outperform alternative machine learning techniques and established statistical algorithms, including BLUP, BayesA… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 28 pages, 7 figures, 2 Tables

    Journal ref: A. Badré, L. Zhang, W. Muchero, J.C. Reynolds, C. Pan (2021). Deep neural network improves the estimation of polygenic risk scores for breast cancer. Journal of Human Genetics, 66(4), 359-369