Skip to main content

Showing 1–50 of 240 results for author: Wang, Z

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2507.03044  [pdf

    physics.geo-ph q-bio.MN

    Positive effects and mechanisms of simulated lunar low-magnetic environment on earthworm-improved lunar soil simulant as a cultivation substrate

    Authors: Sihan Hou, Zhongfu Wang, Yuting Zhu, Hong Liu, Jiajie Feng

    Abstract: With the advancement of crewed deep-space missions, Bioregenerative Life Support Systems (BLSS) for lunar bases face stresses from lunar environmental factors. While microgravity and radiation are well-studied, the low-magnetic field's effects remain unclear. Earthworms ("soil scavengers") improve lunar soil simulant and degrade plant waste, as shown in our prior studies. We tested earthworms in l… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: 28 pages, 6 figures

  2. arXiv:2507.02379  [pdf

    cs.AI q-bio.BM

    An AI-native experimental laboratory for autonomous biomolecular engineering

    Authors: Mingyu Wu, Zhaoguo Wang, Jiabin Wang, Zhiyuan Dong, Jingkai Yang, Qingting Li, Tianyu Huang, Lei Zhao, Mingqiang Li, Fei Wang, Chunhai Fan, Haibo Chen

    Abstract: Autonomous scientific research, capable of independently conducting complex experiments and serving non-specialists, represents a long-held aspiration. Achieving it requires a fundamental paradigm shift driven by artificial intelligence (AI). While autonomous experimental systems are emerging, they remain confined to areas featuring singular objectives and well-defined, simple experimental workflo… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

  3. arXiv:2507.02064  [pdf, ps, other

    q-bio.NC

    REMI: Reconstructing Episodic Memory During Intrinsic Path Planning

    Authors: Zhaoze Wang, Genela Morris, Dori Derdikman, Pratik Chaudhari, Vijay Balasubramanian

    Abstract: Grid cells in the medial entorhinal cortex (MEC) are believed to path integrate speed and direction signals to activate at triangular grids of locations in an environment, thus implementing a population code for position. In parallel, place cells in the hippocampus (HC) fire at spatially confined locations, with selectivity tuned not only to allocentric position but also to environmental contexts,… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  4. arXiv:2507.01485  [pdf, ps, other

    cs.RO cs.AI cs.MA q-bio.QM

    BioMARS: A Multi-Agent Robotic System for Autonomous Biological Experiments

    Authors: Yibo Qiu, Zan Huang, Zhiyu Wang, Handi Liu, Yiling Qiao, Yifeng Hu, Shu'ang Sun, Hangke Peng, Ronald X Xu, Mingzhai Sun

    Abstract: Large language models (LLMs) and vision-language models (VLMs) have the potential to transform biological research by enabling autonomous experimentation. Yet, their application remains constrained by rigid protocol design, limited adaptability to dynamic lab conditions, inadequate error handling, and high operational complexity. Here we introduce BioMARS (Biological Multi-Agent Robotic System), a… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  5. arXiv:2506.21000  [pdf

    q-bio.NC

    Modulating task outcome value to mitigate real-world procrastination via noninvasive brain stimulation

    Authors: Zhiyi Chen, Zhilin Ren, Wei Li, ZhenZhen Huo, ZhuangZheng Wang, Ye Liu, Bowen Hu, Wanting Chen, Ting Xu, Artemiy Leonov, Chenyan Zhang, Bernhard Hommel, Tingyong Feng

    Abstract: Procrastination represents one of the most prevalent behavioral problems affecting individual health and societal productivity. Although it is often conceptualized as a form of self-control failure, its underlying neurocognitive mechanisms are poorly understood. A leading model posits that procrastination arises from imbalanced competing motivations: the avoidance of negative task aversiveness and… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  6. arXiv:2506.19266  [pdf

    q-bio.NC cs.CV eess.IV

    Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans

    Authors: Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Andrew Zalesky, Xiaoquan Yang, Yuanjing Feng, Zheng Wang

    Abstract: The organization and connectivity of the arcuate fasciculus (AF) in nonhuman primates remain contentious, especially concerning how its anatomy diverges from that of humans. Here, we combined cross-scale single-neuron tracing - using viral-based genetic labeling and fluorescence micro-optical sectioning tomography in macaques (n = 4; age 3 - 11 years) - with whole-brain tractography from 11.7T dif… ▽ More

    Submitted 2 July, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

    Comments: 34 pages, 6 figures

  7. arXiv:2506.11634  [pdf

    q-bio.NC

    Differences in Neurovascular Coupling in Patients with Major Depressive Disorder: Evidence from Simultaneous Resting-State EEG-fNIRS

    Authors: Feng Yan, Xiaobin Wang, Yao Zhao, Shuyi Yang, Zhiren Wang

    Abstract: Neurovascular coupling (NVC) refers to the process by which local neural activity, through energy consumption, induces changes in regional cerebral blood flow to meet the metabolic demands of neurons. Event-related studies have shown that the hemodynamic response typically lags behind neural activation by 4-6 seconds. However, little is known about how NVC is altered in patients with major depress… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: 19 pages,9 figures

  8. arXiv:2506.10271  [pdf, ps, other

    q-bio.QM cs.LG q-bio.GN

    Predicting function of evolutionarily implausible DNA sequences

    Authors: Shiyu Jiang, Xuyin Liu, Zitong Jerry Wang

    Abstract: Genomic language models (gLMs) show potential for generating novel, functional DNA sequences for synthetic biology, but doing so requires them to learn not just evolutionary plausibility, but also sequence-to-function relationships. We introduce a set of prediction tasks called Nullsettes, which assesses a model's ability to predict loss-of-function mutations created by translocating key control e… ▽ More

    Submitted 4 July, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

    Comments: 13 pages, 6 figures, accepted to ICML 2025 Generative AI and Biology Workshop

  9. arXiv:2506.07459  [pdf, ps, other

    cs.LG q-bio.QM

    ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning

    Authors: Ziwen Wang, Jiajun Fan, Ruihan Guo, Thao Nguyen, Heng Ji, Ge Liu

    Abstract: Protein generative models have shown remarkable promise in protein design but still face limitations in success rate, due to the scarcity of high-quality protein datasets for supervised pretraining. We present ProteinZero, a novel framework that enables scalable, automated, and continuous self-improvement of the inverse folding model through online reinforcement learning. To achieve computationall… ▽ More

    Submitted 10 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  10. arXiv:2506.04303  [pdf

    q-bio.GN cs.AI cs.LG

    Knowledge-guided Contextual Gene Set Analysis Using Large Language Models

    Authors: Zhizheng Wang, Chi-Ping Day, Chih-Hsuan Wei, Qiao Jin, Robert Leaman, Yifan Yang, Shubo Tian, Aodong Qiu, Yin Fang, Qingqing Zhu, Xinghua Lu, Zhiyong Lu

    Abstract: Gene set analysis (GSA) is a foundational approach for interpreting genomic data of diseases by linking genes to biological processes. However, conventional GSA methods overlook clinical context of the analyses, often generating long lists of enriched pathways with redundant, nonspecific, or irrelevant results. Interpreting these requires extensive, ad-hoc manual effort, reducing both reliability… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: 56 pages, 9 figures, 1 table

  11. arXiv:2506.00410  [pdf, ps, other

    cs.LG q-bio.GN stat.ML

    JojoSCL: Shrinkage Contrastive Learning for single-cell RNA sequence Clustering

    Authors: Ziwen Wang

    Abstract: Single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of cellular processes by enabling gene expression analysis at the individual cell level. Clustering allows for the identification of cell types and the further discovery of intrinsic patterns in single-cell data. However, the high dimensionality and sparsity of scRNA-seq data continue to challenge existing clustering model… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  12. arXiv:2505.11823  [pdf, ps, other

    cs.LG math.OC q-bio.QM

    Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action

    Authors: Yuhao Sun, Zhenyi Zhang, Zihan Wang, Tiejun Li, Peijie Zhou

    Abstract: Recovering the dynamics from a few snapshots of a high-dimensional system is a challenging task in statistical physics and machine learning, with important applications in computational biology. Many algorithms have been developed to tackle this problem, based on frameworks such as optimal transport and the Schrödinger bridge. A notable recent framework is Regularized Unbalanced Optimal Transport… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  13. arXiv:2505.11197  [pdf, ps, other

    cs.LG math.OC q-bio.QM

    Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge

    Authors: Zhenyi Zhang, Zihan Wang, Yuhao Sun, Tiejun Li, Peijie Zhou

    Abstract: Modeling the dynamics from sparsely time-resolved snapshot data is crucial for understanding complex cellular processes and behavior. Existing methods leverage optimal transport, Schrödinger bridge theory, or their variants to simultaneously infer stochastic, unbalanced dynamics from snapshot data. However, these approaches remain limited in their ability to account for cell-cell interactions. Thi… ▽ More

    Submitted 1 June, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  14. arXiv:2505.09643  [pdf

    q-bio.NC cs.LG

    A Computational Approach to Epilepsy Treatment: An AI-optimized Global Natural Product Prescription System

    Authors: Zhixuan Wang

    Abstract: Epilepsy is a prevalent neurological disease with millions of patients worldwide. Many patients have turned to alternative medicine due to the limited efficacy and side effects of conventional antiepileptic drugs. In this study, we developed a computational approach to optimize herbal epilepsy treatment through AI-driven analysis of global natural products and statistically validated randomized co… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  15. arXiv:2505.08581  [pdf, other

    cs.CV eess.IV q-bio.TO

    ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking

    Authors: Haofeng Liu, Mingqi Gao, Xuxiao Luo, Ziyue Wang, Guanyi Qin, Junde Wu, Yueming Jin

    Abstract: Surgical scene segmentation is critical in computer-assisted surgery and is vital for enhancing surgical quality and patient outcomes. Recently, referring surgical segmentation is emerging, given its advantage of providing surgeons with an interactive experience to segment the target object. However, existing methods are limited by low efficiency and short-term tracking, hindering their applicabil… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: Early accepted by MICCAI 2025

  16. arXiv:2505.08254  [pdf, other

    q-bio.NC math.PR

    Efficient, simulation-free estimators of firing rates with Markovian surrogates

    Authors: Zhongyi Wang, Louis Tao, Zhuo-Cheng Xiao

    Abstract: Spiking neural networks (SNNs) are powerful mathematical models that integrate the biological details of neural systems, but their complexity often makes them computationally expensive and analytically untractable. The firing rate of an SNN is a crucial first-order statistic to characterize network activity. However, estimating firing rates analytically from even simplified SNN models is challengi… ▽ More

    Submitted 14 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: 9 pages, 5 figures

  17. arXiv:2505.05874  [pdf, ps, other

    cs.LG physics.chem-ph q-bio.BM

    A 3D pocket-aware and evolutionary conserved interaction guided diffusion model for molecular optimization

    Authors: Anjie Qiao, Hao Zhang, Qianmu Yuan, Qirui Deng, Jingtian Su, Weifeng Huang, Huihao Zhou, Guo-Bo Li, Zhen Wang, Jinping Lei

    Abstract: Generating molecules that bind to specific protein targets via diffusion models has shown good promise for structure-based drug design and molecule optimization. Especially, the diffusion models with binding interaction guidance enables molecule generation with high affinity through forming favorable interaction within protein pocket. However, the generated molecules may not form interactions with… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  18. arXiv:2505.05736  [pdf

    q-bio.QM cs.CL cs.CV cs.LG

    Multimodal Integrated Knowledge Transfer to Large Language Models through Preference Optimization with Biomedical Applications

    Authors: Da Wu, Zhanliang Wang, Quan Nguyen, Zhuoran Xu, Kai Wang

    Abstract: The scarcity of high-quality multimodal biomedical data limits the ability to effectively fine-tune pretrained Large Language Models (LLMs) for specialized biomedical tasks. To address this challenge, we introduce MINT (Multimodal Integrated kNowledge Transfer), a framework that aligns unimodal large decoder models with domain-specific decision patterns from multimodal biomedical data through pref… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: First Draft

  19. arXiv:2504.18559  [pdf

    physics.bio-ph cond-mat.soft physics.chem-ph q-bio.BM

    Molecular Determinants of Orthosteric-allosteric Dual Inhibition of PfHT1 by Computational Assessment

    Authors: Decheng Kong, Jinlong Ren, Zhuang Li, Guangcun Shan, Zhongjian Wang, Ruiqin Zhang, Wei Huang, Kunpeng Dou

    Abstract: To overcome antimalarial drug resistance, carbohydrate derivatives as selective PfHT1 inhibitor have been suggested in recent experimental work with orthosteric and allosteric dual binding pockets. Inspired by this promising therapeutic strategy, herein, molecular dynamics simulations are performed to investigate the molecular determinants of co-administration on orthosteric and allosteric inhibit… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 21 pages, 15 figures, FOP revised

  20. arXiv:2504.16504  [pdf

    q-bio.NC cs.HC

    Intelligent Depression Prevention via LLM-Based Dialogue Analysis: Overcoming the Limitations of Scale-Dependent Diagnosis through Precise Emotional Pattern Recognition

    Authors: Zhenguang Zhong, Zhixuan Wang

    Abstract: Existing depression screening predominantly relies on standardized questionnaires (e.g., PHQ-9, BDI), which suffer from high misdiagnosis rates (18-34% in clinical studies) due to their static, symptom-counting nature and susceptibility to patient recall bias. This paper presents an AI-powered depression prevention system that leverages large language models (LLMs) to analyze real-time conversatio… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  21. arXiv:2504.12351  [pdf, other

    cs.GR cs.AI eess.IV q-bio.TO

    Prototype-Guided Diffusion for Digital Pathology: Achieving Foundation Model Performance with Minimal Clinical Data

    Authors: Ekaterina Redekop, Mara Pleasure, Vedrana Ivezic, Zichen Wang, Kimberly Flores, Anthony Sisk, William Speier, Corey Arnold

    Abstract: Foundation models in digital pathology use massive datasets to learn useful compact feature representations of complex histology images. However, there is limited transparency into what drives the correlation between dataset size and performance, raising the question of whether simply adding more data to increase performance is always necessary. In this study, we propose a prototype-guided diffusi… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  22. arXiv:2504.10525  [pdf

    q-bio.QM cs.CL cs.IR

    BioChemInsight: An Open-Source Toolkit for Automated Identification and Recognition of Optical Chemical Structures and Activity Data in Scientific Publications

    Authors: Zhe Wang, Fangtian Fu, Wei Zhang, Lige Yan, Yan Meng, Jianping Wu, Hui Wu, Gang Xu, Si Chen

    Abstract: Automated extraction of chemical structures and their bioactivity data is crucial for accelerating drug discovery and enabling data-driven pharmaceutical research. Existing optical chemical structure recognition (OCSR) tools fail to autonomously associate molecular structures with their bioactivity profiles, creating a critical bottleneck in structure-activity relationship (SAR) analysis. Here, we… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

    Comments: 20 pages, 7 figures

  23. arXiv:2504.08201  [pdf, other

    q-bio.NC cs.AI cs.LG

    Neural Encoding and Decoding at Scale

    Authors: Yizi Zhang, Yanchen Wang, Mehdi Azabou, Alexandre Andre, Zixuan Wang, Hanrui Lyu, The International Brain Laboratory, Eva Dyer, Liam Paninski, Cole Hurwitz

    Abstract: Recent work has demonstrated that large-scale, multi-animal models are powerful tools for characterizing the relationship between neural activity and behavior. Current large-scale approaches, however, focus exclusively on either predicting neural activity from behavior (encoding) or predicting behavior from neural activity (decoding), limiting their ability to capture the bidirectional relationshi… ▽ More

    Submitted 24 May, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

  24. arXiv:2504.04647  [pdf, other

    cs.LG q-bio.QM

    Sub-Clustering for Class Distance Recalculation in Long-Tailed Drug Classification

    Authors: Yujia Su, Xinjie Li, Lionel Z. Wang

    Abstract: In the real world, long-tailed data distributions are prevalent, making it challenging for models to effectively learn and classify tail classes. However, we discover that in the field of drug chemistry, certain tail classes exhibit higher identifiability during training due to their unique molecular structural features, a finding that significantly contrasts with the conventional understanding th… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

  25. arXiv:2504.02698  [pdf, other

    cs.LG cs.AI q-bio.QM

    SCMPPI: Supervised Contrastive Multimodal Framework for Predicting Protein-Protein Interactions

    Authors: Shengrui XU, Tianchi Lu, Zikun Wang, Jixiu Zhai

    Abstract: Protein-protein interaction (PPI) prediction plays a pivotal role in deciphering cellular functions and disease mechanisms. To address the limitations of traditional experimental methods and existing computational approaches in cross-modal feature fusion and false-negative suppression, we propose SCMPPI-a novel supervised contrastive multimodal framework. By effectively integrating sequence-based… ▽ More

    Submitted 27 April, 2025; v1 submitted 3 April, 2025; originally announced April 2025.

    Comments: 20 pages,9 figures,conference

    MSC Class: 92C40; 68T07 ACM Class: I.2.6; J.3

  26. arXiv:2504.00334  [pdf

    q-bio.QM

    Pharmacokinetic characteristics of Jinhong tablets in normal, chronic superficial gastritis and intestinal microbial disorder rats

    Authors: Tingyu Zhang, Jian Feng, Xia Gao, Xialin Chen, Hongyu Peng, Xiaoxue Fan, Xin Meng, Mingke Yin, Zhenzhong Wang, Bo Zhang, Liang Cao

    Abstract: Jinhong tablet (JHT), a traditional Chinese medicine made from four herbs, effectively treats chronic superficial gastritis (CSG) by soothing the liver, relieving depression, regulating qi, and promoting blood circulation. However, its pharmacokinetics are underexplored. This study investigates JHT's pharmacokinetics in normal rats and its differences in normal, CSG, and intestinal microbial disor… ▽ More

    Submitted 31 March, 2025; originally announced April 2025.

  27. arXiv:2503.20179  [pdf, other

    cs.CL cs.IR q-bio.QM

    ProtoBERT-LoRA: Parameter-Efficient Prototypical Finetuning for Immunotherapy Study Identification

    Authors: Shijia Zhang, Xiyu Ding, Kai Ding, Jacob Zhang, Kevin Galinsky, Mengrui Wang, Ryan P. Mayers, Zheyu Wang, Hadi Kharrazi

    Abstract: Identifying immune checkpoint inhibitor (ICI) studies in genomic repositories like Gene Expression Omnibus (GEO) is vital for cancer research yet remains challenging due to semantic ambiguity, extreme class imbalance, and limited labeled data in low-resource settings. We present ProtoBERT-LoRA, a hybrid framework that combines PubMedBERT with prototypical networks and Low-Rank Adaptation (LoRA) fo… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Submitted to AMIA 2025 Annual Symposium

  28. arXiv:2503.17007  [pdf, other

    q-bio.BM

    RiboFlow: Conditional De Novo RNA Sequence-Structure Co-Design via Synergistic Flow Matching

    Authors: Runze Ma, Zhongyue Zhang, Zichen Wang, Chenqing Hua, Zhuomin Zhou, Fenglei Cao, Jiahua Rao, Shuangjia Zheng

    Abstract: Ribonucleic acid (RNA) binds to molecules to achieve specific biological functions. While generative models are advancing biomolecule design, existing methods for designing RNA that target specific ligands face limitations in capturing RNA's conformational flexibility, ensuring structural validity, and overcoming data scarcity. To address these challenges, we introduce RiboFlow, a synergistic flow… ▽ More

    Submitted 23 March, 2025; v1 submitted 21 March, 2025; originally announced March 2025.

  29. arXiv:2503.14512  [pdf

    q-bio.QM cs.LG stat.AP stat.ML

    Machine learning algorithms to predict stroke in China based on causal inference of time series analysis

    Authors: Qizhi Zheng, Ayang Zhao, Xinzhu Wang, Yanhong Bai, Zikun Wang, Xiuying Wang, Xianzhang Zeng, Guanghui Dong

    Abstract: Participants: This study employed a combination of Vector Autoregression (VAR) model and Graph Neural Networks (GNN) to systematically construct dynamic causal inference. Multiple classic classification algorithms were compared, including Random Forest, Logistic Regression, XGBoost, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Gradient Boosting, and Multi Layer Perceptron (MLP). The SMO… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 17 pages

  30. arXiv:2503.12286  [pdf

    cs.CL cs.AI q-bio.GN q-bio.QM

    Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

    Authors: Da Wu, Zhanliang Wang, Quan Nguyen, Kai Wang

    Abstract: Background: Several studies show that large language models (LLMs) struggle with phenotype-driven gene prioritization for rare diseases. These studies typically use Human Phenotype Ontology (HPO) terms to prompt foundation models like GPT and LLaMA to predict candidate genes. However, in real-world settings, foundation models are not optimized for domain-specific tasks like clinical diagnosis, yet… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

    Comments: 31 pages, 3 figures

  31. arXiv:2503.08179  [pdf, other

    q-bio.BM cs.AI

    ProtTeX: Structure-In-Context Reasoning and Editing of Proteins with Large Language Models

    Authors: Zicheng Ma, Chuanliu Fan, Zhicong Wang, Zhenyu Chen, Xiaohan Lin, Yanheng Li, Shihao Feng, Jun Zhang, Ziqiang Cao, Yi Qin Gao

    Abstract: Large language models have made remarkable progress in the field of molecular science, particularly in understanding and generating functional small molecules. This success is largely attributed to the effectiveness of molecular tokenization strategies. In protein science, the amino acid sequence serves as the sole tokenizer for LLMs. However, many fundamental challenges in protein science are inh… ▽ More

    Submitted 13 March, 2025; v1 submitted 11 March, 2025; originally announced March 2025.

    Comments: 26 pages, 9 figures

  32. arXiv:2503.07203  [pdf

    q-bio.MN

    POINT: a web-based platform for pharmacological investigation enhanced by multi-omics networks and knowledge graphs

    Authors: Zihao He, Liu Liu, Dongchen Han, Kai Gao, Lei Dong, Dechao Bu, Peipei Huo, Zhihao Wang, Wenxin Deng, Jingjia Liu, Jin-cheng Guo, Yi Zhao, Yang Wu

    Abstract: Network pharmacology (NP) explores pharmacological mechanisms through biological networks. Multi-omics data enable multi-layer network construction under diverse conditions, requiring integration into NP analyses. We developed POINT, a novel NP platform enhanced by multi-omics biological networks, advanced algorithms, and knowledge graphs (KGs) featuring network-based and KG-based analytical funct… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 45 pages. 7 figures

  33. arXiv:2503.04490  [pdf, ps, other

    cs.CL q-bio.GN

    Large Language Models in Bioinformatics: A Survey

    Authors: Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan Chen, Xiangyu Shi, Yu Li

    Abstract: Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarci… ▽ More

    Submitted 31 May, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: Accepted by ACL 2025

  34. arXiv:2503.04362  [pdf, other

    cs.LG cs.AI q-bio.BM

    A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery

    Authors: Yiheng Zhu, Mingyang Li, Junlong Liu, Kun Fu, Jiansheng Wu, Qiuyi Li, Mingze Yin, Jieping Ye, Jian Wu, Zheng Wang

    Abstract: Structure-based drug discovery (SBDD) is a systematic scientific process that develops new drugs by leveraging the detailed physical structure of the target protein. Recent advancements in pre-trained models for biomolecules have demonstrated remarkable success across various biochemical applications, including drug discovery and protein engineering. However, in most approaches, the pre-trained mo… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  35. arXiv:2503.00586  [pdf, other

    eess.IV cs.CV q-bio.QM

    Cross-Attention Fusion of MRI and Jacobian Maps for Alzheimer's Disease Diagnosis

    Authors: Shijia Zhang, Xiyu Ding, Brian Caffo, Junyu Chen, Cindy Zhang, Hadi Kharrazi, Zheyu Wang

    Abstract: Early diagnosis of Alzheimer's disease (AD) is critical for intervention before irreversible neurodegeneration occurs. Structural MRI (sMRI) is widely used for AD diagnosis, but conventional deep learning approaches primarily rely on intensity-based features, which require large datasets to capture subtle structural changes. Jacobian determinant maps (JSM) provide complementary information by enco… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Comments: Submitted to MICCAI 2025

  36. arXiv:2503.00089  [pdf, ps, other

    q-bio.QM cs.AI cs.LG

    Protein Structure Tokenization: Benchmarking and New Recipe

    Authors: Xinyu Yuan, Zichen Wang, Marcus Collins, Huzefa Rangwala

    Abstract: Recent years have witnessed a surge in the development of protein structural tokenization methods, which chunk protein 3D structures into discrete or continuous representations. Structure tokenization enables the direct application of powerful techniques like language modeling for protein structures, and large multimodal models to integrate structures with protein sequences and functional texts. D… ▽ More

    Submitted 24 June, 2025; v1 submitted 28 February, 2025; originally announced March 2025.

    Comments: Accepted at ICML 2025

  37. arXiv:2502.10807  [pdf, other

    cs.LG cs.AI q-bio.GN

    HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model

    Authors: Mingqian Ma, Guoqing Liu, Chuan Cao, Pan Deng, Tri Dao, Albert Gu, Peiran Jin, Zhao Yang, Yingce Xia, Renqian Luo, Pipi Hu, Zun Wang, Yuan-Jyue Chen, Haiguang Liu, Tao Qin

    Abstract: Advances in natural language processing and large language models have sparked growing interest in modeling DNA, often referred to as the "language of life". However, DNA modeling poses unique challenges. First, it requires the ability to process ultra-long DNA sequences while preserving single-nucleotide resolution, as individual nucleotides play a critical role in DNA function. Second, success i… ▽ More

    Submitted 17 February, 2025; v1 submitted 15 February, 2025; originally announced February 2025.

    Comments: Project page: https://hybridna-project.github.io/HybriDNA-Project/

  38. arXiv:2502.08070  [pdf

    q-bio.NC

    Normative Cerebral Perfusion Across the Lifespan

    Authors: Xinglin Zeng, Yiran Li, Lin Hua, Ruoxi Lu, Lucas Lemos Franco, Peter Kochunov, Shuo Chen, John A Detre, Ze Wang

    Abstract: Cerebral perfusion plays a crucial role in maintaining brain function and is tightly coupled with neuronal activity. While previous studies have examined cerebral perfusion trajectories across development and aging, precise characterization of its lifespan dynamics has been limited by small sample sizes and methodological inconsistencies. In this study, we construct the first comprehensive normati… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  39. arXiv:2502.07272  [pdf, other

    cs.CL q-bio.GN

    GENERator: A Long-Context Generative Genomic Foundation Model

    Authors: Wei Wu, Qiuyi Li, Mingyang Li, Kun Fu, Fuli Feng, Jieping Ye, Hui Xiong, Zheng Wang

    Abstract: Advancements in DNA sequencing technologies have significantly improved our ability to decode genomic sequences. However, the prediction and interpretation of these sequences remain challenging due to the intricate nature of genetic material. Large language models (LLMs) have introduced new opportunities for biological sequence analysis. Recent developments in genomic language models have undersco… ▽ More

    Submitted 31 March, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  40. arXiv:2502.06846  [pdf, other

    cs.LG cs.AI q-bio.BM

    Prot2Chat: Protein LLM with Early-Fusion of Text, Sequence and Structure

    Authors: Zhicong Wang, Zicheng Ma, Ziqiang Cao, Changlong Zhou, Jun Zhang, Yiqin Gao

    Abstract: Motivation: Proteins are of great significance in living organisms. However, understanding their functions encounters numerous challenges, such as insufficient integration of multimodal information, a large number of training parameters, limited flexibility of classification-based methods, and the lack of systematic evaluation metrics for protein Q&A systems. To tackle these issues, we propose the… ▽ More

    Submitted 22 May, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

    Comments: 8 pages, 3 figures

  41. arXiv:2502.06274  [pdf, other

    cs.LG cs.AI q-bio.MN

    HODDI: A Dataset of High-Order Drug-Drug Interactions for Computational Pharmacovigilance

    Authors: Zhaoying Wang, Yingdan Shi, Xiang Liu, Can Chen, Jun Wen, Ren Wang

    Abstract: Drug-side effect research is vital for understanding adverse reactions arising in complex multi-drug therapies. However, the scarcity of higher-order datasets that capture the combinatorial effects of multiple drugs severely limits progress in this field. Existing resources such as TWOSIDES primarily focus on pairwise interactions. To fill this critical gap, we introduce HODDI, the first Higher-Or… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  42. arXiv:2502.00410  [pdf, ps, other

    q-bio.PE math.AP math.DS

    Effects and biological consequences of the predator-mediated apparent competition I: ODE models

    Authors: Yuan Lou, Weirun Tao, Zhi-An Wang

    Abstract: This paper is devoted to investigating the effects and biological consequences of the predator-mediated apparent competition based on a two prey species (one is native and the other is invasive) and one predator model with Holling type I and II functional response functions. Through the analytical results and case studies alongside numerical simulations, we find that the initial mass of the invasi… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: 29 pages

    MSC Class: 34D05; 34D23; 92-10; 92D25

  43. arXiv:2501.12615  [pdf, other

    q-bio.NC cs.AI

    GATE: Adaptive Learning with Working Memory by Information Gating in Multi-lamellar Hippocampal Formation

    Authors: Yuechen Liu, Zishun Wang, Chen Qiao, Zongben Xu

    Abstract: Hippocampal formation (HF) can rapidly adapt to varied environments and build flexible working memory (WM). To mirror the HF's mechanism on generalization and WM, we propose a model named Generalization and Associative Temporary Encoding (GATE), which deploys a 3-D multi-lamellar dorsoventral (DV) architecture, and learns to build up internally representation from externally driven information lay… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

  44. arXiv:2501.12421  [pdf, ps, other

    cs.LG cs.AI q-bio.QM

    Tackling Small Sample Survival Analysis via Transfer Learning: A Study of Colorectal Cancer Prognosis

    Authors: Yonghao Zhao, Changtao Li, Chi Shu, Qingbin Wu, Hong Li, Chuan Xu, Tianrui Li, Ziqiang Wang, Zhipeng Luo, Yazhou He

    Abstract: Survival prognosis is crucial for medical informatics. Practitioners often confront small-sized clinical data, especially cancer patient cases, which can be insufficient to induce useful patterns for survival predictions. This study deals with small sample survival analysis by leveraging transfer learning, a useful machine learning technique that can enhance the target analysis with related knowle… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

  45. arXiv:2412.20014  [pdf, other

    cs.LG cs.AI q-bio.BM

    ProtCLIP: Function-Informed Protein Multi-Modal Learning

    Authors: Hanjing Zhou, Mingze Yin, Wei Wu, Mingyang Li, Kun Fu, Jintai Chen, Jian Wu, Zheng Wang

    Abstract: Multi-modality pre-training paradigm that aligns protein sequences and biological descriptions has learned general protein representations and achieved promising performance in various downstream applications. However, these works were still unable to replicate the extraordinary success of language-supervised visual foundation models due to the ineffective usage of aligned protein-text paired data… ▽ More

    Submitted 27 December, 2024; originally announced December 2024.

    Journal ref: AAAI 2025

  46. arXiv:2412.16235  [pdf, other

    cs.LG math-ph q-bio.QM stat.ML

    Utilizing Causal Network Markers to Identify Tipping Points ahead of Critical Transition

    Authors: Shirui Bian, Zezhou Wang, Siyang Leng, Wei Lin, Jifan Shi

    Abstract: Early-warning signals of delicate design are always used to predict critical transitions in complex systems, which makes it possible to render the systems far away from the catastrophic state by introducing timely interventions. Traditional signals including the dynamical network biomarker (DNB), based on statistical properties such as variance and autocorrelation of nodal dynamics, overlook direc… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

    Comments: 16 pages, 4 figures

  47. arXiv:2412.11082  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    EquiFlow: Equivariant Conditional Flow Matching with Optimal Transport for 3D Molecular Conformation Prediction

    Authors: Qingwen Tian, Yuxin Xu, Yixuan Yang, Zhen Wang, Ziqi Liu, Pengju Yan, Xiaolin Li

    Abstract: Molecular 3D conformations play a key role in determining how molecules interact with other molecules or protein surfaces. Recent deep learning advancements have improved conformation prediction, but slow training speeds and difficulties in utilizing high-degree features limit performance. We propose EquiFlow, an equivariant conditional flow matching model with optimal transport. EquiFlow uniquely… ▽ More

    Submitted 15 December, 2024; originally announced December 2024.

    Comments: 11 pages,5 figures

  48. arXiv:2412.01836  [pdf

    q-bio.NC

    Eye dominance and testing order effects in the circularly-oriented macular pigment optical density measurements that rely on the perception of structured light-based stimuli

    Authors: Mukhit Kulmaganbetov, Taranjit Singh, Dmitry Pushin, Pinki Chahal, David Cory, Davis Garrad, Connor Kapahi, Melanie Mungalsingh, Iman Salehi, Andrew Silva, Ben Thompson, Zhangting Wang, Dusan Sarenac

    Abstract: Psychophysical discrimination of structured light (SL) stimuli may be useful in screening for various macular disorders, including degenerative macular diseases. The circularly-oriented macular pigment optical density (coMPOD), calculated from the discrimination performance of SL-induced entoptic phenomena, may reveal a novel functional biomarker of macular health. In this study, we investigated t… ▽ More

    Submitted 16 November, 2024; originally announced December 2024.

  49. arXiv:2412.01564  [pdf, other

    cs.LG q-bio.BM

    Tokenizing 3D Molecule Structure with Quantized Spherical Coordinates

    Authors: Kaiyuan Gao, Yusong Wang, Haoxiang Guan, Zun Wang, Qizhi Pei, John E. Hopcroft, Kun He, Lijun Wu

    Abstract: The application of language models (LMs) to molecular structure generation using line notations such as SMILES and SELFIES has been well-established in the field of cheminformatics. However, extending these models to generate 3D molecular structures presents significant challenges. Two primary obstacles emerge: (1) the difficulty in designing a 3D line notation that ensures SE(3)-invariant atomic… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: 17 pages, 6 figures, preprint

  50. arXiv:2411.16793  [pdf, other

    cs.CV q-bio.GN

    ST-Align: A Multimodal Foundation Model for Image-Gene Alignment in Spatial Transcriptomics

    Authors: Yuxiang Lin, Ling Luo, Ying Chen, Xushi Zhang, Zihui Wang, Wenxian Yang, Mengsha Tong, Rongshan Yu

    Abstract: Spatial transcriptomics (ST) provides high-resolution pathological images and whole-transcriptomic expression profiles at individual spots across whole-slide scales. This setting makes it an ideal data source to develop multimodal foundation models. Although recent studies attempted to fine-tune visual encoders with trainable gene encoders based on spot-level, the absence of a wider slide perspect… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.