Search | arXiv e-print repository

An AI-native experimental laboratory for autonomous biomolecular engineering

Authors: Mingyu Wu, Zhaoguo Wang, Jiabin Wang, Zhiyuan Dong, Jingkai Yang, Qingting Li, Tianyu Huang, Lei Zhao, Mingqiang Li, Fei Wang, Chunhai Fan, Haibo Chen

Abstract: Autonomous scientific research, capable of independently conducting complex experiments and serving non-specialists, represents a long-held aspiration. Achieving it requires a fundamental paradigm shift driven by artificial intelligence (AI). While autonomous experimental systems are emerging, they remain confined to areas featuring singular objectives and well-defined, simple experimental workflo… ▽ More Autonomous scientific research, capable of independently conducting complex experiments and serving non-specialists, represents a long-held aspiration. Achieving it requires a fundamental paradigm shift driven by artificial intelligence (AI). While autonomous experimental systems are emerging, they remain confined to areas featuring singular objectives and well-defined, simple experimental workflows, such as chemical synthesis and catalysis. We present an AI-native autonomous laboratory, targeting highly complex scientific experiments for applications like autonomous biomolecular engineering. This system autonomously manages instrumentation, formulates experiment-specific procedures and optimization heuristics, and concurrently serves multiple user requests. Founded on a co-design philosophy of models, experiments, and instruments, the platform supports the co-evolution of AI models and the automation system. This establishes an end-to-end, multi-user autonomous laboratory that handles complex, multi-objective experiments across diverse instrumentation. Our autonomous laboratory supports fundamental nucleic acid functions-including synthesis, transcription, amplification, and sequencing. It also enables applications in fields such as disease diagnostics, drug development, and information storage. Without human intervention, it autonomously optimizes experimental performance to match state-of-the-art results achieved by human scientists. In multi-user scenarios, the platform significantly improves instrument utilization and experimental efficiency. This platform paves the way for advanced biomaterials research to overcome dependencies on experts and resource barriers, establishing a blueprint for science-as-a-service at scale. △ Less

Submitted 3 July, 2025; originally announced July 2025.

arXiv:2506.16921 [pdf, ps, other]

EHCube4P: Learning Epistatic Patterns Through Hypercube Graph Convolution Neural Network for Protein Fitness Function Estimation

Authors: Muhammad Daud, Philippe Charton, Cedric Damour, Jingbo Wang, Frederic Cadet

Abstract: Understanding the relationship between protein sequences and their functions is fundamental to protein engineering, but this task is hindered by the combinatorially vast sequence space and the experimental noise inherent in fitness measurements. In this study, we present a novel framework that models the sequence landscape as a hypercube $H(k,2)$ and integrates wavelet-based signal denoising with… ▽ More Understanding the relationship between protein sequences and their functions is fundamental to protein engineering, but this task is hindered by the combinatorially vast sequence space and the experimental noise inherent in fitness measurements. In this study, we present a novel framework that models the sequence landscape as a hypercube $H(k,2)$ and integrates wavelet-based signal denoising with a graph convolutional neural network (GCN) to predict protein fitness across rugged fitness landscapes. Using a dataset of 419 experimentally measured mutant sequences of the Tobacco 5-Epi-Aristolochene Synthase (TEAS) enzyme, we preprocess the fitness signals using a 1-D discrete wavelet transform with a Daubechies-3 basis to suppress experimental noise while preserving local epistatic patterns. Our model comprises two GCN layers, allowing for beyond pairwise aggregation, followed by a multi-layer perceptron (MLP). We show that our approach, EHCube4P, generalizes well across different enzyme activity datasets and effectively captures higher-order mutational interactions. Performance varies with the ruggedness of the fitness landscape, with smoother signals yielding higher test set $r^2$ scores. These results demonstrate that combining wavelet preprocessing with graph-based deep learning enhances the robustness and generalization of fitness prediction, particularly for sparse and noisy biological datasets. The approach provides a scalable and interpretable framework for protein fitness estimation applicable to a broad range of combinatorial biological systems. △ Less

Submitted 20 June, 2025; originally announced June 2025.

Comments: 12 pages, 4 figures, 1 table

arXiv:2506.15190 [pdf, ps, other]

Learning Task-Agnostic Skill Bases to Uncover Motor Primitives in Animal Behaviors

Authors: Jiyi Wang, Jingyang Ke, Bo Dai, Anqi Wu

Abstract: Animals flexibly recombine a finite set of core motor primitives to meet diverse task demands, but existing behavior-segmentation methods oversimplify this process by imposing discrete syllables under restrictive generative assumptions. To reflect the animal behavior generation procedure, we introduce skill-based imitation learning (SKIL) for behavior understanding, a reinforcement learning-based… ▽ More Animals flexibly recombine a finite set of core motor primitives to meet diverse task demands, but existing behavior-segmentation methods oversimplify this process by imposing discrete syllables under restrictive generative assumptions. To reflect the animal behavior generation procedure, we introduce skill-based imitation learning (SKIL) for behavior understanding, a reinforcement learning-based imitation framework that (1) infers interpretable skill sets, i.e., latent basis functions of behavior, by leveraging representation learning on transition probabilities, and (2) parameterizes policies as dynamic mixtures of these skills. We validate our approach on a simple grid world, a discrete labyrinth, and unconstrained videos of freely moving animals. Across tasks, it identifies reusable skill components, learns continuously evolving compositional policies, and generates realistic trajectories beyond the capabilities of traditional discrete models. By exploiting generative behavior modeling with compositional representations, our method offers a concise, principled account of how complex animal behaviors emerge from dynamic combinations of fundamental motor primitives. △ Less

Submitted 18 June, 2025; originally announced June 2025.

Comments: 9 pages and 4 figures for the main text

arXiv:2506.10271 [pdf, ps, other]

Predicting function of evolutionarily implausible DNA sequences

Authors: Shiyu Jiang, Xuyin Liu, Zitong Jerry Wang

Abstract: Genomic language models (gLMs) show potential for generating novel, functional DNA sequences for synthetic biology, but doing so requires them to learn not just evolutionary plausibility, but also sequence-to-function relationships. We introduce a set of prediction tasks called Nullsettes, which assesses a model's ability to predict loss-of-function mutations created by translocating key control e… ▽ More Genomic language models (gLMs) show potential for generating novel, functional DNA sequences for synthetic biology, but doing so requires them to learn not just evolutionary plausibility, but also sequence-to-function relationships. We introduce a set of prediction tasks called Nullsettes, which assesses a model's ability to predict loss-of-function mutations created by translocating key control elements in synthetic expression cassettes. Across 12 state-of-the-art models, we find that mutation effect prediction performance strongly correlates with the predicted likelihood of the nonmutant. Furthermore, the range of likelihood values predictive of strong model performance is highly dependent on sequence length. Our work highlights the importance of considering both sequence likelihood and sequence length when using gLMs for mutation effect prediction. △ Less

Submitted 4 July, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

Comments: 13 pages, 6 figures, accepted to ICML 2025 Generative AI and Biology Workshop

arXiv:2506.07553 [pdf, ps, other]

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

Authors: Jingchao Wang, Haote Yang, Jiang Wu, Yifan He, Xingjian Wei, Yinfan Wang, Chengjin Liu, Lingli Ge, Lijun Wu, Bin Wang, Dahua Lin, Conghui He

Abstract: Optical Chemical Structure Recognition (OCSR) is crucial for digitizing chemical knowledge by converting molecular images into machine-readable formats. While recent vision-language models (VLMs) have shown potential in this task, their image-captioning approach often struggles with complex molecular structures and inconsistent annotations. To overcome these challenges, we introduce GTR-Mol-VLM, a… ▽ More Optical Chemical Structure Recognition (OCSR) is crucial for digitizing chemical knowledge by converting molecular images into machine-readable formats. While recent vision-language models (VLMs) have shown potential in this task, their image-captioning approach often struggles with complex molecular structures and inconsistent annotations. To overcome these challenges, we introduce GTR-Mol-VLM, a novel framework featuring two key innovations: (1) the Graph Traversal as Visual Chain of Thought mechanism that emulates human reasoning by incrementally parsing molecular graphs through sequential atom-bond predictions, and (2) the data-centric principle of Faithfully Recognize What You've Seen, which addresses the mismatch between abbreviated structures in images and their expanded annotations. To support model development, we constructed GTR-CoT-1.3M, a large-scale instruction-tuning dataset with meticulously corrected annotations, and introduced MolRec-Bench, the first benchmark designed for a fine-grained evaluation of graph-parsing accuracy in OCSR. Comprehensive experiments demonstrate that GTR-Mol-VLM achieves superior results compared to specialist models, chemistry-domain VLMs, and commercial general-purpose VLMs. Notably, in scenarios involving molecular images with functional group abbreviations, GTR-Mol-VLM outperforms the second-best baseline by approximately 14 percentage points, both in SMILES-based and graph-based metrics. We hope that this work will drive OCSR technology to more effectively meet real-world needs, thereby advancing the fields of cheminformatics and AI for Science. We will release GTR-CoT at https://github.com/opendatalab/GTR-CoT. △ Less

Submitted 9 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

arXiv:2506.06915 [pdf]

Graph Neural Networks in Modern AI-aided Drug Discovery

Authors: Odin Zhang, Haitao Lin, Xujun Zhang, Xiaorui Wang, Zhenxing Wu, Qing Ye, Weibo Zhao, Jike Wang, Kejun Ying, Yu Kang, Chang-yu Hsieh, Tingjun Hou

Abstract: Graph neural networks (GNNs), as topology/structure-aware models within deep learning, have emerged as powerful tools for AI-aided drug discovery (AIDD). By directly operating on molecular graphs, GNNs offer an intuitive and expressive framework for learning the complex topological and geometric features of drug-like molecules, cementing their role in modern molecular modeling. This review provide… ▽ More Graph neural networks (GNNs), as topology/structure-aware models within deep learning, have emerged as powerful tools for AI-aided drug discovery (AIDD). By directly operating on molecular graphs, GNNs offer an intuitive and expressive framework for learning the complex topological and geometric features of drug-like molecules, cementing their role in modern molecular modeling. This review provides a comprehensive overview of the methodological foundations and representative applications of GNNs in drug discovery, spanning tasks such as molecular property prediction, virtual screening, molecular generation, biomedical knowledge graph construction, and synthesis planning. Particular attention is given to recent methodological advances, including geometric GNNs, interpretable models, uncertainty quantification, scalable graph architectures, and graph generative frameworks. We also discuss how these models integrate with modern deep learning approaches, such as self-supervised learning, multi-task learning, meta-learning and pre-training. Throughout this review, we highlight the practical challenges and methodological bottlenecks encountered when applying GNNs to real-world drug discovery pipelines, and conclude with a discussion on future directions. △ Less

Submitted 7 June, 2025; originally announced June 2025.

arXiv:2506.05768 [pdf, ps, other]

AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation

Authors: Wenyu Zhu, Jianhui Wang, Bowen Gao, Yinjun Jia, Haichuan Tan, Ya-Qin Zhang, Wei-Ying Ma, Yanyan Lan

Abstract: Virtual screening (VS) is a critical component of modern drug discovery, yet most existing methods--whether physics-based or deep learning-based--are developed around holo protein structures with known ligand-bound pockets. Consequently, their performance degrades significantly on apo or predicted structures such as those from AlphaFold2, which are more representative of real-world early-stage dru… ▽ More Virtual screening (VS) is a critical component of modern drug discovery, yet most existing methods--whether physics-based or deep learning-based--are developed around holo protein structures with known ligand-bound pockets. Consequently, their performance degrades significantly on apo or predicted structures such as those from AlphaFold2, which are more representative of real-world early-stage drug discovery, where pocket information is often missing. In this paper, we introduce an alignment-and-aggregation framework to enable accurate virtual screening under structural uncertainty. Our method comprises two core components: (1) a tri-modal contrastive learning module that aligns representations of the ligand, the holo pocket, and cavities detected from structures, thereby enhancing robustness to pocket localization error; and (2) a cross-attention based adapter for dynamically aggregating candidate binding sites, enabling the model to learn from activity data even without precise pocket annotations. We evaluated our method on a newly curated benchmark of apo structures, where it significantly outperforms state-of-the-art methods in blind apo setting, improving the early enrichment factor (EF1%) from 11.75 to 37.19. Notably, it also maintains strong performance on holo structures. These results demonstrate the promise of our approach in advancing first-in-class drug discovery, particularly in scenarios lacking experimentally resolved protein-ligand complexes. △ Less

Submitted 6 June, 2025; originally announced June 2025.

arXiv:2506.01456 [pdf]

GenDMR: A dynamic multimodal role-swapping network for identifying risk gene phenotypes

Authors: Lina Qin, Cheng Zhu, Chuqi Zhou, Yukun Huang, Jiayi Zhu, Ping Liang, Jinju Wang, Yixing Huang, Cheng Luo, Dezhong Yao, Ying Tan

Abstract: Recent studies have shown that integrating multimodal data fusion techniques for imaging and genetic features is beneficial for the etiological analysis and predictive diagnosis of Alzheimer's disease (AD). However, there are several critical flaws in current deep learning methods. Firstly, there has been insufficient discussion and exploration regarding the selection and encoding of genetic infor… ▽ More Recent studies have shown that integrating multimodal data fusion techniques for imaging and genetic features is beneficial for the etiological analysis and predictive diagnosis of Alzheimer's disease (AD). However, there are several critical flaws in current deep learning methods. Firstly, there has been insufficient discussion and exploration regarding the selection and encoding of genetic information. Secondly, due to the significantly superior classification value of AD imaging features compared to genetic features, many studies in multimodal fusion emphasize the strengths of imaging features, actively mitigating the influence of weaker features, thereby diminishing the learning of the unique value of genetic features. To address this issue, this study proposes the dynamic multimodal role-swapping network (GenDMR). In GenDMR, we develop a novel approach to encode the spatial organization of single nucleotide polymorphisms (SNPs), enhancing the representation of their genomic context. Additionally, to adaptively quantify the disease risk of SNPs and brain region, we propose a multi-instance attention module to enhance model interpretability. Furthermore, we introduce a dominant modality selection module and a contrastive self-distillation module, combining them to achieve a dynamic teacher-student role exchange mechanism based on dominant and auxiliary modalities for bidirectional co-updating of different modal data. Finally, GenDMR achieves state-of-the-art performance on the ADNI public dataset and visualizes attention to different SNPs, focusing on confirming 12 potential high-risk genes related to AD, including the most classic APOE and recently highlighted significant risk genes. This demonstrates GenDMR's interpretable analytical capability in exploring AD genetic features, providing new insights and perspectives for the development of multimodal data fusion techniques. △ Less

Submitted 2 June, 2025; originally announced June 2025.

Comments: 31 pages, 9 figures

arXiv:2505.09664 [pdf, other]

KINDLE: Knowledge-Guided Distillation for Prior-Free Gene Regulatory Network Inference

Authors: Rui Peng, Yuchen Lu, Qichen Sun, Yuxing Lu, Chi Zhang, Ziru Liu, Jinzhuo Wang

Abstract: Gene regulatory network (GRN) inference serves as a cornerstone for deciphering cellular decision-making processes. Early approaches rely exclusively on gene expression data, thus their predictive power remain fundamentally constrained by the vast combinatorial space of potential gene-gene interactions. Subsequent methods integrate prior knowledge to mitigate this challenge by restricting the solu… ▽ More Gene regulatory network (GRN) inference serves as a cornerstone for deciphering cellular decision-making processes. Early approaches rely exclusively on gene expression data, thus their predictive power remain fundamentally constrained by the vast combinatorial space of potential gene-gene interactions. Subsequent methods integrate prior knowledge to mitigate this challenge by restricting the solution space to biologically plausible interactions. However, we argue that the effectiveness of these approaches is contingent upon the precision of prior information and the reduction in the search space will circumscribe the models' potential for novel biological discoveries. To address these limitations, we introduce KINDLE, a three-stage framework that decouples GRN inference from prior knowledge dependencies. KINDLE trains a teacher model that integrates prior knowledge with temporal gene expression dynamics and subsequently distills this encoded knowledge to a student model, enabling accurate GRN inference solely from expression data without access to any prior. KINDLE achieves state-of-the-art performance across four benchmark datasets. Notably, it successfully identifies key transcription factors governing mouse embryonic development and precisely characterizes their functional roles. In mouse hematopoietic stem cell data, KINDLE accurately predicts fate transition outcomes following knockout of two critical regulators (Gata1 and Spi1). These biological validations demonstrate our framework's dual capability in maintaining topological inference precision while preserving discovery potential for novel biological mechanisms. △ Less

Submitted 14 May, 2025; originally announced May 2025.

arXiv:2505.09656 [pdf, other]

VIGIL: Vision-Language Guided Multiple Instance Learning Framework for Ulcerative Colitis Histological Healing Prediction

Authors: Zhengxuan Qiu, Bo Peng, Xiaoying Tang, Jiankun Wang, Qin Guo

Abstract: Objective: Ulcerative colitis (UC), characterized by chronic inflammation with alternating remission-relapse cycles, requires precise histological healing (HH) evaluation to improve clinical outcomes. To overcome the limitations of annotation-intensive deep learning methods and suboptimal multi-instance learning (MIL) in HH prediction, we propose VIGIL, the first vision-language guided MIL framewo… ▽ More Objective: Ulcerative colitis (UC), characterized by chronic inflammation with alternating remission-relapse cycles, requires precise histological healing (HH) evaluation to improve clinical outcomes. To overcome the limitations of annotation-intensive deep learning methods and suboptimal multi-instance learning (MIL) in HH prediction, we propose VIGIL, the first vision-language guided MIL framework integrating white light endoscopy (WLE) and endocytoscopy (EC). Methods:VIGIL begins with a dual-branch MIL module KS-MIL based on top-K typical frames selection and similarity metric adaptive learning to learn relationships among frame features effectively. By integrating the diagnostic report text and specially designed multi-level alignment and supervision between image-text pairs, VIGIL establishes joint image-text guidance during training to capture richer disease-related semantic information. Furthermore, VIGIL employs a multi-modal masked relation fusion (MMRF) strategy to uncover the latent diagnostic correlations of two endoscopic image representations. Results:Comprehensive experiments on a real-world clinical dataset demonstrate VIGIL's superior performance, achieving 92.69\% accuracy and 94.79\% AUC, outperforming existing state-of-the-art methods. Conclusion: The proposed VIGIL framework successfully establishes an effective vision-language guided MIL paradigm for UC HH prediction, reducing annotation burdens while improving prediction reliability. Significance: The research outcomes provide new insights for non-invasive UC diagnosis and hold theoretical significance and clinical value for advancing intelligent healthcare development. △ Less

Submitted 13 May, 2025; originally announced May 2025.

arXiv:2505.03121 [pdf]

AutoLoop: a novel autoregressive deep learning method for protein loop prediction with high accuracy

Authors: Tianyue Wang, Xujun Zhang, Langcheng Wang, Odin Zhang, Jike Wang, Ercheng Wang, Jialu Wu, Renling Hu, Jingxuan Ge, Shimeng Li, Qun Su, Jiajun Yu, Chang-Yu Hsieh, Tingjun Hou, Yu Kang

Abstract: Protein structure prediction is a critical and longstanding challenge in biology, garnering widespread interest due to its significance in understanding biological processes. A particular area of focus is the prediction of missing loops in proteins, which are vital in determining protein function and activity. To address this challenge, we propose AutoLoop, a novel computational model designed to… ▽ More Protein structure prediction is a critical and longstanding challenge in biology, garnering widespread interest due to its significance in understanding biological processes. A particular area of focus is the prediction of missing loops in proteins, which are vital in determining protein function and activity. To address this challenge, we propose AutoLoop, a novel computational model designed to automatically generate accurate loop backbone conformations that closely resemble their natural structures. AutoLoop employs a bidirectional training approach while merging atom- and residue-level embedding, thus improving robustness and precision. We compared AutoLoop with twelve established methods, including FREAD, NGK, AlphaFold2, and AlphaFold3. AutoLoop consistently outperforms other methods, achieving a median RMSD of 1.12 Angstrom and a 2-Angstrom success rate of 73.23% on the CASP15 dataset, while maintaining strong performance on the HOMSTARD dataset. It demonstrates the best performance across nearly all loop lengths and secondary structural types. Beyond accuracy, AutoLoop is computationally efficient, requiring only 0.10 s per generation. A post-processing module for side-chain packing and energy minimization further improves results slightly, confirming the reliability of the predicted backbone. A case study also highlights AutoLoop's potential for precise predictions based on dominant loop conformations. These advances hold promise for protein engineering and drug discovery. △ Less

Submitted 5 May, 2025; originally announced May 2025.

Comments: 34 pages, 7 figures

arXiv:2504.17162 [pdf]

A Comprehensive Review on RNA Subcellular Localization Prediction

Authors: Cece Zhang, Xuehuan Zhu, Nick Peterson, Jieqiong Wang, Shibiao Wan

Abstract: The subcellular localization of RNAs, including long non-coding RNAs (lncRNAs), messenger RNAs (mRNAs), microRNAs (miRNAs) and other smaller RNAs, plays a critical role in determining their biological functions. For instance, lncRNAs are predominantly associated with chromatin and act as regulators of gene transcription and chromatin structure, while mRNAs are distributed across the nucleus and cy… ▽ More The subcellular localization of RNAs, including long non-coding RNAs (lncRNAs), messenger RNAs (mRNAs), microRNAs (miRNAs) and other smaller RNAs, plays a critical role in determining their biological functions. For instance, lncRNAs are predominantly associated with chromatin and act as regulators of gene transcription and chromatin structure, while mRNAs are distributed across the nucleus and cytoplasm, facilitating the transport of genetic information for protein synthesis. Understanding RNA localization sheds light on processes like gene expression regulation with spatial and temporal precision. However, traditional wet lab methods for determining RNA localization, such as in situ hybridization, are often time-consuming, resource-demanding, and costly. To overcome these challenges, computational methods leveraging artificial intelligence (AI) and machine learning (ML) have emerged as powerful alternatives, enabling large-scale prediction of RNA subcellular localization. This paper provides a comprehensive review of the latest advancements in AI-based approaches for RNA subcellular localization prediction, covering various RNA types and focusing on sequence-based, image-based, and hybrid methodologies that combine both data types. We highlight the potential of these methods to accelerate RNA research, uncover molecular pathways, and guide targeted disease treatments. Furthermore, we critically discuss the challenges in AI/ML approaches for RNA subcellular localization, such as data scarcity and lack of benchmarks, and opportunities to address them. This review aims to serve as a valuable resource for researchers seeking to develop innovative solutions in the field of RNA subcellular localization and beyond. △ Less

Submitted 23 April, 2025; originally announced April 2025.

arXiv:2504.07881 [pdf]

An LLM-Driven Multi-Agent Debate System for Mendelian Diseases

Authors: Xinyang Zhou, Yongyong Ren, Qianqian Zhao, Daoyi Huang, Xinbo Wang, Tingting Zhao, Zhixing Zhu, Wenyuan He, Shuyuan Li, Yan Xu, Yu Sun, Yongguo Yu, Shengnan Wu, Jian Wang, Guangjun Yu, Dake He, Bo Ban, Hui Lu

Abstract: Accurate diagnosis of Mendelian diseases is crucial for precision therapy and assistance in preimplantation genetic diagnosis. However, existing methods often fall short of clinical standards or depend on extensive datasets to build pretrained machine learning models. To address this, we introduce an innovative LLM-Driven multi-agent debate system (MD2GPS) with natural language explanations of the… ▽ More Accurate diagnosis of Mendelian diseases is crucial for precision therapy and assistance in preimplantation genetic diagnosis. However, existing methods often fall short of clinical standards or depend on extensive datasets to build pretrained machine learning models. To address this, we introduce an innovative LLM-Driven multi-agent debate system (MD2GPS) with natural language explanations of the diagnostic results. It utilizes a language model to transform results from data-driven and knowledge-driven agents into natural language, then fostering a debate between these two specialized agents. This system has been tested on 1,185 samples across four independent datasets, enhancing the TOP1 accuracy from 42.9% to 66% on average. Additionally, in a challenging cohort of 72 cases, MD2GPS identified potential pathogenic genes in 12 patients, reducing the diagnostic time by 90%. The methods within each module of this multi-agent debate system are also replaceable, facilitating its adaptation for diagnosing and researching other complex diseases. △ Less

Submitted 11 April, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

Comments: 21 pages, 5 figures, 1 table

arXiv:2503.21788 [pdf, other]

PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation

Authors: Yizhen Luo, Jiashuo Wang, Siqi Fan, Zaiqing Nie

Abstract: Structural biology relies on accurate three-dimensional biomolecular structures to advance our understanding of biological functions, disease mechanisms, and therapeutics. While recent advances in deep learning have enabled the development of all-atom foundation models for molecular modeling and generation, existing approaches face challenges in generalization due to the multi-modal nature of atom… ▽ More Structural biology relies on accurate three-dimensional biomolecular structures to advance our understanding of biological functions, disease mechanisms, and therapeutics. While recent advances in deep learning have enabled the development of all-atom foundation models for molecular modeling and generation, existing approaches face challenges in generalization due to the multi-modal nature of atomic data and the lack of comprehensive analysis of training and sampling strategies. To address these limitations, we propose PharMolixFM, a unified framework for constructing all-atom foundation models based on multi-modal generative techniques. Our framework includes three variants using state-of-the-art multi-modal generative models. By formulating molecular tasks as a generalized denoising process with task-specific priors, PharMolixFM achieves robust performance across various structural biology applications. Experimental results demonstrate that PharMolixFM-Diff achieves competitive prediction accuracy in protein-small-molecule docking (83.9% vs. 90.2% RMSD < 2Å, given pocket) with significantly improved inference speed. Moreover, we explore the empirical inference scaling law by introducing more sampling repeats or steps. Our code and model are available at https://github.com/PharMolix/OpenBioMed. △ Less

Submitted 31 March, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

arXiv:2503.17738 [pdf]

Tumor-associated CD19$^+$ macrophages induce immunosuppressive microenvironment in hepatocellular carcinoma

Authors: Junli Wang, Wanyue Cao, Jinyan Huang, Yu Zhou, Rujia Zheng, Yu Lou, Jiaqi Yang, Jianghui Tang, Mao Ye, Zhengtao Hong, Jiangchao Wu, Haonan Ding, Yuquan Zhang, Jianpeng Sheng, Xinjiang Lu, Pinglong Xu, Xiongbin Lu, Xueli Bai, Tingbo Liang, Qi Zhang

Abstract: Tumor-associated macrophages are a key component that contributes to the immunosuppressive microenvironment in human cancers. However, therapeutic targeting of macrophages has been a challenge in clinic due to the limited understanding of their heterogeneous subpopulations and distinct functions. Here, we identify a unique and clinically relevant CD19$^+$ subpopulation of macrophages that is enric… ▽ More Tumor-associated macrophages are a key component that contributes to the immunosuppressive microenvironment in human cancers. However, therapeutic targeting of macrophages has been a challenge in clinic due to the limited understanding of their heterogeneous subpopulations and distinct functions. Here, we identify a unique and clinically relevant CD19$^+$ subpopulation of macrophages that is enriched in many types of cancer, particularly in hepatocellular carcinoma (HCC). The CD19$^+$ macrophages exhibit increased levels of PD-L1 and CD73, enhanced mitochondrial oxidation, and compromised phagocytosis, indicating their immunosuppressive functions. Targeting CD19$^+$ macrophages with anti-CD19 chimeric antigen receptor T (CAR-T) cells inhibited HCC tumor growth. We identify PAX5 as a primary driver of up-regulated mitochondrial biogenesis in CD19$^+$ macrophages, which depletes cytoplasmic Ca$^{2+}$, leading to lysosomal deficiency and consequent accumulation of CD73 and PD-L1. Inhibiting CD73 or mitochondrial oxidation enhanced the efficacy of immune checkpoint blockade therapy in treating HCC, suggesting great promise for CD19$^+$ macrophage-targeting therapeutics. △ Less

Submitted 22 March, 2025; originally announced March 2025.

Comments: 7 figures

arXiv:2503.16582 [pdf]

Machine Learning-Based Genomic Linguistic Analysis (Gene Sequence Feature Learning): A Case Study on Predicting Heavy Metal Response Genes in Rice

Authors: Ruiqi Yang, Jianxu Wang, Wei Yuan, Xun Wang, Mei Li

Abstract: This study explores the application of machine learning-based genetic linguistics for identifying heavy metal response genes in rice (Oryza sativa). By integrating convolutional neural networks and random forest algorithms, we developed a hybrid model capable of extracting and learning meaningful features from gene sequences, such as k-mer frequencies and physicochemical properties. The model was… ▽ More This study explores the application of machine learning-based genetic linguistics for identifying heavy metal response genes in rice (Oryza sativa). By integrating convolutional neural networks and random forest algorithms, we developed a hybrid model capable of extracting and learning meaningful features from gene sequences, such as k-mer frequencies and physicochemical properties. The model was trained and tested on datasets of genes, achieving high predictive performance (precision: 0.89, F1-score: 0.82). RNA-seq and qRT-PCR experiments conducted on rice leaves which exposed to Hg0, revealed differential expression of genes associated with heavy metal responses, which validated the model's predictions. Co-expression network analysis identified 103 related genes, and a literature review indicated that these genes are highly likely to be involved in heavy metal-related biological processes. By integrating and comparing the analysis results with those of differentially expressed genes (DEGs), the validity of the new machine learning method was further demonstrated. This study highlights the efficacy of combining machine learning with genetic linguistics for large-scale gene prediction. It demonstrates a cost-effective and efficient approach for uncovering molecular mechanisms underlying heavy metal responses, with potential applications in developing stress-tolerant crop varieties. △ Less

Submitted 20 March, 2025; originally announced March 2025.

arXiv:2503.13522 [pdf, ps, other]

Advanced Deep Learning Methods for Protein Structure Prediction and Design

Authors: Yichao Zhang, Ningyuan Deng, Xinyuan Song, Ziqian Bi, Tianyang Wang, Zheyu Yao, Keyu Chen, Ming Li, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Liu, Li Zhang, Xuanhe Pan, Jinlang Wang, Pohsun Feng, Yizhu Wen, Lawrence KQ Yan, Hongming Tseng, Yan Zhong, Yunze Wang, Ziyuan Qin, Bowen Jing, Junjie Yang , et al. (3 additional authors not shown)

Abstract: After AlphaFold won the Nobel Prize, protein prediction with deep learning once again became a hot topic. We comprehensively explore advanced deep learning methods applied to protein structure prediction and design. It begins by examining recent innovations in prediction architectures, with detailed discussions on improvements such as diffusion based frameworks and novel pairwise attention modules… ▽ More After AlphaFold won the Nobel Prize, protein prediction with deep learning once again became a hot topic. We comprehensively explore advanced deep learning methods applied to protein structure prediction and design. It begins by examining recent innovations in prediction architectures, with detailed discussions on improvements such as diffusion based frameworks and novel pairwise attention modules. The text analyses key components including structure generation, evaluation metrics, multiple sequence alignment processing, and network architecture, thereby illustrating the current state of the art in computational protein modelling. Subsequent chapters focus on practical applications, presenting case studies that range from individual protein predictions to complex biomolecular interactions. Strategies for enhancing prediction accuracy and integrating deep learning techniques with experimental validation are thoroughly explored. The later sections review the industry landscape of protein design, highlighting the transformative role of artificial intelligence in biotechnology and discussing emerging market trends and future challenges. Supplementary appendices provide essential resources such as databases and open source tools, making this volume a valuable reference for researchers and students. △ Less

Submitted 29 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

arXiv:2503.13465 [pdf, ps, other]

A novel Fourier Adjacency Transformer for advanced EEG emotion recognition

Authors: Jinfeng Wang, Yanhao Huang, Sifan Song, Boqian Wang, Jionglong Su, Jiaman Ding

Abstract: EEG emotion recognition faces significant hurdles due to noise interference, signal nonstationarity, and the inherent complexity of brain activity which make accurately emotion classification. In this study, we present the Fourier Adjacency Transformer, a novel framework that seamlessly integrates Fourier-based periodic analysis with graph-driven structural modeling. Our method first leverages nov… ▽ More EEG emotion recognition faces significant hurdles due to noise interference, signal nonstationarity, and the inherent complexity of brain activity which make accurately emotion classification. In this study, we present the Fourier Adjacency Transformer, a novel framework that seamlessly integrates Fourier-based periodic analysis with graph-driven structural modeling. Our method first leverages novel Fourier-inspired modules to extract periodic features from embedded EEG signals, effectively decoupling them from aperiodic components. Subsequently, we employ an adjacency attention scheme to reinforce universal inter-channel correlation patterns, coupling these patterns with their sample-based counterparts. Empirical evaluations on SEED and DEAP datasets demonstrate that our method surpasses existing state-of-the-art techniques, achieving an improvement of approximately 6.5% in recognition accuracy. By unifying periodicity and structural insights, this framework offers a promising direction for future research in EEG emotion analysis. △ Less

Submitted 27 February, 2025; originally announced March 2025.

arXiv:2503.10195 [pdf, other]

ST-FlowNet: An Efficient Spiking Neural Network for Event-Based Optical Flow Estimation

Authors: Hongze Sun, Jun Wang, Wuque Cai, Duo Chen, Qianqian Liao, Jiayi He, Yan Cui, Dezhong Yao, Daqing Guo

Abstract: Spiking Neural Networks (SNNs) have emerged as a promising tool for event-based optical flow estimation tasks due to their ability to leverage spatio-temporal information and low-power capabilities. However, the performance of SNN models is often constrained, limiting their application in real-world scenarios. In this work, we address this gap by proposing a novel neural network architecture, ST-F… ▽ More Spiking Neural Networks (SNNs) have emerged as a promising tool for event-based optical flow estimation tasks due to their ability to leverage spatio-temporal information and low-power capabilities. However, the performance of SNN models is often constrained, limiting their application in real-world scenarios. In this work, we address this gap by proposing a novel neural network architecture, ST-FlowNet, specifically tailored for optical flow estimation from event-based data. The ST-FlowNet architecture integrates ConvGRU modules to facilitate cross-modal feature augmentation and temporal alignment of the predicted optical flow, improving the network's ability to capture complex motion dynamics. Additionally, to overcome the challenges associated with training SNNs, we introduce a novel approach to derive SNN models from pre-trained artificial neural networks (ANNs) through ANN-to-SNN conversion or our proposed BISNN method. Notably, the BISNN method alleviates the complexities involved in biological parameter selection, further enhancing the robustness of SNNs in optical flow estimation tasks. Extensive evaluations on three benchmark event-based datasets demonstrate that the SNN-based ST-FlowNet model outperforms state-of-the-art methods, delivering superior performance in accurate optical flow estimation across a diverse range of dynamic visual scenes. Furthermore, the inherent energy efficiency of SNN models is highlighted, establishing a compelling advantage for their practical deployment. Overall, our work presents a novel framework for optical flow estimation using SNNs and event-based data, contributing to the advancement of neuromorphic vision applications. △ Less

Submitted 27 April, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

Comments: 13 pages, 6 figures, 6 tables; This work has been submitted to Neural Networks for possible publication

arXiv:2503.03783 [pdf, other]

Passive Heart Rate Monitoring During Smartphone Use in Everyday Life

Authors: Shun Liao, Paolo Di Achille, Jiang Wu, Silviu Borac, Jonathan Wang, Xin Liu, Eric Teasley, Lawrence Cai, Yuzhe Yang, Yun Liu, Daniel McDuff, Hao-Wei Su, Brent Winslow, Anupam Pathak, Shwetak Patel, James A. Taylor, Jameson K. Rogers, Ming-Zher Poh

Abstract: Resting heart rate (RHR) is an important biomarker of cardiovascular health and mortality, but tracking it longitudinally generally requires a wearable device, limiting its availability. We present PHRM, a deep learning system for passive heart rate (HR) and RHR measurements during everyday smartphone use, using facial video-based photoplethysmography. Our system was developed using 225,773 videos… ▽ More Resting heart rate (RHR) is an important biomarker of cardiovascular health and mortality, but tracking it longitudinally generally requires a wearable device, limiting its availability. We present PHRM, a deep learning system for passive heart rate (HR) and RHR measurements during everyday smartphone use, using facial video-based photoplethysmography. Our system was developed using 225,773 videos from 495 participants and validated on 185,970 videos from 205 participants in laboratory and free-living conditions, representing the largest validation study of its kind. Compared to reference electrocardiogram, PHRM achieved a mean absolute percentage error (MAPE) < 10% for HR measurements across three skin tone groups of light, medium and dark pigmentation; MAPE for each skin tone group was non-inferior versus the others. Daily RHR measured by PHRM had a mean absolute error < 5 bpm compared to a wearable HR tracker, and was associated with known risk factors. These results highlight the potential of smartphones to enable passive and equitable heart health monitoring. △ Less

Submitted 21 March, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

Comments: Updated author list

arXiv:2503.03780 [pdf, other]

doi 10.1016/j.medengphy.2022.103822

Weighted Combination and Singular Spectrum Analysis Based Remote Photoplethysmography Pulse Extraction in Low-light Environments

Authors: Lin Xi, Xingming Wu, Weihai Chen, Jianhua Wang, Changchen Zhao

Abstract: Camera-based vital signs monitoring in recent years has attracted more and more researchers and the results are promising. However, a few research works focus on heart rate extraction under extremely low illumination environments. In this paper, we propose a novel framework for remote heart rate estimation under low-light conditions. This method uses singular spectrum analysis (SSA) to decompose t… ▽ More Camera-based vital signs monitoring in recent years has attracted more and more researchers and the results are promising. However, a few research works focus on heart rate extraction under extremely low illumination environments. In this paper, we propose a novel framework for remote heart rate estimation under low-light conditions. This method uses singular spectrum analysis (SSA) to decompose the filtered signal into several reconstructed components. A spectral masking algorithm is utilized to refine the preliminary candidate components on the basis of a reference heart rate. The contributive components are fused into the final pulse signal. To evaluate the performance of our framework in low-light conditions, the proposed approach is tested on a large-scale multi-illumination HR dataset (named MIHR). The test results verify that the proposed method has stronger robustness to low illumination than state-of-the-art methods, effectively improving the signal-to-noise ratio and heart rate estimation precision. We further perform experiments on the PUlse RatE detection (PURE) dataset which is recorded under normal light conditions to demonstrate the generalization of our method. The experiment results show that our method can stably detect pulse rate and achieve comparative results. The proposed method pioneers a new solution to the remote heart rate estimation in low-light conditions. △ Less

Submitted 4 March, 2025; originally announced March 2025.

Comments: 14 pages, 8 figures; Published at Medical Engineering & Physics (MEP)

arXiv:2502.20275 [pdf, other]

How cancer emerges: Data-driven universal insights into tumorigenesis via hallmark networks

Authors: Jiahe Wang, Yan Wu, Yuke Hou, Yang Li, Dachuan Xu, Changjing Zhuge, Yue Han

Abstract: Cancer is a complex disease driven by dynamic regulatory shifts that cannot be fully captured by individual molecular profiling. We employ a data-driven approach to construct a coarse-grained dynamic network model based on hallmark interactions, integrating stochastic differential equations with gene regulatory network data to explore key macroscopic dynamic changes in tumorigenesis. Our analysis… ▽ More Cancer is a complex disease driven by dynamic regulatory shifts that cannot be fully captured by individual molecular profiling. We employ a data-driven approach to construct a coarse-grained dynamic network model based on hallmark interactions, integrating stochastic differential equations with gene regulatory network data to explore key macroscopic dynamic changes in tumorigenesis. Our analysis reveals that network topology undergoes significant reconfiguration before hallmark expression shifts, serving as an early indicator of malignancy. A pan-cancer examination across $15$ cancer types uncovers universal patterns, where Tissue Invasion and Metastasis exhibits the most significant difference between normal and cancer states, while the differences in Reprogramming Energy Metabolism are the least pronounced, consistent with the characteristic features of tumor biology. These findings reinforce the systemic nature of cancer evolution, highlighting the potential of network-based systems biology methods for understanding critical transitions in tumorigenesis. △ Less

Submitted 27 February, 2025; originally announced February 2025.

arXiv:2502.16446 [pdf, other]

Auxiliary Discrminator Sequence Generative Adversarial Networks (ADSeqGAN) for Few Sample Molecule Generation

Authors: Haocheng Tang, Jing Long, Junmei Wang

Abstract: In this work, we introduce Auxiliary Discriminator Sequence Generative Adversarial Networks (ADSeqGAN), a novel approach for molecular generation in small-sample datasets. Traditional generative models often struggle with limited training data, particularly in drug discovery, where molecular datasets for specific therapeutic targets, such as nucleic acids binders and central nervous system (CNS) d… ▽ More In this work, we introduce Auxiliary Discriminator Sequence Generative Adversarial Networks (ADSeqGAN), a novel approach for molecular generation in small-sample datasets. Traditional generative models often struggle with limited training data, particularly in drug discovery, where molecular datasets for specific therapeutic targets, such as nucleic acids binders and central nervous system (CNS) drugs, are scarce. ADSeqGAN addresses this challenge by integrating an auxiliary random forest classifier as an additional discriminator into the GAN framework, significantly improves molecular generation quality and class specificity. Our method incorporates pretrained generator and Wasserstein distance to enhance training stability and diversity. We evaluate ADSeqGAN on a dataset comprising nucleic acid-targeting and protein-targeting small molecules, demonstrating its superior ability to generate nucleic acid binders compared to baseline models such as SeqGAN, ORGAN, and MolGPT. Through an oversampling strategy, ADSeqGAN also significantly improves CNS drug generation, achieving a higher yield than traditional de novo models. Critical assessments, including docking simulations and molecular property analysis, confirm that ADSeqGAN-generated molecules exhibit strong binding affinities, enhanced chemical diversity, and improved synthetic feasibility. Overall, ADSeqGAN presents a novel framework for generative molecular design in data-scarce scenarios, offering potential applications in computational drug discovery. We have demonstrated the successful applications of ADSeqGAN in generating synthetic nucleic acid-targeting and CNS drugs in this work. △ Less

Submitted 23 February, 2025; originally announced February 2025.

arXiv:2502.10425 [pdf, other]

Neuron Platonic Intrinsic Representation From Dynamics Using Contrastive Learning

Authors: Wei Wu, Can Liao, Zizhen Deng, Zhengrui Guo, Jinzhuo Wang

Abstract: The Platonic Representation Hypothesis suggests a universal, modality-independent reality representation behind different data modalities. Inspired by this, we view each neuron as a system and detect its multi-segment activity data under various peripheral conditions. We assume there's a time-invariant representation for the same neuron, reflecting its intrinsic properties like molecular profiles,… ▽ More The Platonic Representation Hypothesis suggests a universal, modality-independent reality representation behind different data modalities. Inspired by this, we view each neuron as a system and detect its multi-segment activity data under various peripheral conditions. We assume there's a time-invariant representation for the same neuron, reflecting its intrinsic properties like molecular profiles, location, and morphology. The goal of obtaining these intrinsic neuronal representations has two criteria: (I) segments from the same neuron should have more similar representations than those from different neurons; (II) the representations must generalize well to out-of-domain data. To meet these, we propose the NeurPIR (Neuron Platonic Intrinsic Representation) framework. It uses contrastive learning, with segments from the same neuron as positive pairs and those from different neurons as negative pairs. In implementation, we use VICReg, which focuses on positive pairs and separates dissimilar samples via regularization. We tested our method on Izhikevich model-simulated neuronal population dynamics data. The results accurately identified neuron types based on preset hyperparameters. We also applied it to two real-world neuron dynamics datasets with neuron type annotations from spatial transcriptomics and neuron locations. Our model's learned representations accurately predicted neuron types and locations and were robust on out-of-domain data (from unseen animals). This shows the potential of our approach for understanding neuronal systems and future neuroscience research. △ Less

Submitted 18 February, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

Comments: Accepted by ICLR'2025

arXiv:2502.06881 [pdf, other]

A Comprehensive Review of Protein Language Models

Authors: Lei Wang, Xudong Li, Han Zhang, Jinyi Wang, Dingkang Jiang, Zhidong Xue, Yan Wang

Abstract: At the intersection of the rapidly growing biological data landscape and advancements in Natural Language Processing (NLP), protein language models (PLMs) have emerged as a transformative force in modern research. These models have achieved remarkable progress, highlighting the need for timely and comprehensive overviews. However, much of the existing literature focuses narrowly on specific domain… ▽ More At the intersection of the rapidly growing biological data landscape and advancements in Natural Language Processing (NLP), protein language models (PLMs) have emerged as a transformative force in modern research. These models have achieved remarkable progress, highlighting the need for timely and comprehensive overviews. However, much of the existing literature focuses narrowly on specific domains, often missing a broader analysis of PLMs. This study provides a systematic review of PLMs from a macro perspective, covering key historical milestones and current mainstream trends. We focus on the models themselves and their evaluation metrics, exploring aspects such as model architectures, positional encoding, scaling laws, and datasets. In the evaluation section, we discuss benchmarks and downstream applications. To further support ongoing research, we introduce relevant mainstream tools. Lastly, we critically examine the key challenges and limitations in this rapidly evolving field. △ Less

Submitted 8 February, 2025; originally announced February 2025.

arXiv:2502.01430 [pdf, other]

Molecular Odor Prediction Based on Multi-Feature Graph Attention Networks

Authors: HongXin Xie, JianDe Sun, Yi Shao, Shuai Li, Sujuan Hou, YuLong Sun, Jian Wang

Abstract: Olfactory perception plays a critical role in both human and organismal interactions, yet understanding of its underlying mechanisms and influencing factors remain insufficient. Molecular structures influence odor perception through intricate biochemical interactions, and accurately quantifying structure-odor relationships presents significant challenges. The Quantitative Structure-Odor Relationsh… ▽ More Olfactory perception plays a critical role in both human and organismal interactions, yet understanding of its underlying mechanisms and influencing factors remain insufficient. Molecular structures influence odor perception through intricate biochemical interactions, and accurately quantifying structure-odor relationships presents significant challenges. The Quantitative Structure-Odor Relationship (QSOR) task, which involves predicting the associations between molecular structures and their corresponding odors, seeks to address these challenges. To this end, we propose a method for QSOR, utilizing Graph Attention Networks to model molecular structures and capture both local and global features. Unlike conventional QSOR approaches reliant on predefined descriptors, our method leverages diverse molecular feature extraction techniques to automatically learn comprehensive representations. This integration enhances the model's capacity to handle complex molecular information, improves prediction accuracy. Our approach demonstrates clear advantages in QSOR prediction tasks, offering valuable insights into the application of deep learning in cheminformatics. △ Less

Submitted 3 February, 2025; originally announced February 2025.

arXiv:2501.08363 [pdf]

TopoLa: A Universal Framework to Enhance Cell Representations for Single-cell and Spatial Omics through Topology-encoded Latent Hyperbolic Geometry

Authors: Kai Zheng, Shaokai Wang, Yunpei Xu, Qiming Lei, Qichang Zhao, Xiao Liang, Qilong Feng, Yaohang Li, Min Li, Jinhui Xu, Jianxin Wang

Abstract: Recent advances in cellular research demonstrate that scRNA-seq characterizes cellular heterogeneity, while spatial transcriptomics reveals the spatial distribution of gene expression. Cell representation is the fundamental issue in the two fields. Here, we propose Topology-encoded Latent Hyperbolic Geometry (TopoLa), a computational framework enhancing cell representations by capturing fine-grain… ▽ More Recent advances in cellular research demonstrate that scRNA-seq characterizes cellular heterogeneity, while spatial transcriptomics reveals the spatial distribution of gene expression. Cell representation is the fundamental issue in the two fields. Here, we propose Topology-encoded Latent Hyperbolic Geometry (TopoLa), a computational framework enhancing cell representations by capturing fine-grained intercellular topological relationships. The framework introduces a new metric, TopoLa distance (TLd), which quantifies the geometric distance between cells within latent hyperbolic space, capturing the network's topological structure more effectively. With this framework, the cell representation can be enhanced considerably by performing convolution on its neighboring cells. Performance evaluation across seven biological tasks, including scRNA-seq data clustering and spatial transcriptomics domain identification, shows that TopoLa significantly improves the performance of several state-of-the-art models. These results underscore the generalizability and robustness of TopoLa, establishing it as a valuable tool for advancing both biological discovery and computational methodologies. △ Less

Submitted 14 January, 2025; originally announced January 2025.

Comments: 116 pages,53 figures

arXiv:2501.02176 [pdf]

Molecule-dynamic-based Aging Clock and Aging Roadmap Forecast with Sundial

Authors: Wei Wu, Zizhen Deng, Chi Zhang, Can Liao, Jinzhuo Wang

Abstract: Addressing the unavoidable bias inherent in supervised aging clocks, we introduce Sundial, a novel framework that models molecular dynamics through a diffusion field, capturing both the population-level aging process and the individual-level relative aging order. Sundial enables unbiasedestimation of biological age and the forecast of aging roadmap. Fasteraging individuals from Sundial exhibit a h… ▽ More Addressing the unavoidable bias inherent in supervised aging clocks, we introduce Sundial, a novel framework that models molecular dynamics through a diffusion field, capturing both the population-level aging process and the individual-level relative aging order. Sundial enables unbiasedestimation of biological age and the forecast of aging roadmap. Fasteraging individuals from Sundial exhibit a higher disease risk compared to those identified from supervised aging clocks. This framework opens new avenues for exploring key topics, including age- and sex-specific aging dynamics and faster yet healthy aging paths. △ Less

Submitted 3 January, 2025; originally announced January 2025.

arXiv:2501.01462 [pdf]

Pan-infection Foundation Framework Enables Multiple Pathogen Prediction

Authors: Lingrui Zhang, Haonan Wu, Nana Jin, Chenqing Zheng, Jize Xie, Qitai Cai, Jun Wang, Qin Cao, Xubin Zheng, Jiankun Wang, Lixin Cheng

Abstract: Host-response-based diagnostics can improve the accuracy of diagnosing bacterial and viral infections, thereby reducing inappropriate antibiotic prescriptions. However, the existing cohorts with limited sample size and coarse infections types are unable to support the exploration of an accurate and generalizable diagnostic model. Here, we curate the largest infection host-response transcriptome da… ▽ More Host-response-based diagnostics can improve the accuracy of diagnosing bacterial and viral infections, thereby reducing inappropriate antibiotic prescriptions. However, the existing cohorts with limited sample size and coarse infections types are unable to support the exploration of an accurate and generalizable diagnostic model. Here, we curate the largest infection host-response transcriptome data, including 11,247 samples across 89 blood transcriptome datasets from 13 countries and 21 platforms. We build a diagnostic model for pathogen prediction starting from a pan-infection model as foundation (AUC = 0.97) based on the pan-infection dataset. Then, we utilize knowledge distillation to efficiently transfer the insights from this "teacher" model to four lightweight pathogen "student" models, i.e., staphylococcal infection (AUC = 0.99), streptococcal infection (AUC = 0.94), HIV infection (AUC = 0.93), and RSV infection (AUC = 0.94), as well as a sepsis "student" model (AUC = 0.99). The proposed knowledge distillation framework not only facilitates the diagnosis of pathogens using pan-infection data, but also enables an across-disease study from pan-infection to sepsis. Moreover, the framework enables high-degree lightweight design of diagnostic models, which is expected to be adaptively deployed in clinical settings. △ Less

Submitted 31 December, 2024; originally announced January 2025.

Comments: 15 pages, 8 figures

arXiv:2412.19831 [pdf, ps, other]

Leslie Population Models in Predator-prey and Competitive populations: theory and applications by machine learning

Authors: Pico Gilman, Steven J. Miller, Daeyoung Son, Saad Waheed, Janine Wang

Abstract: We introduce a new predator-prey model by replacing the growth and predation constant by a square matrix, and the population density as a population vector. The classical Lotka-Volterra model describes a population that either modulates or converges. Stability analysis of such models have been extensively studied by the works of Merdan (https://doi.org/10.1016/j.chaos.2007.06.062). The new model a… ▽ More We introduce a new predator-prey model by replacing the growth and predation constant by a square matrix, and the population density as a population vector. The classical Lotka-Volterra model describes a population that either modulates or converges. Stability analysis of such models have been extensively studied by the works of Merdan (https://doi.org/10.1016/j.chaos.2007.06.062). The new model adds complexity by introducing an age group structure where the population of each age group evolves as prescribed by the Leslie matrix. The added complexity changes the behavior of the model such that the population either displays roughly an exponential growth or decay. We first provide an exact equation that describes a time evolution and use analytic techniques to obtain an approximate growth factor. We also discuss the variants of the Leslie model, i.e., the complex value predator-prey model and the competitive model. We then prove the Last Species Standing theorem that determines the dominant population in the large time limit. The recursive structure of the model denies the application of simple regression. We discuss a machine learning scheme that allows an admissible fit for the population evolution of Paramecium Aurelia and Paramecium Caudatum. Another potential avenue to simplify the computation is to use the machinery of quantum operators. We demonstrate the potential of this approach by computing the Hamiltonian of a simple Leslie system. △ Less

Submitted 20 December, 2024; originally announced December 2024.

arXiv:2412.18056 [pdf, ps, other]

doi 10.1016/j.jbiomech.2025.112794

A Fluid-Structure Interaction Model of the Zebrafish Aortic Valve

Authors: Alexander D. Kaiser, Jing Wang, Aaron L. Brown, Enbo Zhu, Tzung Hsiai, Alison L. Marsden

Abstract: The zebrafish is a valuable model organism for studying cardiac development and diseases due to its many shared aspects of genetics and anatomy with humans and ease of experimental manipulations. Computational fluid-structure interaction (FSI) simulations are an efficient and highly controllable means to study the function of cardiac valves in development and diseases. Due to their small scales, l… ▽ More The zebrafish is a valuable model organism for studying cardiac development and diseases due to its many shared aspects of genetics and anatomy with humans and ease of experimental manipulations. Computational fluid-structure interaction (FSI) simulations are an efficient and highly controllable means to study the function of cardiac valves in development and diseases. Due to their small scales, little is known about the mechanical properties of zebrafish cardiac valves, limiting existing computational studies of zebrafish valves and their interaction with blood. To circumvent these limitations, we took a largely first-principles approach called design-based elasticity that allows us to derive valve geometry, fiber orientation and material properties. In FSI simulations of an adult zebrafish aortic valve, these models produce realistic flow rates when driven by physiological pressures and demonstrate the spatiotemporal dynamics of valvular mechanical properties. These models can be used for future studies of zebrafish cardiac hemodynamics, development, and disease. △ Less

Submitted 19 June, 2025; v1 submitted 23 December, 2024; originally announced December 2024.

MSC Class: 92C35 (Primary); 92C10; 76Z05 (Secondary) ACM Class: J.3.1

arXiv:2412.17043 [pdf, other]

Optimal signal transmission and timescale diversity in a model of human brain operating near criticality

Authors: Yang Qi, Jiexiang Wang, Weiyang Ding, Gustavo Deco, Viktor Jirsa, Wenlian Lu, Jianfeng Feng

Abstract: Cortical neurons exhibit a hierarchy of timescales across brain regions in response to input stimuli, which is thought to be crucial for information processing of different temporal scales. Modeling studies suggest that both intra-regional circuit dynamics as well as cross-regional connectome may contribute to this timescale diversity. Equally important to diverse timescales is the ability to tran… ▽ More Cortical neurons exhibit a hierarchy of timescales across brain regions in response to input stimuli, which is thought to be crucial for information processing of different temporal scales. Modeling studies suggest that both intra-regional circuit dynamics as well as cross-regional connectome may contribute to this timescale diversity. Equally important to diverse timescales is the ability to transmit sensory signals reliably across the whole brain. Therefore, the brain must be able to generate diverse timescales while simultaneously minimizing signal attenuation. To understand the dynamical mechanism behind these phenomena, we develop a second-order mean field model of the human brain by applying moment closure and coarse-graining to a digital twin brain model endowed with whole brain structural connectome. Cross-regional coupling strength is found to induced a phase transition from asynchronous activity to synchronous oscillation. By analyzing the input-response properties of the model, we reveal criticality as a unifying mechanism for enabling simultaneously optimal signal transmission and timescales diversity. We show how structural connectome and criticality jointly shape intrinsic timescale hierarchy across the brain. △ Less

Submitted 22 December, 2024; originally announced December 2024.

arXiv:2412.16220 [pdf, other]

Cross-Attention Graph Neural Networks for Inferring Gene Regulatory Networks with Skewed Degree Distribution

Authors: Jiaqi Xiong, Nan Yin, Shiyang Liang, Haoyang Li, Yingxu Wang, Duo Ai, Fang Pan, Jingjie Wang

Abstract: Inferencing Gene Regulatory Networks (GRNs) from gene expression data is a pivotal challenge in systems biology, and several innovative computational methods have been introduced. However, most of these studies have not considered the skewed degree distribution of genes. Specifically, some genes may regulate multiple target genes while some genes may be regulated by multiple regulator genes. Such… ▽ More Inferencing Gene Regulatory Networks (GRNs) from gene expression data is a pivotal challenge in systems biology, and several innovative computational methods have been introduced. However, most of these studies have not considered the skewed degree distribution of genes. Specifically, some genes may regulate multiple target genes while some genes may be regulated by multiple regulator genes. Such a skewed degree distribution issue significantly complicates the application of directed graph embedding methods. To tackle this issue, we propose the Cross-Attention Complex Dual Graph Embedding Model (XATGRN). Our XATGRN employs a cross-attention mechanism to effectively capture intricate gene interactions from gene expression profiles. Additionally, it uses a Dual Complex Graph Embedding approach to manage the skewed degree distribution, thereby ensuring precise prediction of regulatory relationships and their directionality. Our model consistently outperforms existing state-of-the-art methods across various datasets, underscoring its efficacy in elucidating complex gene regulatory mechanisms. Our codes used in this paper are publicly available at: https://github.com/kikixiong/XATGRN. △ Less

Submitted 9 January, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

Comments: 11 pages, 6 figures,1 tabels

arXiv:2412.10659 [pdf, other]

MEATRD: Multimodal Anomalous Tissue Region Detection Enhanced with Spatial Transcriptomics

Authors: Kaichen Xu, Qilong Wu, Yan Lu, Yinan Zheng, Wenlin Li, Xingjie Tang, Jun Wang, Xiaobo Sun

Abstract: The detection of anomalous tissue regions (ATRs) within affected tissues is crucial in clinical diagnosis and pathological studies. Conventional automated ATR detection methods, primarily based on histology images alone, falter in cases where ATRs and normal tissues have subtle visual differences. The recent spatial transcriptomics (ST) technology profiles gene expressions across tissue regions, o… ▽ More The detection of anomalous tissue regions (ATRs) within affected tissues is crucial in clinical diagnosis and pathological studies. Conventional automated ATR detection methods, primarily based on histology images alone, falter in cases where ATRs and normal tissues have subtle visual differences. The recent spatial transcriptomics (ST) technology profiles gene expressions across tissue regions, offering a molecular perspective for detecting ATRs. However, there is a dearth of ATR detection methods that effectively harness complementary information from both histology images and ST. To address this gap, we propose MEATRD, a novel ATR detection method that integrates histology image and ST data. MEATRD is trained to reconstruct image patches and gene expression profiles of normal tissue spots (inliers) from their multimodal embeddings, followed by learning a one-class classification AD model based on latent multimodal reconstruction errors. This strategy harmonizes the strengths of reconstruction-based and one-class classification approaches. At the heart of MEATRD is an innovative masked graph dual-attention transformer (MGDAT) network, which not only facilitates cross-modality and cross-node information sharing but also addresses the model over-generalization issue commonly seen in reconstruction-based AD methods. Additionally, we demonstrate that modality-specific, task-relevant information is collated and condensed in multimodal bottleneck encoding generated in MGDAT, marking the first theoretical analysis of the informational properties of multimodal bottleneck encoding. Extensive evaluations across eight real ST datasets reveal MEATRD's superior performance in ATR detection, surpassing various state-of-the-art AD methods. Remarkably, MEATRD also proves adept at discerning ATRs that only show slight visual deviations from normal tissues. △ Less

Submitted 13 December, 2024; originally announced December 2024.

Comments: AAAI 2025. Code: https://github.com/wqlzuel/MEATRD

arXiv:2412.07236 [pdf, other]

CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding

Authors: Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, Gang Pan

Abstract: Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EE… ▽ More Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EEG foundation models. However, these studies still leave challenges: Firstly, most of existing EEG foundation models employ full EEG modeling strategy. It models the spatial and temporal dependencies between all EEG patches together, but ignores that the spatial and temporal dependencies are heterogeneous due to the unique structural characteristics of EEG signals. Secondly, existing EEG foundation models have limited generalizability on a wide range of downstream BCI tasks due to varying formats of EEG data, making it challenging to adapt to. To address these challenges, we propose a novel foundation model called CBraMod. Specifically, we devise a criss-cross transformer as the backbone to thoroughly leverage the structural characteristics of EEG signals, which can model spatial and temporal dependencies separately through two parallel attention mechanisms. And we utilize an asymmetric conditional positional encoding scheme which can encode positional information of EEG patches and be easily adapted to the EEG with diverse formats. CBraMod is pre-trained on a very large corpus of EEG through patch-based masked EEG reconstruction. We evaluate CBraMod on up to 10 downstream BCI tasks (12 public datasets). CBraMod achieves the state-of-the-art performance across the wide range of tasks, proving its strong capability and generalizability. The source code is publicly available at https://github.com/wjq-learning/CBraMod. △ Less

Submitted 13 April, 2025; v1 submitted 10 December, 2024; originally announced December 2024.

Comments: Accepted by The Thirteenth International Conference on Learning Representations (ICLR 2025)

arXiv:2412.07136 [pdf]

A multimodal ensemble approach for clear cell renal cell carcinoma treatment outcome prediction

Authors: Meixu Chen, Kai Wang, Payal Kapur, James Brugarolas, Raquibul Hannan, Jing Wang

Abstract: Purpose: A reliable cancer prognosis model for clear cell renal cell carcinoma (ccRCC) can enhance personalized treatment. We developed a multi-modal ensemble model (MMEM) that integrates pretreatment clinical data, multi-omics data, and histopathology whole slide image (WSI) data to predict overall survival (OS) and disease-free survival (DFS) for ccRCC patients. Methods: We analyzed 226 patients… ▽ More Purpose: A reliable cancer prognosis model for clear cell renal cell carcinoma (ccRCC) can enhance personalized treatment. We developed a multi-modal ensemble model (MMEM) that integrates pretreatment clinical data, multi-omics data, and histopathology whole slide image (WSI) data to predict overall survival (OS) and disease-free survival (DFS) for ccRCC patients. Methods: We analyzed 226 patients from The Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) dataset, which includes OS, DFS follow-up data, and five data modalities: clinical data, WSIs, and three multi-omics datasets (mRNA, miRNA, and DNA methylation). Separate survival models were built for OS and DFS. Cox-proportional hazards (CPH) model with forward feature selection is used for clinical and multi-omics data. Features from WSIs were extracted using ResNet and three general-purpose foundation models. A deep learning-based CPH model predicted survival using encoded WSI features. Risk scores from all models were combined based on training performance. Results: Performance was assessed using concordance index (C-index) and AUROC. The clinical feature-based CPH model received the highest weight for both OS and DFS tasks. Among WSI-based models, the general-purpose foundation model (UNI) achieved the best performance. The final MMEM model surpassed single-modality models, achieving C-indices of 0.820 (OS) and 0.833 (DFS), and AUROC values of 0.831 (3-year patient death) and 0.862 (cancer recurrence). Using predicted risk medians to stratify high- and low-risk groups, log-rank tests showed improved performance in both OS and DFS compared to single-modality models. Conclusion: MMEM is the first multi-modal model for ccRCC patients, integrating five data modalities. It outperformed single-modality models in prognostic ability and has the potential to assist in ccRCC patient management if independently validated. △ Less

Submitted 9 December, 2024; originally announced December 2024.

Comments: 10 pages, 3 figures, 4 tables

arXiv:2412.06847 [pdf, other]

M$^{3}$-20M: A Large-Scale Multi-Modal Molecule Dataset for AI-driven Drug Design and Discovery

Authors: Siyuan Guo, Lexuan Wang, Chang Jin, Jinxian Wang, Han Peng, Huayang Shi, Wengen Li, Jihong Guan, Shuigeng Zhou

Abstract: This paper introduces M$^{3}$-20M, a large-scale Multi-Modal Molecule dataset that contains over 20 million molecules, with the data mainly being integrated from existing databases and partially generated by large language models. Designed to support AI-driven drug design and discovery, M$^{3}$-20M is 71 times more in the number of molecules than the largest existing dataset, providing an unpreced… ▽ More This paper introduces M$^{3}$-20M, a large-scale Multi-Modal Molecule dataset that contains over 20 million molecules, with the data mainly being integrated from existing databases and partially generated by large language models. Designed to support AI-driven drug design and discovery, M$^{3}$-20M is 71 times more in the number of molecules than the largest existing dataset, providing an unprecedented scale that can highly benefit the training or fine-tuning of models, including large language models for drug design and discovery tasks. This dataset integrates one-dimensional SMILES, two-dimensional molecular graphs, three-dimensional molecular structures, physicochemical properties, and textual descriptions collected through web crawling and generated using GPT-3.5, offering a comprehensive view of each molecule. To demonstrate the power of M$^{3}$-20M in drug design and discovery, we conduct extensive experiments on two key tasks: molecule generation and molecular property prediction, using large language models including GLM4, GPT-3.5, GPT-4, and Llama3-8b. Our experimental results show that M$^{3}$-20M can significantly boost model performance in both tasks. Specifically, it enables the models to generate more diverse and valid molecular structures and achieve higher property prediction accuracy than existing single-modal datasets, which validates the value and potential of M$^{3}$-20M in supporting AI-driven drug design and discovery. The dataset is available at https://github.com/bz99bz/M-3. △ Less

Submitted 16 March, 2025; v1 submitted 7 December, 2024; originally announced December 2024.

arXiv:2411.17206 [pdf, other]

Energy Consumption Optimization, Response Time Differences and Indicators in Cortical Working Memory Revealed by Nonequilibrium

Authors: Xiaochen Wang, Yuxuan Wu, Feng Zhang, Jin Wang

Abstract: The neocortex, a complex system driving multi-region interactions, remains a core puzzle in neuroscience. Despite quantitative insights across brain scales, understanding the mechanisms underlying neural activities is challenging. Advances from Hopfield networks to large-scale cortical models have deepened neural network theory, yet these models often fall short of capturing global brain functions… ▽ More The neocortex, a complex system driving multi-region interactions, remains a core puzzle in neuroscience. Despite quantitative insights across brain scales, understanding the mechanisms underlying neural activities is challenging. Advances from Hopfield networks to large-scale cortical models have deepened neural network theory, yet these models often fall short of capturing global brain functions. In large-scale cortical networks, an intriguing hierarchy of timescales reflects diverse information processing speeds across spatial regions. As a non-equilibrium system, the brain incurs significant energy costs, with long-distance connectivity suggesting an evolutionary spatial organization. To explore these complexities, we introduce a nonequilibrium landscape flux approach to analyze cortical networks. This allows us to quantify potential landscapes and principal transition paths, uncovering dynamical characteristics across timescales. We examine whether temporal hierarchies correlate with stimuli distribution and how hierarchical networks exhibit differential responses. Furthermore, our analysis quantifies the thermodynamic cost of sustaining cognition, highlighting a link to network connectivity. These findings provide insights into energy consumption during cognitive processes and emphasize the spatial benefits for working memory tasks. Experimental validation is challenging due to evolutionary variability, making our theoretical approach valuable for quantifying complex dynamics. By assessing time irreversibility and critical slowdown, we gain predictive insights into network bifurcations and state transitions, offering practical tools for identifying cortical state changes. These results advance our understanding of cortical dynamics. △ Less

Submitted 26 November, 2024; originally announced November 2024.

arXiv:2411.14743 [pdf, other]

FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification

Authors: Zhengrui Guo, Conghao Xiong, Jiabo Ma, Qichen Sun, Lishuang Feng, Jinzhuo Wang, Hao Chen

Abstract: Few-shot learning presents a critical solution for cancer diagnosis in computational pathology (CPath), addressing fundamental limitations in data availability, particularly the scarcity of expert annotations and patient privacy constraints. A key challenge in this paradigm stems from the inherent disparity between the limited training set of whole slide images (WSIs) and the enormous number of co… ▽ More Few-shot learning presents a critical solution for cancer diagnosis in computational pathology (CPath), addressing fundamental limitations in data availability, particularly the scarcity of expert annotations and patient privacy constraints. A key challenge in this paradigm stems from the inherent disparity between the limited training set of whole slide images (WSIs) and the enormous number of contained patches, where a significant portion of these patches lacks diagnostically relevant information, potentially diluting the model's ability to learn and focus on critical diagnostic features. While recent works attempt to address this by incorporating additional knowledge, several crucial gaps hinder further progress: (1) despite the emergence of powerful pathology foundation models (FMs), their potential remains largely untapped, with most approaches limiting their use to basic feature extraction; (2) current language guidance mechanisms attempt to align text prompts with vast numbers of WSI patches all at once, struggling to leverage rich pathological semantic information. To this end, we introduce the knowledge-enhanced adaptive visual compression framework, dubbed FOCUS, which uniquely combines pathology FMs with language prior knowledge to enable a focused analysis of diagnostically relevant regions by prioritizing discriminative WSI patches. Our approach implements a progressive three-stage compression strategy: we first leverage FMs for global visual redundancy elimination, and integrate compressed features with language prompts for semantic relevance assessment, then perform neighbor-aware visual token filtering while preserving spatial coherence. Extensive experiments on pathological datasets spanning breast, lung, and ovarian cancers demonstrate its superior performance in few-shot pathology diagnosis. Codes are available at https://github.com/dddavid4real/FOCUS. △ Less

Submitted 20 March, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

Comments: Accepted by CVPR'2025

arXiv:2411.05244 [pdf]

Nonperfused Retinal Capillaries -- A New Method Developed on OCT and OCTA

Authors: Min Gao, Yukun Guo, Tristan T. Hormel, Jie Wang, Elizabeth White, Dong-Wouk Park, Thomas S. Hwang, Steven T. Bailey, Yali Jia

Abstract: To develop a new method to quantify nonperfused retinal capillaries (NPCs) by using co-registered optical coherence tomography (OCT) and OCT angiography (OCTA), and to evaluate NPCs in eyes with age-related macular degeneration (AMD) and diabetic retinopathy (DR). Multiple consecutive 3x3-mm OCT/OCTA scans were obtained using a commercial device (Solix; Visionix/Optovue, Inc., California, USA). We… ▽ More To develop a new method to quantify nonperfused retinal capillaries (NPCs) by using co-registered optical coherence tomography (OCT) and OCT angiography (OCTA), and to evaluate NPCs in eyes with age-related macular degeneration (AMD) and diabetic retinopathy (DR). Multiple consecutive 3x3-mm OCT/OCTA scans were obtained using a commercial device (Solix; Visionix/Optovue, Inc., California, USA). We averaged multiple registered OCT/OCTA scans to create high-definition volumes. The deep capillary plexus slab was defined and segmented. A novel deep learning denoising algorithm removed tissue background noise from capillaries in the en face OCT/OCTA. The algorithm segmented NPCs by identifying capillaries from OCT without corresponding flow signals in the OCTA. We then investigated the relationships between NPCs and known features in AMD and DR. The denoised en face OCT/OCTA revealed the structure and flow of the capillaries. The automatically segmented NPC achieved an accuracy of 88.2% compared to manual grading of DR. Compared to healthy controls, both the mean number and total length (mm) of NPCs were significantly increased in eyes with AMD and eyes with DR (P < 0.001). Compared to early and intermediate AMD, the number and total length of NPCs were significantly higher in advanced AMD (number: P<0.001, P<0.001; total length: P = 0.002, P =0.003). Geography atrophy, macular neovascularization, drusen volume, and extrafoveal avascular area (EAA) significantly correlated with increased NPCs (P<0.05). In eyes with DR, NPCs correlated with the number of microaneurysms and EAA (P<0.05). The presence of fluid did not significantly correlate with NPCs in AMD and DR. Conclusions A deep learning-based algorithm can segment and quantify retinal capillaries that lack flow using colocalized OCT/OCTA. This novel biomarker may be useful in AMD and DR. △ Less

Submitted 7 November, 2024; originally announced November 2024.

arXiv:2410.20132 [pdf, ps, other]

On-Site Precise Screening of SARS-CoV-2 Systems Using a Channel-Wise Attention-Based PLS-1D-CNN Model with Limited Infrared Signatures

Authors: Wenwen Zhang, Zhouzhuo Tang, Yingmei Feng, Xia Yu, Qi Jie Wang, Zhiping Lin

Abstract: During the early stages of respiratory virus outbreaks, such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the efficient utilize of limited nasopharyngeal swabs for rapid and accurate screening is crucial for public health. In this study, we present a methodology that integrates attenuated total reflection-Fourier transform infrared spectroscopy (ATR-FTIR) with the adaptive iter… ▽ More During the early stages of respiratory virus outbreaks, such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the efficient utilize of limited nasopharyngeal swabs for rapid and accurate screening is crucial for public health. In this study, we present a methodology that integrates attenuated total reflection-Fourier transform infrared spectroscopy (ATR-FTIR) with the adaptive iteratively reweighted penalized least squares (airPLS) preprocessing algorithm and a channel-wise attention-based partial least squares one-dimensional convolutional neural network (PLS-1D-CNN) model, enabling accurate screening of infected individuals within 10 minutes. Two cohorts of nasopharyngeal swab samples, comprising 126 and 112 samples from suspected SARS-CoV-2 Omicron variant cases, were collected at Beijing You'an Hospital for verification. Given that ATR-FTIR spectra are highly sensitive to variations in experimental conditions, which can affect their quality, we propose a biomolecular importance (BMI) evaluation method to assess signal quality across different conditions, validated by comparing BMI with PLS-GBM and PLS-RF results. For the ATR-FTIR signals in cohort 2, which exhibited a higher BMI, airPLS was utilized for signal preprocessing, followed by the application of the channel-wise attention-based PLS-1D-CNN model for screening. The experimental results demonstrate that our model outperforms recently reported methods in the field of respiratory virus spectrum detection, achieving a recognition screening accuracy of 96.48%, a sensitivity of 96.24%, a specificity of 97.14%, an F1-score of 96.12%, and an AUC of 0.99. It meets the World Health Organization (WHO) recommended criteria for an acceptable product: sensitivity of 95.00% or greater and specificity of 97.00% or greater for testing prior SARS-CoV-2 infection in moderate to high volume scenarios. △ Less

Submitted 26 October, 2024; originally announced October 2024.

arXiv:2410.17620 [pdf]

Holistic structure of neural pathways underlies brain perceptual rivalry: Physical mechanism of auditory stream segregation

Authors: Yuxuan Wu, Jinling Gao, Xiaona Fang, Jin Wang

Abstract: Brain perceptual rivalry, exemplified by auditory stream segregation of competing tones (A_, B__, ABA_), serves as a core mechanism of brain perception formation. While increasingly recognized as determining by neural connections rather than specific neural groups, the mechanism of brain perception remains uncertain. We demonstrate that auditory stream segregation arises from the topological struc… ▽ More Brain perceptual rivalry, exemplified by auditory stream segregation of competing tones (A_, B__, ABA_), serves as a core mechanism of brain perception formation. While increasingly recognized as determining by neural connections rather than specific neural groups, the mechanism of brain perception remains uncertain. We demonstrate that auditory stream segregation arises from the topological structure of holistic neural pathways. By constructing a holistic pathway model using existing neurophysiological data, combining nonlinear neural dynamics and nonequilibrium physics, we uncover the biophysical mechanism of perceptual phase transitions from integrated (ABA_) to segregated streams (A_ or B_), as well as the mechanism of temporal dynamics, perceptual switching path, and attention regulation underlying these transitions. Further, we demonstrate how our framework reveals energy consumption of the auditory system and combines it with neuroelectrophysiology. Two psycho-acoustic experiments validate our predictions of perception alternation and attention modulation. Our framework provides a transformative perspective on how brain networks generate complex perceptual experiences, emphasizing the significance of neural pathway structure in the process of brain function realization. △ Less

Submitted 7 March, 2025; v1 submitted 23 October, 2024; originally announced October 2024.

Comments: 26 pages, 8 figures

arXiv:2410.13872 [pdf, other]

BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation

Authors: Zhengrui Guo, Fangxu Zhou, Wei Wu, Qichen Sun, Lishuang Feng, Jinzhuo Wang, Hao Chen

Abstract: Modeling the nonlinear dynamics of neuronal populations represents a key pursuit in computational neuroscience. Recent research has increasingly focused on jointly modeling neural activity and behavior to unravel their interconnections. Despite significant efforts, these approaches often necessitate either intricate model designs or oversimplified assumptions. Given the frequent absence of perfect… ▽ More Modeling the nonlinear dynamics of neuronal populations represents a key pursuit in computational neuroscience. Recent research has increasingly focused on jointly modeling neural activity and behavior to unravel their interconnections. Despite significant efforts, these approaches often necessitate either intricate model designs or oversimplified assumptions. Given the frequent absence of perfectly paired neural-behavioral datasets in real-world scenarios when deploying these models, a critical yet understudied research question emerges: how to develop a model that performs well using only neural activity as input at inference, while benefiting from the insights gained from behavioral signals during training? To this end, we propose BLEND, the behavior-guided neural population dynamics modeling framework via privileged knowledge distillation. By considering behavior as privileged information, we train a teacher model that takes both behavior observations (privileged features) and neural activities (regular features) as inputs. A student model is then distilled using only neural activity. Unlike existing methods, our framework is model-agnostic and avoids making strong assumptions about the relationship between behavior and neural activity. This allows BLEND to enhance existing neural dynamics modeling architectures without developing specialized models from scratch. Extensive experiments across neural population activity modeling and transcriptomic neuron identity prediction tasks demonstrate strong capabilities of BLEND, reporting over 50% improvement in behavioral decoding and over 15% improvement in transcriptomic neuron identity prediction after behavior-guided distillation. Furthermore, we empirically explore various behavior-guided distillation strategies within the BLEND framework and present a comprehensive analysis of effectiveness and implications for model performance. △ Less

Submitted 6 February, 2025; v1 submitted 2 October, 2024; originally announced October 2024.

Comments: Accepted by ICLR'2025

arXiv:2410.07919 [pdf, other]

InstructBioMol: Advancing Biomolecule Understanding and Design Following Human Instructions

Authors: Xiang Zhuang, Keyan Ding, Tianwen Lyu, Yinuo Jiang, Xiaotong Li, Zhuoyi Xiang, Zeyuan Wang, Ming Qin, Kehua Feng, Jike Wang, Qiang Zhang, Huajun Chen

Abstract: Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology, and enzyme engineering. Recent breakthroughs in Artificial Intelligence (AI) have revolutionized biomolecular research, achieving remarkable accuracy in biomolecular prediction and design. However, a critical gap remains between AI's computational power and res… ▽ More Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology, and enzyme engineering. Recent breakthroughs in Artificial Intelligence (AI) have revolutionized biomolecular research, achieving remarkable accuracy in biomolecular prediction and design. However, a critical gap remains between AI's computational power and researchers' intuition, using natural language to align molecular complexity with human intentions. Large Language Models (LLMs) have shown potential to interpret human intentions, yet their application to biomolecular research remains nascent due to challenges including specialized knowledge requirements, multimodal data integration, and semantic alignment between natural language and biomolecules. To address these limitations, we present InstructBioMol, a novel LLM designed to bridge natural language and biomolecules through a comprehensive any-to-any alignment of natural language, molecules, and proteins. This model can integrate multimodal biomolecules as input, and enable researchers to articulate design goals in natural language, providing biomolecular outputs that meet precise biological needs. Experimental results demonstrate InstructBioMol can understand and design biomolecules following human instructions. Notably, it can generate drug molecules with a 10% improvement in binding affinity and design enzymes that achieve an ESP Score of 70.4, making it the only method to surpass the enzyme-substrate interaction threshold of 60.0 recommended by the ESP developer. This highlights its potential to transform real-world biomolecular research. △ Less

Submitted 10 October, 2024; originally announced October 2024.

arXiv:2410.04351 [pdf]

An asymmetric surface coating strategy for promotes rapid endothelialization in the rabbit carotid artery

Authors: Lili Tan, Zhiyi Ye, Suhua Yu, Jinxuan Wang, Chenxi Ouyang, Zhengcai Zhang, Robert Guidoin, Guixue Wang

Abstract: Studying surface modification has long been a key area for enhancing the effects of vascular stents after surgery. The study aimed to develop an asymmetric drug-eluting stent (ADES) with differential drug loading on its inner and outer surfaces, hypothesizing that this design would enhance drug delivery efficacy for percutaneous coronary interventions (PCIs) compared to uniformly coated drug-eluti… ▽ More Studying surface modification has long been a key area for enhancing the effects of vascular stents after surgery. The study aimed to develop an asymmetric drug-eluting stent (ADES) with differential drug loading on its inner and outer surfaces, hypothesizing that this design would enhance drug delivery efficacy for percutaneous coronary interventions (PCIs) compared to uniformly coated drug-eluting stents (UDES). An ultrasonic atomization spraying device was utilized to fabricate the ADES, which was subsequently evaluated for drug release patterns, hemocompatibility, and biocompatibility. In vitro, assessments demonstrated favorable hemocompatibility and showed targeted drug delivery capabilities of ADES within artificial blood vessels. Furthermore, in vivo testing using a rabbit carotid artery model revealed significant endothelialization on stented segments treated with the ADES. These findings suggest that the ADES holds promise as a minimally invasive platform for improving cardiovascular disease treatment outcomes by addressing thrombus formation and neointima proliferation more effectively than traditional stents. △ Less

Submitted 6 October, 2024; originally announced October 2024.

Comments: 24 pages,7 figures, 1 table

arXiv:2409.08022 [pdf, other]

De novo design of high-affinity protein binders with AlphaProteo

Authors: Vinicius Zambaldi, David La, Alexander E. Chu, Harshnira Patani, Amy E. Danson, Tristan O. C. Kwan, Thomas Frerix, Rosalia G. Schneider, David Saxton, Ashok Thillaisundaram, Zachary Wu, Isabel Moraes, Oskar Lange, Eliseo Papa, Gabriella Stanton, Victor Martin, Sukhdeep Singh, Lai H. Wong, Russ Bates, Simon A. Kohl, Josh Abramson, Andrew W. Senior, Yilmaz Alguel, Mary Y. Wu, Irene M. Aspalter , et al. (7 additional authors not shown)

Abstract: Computational design of protein-binding proteins is a fundamental capability with broad utility in biomedical research and biotechnology. Recent methods have made strides against some target proteins, but on-demand creation of high-affinity binders without multiple rounds of experimental testing remains an unsolved challenge. This technical report introduces AlphaProteo, a family of machine learni… ▽ More Computational design of protein-binding proteins is a fundamental capability with broad utility in biomedical research and biotechnology. Recent methods have made strides against some target proteins, but on-demand creation of high-affinity binders without multiple rounds of experimental testing remains an unsolved challenge. This technical report introduces AlphaProteo, a family of machine learning models for protein design, and details its performance on the de novo binder design problem. With AlphaProteo, we achieve 3- to 300-fold better binding affinities and higher experimental success rates than the best existing methods on seven target proteins. Our results suggest that AlphaProteo can generate binders "ready-to-use" for many research applications using only one round of medium-throughput screening and no further optimization. △ Less

Submitted 12 September, 2024; originally announced September 2024.

Comments: 45 pages, 17 figures

arXiv:2408.12413 [pdf, other]

Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures

Authors: Ce Liu, Jun Wang, Zhiqiang Cai, Yingxu Wang, Huizhen Kuang, Kaihui Cheng, Liwei Zhang, Qingkun Su, Yining Tang, Fenglei Cao, Limei Han, Siyu Zhu, Yuan Qi

Abstract: Despite significant progress in static protein structure collection and prediction, the dynamic behavior of proteins, one of their most vital characteristics, has been largely overlooked in prior research. This oversight can be attributed to the limited availability, diversity, and heterogeneity of dynamic protein datasets. To address this gap, we propose to enhance existing prestigious static 3D… ▽ More Despite significant progress in static protein structure collection and prediction, the dynamic behavior of proteins, one of their most vital characteristics, has been largely overlooked in prior research. This oversight can be attributed to the limited availability, diversity, and heterogeneity of dynamic protein datasets. To address this gap, we propose to enhance existing prestigious static 3D protein structural databases, such as the Protein Data Bank (PDB), by integrating dynamic data and additional physical properties. Specifically, we introduce a large-scale dataset, Dynamic PDB, encompassing approximately 12.6K proteins, each subjected to all-atom molecular dynamics (MD) simulations lasting 1 microsecond to capture conformational changes. Furthermore, we provide a comprehensive suite of physical properties, including atomic velocities and forces, potential and kinetic energies of proteins, and the temperature of the simulation environment, recorded at 1 picosecond intervals throughout the simulations. For benchmarking purposes, we evaluate state-of-the-art methods on the proposed dataset for the task of trajectory prediction. To demonstrate the value of integrating richer physical properties in the study of protein dynamics and related model design, we base our approach on the SE(3) diffusion model and incorporate these physical properties into the trajectory prediction process. Preliminary results indicate that this straightforward extension of the SE(3) model yields improved accuracy, as measured by MAE and RMSD, when the proposed physical properties are taken into consideration. https://fudan-generative-vision.github.io/dynamicPDB/ . △ Less

Submitted 18 September, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

arXiv:2407.12296 [pdf]

Discovery of novel antimicrobial peptides with notable antibacterial potency by a LLM-based foundation model

Authors: Jike Wang, Jianwen Feng, Yu Kang, Peichen Pan, Jingxuan Ge, Yan Wang, Mingyang Wang, Zhenxing Wu, Xingcai Zhang, Jiameng Yu, Xujun Zhang, Tianyue Wang, Lirong Wen, Guangning Yan, Yafeng Deng, Hui Shi, Chang-Yu Hsieh, Zhihui Jiang, Tingjun Hou

Abstract: Large language models (LLMs) have shown remarkable advancements in chemistry and biomedical research, acting as versatile foundation models for various tasks. We introduce AMP-Designer, an LLM-based approach for swiftly designing novel antimicrobial peptides (AMPs) with desired properties. Within 11 days, AMP-Designer achieved the de novo design of 18 AMPs with broad-spectrum activity against Gram… ▽ More Large language models (LLMs) have shown remarkable advancements in chemistry and biomedical research, acting as versatile foundation models for various tasks. We introduce AMP-Designer, an LLM-based approach for swiftly designing novel antimicrobial peptides (AMPs) with desired properties. Within 11 days, AMP-Designer achieved the de novo design of 18 AMPs with broad-spectrum activity against Gram-negative bacteria. In vitro validation revealed a 94.4% success rate, with two candidates demonstrating exceptional antibacterial efficacy, minimal hemotoxicity, stability in human plasma, and low potential to induce resistance, as evidenced by significant bacterial load reduction in murine lung infection experiments. The entire process, from design to validation, concluded in 48 days. AMP-Designer excels in creating AMPs targeting specific strains despite limited data availability, with a top candidate displaying a minimum inhibitory concentration of 2.0 μg/ml against Propionibacterium acnes. Integrating advanced machine learning techniques, AMP-Designer demonstrates remarkable efficiency, paving the way for innovative solutions to antibiotic resistance. △ Less

Submitted 2 March, 2025; v1 submitted 16 July, 2024; originally announced July 2024.

Comments: 43 pages, 6 figures, 5 tables. Due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract appearing here is slightly shorter than that in the PDF file

arXiv:2407.09450 [pdf, other]

Human-like Episodic Memory for Infinite Context LLMs

Authors: Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang

Abstract: Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrat… ▽ More Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into LLMs with no fine-tuning, enabling them to handle practically infinite context lengths while maintaining computational efficiency. EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement in an online fashion. When needed, these events are retrieved through a two-stage memory process, combining similarity-based and temporally contiguous retrieval for efficient and human-like access to relevant information. Experiments on the LongBench and InfiniteBench benchmarks demonstrate EM-LLM's superior performance, consistently outperforming the state-of-the-art retrieval model InfLLM across various baseline LLMs. In addition, EM-LLM outperforms its popular counterpart, RAG, in a wide range of tasks, while requiring similar resources. Notably, EM-LLM's performance even surpasses full-context models in most tasks, while successfully performing retrieval across 10 million tokens - a scale computationally infeasible for such models. Finally, our analysis reveals strong correlations between EM-LLM's event segmentation and human-perceived events, suggesting a bridge between this artificial system and its biological counterpart, thereby offering a novel computational framework for exploring human memory mechanisms. △ Less

Submitted 25 October, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.07930 [pdf]

Token-Mol 1.0: Tokenized drug design with large language model

Authors: Jike Wang, Rui Qin, Mingyang Wang, Meijing Fang, Yangyang Zhang, Yuchen Zhu, Qun Su, Qiaolin Gou, Chao Shen, Odin Zhang, Zhenxing Wu, Dejun Jiang, Xujun Zhang, Huifeng Zhao, Xiaozhe Wan, Zhourui Wu, Liwei Liu, Yu Kang, Chang-Yu Hsieh, Tingjun Hou

Abstract: Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug… ▽ More Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug design model. This model encodes all molecular information, including 2D and 3D structures, as well as molecular property data, into tokens, which transforms classification and regression tasks in drug discovery into probabilistic prediction problems, thereby enabling learning through a unified paradigm. Token-Mol is built on the transformer decoder architecture and trained using random causal masking techniques. Additionally, we proposed the Gaussian cross-entropy (GCE) loss function to overcome the challenges in regression tasks, significantly enhancing the capacity of LLMs to learn continuous numerical values. Through a combination of fine-tuning and reinforcement learning (RL), Token-Mol achieves performance comparable to or surpassing existing task-specific methods across various downstream tasks, including pocket-based molecular generation, conformation generation, and molecular property prediction. Compared to existing molecular pre-trained models, Token-Mol exhibits superior proficiency in handling a wider range of downstream tasks essential for drug design. Notably, our approach improves regression task accuracy by approximately 30% compared to similar token-only methods. Token-Mol overcomes the precision limitations of token-only models and has the potential to integrate seamlessly with general models such as ChatGPT, paving the way for the development of a universal artificial intelligence drug design model that facilitates rapid and high-quality drug design by experts. △ Less

Submitted 19 August, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Showing 1–50 of 247 results for author: Wang, J