Skip to main content

Showing 1–50 of 60 results for author: Zhu, J

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2506.14021  [pdf, ps, other

    q-bio.QM

    An 11,000-Study Open-Access Dataset of Longitudinal Magnetic Resonance Images of Brain Metastases

    Authors: Saahil Chadha, David Weiss, Anastasia Janas, Divya Ramakrishnan, Thomas Hager, Klara Osenberg, Klara Willms, Joshua Zhu, Veronica Chiang, Spyridon Bakas, Nazanin Maleki, Durga V. Sritharan, Sven Schoenherr, Malte Westerhoff, Matthew Zawalich, Melissa Davis, Ajay Malhotra, Khaled Bousabarah, Cornelius Deuschl, MingDe Lin, Sanjay Aneja, Mariam S. Aboian

    Abstract: Brain metastases are a common complication of systemic cancer, affecting over 20% of patients with primary malignancies. Longitudinal magnetic resonance imaging (MRI) is essential for diagnosing patients, tracking disease progression, assessing therapeutic response, and guiding treatment selection. However, the manual review of longitudinal imaging is time-intensive, especially for patients with m… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  2. arXiv:2506.01833  [pdf, ps, other

    cs.LG q-bio.GN

    SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model

    Authors: Zhao Yang, Jiwei Zhu, Bing Su

    Abstract: Inspired by the success of unsupervised pre-training paradigms, researchers have applied these approaches to DNA pre-training. However, we argue that these approaches alone yield suboptimal results because pure DNA sequences lack sufficient information, since their functions are regulated by genomic profiles like chromatin accessibility. Here, we demonstrate that supervised training for genomic pr… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to ICML 2025

  3. arXiv:2506.01456  [pdf

    q-bio.GN cs.AI cs.LG q-bio.NC

    GenDMR: A dynamic multimodal role-swapping network for identifying risk gene phenotypes

    Authors: Lina Qin, Cheng Zhu, Chuqi Zhou, Yukun Huang, Jiayi Zhu, Ping Liang, Jinju Wang, Yixing Huang, Cheng Luo, Dezhong Yao, Ying Tan

    Abstract: Recent studies have shown that integrating multimodal data fusion techniques for imaging and genetic features is beneficial for the etiological analysis and predictive diagnosis of Alzheimer's disease (AD). However, there are several critical flaws in current deep learning methods. Firstly, there has been insufficient discussion and exploration regarding the selection and encoding of genetic infor… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: 31 pages, 9 figures

  4. arXiv:2501.16391  [pdf, other

    cs.LG cs.AI q-bio.BM

    Inductive-Associative Meta-learning Pipeline with Human Cognitive Patterns for Unseen Drug-Target Interaction Prediction

    Authors: Xiaoqing Lian, Jie Zhu, Tianxu Lv, Shiyun Nie, Hang Fan, Guosheng Wu, Yunjun Ge, Lihua Li, Xiangxiang Zeng, Xiang Pan

    Abstract: Significant differences in protein structures hinder the generalization of existing drug-target interaction (DTI) models, which often rely heavily on pre-learned binding principles or detailed annotations. In contrast, BioBridge designs an Inductive-Associative pipeline inspired by the workflow of scientists who base their accumulated expertise on drawing insights into novel drug-target pairs from… ▽ More

    Submitted 27 March, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

  5. arXiv:2501.08518  [pdf, other

    cs.HC cs.AI eess.SP q-bio.QM

    Easing Seasickness through Attention Redirection with a Mindfulness-Based Brain--Computer Interface

    Authors: Xiaoyu Bao, Kailin Xu, Jiawei Zhu, Haiyun Huang, Kangning Li, Qiyun Huang, Yuanqing Li

    Abstract: Seasickness is a prevalent issue that adversely impacts both passenger experiences and the operational efficiency of maritime crews. While techniques that redirect attention have proven effective in alleviating motion sickness symptoms in terrestrial environments, applying similar strategies to manage seasickness poses unique challenges due to the prolonged and intense motion environment associate… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  6. arXiv:2410.11224  [pdf, other

    q-bio.BM cs.LG

    DeltaDock: A Unified Framework for Accurate, Efficient, and Physically Reliable Molecular Docking

    Authors: Jiaxian Yan, Zaixi Zhang, Jintao Zhu, Kai Zhang, Jianfeng Pei, Qi Liu

    Abstract: Molecular docking, a technique for predicting ligand binding poses, is crucial in structure-based drug design for understanding protein-ligand interactions. Recent advancements in docking methods, particularly those leveraging geometric deep learning (GDL), have demonstrated significant efficiency and accuracy advantages over traditional sampling methods. Despite these advancements, current method… ▽ More

    Submitted 16 October, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: Accepted by NeurIPS'24

  7. arXiv:2407.15202  [pdf, other

    q-bio.BM cs.AI cs.LG

    Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors

    Authors: Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan

    Abstract: Drug-Target binding Affinity (DTA) prediction is essential for drug discovery. Despite the application of deep learning methods to DTA prediction, the achieved accuracy remain suboptimal. In this work, inspired by the recent success of retrieval methods, we propose $k$NN-DTA, a non-parametric embedding-based retrieval method adopted on a pre-trained DTA prediction model, which can extend the power… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: Accepted by 33rd ACM International Conference on Information and Knowledge Management 2024 (CIKM 2024)

  8. arXiv:2407.08174  [pdf, other

    cs.HC q-bio.NC

    An Adaptively Weighted Averaging Method for Regional Time Series Extraction of fMRI-based Brain Decoding

    Authors: Jianfei Zhu, Baichun Wei, Jiaru Tian, Feng Jiang, Chunzhi Yi

    Abstract: Brain decoding that classifies cognitive states using the functional fluctuations of the brain can provide insightful information for understanding the brain mechanisms of cognitive functions. Among the common procedures of decoding the brain cognitive states with functional magnetic resonance imaging (fMRI), extracting the time series of each brain region after brain parcellation traditionally av… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 17 pages, 4 figures

    ACM Class: J.3

  9. Systematic evaluation of the isolated effect of tissue environment on the transcriptome using a single-cell RNA-seq atlas dataset

    Authors: Daigo Okada, Jianshen Zhu, Kan Shota, Yuuki Nishimura, Kazuya Haraguchi

    Abstract: Background: Understanding cellular diversity throughout the body is essential for elucidating the complex functions of biological systems. Recently, large-scale single-cell omics datasets, known as omics atlases, have become available. These atlases encompass data from diverse tissues and cell-types, providing insights into the landscape of cell-type-specific gene expression. However, the isolated… ▽ More

    Submitted 19 December, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Journal ref: BMC Genomics 26, 416 (2025)

  10. arXiv:2406.05797  [pdf, other

    q-bio.BM cs.AI cs.CE cs.CL cs.LG

    3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

    Authors: Qizhi Pei, Rui Yan, Kaiyuan Gao, Jinhua Zhu, Lijun Wu

    Abstract: The integration of molecular and natural language representations has emerged as a focal point in molecular science, with recent advancements in Language Models (LMs) demonstrating significant potential for comprehensive modeling of both domains. However, existing approaches face notable limitations, particularly in their neglect of three-dimensional (3D) information, which is crucial for understa… ▽ More

    Submitted 18 March, 2025; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by ICLR 2025

  11. arXiv:2405.00513  [pdf

    q-bio.QM

    3D MR Fingerprinting for Dynamic Contrast-Enhanced Imaging of Whole Mouse Brain

    Authors: Yuran Zhu, Guanhua Wang, Yuning Gu, Walter Zhao, Jiahao Lu, Junqing Zhu, Christina J. MacAskill, Andrew Dupuis, Mark A. Griswold, Dan Ma, Chris A. Flask, Xin Yu

    Abstract: Quantitative MRI enables direct quantification of contrast agent concentrations in contrast-enhanced scans. However, the lengthy scan times required by conventional methods are inadequate for tracking contrast agent transport dynamically in mouse brain. We developed a 3D MR fingerprinting (MRF) method for simultaneous T1 and T2 mapping across the whole mouse brain with 4.3-min temporal resolution.… ▽ More

    Submitted 5 August, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  12. arXiv:2403.20261  [pdf, other

    q-bio.BM cs.AI cs.LG

    FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

    Authors: Kaiyuan Gao, Qizhi Pei, Gongbo Zhang, Jinhua Zhu, Kun He, Lijun Wu

    Abstract: Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with… ▽ More

    Submitted 24 February, 2025; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted for presentation at KDD 2025

  13. arXiv:2403.17513  [pdf, other

    physics.chem-ph physics.bio-ph q-bio.BM

    A unified framework for coarse grained molecular dynamics of proteins with high-fidelity reconstruction

    Authors: Jinzhen Zhu

    Abstract: Simulating large proteins using traditional molecular dynamics (MD) is computationally demanding. To address this challenge, we propose a novel tree-structured coarse-grained model that efficiently captures protein dynamics. By leveraging a hierarchical protein representation, our model accurately reconstructs high-resolution protein structures, with sub-angstrom precision achieved for a 168-amino… ▽ More

    Submitted 9 December, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: 14 pages, 10 figures

  14. arXiv:2403.01528  [pdf, other

    cs.CL cs.AI q-bio.BM

    Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, Jinhua Zhu, Yue Wang, Zun Wang, Tao Qin, Rui Yan

    Abstract: The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. This approach leverages the rich, multifaceted descriptions of biomolecules contained within textual data sources to enhance our fundamental understanding and enable downstream computational tasks such as biomol… ▽ More

    Submitted 5 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Survey Paper. 25 pages, 9 figures, and 3 tables

  15. arXiv:2402.17810  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG q-bio.BM

    BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning

    Authors: Qizhi Pei, Lijun Wu, Kaiyuan Gao, Xiaozhuan Liang, Yin Fang, Jinhua Zhu, Shufang Xie, Tao Qin, Rui Yan

    Abstract: Recent research trends in computational biology have increasingly focused on integrating text and bio-entity modeling, especially in the context of molecules and proteins. However, previous efforts like BioT5 faced challenges in generalizing across diverse tasks and lacked a nuanced understanding of molecular structures, particularly in their textual representations (e.g., IUPAC). This paper intro… ▽ More

    Submitted 31 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 (Findings)

  16. arXiv:2402.12391  [pdf, other

    q-bio.GN cs.AI cs.LG

    Toward a Team of AI-made Scientists for Scientific Discovery from Gene Expression Data

    Authors: Haoyang Liu, Yijiang Li, Jinglin Jian, Yuxuan Cheng, Jianrong Lu, Shuyi Guo, Jinglei Zhu, Mianchen Zhang, Miantong Zhang, Haohan Wang

    Abstract: Machine learning has emerged as a powerful tool for scientific discovery, enabling researchers to extract meaningful insights from complex datasets. For instance, it has facilitated the identification of disease-predictive genes from gene expression data, significantly advancing healthcare. However, the traditional process for analyzing such datasets demands substantial human effort and expertise… ▽ More

    Submitted 20 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 18 pages, 2 figures; added contact

  17. arXiv:2402.06772  [pdf, other

    q-bio.QM cs.AI cs.CE cs.LG

    Retrosynthesis Prediction via Search in (Hyper) Graph

    Authors: Zixun Lan, Binjie Hong, Jiajun Zhu, Zuo Zeng, Zhenfu Liu, Limin Yu, Fei Ma

    Abstract: Predicting reactants from a specified core product stands as a fundamental challenge within organic synthesis, termed retrosynthesis prediction. Recently, semi-template-based methods and graph-edits-based methods have achieved good performance in terms of both interpretability and accuracy. However, due to their mechanisms these methods cannot predict complex reactions, e.g., reactions with multip… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  18. arXiv:2401.10806  [pdf, ps, other

    q-bio.BM

    DeepRLI: A Multi-objective Framework for Universal Protein--Ligand Interaction Prediction

    Authors: Haoyu Lin, Shiwei Wang, Jintao Zhu, Yibo Li, Jianfeng Pei, Luhua Lai

    Abstract: Protein (receptor)--ligand interaction prediction is a critical component in computer-aided drug design, significantly influencing molecular docking and virtual screening processes. Despite the development of numerous scoring functions in recent years, particularly those employing machine learning, accurately and efficiently predicting binding affinities for protein--ligand complexes remains a for… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  19. arXiv:2311.15201  [pdf, other

    q-bio.BM

    DiffBindFR: An SE(3) Equivariant Network for Flexible Protein-Ligand Docking

    Authors: Jintao Zhu, Zhonghui Gu, Jianfeng Pei, Luhua Lai

    Abstract: Molecular docking, a key technique in structure-based drug design, plays pivotal roles in protein-ligand interaction modeling, hit identification and optimization, in which accurate prediction of protein-ligand binding mode is essential. Conventional docking approaches perform well in redocking tasks with known protein binding pocket conformation in the complex state. However, in real-world dockin… ▽ More

    Submitted 19 December, 2023; v1 submitted 26 November, 2023; originally announced November 2023.

  20. arXiv:2310.13468  [pdf, other

    q-bio.PE physics.soc-ph q-bio.QM

    EpiGeoPop: A Tool for Developing Spatially Accurate Country-level Epidemiological Models

    Authors: Lara Herriott, Henriette L. Capel, Isaac Ellmen, Nathan Schofield, Jiayuan Zhu, Ben Lambert, David Gavaghan, Ioana Bouros, Richard Creswell, Kit Gallagher

    Abstract: Mathematical models play a crucial role in understanding the spread of infectious disease outbreaks and influencing policy decisions. These models aid pandemic preparedness by predicting outcomes under hypothetical scenarios and identifying weaknesses in existing frameworks. However, their accuracy, utility, and comparability are being scrutinized. Agent-based models (ABMs) have emerged as a valua… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 16 pages, 6 figures, 3 supplementary figures

  21. arXiv:2310.07276  [pdf, other

    cs.CL cs.AI cs.LG q-bio.BM

    BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

    Authors: Qizhi Pei, Wei Zhang, Jinhua Zhu, Kehan Wu, Kaiyuan Gao, Lijun Wu, Yingce Xia, Rui Yan

    Abstract: Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery. However, current models exhibit several limitations, such as the generation of invalid molecular SMILES, underutilization of contextual information, and equal treatment of structured and unstructured knowledge. To address these issues, we propose… ▽ More

    Submitted 28 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted by Empirical Methods in Natural Language Processing 2023 (EMNLP 2023)

  22. arXiv:2310.06763  [pdf, other

    cs.LG cs.AI q-bio.BM

    FABind: Fast and Accurate Protein-Ligand Binding

    Authors: Qizhi Pei, Kaiyuan Gao, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Kun He, Tie-Yan Liu, Rui Yan

    Abstract: Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based meth… ▽ More

    Submitted 8 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted by Neural Information Processing Systems 2023 (NeurIPS 2023)

  23. arXiv:2309.07165  [pdf

    q-bio.PE

    Revive, Restore, Revitalize: An Eco-economic Methodology for Maasai Mara

    Authors: Yipeng Xu, He Sun, Junfeng Zhu

    Abstract: The Maasai Mara in Kenya, renowned for its biodiversity, is witnessing ecosystem degradation and species endangerment due to intensified human activities. Addressing this, we introduce a dynamic system harmonizing ecological and human priorities. Our agent-based model replicates the Maasai Mara savanna ecosystem, incorporating 71 animal species, 10 human classifications, and 2 natural resource typ… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 25 pages, 16 figures

  24. arXiv:2307.08576  [pdf

    q-bio.NC cs.LG

    A Study on the Performance of Generative Pre-trained Transformer (GPT) in Simulating Depressed Individuals on the Standardized Depressive Symptom Scale

    Authors: Sijin Cai, Nanfeng Zhang, Jiaying Zhu, Yanjie Liu, Yongjin Zhou

    Abstract: Background: Depression is a common mental disorder with societal and economic burden. Current diagnosis relies on self-reports and assessment scales, which have reliability issues. Objective approaches are needed for diagnosing depression. Objective: Evaluate the potential of GPT technology in diagnosing depression. Assess its ability to simulate individuals with depression and investigate the inf… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  25. arXiv:2306.05445  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    Towards Predicting Equilibrium Distributions for Molecular Systems with Deep Learning

    Authors: Shuxin Zheng, Jiyan He, Chang Liu, Yu Shi, Ziheng Lu, Weitao Feng, Fusong Ju, Jiaxi Wang, Jianwei Zhu, Yaosen Min, He Zhang, Shidi Tang, Hongxia Hao, Peiran Jin, Chi Chen, Frank Noé, Haiguang Liu, Tie-Yan Liu

    Abstract: Advances in deep learning have greatly improved structure prediction of molecules. However, many macroscopic observations that are important for real-world applications are not functions of a single molecular structure, but rather determined from the equilibrium distribution of structures. Traditional methods for obtaining these distributions, such as molecular dynamics simulation, are computation… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 80 pages, 11 figures

  26. arXiv:2304.01347  [pdf

    q-bio.NC cs.LG cs.MM

    Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

    Authors: Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo

    Abstract: The available evidence suggests that dynamic functional connectivity (dFC) can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia(SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal br… ▽ More

    Submitted 11 September, 2023; v1 submitted 30 March, 2023; originally announced April 2023.

  27. arXiv:2211.08406  [pdf, other

    q-bio.BM cs.AI cs.LG

    Incorporating Pre-training Paradigm for Antibody Sequence-Structure Co-design

    Authors: Kaiyuan Gao, Lijun Wu, Jinhua Zhu, Tianbo Peng, Yingce Xia, Liang He, Shufang Xie, Tao Qin, Haiguang Liu, Kun He, Tie-Yan Liu

    Abstract: Antibodies are versatile proteins that can bind to pathogens and provide effective protection for human body. Recently, deep learning-based computational antibody design has attracted popular attention since it automatically mines the antibody patterns from data that could be complementary to human experiences. However, the computational methods heavily rely on high-quality antibody structure data… ▽ More

    Submitted 17 November, 2022; v1 submitted 26 October, 2022; originally announced November 2022.

  28. arXiv:2209.15408  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    Equivariant Energy-Guided SDE for Inverse Molecular Design

    Authors: Fan Bao, Min Zhao, Zhongkai Hao, Peiyao Li, Chongxuan Li, Jun Zhu

    Abstract: Inverse molecular design is critical in material science and drug discovery, where the generated molecules should satisfy certain desirable properties. In this paper, we propose equivariant energy-guided stochastic differential equations (EEGSDE), a flexible framework for controllable 3D molecule generation under the guidance of an energy function in diffusion models. Formally, we show that EEGSDE… ▽ More

    Submitted 28 February, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

  29. arXiv:2209.13527  [pdf, ps, other

    q-bio.BM cs.LG math.OC

    Molecular Design Based on Integer Programming and Quadratic Descriptors in a Two-layered Model

    Authors: Jianshen Zhu, Naveed Ahmed Azam, Shengjuan Cao, Ryota Ido, Kazuya Haraguchi, Liang Zhao, Hiroshi Nagamochi, Tatsuya Akutsu

    Abstract: A novel framework has recently been proposed for designing the molecular structure of chemical compounds with a desired chemical property, where design of novel drugs is an important topic in bioinformatics and chemo-informatics. The framework infers a desired chemical graph by solving a mixed integer linear program (MILP) that simulates the computation process of a feature function defined by a t… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2108.10266, arXiv:2107.02381, arXiv:2109.02628

  30. arXiv:2208.06348  [pdf, other

    q-bio.NC cs.AI cs.CL cs.LG

    Can Brain Signals Reveal Inner Alignment with Human Languages?

    Authors: William Han, Jielin Qiu, Jiacheng Zhu, Mengdi Xu, Douglas Weber, Bo Li, Ding Zhao

    Abstract: Brain Signals, such as Electroencephalography (EEG), and human languages have been widely explored independently for many downstream tasks, however, the connection between them has not been well explored. In this study, we explore the relationship and dependency between EEG and language. To study at the representation level, we introduced \textbf{MTAM}, a \textbf{M}ultimodal \textbf{T}ransformer \… ▽ More

    Submitted 4 May, 2024; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: EMNLP 2023 Findings

  31. arXiv:2206.09818  [pdf, other

    q-bio.BM cs.AI cs.LG

    SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction

    Authors: Qizhi Pei, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Haiguang Liu, Tie-Yan Liu, Rui Yan

    Abstract: Accurate prediction of Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain the most reliable method, they are time-consuming and resource-intensive, resulting in limited data availability that poses challenges for deep… ▽ More

    Submitted 17 October, 2023; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: Accepted by Briefings in Bioinformatics 2023

  32. arXiv:2205.11016  [pdf, other

    cs.CV q-bio.QM

    MolMiner: You only look once for chemical structure recognition

    Authors: Youjun Xu, Jinchuan Xiao, Chia-Han Chou, Jianhang Zhang, Jintao Zhu, Qiwan Hu, Hemin Li, Ningsheng Han, Bingyu Liu, Shuaipeng Zhang, Jinyu Han, Zhen Zhang, Shuhao Zhang, Weilin Zhang, Luhua Lai, Jianfeng Pei

    Abstract: Molecular structures are always depicted as 2D printed form in scientific documents like journal papers and patents. However, these 2D depictions are not machine-readable. Due to a backlog of decades and an increasing amount of these printed literature, there is a high demand for the translation of printed depictions into machine-readable formats, which is known as Optical Chemical Structure Recog… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

    Comments: 19 pages, 4 figures

  33. arXiv:2204.11840  [pdf, other

    cs.LG cs.AI eess.SP q-bio.NC

    Dynamic Ensemble Bayesian Filter for Robust Control of a Human Brain-machine Interface

    Authors: Yu Qi, Xinyun Zhu, Kedi Xu, Feixiao Ren, Hongjie Jiang, Junming Zhu, Jianmin Zhang, Gang Pan, Yueming Wang

    Abstract: Objective: Brain-machine interfaces (BMIs) aim to provide direct brain control of devices such as prostheses and computer cursors, which have demonstrated great potential for mobility restoration. One major limitation of current BMIs lies in the unstable performance in online control due to the variability of neural signals, which seriously hinders the clinical availability of BMIs. Method: To dea… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

  34. arXiv:2107.02381  [pdf, ps, other

    cs.LG math.OC q-bio.BM

    An Inverse QSAR Method Based on Linear Regression and Integer Programming

    Authors: Jianshen Zhu, Naveed Ahmed Azam, Kazuya Haraguchi, Liang Zhao, Hiroshi Nagamochi, Tatsuya Akutsu

    Abstract: Recently a novel framework has been proposed for designing the molecular structure of chemical compounds using both artificial neural networks (ANNs) and mixed integer linear programming (MILP). In the framework, we first define a feature vector $f(C)$ of a chemical graph $C$ and construct an ANN that maps $x=f(C)$ to a predicted value $η(x)$ of a chemical property $Ï€$ to $C$. After this, we formu… ▽ More

    Submitted 23 August, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

  35. arXiv:2106.10234  [pdf, other

    q-bio.QM cs.LG

    Dual-view Molecule Pre-training

    Authors: Jinhua Zhu, Yingce Xia, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

    Abstract: Inspired by its success in natural language processing and computer vision, pre-training has attracted substantial attention in cheminformatics and bioinformatics, especially for molecule based tasks. A molecule can be represented by either a graph (where atoms are connected by bonds) or a SMILES sequence (where depth-first-search is applied to the molecular graph with specific rules). Existing wo… ▽ More

    Submitted 12 October, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: Add new results of retrosynthesis

  36. arXiv:2101.10643  [pdf, other

    stat.ML cs.AI cs.LG q-bio.QM

    Causal inference for observational longitudinal studies using deep survival models

    Authors: Jie Zhu, Blanca Gallego

    Abstract: Causal inference for observational longitudinal studies often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-dependent patient history and time-dependent covariates. To tackle this longitudinal treatment effect estimation problem, we have developed a time-variant causal survival (TCS) model that uses the potential outcomes framework with an… ▽ More

    Submitted 8 June, 2022; v1 submitted 26 January, 2021; originally announced January 2021.

  37. arXiv:2011.01002  [pdf, other

    q-bio.QM eess.IV q-bio.TO stat.AP

    RRScell method for automated single-cell profiling of multiplexed immunofluorescence cancer tissue

    Authors: Alvason Zhenhua Li, Karsten Eichholz, Anton Sholukh, Daniel Stone, Michelle A. Loprieno, Keith R. Jerome, Khamsone Phasouk, Kurt Diem, Jia Zhu, Lawrence Corey

    Abstract: Multiplexed immuno-fluorescence tissue imaging, allowing simultaneous detection of molecular properties of cells, is an essential tool for characterizing the complex cellular mechanisms in translational research and clinical practice. New image analysis approaches are needed because tissue section stained with a mixture of protein, DNA and RNA biomarkers are introducing various complexities, inclu… ▽ More

    Submitted 18 March, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

    Comments: 8 pages, 6 figures, markerUMAP cell clustering

  38. arXiv:2006.03226  [pdf

    cs.NE cs.AI q-bio.NC

    Brain-inspired global-local learning incorporated with neuromorphic computing

    Authors: Yujie Wu, Rong Zhao, Jun Zhu, Feng Chen, Mingkun Xu, Guoqi Li, Sen Song, Lei Deng, Guanrui Wang, Hao Zheng, Jing Pei, Youhui Zhang, Mingguo Zhao, Luping Shi

    Abstract: Two main routes of learning methods exist at present including error-driven global learning and neuroscience-oriented local learning. Integrating them into one network may provide complementary learning capabilities for versatile learning scenarios. At the same time, neuromorphic computing holds great promise, but still needs plenty of useful algorithms and algorithm-hardware co-designs for exploi… ▽ More

    Submitted 21 June, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: 5 figures, 6 tables

  39. arXiv:2004.02689  [pdf, other

    q-bio.QM cs.IT eess.SP stat.ME stat.ML

    Noisy Pooled PCR for Virus Testing

    Authors: Junan Zhu, Kristina Rivera, Dror Baron

    Abstract: Fast testing can help mitigate the coronavirus disease 2019 (COVID-19) pandemic. Despite their accuracy for single sample analysis, infectious diseases diagnostic tools, like RT-PCR, require substantial resources to test large populations. We develop a scalable approach for determining the viral status of pooled patient samples. Our approach converts group testing to a linear inverse problem, wher… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    Comments: 5 pages, 3 figures; we welcome new collaborators to reach out and help improve this work!

  40. arXiv:2002.09283  [pdf

    cs.DL cs.LG q-bio.NC

    MODMA dataset: a Multi-modal Open Dataset for Mental-disorder Analysis

    Authors: Hanshu Cai, Yiwen Gao, Shuting Sun, Na Li, Fuze Tian, Han Xiao, Jianxiu Li, Zhengwu Yang, Xiaowei Li, Qinglin Zhao, Zhenyu Liu, Zhijun Yao, Minqiang Yang, Hong Peng, Jing Zhu, Xiaowei Zhang, Guoping Gao, Fang Zheng, Rui Li, Zhihua Guo, Rong Ma, Jing Yang, Lan Zhang, Xiping Hu, Yumin Li , et al. (1 additional authors not shown)

    Abstract: According to the World Health Organization, the number of mental disorder patients, especially depression patients, has grown rapidly and become a leading contributor to the global burden of disease. However, the present common practice of depression diagnosis is based on interviews and clinical scales carried out by doctors, which is not only labor-consuming but also time-consuming. One important… ▽ More

    Submitted 4 March, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Journal ref: Sci Data 9, 178 (2022)

  41. arXiv:1910.08877  [pdf, other

    stat.ME q-bio.QM stat.ML

    Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis

    Authors: Jie Zhu, Blanca Gallego

    Abstract: The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individua… ▽ More

    Submitted 22 October, 2019; v1 submitted 19 October, 2019; originally announced October 2019.

    Journal ref: j.jbi.2020.103474

  42. arXiv:1906.11196  [pdf, other

    q-bio.BM cs.LG stat.ML

    Seq-SetNet: Exploring Sequence Sets for Inferring Structures

    Authors: Fusong Ju, Jianwei Zhu, Guozheng Wei, Qi Zhang, Shiwei Sun, Dongbo Bu

    Abstract: Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it. Almost all of the existing approaches exploit MSAs in an indirect fashion, i.e., they transform MSAs into position-specific scoring matrices (PSSM) that represen… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  43. arXiv:1810.02037  [pdf, other

    stat.ME q-bio.GN

    A statistical normalization method and differential expression analysis for RNA-seq data between different species

    Authors: Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang

    Abstract: Background: High-throughput techniques bring novel tools but also statistical challenges to genomic research. Identifying genes with differential expression between different species is an effective way to discover evolutionarily conserved transcriptional responses. To remove systematic variation between different species for a fair comparison, the normalization procedure serves as a crucial pre-p… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

  44. arXiv:1809.09553  [pdf

    physics.med-ph q-bio.QM

    Prediction of Coronary Heart Disease Using Routine Blood Tests

    Authors: Ning Meng, Peng Zhang, Junfeng Li, Jun He, Jin Zhu

    Abstract: Background --The objective of this study was to examine the association of routine blood test results with coronary heart disease (CHD) risk, to incorporate them into coronary prediction models and to compare the discrimination properties of this approach with other prediction functions. Methods and Results --This work was designed as a retrospective, single-center study of a hospital-based cohort… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

  45. arXiv:1809.00083  [pdf, other

    q-bio.BM cs.LG stat.ME

    Predicting protein inter-residue contacts using composite likelihood maximization and deep learning

    Authors: Haicang Zhang, Qi Zhang, Fusong Ju, Jianwei Zhu, Shiwei Sun, Yujuan Gao, Ziwei Xie, Minghua Deng, Shiwei Sun, Wei-Mou Zheng, Dongbo Bu

    Abstract: Accurate prediction of inter-residue contacts of a protein is important to calcu- lating its tertiary structure. Analysis of co-evolutionary events among residues has been proved effective to inferring inter-residue contacts. The Markov ran- dom field (MRF) technique, although being widely used for contact prediction, suffers from the following dilemma: the actual likelihood function of MRF is acc… ▽ More

    Submitted 31 August, 2018; originally announced September 2018.

  46. arXiv:1808.08662  [pdf, other

    q-bio.PE

    Advances in Computational Methods for Phylogenetic Networks in the Presence of Hybridization

    Authors: R. A. L. Elworth, H. A. Ogilvie, J. Zhu, L. Nakhleh

    Abstract: Phylogenetic networks extend phylogenetic trees to allow for modeling reticulate evolutionary processes such as hybridization. They take the shape of a rooted, directed, acyclic graph, and when parameterized with evolutionary parameters, such as divergence times and population sizes, they form a generative process of molecular sequence evolution. Early work on computational methods for phylogeneti… ▽ More

    Submitted 26 August, 2018; originally announced August 2018.

  47. arXiv:1805.03327  [pdf, other

    q-bio.MN cs.LG cs.SI

    Network Enhancement: a general method to denoise weighted biological networks

    Authors: Bo Wang, Armin Pourshafeie, Marinka Zitnik, Junjie Zhu, Carlos D. Bustamante, Serafim Batzoglou, Jure Leskovec

    Abstract: Networks are ubiquitous in biology where they encode connectivity patterns at all scales of organization, from molecular to the biome. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper discovery of network patterns and dynamics. We propose Network Enhancement (NE), a method for improving the signal-to-noise rati… ▽ More

    Submitted 1 June, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

    Journal ref: Nature Communications, 9:3108, 2018

  48. arXiv:1706.02609  [pdf, other

    cs.NE q-bio.NC stat.ML

    Spatio-Temporal Backpropagation for Training High-performance Spiking Neural Networks

    Authors: Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, Luping Shi

    Abstract: Compared with artificial neural networks (ANNs), spiking neural networks (SNNs) are promising to explore the brain-like behaviors since the spikes could encode more spatio-temporal information. Although pre-training from ANN or direct training based on backpropagation (BP) makes the supervised training of SNNs possible, these methods only exploit the networks' spatial domain information which lead… ▽ More

    Submitted 12 September, 2017; v1 submitted 8 June, 2017; originally announced June 2017.

    Journal ref: Frontiers in neuroscience, 2018, 12

  49. arXiv:1703.07844  [pdf, other

    q-bio.GN cs.LG q-bio.QM

    SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning

    Authors: Bo Wang, Daniele Ramazzotti, Luca De Sano, Junjie Zhu, Emma Pierson, Serafim Batzoglou

    Abstract: We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmar… ▽ More

    Submitted 18 January, 2018; v1 submitted 21 March, 2017; originally announced March 2017.

  50. arXiv:1611.10252  [pdf, other

    q-bio.NC cs.AI cs.LG

    SeDMiD for Confusion Detection: Uncovering Mind State from Time Series Brain Wave Data

    Authors: Jingkang Yang, Haohan Wang, Jun Zhu, Eric P. Xing

    Abstract: Understanding how brain functions has been an intriguing topic for years. With the recent progress on collecting massive data and developing advanced technology, people have become interested in addressing the challenge of decoding brain wave data into meaningful mind states, with many machine learning models and algorithms being revisited and developed, especially the ones that handle time series… ▽ More

    Submitted 29 November, 2016; originally announced November 2016.

    Comments: 11 pages, 2 figures, NIPS 2016 Time Series Workshop