Skip to main content

Showing 1–9 of 9 results for author: Xin, J

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2404.11068  [pdf, other

    cs.LG cs.AI cs.DC q-bio.QM

    ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours

    Authors: Feiwen Zhu, Arkadiusz Nowaczynski, Rundong Li, Jie Xin, Yifei Song, Michal Marcinkiewicz, Sukru Burc Eryilmaz, Jun Yang, Michael Andersch

    Abstract: AlphaFold2 has been hailed as a breakthrough in protein folding. It can rapidly predict protein structures with lab-grade accuracy. However, its implementation does not include the necessary training code. OpenFold is the first trainable public reimplementation of AlphaFold. AlphaFold training procedure is prohibitively time-consuming, and gets diminishing benefits from scaling to more compute res… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  2. arXiv:2307.12682  [pdf

    q-bio.BM

    Pro-PRIME: A general Temperature-Guided Language model to engineer enhanced Stability and Activity in Proteins

    Authors: Fan Jiang, Mingchen Li, Jiajun Dong, Yuanxi Yu, Xinyu Sun, Banghao Wu, Jin Huang, Liqi Kang, Yufeng Pei, Liang Zhang, Shaojie Wang, Wenxue Xu, Jingyao Xin, Wanli Ouyang, Guisheng Fan, Lirong Zheng, Yang Tan, Zhiqiang Hu, Yi Xiong, Yan Feng, Guangyu Yang, Qian Liu, Jie Song, Jia Liu, Liang Hong , et al. (1 additional authors not shown)

    Abstract: Designing protein mutants of both high stability and activity is a critical yet challenging task in protein engineering. Here, we introduce PRIME, a deep learning model, which can suggest protein mutants of improved stability and activity without any prior experimental mutagenesis data of the specified protein. Leveraging temperature-aware language modeling, PRIME demonstrated superior predictive… ▽ More

    Submitted 27 October, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2304.03780

  3. arXiv:2307.08240  [pdf, other

    q-bio.QM q-bio.BM

    MIST-CF: Chemical formula inference from tandem mass spectra

    Authors: Samuel Goldman, Jiayi Xin, Joules Provenzano, Connor W. Coley

    Abstract: Chemical formula annotation for tandem mass spectrometry (MS/MS) data is the first step toward structurally elucidating unknown metabolites. While great strides have been made toward solving this problem, the current state-of-the-art method depends on time-intensive, proprietary, and expert-parameterized fragmentation tree construction and scoring. In this work we extend our previous spectrum Tran… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  4. arXiv:2304.09344  [pdf

    cs.DB q-bio.QM

    BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs

    Authors: Jackson Callaghan, Colleen H. Xu, Jiwen Xin, Marco Alvarado Cano, Anders Riutta, Eric Zhou, Rohan Juneja, Yao Yao, Madhumita Narayan, Kristina Hanspers, Ayushi Agrawal, Alexander R. Pico, Chunlei Wu, Andrew I. Su

    Abstract: Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of dr… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  5. arXiv:2303.06470  [pdf, other

    q-bio.QM cs.LG

    Prefix-Tree Decoding for Predicting Mass Spectra from Molecules

    Authors: Samuel Goldman, John Bradshaw, Jiayi Xin, Connor W. Coley

    Abstract: Computational predictions of mass spectra from molecules have enabled the discovery of clinically relevant metabolites. However, such predictive tools are still limited as they occupy one of two extremes, either operating (a) by fragmenting molecules combinatorially with overly rigid constraints on potential rearrangements and poor time complexity or (b) by decoding lossy and nonphysical discretiz… ▽ More

    Submitted 3 December, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

  6. arXiv:2302.12563  [pdf, other

    q-bio.BM cs.LG

    Retrieved Sequence Augmentation for Protein Representation Learning

    Authors: Chang Ma, Haiteng Zhao, Lin Zheng, Jiayi Xin, Qintong Li, Lijun Wu, Zhihong Deng, Yang Lu, Qi Liu, Lingpeng Kong

    Abstract: Protein language models have excelled in a variety of tasks, ranging from structure prediction to protein engineering. However, proteins are highly diverse in functions and structures, and current state-of-the-art models including the latest version of AlphaFold rely on Multiple Sequence Alignments (MSA) to feed in the evolutionary knowledge. Despite their success, heavy computational overheads, a… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  7. arXiv:2201.09394  [pdf, other

    cs.LG math.NA q-bio.PE q-bio.QM

    An integrated recurrent neural network and regression model with spatial and climatic couplings for vector-borne disease dynamics

    Authors: Zhijian Li, Jack Xin, Guofa Zhou

    Abstract: We developed an integrated recurrent neural network and nonlinear regression spatio-temporal model for vector-borne disease evolution. We take into account climate data and seasonality as external factors that correlate with disease transmitting insects (e.g. flies), also spill-over infections from neighboring regions surrounding a region of interest. The climate data is encoded to the model throu… ▽ More

    Submitted 23 January, 2022; originally announced January 2022.

  8. arXiv:2007.10929  [pdf, other

    q-bio.PE cs.LG stat.AP stat.ML

    A Recurrent Neural Network and Differential Equation Based Spatiotemporal Infectious Disease Model with Application to COVID-19

    Authors: Zhijian Li, Yunling Zheng, Jack Xin, Guofa Zhou

    Abstract: The outbreaks of Coronavirus Disease 2019 (COVID-19) have impacted the world significantly. Modeling the trend of infection and real-time forecasting of cases can help decision making and control of the disease spread. However, data-driven methods such as recurrent neural networks (RNN) can perform poorly due to limited daily samples in time. In this work, we develop an integrated spatiotemporal m… ▽ More

    Submitted 17 September, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

  9. arXiv:q-bio/0411032  [pdf, ps, other

    q-bio.QM q-bio.TO

    Signal processing of acoustic signals in the time domain with an active nonlinear nonlocal cochlear model

    Authors: M. Drew LaMar, J. Xin, Y. Qi

    Abstract: A two space dimensional active nonlinear nonlocal cochlear model is formulated in the time domain to capture nonlinear hearing effects such as compression, multi-tone suppression and difference tones. The micromechanics of the basilar membrane (BM) are incorporated to model active cochlear properties. An active gain parameter is constructed in the form of a nonlinear nonlocal functional of BM disp… ▽ More

    Submitted 6 July, 2010; v1 submitted 15 November, 2004; originally announced November 2004.

    Comments: 19 pages, 9 figures