Skip to main content

Showing 1–5 of 5 results for author: Church, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.17265  [pdf, other

    cs.LG q-bio.QM

    CodonMPNN for Organism Specific and Codon Optimal Inverse Folding

    Authors: Hannes Stark, Umesh Padia, Julia Balla, Cameron Diao, George Church

    Abstract: Generating protein sequences conditioned on protein structures is an impactful technique for protein engineering. When synthesizing engineered proteins, they are commonly translated into DNA and expressed in an organism such as yeast. One difficulty in this process is that the expression rates can be low due to suboptimal codon sequences for expressing a protein in a host organism. We propose Codo… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

    Comments: Appeared at the 2024 ICML AI4Science workshop

  2. arXiv:2409.04481  [pdf, other

    q-bio.QM cs.AI cs.LG

    Large Language Models in Drug Discovery and Development: From Disease Mechanisms to Clinical Trials

    Authors: Yizhen Zheng, Huan Yee Koh, Maddie Yang, Li Li, Lauren T. May, Geoffrey I. Webb, Shirui Pan, George Church

    Abstract: The integration of Large Language Models (LLMs) into the drug discovery and development field marks a significant paradigm shift, offering novel methodologies for understanding disease mechanisms, facilitating drug discovery, and optimizing clinical trial processes. This review highlights the expanding role of LLMs in revolutionizing various stages of the drug development pipeline. We investigate… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  3. arXiv:1712.03346  [pdf, other

    q-bio.QM cs.LG

    Variational auto-encoding of protein sequences

    Authors: Sam Sinai, Eric Kelsic, George M. Church, Martin A. Nowak

    Abstract: Proteins are responsible for the most diverse set of functions in biology. The ability to extract information from protein sequences and to predict the effects of mutations is extremely valuable in many domains of biology and medicine. However the mapping between protein sequence and function is complex and poorly understood. Here we present an embedding of natural protein sequences using a Variat… ▽ More

    Submitted 3 January, 2018; v1 submitted 9 December, 2017; originally announced December 2017.

    Comments: Abstract for oral presentation at NIPS 2017 Workshop on Machine Learning in Computational Biology

  4. arXiv:1502.07816  [pdf, other

    q-bio.NC cs.CE cs.CV q-bio.QM

    Puzzle Imaging: Using Large-scale Dimensionality Reduction Algorithms for Localization

    Authors: Joshua I. Glaser, Bradley M. Zamft, George M. Church, Konrad P. Kording

    Abstract: Current high-resolution imaging techniques require an intact sample that preserves spatial relationships. We here present a novel approach, "puzzle imaging," that allows imaging a spatially scrambled sample. This technique takes many spatially disordered samples, and then pieces them back together using local properties embedded within the sample. We show that puzzle imaging can efficiently produc… ▽ More

    Submitted 21 June, 2015; v1 submitted 26 February, 2015; originally announced February 2015.

  5. arXiv:cs/0101016  [pdf, ps, other

    cs.CE cs.DS

    A Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry

    Authors: Ting Chen, Ming-Yang Kao, Matthew Tepel, John Rush, George M. Church

    Abstract: The tandem mass spectrometry fragments a large number of molecules of the same peptide sequence into charged prefix and suffix subsequences, and then measures mass/charge ratios of these ions. The de novo peptide sequencing problem is to reconstruct the peptide sequence from a given tandem mass spectral data of k ions. By implicitly transforming the spectral data into an NC-spectrum graph G=(V,E… ▽ More

    Submitted 17 January, 2001; originally announced January 2001.

    Comments: A preliminary version appeared in Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 389--398, 2000

    ACM Class: F.2; J.3