Skip to main content

Showing 1–1 of 1 results for author: Nelson, D R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.06798  [pdf

    q-bio.GN cs.AI cs.CL q-bio.QM

    LA4SR: illuminating the dark proteome with generative AI

    Authors: David R. Nelson, Ashish Kumar Jaiswal, Noha Ismail, Alexandra Mystikou, Kourosh Salehi-Ashtiani

    Abstract: AI language models (LMs) show promise for biological sequence analysis. We re-engineered open-source LMs (GPT-2, BLOOM, DistilRoBERTa, ELECTRA, and Mamba, ranging from 70M to 12B parameters) for microbial sequence classification. The models achieved F1 scores up to 95 and operated 16,580x faster and at 2.9x the recall of BLASTP. They effectively classified the algal dark proteome - uncharacterized… ▽ More

    Submitted 11 December, 2024; v1 submitted 11 November, 2024; originally announced November 2024.