Lomics: Generation of Pathways and Gene Sets using Large Language Models for Transcriptomic Analysis

Wong, Chun-Ka; Choo, Ali; Cheng, Eugene C. C.; San, Wing-Chun; Cheng, Kelvin Chak-Kong; Lau, Yee-Man; Lin, Minqing; Li, Fei; Liang, Wei-Hao; Liao, Song-Yan; Ng, Kwong-Man; Hung, Ivan Fan-Ngai; Tse, Hung-Fat; Wong, Jason Wing-Hon

Quantitative Biology > Molecular Networks

arXiv:2407.09089 (q-bio)

[Submitted on 12 Jul 2024]

Title:Lomics: Generation of Pathways and Gene Sets using Large Language Models for Transcriptomic Analysis

Authors:Chun-Ka Wong, Ali Choo, Eugene C. C. Cheng, Wing-Chun San, Kelvin Chak-Kong Cheng, Yee-Man Lau, Minqing Lin, Fei Li, Wei-Hao Liang, Song-Yan Liao, Kwong-Man Ng, Ivan Fan-Ngai Hung, Hung-Fat Tse, Jason Wing-Hon Wong

View PDF

Abstract:Interrogation of biological pathways is an integral part of omics data analysis. Large language models (LLMs) enable the generation of custom pathways and gene sets tailored to specific scientific questions. These targeted sets are significantly smaller than traditional pathway enrichment analysis libraries, reducing multiple hypothesis testing and potentially enhancing statistical power. Lomics (Large Language Models for Omics Studies) v1.0 is a python-based bioinformatics toolkit that streamlines the generation of pathways and gene sets for transcriptomic analysis. It operates in three steps: 1) deriving relevant pathways based on the researcher's scientific question, 2) generating valid gene sets for each pathway, and 3) outputting the results as .GMX files. Lomics also provides explanations for pathway selections. Consistency and accuracy are ensured through iterative processes, JSON format validation, and HUGO Gene Nomenclature Committee (HGNC) gene symbol verification. Lomics serves as a foundation for integrating LLMs into omics research, potentially improving the specificity and efficiency of pathway analysis.

Subjects:	Molecular Networks (q-bio.MN)
Cite as:	arXiv:2407.09089 [q-bio.MN]
	(or arXiv:2407.09089v1 [q-bio.MN] for this version)
	https://doi.org/10.48550/arXiv.2407.09089

Submission history

From: Chun Ka Wong [view email]
[v1] Fri, 12 Jul 2024 08:34:45 UTC (683 KB)

Quantitative Biology > Molecular Networks

Title:Lomics: Generation of Pathways and Gene Sets using Large Language Models for Transcriptomic Analysis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Molecular Networks

Title:Lomics: Generation of Pathways and Gene Sets using Large Language Models for Transcriptomic Analysis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators