Skip to main content

Showing 1–12 of 12 results for author: Pimentel, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.16565  [pdf, other

    cs.LG cs.AI cs.CL q-bio.GN

    Gene42: Long-Range Genomic Foundation Model With Dense Attention

    Authors: Kirill Vishniakov, Boulbaba Ben Amor, Engin Tekin, Nancy A. ElNaker, Karthik Viswanathan, Aleksandr Medvedev, Aahan Singh, Maryam Nadeem, Mohammad Amaan Sayeed, Praveenkumar Kanithi, Tiago Magalhaes, Natalia Vassilieva, Dwarikanath Mahapatra, Marco Pimentel, and Shadab Khan

    Abstract: We introduce Gene42, a novel family of Genomic Foundation Models (GFMs) designed to manage context lengths of up to 192,000 base pairs (bp) at a single-nucleotide resolution. Gene42 models utilize a decoder-only (LLaMA-style) architecture with a dense self-attention mechanism. Initially trained on fixed-length sequences of 4,096 bp, our models underwent continuous pretraining to extend the context… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

  2. arXiv:2501.09825  [pdf, ps, other

    cs.CL cs.AI

    Bridging Language Barriers in Healthcare: A Study on Arabic LLMs

    Authors: Nada Saadi, Tathagata Raha, Clément Christophe, Marco AF Pimentel, Ronnie Rajan, Praveen K Kanithi

    Abstract: This paper investigates the challenges of developing large language models (LLMs) proficient in both multilingual understanding and medical knowledge. We demonstrate that simply translating medical data does not guarantee strong performance on clinical tasks in the target language. Our experiments reveal that the optimal language mix in training data varies significantly across different medical t… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

  3. arXiv:2410.05046  [pdf, other

    cs.CL cs.AI

    Named Clinical Entity Recognition Benchmark

    Authors: Wadood M Abdul, Marco AF Pimentel, Muhammad Umar Salman, Tathagata Raha, Clément Christophe, Praveen K Kanithi, Nasir Hayat, Ronnie Rajan, Shadab Khan

    Abstract: This technical report introduces a Named Clinical Entity Recognition Benchmark for evaluating language models in healthcare, addressing the crucial natural language processing (NLP) task of extracting structured information from clinical narratives to support applications like automated coding, clinical trial cohort identification, and clinical decision support. The leaderboard provides a standa… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Technical Report

  4. arXiv:2409.14988  [pdf, other

    cs.CL

    Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

    Authors: Clément Christophe, Tathagata Raha, Svetlana Maslenkova, Muhammad Umar Salman, Praveen K Kanithi, Marco AF Pimentel, Shadab Khan

    Abstract: Large Language Models (LLMs) have demonstrated significant potential in transforming clinical applications. In this study, we investigate the efficacy of four techniques in adapting LLMs for clinical use-cases: continuous pretraining, instruct fine-tuning, NEFTune, and prompt engineering. We employ these methods on Mistral 7B and Mixtral 8x7B models, leveraging a large-scale clinical pretraining d… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  5. arXiv:2409.07314  [pdf, other

    cs.CL cs.AI

    MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications

    Authors: Praveen K Kanithi, Clément Christophe, Marco AF Pimentel, Tathagata Raha, Nada Saadi, Hamza Javed, Svetlana Maslenkova, Nasir Hayat, Ronnie Rajan, Shadab Khan

    Abstract: The rapid development of Large Language Models (LLMs) for healthcare applications has spurred calls for holistic evaluation beyond frequently-cited benchmarks like USMLE, to better reflect real-world performance. While real-world assessments are valuable indicators of utility, they often lag behind the pace of LLM evolution, likely rendering findings obsolete upon deployment. This temporal disconn… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: Technical report

  6. arXiv:2408.06142  [pdf, ps, other

    cs.CL cs.AI

    Med42-v2: A Suite of Clinical LLMs

    Authors: Clément Christophe, Praveen K Kanithi, Tathagata Raha, Shadab Khan, Marco AF Pimentel

    Abstract: Med42-v2 introduces a suite of clinical large language models (LLMs) designed to address the limitations of generic models in healthcare settings. These models are built on Llama3 architecture and fine-tuned using specialized clinical data. They underwent multi-stage preference alignment to effectively respond to natural prompts. While generic models are often preference-aligned to avoid answering… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  7. arXiv:2407.21072  [pdf, other

    cs.AI cs.CL

    Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks

    Authors: Marco AF Pimentel, Clément Christophe, Tathagata Raha, Prateek Munjal, Praveen K Kanithi, Shadab Khan

    Abstract: As large language models (LLMs) continue to evolve, the need for robust and standardized evaluation benchmarks becomes paramount. Evaluating the performance of these models is a complex challenge that requires careful consideration of various linguistic tasks, model architectures, and benchmarking methodologies. In recent years, various frameworks have emerged as noteworthy contributions to the fi… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: 15 pages, 3 figures

  8. arXiv:2404.14779  [pdf, other

    cs.CL

    Med42 -- Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches

    Authors: Clément Christophe, Praveen K Kanithi, Prateek Munjal, Tathagata Raha, Nasir Hayat, Ronnie Rajan, Ahmed Al-Mahrooqi, Avani Gupta, Muhammad Umar Salman, Gurpreet Gosal, Bhargav Kanakiya, Charles Chen, Natalia Vassilieva, Boulbaba Ben Amor, Marco AF Pimentel, Shadab Khan

    Abstract: This study presents a comprehensive analysis and comparison of two predominant fine-tuning methodologies - full-parameter fine-tuning and parameter-efficient tuning - within the context of medical Large Language Models (LLMs). We developed and refined a series of LLMs, based on the Llama-2 architecture, specifically designed to enhance medical knowledge retrieval, reasoning, and question-answering… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Published at AAAI 2024 Spring Symposium - Clinical Foundation Models

  9. A Data-Driven Biophysical Computational Model of Parkinson's Disease based on Marmoset Monkeys

    Authors: Caetano M. Ranieri, Jhielson M. Pimentel, Marcelo R. Romano, Leonardo A. Elias, Roseli A. F. Romero, Michael A. Lones, Mariana F. P. Araujo, Patricia A. Vargas, Renan C. Moioli

    Abstract: In this work we propose a new biophysical computational model of brain regions relevant to Parkinson's Disease based on local field potential data collected from the brain of marmoset monkeys. Parkinson's disease is a neurodegenerative disorder, linked to the death of dopaminergic neurons at the substantia nigra pars compacta, which affects the normal dynamics of the basal ganglia-thalamus-cortex… ▽ More

    Submitted 1 September, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Journal ref: IEEE Access, 2021

  10. arXiv:2005.13228  [pdf, ps, other

    cs.GT econ.TH

    Oligopoly Dynamics

    Authors: Bernardo Melo Pimentel

    Abstract: The present notes summarise the oligopoly dynamics lectures professor Luís Cabral gave at the Bank of Portugal in September and October 2017. The lectures discuss a set industrial organisation problems in a dynamic environment, namely learning by doing, switching costs, price wars, networks and platforms, and ladder models of innovation. Methodologically, the materials cover analytical solutions o… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

  11. Remote health monitoring and diagnosis in the time of COVID-19

    Authors: Joachim A. Behar, Chengyu Liu, Kevin Kotzen, Kenta Tsutsui, Valentina D. A. Corino, Janmajay Singh, Marco A. F. Pimentel, Philip Warrick, Sebastian Zaunseder, Fernando Andreotti, David Sebag, Georgy Popanitsa, Patrick E. McSharry, Walter Karlen, Chandan Karmakar, Gari D. Clifford

    Abstract: Coronavirus disease (COVID-19) is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that is rapidly spreading across the globe. The clinical spectrum of SARS-CoV-2 pneumonia ranges from mild to critically ill cases and requires early detection and monitoring, within a clinical environment for critical cases and remotely for mild cases. The fear of contamination in clinical… ▽ More

    Submitted 15 October, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: 36 pages

  12. arXiv:1802.00030  [pdf, other

    cs.LG stat.ML

    Fusarium Damaged Kernels Detection Using Transfer Learning on Deep Neural Network Architecture

    Authors: Márcio Nicolau, Márcia Barrocas Moreira Pimentel, Casiane Salete Tibola, José Mauricio Cunha Fernandes, Willingthon Pavan

    Abstract: The present work shows the application of transfer learning for a pre-trained deep neural network (DNN), using a small image dataset ($\approx$ 12,000) on a single workstation with enabled NVIDIA GPU card that takes up to 1 hour to complete the training task and archive an overall average accuracy of $94.7\%$. The DNN presents a $20\%$ score of misclassification for an external test dataset. The a… ▽ More

    Submitted 31 January, 2018; originally announced February 2018.