Skip to main content

Showing 1–2 of 2 results for author: Gonzalez-Mallo, M

.
  1. arXiv:2505.04388  [pdf, ps, other

    cs.CL cs.AI

    The Aloe Family Recipe for Open and Specialized Healthcare LLMs

    Authors: Dario Garcia-Gasulla, Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés

    Abstract: Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which in… ▽ More

    Submitted 28 May, 2025; v1 submitted 7 May, 2025; originally announced May 2025.

    Comments: Follow-up work from arXiv:2405.01886

  2. arXiv:2405.01886  [pdf, other

    cs.CL cs.AI

    Aloe: A Family of Fine-tuned Open Healthcare LLMs

    Authors: Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Jordi Bayarri-Planas, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Lucia Urcelay-Ganzabal, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés Dario Garcia-Gasulla

    Abstract: As the capabilities of Large Language Models (LLMs) in healthcare and medicine continue to advance, there is a growing need for competitive open-source models that can safeguard public interest. With the increasing availability of highly competitive open base models, the impact of continued pre-training is increasingly uncertain. In this work, we explore the role of instruct tuning, model merging,… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Five appendix