Skip to main content

Showing 1–16 of 16 results for author: Smith, K E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14028  [pdf, ps, other

    cs.CL

    MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation

    Authors: Xueqing Peng, Lingfei Qian, Yan Wang, Ruoyu Xiang, Yueru He, Yang Ren, Mingyang Jiang, Jeff Zhao, Huan He, Yi Han, Yun Feng, Yuechen Jiang, Yupeng Cao, Haohang Li, Yangyang Yu, Xiaoyu Wang, Penglei Gao, Shengyuan Lin, Keyi Wang, Shanshan Yang, Yilun Zhao, Zhiwei Liu, Peng Lu, Jerry Huang, Suyuchen Wang , et al. (19 additional authors not shown)

    Abstract: Recent advances in large language models (LLMs) have accelerated progress in financial NLP and applications, yet existing benchmarks remain limited to monolingual and unimodal settings, often over-relying on simple tasks and failing to reflect the complexity of real-world financial communication. We introduce MultiFinBen, the first multilingual and multimodal benchmark tailored to the global finan… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  2. arXiv:2502.11433  [pdf, other

    cs.AI cs.CE q-fin.TR

    FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading

    Authors: Guojun Xiong, Zhiyang Deng, Keyi Wang, Yupeng Cao, Haohang Li, Yangyang Yu, Xueqing Peng, Mingquan Lin, Kaleb E Smith, Xiao-Yang Liu, Jimin Huang, Sophia Ananiadou, Qianqian Xie

    Abstract: Large language models (LLMs) fine-tuned on multimodal financial data have demonstrated impressive reasoning capabilities in various financial tasks. However, they often struggle with multi-step, goal-oriented scenarios in interactive financial markets, such as trading, where complex agentic approaches are required to improve decision-making. To address this, we propose \textsc{FLAG-Trader}, a unif… ▽ More

    Submitted 18 February, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

  3. arXiv:2502.03298  [pdf, other

    cs.CL cs.AI cs.LG

    MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters

    Authors: Amin Dada, Osman Alperen Koras, Marie Bauer, Amanda Butler, Kaleb E. Smith, Jens Kleesiek, Julian Friedrich

    Abstract: While increasing patients' access to medical documents improves medical care, this benefit is limited by varying health literacy levels and complex medical terminology. Large language models (LLMs) offer solutions by simplifying medical information. However, evaluating LLMs for safe and patient-friendly text generation is difficult due to the lack of standardized evaluation resources. To fill this… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  4. arXiv:2408.11878  [pdf, ps, other

    cs.CL cs.CE q-fin.CP

    Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

    Authors: Jimin Huang, Mengxi Xiao, Dong Li, Zihao Jiang, Yuzhe Yang, Yifei Zhang, Lingfei Qian, Yan Wang, Xueqing Peng, Yang Ren, Ruoyu Xiang, Zhengyu Chen, Xiao Zhang, Yueru He, Weiguang Han, Shunian Chen, Lihang Shen, Daniel Kim, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram , et al. (19 additional authors not shown)

    Abstract: Financial LLMs hold promise for advancing financial tasks and domain-specific applications. However, they are limited by scarce corpora, weak multimodal capabilities, and narrow evaluations, making them less suited for real-world application. To address this, we introduce \textit{Open-FinLLMs}, the first open-source multimodal financial LLMs designed to handle diverse tasks across text, tabular, t… ▽ More

    Submitted 6 June, 2025; v1 submitted 20 August, 2024; originally announced August 2024.

    Comments: 33 pages, 13 figures

  5. arXiv:2404.05694  [pdf, other

    cs.CL cs.AI cs.LG

    Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding

    Authors: Ahmad Idrissi-Yaghir, Amin Dada, Henning Schäfer, Kamyar Arzideh, Giulia Baldini, Jan Trienes, Max Hasin, Jeanette Bewersdorff, Cynthia S. Schmidt, Marie Bauer, Kaleb E. Smith, Jiang Bian, Yonghui Wu, Jörg Schlötterer, Torsten Zesch, Peter A. Horn, Christin Seifert, Felix Nensa, Jens Kleesiek, Christoph M. Friedrich

    Abstract: Recent advances in natural language processing (NLP) can be largely attributed to the advent of pre-trained language models such as BERT and RoBERTa. While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are commo… ▽ More

    Submitted 8 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024

  6. arXiv:2404.04067  [pdf, other

    cs.CL cs.AI cs.LG

    Does Biomedical Training Lead to Better Medical Performance?

    Authors: Amin Dada, Marie Bauer, Amanda Butler Contreras, Osman Alperen Koraş, Constantin Marc Seibold, Kaleb E Smith, Jens Kleesiek

    Abstract: Large Language Models (LLMs) are expected to significantly contribute to patient care, diagnostics, and administrative processes. Emerging biomedical LLMs aim to address healthcare-specific challenges, including privacy demands and computational constraints. Assessing the models' suitability for this sensitive application area is of the utmost importance. However, biomedical training has not been… ▽ More

    Submitted 17 September, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  7. arXiv:2403.12374  [pdf

    cs.CL

    Improving Generalizability of Extracting Social Determinants of Health Using Large Language Models through Prompt-tuning

    Authors: Cheng Peng, Zehao Yu, Kaleb E Smith, Wei-Hsuan Lo-Ciganic, Jiang Bian, Yonghui Wu

    Abstract: The progress in natural language processing (NLP) using large language models (LLMs) has greatly improved patient information extraction from clinical narratives. However, most methods based on the fine-tuning strategy have limited transfer learning ability for cross-domain applications. This study proposed a novel approach that employs a soft prompt-based learning architecture, which introduces t… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  8. arXiv:2312.06099  [pdf

    cs.CL

    Generative Large Language Models Are All-purpose Text Analytics Engines: Text-to-text Learning Is All Your Need

    Authors: Cheng Peng, Xi Yang, Aokun Chen, Zehao Yu, Kaleb E Smith, Anthony B Costa, Mona G Flores, Jiang Bian, Yonghui Wu

    Abstract: Objective To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. Methods We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 b… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  9. arXiv:2310.07321  [pdf, other

    cs.CL cs.AI cs.LG

    On the Impact of Cross-Domain Data on German Language Models

    Authors: Amin Dada, Aokun Chen, Cheng Peng, Kaleb E Smith, Ahmad Idrissi-Yaghir, Constantin Marc Seibold, Jianning Li, Lars Heiliger, Xi Yang, Christoph M. Friedrich, Daniel Truhn, Jan Egger, Jiang Bian, Jens Kleesiek, Yonghui Wu

    Abstract: Traditionally, large language models have been either trained on general web crawls or domain-specific data. However, recent successes of generative large language models, have shed light on the benefits of cross-domain datasets. To examine the significance of prioritizing data diversity over quality, we present a German dataset comprising texts from five domains, along with another dataset aimed… ▽ More

    Submitted 13 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 13 pages, 1 figure, accepted at Findings of the Association for Computational Linguistics: EMNLP 2023

  10. Model Tuning or Prompt Tuning? A Study of Large Language Models for Clinical Concept and Relation Extraction

    Authors: Cheng Peng, Xi Yang, Kaleb E Smith, Zehao Yu, Aokun Chen, Jiang Bian, Yonghui Wu

    Abstract: Objective To develop soft prompt-based learning algorithms for large language models (LLMs), examine the shape of prompts, prompt-tuning using frozen/unfrozen LLMs, transfer learning, and few-shot learning abilities. Methods We developed a soft prompt-based LLM model and compared 4 training strategies including (1) fine-tuning without prompts; (2) hard-prompt with unfrozen LLMs; (3) soft-prompt wi… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Journal ref: Journal of Biomedical Informatics. Volume 153, May 2024, 104630

  11. A Study of Generative Large Language Model for Medical Research and Healthcare

    Authors: Cheng Peng, Xi Yang, Aokun Chen, Kaleb E Smith, Nima PourNejatian, Anthony B Costa, Cheryl Martin, Mona G Flores, Ying Zhang, Tanja Magoc, Gloria Lipori, Duane A Mitchell, Naykky S Ospina, Mustafa M Ahmed, William R Hogan, Elizabeth A Shenkman, Yi Guo, Jiang Bian, Yonghui Wu

    Abstract: There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language proc… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  12. arXiv:2211.07867  [pdf, other

    cs.LG eess.SP q-bio.NC

    Machine Learning Methods Applied to Cortico-Cortical Evoked Potentials Aid in Localizing Seizure Onset Zones

    Authors: Ian G. Malone, Kaleb E. Smith, Morgan E. Urdaneta, Tyler S. Davis, Daria Nesterovich Anderson, Brian J. Phillip, John D. Rolston, Christopher R. Butson

    Abstract: Epilepsy affects millions of people, reducing quality of life and increasing risk of premature death. One-third of epilepsy cases are drug-resistant and require surgery for treatment, which necessitates localizing the seizure onset zone (SOZ) in the brain. Attempts have been made to use cortico-cortical evoked potentials (CCEPs) to improve SOZ localization but none have been successful enough for… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 6 pages

  13. arXiv:2203.03540  [pdf

    cs.CL cs.AI cs.LG

    GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records

    Authors: Xi Yang, Aokun Chen, Nima PourNejatian, Hoo Chang Shin, Kaleb E Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Mona G Flores, Ying Zhang, Tanja Magoc, Christopher A Harle, Gloria Lipori, Duane A Mitchell, William R Hogan, Elizabeth A Shenkman, Jiang Bian, Yonghui Wu

    Abstract: There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is compar… ▽ More

    Submitted 16 December, 2022; v1 submitted 2 February, 2022; originally announced March 2022.

    Comments: 24 pages, 2 figures, 3 tables

  14. arXiv:2111.04916  [pdf

    cs.SE cs.AI

    Building an AI-ready RSE Workforce

    Authors: Ying Zhang, Matthew A. Gitzendanner, Dan S. Maxwell, Justin W. Richardson, Kaleb E. Smith, Eric A. Stubbs, Brian J. Stucky, Jingchao Zhang, Erik Deumens

    Abstract: Artificial Intelligence has been transforming industries and academic research across the globe, and research software development is no exception. Machine learning and deep learning are being applied in every aspect of the research software development lifecycles, from new algorithm design paradigms to software development processes. In this paper, we discuss our views on today's challenges and o… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: 3 pages. Research Software Engineers in HPC Workshop (RSE-HPC-2021) at SC21

  15. arXiv:2103.01904  [pdf, other

    cs.LG stat.ML

    A Spectral Enabled GAN for Time Series Data Generation

    Authors: Kaleb E. Smith, Anthony O. Smith

    Abstract: Time dependent data is a main source of information in today's data driven world. Generating this type of data though has shown its challenges and made it an interesting research area in the field of generative machine learning. One such approach was that by Smith et al. who developed Time Series Generative Adversarial Network (TSGAN) which showed promising performance in generating time dependent… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

  16. arXiv:2006.16477  [pdf, other

    cs.LG stat.ML

    Conditional GAN for timeseries generation

    Authors: Kaleb E Smith, Anthony O Smith

    Abstract: It is abundantly clear that time dependent data is a vital source of information in the world. The challenge has been for applications in machine learning to gain access to a considerable amount of quality data needed for algorithm development and analysis. Modeling synthetic data using a Generative Adversarial Network (GAN) has been at the heart of providing a viable solution. Our work focuses on… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.