Skip to main content

Showing 1–15 of 15 results for author: Nooralahzadeh, F

.
  1. arXiv:2503.08323  [pdf, other

    cs.CL

    Towards Scalable and Cross-Lingual Specialist Language Models for Oncology

    Authors: Morteza Rohanian, Tarun Mehra, Nicola Miglino, Farhad Nooralahzadeh, Michael Krauthammer, Andreas Wicki

    Abstract: Clinical oncology generates vast, unstructured data that often contain inconsistencies, missing information, and ambiguities, making it difficult to extract reliable insights for data-driven decision-making. General-purpose large language models (LLMs) struggle with these challenges due to their lack of domain-specific reasoning, including specialized clinical terminology, context-dependent interp… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  2. arXiv:2502.18285  [pdf, other

    cs.CL

    Uncertainty Modeling in Multimodal Speech Analysis Across the Psychosis Spectrum

    Authors: Morteza Rohanian, Roya M. Hüppi, Farhad Nooralahzadeh, Noemi Dannecker, Yves Pauli, Werner Surbeck, Iris Sommer, Wolfram Hinzen, Nicolas Langer, Michael Krauthammer, Philipp Homan

    Abstract: Capturing subtle speech disruptions across the psychosis spectrum is challenging because of the inherent variability in speech patterns. This variability reflects individual differences and the fluctuating nature of symptoms in both clinical and non-clinical populations. Accounting for uncertainty in speech data is essential for predicting symptom severity and improving diagnostic precision. Speec… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  3. arXiv:2502.03333  [pdf, other

    cs.CV cs.AI

    RadVLM: A Multitask Conversational Vision-Language Model for Radiology

    Authors: Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruipérez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas M. Sutter, Julia E. Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Blüthgen, Farhad Nooralahzadeh, Michael Krauthammer

    Abstract: The widespread use of chest X-rays (CXRs), coupled with a shortage of radiologists, has driven growing interest in automated CXR analysis and AI-assisted reporting. While existing vision-language models (VLMs) show promise in specific tasks such as report generation or abnormality detection, they often lack support for interactive diagnostic capabilities. In this work we present RadVLM, a compact,… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

    Comments: 21 pages, 15 figures

  4. arXiv:2412.18428  [pdf, other

    cs.AI cs.CL

    Explainable Multi-Modal Data Exploration in Natural Language via LLM Agent

    Authors: Farhad Nooralahzadeh, Yi Zhang, Jonathan Furst, Kurt Stockinger

    Abstract: International enterprises, organizations, or hospitals collect large amounts of multi-modal data stored in databases, text documents, images, and videos. While there has been recent progress in the separate fields of multi-modal data exploration as well as in database systems that automatically translate natural language questions to database query languages, the research challenge of querying dat… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

  5. arXiv:2406.03170  [pdf, other

    cs.CL

    StatBot.Swiss: Bilingual Open Data Exploration in Natural Language

    Authors: Farhad Nooralahzadeh, Yi Zhang, Ellery Smith, Sabine Maennel, Cyril Matthey-Doret, Raphaël de Fondville, Kurt Stockinger

    Abstract: The potential for improvements brought by Large Language Models (LLMs) in Text-to-SQL systems is mostly assessed on monolingual English datasets. However, LLMs' performance for other languages remains vastly unexplored. In this work, we release the StatBot.Swiss dataset, the first bilingual benchmark for evaluating Text-to-SQL systems based on real-world applications. The StatBot.Swiss dataset con… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: This work is accepted at ACL Findings 2024

  6. arXiv:2402.08349  [pdf, other

    cs.DB cs.AI cs.CL

    Evaluating the Data Model Robustness of Text-to-SQL Systems Based on Real User Queries

    Authors: Jonathan Fürst, Catherine Kosten, Farhad Nooralahzadeh, Yi Zhang, Kurt Stockinger

    Abstract: Text-to-SQL systems (also known as NL-to-SQL systems) have become an increasingly popular solution for bridging the gap between user capabilities and SQL-based data access. These systems translate user requests in natural language to valid SQL statements for a specific database. Recent Text-to-SQL systems have benefited from the rapid improvement of transformer-based language models. However, whil… ▽ More

    Submitted 29 November, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  7. arXiv:2311.16764  [pdf, other

    cs.CL

    Radiology-Aware Model-Based Evaluation Metric for Report Generation

    Authors: Amos Calamida, Farhad Nooralahzadeh, Morteza Rohanian, Koji Fujimoto, Mizuho Nishio, Michael Krauthammer

    Abstract: We propose a new automated evaluation metric for machine-generated radiology reports using the successful COMET architecture adapted for the radiology domain. We train and publish four medically-oriented model checkpoints, including one trained on RadGraph, a radiology knowledge graph. Our results show that our metric correlates moderately to high with established metrics such as BERTscore, BLEU,… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 9 pages

  8. arXiv:2305.04561  [pdf, other

    cs.CL

    Boosting Radiology Report Generation by Infusing Comparison Prior

    Authors: Sanghwan Kim, Farhad Nooralahzadeh, Morteza Rohanian, Koji Fujimoto, Mizuho Nishio, Ryo Sakamoto, Fabio Rinaldi, Michael Krauthammer

    Abstract: Recent transformer-based models have made significant strides in generating radiology reports from chest X-ray images. However, a prominent challenge remains: these models often lack prior knowledge, resulting in the generation of synthetic reports that mistakenly reference non-existent prior exams. This discrepancy can be attributed to a knowledge gap between radiologists and the generation model… ▽ More

    Submitted 5 June, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023, BioNLP Workshop

  9. arXiv:2302.04208  [pdf, other

    cs.LG stat.ML

    Exploratory Analysis of Federated Learning Methods with Differential Privacy on MIMIC-III

    Authors: Aron N. Horvath, Matteo Berchier, Farhad Nooralahzadeh, Ahmed Allam, Michael Krauthammer

    Abstract: Background: Federated learning methods offer the possibility of training machine learning models on privacy-sensitive data sets, which cannot be easily shared. Multiple regulations pose strict requirements on the storage and usage of healthcare data, leading to data being in silos (i.e. locked-in at healthcare facilities). The application of federated algorithms on these datasets could accelerate… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  10. arXiv:2209.02982  [pdf, other

    cs.CL

    Improving the Cross-Lingual Generalisation in Visual Question Answering

    Authors: Farhad Nooralahzadeh, Rico Sennrich

    Abstract: While several benefits were realized for multilingual vision-language pretrained models, recent benchmarks across various tasks and languages showed poor cross-lingual generalisation when multilingually pre-trained vision-language models are applied to non-English data, with a large gap between (supervised) English performance and (zero-shot) cross-lingual transfer. In this work, we explore the po… ▽ More

    Submitted 30 November, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: This work is accepted by the AAAI 2023

  11. arXiv:2102.09777  [pdf, other

    cs.CL

    Progressive Transformer-Based Generation of Radiology Reports

    Authors: Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer

    Abstract: Inspired by Curriculum Learning, we propose a consecutive (i.e., image-to-text-to-text) generation framework where we divide the problem of radiology report generation into two steps. Contrary to generating the full radiology report from the image at once, the model generates global concepts from the image in the first step and then reforms them into finer and coherent texts using a transformer ar… ▽ More

    Submitted 31 August, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

    Comments: Accepted to findings of EMNLP 2021

  12. arXiv:2011.04372  [pdf, other

    cs.CL

    Low-Resource Adaptation of Neural NLP Models

    Authors: Farhad Nooralahzadeh

    Abstract: Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data. These resources are often based on language data available in large quantities, such as English newswire. However, in real-world applications of NLP, the textual resources vary across several dimensions, such as language,… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Thesis submitted for the degree of Philosophiae Doctor. Department of Informatics, University of Oslo. https://www.mn.uio.no/ifi/forskning/aktuelt/arrangementer/disputaser/2020/nooralahzadeh.html

  13. arXiv:2003.02739  [pdf, other

    cs.CL

    Zero-Shot Cross-Lingual Transfer with Meta Learning

    Authors: Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle Augenstein

    Abstract: Learning what to share between tasks has been a topic of great importance recently, as strategic sharing of knowledge has been shown to improve downstream task performance. This is particularly important for multilingual applications, as most languages in the world are under-resourced. Here, we consider the setting of training models on multiple different languages at the same time, when little or… ▽ More

    Submitted 5 October, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: Accepted as long paper in EMNLP2020 main conference

  14. arXiv:1805.11461  [pdf, other

    cs.CL

    Syntactic Dependency Representations in Neural Relation Classification

    Authors: Farhad Nooralahzadeh, Lilja Øvrelid

    Abstract: We investigate the use of different syntactic dependency representations in a neural relation classification task and compare the CoNLL, Stanford Basic and Universal Dependencies schemes. We further compare with a syntax-agnostic approach and perform an error analysis in order to gain a better understanding of the results.

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: arXiv admin note: text overlap with arXiv:1804.08887

  15. arXiv:1804.08887  [pdf, other

    cs.CL

    SIRIUS-LTG-UiO at SemEval-2018 Task 7: Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific Papers

    Authors: Farhad Nooralahzadeh, Lilja Øvrelid, Jan Tore Lønning

    Abstract: This article presents the SIRIUS-LTG-UiO system for the SemEval 2018 Task 7 on Semantic Relation Extraction and Classification in Scientific Papers. First we extract the shortest dependency path (sdp) between two entities, then we introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input and performs relation classification with differing objective… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.