Skip to main content

Showing 1–18 of 18 results for author: Mahdi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.18919  [pdf, ps, other

    cs.HC cs.AI cs.CL

    Clinical knowledge in LLMs does not translate to human interactions

    Authors: Andrew M. Bean, Rebecca Payne, Guy Parsons, Hannah Rose Kirk, Juan Ciro, Rafael Mosquera, Sara HincapiƩ Monsalve, Aruna S. Ekanayaka, Lionel Tarassenko, Luc Rocher, Adam Mahdi

    Abstract: Global healthcare providers are exploring use of large language models (LLMs) to provide medical advice to the public. LLMs now achieve nearly perfect scores on medical licensing exams, but this does not necessarily translate to accurate performance in real-world settings. We tested if LLMs can assist members of the public in identifying underlying conditions and choosing a course of action (dispo… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: 52 pages, 4 figures

  2. arXiv:2503.02972  [pdf, other

    cs.CL cs.AI

    LINGOLY-TOO: Disentangling Reasoning from Knowledge with Templatised Orthographic Obfuscation

    Authors: Jude Khouja, Karolina Korgul, Simi Hellsten, Lingyi Yang, Vlad Neacsu, Harry Mayne, Ryan Kearns, Andrew Bean, Adam Mahdi

    Abstract: The expanding knowledge and memorisation capacity of frontier language models allows them to solve many reasoning tasks directly by exploiting prior knowledge, leading to inflated estimates of their reasoning abilities. We introduce LINGOLY-TOO, a challenging reasoning benchmark grounded in natural language and designed to counteract the effect of non-reasoning abilities on reasoning estimates. Us… ▽ More

    Submitted 28 May, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

  3. arXiv:2411.10168  [pdf, other

    cs.AI cs.CL

    Evaluating the role of `Constitutions' for learning from AI feedback

    Authors: Saskia Redgate, Andrew M. Bean, Adam Mahdi

    Abstract: The growing capabilities of large language models (LLMs) have led to their use as substitutes for human feedback for training and assessing other LLMs. These methods often rely on `constitutions', written guidelines which a critic model uses to provide feedback and improve generations. We investigate how the choice of constitution affects feedback quality by using four different constitutions to i… ▽ More

    Submitted 15 November, 2024; originally announced November 2024.

    Comments: 4 pages, 2 figures. In NeurIPS 2024 Workshop on Language Gamification

  4. arXiv:2411.08790  [pdf, other

    cs.LG cs.AI cs.CL

    Can sparse autoencoders be used to decompose and interpret steering vectors?

    Authors: Harry Mayne, Yushi Yang, Adam Mahdi

    Abstract: Steering vectors are a promising approach to control the behaviour of large language models. However, their underlying mechanisms remain poorly understood. While sparse autoencoders (SAEs) may offer a potential method to interpret steering vectors, recent findings show that SAE-reconstructed vectors often lack the steering properties of the original vectors. This paper investigates why directly ap… ▽ More

    Submitted 13 November, 2024; originally announced November 2024.

  5. arXiv:2411.06424  [pdf, other

    cs.LG cs.CL

    Beyond Toxic Neurons: A Mechanistic Analysis of DPO for Toxicity Reduction

    Authors: Yushi Yang, Filip Sondej, Harry Mayne, Adam Mahdi

    Abstract: Safety fine-tuning algorithms are widely used to reduce harmful outputs in language models, but how they achieve this remain unclear. Studying the Direct Preference Optimization (DPO) algorithm for toxicity reduction, current explanations claim that DPO achieves this by dampening the activations of toxic MLP neurons. However, through activation patching, we show that this explanation is incomplete… ▽ More

    Submitted 13 December, 2024; v1 submitted 10 November, 2024; originally announced November 2024.

    Journal ref: NeurIPS 2024 Workshop on Socially Responsible Language Modelling Research (SoLaR)

  6. arXiv:2410.21868  [pdf, other

    cs.CL

    Improving In-Context Learning with Small Language Model Ensembles

    Authors: M. Mehdi Mojarradi, Lingyi Yang, Robert McCraith, Adam Mahdi

    Abstract: Large language models (LLMs) have shown impressive capabilities across various tasks, but their performance on domain-specific tasks remains limited. While methods like retrieval augmented generation and fine-tuning can help to address this, they require significant resources. In-context learning (ICL) is a cheap and efficient alternative but cannot match the accuracies of advanced methods. We pre… ▽ More

    Submitted 20 December, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

    Comments: Presented at NeurIPS 2024 Workshop on Adaptive Foundation Models

  7. arXiv:2410.14185  [pdf, other

    cs.LG eess.IV

    Combining Hough Transform and Deep Learning Approaches to Reconstruct ECG Signals From Printouts

    Authors: Felix Krones, Ben Walker, Terry Lyons, Adam Mahdi

    Abstract: This work presents our team's (SignalSavants) winning contribution to the 2024 George B. Moody PhysioNet Challenge. The Challenge had two goals: reconstruct ECG signals from printouts and classify them for cardiac diseases. Our focus was the first task. Despite many ECGs being digitally recorded today, paper ECGs remain common throughout the world. Digitising them could help build more diverse dat… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  8. arXiv:2408.07888  [pdf, other

    cs.CL

    Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering

    Authors: Yushi Yang, Andrew M. Bean, Robert McCraith, Adam Mahdi

    Abstract: Fine-tuning Large Language Models (LLMs) incurs considerable training costs, driving the need for data-efficient training with optimised data ordering. Human-inspired strategies offer a solution by organising data based on human learning practices. This study evaluates the fine-tuning efficiency of five human-inspired strategies across four language models, three datasets, and both human- and LLM-… ▽ More

    Submitted 5 November, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: NeurIPS 2024 Workshop on Fine-Tuning in Modern Machine Learning: Principles and Scalability (FITML)

  9. arXiv:2403.15475  [pdf, other

    cs.CY

    Large language models can help boost food production, but be mindful of their risks

    Authors: Djavan De Clercq, Elias Nehring, Harry Mayne, Adam Mahdi

    Abstract: Coverage of ChatGPT-style large language models (LLMs) in the media has focused on their eye-catching achievements, including solving advanced mathematical problems and reaching expert proficiency in medical examinations. But the gradual adoption of LLMs in agriculture, an industry which touches every human life, has received much less public scrutiny. In this short perspective, we examine risks a… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  10. arXiv:2403.07967  [pdf, other

    cs.LG

    Feasibility of machine learning-based rice yield prediction in India at the district level using climate reanalysis data

    Authors: Djavan De Clercq, Adam Mahdi

    Abstract: Yield forecasting, the science of predicting agricultural productivity before the crop harvest occurs, helps a wide range of stakeholders make better decisions around agricultural planning. This study aims to investigate whether machine learning-based yield prediction models can capably predict Kharif season rice yields at the district level in India several months before the rice harvest takes pl… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  11. arXiv:2403.06027  [pdf, other

    cs.LG eess.SP

    Multimodal deep learning approach to predicting neurological recovery from coma after cardiac arrest

    Authors: Felix H. Krones, Ben Walker, Guy Parsons, Terry Lyons, Adam Mahdi

    Abstract: This work showcases our team's (The BEEGees) contributions to the 2023 George B. Moody PhysioNet Challenge. The aim was to predict neurological recovery from coma following cardiac arrest using clinical data and time-series such as multi-channel EEG and ECG signals. Our modelling approach is multimodal, based on two-dimensional spectrogram representations derived from numerous EEG channels, alongs… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 5 figures, 2 tables

  12. arXiv:2403.02945  [pdf, other

    cs.LG

    Unsupervised Learning Approaches for Identifying ICU Patient Subgroups: Do Results Generalise?

    Authors: Harry Mayne, Guy Parsons, Adam Mahdi

    Abstract: The use of unsupervised learning to identify patient subgroups has emerged as a potentially promising direction to improve the efficiency of Intensive Care Units (ICUs). By identifying subgroups of patients with similar levels of medical resource need, ICUs could be restructured into a collection of smaller subunits, each catering to a specific group. However, it is unclear whether common patient… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  13. arXiv:2402.02460  [pdf, other

    cs.LG cs.AI cs.CV

    Review of multimodal machine learning approaches in healthcare

    Authors: Felix Krones, Umar Marikkar, Guy Parsons, Adam Szmul, Adam Mahdi

    Abstract: Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved decision making. Clinicians typically rely on a variety of data sources including patients' demographic information, laboratory data, vital signs and various imaging… ▽ More

    Submitted 11 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: 5 figures, 5 tables

  14. LT-ViT: A Vision Transformer for multi-label Chest X-ray classification

    Authors: Umar Marikkar, Sara Atito, Muhammad Awais, Adam Mahdi

    Abstract: Vision Transformers (ViTs) are widely adopted in medical imaging tasks, and some existing efforts have been directed towards vision-language training for Chest X-rays (CXRs). However, we envision that there still exists a potential for improvement in vision-only training for CXRs using ViTs, by aggregating information from multiple scales, which has been proven beneficial for non-transformer netwo… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures

  15. arXiv:2310.07225  [pdf, ps, other

    cs.CL

    Do Large Language Models have Shared Weaknesses in Medical Question Answering?

    Authors: Andrew M. Bean, Karolina Korgul, Felix Krones, Robert McCraith, Adam Mahdi

    Abstract: Large language models (LLMs) have made rapid improvement on medical benchmarks, but their unreliability remains a persistent challenge for safe real-world uses. To design for the use LLMs as a category, rather than for specific models, requires developing an understanding of shared strengths and weaknesses which appear across models. To address this challenge, we benchmark a range of top LLMs and… ▽ More

    Submitted 11 October, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 8 pages, 10 figures. To appear in NeurIPS 2024 Advancements in Medical Foundation Models Workshop

  16. Dual Bayesian ResNet: A Deep Learning Approach to Heart Murmur Detection

    Authors: Benjamin Walker, Felix Krones, Ivan Kiskin, Guy Parsons, Terry Lyons, Adam Mahdi

    Abstract: This study presents our team PathToMyHeart's contribution to the George B. Moody PhysioNet Challenge 2022. Two models are implemented. The first model is a Dual Bayesian ResNet (DBRes), where each patient's recording is segmented into overlapping log mel spectrograms. These undergo two binary classifications: present versus unknown or absent, and unknown versus present or absent. The classificatio… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 5 pages, 3 figures

    Journal ref: Computing in Cardiology, vol. 49, 2022

  17. arXiv:1709.02495  [pdf, other

    cs.CV

    DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets

    Authors: Ali Mahdi, Jun Qin

    Abstract: A deep feature based saliency model (DeepFeat) is developed to leverage the understanding of the prediction of human fixations. Traditional saliency models often predict the human visual attention relying on few level image cues. Although such models predict fixations on a variety of image complexities, their approaches are limited to the incorporated features. In this study, we aim to provide an… ▽ More

    Submitted 7 September, 2017; originally announced September 2017.

    Comments: 9 pages, 7 figures, submitted to IEEE transactions on cognitive developmental systems

  18. arXiv:1706.00396   

    cs.CV

    Line Profile Based Segmentation Algorithm for Touching Corn Kernels

    Authors: Ali Mahdi, Jun Qin

    Abstract: Image segmentation of touching objects plays a key role in providing accurate classification for computer vision technologies. A new line profile based imaging segmentation algorithm has been developed to provide a robust and accurate segmentation of a group of touching corns. The performance of the line profile based algorithm has been compared to a watershed based imaging segmentation algorithm.… ▽ More

    Submitted 3 August, 2017; v1 submitted 1 June, 2017; originally announced June 2017.

    Comments: We found some results in this paper may not be correct. Therefore, we require to withdraw this paper. Thanks