Skip to main content

Showing 1–11 of 11 results for author: Mayer, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06232  [pdf, ps, other

    cs.CV

    Challenging Vision-Language Models with Surgical Data: A New Dataset and Broad Benchmarking Study

    Authors: Leon Mayer, Tim Rädsch, Dominik Michael, Lucas Luttner, Amine Yamlahi, Evangelia Christodoulou, Patrick Godau, Marcel Knopp, Annika Reinke, Fiona Kolbinger, Lena Maier-Hein

    Abstract: While traditional computer vision models have historically struggled to generalize to endoscopic domains, the emergence of foundation models has shown promising cross-domain performance. In this work, we present the first large-scale study assessing the capabilities of Vision Language Models (VLMs) for endoscopic tasks with a specific focus on laparoscopic surgery. Using a diverse set of state-of-… ▽ More

    Submitted 8 July, 2025; v1 submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2506.02692  [pdf, ps, other

    cs.CV

    Large-scale Self-supervised Video Foundation Model for Intelligent Surgery

    Authors: Shu Yang, Fengtao Zhou, Leon Mayer, Fuxiang Huang, Yiliang Chen, Yihui Wang, Sunan He, Yuxiang Nie, Xi Wang, Ömer Sümer, Yueming Jin, Huihui Sun, Shuchang Xu, Alex Qinyang Liu, Zheng Li, Jing Qin, Jeremy YuenChun Teoh, Lena Maier-Hein, Hao Chen

    Abstract: Computer-Assisted Intervention (CAI) has the potential to revolutionize modern surgery, with surgical scene understanding serving as a critical component in supporting decision-making, improving procedural efficacy, and ensuring intraoperative safety. While existing AI-driven approaches alleviate annotation burdens via self-supervised spatial representation learning, their lack of explicit tempora… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  3. arXiv:2505.04720  [pdf, other

    cs.CV

    False Promises in Medical Imaging AI? Assessing Validity of Outperformance Claims

    Authors: Evangelia Christodoulou, Annika Reinke, Pascaline Andrè, Patrick Godau, Piotr Kalinowski, Rola Houhou, Selen Erkan, Carole H. Sudre, Ninon Burgos, Sofiène Boutaj, Sophie Loizillon, Maëlys Solal, Veronika Cheplygina, Charles Heitz, Michal Kozubek, Michela Antonelli, Nicola Rieke, Antoine Gilson, Leon D. Mayer, Minu D. Tizabi, M. Jorge Cardoso, Amber Simpson, Annette Kopp-Schneider, Gaël Varoquaux, Olivier Colliot , et al. (1 additional authors not shown)

    Abstract: Performance comparisons are fundamental in medical imaging Artificial Intelligence (AI) research, often driving claims of superiority based on relative improvements in common performance metrics. However, such claims frequently rely solely on empirical mean performance. In this paper, we investigate whether newly proposed methods genuinely outperform the state of the art by analyzing a representat… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  4. arXiv:2503.00248  [pdf, other

    cs.AI cs.HC cs.MA

    Human-AI Collaboration: Trade-offs Between Performance and Preferences

    Authors: Lukas William Mayer, Sheer Karny, Jackie Ayoub, Miao Song, Danyang Tian, Ehsan Moradi-Pari, Mark Steyvers

    Abstract: Despite the growing interest in collaborative AI, designing systems that seamlessly integrate human input remains a major challenge. In this study, we developed a task to systematically examine human preferences for collaborative agents. We created and evaluated five collaborative AI agents with strategies that differ in the manner and degree they adapt to human actions. Participants interacted wi… ▽ More

    Submitted 28 February, 2025; originally announced March 2025.

    Comments: LW Mayer & S Karny are co-first authors

  5. arXiv:2502.15563  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Bridging vision language model (VLM) evaluation gaps with a framework for scalable and cost-effective benchmark generation

    Authors: Tim Rädsch, Leon Mayer, Simon Pavicic, A. Emre Kavur, Marcel Knopp, Barış Öztürk, Klaus Maier-Hein, Paul F. Jaeger, Fabian Isensee, Annika Reinke, Lena Maier-Hein

    Abstract: Reliable evaluation of AI models is critical for scientific progress and practical application. While existing VLM benchmarks provide general insights into model capabilities, their heterogeneous designs and limited focus on a few imaging domains pose significant challenges for both cross-domain performance comparison and targeted domain-specific evaluation. To address this, we propose three key c… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

  6. arXiv:2409.17763  [pdf, other

    cs.CV cs.AI cs.LG

    Confidence intervals uncovered: Are we ready for real-world medical imaging AI?

    Authors: Evangelia Christodoulou, Annika Reinke, Rola Houhou, Piotr Kalinowski, Selen Erkan, Carole H. Sudre, Ninon Burgos, Sofiène Boutaj, Sophie Loizillon, Maëlys Solal, Nicola Rieke, Veronika Cheplygina, Michela Antonelli, Leon D. Mayer, Minu D. Tizabi, M. Jorge Cardoso, Amber Simpson, Paul F. Jäger, Annette Kopp-Schneider, Gaël Varoquaux, Olivier Colliot, Lena Maier-Hein

    Abstract: Medical imaging is spearheading the AI transformation of healthcare. Performance reporting is key to determine which methods should be translated into clinical practice. Frequently, broad conclusions are simply derived from mean performance values. In this paper, we argue that this common practice is often a misleading simplification as it ignores performance variability. Our contribution is three… ▽ More

    Submitted 27 September, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: Paper accepted at MICCAI 2024 conference

  7. arXiv:2409.04164  [pdf, other

    cs.CL cs.LG cs.SE

    Can OpenSource beat ChatGPT? -- A Comparative Study of Large Language Models for Text-to-Code Generation

    Authors: Luis Mayer, Christian Heumann, Matthias Aßenmacher

    Abstract: In recent years, large language models (LLMs) have emerged as powerful tools with potential applications in various fields, including software engineering. Within the scope of this research, we evaluate five different state-of-the-art LLMs - Bard, BingChat, ChatGPT, Llama2, and Code Llama - concerning their capabilities for text-to-code generation. In an empirical study, we feed prompts with textu… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

    Comments: Conference Paper accepted at the 9th SwissText Conference (2024)

  8. arXiv:2401.13835  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    What Large Language Models Know and What People Think They Know

    Authors: Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer Karny, Xinyue Hu, Lukas Mayer, Padhraic Smyth

    Abstract: As artificial intelligence (AI) systems, particularly large language models (LLMs), become increasingly integrated into decision-making processes, the ability to trust their outputs is crucial. To earn human trust, LLMs must be well calibrated such that they can accurately assess and communicate the likelihood of their predictions being correct. Whereas recent work has focused on LLMs' internal co… ▽ More

    Submitted 13 February, 2025; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 27 pages, 10 figures For the journal publication on Nature Machine Intelligence see https://www.nature.com/articles/s42256-024-00976-7 For the data and code see https://osf.io/y7pr6/

    Journal ref: Nat Mach Intell (2025)

  9. arXiv:2307.06345  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.DS

    Cornerstone: Octree Construction Algorithms for Scalable Particle Simulations

    Authors: Sebastian Keller, Aurélien Cavelan, Rubén Cabezon, Lucio Mayer, Florina M. Ciorba

    Abstract: This paper presents an octree construction method, called Cornerstone, that facilitates global domain decomposition and interactions between particles in mesh-free numerical simulations. Our method is based on algorithms developed for 3D computer graphics, which we extend to distributed high performance computing (HPC) systems. Cornerstone yields global and locally essential octrees and is able to… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    ACM Class: J.2

    Journal ref: PASC '23: Proceedings of the Platform for Advanced Scientific Computing Conference, June 2023, Article No.: 18

  10. arXiv:1905.03344  [pdf, other

    physics.comp-ph cs.PF

    SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App

    Authors: Danilo Guerrera, Aurélien Cavelan, Rubén M. Cabezón, David Imbert, Jean-Guillaume Piccinali, Ali Mohammed, Lucio Mayer, Darren Reed, Florina M. Ciorba

    Abstract: Numerical simulations of fluids in astrophysics and computational fluid dynamics (CFD) are among the most computationally-demanding calculations, in terms of sustained floating-point operations per second, or FLOP/s. It is expected that these numerical simulations will significantly benefit from the future Exascale computing infrastructures, that will perform 10^18 FLOP/s. The performance of the S… ▽ More

    Submitted 29 April, 2019; originally announced May 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1809.08013

  11. arXiv:1809.08013  [pdf, other

    physics.comp-ph cs.CE cs.DC

    Towards a Mini-App for Smoothed Particle Hydrodynamics at Exascale

    Authors: Danilo Guerrera, Rubén M. Cabezón, Jean-Guillaume Piccinali, Aurélien Cavelan, Florina M. Ciorba, David Imbert, Lucio Mayer, Darren Reed

    Abstract: The smoothed particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations with detailed physics represent computationally-demanding calculations. The parallelization of SPH codes is not trivial due to the absence of a structured grid. Additionally, the perform… ▽ More

    Submitted 21 September, 2018; originally announced September 2018.

    Comments: 18 pages, 4 figures, 5 tables, 2018 IEEE International Conference on Cluster Computing proceedings for WRAp18