Skip to main content

Showing 1–50 of 56 results for author: Sznitman, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.11356  [pdf, ps, other

    cs.CV

    GynSurg: A Comprehensive Gynecology Laparoscopic Surgery Dataset

    Authors: Sahar Nasirihaghighi, Negin Ghamsarian, Leonie Peschek, Matteo Munari, Heinrich Husslein, Raphael Sznitman, Klaus Schoeffmann

    Abstract: Recent advances in deep learning have transformed computer-assisted intervention and surgical video analysis, driving improvements not only in surgical training, intraoperative decision support, and patient outcomes, but also in postoperative documentation and surgical discovery. Central to these developments is the availability of large, high-quality annotated datasets. In gynecologic laparoscopy… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2506.08896  [pdf, ps, other

    cs.CV

    WetCat: Automating Skill Assessment in Wetlab Cataract Surgery Videos

    Authors: Negin Ghamsarian, Raphael Sznitman, Klaus Schoeffmann, Jens Kowal

    Abstract: To meet the growing demand for systematic surgical training, wetlab environments have become indispensable platforms for hands-on practice in ophthalmology. Yet, traditional wetlab training depends heavily on manual performance evaluations, which are labor-intensive, time-consuming, and often subject to variability. Recent advances in computer vision offer promising avenues for automated skill ass… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 9 pages, 6 figures

  3. arXiv:2505.07691  [pdf, ps, other

    cs.CV

    Feedback-Driven Pseudo-Label Reliability Assessment: Redefining Thresholding for Semi-Supervised Semantic Segmentation

    Authors: Negin Ghamsarian, Sahar Nasirihaghighi, Klaus Schoeffmann, Raphael Sznitman

    Abstract: Semi-supervised learning leverages unlabeled data to enhance model performance, addressing the limitations of fully supervised approaches. Among its strategies, pseudo-supervision has proven highly effective, typically relying on one or multiple teacher networks to refine pseudo-labels before training a student network. A common practice in pseudo-supervision is filtering pseudo-labels based on pr… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 11 pages, 5 Figures

  4. arXiv:2501.17628  [pdf, other

    eess.IV cs.CV

    Dual Invariance Self-training for Reliable Semi-supervised Surgical Phase Recognition

    Authors: Sahar Nasirihaghighi, Negin Ghamsarian, Raphael Sznitman, Klaus Schoeffmann

    Abstract: Accurate surgical phase recognition is crucial for advancing computer-assisted interventions, yet the scarcity of labeled data hinders training reliable deep learning models. Semi-supervised learning (SSL), particularly with pseudo-labeling, shows promise over fully supervised methods but often lacks reliable pseudo-label assessment mechanisms. To address this gap, we propose a novel SSL framework… ▽ More

    Submitted 29 January, 2025; originally announced January 2025.

  5. arXiv:2501.06836  [pdf, other

    cs.CV

    SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

    Authors: Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman

    Abstract: This paper addresses the domain adaptation challenge for semantic segmentation in medical imaging. Despite the impressive performance of recent foundational segmentation models like SAM on natural images, they struggle with medical domain images. Beyond this, recent approaches that perform end-to-end fine-tuning of models are simply not computationally tractable. To address this, we propose a nove… ▽ More

    Submitted 12 January, 2025; originally announced January 2025.

    Comments: WACV25

  6. arXiv:2412.06470  [pdf, other

    cs.CV cs.LG

    Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation

    Authors: Fei Wu, Pablo Marquez-Neila, Hedyeh Rafi-Tarii, Raphael Sznitman

    Abstract: Multi-class semantic segmentation remains a cornerstone challenge in computer vision. Yet, dataset creation remains excessively demanding in time and effort, especially for specialized domains. Active Learning (AL) mitigates this challenge by selecting data points for annotation strategically. However, existing patch-based AL methods often overlook boundary pixels critical information, essential f… ▽ More

    Submitted 16 March, 2025; v1 submitted 9 December, 2024; originally announced December 2024.

    Comments: WACV 2025 (Oral), 8 pages

  7. arXiv:2411.13619  [pdf, other

    cs.CV cs.AI cs.LG

    Non-Linear Outlier Synthesis for Out-of-Distribution Detection

    Authors: Lars Doorenbos, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: The reliability of supervised classifiers is severely hampered by their limitations in dealing with unexpected inputs, leading to great interest in out-of-distribution (OOD) detection. Recently, OOD detectors trained on synthetic outliers, especially those generated by large diffusion models, have shown promising results in defining robust OOD decision boundaries. Building on this progress, we pre… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

  8. arXiv:2409.06037  [pdf, other

    cs.CV

    Online 3D reconstruction and dense tracking in endoscopic videos

    Authors: Michel Hayoz, Christopher Hahne, Thomas Kurmann, Max Allan, Guido Beldi, Daniel Candinas, ablo Márquez-Neila, Raphael Sznitman

    Abstract: 3D scene reconstruction from stereo endoscopic video data is crucial for advancing surgical interventions. In this work, we present an online framework for online, dense 3D scene reconstruction and tracking, aimed at enhancing surgical scene understanding and assisting interventions. Our method dynamically extends a canonical scene representation using Gaussian splatting, while modeling tissue def… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  9. arXiv:2408.03043  [pdf, other

    cs.CV

    Targeted Visual Prompting for Medical Visual Question Answering

    Authors: Sergio Tascon-Morales, Pablo Márquez-Neila, Raphael Sznitman

    Abstract: With growing interest in recent years, medical visual question answering (Med-VQA) has rapidly evolved, with multimodal large language models (MLLMs) emerging as an alternative to classical model architectures. Specifically, their ability to add visual information to the input of pre-trained LLMs brings new capabilities for image interpretation. However, simple visual errors cast doubt on the actu… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: Accepted at the MICCAI AMAI Workshop 2024

  10. arXiv:2407.11906  [pdf, other

    cs.CV cs.RO

    SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge

    Authors: Hao Ding, Yuqian Zhang, Tuxun Lu, Ruixing Liang, Hongchao Shu, Lalithkumar Seenivasan, Yonghao Long, Qi Dou, Cong Gao, Yicheng Leng, Seok Bong Yoo, Eung-Joo Lee, Negin Ghamsarian, Klaus Schoeffmann, Raphael Sznitman, Zijian Wu, Yuxin Chen, Septimiu E. Salcudean, Samra Irshad, Shadi Albarqouni, Seong Tae Kim, Yueyi Sun, An Wang, Long Bai, Hongliang Ren , et al. (17 additional authors not shown)

    Abstract: Surgical data science has seen rapid advancement due to the excellent performance of end-to-end deep neural networks (DNNs) for surgical video analysis. Despite their successes, end-to-end DNNs have been proven susceptible to even minor corruptions, substantially impairing the model's performance. This vulnerability has become a major concern for the translation of cutting-edge technology, especia… ▽ More

    Submitted 7 April, 2025; v1 submitted 16 July, 2024; originally announced July 2024.

  11. arXiv:2407.04022  [pdf, other

    cs.CV cs.AI cs.LG

    Learning Non-Linear Invariants for Unsupervised Out-of-Distribution Detection

    Authors: Lars Doorenbos, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: The inability of deep learning models to handle data drawn from unseen distributions has sparked much interest in unsupervised out-of-distribution (U-OOD) detection, as it is crucial for reliable deep learning models. Despite considerable attention, theoretically-motivated approaches are few and far between, with most methods building on top of some form of heuristic. Recently, U-OOD was formalize… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  12. arXiv:2406.18175  [pdf, other

    astro-ph.GA astro-ph.IM cs.AI

    Galaxy spectroscopy without spectra: Galaxy properties from photometric images with conditional diffusion models

    Authors: Lars Doorenbos, Eva Sextl, Kevin Heng, Stefano Cavuoti, Massimo Brescia, Olena Torbaniuk, Giuseppe Longo, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: Modern spectroscopic surveys can only target a small fraction of the vast amount of photometrically cataloged sources in wide-field surveys. Here, we report the development of a generative AI method capable of predicting optical galaxy spectra from photometric broad-band images alone. This method draws from the latest advances in diffusion models in combination with contrastive networks. We pass m… ▽ More

    Submitted 28 October, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted by The Astrophysical Journal. Code is available at https://github.com/LarsDoorenbos/generate-spectra

  13. arXiv:2406.02327  [pdf, other

    cs.CV cs.LG

    Iterative Deployment Exposure for Unsupervised Out-of-Distribution Detection

    Authors: Lars Doorenbos, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: Deep learning models are vulnerable to performance degradation when encountering out-of-distribution (OOD) images, potentially leading to misdiagnoses and compromised patient care. These shortcomings have led to great interest in the field of OOD detection. Existing unsupervised OOD (U-OOD) detection methods typically assume that OOD samples originate from an unconcentrated distribution complement… ▽ More

    Submitted 19 May, 2025; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted at MICCAI 2025

  14. arXiv:2405.14788  [pdf, other

    cs.CV

    Masked Image Modelling for retinal OCT understanding

    Authors: Theodoros Pissas, Pablo Márquez-Neila, Sebastian Wolf, Martin Zinkernagel, Raphael Sznitman

    Abstract: This work explores the effectiveness of masked image modelling for learning representations of retinal OCT images. To this end, we leverage Masked Autoencoders (MAE), a simple and scalable method for self-supervised learning, to obtain a powerful and general representation for OCT images by training on 700K OCT images from 41K patients collected under real world clinical settings. We also provide… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  15. arXiv:2312.06295  [pdf, other

    cs.CV

    Cataract-1K: Cataract Surgery Dataset for Scene Segmentation, Phase Recognition, and Irregularity Detection

    Authors: Negin Ghamsarian, Yosuf El-Shabrawi, Sahar Nasirihaghighi, Doris Putzgruber-Adamitsch, Martin Zinkernagel, Sebastian Wolf, Klaus Schoeffmann, Raphael Sznitman

    Abstract: In recent years, the landscape of computer-assisted interventions and post-operative surgical video analysis has been dramatically reshaped by deep-learning techniques, resulting in significant advancements in surgeons' skills, operation room management, and overall surgical outcomes. However, the progression of deep-learning-powered surgical technologies is profoundly reliant on large-scale datas… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 12 pages, 5 figures, 7 tables

  16. arXiv:2312.03409  [pdf, other

    cs.CV

    DeepPyramid+: Medical Image Segmentation using Pyramid View Fusion and Deformable Pyramid Reception

    Authors: Negin Ghamsarian, Sebastian Wolf, Martin Zinkernagel, Klaus Schoeffmann, Raphael Sznitman

    Abstract: Semantic Segmentation plays a pivotal role in many applications related to medical image and video analysis. However, designing a neural network architecture for medical image and surgical video segmentation is challenging due to the diverse features of relevant classes, including heterogeneity, deformability, transparency, blunt boundaries, and various distortions. We propose a network architectu… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 13 pages, 3 figures

  17. arXiv:2312.03401  [pdf, other

    eess.IV cs.CV

    Predicting Postoperative Intraocular Lens Dislocation in Cataract Surgery via Deep Learning

    Authors: Negin Ghamsarian, Doris Putzgruber-Adamitsch, Stephanie Sarny, Raphael Sznitman, Klaus Schoeffmann, Yosuf El-Shabrawi

    Abstract: A critical yet unpredictable complication following cataract surgery is intraocular lens dislocation. Postoperative stability is imperative, as even a tiny decentration of multifocal lenses or inadequate alignment of the torus in toric lenses due to postoperative rotation can lead to a significant drop in visual acuity. Investigating possible intraoperative indicators that can predict post-surgica… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 12 pages, 5 figures

  18. arXiv:2311.08811  [pdf, other

    cs.CV cs.LG

    Correlation-aware active learning for surgery video segmentation

    Authors: Fei Wu, Pablo Marquez-Neila, Mingyi Zheng, Hedyeh Rafii-Tari, Raphael Sznitman

    Abstract: Semantic segmentation is a complex task that relies heavily on large amounts of annotated image data. However, annotating such data can be time-consuming and resource-intensive, especially in the medical domain. Active Learning (AL) is a popular approach that can help to reduce this burden by iteratively selecting images for annotation to improve the model performance. In the case of video data, i… ▽ More

    Submitted 11 December, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: WACV 2024, 8 pages, 7 supplementary pages

  19. arXiv:2311.04081  [pdf, other

    eess.IV cs.CV physics.med-ph

    Learning Super-Resolution Ultrasound Localization Microscopy from Radio-Frequency Data

    Authors: Christopher Hahne, Georges Chabouh, Olivier Couture, Raphael Sznitman

    Abstract: Ultrasound Localization Microscopy (ULM) enables imaging of vascular structures in the micrometer range by accumulating contrast agent particle locations over time. Precise and efficient target localization accuracy remains an active research topic in the ULM field to further push the boundaries of this promising medical imaging technology. Existing work incorporates Delay-And-Sum (DAS) beamformin… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: IEEE International Ultrasonics Symposium (IUS), 2023

  20. arXiv:2310.01545  [pdf, other

    cs.CG cs.CV physics.med-ph

    RF-ULM: Ultrasound Localization Microscopy Learned from Radio-Frequency Wavefronts

    Authors: Christopher Hahne, Georges Chabouh, Arthur Chavignon, Olivier Couture, Raphael Sznitman

    Abstract: In Ultrasound Localization Microscopy (ULM), achieving high-resolution images relies on the precise localization of contrast agent particles across a series of beamformed frames. However, our study uncovers an enormous potential: The process of delay-and-sum beamforming leads to an irreversible reduction of Radio-Frequency (RF) channel data, while its implications for localization remain largely u… ▽ More

    Submitted 5 April, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  21. arXiv:2308.13279  [pdf, other

    cs.LG cs.AI

    Hyperbolic Random Forests

    Authors: Lars Doorenbos, Pablo Márquez-Neila, Raphael Sznitman, Pascal Mettes

    Abstract: Hyperbolic space is becoming a popular choice for representing data due to the hierarchical structure - whether implicit or explicit - of many real-world datasets. Along with it comes a need for algorithms capable of solving fundamental tasks, such as classification, in hyperbolic space. Recently, multiple papers have investigated hyperbolic alternatives to hyperplane-based classifiers, such as lo… ▽ More

    Submitted 24 June, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted at TMLR. Code available at https://github.com/LarsDoorenbos/HoroRF

  22. arXiv:2308.12009  [pdf, other

    cs.CV eess.IV physics.geo-ph

    StofNet: Super-resolution Time of Flight Network

    Authors: Christopher Hahne, Michel Hayoz, Raphael Sznitman

    Abstract: Time of Flight (ToF) is a prevalent depth sensing technology in the fields of robotics, medical imaging, and non-destructive testing. Yet, ToF sensing faces challenges from complex ambient conditions making an inverse modelling from the sparse temporal information intractable. This paper highlights the potential of modern super-resolution techniques to learn varying surroundings for a reliable and… ▽ More

    Submitted 23 December, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: pre-print

  23. arXiv:2307.16660  [pdf, other

    cs.CV

    Domain Adaptation for Medical Image Segmentation using Transformation-Invariant Self-Training

    Authors: Negin Ghamsarian, Javier Gamazo Tejero, Pablo Márquez Neila, Sebastian Wolf, Martin Zinkernagel, Klaus Schoeffmann, Raphael Sznitman

    Abstract: Models capable of leveraging unlabelled data are crucial in overcoming large distribution gaps between the acquired datasets across different imaging devices and configurations. In this regard, self-training techniques based on pseudo-labeling have been shown to be highly effective for semi-supervised domain adaptation. However, the unreliability of pseudo labels can hinder the capability of self-… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: 11 pages, 5 figures, accepted at 26th international conference on Medical Image Computing & Computer Assisted Intervention (MICCAI 2023)

  24. A reinforcement learning approach for VQA validation: an application to diabetic macular edema grading

    Authors: Tatiana Fountoukidou, Raphael Sznitman

    Abstract: Recent advances in machine learning models have greatly increased the performance of automated methods in medical image analysis. However, the internal functioning of such models is largely hidden, which hinders their integration in clinical practice. Explainability and trust are viewed as important aspects of modern methods, for the latter's widespread use in clinical communities. As such, valida… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 16 pages (+ 23 pages supplementary material)

    Journal ref: Medical image analysis 87 (2023): 102822

  25. arXiv:2307.01067  [pdf, other

    cs.CV

    Localized Questions in Medical Visual Question Answering

    Authors: Sergio Tascon-Morales, Pablo Márquez-Neila, Raphael Sznitman

    Abstract: Visual Question Answering (VQA) models aim to answer natural language questions about given images. Due to its ability to ask questions that differ from those used when training the model, medical VQA has received substantial attention in recent years. However, existing medical VQA models typically focus on answering questions that refer to an entire image rather than where the relevant content ma… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: Appears in Medical Image Computing and Computer Assisted Interventions (MICCAI), 2023

  26. arXiv:2306.15548  [pdf, other

    cs.CV cs.LG

    Geometric Ultrasound Localization Microscopy

    Authors: Christopher Hahne, Raphael Sznitman

    Abstract: Contrast-Enhanced Ultra-Sound (CEUS) has become a viable method for non-invasive, dynamic visualization in medical diagnostics, yet Ultrasound Localization Microscopy (ULM) has enabled a revolutionary breakthrough by offering ten times higher resolution. To date, Delay-And-Sum (DAS) beamformers are used to render ULM frames, ultimately determining the image resolution capability. To take full adva… ▽ More

    Submitted 18 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Pre-print accepted for MICCAI 2023

  27. arXiv:2304.08023  [pdf, other

    cs.CV

    Learning How To Robustly Estimate Camera Pose in Endoscopic Videos

    Authors: Michel Hayoz, Christopher Hahne, Mathias Gallardo, Daniel Candinas, Thomas Kurmann, Maximilian Allan, Raphael Sznitman

    Abstract: Purpose: Surgical scene understanding plays a critical role in the technology stack of tomorrow's intervention-assisting systems in endoscopic surgeries. For this, tracking the endoscope pose is a key component, but remains challenging due to illumination conditions, deforming tissues and the breathing motion of organs. Method: We propose a solution for stereo endoscopes that estimates depth and o… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Accepted at IPCAI 2023

  28. Unsupervised out-of-distribution detection for safer robotically guided retinal microsurgery

    Authors: Alain Jungo, Lars Doorenbos, Tommaso Da Col, Maarten Beelen, Martin Zinkernagel, Pablo Márquez-Neila, Raphael Sznitman

    Abstract: Purpose: A fundamental problem in designing safe machine learning systems is identifying when samples presented to a deployed model differ from those observed at training time. Detecting so-called out-of-distribution (OoD) samples is crucial in safety-critical applications such as robotically guided retinal microsurgery, where distances between the instrument and the retina are derived from sequen… ▽ More

    Submitted 3 May, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Accepted at IPCAI 2023

  29. arXiv:2303.11678  [pdf, other

    cs.CV

    Full or Weak annotations? An adaptive strategy for budget-constrained annotation campaigns

    Authors: Javier Gamazo Tejero, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman, Pablo Márquez Neila

    Abstract: Annotating new datasets for machine learning tasks is tedious, time-consuming, and costly. For segmentation applications, the burden is particularly high as manual delineations of relevant image content are often extremely expensive or can only be done by experts with domain-specific knowledge. Thanks to developments in transfer learning and training with weak supervision, segmentation models can… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: CVPR23

  30. arXiv:2303.09427  [pdf, other

    cs.CV

    Logical Implications for Visual Question Answering Consistency

    Authors: Sergio Tascon-Morales, Pablo Márquez-Neila, Raphael Sznitman

    Abstract: Despite considerable recent progress in Visual Question Answering (VQA) models, inconsistent or contradictory answers continue to cast doubt on their true reasoning capabilities. However, most proposed methods use indirect strategies or strong assumptions on pairs of questions and answers to enforce model consistency. Instead, we propose a novel strategy intended to improve model performance by di… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  31. arXiv:2303.08888  [pdf, other

    cs.CV

    Stochastic Segmentation with Conditional Categorical Diffusion Models

    Authors: Lukas Zbinden, Lars Doorenbos, Theodoros Pissas, Adrian Thomas Huber, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: Semantic segmentation has made significant progress in recent years thanks to deep neural networks, but the common objective of generating a single segmentation output that accurately matches the image's content may not be suitable for safety-critical domains such as medical diagnostics and autonomous driving. Instead, multiple possible correct segmentation maps may be required to reflect the true… ▽ More

    Submitted 11 September, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted at ICCV 2023. Code available at https://github.com/LarsDoorenbos/ccdm-stochastic-segmentation

  32. ULISSE: A Tool for One-shot Sky Exploration and its Application to Active Galactic Nuclei Detection

    Authors: Lars Doorenbos, Olena Torbaniuk, Stefano Cavuoti, Maurizio Paolillo, Giuseppe Longo, Massimo Brescia, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: Modern sky surveys are producing ever larger amounts of observational data, which makes the application of classical approaches for the classification and analysis of objects challenging and time-consuming. However, this issue may be significantly mitigated by the application of automatic machine and deep learning methods. We propose ULISSE, a new deep learning tool that, starting from a single pr… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Accepted for publication in A&A

    Journal ref: A&A 666, A171 (2022)

  33. arXiv:2207.01453  [pdf, other

    cs.CV

    DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

    Authors: Negin Ghamsarian, Mario Taschwer, Raphael Sznitman, Klaus Schoeffmann

    Abstract: Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network, termed DeepPyramid, that can deal with thes… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: 11 pages, 4 figures, accepted at 25th international conference on Medical Image Computing & Computer Assisted Intervention (MICCAI 2022). arXiv admin note: substantial text overlap with arXiv:2109.05352

  34. arXiv:2206.13296  [pdf, other

    cs.CV cs.LG

    Consistency-preserving Visual Question Answering in Medical Imaging

    Authors: Sergio Tascon-Morales, Pablo Márquez-Neila, Raphael Sznitman

    Abstract: Visual Question Answering (VQA) models take an image and a natural-language question as input and infer the answer to the question. Recently, VQA systems in medical imaging have gained popularity thanks to potential advantages such as patient engagement and second opinions for clinicians. While most research efforts have been focused on improving architectures and overcoming data-related limitatio… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Appears in Medical Image Computing and Computer Assisted Interventions (MICCAI), 2022

  35. arXiv:2111.13362  [pdf, other

    cs.CV

    Data Invariants to Understand Unsupervised Out-of-Distribution Detection

    Authors: Lars Doorenbos, Raphael Sznitman, Pablo Márquez-Neila

    Abstract: Unsupervised out-of-distribution (U-OOD) detection has recently attracted much attention due its importance in mission-critical systems and broader applicability over its supervised counterpart. Despite this increase in attention, U-OOD methods suffer from important shortcomings. By performing a large-scale evaluation on different benchmarks and image modalities, we show in this work that most pop… ▽ More

    Submitted 21 July, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: ECCV 2022

  36. arXiv:2107.08394  [pdf, other

    cs.CV

    A Positive/Unlabeled Approach for the Segmentation of Medical Sequences using Point-Wise Supervision

    Authors: Laurent Lejeune, Raphael Sznitman

    Abstract: The ability to quickly annotate medical imaging data plays a critical role in training deep learning frameworks for segmentation. Doing so for image volumes or video sequences is even more pressing as annotating these is particularly burdensome. To alleviate this problem, this work proposes a new method to efficiently segment medical imaging volumes or videos using point-wise annotations only. Thi… ▽ More

    Submitted 18 July, 2021; originally announced July 2021.

  37. arXiv:2106.11048  [pdf, other

    eess.IV cs.CV

    CataNet: Predicting remaining cataract surgery duration

    Authors: Andrés Marafioti, Michel Hayoz, Mathias Gallardo, Pablo Márquez Neila, Sebastian Wolf, Martin Zinkernagel, Raphael Sznitman

    Abstract: Cataract surgery is a sight saving surgery that is performed over 10 million times each year around the world. With such a large demand, the ability to organize surgical wards and operating rooms efficiently is critical to delivery this therapy in routine clinical care. In this context, estimating the remaining surgical duration (RSD) during procedures is one way to help streamline patient through… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: Accepted at MICCAI 2021

  38. arXiv:2101.01133  [pdf, other

    cs.CV

    Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

    Authors: Max Allan, Jonathan Mcleod, Congcong Wang, Jean Claude Rosenthal, Zhenglei Hu, Niklas Gard, Peter Eisert, Ke Xue Fu, Trevor Zeffiro, Wenyao Xia, Zhanshi Zhu, Huoling Luo, Fucang Jia, Xiran Zhang, Xiaohong Li, Lalith Sharan, Tom Kurmann, Sebastian Schmid, Raphael Sznitman, Dimitris Psychogyios, Mahdi Azizian, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel

    Abstract: The stereo correspondence and reconstruction of endoscopic data sub-challenge was organized during the Endovis challenge at MICCAI 2019 in Shenzhen, China. The task was to perform dense depth estimation using 7 training datasets and 2 test sets of structured light data captured using porcine cadavers. These were provided by a team at Intuitive Surgical. 10 teams participated in the challenge day.… ▽ More

    Submitted 28 January, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

  39. arXiv:2011.02284  [pdf, other

    cs.CY cs.CV cs.LG eess.IV

    Surgical Data Science -- from Concepts toward Clinical Translation

    Authors: Lena Maier-Hein, Matthias Eisenmann, Duygu Sarikaya, Keno März, Toby Collins, Anand Malpani, Johannes Fallert, Hubertus Feussner, Stamatia Giannarou, Pietro Mascagni, Hirenkumar Nakawala, Adrian Park, Carla Pugh, Danail Stoyanov, Swaroop S. Vedula, Kevin Cleary, Gabor Fichtinger, Germain Forestier, Bernard Gibaud, Teodor Grantcharov, Makoto Hashizume, Doreen Heckmann-Nötzel, Hannes G. Kenngott, Ron Kikinis, Lars Mündermann , et al. (25 additional authors not shown)

    Abstract: Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applica… ▽ More

    Submitted 30 July, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

  40. arXiv:2003.08760  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    A Question-Centric Model for Visual Question Answering in Medical Imaging

    Authors: Minh H. Vu, Tommy Löfstedt, Tufve Nyholm, Raphael Sznitman

    Abstract: Deep learning methods have proven extremely effective at performing a variety of medical image analysis tasks. With their potential use in clinical routine, their lack of transparency has however been one of their few weak points, raising concerns regarding their behavior and failure modes. While most research to infer model behavior has focused on indirect strategies that estimate prediction unce… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

    Comments: Accepted at IEEE Transactions on Medical Imaging

  41. arXiv:1907.06955  [pdf, other

    cs.CV cs.LG

    Fused Detection of Retinal Biomarkers in OCT Volumes

    Authors: Thomas Kurmann, Pablo Márquez-Neila, Siqing Yu, Marion Munk, Sebastian Wolf, Raphael Sznitman

    Abstract: Optical Coherence Tomography (OCT) is the primary imaging modality for detecting pathological biomarkers associated to retinal diseases such as Age-Related Macular Degeneration. In practice, clinical diagnosis and treatment strategies are closely linked to biomarkers visible in OCT volumes and the ability to identify these plays an important role in the development of ophthalmic pharmaceutical pro… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

  42. arXiv:1907.06414  [pdf, other

    cs.LG eess.IV stat.ML

    Concept-Centric Visual Turing Tests for Method Validation

    Authors: Tatiana Fountoukidou, Raphael Sznitman

    Abstract: Recent advances in machine learning for medical imaging have led to impressive increases in model complexity and overall capabilities. However, the ability to discern the precise information a machine learning method is using to make decisions has lagged behind and it is often unclear how these performances are in fact achieved. Conventional evaluation metrics that reduce method performance to a s… ▽ More

    Submitted 15 March, 2020; v1 submitted 15 July, 2019; originally announced July 2019.

    Comments: 9 pages, 8 figures

  43. arXiv:1907.04563  [pdf, other

    cs.CV cs.LG

    Deep Multi Label Classification in Affine Subspaces

    Authors: Thomas Kurmann, Pablo Marquez Neila, Sebastian Wolf, Raphael Sznitman

    Abstract: Multi-label classification (MLC) problems are becoming increasingly popular in the context of medical imaging. This has in part been driven by the fact that acquiring annotations for MLC is far less burdensome than for semantic segmentation and yet provides more expressiveness than multi-class classification. However, to train MLCs, most methods have resorted to similar objective functions as with… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

  44. arXiv:1810.04114  [pdf, other

    cs.LG stat.ML

    Discovering General-Purpose Active Learning Strategies

    Authors: Ksenia Konyushkova, Raphael Sznitman, Pascal Fua

    Abstract: We propose a general-purpose approach to discovering active learning (AL) strategies from data. These strategies are transferable from one domain to another and can be used in conjunction with many machine learning models. To this end, we formalize the annotation process as a Markov decision process, design universal state and action spaces and introduce a new reward function that precisely model… ▽ More

    Submitted 2 April, 2019; v1 submitted 9 October, 2018; originally announced October 2018.

  45. arXiv:1809.00970  [pdf, other

    cs.CV cs.AI

    Iterative multi-path tracking for video and volume segmentation with sparse point supervision

    Authors: Laurent Lejeune, Jan Grossrieder, Raphael Sznitman

    Abstract: Recent machine learning strategies for segmentation tasks have shown great ability when trained on large pixel-wise annotated image datasets. It remains a major challenge however to aggregate such datasets, as the time and monetary cost associated with collecting extensive annotations is extremely high. This is particularly the case for generating precise pixel-wise annotations in video and volume… ▽ More

    Submitted 27 August, 2018; originally announced September 2018.

  46. arXiv:1805.02475  [pdf, other

    cs.CV

    Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

    Authors: Sebastian Bodenstedt, Max Allan, Anthony Agustinos, Xiaofei Du, Luis Garcia-Peraza-Herrera, Hannes Kenngott, Thomas Kurmann, Beat Müller-Stich, Sebastien Ourselin, Daniil Pakhomov, Raphael Sznitman, Marvin Teichmann, Martin Thoma, Tom Vercauteren, Sandrine Voros, Martin Wagner, Pamela Wochner, Lena Maier-Hein, Danail Stoyanov, Stefanie Speidel

    Abstract: Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are… ▽ More

    Submitted 7 May, 2018; originally announced May 2018.

  47. Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery

    Authors: Thomas Kurmann, Pablo Marquez Neila, Xiaofei Du, Pascal Fua, Danail Stoyanov, Sebastian Wolf, Raphael Sznitman

    Abstract: Detection of surgical instruments plays a key role in ensuring patient safety in minimally invasive surgery. In this paper, we present a novel method for 2D vision-based recognition and pose estimation of surgical instruments that generalizes to different surgical applications. At its core, we propose a novel scene model in order to simultaneously recognize multiple instruments as well as their pa… ▽ More

    Submitted 18 October, 2017; originally announced October 2017.

    Comments: 8 pages, 2 figures, MICCAI 2017

  48. arXiv:1707.04931  [pdf, other

    cs.CV

    Pathological OCT Retinal Layer Segmentation using Branch Residual U-shape Networks

    Authors: Stefanos Apostolopoulos, Sandro De Zanet, Carlos Ciller, Sebastian Wolf, Raphael Sznitman

    Abstract: The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully Convolutional Neural Network (CNN) architecture which combines dila… ▽ More

    Submitted 16 July, 2017; originally announced July 2017.

    Comments: 9 pages, 5 figures, MICCAI 2017

  49. arXiv:1707.04905  [pdf, other

    cs.CV

    Expected exponential loss for gaze-based video and volume ground truth annotation

    Authors: Laurent Lejeune, Mario Christoudias, Raphael Sznitman

    Abstract: Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record… ▽ More

    Submitted 16 July, 2017; originally announced July 2017.

    Comments: 9 pages, 5 figues, MICCAI 2017 - LABELS Workshop

  50. arXiv:1703.03365  [pdf, other

    cs.LG

    Learning Active Learning from Data

    Authors: Ksenia Konyushkova, Raphael Sznitman, Pascal Fua

    Abstract: In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from… ▽ More

    Submitted 14 July, 2017; v1 submitted 9 March, 2017; originally announced March 2017.