Search | arXiv e-print repository

Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

Authors: Sebastian Bodenstedt, Max Allan, Anthony Agustinos, Xiaofei Du, Luis Garcia-Peraza-Herrera, Hannes Kenngott, Thomas Kurmann, Beat Müller-Stich, Sebastien Ourselin, Daniil Pakhomov, Raphael Sznitman, Marvin Teichmann, Martin Thoma, Tom Vercauteren, Sandrine Voros, Martin Wagner, Pamela Wochner, Lena Maier-Hein, Danail Stoyanov, Stefanie Speidel

Abstract: Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are… ▽ More Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are common image data sets for consistent evaluation and benchmarking of algorithms against each other. The paper presents a comparative validation study of different vision-based methods for instrument segmentation and tracking in the context of robotic as well as conventional laparoscopic surgery. The contribution of the paper is twofold: we introduce a comprehensive validation data set that was provided to the study participants and present the results of the comparative validation study. Based on the results of the validation study, we arrive at the conclusion that modern deep learning approaches outperform other methods in instrument segmentation tasks, but the results are still not perfect. Furthermore, we show that merging results from different methods actually significantly increases accuracy in comparison to the best stand-alone method. On the other hand, the results of the instrument tracking task show that this is still an open challenge, especially during challenging scenarios in conventional laparoscopic surgery. △ Less

Submitted 7 May, 2018; originally announced May 2018.

arXiv:1710.06668 [pdf, other]

doi 10.1007/978-3-319-66185-8_57

Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery

Authors: Thomas Kurmann, Pablo Marquez Neila, Xiaofei Du, Pascal Fua, Danail Stoyanov, Sebastian Wolf, Raphael Sznitman

Abstract: Detection of surgical instruments plays a key role in ensuring patient safety in minimally invasive surgery. In this paper, we present a novel method for 2D vision-based recognition and pose estimation of surgical instruments that generalizes to different surgical applications. At its core, we propose a novel scene model in order to simultaneously recognize multiple instruments as well as their pa… ▽ More Detection of surgical instruments plays a key role in ensuring patient safety in minimally invasive surgery. In this paper, we present a novel method for 2D vision-based recognition and pose estimation of surgical instruments that generalizes to different surgical applications. At its core, we propose a novel scene model in order to simultaneously recognize multiple instruments as well as their parts. We use a Convolutional Neural Network architecture to embody our model and show that the cross-entropy loss is well suited to optimize its parameters which can be trained in an end-to-end fashion. An additional advantage of our approach is that instrument detection at test time is achieved while avoiding the need for scale-dependent sliding window evaluation. This allows our approach to be relatively parameter free at test time and shows good performance for both instrument detection and tracking. We show that our approach surpasses state-of-the-art results on in-vivo retinal microsurgery image data, as well as ex-vivo laparoscopic sequences. △ Less

Submitted 18 October, 2017; originally announced October 2017.

Comments: 8 pages, 2 figures, MICCAI 2017

arXiv:1707.04931 [pdf, other]

Pathological OCT Retinal Layer Segmentation using Branch Residual U-shape Networks

Authors: Stefanos Apostolopoulos, Sandro De Zanet, Carlos Ciller, Sebastian Wolf, Raphael Sznitman

Abstract: The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully Convolutional Neural Network (CNN) architecture which combines dila… ▽ More The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully Convolutional Neural Network (CNN) architecture which combines dilated residual blocks in an asymmetric U-shape configuration, and can segment multiple layers of highly pathological eyes in one shot. We validate our approach on a dataset of late-stage AMD patients and demonstrate lower computational costs and higher performance compared to other state-of-the-art methods. △ Less

Submitted 16 July, 2017; originally announced July 2017.

Comments: 9 pages, 5 figures, MICCAI 2017

arXiv:1707.04905 [pdf, other]

Expected exponential loss for gaze-based video and volume ground truth annotation

Authors: Laurent Lejeune, Mario Christoudias, Raphael Sznitman

Abstract: Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record… ▽ More Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record where they have looked at with a \$200 eye gaze tracker. Our method then estimates pixel-wise probabilities for the presence of the object throughout the sequence from which we train a classifier in semi-supervised setting using a novel Expected Exponential loss function. We show that our framework provides superior performances on a wide range of medical image settings compared to existing strategies and that our method can be combined with current crowd-sourcing paradigms as well. △ Less

Submitted 16 July, 2017; originally announced July 2017.

Comments: 9 pages, 5 figues, MICCAI 2017 - LABELS Workshop

arXiv:1703.03365 [pdf, other]

Learning Active Learning from Data

Authors: Ksenia Konyushkova, Raphael Sznitman, Pascal Fua

Abstract: In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from… ▽ More In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from previous AL outcomes. We show that a strategy can be learnt either from simple synthetic 2D datasets or from a subset of domain-specific data. Our method yields strategies that work well on real data from a wide range of domains. △ Less

Submitted 14 July, 2017; v1 submitted 9 March, 2017; originally announced March 2017.

arXiv:1610.03628 [pdf, other]

RetiNet: Automatic AMD identification in OCT volumetric data

Authors: Stefanos Apostolopoulos, Carlos Ciller, Sandro I. De Zanet, Sebastian Wolf, Raphael Sznitman

Abstract: Optical Coherence Tomography (OCT) provides a unique ability to image the eye retina in 3D at micrometer resolution and gives ophthalmologist the ability to visualize retinal diseases such as Age-Related Macular Degeneration (AMD). While visual inspection of OCT volumes remains the main method for AMD identification, doing so is time consuming as each cross-section within the volume must be inspec… ▽ More Optical Coherence Tomography (OCT) provides a unique ability to image the eye retina in 3D at micrometer resolution and gives ophthalmologist the ability to visualize retinal diseases such as Age-Related Macular Degeneration (AMD). While visual inspection of OCT volumes remains the main method for AMD identification, doing so is time consuming as each cross-section within the volume must be inspected individually by the clinician. In much the same way, acquiring ground truth information for each cross-section is expensive and time consuming. This fact heavily limits the ability to acquire large amounts of ground truth, which subsequently impacts the performance of learning-based methods geared at automatic pathology identification. To avoid this burden, we propose a novel strategy for automatic analysis of OCT volumes where only volume labels are needed. That is, we train a classifier in a semi-supervised manner to conduct this task. Our approach uses a novel Convolutional Neural Network (CNN) architecture, that only needs volume-level labels to be trained to automatically asses whether an OCT volume is healthy or contains AMD. Our architecture involves first learning a cross-section pathology classifier using pseudo-labels that could be corrupted and then leverage these towards a more accurate volume-level classification. We then show that our approach provides excellent performances on a publicly available dataset and outperforms a number of existing automatic techniques. △ Less

Submitted 12 October, 2016; originally announced October 2016.

Comments: 14 pages, 10 figures, Code available

arXiv:1606.09029 [pdf, other]

doi 10.1016/j.cviu.2019.01.007

Geometry in Active Learning for Binary and Multi-class Image Segmentation

Authors: Ksenia Konyushkova, Raphael Sznitman, Pascal Fua

Abstract: We propose an active learning approach to image segmentation that exploits geometric priors to speed up and streamline the annotation process. It can be applied for both background-foreground and multi-class segmentation tasks in 2D images and 3D image volumes. Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels o… ▽ More We propose an active learning approach to image segmentation that exploits geometric priors to speed up and streamline the annotation process. It can be applied for both background-foreground and multi-class segmentation tasks in 2D images and 3D image volumes. Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels or voxels are the most informative, and thus should to be annotated next. For multi-class settings, we additionally introduce two novel criteria for uncertainty. In the 3D case, we use the resulting uncertainty measure to select voxels lying on a planar patch, which makes batch annotation much more convenient for the end user compared to the setting where voxels are randomly distributed in a volume. The planar patch is found using a branch-and-bound algorithm that looks for a 2D patch in a 3D volume where the most informative instances are located. We evaluate our approach on Electron Microscopy and Magnetic Resonance image volumes, as well as on regular images of horses and faces. We demonstrate a substantial performance increase over other approaches thanks to the use of geometric priors. △ Less

Submitted 4 April, 2019; v1 submitted 29 June, 2016; originally announced June 2016.

Comments: Extension of our previous paper arXiv:1508.04955

Journal ref: Published in "Computer Vision and Image Understanding" journal, 1077-3142, 2019

arXiv:1512.00747 [pdf, other]

Active Learning for Delineation of Curvilinear Structures

Authors: Agata Mosinska, Raphael Sznitman, Przemysław Głowacki, Pascal Fua

Abstract: Many recent delineation techniques owe much of their increased effectiveness to path classification algorithms that make it possible to distinguish promising paths from others. The downside of this development is that they require annotated training data, which is tedious to produce. In this paper, we propose an Active Learning approach that considerably speeds up the annotation process. Unlike… ▽ More Many recent delineation techniques owe much of their increased effectiveness to path classification algorithms that make it possible to distinguish promising paths from others. The downside of this development is that they require annotated training data, which is tedious to produce. In this paper, we propose an Active Learning approach that considerably speeds up the annotation process. Unlike standard ones, it takes advantage of the specificities of the delineation problem. It operates on a graph and can reduce the training set size by up to 80% without compromising the reconstruction quality. We will show that our approach outperforms conventional ones on various biomedical and natural image datasets, thus showing that it is broadly applicable. △ Less

Submitted 2 December, 2015; originally announced December 2015.

arXiv:1508.04955 [pdf, other]

Introducing Geometry in Active Learning for Image Segmentation

Authors: Ksenia Konyushkova, Raphael Sznitman, Pascal Fua

Abstract: We propose an Active Learning approach to training a segmentation classifier that exploits geometric priors to streamline the annotation process in 3D image volumes. To this end, we use these priors not only to select voxels most in need of annotation but to guarantee that they lie on 2D planar patch, which makes it much easier to annotate than if they were randomly distributed in the volume. A si… ▽ More We propose an Active Learning approach to training a segmentation classifier that exploits geometric priors to streamline the annotation process in 3D image volumes. To this end, we use these priors not only to select voxels most in need of annotation but to guarantee that they lie on 2D planar patch, which makes it much easier to annotate than if they were randomly distributed in the volume. A simplified version of this approach is effective in natural 2D images. We evaluated our approach on Electron Microscopy and Magnetic Resonance image volumes, as well as on natural images. Comparing our approach against several accepted baselines demonstrates a marked performance increase. △ Less

Submitted 20 August, 2015; originally announced August 2015.

arXiv:1007.1398 [pdf, other]

doi 10.1371/journal.pone.0011631

Multi-environment model estimation for motility analysis of Caenorhabditis Elegans

Authors: Raphael Sznitman, Manaswi Gupta, Gregory D. Hager, Paulo E. Arratia, Josue Sznitman

Abstract: The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans' motility has been studied across… ▽ More The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans' motility has been studied across a wide range of environments, including crawling on substrates, swimming in fluids, and locomoting through microfluidic substrates. However, each environment often requires customized image processing tools relying on heuristic parameter tuning. In the present study, we propose a novel Multi-Environment Model Estimation (MEME) framework for automated image segmentation that is versatile across various environments. The MEME platform is constructed around the concept of Mixture of Gaussian (MOG) models, where statistical models for both the background environment and the nematode appearance are explicitly learned and used to accurately segment a target nematode. Our method is designed to simplify the burden often imposed on users; here, only a single image which includes a nematode in its environment must be provided for model learning. In addition, our platform enables the extraction of nematode `skeletons' for straightforward motility quantification. We test our algorithm on various locomotive environments and compare performances with an intensity-based thresholding method. Overall, MEME outperforms the threshold-based approach for the overwhelming majority of cases examined. Ultimately, MEME provides researchers with an attractive platform for C. elegans' segmentation and `skeletonizing' across a wide range of motility assays. △ Less

Submitted 8 July, 2010; originally announced July 2010.

Comments: 21 pages, 8 figures, accepted in: PLoS ONE (2010)

arXiv:1003.5249 [pdf, other]

doi 10.1109/TPAMI.2010.106

Active Testing for Face Detection and Localization

Authors: Raphael Sznitman, Bruno Jedynak

Abstract: We provide a novel search technique, which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels. We provide a novel search technique, which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels. △ Less

Submitted 26 March, 2010; originally announced March 2010.

Comments: 16 pages, 5 figures, accepted in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2010

Showing 51–61 of 61 results for author: Sznitman, R