-
Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery
Authors:
Sebastian Bodenstedt,
Max Allan,
Anthony Agustinos,
Xiaofei Du,
Luis Garcia-Peraza-Herrera,
Hannes Kenngott,
Thomas Kurmann,
Beat Müller-Stich,
Sebastien Ourselin,
Daniil Pakhomov,
Raphael Sznitman,
Marvin Teichmann,
Martin Thoma,
Tom Vercauteren,
Sandrine Voros,
Martin Wagner,
Pamela Wochner,
Lena Maier-Hein,
Danail Stoyanov,
Stefanie Speidel
Abstract:
Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are…
▽ More
Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are common image data sets for consistent evaluation and benchmarking of algorithms against each other. The paper presents a comparative validation study of different vision-based methods for instrument segmentation and tracking in the context of robotic as well as conventional laparoscopic surgery. The contribution of the paper is twofold: we introduce a comprehensive validation data set that was provided to the study participants and present the results of the comparative validation study. Based on the results of the validation study, we arrive at the conclusion that modern deep learning approaches outperform other methods in instrument segmentation tasks, but the results are still not perfect. Furthermore, we show that merging results from different methods actually significantly increases accuracy in comparison to the best stand-alone method. On the other hand, the results of the instrument tracking task show that this is still an open challenge, especially during challenging scenarios in conventional laparoscopic surgery.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery
Authors:
Thomas Kurmann,
Pablo Marquez Neila,
Xiaofei Du,
Pascal Fua,
Danail Stoyanov,
Sebastian Wolf,
Raphael Sznitman
Abstract:
Detection of surgical instruments plays a key role in ensuring patient safety in minimally invasive surgery. In this paper, we present a novel method for 2D vision-based recognition and pose estimation of surgical instruments that generalizes to different surgical applications. At its core, we propose a novel scene model in order to simultaneously recognize multiple instruments as well as their pa…
▽ More
Detection of surgical instruments plays a key role in ensuring patient safety in minimally invasive surgery. In this paper, we present a novel method for 2D vision-based recognition and pose estimation of surgical instruments that generalizes to different surgical applications. At its core, we propose a novel scene model in order to simultaneously recognize multiple instruments as well as their parts. We use a Convolutional Neural Network architecture to embody our model and show that the cross-entropy loss is well suited to optimize its parameters which can be trained in an end-to-end fashion. An additional advantage of our approach is that instrument detection at test time is achieved while avoiding the need for scale-dependent sliding window evaluation. This allows our approach to be relatively parameter free at test time and shows good performance for both instrument detection and tracking. We show that our approach surpasses state-of-the-art results on in-vivo retinal microsurgery image data, as well as ex-vivo laparoscopic sequences.
△ Less
Submitted 18 October, 2017;
originally announced October 2017.
-
Pathological OCT Retinal Layer Segmentation using Branch Residual U-shape Networks
Authors:
Stefanos Apostolopoulos,
Sandro De Zanet,
Carlos Ciller,
Sebastian Wolf,
Raphael Sznitman
Abstract:
The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully Convolutional Neural Network (CNN) architecture which combines dila…
▽ More
The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully Convolutional Neural Network (CNN) architecture which combines dilated residual blocks in an asymmetric U-shape configuration, and can segment multiple layers of highly pathological eyes in one shot. We validate our approach on a dataset of late-stage AMD patients and demonstrate lower computational costs and higher performance compared to other state-of-the-art methods.
△ Less
Submitted 16 July, 2017;
originally announced July 2017.
-
Expected exponential loss for gaze-based video and volume ground truth annotation
Authors:
Laurent Lejeune,
Mario Christoudias,
Raphael Sznitman
Abstract:
Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record…
▽ More
Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record where they have looked at with a \$200 eye gaze tracker. Our method then estimates pixel-wise probabilities for the presence of the object throughout the sequence from which we train a classifier in semi-supervised setting using a novel Expected Exponential loss function. We show that our framework provides superior performances on a wide range of medical image settings compared to existing strategies and that our method can be combined with current crowd-sourcing paradigms as well.
△ Less
Submitted 16 July, 2017;
originally announced July 2017.
-
Learning Active Learning from Data
Authors:
Ksenia Konyushkova,
Raphael Sznitman,
Pascal Fua
Abstract:
In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from…
▽ More
In this paper, we suggest a novel data-driven approach to active learning (AL). The key idea is to train a regressor that predicts the expected error reduction for a candidate sample in a particular learning state. By formulating the query selection procedure as a regression problem we are not restricted to working with existing AL heuristics; instead, we learn strategies based on experience from previous AL outcomes. We show that a strategy can be learnt either from simple synthetic 2D datasets or from a subset of domain-specific data. Our method yields strategies that work well on real data from a wide range of domains.
△ Less
Submitted 14 July, 2017; v1 submitted 9 March, 2017;
originally announced March 2017.
-
RetiNet: Automatic AMD identification in OCT volumetric data
Authors:
Stefanos Apostolopoulos,
Carlos Ciller,
Sandro I. De Zanet,
Sebastian Wolf,
Raphael Sznitman
Abstract:
Optical Coherence Tomography (OCT) provides a unique ability to image the eye retina in 3D at micrometer resolution and gives ophthalmologist the ability to visualize retinal diseases such as Age-Related Macular Degeneration (AMD). While visual inspection of OCT volumes remains the main method for AMD identification, doing so is time consuming as each cross-section within the volume must be inspec…
▽ More
Optical Coherence Tomography (OCT) provides a unique ability to image the eye retina in 3D at micrometer resolution and gives ophthalmologist the ability to visualize retinal diseases such as Age-Related Macular Degeneration (AMD). While visual inspection of OCT volumes remains the main method for AMD identification, doing so is time consuming as each cross-section within the volume must be inspected individually by the clinician. In much the same way, acquiring ground truth information for each cross-section is expensive and time consuming. This fact heavily limits the ability to acquire large amounts of ground truth, which subsequently impacts the performance of learning-based methods geared at automatic pathology identification. To avoid this burden, we propose a novel strategy for automatic analysis of OCT volumes where only volume labels are needed. That is, we train a classifier in a semi-supervised manner to conduct this task. Our approach uses a novel Convolutional Neural Network (CNN) architecture, that only needs volume-level labels to be trained to automatically asses whether an OCT volume is healthy or contains AMD. Our architecture involves first learning a cross-section pathology classifier using pseudo-labels that could be corrupted and then leverage these towards a more accurate volume-level classification. We then show that our approach provides excellent performances on a publicly available dataset and outperforms a number of existing automatic techniques.
△ Less
Submitted 12 October, 2016;
originally announced October 2016.
-
Geometry in Active Learning for Binary and Multi-class Image Segmentation
Authors:
Ksenia Konyushkova,
Raphael Sznitman,
Pascal Fua
Abstract:
We propose an active learning approach to image segmentation that exploits geometric priors to speed up and streamline the annotation process. It can be applied for both background-foreground and multi-class segmentation tasks in 2D images and 3D image volumes. Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels o…
▽ More
We propose an active learning approach to image segmentation that exploits geometric priors to speed up and streamline the annotation process. It can be applied for both background-foreground and multi-class segmentation tasks in 2D images and 3D image volumes. Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels or voxels are the most informative, and thus should to be annotated next. For multi-class settings, we additionally introduce two novel criteria for uncertainty. In the 3D case, we use the resulting uncertainty measure to select voxels lying on a planar patch, which makes batch annotation much more convenient for the end user compared to the setting where voxels are randomly distributed in a volume. The planar patch is found using a branch-and-bound algorithm that looks for a 2D patch in a 3D volume where the most informative instances are located. We evaluate our approach on Electron Microscopy and Magnetic Resonance image volumes, as well as on regular images of horses and faces. We demonstrate a substantial performance increase over other approaches thanks to the use of geometric priors.
△ Less
Submitted 4 April, 2019; v1 submitted 29 June, 2016;
originally announced June 2016.
-
Active Learning for Delineation of Curvilinear Structures
Authors:
Agata Mosinska,
Raphael Sznitman,
Przemysław Głowacki,
Pascal Fua
Abstract:
Many recent delineation techniques owe much of their increased effectiveness to path classification algorithms that make it possible to distinguish promising paths from others. The downside of this development is that they require annotated training data, which is tedious to produce.
In this paper, we propose an Active Learning approach that considerably speeds up the annotation process. Unlike…
▽ More
Many recent delineation techniques owe much of their increased effectiveness to path classification algorithms that make it possible to distinguish promising paths from others. The downside of this development is that they require annotated training data, which is tedious to produce.
In this paper, we propose an Active Learning approach that considerably speeds up the annotation process. Unlike standard ones, it takes advantage of the specificities of the delineation problem. It operates on a graph and can reduce the training set size by up to 80% without compromising the reconstruction quality.
We will show that our approach outperforms conventional ones on various biomedical and natural image datasets, thus showing that it is broadly applicable.
△ Less
Submitted 2 December, 2015;
originally announced December 2015.
-
Introducing Geometry in Active Learning for Image Segmentation
Authors:
Ksenia Konyushkova,
Raphael Sznitman,
Pascal Fua
Abstract:
We propose an Active Learning approach to training a segmentation classifier that exploits geometric priors to streamline the annotation process in 3D image volumes. To this end, we use these priors not only to select voxels most in need of annotation but to guarantee that they lie on 2D planar patch, which makes it much easier to annotate than if they were randomly distributed in the volume. A si…
▽ More
We propose an Active Learning approach to training a segmentation classifier that exploits geometric priors to streamline the annotation process in 3D image volumes. To this end, we use these priors not only to select voxels most in need of annotation but to guarantee that they lie on 2D planar patch, which makes it much easier to annotate than if they were randomly distributed in the volume. A simplified version of this approach is effective in natural 2D images. We evaluated our approach on Electron Microscopy and Magnetic Resonance image volumes, as well as on natural images. Comparing our approach against several accepted baselines demonstrates a marked performance increase.
△ Less
Submitted 20 August, 2015;
originally announced August 2015.
-
Multi-environment model estimation for motility analysis of Caenorhabditis Elegans
Authors:
Raphael Sznitman,
Manaswi Gupta,
Gregory D. Hager,
Paulo E. Arratia,
Josue Sznitman
Abstract:
The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans' motility has been studied across…
▽ More
The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans' motility has been studied across a wide range of environments, including crawling on substrates, swimming in fluids, and locomoting through microfluidic substrates. However, each environment often requires customized image processing tools relying on heuristic parameter tuning. In the present study, we propose a novel Multi-Environment Model Estimation (MEME) framework for automated image segmentation that is versatile across various environments. The MEME platform is constructed around the concept of Mixture of Gaussian (MOG) models, where statistical models for both the background environment and the nematode appearance are explicitly learned and used to accurately segment a target nematode. Our method is designed to simplify the burden often imposed on users; here, only a single image which includes a nematode in its environment must be provided for model learning. In addition, our platform enables the extraction of nematode `skeletons' for straightforward motility quantification. We test our algorithm on various locomotive environments and compare performances with an intensity-based thresholding method. Overall, MEME outperforms the threshold-based approach for the overwhelming majority of cases examined. Ultimately, MEME provides researchers with an attractive platform for C. elegans' segmentation and `skeletonizing' across a wide range of motility assays.
△ Less
Submitted 8 July, 2010;
originally announced July 2010.
-
Active Testing for Face Detection and Localization
Authors:
Raphael Sznitman,
Bruno Jedynak
Abstract:
We provide a novel search technique, which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels.
We provide a novel search technique, which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels.
△ Less
Submitted 26 March, 2010;
originally announced March 2010.