Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Ramesh, Sanat; Srivastav, Vinkle; Alapatt, Deepak; Yu, Tong; Murali, Aditya; Sestini, Luca; Nwoye, Chinedu Innocent; Hamoud, Idris; Sharma, Saurav; Fleurentin, Antoine; Exarchakis, Georgios; Karargyris, Alexandros; Padoy, Nicolas

Computer Science > Computer Vision and Pattern Recognition

arXiv:2207.00449 (cs)

[Submitted on 1 Jul 2022 (v1), last revised 31 May 2023 (this version, v3)]

Title:Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Authors:Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Saurav Sharma, Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy

View PDF

Abstract:The field of surgical computer vision has undergone considerable breakthroughs in recent years with the rising popularity of deep neural network-based methods. However, standard fully-supervised approaches for training such models require vast amounts of annotated data, imposing a prohibitively high cost; especially in the clinical domain. Self-Supervised Learning (SSL) methods, which have begun to gain traction in the general computer vision community, represent a potential solution to these annotation costs, allowing to learn useful representations from only unlabeled data. Still, the effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. In this work, we address this critical need by investigating four state-of-the-art SSL methods (MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection. We examine their parameterization, then their behavior with respect to training data quantities in semi-supervised settings. Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL - up to 7.4% on phase recognition and 20% on tool presence detection - as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%. Further results obtained on a highly diverse selection of surgical datasets exhibit strong generalization properties. The code is available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2207.00449 [cs.CV]
	(or arXiv:2207.00449v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2207.00449

Submission history

From: Sanat Ramesh [view email]
[v1] Fri, 1 Jul 2022 14:17:11 UTC (6,575 KB)
[v2] Wed, 8 Feb 2023 12:52:49 UTC (9,610 KB)
[v3] Wed, 31 May 2023 09:08:11 UTC (9,609 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators