Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities

Mezzina, Marco; De Backer, Pieter; Vercauteren, Tom; Blaschko, Matthew; Mottrie, Alexandre; Tuytelaars, Tinne

doi:10.1007/s11548-025-03383-4

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2504.18954 (eess)

[Submitted on 26 Apr 2025]

Title:Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities

Authors:Marco Mezzina, Pieter De Backer, Tom Vercauteren, Matthew Blaschko, Alexandre Mottrie, Tinne Tuytelaars

View PDF HTML (experimental)

Abstract:Purpose: Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events, functioning as a building block for efficient video review, surgical education as well as skill assessment. Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases. This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure. Methods: Urologists of varying expertise were grouped and tasked to indicate the surgical phase for RAPN on both single frames and video snippets using a custom-made web platform. Participants reported their confidence levels and the visual landmarks used in their decision-making. AI architectures without and with temporal context as trained and benchmarked on the Cholec80 dataset were subsequently trained on this RAPN dataset. Results: Video snippets and presence of specific visual landmarks improved phase classification accuracy across all groups. Surgeons displayed high confidence in their classifications and outperformed novices, who struggled discriminating phases. The performance of the AI models is comparable to the surgeons in the survey, with improvements when temporal context was incorporated in both cases. Conclusion: SPR is an inherently complex task for expert surgeons and computer vision, where both perform equally well when given the same context. Performance increases when temporal information is provided. Surgical tools and organs form the key landmarks for human interpretation and are expected to shape the future of automated SPR.

Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2504.18954 [eess.IV]
	(or arXiv:2504.18954v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2504.18954
Related DOI:	https://doi.org/10.1007/s11548-025-03383-4

Submission history

From: Marco Mezzina Ir. [view email]
[v1] Sat, 26 Apr 2025 15:37:22 UTC (6,066 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators