DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

Ghamsarian, Negin; Taschwer, Mario; Sznitman, Raphael; Schoeffmann, Klaus

Computer Science > Computer Vision and Pattern Recognition

arXiv:2207.01453 (cs)

[Submitted on 4 Jul 2022]

Title:DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

Authors:Negin Ghamsarian, Mario Taschwer, Raphael Sznitman, Klaus Schoeffmann

View PDF

Abstract:Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network, termed DeepPyramid, that can deal with these challenges using three novelties: (1) a Pyramid View Fusion module which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (2) a Deformable Pyramid Reception module which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (3) a dedicated Pyramid Loss that adaptively supervises multi-scale semantic feature maps. Combined, we show that these modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. We demonstrate that our approach performs at a state-of-the-art level and outperforms a number of existing methods with a large margin (3.66% overall improvement in intersection over union compared to the best rival approach).

Comments:	11 pages, 4 figures, accepted at 25th international conference on Medical Image Computing & Computer Assisted Intervention (MICCAI 2022). arXiv admin note: substantial text overlap with arXiv:2109.05352
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2207.01453 [cs.CV]
	(or arXiv:2207.01453v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2207.01453

Submission history

From: Negin Ghamsarian [view email]
[v1] Mon, 4 Jul 2022 14:41:45 UTC (13,499 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators