PAT: Pixel-wise Adaptive Training for Long-tailed Segmentation

Do, Khoi; Nguyen, Duong; Tran, Nguyen H.; Nguyen, Viet Dung

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.05393 (cs)

[Submitted on 8 Apr 2024 (v1), last revised 20 Oct 2024 (this version, v4)]

Title:PAT: Pixel-wise Adaptive Training for Long-tailed Segmentation

Authors:Khoi Do, Duong Nguyen, Nguyen H. Tran, Viet Dung Nguyen

View PDF HTML (experimental)

Abstract:Beyond class frequency, we recognize the impact of class-wise relationships among various class-specific predictions and the imbalance in label masks on long-tailed segmentation learning. To address these challenges, we propose an innovative Pixel-wise Adaptive Training (PAT) technique tailored for long-tailed segmentation. PAT has two key features: 1) class-wise gradient magnitude homogenization, and 2) pixel-wise class-specific loss adaptation (PCLA). First, the class-wise gradient magnitude homogenization helps alleviate the imbalance among label masks by ensuring equal consideration of the class-wise impact on model updates. Second, PCLA tackles the detrimental impact of both rare classes within the long-tailed distribution and inaccurate predictions from previous training stages by encouraging learning classes with low prediction confidence and guarding against forgetting classes with high confidence. This combined approach fosters robust learning while preventing the model from forgetting previously learned knowledge. PAT exhibits significant performance improvements, surpassing the current state-of-the-art by 2.2% in the NyU dataset. Moreover, it enhances overall pixel-wise accuracy by 2.85% and intersection over union value by 2.07%, with a particularly notable declination of 0.39% in detecting rare classes compared to Balance Logits Variation, as demonstrated on the three popular datasets, i.e., OxfordPetIII, CityScape, and NYU.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.05393 [cs.CV]
	(or arXiv:2404.05393v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.05393

Submission history

From: Khoi Do [view email]
[v1] Mon, 8 Apr 2024 10:52:29 UTC (31,814 KB)
[v2] Tue, 9 Apr 2024 09:52:32 UTC (15,911 KB)
[v3] Wed, 10 Jul 2024 06:26:35 UTC (15,947 KB)
[v4] Sun, 20 Oct 2024 16:20:16 UTC (6,235 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:PAT: Pixel-wise Adaptive Training for Long-tailed Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:PAT: Pixel-wise Adaptive Training for Long-tailed Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators