Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

Souibgui, Mohamed Ali; Biswas, Sanket; Mafla, Andres; Biten, Ali Furkan; Fornés, Alicia; Kessentini, Yousri; Lladós, Josep; Gomez, Lluis; Karatzas, Dimosthenis

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.04814 (cs)

[Submitted on 9 Mar 2022 (v1), last revised 18 Aug 2022 (this version, v4)]

Title:Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

Authors:Mohamed Ali Souibgui, Sanket Biswas, Andres Mafla, Ali Furkan Biten, Alicia Fornés, Yousri Kessentini, Josep Lladós, Lluis Gomez, Dimosthenis Karatzas

View PDF

Abstract:In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labeled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at~\url{ http://Upon_Acceptance}.

Comments:	Preprint
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.04814 [cs.CV]
	(or arXiv:2203.04814v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.04814

Submission history

From: Mohamed Ali Souibgui [view email]
[v1] Wed, 9 Mar 2022 15:44:36 UTC (3,816 KB)
[v2] Thu, 10 Mar 2022 17:39:02 UTC (3,816 KB)
[v3] Wed, 16 Mar 2022 15:12:56 UTC (9,754 KB)
[v4] Thu, 18 Aug 2022 14:29:56 UTC (16,414 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators