Few Shots Are All You Need: A Progressive Few Shot Learning Approach for Low Resource Handwritten Text Recognition

Souibgui, Mohamed Ali; Fornés, Alicia; Kessentini, Yousri; Megyesi, Beáta

doi:10.1016/j.patrec.2022.06.003

Computer Science > Computer Vision and Pattern Recognition

arXiv:2107.10064 (cs)

[Submitted on 21 Jul 2021 (v1), last revised 13 Jun 2022 (this version, v3)]

Title:Few Shots Are All You Need: A Progressive Few Shot Learning Approach for Low Resource Handwritten Text Recognition

Authors:Mohamed Ali Souibgui, Alicia Fornés, Yousri Kessentini, Beáta Megyesi

View PDF

Abstract:Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. The main difficulty comes from the very few annotated data and the limited linguistic information (e.g. dictionaries and language models). Thus, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human labor annotation process, requiring only few images of each alphabet symbol. The method consists in detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from any alphabet, even though different from the target domain. A second training step is then applied to diminish the gap between the source and target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the non-annotated data. The evaluation on different manuscript datasets show that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in this repository: \url{this https URL}

Comments:	Accepted in Pattern Recognition Letters
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2107.10064 [cs.CV]
	(or arXiv:2107.10064v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2107.10064
Related DOI:	https://doi.org/10.1016/j.patrec.2022.06.003

Submission history

From: Mohamed Ali Souibgui [view email]
[v1] Wed, 21 Jul 2021 13:18:21 UTC (2,369 KB)
[v2] Fri, 28 Jan 2022 14:59:38 UTC (2,500 KB)
[v3] Mon, 13 Jun 2022 11:22:21 UTC (2,500 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Few Shots Are All You Need: A Progressive Few Shot Learning Approach for Low Resource Handwritten Text Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Few Shots Are All You Need: A Progressive Few Shot Learning Approach for Low Resource Handwritten Text Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators