Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition

Xie, Xudong; Fu, Ling; Zhang, Zhifei; Wang, Zhaowen; Bai, Xiang

Abstract:Artistic text recognition is an extremely challenging task with a wide range of applications. However, current scene text recognition methods mainly focus on irregular text while have not explored artistic text specifically. The challenges of artistic text recognition include the various appearance with special-designed fonts and effects, the complex connections and overlaps between characters, and the severe interference from background patterns. To alleviate these problems, we propose to recognize the artistic text at three levels. Firstly, corner points are applied to guide the extraction of local features inside characters, considering the robustness of corner structures to appearance and shape. In this way, the discreteness of the corner points cuts off the connection between characters, and the sparsity of them improves the robustness for background interference. Secondly, we design a character contrastive loss to model the character-level feature, improving the feature representation for character classification. Thirdly, we utilize Transformer to learn the global feature on image-level and model the global relationship of the corner points, with the assistance of a corner-query cross-attention mechanism. Besides, we provide an artistic text dataset to benchmark the performance. Experimental results verify the significant superiority of our proposed method on artistic text recognition and also achieve state-of-the-art performance on several blurred and perspective datasets.

Comments:	Accepted by ECCV2022 as an oral paper. The dataset and codes are available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2208.00438 [cs.CV]
	(or arXiv:2208.00438v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2208.00438

Computer Science > Computer Vision and Pattern Recognition

Title:Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators