Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation

R, Amarnath; Nagabhushan, P.

doi:10.5120/ijca2017915133

Computer Science > Computer Vision and Pattern Recognition

arXiv:1708.05545 (cs)

[Submitted on 18 Aug 2017]

Title:Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation

Authors:Amarnath R, P. Nagabhushan

View PDF

Abstract:Line separators are used to segregate text-lines from one another in document image analysis. Finding the separator points at every line terminal in a document image would enable text-line segmentation. In particular, identifying the separators in handwritten text could be a thrilling exercise. Obviously it would be challenging to perform this in the compressed version of a document image and that is the proposed objective in this research. Such an effort would prevent the computational burden of decompressing a document for text-line segmentation. Since document images are generally compressed using run length encoding (RLE) technique as per the CCITT standards, the first column in the RLE will be a white column. The value (depth) in the white column is very low when a particular line is a text line and the depth could be larger at the point of text line separation. A longer consecutive sequence of such larger depth should indicate the gap between the text lines, which provides the separator region. In case of over separation and under separation issues, corrective actions such as deletion and insertion are suggested respectively. An extensive experimentation is conducted on the compressed images of the benchmark datasets of ICDAR13 and Alireza et al [17] to demonstrate the efficacy.

Comments:	Line separators, Document image analysis, Handwritten text, Compression and decompression, RLE, CCITT. Line separator points at every line terminal in a compressed handwritten document images enabling text line segmentation
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1708.05545 [cs.CV]
	(or arXiv:1708.05545v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1708.05545
Journal reference:	International Journal of Computer Applications 172(4): 40-47 (2017)
Related DOI:	https://doi.org/10.5120/ijca2017915133

Submission history

From: Amarnath R [view email]
[v1] Fri, 18 Aug 2017 09:51:17 UTC (858 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators