Skip to main content

Showing 1–6 of 6 results for author: Kohút, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.19658  [pdf, other

    cs.CV cs.AI cs.LG

    BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction

    Authors: Jan Kohút, Martin Dočekal, Michal Hradiš, Marek Vaško

    Abstract: Manual digitization of bibliographic metadata is time consuming and labor intensive, especially for historical and real-world archives with highly variable formatting across documents. Despite advances in machine learning, the absence of dedicated datasets for metadata extraction hinders automation. To address this gap, we introduce BiblioPage, a dataset of scanned title pages annotated with struc… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Submitted to ICDAR2025 conference

  2. arXiv:2503.19546  [pdf, other

    cs.CV

    Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts

    Authors: Jan Kohút, Michal Hradiš

    Abstract: A common use case for OCR applications involves users uploading documents and progressively correcting automatic recognition to obtain the final transcript. This correction phase presents an opportunity for progressive adaptation of the OCR model, making it crucial to adapt early, while ensuring stability and reliability. We demonstrate that state-of-the-art transformer-based models can effectivel… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Submitted to ICDAR2025 conference

  3. arXiv:2302.06318  [pdf, other

    cs.CV

    Towards Writing Style Adaptation in Handwriting Recognition

    Authors: Jan Kohút, Michal Hradiš, Martin Kišš

    Abstract: One of the challenges of handwriting recognition is to transcribe a large number of vastly different writing styles. State-of-the-art approaches do not explicitly use information about the writer's style, which may be limiting overall accuracy due to various ambiguities. We explore models with writer-dependent parameters which take the writer's identity as an additional input. The proposed models… ▽ More

    Submitted 30 April, 2025; v1 submitted 13 February, 2023; originally announced February 2023.

  4. arXiv:2302.06308  [pdf, other

    cs.CV

    Fine-tuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

    Authors: Jan Kohút, Michal Hradiš

    Abstract: In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple fine-tuning with data augmentation works surprisingly well in such scenarios and that… ▽ More

    Submitted 30 April, 2025; v1 submitted 13 February, 2023; originally announced February 2023.

  5. arXiv:2201.09575  [pdf, other

    cs.CV

    Importance of Textlines in Historical Document Classification

    Authors: Martin Kišš, Jan Kohút, Karel Beneš, Michal Hradiš

    Abstract: This paper describes a system prepared at Brno University of Technology for ICDAR 2021 Competition on Historical Document Classification, experiments leading to its design, and the main findings. The solved tasks include script and font classification, document origin localization, and dating. We combined patch-level and line-level approaches, where the line-level system utilizes an existing, publ… ▽ More

    Submitted 30 March, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

    Comments: 13 pages, 7 figures, 5 tables

    MSC Class: 68T07; 68T10

  6. TS-Net: OCR Trained to Switch Between Text Transcription Styles

    Authors: Jan Kohút, Michal Hradiš

    Abstract: Users of OCR systems, from different institutions and scientific disciplines, prefer and produce different transcription styles. This presents a problem for training of consistent text recognition neural networks on real-world data. We propose to extend existing text recognition networks with a Transcription Style Block (TSB) which can learn from data to switch between multiple transcription style… ▽ More

    Submitted 13 February, 2023; v1 submitted 9 March, 2021; originally announced March 2021.

    Journal ref: ICDAR 2021: Proceedings, Part IV 16 (pp. 478-493)