Skip to main content

Showing 1–16 of 16 results for author: Wick, C

.
  1. arXiv:2110.05909  [pdf, other

    cs.CV

    Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes

    Authors: Christoph Wick, Jochen Zöllner, Tobias Grüning

    Abstract: In contrast to Connectionist Temporal Classification (CTC) approaches, Sequence-To-Sequence (S2S) models for Handwritten Text Recognition (HTR) suffer from errors such as skipped or repeated words which often occur at the end of a sequence. In this paper, to combine the best of both approaches, we propose to use the CTC-Prefix-Score during S2S decoding. Hereby, during beam search, paths that are i… ▽ More

    Submitted 29 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: 15 pages, 6 tables, 3 figures

  2. arXiv:2106.07881  [pdf

    cs.CV

    Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning

    Authors: Christian Reul, Christoph Wick, Maximilian Nöth, Andreas Büttner, Maximilian Wehner, Uwe Springmann

    Abstract: In order to apply Optical Character Recognition (OCR) to historical printings of Latin script fully automatically, we report on our efforts to construct a widely-applicable polyfont recognition model yielding text with a Character Error Rate (CER) around 2% when applied out-of-the-box. Moreover, we show how this model can be further finetuned to specific classes of printings with little manual and… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: submitted to HIP'21

  3. Optimizing small BERTs trained for German NER

    Authors: Jochen Zöllner, Konrad Sperfeld, Christoph Wick, Roger Labahn

    Abstract: Currently, the most widespread neural network architecture for training language models is the so called BERT which led to improvements in various Natural Language Processing (NLP) tasks. In general, the larger the number of parameters in a BERT model, the better the results obtained in these NLP tasks. Unfortunately, the memory consumption and the training duration drastically increases with the… ▽ More

    Submitted 1 November, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

    Journal ref: MDPI Information 2021, vol. 12 nr. 11, article-nr. 443

  4. OCR4all -- An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings

    Authors: Christian Reul, Dennis Christ, Alexander Hartelt, Nico Balbach, Maximilian Wehner, Uwe Springmann, Christoph Wick, Christine Grundig, Andreas Büttner, Frank Puppe

    Abstract: Optical Character Recognition (OCR) on historical printings is a challenging task mainly due to the complexity of the layout and the highly variant typography. Nevertheless, in the last few years great progress has been made in the area of historical OCR, resulting in several powerful open-source tools for preprocessing, layout recognition and segmentation, character recognition and post-processin… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: submitted to MDPI - Applied Sciences

    Journal ref: https://www.mdpi.com/2076-3417/9/22/4853/htm

  5. arXiv:1905.06009  [pdf

    physics.chem-ph cond-mat.mes-hall physics.comp-ph

    Structural Characterization of an Ionic Liquid in bulk and in nano-confined environment from MD simulations

    Authors: Natasa Vucemilovic-Alagic, Radha D. Banhatti, Robert Stepic, Christian R. Wick, Daniel Berger, Mario Gaimann, Andreas Bear, Jens Harting, David M. Smith, Ana-Suncana Smith

    Abstract: This article contains data on structural characterization of the [C2Mim][NTf2] in bulk and in nano-confined environment obtained using MD simulations. These data supplement those presented in the paper Insights from Molecular Dynamics Simulations on Structural Organization and Diffusive Dynamics of an Ionic Liquid at Solid and Vacuum Interfaces, where force fields with three different charge metho… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: 13 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:1903.09450

  6. arXiv:1903.09450  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Insights from Molecular Dynamics Simulations on Structural Organization and Diffusive Dynamics of an Ionic Liquid at Solid and Vacuum Interfaces

    Authors: Natasa Vucemilovic-Alagic, Radha D. Banhatti, Robert Stepic, Christian R. Wick, Daniel Berger, Mario U. Gaimann, Andreas Baer, Jens Harting, David M. Smith, Ana-Suncana Smith

    Abstract: Hypothesis A prototypical modelling approach is required for a full characterisation of the static and equilibrium dynamical properties of confined ionic liquids (ILs), in order to gain predictive power of properties that are difficult to extract from experiments. Such a protocol needs to be constructed by benchmarking molecular dynamics simulations against available experiments. Simulations We… ▽ More

    Submitted 22 March, 2019; originally announced March 2019.

    Comments: 14 pages, 9 figures in main text and 14 figures in Supporting Information

    Journal ref: Journal of Colloid and Interface Science 553, 350-363 (2019)

  7. arXiv:1810.03436  [pdf

    cs.CV

    State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines

    Authors: Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

    Abstract: In this paper we evaluate Optical Character Recognition (OCR) of 19th century Fraktur scripts without book-specific training using mixed models, i.e. models trained to recognize a variety of fonts and typesets from previously unseen sources. We describe the training process leading to strong mixed OCR models and compare them to freely available models of the popular open source engines OCRopus and… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

    Comments: Submitted to DHd 2019 (https://dhd2019.org/) which demands a... creative... submission format. Consequently, some captions might look weird and some links aren't clickable. Extended version with more technical details and some fixes to follow

  8. arXiv:1807.02004  [pdf

    cs.CV

    Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition

    Authors: Christoph Wick, Christian Reul, Frank Puppe

    Abstract: Optical Character Recognition (OCR) on contemporary and historical data is still in the focus of many researchers. Especially historical prints require book specific trained OCR models to achieve applicable results (Springmann and Lüdeling, 2016, Reul et al., 2017a). To reduce the human effort for manually annotating ground truth (GT) various techniques such as voting and pretraining have shown to… ▽ More

    Submitted 6 August, 2018; v1 submitted 5 July, 2018; originally announced July 2018.

    Comments: 11 pages, 3 figures

    Journal ref: Digital Humanities Quarterly 14 (2), 2020

  9. arXiv:1802.10038  [pdf, other

    cs.CV

    Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning

    Authors: Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

    Abstract: We combine three methods which significantly improve the OCR accuracy of OCR models trained on early printed books: (1) The pretraining method utilizes the information stored in already existing models trained on a variety of typesets (mixed models) instead of starting the training from scratch. (2) Performing cross fold training on a single set of ground truth data (line images and their transcri… ▽ More

    Submitted 28 February, 2018; v1 submitted 27 February, 2018; originally announced February 2018.

    Comments: Submitted to JLCL Volume 33 (2018), Issue 1: Special Issue on Automatic Text and Layout Recognition

  10. arXiv:1802.10033  [pdf, other

    cs.CV cs.DL

    Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks

    Authors: Christoph Wick, Christian Reul, Frank Puppe

    Abstract: This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books. While the standard model of line based OCR uses a single LSTM layer, we utilize a CNN- and Pooling-Layer combination in advance of an LSTM layer. Due to the higher amount of trainable parameters the performance of the network relies on a high amount of training examples to… ▽ More

    Submitted 27 February, 2018; originally announced February 2018.

    Comments: 16 pages, 4 figures, 8 tables, submitted to JLCL Volume 33 (2018), Issue 1

  11. arXiv:1712.05586  [pdf

    cs.CV

    Transfer Learning for OCRopus Model Training on Early Printed Books

    Authors: Christian Reul, Christoph Wick, Uwe Springmann, Frank Puppe

    Abstract: A method is presented that significantly reduces the character error rates for OCR text obtained from OCRopus models trained on early printed books when only small amounts of diplomatic transcriptions are available. This is achieved by building from already existing models during training instead of starting from scratch. To overcome the discrepancies between the set of characters of the pretraine… ▽ More

    Submitted 21 December, 2017; v1 submitted 15 December, 2017; originally announced December 2017.

  12. arXiv:1712.00967  [pdf, other

    cs.CV

    Leaf Identification Using a Deep Convolutional Neural Network

    Authors: Christoph Wick, Frank Puppe

    Abstract: Convolutional neural networks (CNNs) have become popular especially in computer vision in the last few years because they achieved outstanding performance on different tasks, such as image classifications. We propose a nine-layer CNN for leaf identification using the famous Flavia and Foliage datasets. Usually the supervised learning of deep CNNs requires huge datasets for training. However, the u… ▽ More

    Submitted 4 December, 2017; originally announced December 2017.

  13. Improving OCR Accuracy on Early Printed Books by utilizing Cross Fold Training and Voting

    Authors: Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

    Abstract: In this paper we introduce a method that significantly reduces the character error rates for OCR text obtained from OCRopus models trained on early printed books. The method uses a combination of cross fold training and confidence based voting. After allocating the available ground truth in different subsets several training processes are performed, each resulting in a specific OCR model. The OCR… ▽ More

    Submitted 27 November, 2017; originally announced November 2017.

  14. Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images

    Authors: Christoph Wick, Frank Puppe

    Abstract: We propose a high-performance fully convolutional neural network (FCN) for historical document segmentation that is designed to process a single page in one step. The advantage of this model beside its speed is its ability to directly learn from raw pixels instead of using preprocessing steps e. g. feature computation or superpixel generation. We show that this network yields better results than e… ▽ More

    Submitted 15 February, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

    Comments: 6 pages, 7 figures, conference

  15. arXiv:1706.04338  [pdf, other

    physics.pop-ph

    Playing Music in Just Intonation - A Dynamically Adapting Tuning Scheme

    Authors: Karolin Stange, Christoph Wick, Haye Hinrichsen

    Abstract: We investigate a dynamically adapting tuning scheme for microtonal tuning of musical instruments, allowing the performer to play music in just intonation in any key. Unlike other methods, which are based on a procedural analysis of the chordal structure, the tuning scheme continually solves a system of linear equations without making explicit decisions. In complex situations, where not all interva… ▽ More

    Submitted 11 June, 2018; v1 submitted 14 June, 2017; originally announced June 2017.

    Comments: 22 pages, 7 figures

  16. arXiv:1508.01652  [pdf, other

    quant-ph cond-mat.stat-mech

    Entanglement formation under random interactions

    Authors: Christoph Wick, Jaegon Um, Haye Hinrichsen

    Abstract: The temporal evolution of the entanglement between two qubits evolving by random interactions is studied analytically and numerically. Two different types of randomness are investigated. Firstly we analyze an ensemble of systems with randomly chosen but time-independent interaction Hamiltonians. Secondly we consider the case of a temporally fluctuating Hamiltonian, where the unitary evolution can… ▽ More

    Submitted 19 October, 2015; v1 submitted 7 August, 2015; originally announced August 2015.

    Comments: Latex, 24 pages, 4 figures, Supplement material in source archive