Skip to main content

Showing 1–5 of 5 results for author: Schulz, K U

.
  1. arXiv:1803.04312  [pdf, other

    cs.FL

    Space-Efficient Bimachine Construction Based on the Equalizer Accumulation Principle

    Authors: Stefan Gerdjikov, Stoyan Mihov, Klaus U. Schulz

    Abstract: Algorithms for building bimachines from functional transducers found in the literature in a run of the bimachine imitate one successful path of the input transducer. Each single bimachine output exactly corresponds to the output of a single transducer transition. Here we introduce an alternative construction principle where bimachine steps take alternative parallel transducer paths into account, m… ▽ More

    Submitted 27 February, 2018; originally announced March 2018.

  2. arXiv:1606.05157  [pdf, other

    cs.DL

    Automatic quality evaluation and (semi-) automatic improvement of OCR models for historical printings

    Authors: U. Springmann, F. Fink, K. U. Schulz

    Abstract: Good OCR results for historical printings rely on the availability of recognition models trained on diplomatic transcriptions as ground truth, which is both a scarce resource and time-consuming to generate. Instead of having to train a separate model for each historical typeface, we propose a strategy to start from models trained on a combined set of available transcriptions in a variety of fonts.… ▽ More

    Submitted 20 October, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

  3. arXiv:1602.05772  [pdf, other

    cs.CL

    Corpus analysis without prior linguistic knowledge - unsupervised mining of phrases and subphrase structure

    Authors: Stefan Gerdjikov, Klaus U. Schulz

    Abstract: When looking at the structure of natural language, "phrases" and "words" are central notions. We consider the problem of identifying such "meaningful subparts" of language of any length and underlying composition principles in a completely corpus-based and language-independent way without using any kind of prior linguistic knowledge. Unsupervised methods for identifying "phrases", mining subphrase… ▽ More

    Submitted 18 February, 2016; originally announced February 2016.

  4. arXiv:1301.0722  [pdf, ps, other

    cs.CL cs.DS

    Good parts first - a new algorithm for approximate search in lexica and string databases

    Authors: Stefan Gerdjikov, Stoyan Mihov, Petar Mitankin, Klaus U. Schulz

    Abstract: We present a new efficient method for approximate search in electronic lexica. Given an input string (the pattern) and a similarity threshold, the algorithm retrieves all entries of the lexicon that are sufficiently similar to the pattern. Search is organized in subsearches that always start with an exact partial match where a substring of the input pattern is aligned with a substring of a lexicon… ▽ More

    Submitted 3 December, 2015; v1 submitted 4 January, 2013; originally announced January 2013.

  5. arXiv:cs/0602004  [pdf, ps, other

    cs.DB cs.AI cs.CC cs.LO

    Conjunctive Queries over Trees

    Authors: Georg Gottlob, Christoph Koch, Klaus U. Schulz

    Abstract: We study the complexity and expressive power of conjunctive queries over unranked labeled trees represented using a variety of structure relations such as ``child'', ``descendant'', and ``following'' as well as unary relations for node labels. We establish a framework for characterizing structures representing trees for which conjunctive queries can be evaluated efficiently. Then we completely c… ▽ More

    Submitted 2 February, 2006; originally announced February 2006.

    Comments: 36 pages, 12 figures, 2 tables, long version of PODS 2004 papers. To appear in Journal of the ACM 53(2), March 2006

    ACM Class: E.1; F.1.3; F.2.2; H.2.3; H.2.4; I.7.2