Skip to main content

Showing 1–5 of 5 results for author: Brychcín, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:1807.04175  [pdf, other

    cs.CL

    Cross-lingual Word Analogies using Linear Transformations between Semantic Spaces

    Authors: Tomáš Brychcín, Stephen Eugene Taylor, Lukáš Svoboda

    Abstract: We generalize the word analogy task across languages, to provide a new intrinsic evaluation method for cross-lingual semantic spaces. We experiment with six languages within different language families, including English, German, Spanish, Italian, Czech, and Croatian. State-of-the-art monolingual semantic spaces are transformed into a shared space using dictionaries of word translations. We compar… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: 11 pages. arXiv admin note: text overlap with arXiv:1807.04172

  2. arXiv:1807.04172  [pdf, other

    cs.CL

    Linear Transformations for Cross-lingual Semantic Textual Similarity

    Authors: Tomáš Brychcín

    Abstract: Cross-lingual semantic textual similarity systems estimate the degree of the meaning similarity between two sentences, each in a different language. State-of-the-art algorithms usually employ machine translation and combine vast amount of features, making the approach strongly supervised, resource rich, and difficult to use for poorly-resourced languages. In this paper, we study linear transform… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: 11 pages

  3. arXiv:1612.06572  [pdf, other

    cs.CL

    Unsupervised Dialogue Act Induction using Gaussian Mixtures

    Authors: Tomáš Brychcín, Pavel Král

    Abstract: This paper introduces a new unsupervised approach for dialogue act induction. Given the sequence of dialogue utterances, the task is to assign them the labels representing their function in the dialogue. Utterances are represented as real-valued vectors encoding their meaning. We model the dialogue as Hidden Markov model with emission probabilities estimated by Gaussian mixtures. We use Gibbs sa… ▽ More

    Submitted 8 February, 2017; v1 submitted 20 December, 2016; originally announced December 2016.

    Comments: Accepted to EACL 2017

  4. arXiv:1608.00789  [pdf, other

    cs.CL

    New word analogy corpus for exploring embeddings of Czech words

    Authors: Lukáš Svoboda, Tomáš Brychcín

    Abstract: The word embedding methods have been proven to be very useful in many tasks of NLP (Natural Language Processing). Much has been investigated about word embeddings of English words and phrases, but only little attention has been dedicated to other languages. Our goal in this paper is to explore the behavior of state-of-the-art word embedding methods on Czech, the language that is characterized by… ▽ More

    Submitted 2 August, 2016; originally announced August 2016.

    Comments: paper accepted on Cicling 2016 conference, will be published in Springer

  5. arXiv:1607.07057  [pdf, other

    cs.CL

    Latent Tree Language Model

    Authors: Tomas Brychcin

    Abstract: In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of a given sentence as a tree of word roles. The learning phase iteratively updates the trees by moving nodes according to Gibbs sampling. We introduce two algorithms to infer a tree for a given sentence. The first one is based on Gibbs sampling. It is fast, but d… ▽ More

    Submitted 5 September, 2016; v1 submitted 24 July, 2016; originally announced July 2016.

    Comments: Accepted to EMNLP 2016