Skip to main content

Showing 1–6 of 6 results for author: Majewska, O

Searching in archive cs. Search in all archives.
.
  1. Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation

    Authors: Olga Majewska, Evgeniia Razumovskaia, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen

    Abstract: Multilingual task-oriented dialogue (ToD) facilitates access to services and information for many (communities of) speakers. Nevertheless, the potential of this technology is not fully realised, as current datasets for multilingual ToD - both for modular and end-to-end modelling - suffer from severe limitations. 1) When created from scratch, they are usually small in scale and fail to cover many p… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

  2. arXiv:2104.08570  [pdf, other

    cs.CL

    Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems

    Authors: Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo M. Ponti, Anna Korhonen, Ivan Vulić

    Abstract: In task-oriented dialogue (ToD), a user holds a conversation with an artificial agent to complete a concrete task. Although this technology represents one of the central objectives of AI and has been the focus of ever more intense research and development efforts, it is currently limited to a few narrow domains (e.g., food ordering, ticket booking) and a handful of languages (e.g., English, Chines… ▽ More

    Submitted 25 May, 2022; v1 submitted 17 April, 2021; originally announced April 2021.

  3. arXiv:2012.15421  [pdf, other

    cs.CL

    Verb Knowledge Injection for Multilingual Event Processing

    Authors: Olga Majewska, Ivan Vulić, Goran Glavaš, Edoardo M. Ponti, Anna Korhonen

    Abstract: In parallel to their overwhelming success across NLP tasks, language ability of deep Transformer networks, pretrained via language modeling (LM) objectives has undergone extensive scrutiny. While probing revealed that these models encode a range of syntactic and semantic properties of a language, they are still prone to fall back on superficial cues and simple heuristics to solve downstream tasks,… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

    Comments: 19 pages, 1 figure, 8 tables

    Journal ref: Proceedings of ACL-IJCNLP 2021 Volume 1 Long Papers 6952-6969

  4. arXiv:2005.11787  [pdf, ps, other

    cs.CL

    Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

    Authors: Anne Lauscher, Olga Majewska, Leonardo F. R. Ribeiro, Iryna Gurevych, Nikolai Rozanov, Goran Glavaš

    Abstract: Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models. While on the one hand, joint pretraining (i.e., training from scratch, adding objectives based on external knowledge to the primary LM objective) may be prohibitively comput… ▽ More

    Submitted 11 October, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

    Comments: EMNLP 2020 - DeeLIO, ECML 2020 - DECODEML, 5 pages, 4 tables, 3 references

  5. arXiv:2005.00333  [pdf, other

    cs.CL

    XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning

    Authors: Edoardo Maria Ponti, Goran Glavaš, Olga Majewska, Qianchu Liu, Ivan Vulić, Anna Korhonen

    Abstract: In order to simulate human language capacity, natural language processing systems must be able to reason about the dynamics of everyday situations, including their possible causes and effects. Moreover, they should be able to generalise the acquired world knowledge to new languages, modulo cultural differences. Advances in machine reasoning and cross-lingual transfer depend on the availability of… ▽ More

    Submitted 26 October, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

  6. arXiv:2003.04866  [pdf, other

    cs.CL

    Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity

    Authors: Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen

    Abstract: We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language dataset is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pa… ▽ More

    Submitted 10 March, 2020; originally announced March 2020.

    Comments: Data and guidelines available at https://multisimlex.com/