Skip to main content

Showing 1–1 of 1 results for author: Salmela, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2102.07396  [pdf, other

    cs.CL

    Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers

    Authors: Liina Repo, Valtteri Skantsi, Samuel Rönnqvist, Saara Hellström, Miika Oinonen, Anna Salmela, Douglas Biber, Jesse Egbert, Sampo Pyysalo, Veronika Laippala

    Abstract: We explore cross-lingual transfer of register classification for web documents. Registers, that is, text varieties such as blogs or news are one of the primary predictors of linguistic variation and thus affect the automatic processing of language. We introduce two new register annotated corpora, FreCORE and SweCORE, for French and Swedish. We demonstrate that deep pre-trained language models perf… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.