Showing 1–2 of 2 results for author: Alegria, I
-
A Comparison of Feature-Based and Neural Scansion of Poetry
Authors:
Manex Agirrezabal,
IƱaki Alegria,
Mans Hulden
Abstract:
Automatic analysis of poetic rhythm is a challenging task that involves linguistics, literature, and computer science. When the language to be analyzed is known, rule-based systems or data-driven methods can be used. In this paper, we analyze poetic rhythm in English and Spanish. We show that the representations of data learned from character-based neural models are more informative than the ones…
▽ More
Automatic analysis of poetic rhythm is a challenging task that involves linguistics, literature, and computer science. When the language to be analyzed is known, rule-based systems or data-driven methods can be used. In this paper, we analyze poetic rhythm in English and Spanish. We show that the representations of data learned from character-based neural models are more informative than the ones from hand-crafted features, and that a Bi-LSTM+CRF-model produces state-of-the art accuracy on scansion of poetry in two languages. Results also show that the information about whole word structure, and not just independent syllables, is highly informative for performing scansion.
△ Less
Submitted 2 November, 2017;
originally announced November 2017.
-
Different Issues in the Design of a Lemmatizer/Tagger for Basque
Authors:
I. Aduriz,
I. Alegria,
J. M. Arriola,
X. Artola,
Diaz de Illarraza A.,
N. Ezeiza,
K. Gojenola,
M. Maritxalar
Abstract:
This paper presents relevant issues that have been considered in the design of a general purpose lemmatizer/tagger for Basque (EUSLEM). The lemmatizer/tagger is conceived as a basic tool necessary for other linguistic applications. It uses the lexical data base and the morphological analyzer previously developed and implemented. Due to the characteristics of the language, the tagset here propose…
▽ More
This paper presents relevant issues that have been considered in the design of a general purpose lemmatizer/tagger for Basque (EUSLEM). The lemmatizer/tagger is conceived as a basic tool necessary for other linguistic applications. It uses the lexical data base and the morphological analyzer previously developed and implemented. Due to the characteristics of the language, the tagset here proposed in structured in for levels, so that each level is a refinement of the previous one in the sense that it adds more detailed information. We will focus on the problems found in designing this tagset and on the strategies for morphological disambiguation that will be used.
△ Less
Submitted 20 March, 1995;
originally announced March 1995.