Skip to main content

Showing 1–3 of 3 results for author: Widiger, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:cs/0609065  [pdf

    cs.CL cs.IR

    Geocoding multilingual texts: Recognition, disambiguation and visualisation

    Authors: Bruno Pouliquen, Marco Kimler, Ralf Steinberger, Camelia Ignat, Tamara Oellinger, Ken Blackler, Flavio Fuart, Wajdi Zaghouani, Anna Widiger, Ann-Charlotte Forslund, Clive Best

    Abstract: We are presenting a method to recognise geographical references in free text. Our tool must work on various languages with a minimum of language-dependent resources, except a gazetteer. The main difficulty is to disambiguate these place names by distinguishing places from persons and by selecting the most likely place out of a list of homographic place names world-wide. The system uses a number… ▽ More

    Submitted 12 September, 2006; originally announced September 2006.

    Comments: 6 pages

    ACM Class: H.3.1; H.3.3; H.3.4

    Journal ref: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC-2006), pp. 53-58. Genoa, Italy, 24-26 May 2006

  2. arXiv:cs/0609058  [pdf

    cs.CL

    The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages

    Authors: Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaz Erjavec, Dan Tufis, Daniel Varga

    Abstract: We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EUanguages, with additional documents being available in the languages of the EU candidate countries. The corpus consists of almost 8,000 documents per language, with an average size of nearly 9 million words per language. Pair-wise par… ▽ More

    Submitted 12 September, 2006; originally announced September 2006.

    Comments: A multilingual textual resource with meta-data freely available for download at http://langtech.jrc.it/JRC-Acquis.html

    ACM Class: H.3.1; H.3.6

    Journal ref: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006), pp. 2142-2147. Genoa, Italy, 24-26 May 2006

  3. arXiv:cs/0609051  [pdf

    cs.CL cs.IR

    Multilingual person name recognition and transliteration

    Authors: Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, Irina Temnikova, Anna Widiger, Wajdi Zaghouani, Jan Zizka

    Abstract: We present an exploratory tool that extracts person names from multilingual news collections, matches name variants referring to the same person, and infers relationships between people based on the co-occurrence of their names in related news. A novel feature is the matching of name variants across languages and writing systems, including names written with the Greek, Cyrillic and Arabic writin… ▽ More

    Submitted 11 September, 2006; originally announced September 2006.

    Comments: Explains the technology behind the JRC's NewsExplorer application, which is freely accessible at http://press.jrc.it/NewsExplorer

    ACM Class: H.3.1; H.3.3; H.3.4; H.3.5

    Journal ref: Journal CORELA - Cognition, Representation, Langage. Numeros speciaux, Le traitement lexicographique des noms propres. December 2005. ISSN 1638-5748