Skip to main content

Showing 1–7 of 7 results for author: Aubin, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2012.08325  [pdf

    q-bio.OT cs.DB

    39 Hints to Facilitate the Use of Semantics for Data on Agriculture and Nutrition

    Authors: Caterina Caracciolo, Sophie Aubin, Clement Jonquet, Emna Amdouni, Romain David, Leyla Garcia, Brandon Whitehead, Catherine Roussey, Armando Stellato, Ferdinando Villa

    Abstract: In this paper, we report on the outputs and adoption of the Agrisemantics Working Group of the Research Data Alliance (RDA), consisting of a set of recommendations to facilitate the adoption of semantic technologies and methods for the purpose of data interoperability in the field of agriculture and nutrition. From 2016 to 2019, the group gathered researchers and practitioners at the crossing poin… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Journal ref: CODATA Data Science Journal, Committee on Data for Science and Technology (CODATA), 2020, 19 (1)

  2. Building Large Lexicalized Ontologies from Text: a Use Case in Automatic Indexing of Biotechnology Patents

    Authors: Claire Nédellec, Wiktoria Golik, Sophie Aubin, Robert Bossy

    Abstract: This paper presents a tool, TyDI, and methods experimented in the building of a termino-ontology, i.e. a lexicalized ontology aimed at fine-grained indexation for semantic search applications. TyDI provides facilities for knowledge engineers and domain experts to efficiently collaborate to validate, organize and conceptualize corpus extracted terms. A use case on biotechnology patent search demons… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Journal ref: International Conference on Knowledge Engineering and Knowledge Management. EKAW 2010. Lecture Notes in Computer Science, vol 6317. (pp. 514-523) Springer, Berlin, Heidelberg

  3. arXiv:0706.4375  [pdf

    cs.AI

    A Robust Linguistic Platform for Efficient and Domain specific Web Content Analysis

    Authors: Thierry Hamon, Adeline Nazarenko, Thierry Poibeau, Sophie Aubin, Julien Derivière

    Abstract: Web semantic access in specific domains calls for specialized search engines with enhanced semantic querying and indexing capacities, which pertain both to information retrieval (IR) and to information extraction (IE). A rich linguistic analysis is required either to identify the relevant semantic units to index and weight them according to linguistic specific statistical distribution, or as the… ▽ More

    Submitted 29 June, 2007; originally announced June 2007.

    Journal ref: Proceedings of RIAO 2007 (30/05/2007)

  4. arXiv:cs/0609135  [pdf

    cs.AI cs.IR

    Event-based Information Extraction for the biomedical domain: the Caderige project

    Authors: Erick Alphonse, Sophie Aubin, Philippe Bessières, Gilles Bisson, Thierry Hamon, Sandrine Lagarrigue, Adeline Nazarenko, Alain-Pierre Manine, Claire Nédellec, Mohamed Ould Abdel Vetah, Thierry Poibeau, Davy Weissenbacher

    Abstract: This paper gives an overview of the Caderige project. This project involves teams from different areas (biology, machine learning, natural language processing) in order to develop high-level analysis tools for extracting structured information from biological bibliographical databases, especially Medline. The paper gives an overview of the approach and compares it to the state of the art.

    Submitted 24 September, 2006; originally announced September 2006.

    ACM Class: H.3.1

    Journal ref: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (COLING'04), Suisse (2004) 43-39

  5. arXiv:cs/0609019  [pdf, ps, other

    cs.CL

    Improving Term Extraction with Terminological Resources

    Authors: Sophie Aubin, Thierry Hamon

    Abstract: Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional limitation. In this paper, we propose to use external terminologies to influence generic linguistic data in order to augment the quality of the extraction. The tool we… ▽ More

    Submitted 6 September, 2006; originally announced September 2006.

    Journal ref: Advances in Natural Language Processing 5th International Conference on NLP, FinTAL 2006 (2006) 380

  6. arXiv:cs/0606119  [pdf, ps, other

    cs.CL cs.IR

    Lexical Adaptation of Link Grammar to the Biomedical Sublanguage: a Comparative Evaluation of Three Approaches

    Authors: Sampo Pyysalo, Tapio Salakoski, Sophie Aubin, Adeline Nazarenko

    Abstract: We study the adaptation of Link Grammar Parser to the biomedical sublanguage with a focus on domain terms not found in a general parser lexicon. Using two biomedical corpora, we implement and evaluate three approaches to addressing unknown words: automatic lexicon expansion, the use of morphological clues, and disambiguation using a part-of-speech tagger. We evaluate each approach separately for… ▽ More

    Submitted 28 June, 2006; originally announced June 2006.

    ACM Class: H.4

    Journal ref: Proceedings of the Second International Symposium on Semantic Mining in Biomedicine (SMBM 2006) (2006) 60-67

  7. arXiv:cs/0606118  [pdf, ps, other

    cs.CL cs.IR

    Adapting a general parser to a sublanguage

    Authors: Sophie Aubin, Adeline Nazarenko, Claire Nédellec

    Abstract: In this paper, we propose a method to adapt a general parser (Link Parser) to sublanguages, focusing on the parsing of texts in biology. Our main proposal is the use of terminology (identication and analysis of terms) in order to reduce the complexity of the text to be parsed. Several other strategies are explored and finally combined among which text normalization, lexicon and morpho-guessing m… ▽ More

    Submitted 28 June, 2006; originally announced June 2006.

    ACM Class: H.4

    Journal ref: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP'05) (2005) 89-93