-
Using WordNet to Complement Training Information in Text Categorization
Abstract: Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed through the use of a set of manually classified documents, a training collection. We suggest the utilization of additional resources like lexical databases to increase the amount of information that TC systems make use of, and thus, to improve their performance. Our ap… ▽ More
Submitted 17 September, 1997; originally announced September 1997.
Comments: 16 pages, 1 figure, 3 tables, previously with RANLP latext style
Journal ref: Second International Conference on Recent Advances in Natural Language Processing, 1997
-
Integrating a Lexical Database and a Training Collection for Text Categorization
Abstract: Automatic text categorization is a complex and useful task for many natural language processing applications. Recent approaches to text categorization focus more on algorithms than on resources involved in this operation. In contrast to this trend, we present an approach based on the integration of widely available resources as lexical databases and training collections to overcome current limit… ▽ More
Submitted 15 September, 1997; originally announced September 1997.
Comments: 12 pages, 3 figures (2 tables)
Journal ref: ACL/EACL Workshop on Automatic Extraction and Building of Lexical Semantic Resources for Natural Language Applications, 1997