Skip to main content

Showing 1–7 of 7 results for author: McKeown, K R

Searching in archive cs. Search in all archives.
.
  1. arXiv:cs/0206007  [pdf, ps, other

    cs.CL cs.DL

    Using the Annotated Bibliography as a Resource for Indicative Summarization

    Authors: Min-Yen Kan, Judith L. Klavans, Kathleen R. McKeown

    Abstract: We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We deta… ▽ More

    Submitted 4 June, 2002; originally announced June 2002.

    Comments: 8 pages, 3 figures

    ACM Class: I.2.7

    Journal ref: Proceedings of LREC 2002, Las Palmas, Spain. pp. 1746-1752

  2. arXiv:cs/0107019  [pdf, ps, other

    cs.CL

    Applying Natural Language Generation to Indicative Summarization

    Authors: Min-Yen Kan, Kathleen R. McKeown, Judith L. Klavans

    Abstract: The task of creating indicative summaries that help a searcher decide whether to read a particular document is a difficult task. This paper examines the indicative summarization task from a generation perspective, by first analyzing its required content via published guidelines and corpus analysis. We show how these summaries can be factored into a set of document features, and how an implemente… ▽ More

    Submitted 16 July, 2001; v1 submitted 16 July, 2001; originally announced July 2001.

    Comments: 8 pages, published in Proc. of 8th European Workshop on NLG

    ACM Class: I.2.7

  3. arXiv:cs/9810014  [pdf, ps, other

    cs.CL

    Resources for Evaluation of Summarization Techniques

    Authors: Judith L. Klavans, Kathleen R. McKeown, Min-Yen Kan, Susan Lee

    Abstract: We report on two corpora to be used in the evaluation of component systems for the tasks of (1) linear segmentation of text and (2) summary-directed sentence extraction. We present characteristics of the corpora, methods used in the collection of user judgments, and an overview of the application of the corpora to evaluating the component system. Finally, we discuss the problems and issues with… ▽ More

    Submitted 13 October, 1998; originally announced October 1998.

    Comments: LaTeX source, 5 pages, US Letter, uses lrec98.sty

    ACM Class: I.2.7

    Journal ref: in Proc. of First International Conference on Language Resources and Evaluation, Rubio, Gallardo, Castro, and Tejada (eds.), Granada, Spain, 1998

  4. arXiv:cs/9809020  [pdf, ps

    cs.CL

    Linear Segmentation and Segment Significance

    Authors: Min-Yen Kan, Judith L. Klavans, Kathleen R. McKeown

    Abstract: We present a new method for discovering a segmental discourse structure of a document while categorizing segment function. We demonstrate how retrieval of noun phrases and pronominal forms, along with a zero-sum weighting scheme, determines topicalized segmentation. Futhermore, we use term distribution to aid in identifying the role that the segment performs in the document. Finally, we present… ▽ More

    Submitted 15 September, 1998; originally announced September 1998.

    Comments: 9 pages, US Letter, 4 figures. Software License can be found at http://www.cs.columbia.edu/nlp/licenses/segmenterLicenseDownload.html

    ACM Class: I.2.7

    Journal ref: Proceedings of 6th International Workshop of Very Large Corpora (WVLC-6), Montreal, Quebec, Canada: Aug. 1998. pp. 197-205

  5. Building a Generation Knowledge Source using Internet-Accessible Newswire

    Authors: Dragomir R. Radev, Kathleen R. McKeown

    Abstract: In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which uses a client-server architecture to extract noun-phrase descriptions of entities such as people, places, and organizations. The system serves two purposes: as an information extraction tool, it all… ▽ More

    Submitted 25 February, 1997; originally announced February 1997.

    Comments: 8 pages, uses epsf

    Journal ref: To appear in Proceedings of the 5th Conference on Applied Natural Processing, Washington DC, 31 March - 3 April, 1997.

  6. arXiv:cmp-lg/9610002  [pdf, ps

    cs.CL

    Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

    Authors: Eric V. Siegel, Kathleen R. McKeown

    Abstract: This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use f… ▽ More

    Submitted 21 October, 1996; originally announced October 1996.

    Comments: postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed.

  7. arXiv:cmp-lg/9408007  [pdf, ps

    cs.CL

    Emergent Linguistic Rules from Inducing Decision Trees: Disambiguating Discourse Clue Words

    Authors: Eric V. Siegel, Kathleen R. McKeown

    Abstract: We apply decision tree induction to the problem of discourse clue word sense disambiguation with a genetic algorithm. The automatic partitioning of the training set which is intrinsic to decision tree induction gives rise to linguistically viable rules.

    Submitted 13 August, 1994; originally announced August 1994.

    Journal ref: AAAI94 proceedings