Skip to main content

Showing 1–1 of 1 results for author: Ge, A C

Searching in archive cs. Search in all archives.
.
  1. arXiv:1906.08470  [pdf, other

    cs.DL cs.IR

    Cleaning Noisy and Heterogeneous Metadata for Record Linking Across Scholarly Big Datasets

    Authors: Athar Sefid, Jian Wu, Allen C. Ge, Jing Zhao, Lu Liu, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles

    Abstract: Automatically extracted metadata from scholarly documents in PDF formats is usually noisy and heterogeneous, often containing incomplete fields and erroneous values. One common way of cleaning metadata is to use a bibliographic reference dataset. The challenge is to match records between corpora with high precision. The existing solution which is based on information retrieval and string similarit… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.