Skip to main content

Showing 1–2 of 2 results for author: Ding, C C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.16795  [pdf, other

    cs.HC cs.AI cs.CL cs.LG

    If in a Crowdsourced Data Annotation Pipeline, a GPT-4

    Authors: Zeyu He, Chieh-Yang Huang, Chien-Kuang Cornelia Ding, Shaurya Rohatgi, Ting-Hao 'Kenneth' Huang

    Abstract: Recent studies indicated GPT-4 outperforms online crowd workers in data labeling accuracy, notably workers from Amazon Mechanical Turk (MTurk). However, these studies were criticized for deviating from standard crowdsourcing practices and emphasizing individual workers' performances over the whole data-annotation process. This paper compared GPT-4 and an ethical and well-executed MTurk pipeline, w… ▽ More

    Submitted 28 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted By CHI 2024

  2. arXiv:2005.02367  [pdf, other

    cs.CL cs.HC

    CODA-19: Using a Non-Expert Crowd to Annotate Research Aspects on 10,000+ Abstracts in the COVID-19 Open Research Dataset

    Authors: Ting-Hao 'Kenneth' Huang, Chieh-Yang Huang, Chien-Kuang Cornelia Ding, Yen-Chia Hsu, C. Lee Giles

    Abstract: This paper introduces CODA-19, a human-annotated dataset that codes the Background, Purpose, Method, Finding/Contribution, and Other sections of 10,966 English abstracts in the COVID-19 Open Research Dataset. CODA-19 was created by 248 crowd workers from Amazon Mechanical Turk within 10 days, and achieved labeling quality comparable to that of experts. Each abstract was annotated by nine different… ▽ More

    Submitted 17 September, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

    Comments: Accepted by the NLP COVID-19 Workshop at ACL 2020. (The data, code, and model are available at: https://github.com/windx0303/CODA-19)