Skip to main content

Showing 1–1 of 1 results for author: Nenadic, G

Searching in archive math. Search in all archives.
.
  1. arXiv:2303.04526  [pdf, other

    cs.CL cs.IT math.NA stat.AP

    Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

    Authors: Serge Gladkoff, Lifeng Han, Goran Nenadic

    Abstract: In natural language processing (NLP) we always rely on human judgement as the golden quality evaluation method. However, there has been an ongoing debate on how to better evaluate inter-rater reliability (IRR) levels for certain evaluation tasks, such as translation quality evaluation (TQE), especially when the data samples (observations) are very scarce. In this work, we first introduce the study… ▽ More

    Submitted 9 July, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: Accepted to RANLP2023: Recent Advances in Natural Language Processing, Varna, Bulgaria. 30 Aug - 8 Sep \url{https://ranlp.org/ranlp2023/}