Skip to main content

Showing 1–1 of 1 results for author: Kanwatchara, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:1912.01580  [pdf, ps, other

    cs.LG cs.CL stat.ML

    A Comparative Study of Pretrained Language Models on Thai Social Text Categorization

    Authors: Thanapapas Horsuwan, Kasidis Kanwatchara, Peerapon Vateekul, Boonserm Kijsirikul

    Abstract: The ever-growing volume of data of user-generated content on social media provides a nearly unlimited corpus of unlabeled data even in languages where resources are scarce. In this paper, we demonstrate that state-of-the-art results on two Thai social text categorization tasks can be realized by pretraining a language model on a large noisy Thai social media corpus of over 1.26 billion tokens and… ▽ More

    Submitted 17 December, 2019; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 12 pages, conference