Skip to main content

Showing 1–15 of 15 results for author: Nutanong, S

.
  1. arXiv:2505.14157  [pdf, ps, other

    cs.CL cs.AI

    Prior Prompt Engineering for Reinforcement Fine-Tuning

    Authors: Pittawat Taveekitworachai, Potsawee Manakul, Sarana Nutanong, Kunat Pipatanakul

    Abstract: This paper investigates prior prompt engineering (pPE) in the context of reinforcement fine-tuning (RFT), where language models (LMs) are incentivized to exhibit behaviors that maximize performance through reward signals. While existing RFT research has primarily focused on algorithms, reward shaping, and data curation, the design of the prior prompt--the instructions prepended to queries during t… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 25 pages, 42 figures

  2. arXiv:2504.05898  [pdf, other

    cs.CL

    Assessing Thai Dialect Performance in LLMs with Automatic Benchmarks and Human Evaluation

    Authors: Peerat Limkonchotiwat, Kanruethai Masuk, Surapon Nonesung, Chalermpun Mai-On, Sarana Nutanong, Wuttikorn Ponwitayarat, Potsawee Manakul

    Abstract: Large language models show promising results in various NLP tasks. Despite these successes, the robustness and consistency of LLMs in underrepresented languages remain largely unexplored, especially concerning local dialects. Existing benchmarks also focus on main dialects, neglecting LLMs' ability on local dialect texts. In this paper, we introduce a Thai local dialect benchmark covering Northern… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: Datasets and codes are available at https://github.com/mrpeerat/Thai_local_benchmark

  3. arXiv:2502.17956  [pdf, other

    cs.CL

    Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments

    Authors: Patomporn Payoungkhamdee, Pume Tuchinda, Jinheon Baek, Samuel Cahyawijaya, Can Udomcharoenchaikit, Potsawee Manakul, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: Multi-step reasoning is essential for large language models (LLMs), yet multilingual performance remains challenging. While Chain-of-Thought (CoT) prompting improves reasoning, it struggles with non-English languages due to the entanglement of reasoning and execution. Program-of-Thought (PoT) prompting separates reasoning from execution, offering a promising alternative but shifting the challenge… ▽ More

    Submitted 22 May, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

  4. arXiv:2502.10868  [pdf, other

    cs.CL

    NitiBench: A Comprehensive Study of LLM Framework Capabilities for Thai Legal Question Answering

    Authors: Pawitsapak Akarajaradwong, Pirat Pothavorn, Chompakorn Chaksangchaichot, Panuthep Tasawong, Thitiwat Nopparatbundit, Sarana Nutanong

    Abstract: The application of large language models (LLMs) in the legal domain holds significant potential for information retrieval and question answering, yet Thai legal QA systems face challenges due to a lack of standardized evaluation benchmarks and the complexity of Thai legal structures. This paper introduces NitiBench, a benchmark comprising two datasets: the NitiBench-CCL, covering general Thai fina… ▽ More

    Submitted 8 March, 2025; v1 submitted 15 February, 2025; originally announced February 2025.

  5. arXiv:2407.19164  [pdf, other

    cs.CL

    Addressing Topic Leakage in Cross-Topic Evaluation for Authorship Verification

    Authors: Jitkapat Sawatphol, Can Udomcharoenchaikit, Sarana Nutanong

    Abstract: Authorship verification (AV) aims to identify whether a pair of texts has the same author. We address the challenge of evaluating AV models' robustness against topic shifts. The conventional evaluation assumes minimal topic overlap between training and test data. However, we argue that there can still be topic leakage in test data, causing misleading model performance and unstable rankings. To add… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: Accepted to publish at Transactions of the Association for Computational Linguistics

  6. arXiv:2406.03125  [pdf, other

    cs.CL

    Space Decomposition for Sentence Embedding

    Authors: Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: Determining sentence pair similarity is crucial for various NLP tasks. A common technique to address this is typically evaluated on a continuous semantic textual similarity scale from 0 to 5. However, based on a linguistic observation in STS annotation guidelines, we found that the score in the range [4,5] indicates an upper-range sample, while the rest are lower-range samples. This necessitates a… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ACL Finding 2024. The code and pre-trained models are available at https://github.com/KornWtp/MixSP

  7. arXiv:2403.16127  [pdf, other

    cs.CL cs.AI

    WangchanLion and WangchanX MRC Eval

    Authors: Wannaphong Phatthiyaphaibun, Surapon Nonesung, Patomporn Payoungkhamdee, Peerat Limkonchotiwat, Can Udomcharoenchaikit, Jitkapat Sawatphol, Chompakorn Chaksangchaichot, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: This technical report describes the development of WangchanLion, an instruction fine-tuned model focusing on Machine Reading Comprehension (MRC) in the Thai language. Our model is based on SEA-LION and a collection of instruction following datasets. To promote open research and reproducibility, we publicly release all training data, code, and the final model weights under the Apache-2 license. To… ▽ More

    Submitted 23 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

  8. arXiv:2311.03228  [pdf, other

    cs.CL cs.AI

    An Efficient Self-Supervised Cross-View Training For Sentence Embedding

    Authors: Peerat Limkonchotiwat, Wuttikorn Ponwitayarat, Lalita Lowphansirikul, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: Self-supervised sentence representation learning is the task of constructing an embedding space for sentences without relying on human annotation efforts. One straightforward approach is to finetune a pretrained language model (PLM) with a representation learning method such as contrastive learning. While this approach achieves impressive performance on larger PLMs, the performance rapidly degrade… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted to TACL. The code and pre-trained models are available at https://github.com/mrpeerat/SCT

  9. arXiv:2306.10348  [pdf, other

    cs.IR cs.CL

    Typo-Robust Representation Learning for Dense Retrieval

    Authors: Panuthep Tasawong, Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: Dense retrieval is a basic building block of information retrieval applications. One of the main challenges of dense retrieval in real-world settings is the handling of queries containing misspelled words. A popular approach for handling misspelled queries is minimizing the representations discrepancy between misspelled queries and their pristine ones. Unlike the existing approaches, which only fo… ▽ More

    Submitted 17 June, 2023; originally announced June 2023.

    Comments: 5 pages, 2 figures

    ACM Class: I.2.7

  10. arXiv:2208.04799  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Thai Wav2Vec2.0 with CommonVoice V8

    Authors: Wannaphong Phatthiyaphaibun, Chompakorn Chaksangchaichot, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong

    Abstract: Recently, Automatic Speech Recognition (ASR), a system that converts audio into text, has caught a lot of attention in the machine learning community. Thus, a lot of publicly available models were released in HuggingFace. However, most of these ASR models are available in English; only a minority of the models are available in Thai. Additionally, most of the Thai ASR models are closed-sourced, and… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

  11. arXiv:2101.09635  [pdf, ps, other

    cs.CL

    WangchanBERTa: Pretraining transformer-based Thai Language Models

    Authors: Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong

    Abstract: Transformer-based language models, more specifically BERT-based architectures have achieved state-of-the-art performance in many downstream tasks. However, for a relatively low-resource language such as Thai, the choices of models are limited to training a BERT-based model based on a much smaller dataset or finetuning multi-lingual models, both of which yield suboptimal downstream performance. Mor… ▽ More

    Submitted 20 March, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

    Comments: 24 pages, edited the citation of the syllable-level tokenizer from [Chormai et al., 2020] to [Phatthiyaphaibun et al., 2020] as the authors used the syllable-level tokenizer from PyThaiNLP [Phatthiyaphaibun et al., 2020] in the experiments

  12. scb-mt-en-th-2020: A Large English-Thai Parallel Corpus

    Authors: Lalita Lowphansirikul, Charin Polpanumas, Attapol T. Rutherford, Sarana Nutanong

    Abstract: The primary objective of our work is to build a large-scale English-Thai dataset for machine translation. We construct an English-Thai machine translation dataset with over 1 million segment pairs, curated from various sources, namely news, Wikipedia articles, SMS messages, task-based dialogs, web-crawled data and government documents. Methodology for gathering data, building parallel texts and re… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 35 pages, 4 figures

  13. arXiv:2002.03118  [pdf, other

    cs.GT math.OC

    Shipper Cooperation in Stochastic Drone Delivery: A Dynamic Bayesian Game Approach

    Authors: Suttinee Sawadsitang, Dusit Niyato, Tan Puay Siew, Ping Wang, Sarana Nutanong

    Abstract: With the recent technological innovation, unmanned aerial vehicles, known as drones, have found numerous applications including package and parcel delivery for shippers. Drone delivery offers benefits over conventional ground-based vehicle delivery in terms of faster speed, lower cost, more environment-friendly, and less manpower needed. However, most of existing studies on drone delivery planning… ▽ More

    Submitted 8 February, 2020; originally announced February 2020.

    Comments: 15 Pages, 10 figures, 2 tables. This paper is still under review

  14. arXiv:1908.07406  [pdf, ps, other

    eess.SP

    Multi-Objective Optimization for Drone Delivery

    Authors: Suttinee Sawadsitang, Dusit Niyato, Puay Siew Tan, Sarana Nutanong

    Abstract: Recently, an unmanned aerial vehicle (UAV), as known as drone, has become an alternative means of package delivery. Although the drone delivery scheduling has been studied in recent years, most existing models are formulated as a single objective optimization problem. However, in practice, the drone delivery scheduling has multiple objectives that the shipper has to achieve. Moreover, drone delive… ▽ More

    Submitted 24 July, 2019; originally announced August 2019.

    Comments: 5 pages, 4 figures

    Journal ref: 2019 IEEE 90th Vehicular Technology Conference: VTC2019-Fall

  15. arXiv:cs/0402018  [pdf

    cs.DC

    P2P Networks for Content Sharing

    Authors: Choon Hoong Ding, Sarana Nutanong, Rajkumar Buyya

    Abstract: Peer-to-peer (P2P) technologies have been widely used for content sharing, popularly called "file-swapping" networks. This chapter gives a broad overview of content sharing P2P technologies. It starts with the fundamental concept of P2P computing followed by the analysis of network topologies used in peer-to-peer systems. Next, three milestone peer-to-peer technologies: Napster, Gnutella, and Fa… ▽ More

    Submitted 10 February, 2004; originally announced February 2004.

    Comments: 35 pages, 26 figures

    Report number: GRIDS-TR-2003-7 ACM Class: C.2.4

    Journal ref: Technical Report, GRIDS-TR-2003-7, Grid Computing and Distributed Systems Laboratory, University of Melbourne, Australia, December 2003