Skip to main content

Showing 1–24 of 24 results for author: Cheng-Kuang

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.17407  [pdf, ps, other

    cs.CL

    Language Matters: How Do Multilingual Input and Reasoning Paths Affect Large Reasoning Models?

    Authors: Zhi Rui Tam, Cheng-Kuang Wu, Yu Ying Chiu, Chieh-Yen Lin, Yun-Nung Chen, Hung-yi Lee

    Abstract: Large reasoning models (LRMs) have demonstrated impressive performance across a range of reasoning tasks, yet little is known about their internal reasoning processes in multilingual settings. We begin with a critical question: {\it In which language do these models reason when solving problems presented in different languages?} Our findings reveal that, despite multilingual training, LRMs tend to… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  2. arXiv:2504.13603  [pdf, other

    cs.CL

    Continual Pre-Training is (not) What You Need in Domain Adaption

    Authors: Pin-Er Chen, Da-Chen Lian, Shu-Kai Hsieh, Sieh-Chuen Huang, Hsuan-Lei Shao, Jun-Wei Chiu, Yang-Hsien Lin, Zih-Ching Chen, Cheng-Kuang, Eddie TC Huang, Simon See

    Abstract: The recent advances in Legal Large Language Models (LLMs) have transformed the landscape of legal research and practice by automating tasks, enhancing research precision, and supporting complex decision-making processes. However, effectively adapting LLMs to the legal domain remains challenging due to the complexity of legal reasoning, the need for precise interpretation of specialized language, a… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 11 pages, 2 figures

  3. arXiv:2503.01550  [pdf, other

    cs.CY cs.CL

    None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering

    Authors: Zhi Rui Tam, Cheng-Kuang Wu, Chieh-Yen Lin, Yun-Nung Chen

    Abstract: Multiple-choice exam questions with "None of the above" (NA) options have been extensively studied in educational testing, in which existing research suggests that they better assess true knowledge. However, their impact on Large Language Models (LLMs) evaluation remains underexplored. Through systematic experiments with 28 LLMs on the MMLU benchmark, we examine how NA options affect model perform… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  4. arXiv:2503.01332  [pdf, other

    cs.CL cs.AI

    Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models

    Authors: Cheng-Kuang Wu, Zhi Rui Tam, Chieh-Yen Lin, Yun-Nung Chen, Hung-yi Lee

    Abstract: Knowing when to answer or refuse is crucial for safe and reliable decision-making language agents. Although prior work has introduced refusal strategies to boost LMs' reliability, how these models adapt their decisions to different risk levels remains underexplored. We formalize the task of risk-aware decision-making, expose critical weaknesses in existing LMs, and propose skill-decomposition solu… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: preprint

  5. arXiv:2408.02442  [pdf, other

    cs.CL

    Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models

    Authors: Zhi Rui Tam, Cheng-Kuang Wu, Yi-Lin Tsai, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen

    Abstract: Structured generation, the process of producing content in standardized formats like JSON and XML, is widely utilized in real-world applications to extract key output information from large language models (LLMs). This study investigates whether such constraints on generation space impact LLMs abilities, including reasoning and domain knowledge comprehension. Specifically, we evaluate LLMs perform… ▽ More

    Submitted 14 October, 2024; v1 submitted 5 August, 2024; originally announced August 2024.

    Comments: 18 pages

  6. arXiv:2407.14767  [pdf, other

    cs.CL cs.AI

    I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case Study on Text-to-SQL Generation

    Authors: Cheng-Kuang Wu, Zhi Rui Tam, Chao-Chung Wu, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen

    Abstract: This study explores the proactive ability of LLMs to seek user support. We propose metrics to evaluate the trade-off between performance improvements and user burden, and investigate whether LLMs can determine when to request help under varying information availability. Our experiments show that without external feedback, many LLMs struggle to recognize their need for user support. The findings hi… ▽ More

    Submitted 29 September, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: Accepted by EMNLP 2024 Main Conference

  7. arXiv:2407.10603  [pdf, other

    eess.AS cs.CL cs.SD

    Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data

    Authors: Liang-Hsuan Tseng, Zih-Ching Chen, Wei-Shun Chang, Cheng-Kuang Lee, Tsung-Ren Huang, Hung-yi Lee

    Abstract: Recent advances in automatic speech recognition (ASR) often rely on large speech foundation models for generating high-quality transcriptions. However, these models can be impractical due to limited computing resources. The situation is even more severe in terms of more realistic or difficult scenarios, such as code-switching ASR (CS-ASR). To address this, we present a framework for developing mor… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  8. arXiv:2407.01911  [pdf, other

    cs.CL cs.HC cs.SD eess.AS

    Investigating the Effects of Large-Scale Pseudo-Stereo Data and Different Speech Foundation Model on Dialogue Generative Spoken Language Model

    Authors: Yu-Kuan Fu, Cheng-Kuang Lee, Hsiu-Hsuan Wang, Hung-yi Lee

    Abstract: Recent efforts in Spoken Dialogue Modeling aim to synthesize spoken dialogue without the need for direct transcription, thereby preserving the wealth of non-textual information inherent in speech. However, this approach faces a challenge when speakers talk simultaneously, requiring stereo dialogue data with speakers recorded on separate channels, a notably scarce resource. To address this, we have… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: submitted to interspeech 2024

  9. arXiv:2406.08747  [pdf, other

    cs.CL

    StreamBench: Towards Benchmarking Continuous Improvement of Language Agents

    Authors: Cheng-Kuang Wu, Zhi Rui Tam, Chieh-Yen Lin, Yun-Nung Chen, Hung-yi Lee

    Abstract: Recent works have shown that large language model (LLM) agents are able to improve themselves from experience, which is an important ability for continuous enhancement post-deployment. However, existing benchmarks primarily evaluate their innate capabilities and do not assess their ability to improve over time. To address this gap, we introduce StreamBench, a pioneering benchmark designed to evalu… ▽ More

    Submitted 30 October, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Track on Datasets and Benchmarks

  10. arXiv:2406.03009  [pdf, other

    cs.CL cs.AI

    Unveiling Selection Biases: Exploring Order and Token Sensitivity in Large Language Models

    Authors: Sheng-Lun Wei, Cheng-Kuang Wu, Hen-Hsen Huang, Hsin-Hsi Chen

    Abstract: In this paper, we investigate the phenomena of "selection biases" in Large Language Models (LLMs), focusing on problems where models are tasked with choosing the optimal option from an ordered sequence. We delve into biases related to option order and token usage, which significantly impact LLMs' decision-making processes. We also quantify the impact of these biases through an extensive empirical… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted as a long findings paper at ACL 2024

  11. arXiv:2405.13629  [pdf, other

    cs.LG

    Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow

    Authors: Chen-Hao Chao, Chien Feng, Wei-Fang Sun, Cheng-Kuang Lee, Simon See, Chun-Yi Lee

    Abstract: Existing Maximum-Entropy (MaxEnt) Reinforcement Learning (RL) methods for continuous action spaces are typically formulated based on actor-critic frameworks and optimized through alternating steps of policy evaluation and policy improvement. In the policy evaluation steps, the critic is updated to capture the soft Q-function. In the policy improvement steps, the actor is adjusted in accordance wit… ▽ More

    Submitted 26 October, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Published at NeurIPS 2024. Code: https://github.com/ChienFeng-hub/meow

  12. arXiv:2310.14981  [pdf, other

    cs.CL

    Fidelity-Enriched Contrastive Search: Reconciling the Faithfulness-Diversity Trade-Off in Text Generation

    Authors: Wei-Lin Chen, Cheng-Kuang Wu, Hsin-Hsi Chen, Chung-Chi Chen

    Abstract: In this paper, we address the hallucination problem commonly found in natural language generation tasks. Language models often generate fluent and convincing content but can lack consistency with the provided source, resulting in potential inaccuracies. We propose a new decoding method called Fidelity-Enriched Contrastive Search (FECS), which augments the contrastive search framework with context-… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted as a short paper at EMNLP 2023

  13. arXiv:2307.08922  [pdf, other

    cs.CL

    Large Language Models Perform Diagnostic Reasoning

    Authors: Cheng-Kuang Wu, Wei-Lin Chen, Hsin-Hsi Chen

    Abstract: We explore the extension of chain-of-thought (CoT) prompting to medical reasoning for the task of automatic diagnosis. Motivated by doctors' underlying reasoning process, we present Diagnostic-Reasoning CoT (DR-CoT). Empirical results demonstrate that by simply prompting large language models trained only on general text corpus with two DR-CoT exemplars, the diagnostic accuracy improves by 15% com… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted as a Tiny Paper at ICLR 2023 (10 pages, 5 figures)

  14. arXiv:2306.02430  [pdf, other

    cs.MA cs.LG

    A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning

    Authors: Wei-Fang Sun, Cheng-Kuang Lee, Simon See, Chun-Yi Lee

    Abstract: In fully cooperative multi-agent reinforcement learning (MARL) settings, environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of other agents. To address the above issues, we proposed a unified framework, called DFAC, for integrating distributional RL with value function factorization methods. This framework generalizes expected v… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: JMLR 2023. Extended version of arXiv:2102.07936

  15. arXiv:2305.15035  [pdf, other

    cs.CL

    Self-ICL: Zero-Shot In-Context Learning with Self-Generated Demonstrations

    Authors: Wei-Lin Chen, Cheng-Kuang Wu, Yun-Nung Chen, Hsin-Hsi Chen

    Abstract: Large language models (LLMs) have exhibited striking in-context learning (ICL) ability to adapt to target tasks with a few input-output demonstrations. For better ICL, different methods are proposed to select representative demonstrations from existing training corpora. However, such settings are not aligned with real-world practices, as end-users usually query LMs without access to demonstration… ▽ More

    Submitted 23 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted as a long paper at EMNLP 2023

  16. arXiv:2305.07355  [pdf, other

    cs.CL

    ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

    Authors: Wei-Lin Chen, An-Zi Yen, Cheng-Kuang Wu, Hen-Hsen Huang, Hsin-Hsi Chen

    Abstract: Language models (LMs) that jointly generate end-task answers as well as free-text rationales are known as self-rationalization models. Recent works demonstrate great performance gain for self-rationalization by few-shot prompting LMs with rationale-augmented exemplars. However, the ability to benefit from explanations only emerges with large-scale LMs, which have poor accessibility. In this work,… ▽ More

    Submitted 23 October, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted as a long paper at EMNLP Findings 2023

  17. arXiv:2212.08830  [pdf, other

    cs.CV

    Inductive Attention for Video Action Anticipation

    Authors: Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald Lanz

    Abstract: Anticipating future actions based on spatiotemporal observations is essential in video understanding and predictive computer vision. Moreover, a model capable of anticipating the future has important applications, it can benefit precautionary systems to react before an event occurs. However, unlike in the action recognition task, future information is inaccessible at observation time -- a model ca… ▽ More

    Submitted 18 March, 2023; v1 submitted 17 December, 2022; originally announced December 2022.

  18. arXiv:2206.10869  [pdf, other

    cs.CV

    NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation Challenge 2022

    Authors: Tsung-Ming Tai, Oswald Lanz, Giuseppe Fiameni, Yi-Kwan Wong, Sze-Sen Poon, Cheng-Kuang Lee, Ka-Chun Cheung, Simon See

    Abstract: In this report, we describe the technical details of our submission for the EPIC-Kitchen-100 action anticipation challenge. Our modelings, the higher-order recurrent space-time transformer and the message-passing neural network with edge learning, are both recurrent-based architectures which observe only 2.5 seconds inference context to form the action anticipation prediction. By averaging the pre… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

  19. arXiv:2206.01009  [pdf, other

    cs.CV

    Unified Recurrence Modeling for Video Action Anticipation

    Authors: Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald Lanz

    Abstract: Forecasting future events based on evidence of current conditions is an innate skill of human beings, and key for predicting the outcome of any decision making. In artificial vision for example, we would like to predict the next human action before it happens, without observing the future video frames associated to it. Computer vision models for action anticipation are expected to collect the subt… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  20. arXiv:2110.09930  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning

    Authors: Yi-Chen Chen, Shu-wen Yang, Cheng-Kuang Lee, Simon See, Hung-yi Lee

    Abstract: Speech representation learning plays a vital role in speech processing. Among them, self-supervised learning (SSL) has become an important research direction. It has been shown that an SSL pretraining model can achieve excellent performance in various downstream tasks of speech processing. On the other hand, supervised multi-task learning (MTL) is another representation learning paradigm, which ha… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

  21. arXiv:2105.03070  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    SpeechNet: A Universal Modularized Model for Speech Processing Tasks

    Authors: Yi-Chen Chen, Po-Han Chi, Shu-wen Yang, Kai-Wei Chang, Jheng-hao Lin, Sung-Feng Huang, Da-Rong Liu, Chi-Liang Liu, Cheng-Kuang Lee, Hung-yi Lee

    Abstract: There is a wide variety of speech processing tasks ranging from extracting content information from speech signals to generating speech signals. For different tasks, model networks are usually designed and tuned separately. If a universal model can perform multiple speech processing tasks, some tasks might be improved with the related abilities learned from other tasks. The multi-task learning of… ▽ More

    Submitted 31 May, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

  22. arXiv:2104.08665  [pdf, other

    cs.CV

    Higher Order Recurrent Space-Time Transformer for Video Action Prediction

    Authors: Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Oswald Lanz

    Abstract: Endowing visual agents with predictive capability is a key step towards video intelligence at scale. The predominant modeling paradigm for this is sequence learning, mostly implemented through LSTMs. Feed-forward Transformer architectures have replaced recurrent model designs in ML applications of language processing and also partly in computer vision. In this paper we investigate on the competiti… ▽ More

    Submitted 21 September, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

  23. arXiv:2102.07936  [pdf, other

    cs.MA cs.LG

    DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

    Authors: Wei-Fang Sun, Cheng-Kuang Lee, Chun-Yi Lee

    Abstract: In fully cooperative multi-agent reinforcement learning (MARL) settings, the environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of the other agents. To address the above issues, we integrate distributional RL and value function factorization methods by proposing a Distributional Value Function Factorization (DFAC) framework to g… ▽ More

    Submitted 22 December, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  24. arXiv:2005.07029  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

    Authors: Yi-Chen Chen, Jui-Yang Hsu, Cheng-Kuang Lee, Hung-yi Lee

    Abstract: In previous works, only parameter weights of ASR models are optimized under fixed-topology architecture. However, the design of successful model architecture has always relied on human experience and intuition. Besides, many hyperparameters related to model architecture need to be manually tuned. Therefore in this paper, we propose an ASR approach with efficient gradient-based architecture search,… ▽ More

    Submitted 25 July, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

    Comments: Accepted at INTERSPEECH 2020