Skip to main content

Showing 1–50 of 495 results for author: Nguyen, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01543  [pdf, ps, other

    cs.CL

    Is External Information Useful for Stance Detection with LLMs?

    Authors: Quang Minh Nguyen, Taegyoon Kim

    Abstract: In the stance detection task, a text is classified as either favorable, opposing, or neutral towards a target. Prior work suggests that the use of external information, e.g., excerpts from Wikipedia, improves stance detection performance. However, whether or not such information can benefit large language models (LLMs) remains an unanswered question, despite their wide adoption in many reasoning t… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: ACL Findings 2025

  2. arXiv:2507.00920  [pdf, ps, other

    cs.LG eess.SP

    Privacy-Preserving Quantized Federated Learning with Diverse Precision

    Authors: Dang Qua Nguyen, Morteza Hashemi, Erik Perrins, Sergiy A. Vorobyov, David J. Love, Taejoon Kim

    Abstract: Federated learning (FL) has emerged as a promising paradigm for distributed machine learning, enabling collaborative training of a global model across multiple local devices without requiring them to share raw data. Despite its advancements, FL is limited by factors such as: (i) privacy risks arising from the unprotected transmission of local model updates to the fusion center (FC) and (ii) decrea… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  3. arXiv:2506.23524  [pdf

    cs.CL cs.AI

    NEU-ESC: A Comprehensive Vietnamese dataset for Educational Sentiment analysis and topic Classification toward multitask learning

    Authors: Phan Quoc Hung Mai, Quang Hung Nguyen, Phuong Giang Duong, Hong Hanh Nguyen, Nguyen Tuan Long

    Abstract: In the field of education, understanding students' opinions through their comments is crucial, especially in the Vietnamese language, where resources remain limited. Existing educational datasets often lack domain relevance and student slang. To address these gaps, we introduce NEU-ESC, a new Vietnamese dataset for Educational Sentiment Classification and Topic Classification, curated from univers… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  4. arXiv:2506.23273  [pdf, ps, other

    cs.AI

    FinStat2SQL: A Text2SQL Pipeline for Financial Statement Analysis

    Authors: Quang Hung Nguyen, Phuong Anh Trinh, Phan Quoc Hung Mai, Tuan Phong Trinh

    Abstract: Despite the advancements of large language models, text2sql still faces many challenges, particularly with complex and domain-specific queries. In finance, database designs and financial reporting layouts vary widely between financial entities and countries, making text2sql even more challenging. We present FinStat2SQL, a lightweight text2sql pipeline enabling natural language queries over financi… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

  5. arXiv:2506.21920  [pdf, ps, other

    cs.CV

    SepFormer: Coarse-to-fine Separator Regression Network for Table Structure Recognition

    Authors: Nam Quan Nguyen, Xuan Phong Pham, Tuan-Anh Tran

    Abstract: The automated reconstruction of the logical arrangement of tables from image data, termed Table Structure Recognition (TSR), is fundamental for semantic data extraction. Recently, researchers have explored a wide range of techniques to tackle this problem, demonstrating significant progress. Each table is a set of vertical and horizontal separators. Following this realization, we present SepFormer… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  6. arXiv:2506.19261  [pdf, ps, other

    cs.CV

    Automated Image Recognition Framework

    Authors: Quang-Binh Nguyen, Trong-Vu Hoang, Ngoc-Do Tran, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: While the efficacy of deep learning models heavily relies on data, gathering and annotating data for specific tasks, particularly when addressing novel or sensitive subjects lacking relevant datasets, poses significant time and resource challenges. In response to this, we propose a novel Automated Image Recognition (AIR) framework that harnesses the power of generative AI. AIR empowers end-users t… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: ICCCI 2025

  7. arXiv:2506.18493  [pdf, ps, other

    cs.CV

    ShowFlow: From Robust Single Concept to Condition-Free Multi-Concept Generation

    Authors: Trong-Vu Hoang, Quang-Binh Nguyen, Thanh-Toan Do, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Customizing image generation remains a core challenge in controllable image synthesis. For single-concept generation, maintaining both identity preservation and prompt alignment is challenging. In multi-concept scenarios, relying solely on a prompt without additional conditions like layout boxes or semantic masks, often leads to identity loss and concept omission. In this paper, we introduce ShowF… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  8. arXiv:2506.18448  [pdf, ps, other

    cs.RO

    GraspMAS: Zero-Shot Language-driven Grasp Detection with Multi-Agent System

    Authors: Quang Nguyen, Tri Le, Huy Nguyen, Thieu Vo, Tung D. Ta, Baoru Huang, Minh N. Vu, Anh Nguyen

    Abstract: Language-driven grasp detection has the potential to revolutionize human-robot interaction by allowing robots to understand and execute grasping tasks based on natural language commands. However, existing approaches face two key challenges. First, they often struggle to interpret complex text instructions or operate ineffectively in densely cluttered environments. Second, most methods require a tr… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 8 pages, accepted to IROS 2025

  9. arXiv:2506.18024  [pdf, ps, other

    cs.DC cs.RO

    Leveraging Cloud-Fog Automation for Autonomous Collision Detection and Classification in Intelligent Unmanned Surface Vehicles

    Authors: Thien Tran, Quang Nguyen, Jonathan Kua, Minh Tran, Toan Luu, Thuong Hoang, Jiong Jin

    Abstract: Industrial Cyber-Physical Systems (ICPS) technologies are foundational in driving maritime autonomy, particularly for Unmanned Surface Vehicles (USVs). However, onboard computational constraints and communication latency significantly restrict real-time data processing, analysis, and predictive modeling, hence limiting the scalability and responsiveness of maritime ICPS. To overcome these challeng… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

    Comments: 6 pages, 5 figures, accepted paper on the 23rd IEEE International Conference on Industrial Informatics (INDIN), July 12-15, 2025, Kunming, China

  10. arXiv:2506.17292  [pdf, ps, other

    cs.CR cs.AI

    Theoretically Unmasking Inference Attacks Against LDP-Protected Clients in Federated Vision Models

    Authors: Quan Nguyen, Minh N. Vu, Truc Nguyen, My T. Thai

    Abstract: Federated Learning enables collaborative learning among clients via a coordinating server while avoiding direct data sharing, offering a perceived solution to preserve privacy. However, recent studies on Membership Inference Attacks (MIAs) have challenged this notion, showing high success rates against unprotected training data. While local differential privacy (LDP) is widely regarded as a gold s… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: Accepted to ICML 2025

  11. arXiv:2506.03763  [pdf, ps, other

    cs.CL

    ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations

    Authors: Quang Hieu Pham, Thuy Duong Nguyen, Tung Pham, Anh Tuan Luu, Dat Quoc Nguyen

    Abstract: The capabilities of large language models (LLMs) have been enhanced by training on data that reflects human thought processes, such as the Chain-of-Thought format. However, evidence suggests that the conventional scheme of next-word prediction may not fully capture how humans learn to think. Inspired by how humans generalize mathematical reasoning, we propose a new approach named ClozeMath to fine… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted to ACL 2025 Findings

  12. arXiv:2506.02539  [pdf, ps, other

    cs.LG

    VerificAgent: Integrating Expert Knowledge and Fact-Checked Memory for Robust Domain-Specific Task Planning

    Authors: Thong Q. Nguyen, Shubhang Desai, Yash Jain, Tanvir Aumi, Vishal Chowdhary

    Abstract: Continual memory augmentation allows computer-use agents (CUAs) to learn from past interactions and refine their task-solving strategies over time. However, unchecked memory accumulation can introduce spurious or hallucinated "learnings" that degrade agent performance, particularly in domain-specific workflows such as productivity software. We present a novel framework, VerificAgent, that effectiv… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Journal ref: ICML Workshop on Computer Use Agents 2025

  13. arXiv:2506.01322  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Zero-Shot Text-to-Speech for Vietnamese

    Authors: Thi Vu, Linh The Nguyen, Dat Quoc Nguyen

    Abstract: This paper introduces PhoAudiobook, a newly curated dataset comprising 941 hours of high-quality audio for Vietnamese text-to-speech. Using PhoAudiobook, we conduct experiments on three leading zero-shot TTS models: VALL-E, VoiceCraft, and XTTS-V2. Our findings demonstrate that PhoAudiobook consistently enhances model performance across various metrics. Moreover, VALL-E and VoiceCraft exhibit supe… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: To appear in Proceedings of ACL 2025 (Main conference paper)

  14. arXiv:2506.01305  [pdf, ps, other

    cs.CL

    VM14K: First Vietnamese Medical Benchmark

    Authors: Thong Nguyen, Duc Nguyen, Minh Dang, Thai Dao, Long Nguyen, Quan H. Nguyen, Dat Nguyen, Kien Tran, Minh Tran

    Abstract: Medical benchmarks are indispensable for evaluating the capabilities of language models in healthcare for non-English-speaking communities,therefore help ensuring the quality of real-life applications. However, not every community has sufficient resources and standardized methods to effectively build and design such benchmark, and available non-English medical data is normally fragmented and diffi… ▽ More

    Submitted 13 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

  15. arXiv:2505.24472  [pdf, ps, other

    cs.CL cs.AI cs.LG

    VietMix: A Naturally Occurring Vietnamese-English Code-Mixed Corpus with Iterative Augmentation for Machine Translation

    Authors: Hieu Tran, Phuong-Anh Nguyen-Le, Huy Nghiem, Quang-Nhan Nguyen, Wei Ai, Marine Carpuat

    Abstract: Machine translation systems fail when processing code-mixed inputs for low-resource languages. We address this challenge by curating VietMix, a parallel corpus of naturally occurring code-mixed Vietnamese text paired with expert English translations. Augmenting this resource, we developed a complementary synthetic data generation pipeline. This pipeline incorporates filtering mechanisms to ensure… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  16. arXiv:2505.24068  [pdf, ps, other

    cs.RO

    DiffCoTune: Differentiable Co-Tuning for Cross-domain Robot Control

    Authors: Lokesh Krishna, Sheng Cheng, Junheng Li, Naira Hovakimyan, Quan Nguyen

    Abstract: The deployment of robot controllers is hindered by modeling discrepancies due to necessary simplifications for computational tractability or inaccuracies in data-generating simulators. Such discrepancies typically require ad-hoc tuning to meet the desired performance, thereby ensuring successful transfer to a target domain. We propose a framework for automated, gradient-based tuning to enhance per… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 8 pages, 8 figures

  17. arXiv:2505.19080  [pdf, ps, other

    cs.RO

    ReFineVLA: Reasoning-Aware Teacher-Guided Transfer Fine-Tuning

    Authors: Tuan Van Vo, Tan Quang Nguyen, Khang Minh Nguyen, Duy Ho Minh Nguyen, Minh Nhat Vu

    Abstract: Vision-Language-Action (VLA) models have gained much attention from the research community thanks to their strength in translating multimodal observations with linguistic instructions into robotic actions. Despite their recent advancements, VLAs often overlook the explicit reasoning and only learn the functional input-action mappings, omitting these crucial logical steps for interpretability and g… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Comments: 10 pages

  18. arXiv:2505.19054  [pdf, ps, other

    cs.LG

    Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning

    Authors: Zhuochen Liu, Rahul Jain, Quan Nguyen

    Abstract: Recent advancements in reinforcement learning (RL) have leveraged neural networks to achieve state-of-the-art performance across various control tasks. However, these successes often come at the cost of significant computational resources, as training deep neural networks requires substantial time and data. In this paper, we introduce an actor-critic algorithm that utilizes randomized neural netwo… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Comments: 8 pages main, 12 pages total, 6 figures

  19. arXiv:2505.15009  [pdf, ps, other

    cs.LG cs.AI

    One-Layer Transformers are Provably Optimal for In-context Reasoning and Distributional Association Learning in Next-Token Prediction Tasks

    Authors: Quan Nguyen, Thanh Nguyen-Tang

    Abstract: We study the approximation capabilities and on-convergence behaviors of one-layer transformers on the noiseless and noisy in-context reasoning of next-token prediction. Existing theoretical results focus on understanding the in-context reasoning behaviors for either the first gradient step or when the number of samples is infinite. Furthermore, no convergence rates nor generalization abilities wer… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 27 pages

  20. arXiv:2505.13944  [pdf, ps, other

    cs.CL

    Towards Rehearsal-Free Continual Relation Extraction: Capturing Within-Task Variance with Adaptive Prompting

    Authors: Bao-Ngoc Dao, Quang Nguyen, Luyen Ngo Dinh, Minh Le, Nam Le, Linh Ngo Van

    Abstract: Memory-based approaches have shown strong performance in Continual Relation Extraction (CRE). However, storing examples from previous tasks increases memory usage and raises privacy concerns. Recently, prompt-based methods have emerged as a promising alternative, as they do not rely on storing past samples. Despite this progress, current prompt-based techniques face several core challenges in CRE,… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  21. arXiv:2505.11014  [pdf, ps, other

    stat.ME cs.LG econ.EM

    A Cautionary Tale on Integrating Studies with Disparate Outcome Measures for Causal Inference

    Authors: Harsh Parikh, Trang Quynh Nguyen, Elizabeth A. Stuart, Kara E. Rudolph, Caleb H. Miles

    Abstract: Data integration approaches are increasingly used to enhance the efficiency and generalizability of studies. However, a key limitation of these methods is the assumption that outcome measures are identical across datasets -- an assumption that often does not hold in practice. Consider the following opioid use disorder (OUD) studies: the XBOT trial and the POAT study, both evaluating the effect of… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  22. arXiv:2505.09007  [pdf, ps, other

    cs.IT

    Vendi Information Gain: An Alternative To Mutual Information For Science And Machine Learning

    Authors: Quan Nguyen, Adji Bousso Dieng

    Abstract: In his 1948 seminal paper A Mathematical Theory of Communication that birthed information theory, Claude Shannon introduced mutual information (MI), which he called "rate of transmission", as a way to quantify information gain (IG) and defined it as the difference between the marginal and conditional entropy of a random variable. While MI has become a standard tool in science and engineering, it h… ▽ More

    Submitted 16 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

  23. arXiv:2505.08662  [pdf, ps, other

    cs.CL cs.LG econ.GN

    Revealing economic facts: LLMs know more than they say

    Authors: Marcus Buckmann, Quynh Anh Nguyen, Edward Hill

    Abstract: We investigate whether the hidden states of large language models (LLMs) can be used to estimate and impute economic and financial statistics. Focusing on county-level (e.g. unemployment) and firm-level (e.g. total assets) variables, we show that a simple linear model trained on the hidden states of open-source LLMs outperforms the models' text outputs. This suggests that hidden states capture ric… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 34 pages, 17 figures

    ACM Class: I.2.7

  24. arXiv:2505.07689  [pdf, ps, other

    cs.CV

    Anatomical Attention Alignment representation for Radiology Report Generation

    Authors: Quang Vinh Nguyen, Minh Duc Nguyen, Thanh Hoang Son Vo, Hyung-Jeong Yang, Soo-Hyung Kim

    Abstract: Automated Radiology report generation (RRG) aims at producing detailed descriptions of medical images, reducing radiologists' workload and improving access to high-quality diagnostic services. Existing encoder-decoder models only rely on visual features extracted from raw input images, which can limit the understanding of spatial structures and semantic relationships, often resulting in suboptimal… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  25. arXiv:2505.05736  [pdf

    q-bio.QM cs.CL cs.CV cs.LG

    Multimodal Integrated Knowledge Transfer to Large Language Models through Preference Optimization with Biomedical Applications

    Authors: Da Wu, Zhanliang Wang, Quan Nguyen, Zhuoran Xu, Kai Wang

    Abstract: The scarcity of high-quality multimodal biomedical data limits the ability to effectively fine-tune pretrained Large Language Models (LLMs) for specialized biomedical tasks. To address this challenge, we introduce MINT (Multimodal Integrated kNowledge Transfer), a framework that aligns unimodal large decoder models with domain-specific decision patterns from multimodal biomedical data through pref… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: First Draft

  26. arXiv:2504.21214  [pdf, other

    cs.CL cs.AI eess.AS

    Pretraining Large Brain Language Model for Active BCI: Silent Speech

    Authors: Jinzhao Zhou, Zehong Cao, Yiqun Duan, Connor Barkley, Daniel Leong, Xiaowei Jiang, Quoc-Toan Nguyen, Ziyi Zhao, Thomas Do, Yu-Cheng Chang, Sheng-Fu Liang, Chin-teng Lin

    Abstract: This paper explores silent speech decoding in active brain-computer interface (BCI) systems, which offer more natural and flexible communication than traditional BCI applications. We collected a new silent speech dataset of over 120 hours of electroencephalogram (EEG) recordings from 12 subjects, capturing 24 commonly used English words for language model pretraining and decoding. Following the re… ▽ More

    Submitted 3 May, 2025; v1 submitted 29 April, 2025; originally announced April 2025.

  27. arXiv:2504.17701  [pdf, other

    cs.SI cond-mat.stat-mech physics.data-an

    Network Sampling: An Overview and Comparative Analysis

    Authors: Quoc Chuong Nguyen

    Abstract: Network sampling is a crucial technique for analyzing large or partially observable networks. However, the effectiveness of different sampling methods can vary significantly depending on the context. In this study, we empirically compare representative methods from three main categories: node-based, edge-based, and exploration-based sampling. We used two real-world datasets for our analysis: a sci… ▽ More

    Submitted 2 May, 2025; v1 submitted 24 April, 2025; originally announced April 2025.

    Comments: 11 pages, 7 figures, 2 tables

  28. arXiv:2504.09876  [pdf, other

    cs.CV cs.AI

    HDC: Hierarchical Distillation for Multi-level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation

    Authors: Tran Quoc Khanh Le, Nguyen Lan Vi Vu, Ha-Hieu Pham, Xuan-Loc Huynh, Tien-Huy Nguyen, Minh Huu Nhat Le, Quan Nguyen, Hien D. Nguyen

    Abstract: Transvaginal ultrasound is a critical imaging modality for evaluating cervical anatomy and detecting physiological changes. However, accurate segmentation of cervical structures remains challenging due to low contrast, shadow artifacts, and indistinct boundaries. While convolutional neural networks (CNNs) have demonstrated efficacy in medical image segmentation, their reliance on large-scale annot… ▽ More

    Submitted 16 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

  29. arXiv:2504.09797  [pdf, ps, other

    cs.CV cs.AI

    IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework under Limited Annotation Scheme

    Authors: Dinh Dai Quan Tran, Hoang-Thien Nguyen, Thanh-Huy Nguyen, Gia-Van To, Tien-Huy Nguyen, Quan Nguyen

    Abstract: Semi-Supervised Semantic Segmentation (SSSS) aims to improve segmentation accuracy by leveraging a small set of labeled images alongside a larger pool of unlabeled data. Recent advances primarily focus on pseudo-labeling, consistency regularization, and co-training strategies. However, existing methods struggle to balance global semantic representation with fine-grained local feature extraction. T… ▽ More

    Submitted 23 May, 2025; v1 submitted 13 April, 2025; originally announced April 2025.

    Comments: 10 pages, 5 figures

  30. arXiv:2504.09298  [pdf, other

    cs.CV

    A Lightweight Moment Retrieval System with Global Re-Ranking and Robust Adaptive Bidirectional Temporal Search

    Authors: Tinh-Anh Nguyen-Nhu, Huu-Loc Tran, Nguyen-Khang Le, Minh-Nhat Nguyen, Tien-Huy Nguyen, Hoang-Long Nguyen-Huu, Huu-Phong Phan-Nguyen, Huy-Thach Pham, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh

    Abstract: The exponential growth of digital video content has posed critical challenges in moment-level video retrieval, where existing methodologies struggle to efficiently localize specific segments within an expansive video corpus. Current retrieval systems are constrained by computational inefficiencies, temporal context limitations, and the intrinsic complexity of navigating video content. In this pape… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  31. arXiv:2504.09297  [pdf, other

    cs.CV

    Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection

    Authors: Huu-Phong Phan-Nguyen, Anh Dao, Tien-Huy Nguyen, Tuan Quang, Huu-Loc Tran, Tinh-Anh Nguyen-Nhu, Huy-Thach Pham, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh

    Abstract: Nowadays, smartphones are ubiquitous, and almost everyone owns one. At the same time, the rapid development of AI has spurred extensive research on applying deep learning techniques to image classification. However, due to the limited resources available on mobile devices, significant challenges remain in balancing accuracy with computational efficiency. In this paper, we propose a novel training… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  32. arXiv:2504.08384  [pdf, other

    cs.CV

    Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking

    Authors: Huu-Loc Tran, Tinh-Anh Nguyen-Nhu, Huu-Phong Phan-Nguyen, Tien-Huy Nguyen, Nhat-Minh Nguyen-Dich, Anh Dao, Huy-Duc Do, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh

    Abstract: Long-form video understanding presents significant challenges for interactive retrieval systems, as conventional methods struggle to process extensive video content efficiently. Existing approaches often rely on single models, inefficient storage, unstable temporal search, and context-agnostic reranking, limiting their effectiveness. This paper presents a novel framework to enhance interactive vid… ▽ More

    Submitted 11 April, 2025; originally announced April 2025.

  33. arXiv:2503.24301  [pdf, other

    cs.ET

    QUADRO: A Hybrid Quantum Optimization Framework for Drone Delivery

    Authors: James B. Holliday, Darren Blount, Hoang Quan Nguyen, Samee U. Khan, Khoa Luu

    Abstract: Quantum computing holds transformative potential for optimizing large-scale drone fleet operations, yet its near-term limitations necessitate hybrid approaches blending classical and quantum techniques. This work introduces Quantum Unmanned Aerial Delivery Routing Optimization (QUADRO), a novel hybrid framework addressing the Energy-Constrained Capacitated Unmanned Aerial Vehicle Routing Problem a… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: submitted to QCE 2025

  34. arXiv:2503.19149  [pdf, other

    cs.LG cs.CV

    Out-of-distribution evaluations of channel agnostic masked autoencoders in fluorescence microscopy

    Authors: Christian John Hurry, Jinjie Zhang, Olubukola Ishola, Emma Slade, Cuong Q. Nguyen

    Abstract: Developing computer vision for high-content screening is challenging due to various sources of distribution-shift caused by changes in experimental conditions, perturbagens, and fluorescent markers. The impact of different sources of distribution-shift are confounded in typical evaluations of models based on transfer learning, which limits interpretations of how changes to model design and trainin… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

    Comments: 13 pages, 5 figures

  35. arXiv:2503.16510  [pdf, other

    cs.HC cs.CY

    Combating the Effects of Cyber-Psychosis: Using Object Security to Facilitate Critical Thinking

    Authors: Robert H. Thomson, Quan Nguyen, Essien Ayanam, Matthew Canham, Thomas C. Schmidt, Matthias Wählisch, Eric Osterweil

    Abstract: Humanity is currently facing an existential crisis about the nature of truth and reality driven by the availability of information online which overloads and overwhelms our cognitive capabilities, which we call Cyber-Psychosis. The results of this Cyber-Psychosis include the decline of critical thinking coupled with deceptive influences on the Internet which have become so prolific that they are c… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: 13 pages, 3 figures, under submission

    ACM Class: K.4.0; J.4; C.2.2

  36. arXiv:2503.12286  [pdf

    cs.CL cs.AI q-bio.GN q-bio.QM

    Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

    Authors: Da Wu, Zhanliang Wang, Quan Nguyen, Kai Wang

    Abstract: Background: Several studies show that large language models (LLMs) struggle with phenotype-driven gene prioritization for rare diseases. These studies typically use Human Phenotype Ontology (HPO) terms to prompt foundation models like GPT and LLaMA to predict candidate genes. However, in real-world settings, foundation models are not optimized for domain-specific tasks like clinical diagnosis, yet… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

    Comments: 31 pages, 3 figures

  37. arXiv:2503.10530  [pdf, ps, other

    cs.CV cs.AI

    Lightweight Models for Emotional Analysis in Video

    Authors: Quoc-Tien Nguyen, Hong-Hai Nguyen, Van-Thong Huynh

    Abstract: In this study, we present an approach for efficient spatiotemporal feature extraction using MobileNetV4 and a multi-scale 3D MLP-Mixer-based temporal aggregation module. MobileNetV4, with its Universal Inverted Bottleneck (UIB) blocks, serves as the backbone for extracting hierarchical feature representations from input image sequences, ensuring both computational efficiency and rich semantic enco… ▽ More

    Submitted 24 March, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

    Comments: https://github.com/PRVSL/abaw-8th

  38. arXiv:2503.09707  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Revisiting semi-supervised learning in the era of foundation models

    Authors: Ping Zhang, Zheda Mai, Quang-Huy Nguyen, Wei-Lun Chao

    Abstract: Semi-supervised learning (SSL) leverages abundant unlabeled data alongside limited labeled data to enhance learning. As vision foundation models (VFMs) increasingly serve as the backbone of vision applications, it remains unclear how SSL interacts with these pre-trained models. To address this gap, we develop new SSL benchmark datasets where frozen VFMs underperform and systematically evaluate rep… ▽ More

    Submitted 7 June, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

  39. arXiv:2503.06796  [pdf, other

    cs.RO

    RoboDesign1M: A Large-scale Dataset for Robot Design Understanding

    Authors: Tri Le, Toan Nguyen, Quang Tran, Quang Nguyen, Baoru Huang, Hoan Nguyen, Minh Nhat Vu, Tung D. Ta, Anh Nguyen

    Abstract: Robot design is a complex and time-consuming process that requires specialized expertise. Gaining a deeper understanding of robot design data can enable various applications, including automated design generation, retrieving example designs from text, and developing AI-powered design assistants. While recent advancements in foundation models present promising approaches to addressing these challen… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

    Comments: 8 pages

  40. BERT-based model for Vietnamese Fact Verification Dataset

    Authors: Bao Tran, T. N. Khanh, Khang Nguyen Tuong, Thien Dang, Quang Nguyen, Nguyen T. Thinh, Vo T. Hung

    Abstract: The rapid advancement of information and communication technology has facilitated easier access to information. However, this progress has also necessitated more stringent verification measures to ensure the accuracy of information, particularly within the context of Vietnam. This paper introduces an approach to address the challenges of Fact Verification using the Vietnamese dataset by integratin… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Comments: accepted for Oral Presentation in CITA 2024 (The 13th Conference on Information Technology and Its Applications) and will be published in VOLUME 1 OF CITA 2024 (Volume of the Lecture Notes in Network and Systems, Springer)

    Journal ref: CITA 2024, LNNS, vol. 882, Springer, 2024

  41. arXiv:2502.18821  [pdf, other

    cs.LG

    CAMEx: Curvature-aware Merging of Experts

    Authors: Dung V. Nguyen, Minh H. Nguyen, Luc Q. Nguyen, Rachel S. Y. Teo, Tan M. Nguyen, Linh Duy Tran

    Abstract: Existing methods for merging experts during model training and fine-tuning predominantly rely on Euclidean geometry, which assumes a flat parameter space. This assumption can limit the model's generalization ability, especially during the pre-training phase, where the parameter manifold might exhibit more complex curvature. Curvature-aware merging methods typically require additional information a… ▽ More

    Submitted 3 March, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

    Comments: 10 pages, 5 Figures, 7 Tables. Published at ICLR 2025

  42. arXiv:2502.16747  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    SQLong: Enhanced NL2SQL for Longer Contexts with LLMs

    Authors: Dai Quoc Nguyen, Cong Duy Vu Hoang, Duy Vu, Gioacchino Tangari, Thanh Tien Vu, Don Dharmasiri, Yuan-Fang Li, Long Duong

    Abstract: Open-weight large language models (LLMs) have significantly advanced performance in the Natural Language to SQL (NL2SQL) task. However, their effectiveness diminishes when dealing with large database schemas, as the context length increases. To address this limitation, we present SQLong, a novel and efficient data augmentation framework designed to enhance LLM performance in long-context scenarios… ▽ More

    Submitted 20 May, 2025; v1 submitted 23 February, 2025; originally announced February 2025.

    Comments: Accepted to Table Representation Learning Workshop at ACL 2025

  43. arXiv:2502.16152  [pdf, other

    cs.LG cs.GT

    DUPRE: Data Utility Prediction for Efficient Data Valuation

    Authors: Kieu Thao Nguyen Pham, Rachael Hwee Ling Sim, Quoc Phong Nguyen, See Kiong Ng, Bryan Kian Hsiang Low

    Abstract: Data valuation is increasingly used in machine learning (ML) to decide the fair compensation for data owners and identify valuable or harmful data for improving ML models. Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility (e.g., validation accuracy) and retraining the ML model for multiple data subsets. While most existing works on efficient e… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: 16 pages, 7 figures, the paper got accepted AAMAS 2025

  44. arXiv:2502.12982  [pdf, other

    cs.CL cs.AI cs.LG

    Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

    Authors: Longxu Dou, Qian Liu, Fan Zhou, Changyu Chen, Zili Wang, Ziqi Jin, Zichen Liu, Tongyao Zhu, Cunxiao Du, Penghui Yang, Haonan Wang, Jiaheng Liu, Yongchi Zhao, Xiachong Feng, Xin Mao, Man Tsung Yeung, Kunat Pipatanakul, Fajri Koto, Min Si Thu, Hynek Kydlíček, Zeyi Liu, Qunshu Lin, Sittipong Sripaisarnmongkol, Kridtaphad Sae-Khow, Nirattisai Thongchim , et al. (16 additional authors not shown)

    Abstract: Sailor2 is a family of cutting-edge multilingual language models for South-East Asian (SEA) languages, available in 1B, 8B, and 20B sizes to suit diverse applications. Building on Qwen2.5, Sailor2 undergoes continuous pre-training on 500B tokens (400B SEA-specific and 100B replay tokens) to support 13 SEA languages while retaining proficiency in Chinese and English. Sailor2-20B model achieves a 50… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

    Comments: 49 pages, 16 figures. Technical Report of Sailor2: https://sea-sailor.github.io/blog/sailor2/

  45. arXiv:2502.12175  [pdf, other

    cs.LG cs.AI

    Spatiotemporal Graph Neural Networks in short term load forecasting: Does adding Graph Structure in Consumption Data Improve Predictions?

    Authors: Quoc Viet Nguyen, Joaquin Delgado Fernandez, Sergio Potenciano Menci

    Abstract: Short term Load Forecasting (STLF) plays an important role in traditional and modern power systems. Most STLF models predominantly exploit temporal dependencies from historical data to predict future consumption. Nowadays, with the widespread deployment of smart meters, their data can contain spatiotemporal dependencies. In particular, their consumption data is not only correlated to historical va… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

    Comments: 13 pages, conference

  46. arXiv:2502.08326  [pdf, other

    cs.LG cs.DB cs.DS cs.IR

    Model-Free Counterfactual Subset Selection at Scale

    Authors: Minh Hieu Nguyen, Viet Hung Doan, Anh Tuan Nguyen, Jun Jo, Quoc Viet Hung Nguyen

    Abstract: Ensuring transparency in AI decision-making requires interpretable explanations, particularly at the instance level. Counterfactual explanations are a powerful tool for this purpose, but existing techniques frequently depend on synthetic examples, introducing biases from unrealistic assumptions, flawed models, or skewed data. Many methods also assume full dataset availability, an impractical const… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  47. arXiv:2502.08143  [pdf, ps, other

    cs.LG

    Data-dependent Bounds with $T$-Optimal Best-of-Both-Worlds Guarantees in Multi-Armed Bandits using Stability-Penalty Matching

    Authors: Quan Nguyen, Shinji Ito, Junpei Komiyama, Nishant A. Mehta

    Abstract: Existing data-dependent and best-of-both-worlds regret bounds for multi-armed bandits problems have limited adaptivity as they are either data-dependent but not best-of-both-worlds (BOBW), BOBW but not data-dependent or have sub-optimal $O(\sqrt{T\ln{T}})$ worst-case guarantee in the adversarial regime. To overcome these limitations, we propose real-time stability-penalty matching (SPM), a new met… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  48. arXiv:2502.07409  [pdf, other

    cs.CV cs.LG

    MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification

    Authors: Anh-Tien Nguyen, Duy Minh Ho Nguyen, Nghiem Tuong Diep, Trung Quoc Nguyen, Nhat Ho, Jacqueline Michelle Metsch, Miriam Cindy Maurer, Daniel Sonntag, Hanibal Bohnenberger, Anne-Christin Hauschild

    Abstract: Whole slide pathology image classification presents challenges due to gigapixel image sizes and limited annotation labels, hindering model generalization. This paper introduces a prompt learning method to adapt large vision-language models for few-shot pathology classification. We first extend the Prov-GigaPath vision foundation model, pre-trained on 1.3 billion pathology image tiles, into a visio… ▽ More

    Submitted 14 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  49. arXiv:2502.07188  [pdf, other

    cs.CL

    A Large-Scale Benchmark for Vietnamese Sentence Paraphrases

    Authors: Sang Quang Nguyen, Kiet Van Nguyen

    Abstract: This paper presents ViSP, a high-quality Vietnamese dataset for sentence paraphrasing, consisting of 1.2M original-paraphrase pairs collected from various domains. The dataset was constructed using a hybrid approach that combines automatic paraphrase generation with manual evaluation to ensure high quality. We conducted experiments using methods such as back-translation, EDA, and baseline models l… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: Accepted in NAACL 2025 Findings

  50. arXiv:2502.06907  [pdf, other

    cs.LG cs.AI

    Can ChatGPT Diagnose Alzheimer's Disease?

    Authors: Quoc-Toan Nguyen, Linh Le, Xuan-The Tran, Thomas Do, Chin-Teng Lin

    Abstract: Can ChatGPT diagnose Alzheimer's Disease (AD)? AD is a devastating neurodegenerative condition that affects approximately 1 in 9 individuals aged 65 and older, profoundly impairing memory and cognitive function. This paper utilises 9300 electronic health records (EHRs) with data from Magnetic Resonance Imaging (MRI) and cognitive tests to address an intriguing question: As a general-purpose task s… ▽ More

    Submitted 9 February, 2025; originally announced February 2025.

    Comments: 14 pages, 5 figures, 5 tables