Skip to main content

Showing 1–50 of 499 results for author: Nguyen, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.10207  [pdf, other

    cs.DM

    How to Color Temporal Graphs to Ensure Proper Transitions

    Authors: Allen Ibiapina, Minh Hang Nguyen, Mikaël Rabie, Cléophée Robin

    Abstract: Graph Coloring consists in assigning colors to vertices ensuring that two adjacent vertices do not have the same color. In dynamic graphs, this notion is not well defined, as we need to decide if different colors for adjacent vertices must happen all the time or not, and how to go from a coloring in one time to the next one. In this paper, we define a coloring notion for Temporal Graphs where at… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 20 pages, 9 figures

  2. arXiv:2505.09114  [pdf, ps, other

    cs.AI cs.LG

    Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

    Authors: Minh Hoang Nguyen, Linh Le Pham Van, Thommen George Karimpanal, Sunil Gupta, Hung Le

    Abstract: Decision Transformers (DT) play a crucial role in modern reinforcement learning, leveraging offline datasets to achieve impressive results across various domains. However, DT requires high-quality, comprehensive data to perform optimally. In real-world applications, the lack of training data and the scarcity of optimal behaviours make training on offline datasets challenging, as suboptimal data ca… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  3. Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting

    Authors: Minh-Duc Nguyen, Hyung-Jeong Yang, Soo-Hyung Kim, Ji-Eun Shin, Seung-Won Kim

    Abstract: The dyadic reaction generation task involves synthesizing responsive facial reactions that align closely with the behaviors of a conversational partner, enhancing the naturalness and effectiveness of human-like interaction simulations. This paper introduces a novel approach, the Latent Behavior Diffusion Model, comprising a context-aware autoencoder and a diffusion-based conditional generator that… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Journal ref: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15325. Springer, Cham

  4. arXiv:2505.07689  [pdf, ps, other

    cs.CV

    Anatomical Attention Alignment representation for Radiology Report Generation

    Authors: Quang Vinh Nguyen, Minh Duc Nguyen, Thanh Hoang Son Vo, Hyung-Jeong Yang, Soo-Hyung Kim

    Abstract: Automated Radiology report generation (RRG) aims at producing detailed descriptions of medical images, reducing radiologists' workload and improving access to high-quality diagnostic services. Existing encoder-decoder models only rely on visual features extracted from raw input images, which can limit the understanding of spatial structures and semantic relationships, often resulting in suboptimal… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  5. arXiv:2505.07416  [pdf, ps, other

    cs.CL

    ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation

    Authors: Truc Mai-Thanh Nguyen, Dat Minh Nguyen, Son T. Luu, Kiet Van Nguyen

    Abstract: Multimodal Review Helpfulness Prediction (MRHP) is an essential task in recommender systems, particularly in E-commerce platforms. Determining the helpfulness of user-generated reviews enhances user experience and improves consumer decision-making. However, existing datasets focus predominantly on English and Indonesian, resulting in a lack of linguistic diversity, especially for low-resource lang… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted at NLDB 2025

  6. arXiv:2505.06874  [pdf

    cs.LG cs.AI

    Enhancing Time Series Forecasting via a Parallel Hybridization of ARIMA and Polynomial Classifiers

    Authors: Thanh Son Nguyen, Van Thanh Nguyen, Dang Minh Duc Nguyen

    Abstract: Time series forecasting has attracted significant attention, leading to the de-velopment of a wide range of approaches, from traditional statistical meth-ods to advanced deep learning models. Among them, the Auto-Regressive Integrated Moving Average (ARIMA) model remains a widely adopted linear technique due to its effectiveness in modeling temporal dependencies in economic, industrial, and social… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  7. arXiv:2505.03770  [pdf, other

    cs.AI

    Proceedings of 1st Workshop on Advancing Artificial Intelligence through Theory of Mind

    Authors: Mouad Abrini, Omri Abend, Dina Acklin, Henny Admoni, Gregor Aichinger, Nitay Alon, Zahra Ashktorab, Ashish Atreja, Moises Auron, Alexander Aufreiter, Raghav Awasthi, Soumya Banerjee, Joe M. Barnby, Rhea Basappa, Severin Bergsmann, Djallel Bouneffouf, Patrick Callaghan, Marc Cavazza, Thierry Chaminade, Sonia Chernova, Mohamed Chetouan, Moumita Choudhury, Axel Cleeremans, Jacek B. Cywinski, Fabio Cuzzolin , et al. (83 additional authors not shown)

    Abstract: This volume includes a selection of papers presented at the Workshop on Advancing Artificial Intelligence through Theory of Mind held at AAAI 2025 in Philadelphia US on 3rd March 2025. The purpose of this volume is to provide an open access and curated anthology for the ToM and AI research community.

    Submitted 28 April, 2025; originally announced May 2025.

    Comments: workshop proceedings

  8. arXiv:2505.03445  [pdf, other

    cs.CV

    Polar Coordinate-Based 2D Pose Prior with Neural Distance Field

    Authors: Qi Gan, Sao Mai Nguyen, Eric Fenaux, Stephan Clémençon, Mounîm El Yacoubi

    Abstract: Human pose capture is essential for sports analysis, enabling precise evaluation of athletes' movements. While deep learning-based human pose estimation (HPE) models from RGB videos have achieved impressive performance on public datasets, their effectiveness in real-world sports scenarios is often hindered by motion blur, occlusions, and domain shifts across different pose representations. Fine-tu… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: This paper is accepted by CVPRW 2025

  9. arXiv:2505.02974  [pdf, other

    cs.LG

    Physics-Learning AI Datamodel (PLAID) datasets: a collection of physics simulations for machine learning

    Authors: Fabien Casenave, Xavier Roynard, Brian Staber, William Piat, Michele Alessandro Bucci, Nissrine Akkari, Abbas Kabalan, Xuan Minh Vuong Nguyen, Luca Saverio, Raphaël Carpintero Perez, Anthony Kalaydjian, Samy Fouché, Thierry Gonon, Ghassan Najjar, Emmanuel Menier, Matthieu Nastorg, Giovanni Catalani, Christian Rey

    Abstract: Machine learning-based surrogate models have emerged as a powerful tool to accelerate simulation-driven scientific workflows. However, their widespread adoption is hindered by the lack of large-scale, diverse, and standardized datasets tailored to physics-based simulations. While existing initiatives provide valuable contributions, many are limited in scope-focusing on specific physics domains, re… ▽ More

    Submitted 8 May, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

  10. arXiv:2505.02508  [pdf, ps, other

    stat.ML cs.LG math.ST

    Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces

    Authors: Yang Lyu, Yuchun Qian, Tan Minh Nguyen, Xin T. Tong

    Abstract: Diffusion models is a popular computational tool to generate new data samples. It utilizes a forward diffusion process that add noise to the data distribution and then use a reverse process to remove noises to produce samples from the data distribution. However, when the empirical data distribution consists of $n$ data point, using the empirical diffusion model will necessarily produce one of the… ▽ More

    Submitted 6 May, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

  11. arXiv:2505.01163  [pdf

    cs.LG

    Empirical Comparison of Lightweight Forecasting Models for Seasonal and Non-Seasonal Time Series

    Authors: Thanh Son Nguyen, Dang Minh Duc Nguyen, Van Thanh Nguyen

    Abstract: Accurate time series forecasting is essential in many real-time applications that demand both high predictive accuracy and computational efficiency. This study provides an empirical comparison between a Polynomial Classifier and a Radial Basis Function Neural Network (RBFNN) across four real-world time series datasets (weather conditions, gold prices, crude oil prices, and beer production volumes)… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

  12. arXiv:2505.00968  [pdf, ps, other

    cs.LG cs.AI

    Tree-Sliced Wasserstein Distance with Nonlinear Projection

    Authors: Thanh Tran, Viet-Hoang Tran, Thanh Chu, Trang Pham, Laurent El Ghaoui, Tam Le, Tan M. Nguyen

    Abstract: Tree-Sliced methods have recently emerged as an alternative to the traditional Sliced Wasserstein (SW) distance, replacing one-dimensional lines with tree-based metric spaces and incorporating a splitting mechanism for projecting measures. This approach enhances the ability to capture the topological structures of integration domains in Sliced Optimal Transport while maintaining low computational… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

    Comments: Accepted at ICML 2025

  13. arXiv:2504.20452  [pdf, other

    cs.IR cs.AI

    Enhancing News Recommendation with Hierarchical LLM Prompting

    Authors: Hai-Dang Kieu, Delvin Ce Zhang, Minh Duc Nguyen, Min Xu, Qiang Wu, Dung D. Le

    Abstract: Personalized news recommendation systems often struggle to effectively capture the complexity of user preferences, as they rely heavily on shallow representations, such as article titles and abstracts. To address this problem, we introduce a novel method, namely PNR-LLM, for Large Language Models for Personalized News Recommendation. Specifically, PNR-LLM harnesses the generation capabilities of L… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  14. arXiv:2504.20073  [pdf, other

    cs.LG cs.AI cs.CL

    RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

    Authors: Zihan Wang, Kangrui Wang, Qineng Wang, Pingyue Zhang, Linjie Li, Zhengyuan Yang, Kefan Yu, Minh Nhat Nguyen, Licheng Liu, Eli Gottlieb, Monica Lam, Yiping Lu, Kyunghyun Cho, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li

    Abstract: Training large language models (LLMs) as interactive agents presents unique challenges including long-horizon decision making and interacting with stochastic environment feedback. While reinforcement learning (RL) has enabled progress in static tasks, multi-turn agent RL training remains underexplored. We propose StarPO (State-Thinking-Actions-Reward Policy Optimization), a general framework for t… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

  15. arXiv:2504.17787  [pdf, other

    cs.CV

    The Fourth Monocular Depth Estimation Challenge

    Authors: Anton Obukhov, Matteo Poggi, Fabio Tosi, Ripudaman Singh Arora, Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden, Shuaihang Wang, Zhenxin Ma, Weijie Chen, Baobei Xu, Fengyu Sun, Di Xie, Jiang Zhu, Mykola Lavreniuk, Haining Guan, Qun Wu, Yupei Zeng, Chao Lu, Huanran Wang, Guangyuan Zhou, Haotian Zhang, Jianxiong Wang, Qiang Rao , et al. (32 additional authors not shown)

    Abstract: This paper presents the results of the fourth edition of the Monocular Depth Estimation Challenge (MDEC), which focuses on zero-shot generalization to the SYNS-Patches benchmark, a dataset featuring challenging environments in both natural and indoor settings. In this edition, we revised the evaluation protocol to use least-squares alignment with two degrees of freedom to support disparity and aff… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: To appear in CVPRW2025

  16. arXiv:2504.14898  [pdf, other

    stat.ML cs.LG

    Expected Free Energy-based Planning as Variational Inference

    Authors: Bert de Vries, Wouter Nuijten, Thijs van de Laar, Wouter Kouw, Sepideh Adamiat, Tim Nisslbeck, Mykola Lukashchuk, Hoang Minh Huu Nguyen, Marco Hidalgo Araya, Raphael Tresor, Thijs Jenneskens, Ivana Nikoloska, Raaja Ganapathy Subramanian, Bart van Erp, Dmitry Bagaev, Albert Podusenko

    Abstract: We address the problem of planning under uncertainty, where an agent must choose actions that not only achieve desired outcomes but also reduce uncertainty. Traditional methods often treat exploration and exploitation as separate objectives, lacking a unified inferential foundation. Active inference, grounded in the Free Energy Principle, provides such a foundation by minimizing Expected Free Ener… ▽ More

    Submitted 23 April, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

    Comments: 18 pages

  17. arXiv:2504.13866  [pdf, other

    cs.HC cs.AI cs.CV cs.RO

    Skeleton-Based Transformer for Classification of Errors and Better Feedback in Low Back Pain Physical Rehabilitation Exercises

    Authors: Aleksa Marusic, Sao Mai Nguyen, Adriana Tapus

    Abstract: Physical rehabilitation exercises suggested by healthcare professionals can help recovery from various musculoskeletal disorders and prevent re-injury. However, patients' engagement tends to decrease over time without direct supervision, which is why there is a need for an automated monitoring system. In recent years, there has been great progress in quality assessment of physical rehabilitation e… ▽ More

    Submitted 28 March, 2025; originally announced April 2025.

    Comments: ICORR 2025 - 19th IEEE/RAS-EMBS International Conference on Rehabilitation Robotics, INTERNATIONAL CONSORTIUM FOR REHABILITATION ROBOTICS, May 2025, Michigan, USA, United States

  18. Multi-goal Rapidly Exploring Random Tree with Safety and Dynamic Constraints for UAV Cooperative Path Planning

    Authors: Thu Hang Khuat, Duy-Nam Bui, Hoa TT. Nguyen, Mien L. Trinh, Minh T. Nguyen, Manh Duong Phung

    Abstract: Cooperative path planning is gaining its importance due to the increasing demand on using multiple unmanned aerial vehicles (UAVs) for complex missions. This work addresses the problem by introducing a new algorithm named MultiRRT that extends the rapidly exploring random tree (RRT) to generate paths for a group of UAVs to reach multiple goal locations at the same time. We first derive the dynamic… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Journal ref: IEEE Transactions on Vehicular Technology, 2025

  19. arXiv:2504.10512  [pdf, other

    cs.IR cs.AI cs.CL

    JEPA4Rec: Learning Effective Language Representations for Sequential Recommendation via Joint Embedding Predictive Architecture

    Authors: Minh-Anh Nguyen, Dung D. Le

    Abstract: Language representation learning has emerged as a promising approach for sequential recommendation, thanks to its ability to learn generalizable representations. However, despite its advantages, this approach still struggles with data sparsity and a limited understanding of common-sense user preferences. To address these limitations, we propose $\textbf{JEPA4Rec}$, a framework that combines… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  20. arXiv:2504.09298  [pdf, other

    cs.CV

    A Lightweight Moment Retrieval System with Global Re-Ranking and Robust Adaptive Bidirectional Temporal Search

    Authors: Tinh-Anh Nguyen-Nhu, Huu-Loc Tran, Nguyen-Khang Le, Minh-Nhat Nguyen, Tien-Huy Nguyen, Hoang-Long Nguyen-Huu, Huu-Phong Phan-Nguyen, Huy-Thach Pham, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh

    Abstract: The exponential growth of digital video content has posed critical challenges in moment-level video retrieval, where existing methodologies struggle to efficiently localize specific segments within an expansive video corpus. Current retrieval systems are constrained by computational inefficiencies, temporal context limitations, and the intrinsic complexity of navigating video content. In this pape… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  21. arXiv:2504.07655  [pdf, other

    cs.AI cs.CY

    Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents

    Authors: Manh Hung Nguyen, Victor-Alexandru Pădurean, Alkis Gotovos, Sebastian Tschiatschek, Adish Singla

    Abstract: Generative AI is transforming computing education by enabling the automatic generation of personalized content and feedback. We investigate its capabilities in providing high-quality programming tasks to students. Despite promising advancements in task generation, a quality gap remains between AI-generated and expert-created tasks. The AI-generated tasks may not align with target programming conce… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

    Comments: AIED'25 paper

  22. arXiv:2504.02283  [pdf

    cs.LG

    Ga$_2$O$_3$ TCAD Mobility Parameter Calibration using Simulation Augmented Machine Learning with Physics Informed Neural Network

    Authors: Le Minh Long Nguyen, Edric Ong, Matthew Eng, Yuhao Zhang, Hiu Yung Wong

    Abstract: In this paper, we demonstrate the possibility of performing automatic Technology Computer-Aided-Design (TCAD) parameter calibration using machine learning, verified with experimental data. The machine only needs to be trained by TCAD data. Schottky Barrier Diode (SBD) fabricated with emerging ultra-wide-bandgap material, Gallium Oxide (Ga$_2$O$_3$), is measured and its current-voltage (IV) is used… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

    Comments: 4 pages, 3 figures

  23. arXiv:2504.00977  [pdf, ps, other

    cs.CL

    Chinese Grammatical Error Correction: A Survey

    Authors: Mengyang Qiu, Qingyu Gao, Linxuan Yang, Yang Gu, Tran Minh Nguyen, Zihao Huang, Jungyeul Park

    Abstract: Chinese Grammatical Error Correction (CGEC) is a critical task in Natural Language Processing, addressing the growing demand for automated writing assistance in both second-language (L2) and native (L1) Chinese writing. While L2 learners struggle with mastering complex grammatical structures, L1 users also benefit from CGEC in academic, professional, and formal contexts where writing precision is… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  24. arXiv:2503.22017  [pdf, other

    cs.AR

    Performance Characterizations and Usage Guidelines of Samsung CXL Memory Module Hybrid Prototype

    Authors: Jianping Zeng, Shuyi Pei, Da Zhang, Yuchen Zhou, Amir Beygi, Xuebin Yao, Ramdas Kachare, Tong Zhang, Zongwang Li, Marie Nguyen, Rekha Pitchumani, Yang Soek Ki, Changhee Jung

    Abstract: The growing prevalence of data-intensive workloads, such as artificial intelligence (AI), machine learning (ML), high-performance computing (HPC), in-memory databases, and real-time analytics, has exposed limitations in conventional memory technologies like DRAM. While DRAM offers low latency and high throughput, it is constrained by high costs, scalability challenges, and volatility, making it le… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  25. arXiv:2503.18221  [pdf, other

    cs.RO

    Decentralized Navigation of a Cable-Towed Load using Quadrupedal Robot Team via MARL

    Authors: Wen-Tse Chen, Minh Nguyen, Zhongyu Li, Guo Ning Sue, Koushil Sreenath

    Abstract: This work addresses the challenge of enabling a team of quadrupedal robots to collaboratively tow a cable-connected load through cluttered and unstructured environments while avoiding obstacles. Leveraging cables allows the multi-robot system to navigate narrow spaces by maintaining slack when necessary. However, this introduces hybrid physical interactions due to alternating taut and slack states… ▽ More

    Submitted 23 March, 2025; originally announced March 2025.

  26. arXiv:2503.12722  [pdf, other

    cs.AI cs.CL cs.GT cs.MA

    Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering

    Authors: Kenneth J. K. Ong, Lye Jia Jun, Hieu Minh "Jord" Nguyen, Seong Hah Cho, Natalia Pérez-Campanero Antolín

    Abstract: As Large Language Models (LLMs) gain autonomous capabilities, their coordination in multi-agent settings becomes increasingly important. However, they often struggle with cooperation, leading to suboptimal outcomes. Inspired by Axelrod's Iterated Prisoner's Dilemma (IPD) tournaments, we explore how personality traits influence LLM cooperation. Using representation engineering, we steer Big Five tr… ▽ More

    Submitted 16 March, 2025; originally announced March 2025.

    Comments: Poster, Technical AI Safety Conference 2025

  27. arXiv:2503.11249  [pdf, other

    cs.LG cs.AI

    Spherical Tree-Sliced Wasserstein Distance

    Authors: Viet-Hoang Tran, Thanh T. Chu, Khoi N. M. Nguyen, Trang Pham, Tam Le, Tan M. Nguyen

    Abstract: Sliced Optimal Transport (OT) simplifies the OT problem in high-dimensional spaces by projecting supports of input measures onto one-dimensional lines and then exploiting the closed-form expression of the univariate OT to reduce the computational burden of OT. Recently, the Tree-Sliced method has been introduced to replace these lines with more intricate structures, known as tree systems. This app… ▽ More

    Submitted 20 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

  28. arXiv:2503.11244  [pdf, other

    cs.PF cs.DC cs.LG

    LLMPerf: GPU Performance Modeling meets Large Language Models

    Authors: Khoi N. M. Nguyen, Hoang Duy Nguyen Do, Huyen Thao Le, Thanh Tuan Dao

    Abstract: Performance modeling, a pivotal domain in program cost analysis, currently relies on manually crafted models constrained by various program and hardware limitations, especially in the intricate landscape of GPGPU. Meanwhile, Large Language Models (LLMs) have demonstrated their effectiveness in addressing diverse programming challenges. Our work establishes a connection between LLMs and performance… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  29. arXiv:2503.11144  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling

    Authors: Rachel S. Y. Teo, Tan M. Nguyen

    Abstract: Large-scale pre-training of deep models, followed by fine-tuning them, has become the cornerstone of natural language processing (NLP). The prevalence of data coupled with computational resources has led to large models with a considerable number of parameters. While the massive size of these models has led to remarkable success in many NLP tasks, a detriment is the expense required to retrain all… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  30. arXiv:2503.11050  [pdf, other

    cs.LG cs.AI

    Distance-Based Tree-Sliced Wasserstein Distance

    Authors: Hoang V. Tran, Khoi N. M. Nguyen, Trang Pham, Thanh T. Chu, Tam Le, Tan M. Nguyen

    Abstract: To overcome computational challenges of Optimal Transport (OT), several variants of Sliced Wasserstein (SW) has been developed in the literature. These approaches exploit the closed-form expression of the univariate OT by projecting measures onto (one-dimensional) lines. However, projecting measures onto low-dimensional spaces can lead to a loss of topological information. Tree-Sliced Wasserstein… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

  31. arXiv:2503.10728  [pdf, other

    cs.CL cs.AI cs.CY

    DarkBench: Benchmarking Dark Patterns in Large Language Models

    Authors: Esben Kran, Hieu Minh "Jord" Nguyen, Akash Kundu, Sami Jawhar, Jinsuk Park, Mateusz Maria Jurewicz

    Abstract: We introduce DarkBench, a comprehensive benchmark for detecting dark design patterns--manipulative techniques that influence user behavior--in interactions with large language models (LLMs). Our benchmark comprises 660 prompts across six categories: brand bias, user retention, sycophancy, anthropomorphism, harmful generation, and sneaking. We evaluate models from five leading companies (OpenAI, An… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    Comments: Accepted as an Oral paper at ICLR 2025

  32. arXiv:2503.04790  [pdf, other

    cs.CL cs.AI

    SuperRAG: Beyond RAG with Layout-Aware Graph Modeling

    Authors: Jeff Yang, Duy-Khanh Vu, Minh-Tien Nguyen, Xuan-Quang Nguyen, Linh Nguyen, Hung Le

    Abstract: This paper introduces layout-aware graph modeling for multimodal RAG. Different from traditional RAG methods that mostly deal with flat text chunks, the proposed method takes into account the relationship of multimodalities by using a graph structure. To do that, a graph modeling structure is defined based on document layout parsing. The structure of an input document is retained with the connecti… ▽ More

    Submitted 28 February, 2025; originally announced March 2025.

    Comments: NAACL 2025, Industry Track

  33. arXiv:2503.00687  [pdf, other

    cs.LG

    Transformer Meets Twicing: Harnessing Unattended Residual Information

    Authors: Laziz Abdullaev, Tan M. Nguyen

    Abstract: Transformer-based deep learning models have achieved state-of-the-art performance across numerous language and vision tasks. While the self-attention mechanism, a core component of transformers, has proven capable of handling complex data patterns, it has been observed that the representational capacity of the attention matrix degrades significantly across transformer layers, thereby hurting its o… ▽ More

    Submitted 7 March, 2025; v1 submitted 1 March, 2025; originally announced March 2025.

    Comments: 10 pages in the main text. Published at ICLR 2025

  34. arXiv:2502.20525  [pdf, other

    cs.LG cs.AI

    Revisiting Kernel Attention with Correlated Gaussian Process Representation

    Authors: Long Minh Bui, Tho Tran Huu, Duy Dinh, Tan Minh Nguyen, Trong Nghia Hoang

    Abstract: Transformers have increasingly become the de facto method to model sequential data with state-of-the-art performance. Due to its widespread use, being able to estimate and calibrate its modeling uncertainty is important to understand and design robust transformer models. To achieve this, previous works have used Gaussian processes (GPs) to perform uncertainty calibration for the attention units of… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 21 pages, 4 figures

    Journal ref: The 40th Conference on Uncertainty in Artificial Intelligence, 2024

  35. arXiv:2502.19752  [pdf, other

    cs.LG cs.AI

    Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data

    Authors: Pei-Yau Weng, Minh Hoang, Lam M. Nguyen, My T. Thai, Tsui-Wei Weng, Trong Nghia Hoang

    Abstract: Fine-tuning pre-trained models is a popular approach in machine learning for solving complex tasks with moderate data. However, fine-tuning the entire pre-trained model is ineffective in federated data scenarios where local data distributions are diversely skewed. To address this, we explore integrating federated learning with a more effective prompt-tuning method, optimizing for a small set of in… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

    Comments: Accepted at NeurIPS-24

  36. arXiv:2502.18821  [pdf, other

    cs.LG

    CAMEx: Curvature-aware Merging of Experts

    Authors: Dung V. Nguyen, Minh H. Nguyen, Luc Q. Nguyen, Rachel S. Y. Teo, Tan M. Nguyen, Linh Duy Tran

    Abstract: Existing methods for merging experts during model training and fine-tuning predominantly rely on Euclidean geometry, which assumes a flat parameter space. This assumption can limit the model's generalization ability, especially during the pre-training phase, where the parameter manifold might exhibit more complex curvature. Curvature-aware merging methods typically require additional information a… ▽ More

    Submitted 3 March, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

    Comments: 10 pages, 5 Figures, 7 Tables. Published at ICLR 2025

  37. arXiv:2502.15315  [pdf, other

    cs.LG

    Tight Clusters Make Specialized Experts

    Authors: Stefan K. Nielsen, Rachel S. Y. Teo, Laziz U. Abdullaev, Tan M. Nguyen

    Abstract: Sparse Mixture-of-Experts (MoE) architectures have emerged as a promising approach to decoupling model capacity from computational cost. At the core of the MoE model is the router, which learns the underlying clustering structure of the input distribution in order to send input tokens to appropriate experts. However, latent clusters may be unidentifiable in high dimension, which causes slow conver… ▽ More

    Submitted 1 March, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

  38. arXiv:2502.14412  [pdf, other

    cs.CV cs.CR cs.LG

    Evaluating Precise Geolocation Inference Capabilities of Vision Language Models

    Authors: Neel Jay, Hieu Minh Nguyen, Trung Dung Hoang, Jacob Haimes

    Abstract: The prevalence of Vision-Language Models (VLMs) raises important questions about privacy in an era where visual information is increasingly available. While foundation VLMs demonstrate broad knowledge and learned capabilities, we specifically investigate their ability to infer geographic location from previously unseen image data. This paper introduces a benchmark dataset collected from Google Str… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: AAAI 2025 Workshop DATASAFE

  39. arXiv:2502.08326  [pdf, other

    cs.LG cs.DB cs.DS cs.IR

    Model-Free Counterfactual Subset Selection at Scale

    Authors: Minh Hieu Nguyen, Viet Hung Doan, Anh Tuan Nguyen, Jun Jo, Quoc Viet Hung Nguyen

    Abstract: Ensuring transparency in AI decision-making requires interpretable explanations, particularly at the instance level. Counterfactual explanations are a powerful tool for this purpose, but existing techniques frequently depend on synthetic examples, introducing biases from unrealistic assumptions, flawed models, or skewed data. Many methods also assume full dataset availability, an impractical const… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  40. arXiv:2502.07409  [pdf, other

    cs.CV cs.LG

    MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification

    Authors: Anh-Tien Nguyen, Duy Minh Ho Nguyen, Nghiem Tuong Diep, Trung Quoc Nguyen, Nhat Ho, Jacqueline Michelle Metsch, Miriam Cindy Maurer, Daniel Sonntag, Hanibal Bohnenberger, Anne-Christin Hauschild

    Abstract: Whole slide pathology image classification presents challenges due to gigapixel image sizes and limited annotation labels, hindering model generalization. This paper introduces a prompt learning method to adapt large vision-language models for few-shot pathology classification. We first extend the Prov-GigaPath vision foundation model, pre-trained on 1.3 billion pathology image tiles, into a visio… ▽ More

    Submitted 14 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  41. arXiv:2502.06470  [pdf, ps, other

    cs.CL cs.AI

    A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks

    Authors: Hieu Minh "Jord" Nguyen

    Abstract: Theory of Mind (ToM), the ability to attribute mental states to others and predict their behaviour, is fundamental to social intelligence. In this paper, we survey studies evaluating behavioural and representational ToM in Large Language Models (LLMs), identify important safety risks from advanced LLM ToM capabilities, and suggest several research directions for effective evaluation and mitigation… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: Advancing Artificial Intelligence through Theory of Mind Workshop, AAAI 2025

  42. arXiv:2502.03029  [pdf, other

    cs.LG

    On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation

    Authors: Nghiem T. Diep, Huy Nguyen, Chau Nguyen, Minh Le, Duy M. H. Nguyen, Daniel Sonntag, Mathias Niepert, Nhat Ho

    Abstract: The LLaMA-Adapter has recently emerged as an efficient fine-tuning technique for LLaMA models, leveraging zero-initialized attention to stabilize training and enhance performance. However, despite its empirical success, the theoretical foundations of zero-initialized attention remain largely unexplored. In this paper, we provide a rigorous theoretical analysis, establishing a connection between ze… ▽ More

    Submitted 22 March, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

    Comments: 43 pages, 5 tables, 6 figures

  43. arXiv:2502.02118  [pdf, other

    cs.LG cs.CV

    BRIDLE: Generalized Self-supervised Learning with Quantization

    Authors: Hoang M. Nguyen, Satya N. Shukla, Qiang Zhang, Hanchao Yu, Sreya D. Roy, Taipeng Tian, Lingjiong Zhu, Yuchen Liu

    Abstract: Self-supervised learning has been a powerful approach for learning meaningful representations from unlabeled data across various domains, reducing the reliance on large labeled datasets. Inspired by BERT's success in capturing deep bidirectional contexts in natural language processing, similar frameworks have been adapted to other modalities such as audio, with models like BEATs extending the bidi… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  44. arXiv:2502.00973  [pdf, other

    cs.LG eess.SP

    A Wearable Device Dataset for Mental Health Assessment Using Laser Doppler Flowmetry and Fluorescence Spectroscopy Sensors

    Authors: Minh Ngoc Nguyen, Khai Le-Duc, Tan-Hanh Pham, Trang Nguyen, Quang Minh Luu, Ba Kien Tran, Truong-Son Hy, Viktor Dremin, Sergei Sokolovsky, Edik Rafailov

    Abstract: In this study, we introduce a novel method to predict mental health by building machine learning models for a non-invasive wearable device equipped with Laser Doppler Flowmetry (LDF) and Fluorescence Spectroscopy (FS) sensors. Besides, we present the corresponding dataset to predict mental health, e.g. depression, anxiety, and stress levels via the DAS-21 questionnaire. To our best knowledge, this… ▽ More

    Submitted 2 February, 2025; originally announced February 2025.

    Comments: Preprint, 55 pages

  45. arXiv:2501.18530  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.IT cs.LG

    Optimal generalisation and learning transition in extensive-width shallow neural networks near interpolation

    Authors: Jean Barbier, Francesco Camilli, Minh-Toan Nguyen, Mauro Pastore, Rudy Skerk

    Abstract: We consider a teacher-student model of supervised learning with a fully-trained two-layer neural network whose width $k$ and input dimension $d$ are large and proportional. We provide an effective theory for approximating the Bayes-optimal generalisation error of the network for any activation function in the regime of sample size $n$ scaling quadratically with the input dimension, i.e., around th… ▽ More

    Submitted 1 April, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

    Comments: v2: 9 pages + appendix, 10 figures, 3 tables; added discussion on Gaussian inner weights (Fig. 2, 5 + Appendix H); added discussion on algorithmic complexity of specialisation (Appendix I and figures therein)

  46. arXiv:2501.15120  [pdf, other

    cs.IR cs.DB cs.ET cs.LG

    Technology Mapping with Large Language Models

    Authors: Minh Hieu Nguyen, Hien Thu Pham, Hiep Minh Ha, Ngoc Quang Hung Le, Jun Jo

    Abstract: In today's fast-evolving business landscape, having insight into the technology stacks that organizations use is crucial for forging partnerships, uncovering market openings, and informing strategic choices. However, conventional technology mapping, which typically hinges on keyword searches, struggles with the sheer scale and variety of data available, often failing to capture nascent technologie… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

    Comments: Technical Report

  47. arXiv:2501.14653  [pdf, other

    cs.LG cs.AI cs.DC cs.MA

    Federated Domain Generalization with Data-free On-server Gradient Matching

    Authors: Trong-Binh Nguyen, Minh-Duong Nguyen, Jinsun Park, Quoc-Viet Pham, Won Joo Hwang

    Abstract: Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. One of the key approaches in DG is training an encoder which generates domain-invariant representations. However, this approach is not applicable in Federated Domain Generalization (FDG), where data from various domains are distributed across different clients. In… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

    Comments: 26 pages, 15 figures, ICLR

    MSC Class: 68Q32; 68Q32 ACM Class: I.4.0; I.2.11

  48. arXiv:2501.14279  [pdf, other

    eess.IV cs.CV

    Deep Learning-Powered Classification of Thoracic Diseases in Chest X-Rays

    Authors: Yiming Lei, Michael Nguyen, Tzu Chia Liu, Hyounkyun Oh

    Abstract: Chest X-rays play a pivotal role in diagnosing respiratory diseases such as pneumonia, tuberculosis, and COVID-19, which are prevalent and present unique diagnostic challenges due to overlapping visual features and variability in image quality. Severe class imbalance and the complexity of medical images hinder automated analysis. This study leverages deep learning techniques, including transfer le… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

  49. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  50. arXiv:2501.09937  [pdf, other

    cs.RO

    Adaptive Twisting Sliding Control for Integrated Attack UAV's Autopilot and Guidance

    Authors: Minh Tu Nguyen, Van Truong Hoang, Manh Duong Phung, Van Hoa Doan

    Abstract: This paper investigates an adaptive sliding-mode control for an integrated UAV autopilot and guidance system. First, a two-dimensional mathematical model of the system is derived by considering the incorporated lateral dynamics and relative kinematics of the UAV and its potential target of attack. Then, a sliding surface is derived utilizing the zero-effort miss distance. An adaptive twisting slid… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

    Comments: in Proceedings of the 2025 International Conference on Energy, Infrastructure and Environmental Research (EIER2025)