Skip to main content

Showing 1–50 of 70 results for author: Le, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.16600  [pdf, ps, other

    cs.LG cs.AI

    FLAME: Towards Federated Fine-Tuning Large Language Models Through Adaptive SMoE

    Authors: Khiem Le, Tuan Tran, Ting Hua, Nitesh V. Chawla

    Abstract: Existing resource-adaptive LoRA federated fine-tuning methods enable clients to fine-tune models using compressed versions of global LoRA matrices, in order to accommodate various compute resources across clients. This compression requirement will lead to suboptimal performance due to information loss. To address this, we propose FLAME, a novel federated learning framework based on the Sparse Mixt… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  2. arXiv:2506.15929  [pdf, ps, other

    cs.CV cs.AI eess.IV

    MoiréXNet: Adaptive Multi-Scale Demoiréing with Linear Attention Test-Time Training and Truncated Flow Matching Prior

    Authors: Liangyan Li, Yimo Ning, Kevin Le, Wei Dong, Yunzhe Li, Jun Chen, Xiaohong Liu

    Abstract: This paper introduces a novel framework for image and video demoiréing by integrating Maximum A Posteriori (MAP) estimation with advanced deep learning techniques. Demoiréing addresses inherently nonlinear degradation processes, which pose significant challenges for existing methods. Traditional supervised learning approaches either fail to remove moiré patterns completely or produce overly smoo… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  3. arXiv:2506.03590  [pdf, ps, other

    cs.LG

    VCDiag: Classifying Erroneous Waveforms for Failure Triage Acceleration

    Authors: Minh Luu, Surya Jasper, Khoi Le, Evan Pan, Michael Quinn, Aakash Tyagi, Jiang Hu

    Abstract: Failure triage in design functional verification is critical but time-intensive, relying on manual specification reviews, log inspections, and waveform analyses. While machine learning (ML) has improved areas like stimulus generation and coverage closure, its application to RTL-level simulation failure triage, particularly for large designs, remains limited. VCDiag offers an efficient, adaptable a… ▽ More

    Submitted 6 July, 2025; v1 submitted 4 June, 2025; originally announced June 2025.

  4. arXiv:2505.24229  [pdf, other

    cs.CL cs.SD eess.AS

    Dynamic Context-Aware Streaming Pretrained Language Model For Inverse Text Normalization

    Authors: Luong Ho, Khanh Le, Vinh Pham, Bao Nguyen, Tan Tran, Duc Chau

    Abstract: Inverse Text Normalization (ITN) is crucial for converting spoken Automatic Speech Recognition (ASR) outputs into well-formatted written text, enhancing both readability and usability. Despite its importance, the integration of streaming ITN within streaming ASR remains largely unexplored due to challenges in accuracy, efficiency, and adaptability, particularly in low-resource and limited-context… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: Accepted to INTERSPEECH 2025

  5. arXiv:2505.03117  [pdf, ps, other

    cs.HC

    Do ATCOs Need Explanations, and Why? Towards ATCO-Centered Explainable AI for Conflict Resolution Advisories

    Authors: Katherine Fennedy, Brian Hilburn, Thaivalappil N. M. Nadirsha, Sameer Alam, Khanh-Duy Le, Hua Li

    Abstract: Interest in explainable artificial intelligence (XAI) is surging. Prior research has primarily focused on systems' ability to generate explanations, often guided by researchers' intuitions rather than end-users' needs. Unfortunately, such approaches have not yielded favorable outcomes when compared to a black-box baseline (i.e., no explanation). To address this gap, this paper advocates a human-ce… ▽ More

    Submitted 3 June, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

    Comments: 2025 ATRD US-Europe Air Transportation Research & Development Symposium

  6. arXiv:2504.18942  [pdf, other

    cs.CL cs.AI cs.LG

    LawFlow : Collecting and Simulating Lawyers' Thought Processes

    Authors: Debarati Das, Khanh Chi Le, Ritik Sachin Parkar, Karin De Langis, Brendan Madson, Chad M. Berryman, Robin M. Willis, Daniel H. Moses, Brett McDonnell, Daniel Schwarcz, Dongyeop Kang

    Abstract: Legal practitioners, particularly those early in their careers, face complex, high-stakes tasks that require adaptive, context-sensitive reasoning. While AI holds promise in supporting legal work, current datasets and models are narrowly focused on isolated subtasks and fail to capture the end-to-end decision-making required in real-world practice. To address this gap, we introduce LawFlow, a data… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: submitted to COLM 2025

  7. arXiv:2504.14582  [pdf, other

    cs.CV

    NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

    Authors: Zheng Chen, Kai Liu, Jue Gong, Jingkai Wang, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Xiangyu Kong, Xiaoxuan Yu, Hyunhee Park, Suejin Han, Hakjae Jeon, Dafeng Zhang, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Lu Zhao, Yuyi Zhang, Pengyu Yan, Jiawei Hu, Pengwei Liu, Fengjun Guo, Hongyuan Yu , et al. (86 additional authors not shown)

    Abstract: This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025. The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $\times$4 scaling factor. The objective is to develop effective network designs or solutions that ach… ▽ More

    Submitted 28 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

    Comments: NTIRE 2025 webpage: https://www.cvlai.net/ntire/2025. Code: https://github.com/zhengchen1999/NTIRE2025_ImageSR_x4

  8. arXiv:2504.09876  [pdf, other

    cs.CV cs.AI

    HDC: Hierarchical Distillation for Multi-level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation

    Authors: Tran Quoc Khanh Le, Nguyen Lan Vi Vu, Ha-Hieu Pham, Xuan-Loc Huynh, Tien-Huy Nguyen, Minh Huu Nhat Le, Quan Nguyen, Hien D. Nguyen

    Abstract: Transvaginal ultrasound is a critical imaging modality for evaluating cervical anatomy and detecting physiological changes. However, accurate segmentation of cervical structures remains challenging due to low contrast, shadow artifacts, and indistinct boundaries. While convolutional neural networks (CNNs) have demonstrated efficacy in medical image segmentation, their reliance on large-scale annot… ▽ More

    Submitted 16 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

  9. arXiv:2504.02789  [pdf, other

    cs.CL

    A Framework for Robust Cognitive Evaluation of LLMs

    Authors: Karin de Langis, Jong Inn Park, Bin Hu, Khanh Chi Le, Andreas Schramm, Michael C. Mensink, Andrew Elfenbein, Dongyeop Kang

    Abstract: Emergent cognitive abilities in large language models (LLMs) have been widely observed, but their nature and underlying mechanisms remain poorly understood. A growing body of research draws on cognitive science to investigate LLM cognition, but standard methodologies and experimen-tal pipelines have not yet been established. To address this gap we develop CognitivEval, a framework for systematical… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

  10. arXiv:2503.05039  [pdf, other

    cs.HC

    Bridging the AI Adoption Gap: Designing an Interactive Pedagogical Agent for Higher Education Instructors

    Authors: Si Chen, Reid Metoyer, Khiem Le, Adam Acunin, Izzy Molnar, Alex Ambrose, James Lang, Nitesh Chawla, Ronald Metoyer

    Abstract: Instructors play a pivotal role in integrating AI into education, yet their adoption of AI-powered tools remains inconsistent. Despite this, limited research explores how to design AI tools that support broader instructor adoption. This study applies a human-centered design approach, incorporating qualitative methods, to investigate the design of interactive pedagogical agents that provide instruc… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  11. arXiv:2502.15158  [pdf, other

    cs.SD eess.AS

    Improving Streaming Speech Recognition With Time-Shifted Contextual Attention And Dynamic Right Context Masking

    Authors: Khanh Le, Duc Chau

    Abstract: Chunk-based inference stands out as a popular approach in developing real-time streaming speech recognition, valued for its simplicity and efficiency. However, because it restricts the model's focus to only the history and current chunk context, it may result in performance degradation in scenarios that demand consideration of future context. Addressing this, we propose a novel approach featuring… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: INTERSPEECH 2024

  12. arXiv:2502.14685  [pdf, other

    cs.SD eess.AS

    SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition

    Authors: Khanh Le, Tuan Vu Ho, Dung Tran, Duc Thanh Chau

    Abstract: RNN-Transducer (RNN-T) is a widely adopted architecture in speech recognition, integrating acoustic and language modeling in an end-to-end framework. However, the RNN-T predictor tends to over-rely on consecutive word dependencies in training data, leading to high deletion error rates, particularly with less common or out-of-domain phrases. Existing solutions, such as regularization and data augme… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: Accepted to ICASSP 2025

  13. arXiv:2502.14673  [pdf, other

    cs.SD eess.AS

    ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription

    Authors: Khanh Le, Tuan Vu Ho, Dung Tran, Duc Thanh Chau

    Abstract: Deploying ASR models at an industrial scale poses significant challenges in hardware resource management, especially for long-form transcription tasks where audio may last for hours. Large Conformer models, despite their capabilities, are limited to processing only 15 minutes of audio on an 80GB GPU. Furthermore, variable input lengths worsen inefficiencies, as standard batching leads to excessive… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: Accepted to ICASSP 2025

  14. arXiv:2501.14166  [pdf, other

    cs.CV cs.AI

    Enhancing Multimodal Entity Linking with Jaccard Distance-based Conditional Contrastive Learning and Contextual Visual Augmentation

    Authors: Cong-Duy Nguyen, Xiaobao Wu, Thong Nguyen, Shuai Zhao, Khoi Le, Viet-Anh Nguyen, Feng Yichao, Anh Tuan Luu

    Abstract: Previous research on multimodal entity linking (MEL) has primarily employed contrastive learning as the primary objective. However, using the rest of the batch as negative samples without careful consideration, these studies risk leveraging easy features and potentially overlook essential details that make entities unique. In this work, we propose JD-CCL (Jaccard Distance-based Conditional Contras… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

  15. arXiv:2412.09843  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Learning Structural Causal Models from Ordering: Identifiable Flow Models

    Authors: Minh Khoa Le, Kien Do, Truyen Tran

    Abstract: In this study, we address causal inference when only observational data and a valid causal ordering from the causal graph are available. We introduce a set of flow models that can recover component-wise, invertible transformation of exogenous variables. Our flow-based methods offer flexible model design while maintaining causal consistency regardless of the number of discretization steps. We propo… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: Accepted at AAAI 2025

  16. arXiv:2411.04475  [pdf, other

    cs.CV

    Deep Learning Models for UAV-Assisted Bridge Inspection: A YOLO Benchmark Analysis

    Authors: Trong-Nhan Phan, Hoang-Hai Nguyen, Thi-Thu-Hien Ha, Huy-Tan Thai, Kim-Hung Le

    Abstract: Visual inspections of bridges are critical to ensure their safety and identify potential failures early. This inspection process can be rapidly and accurately automated by using unmanned aerial vehicles (UAVs) integrated with deep learning models. However, choosing an appropriate model that is lightweight enough to integrate into the UAV and fulfills the strict requirements for inference time and… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

  17. arXiv:2410.13147  [pdf, ps, other

    cs.LG cs.AI cs.CV

    AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular Optimization

    Authors: Khiem Le, Ting Hua, Nitesh V. Chawla

    Abstract: Molecular optimization -- modifying a given molecule to improve desired properties -- is a fundamental task in drug discovery. While LLMs hold the potential to solve this task using natural language to drive the optimization, straightforward prompting achieves limited accuracy. In this work, we propose AgentDrug, an agentic workflow that leverages LLMs in a structured refinement process to achieve… ▽ More

    Submitted 8 June, 2025; v1 submitted 16 October, 2024; originally announced October 2024.

  18. Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition

    Authors: Kha Nhat Le, Hoang-Tuan Nguyen, Hung Tien Tran, Thanh Duc Ngo

    Abstract: Unsupervised domain adaptation (UDA) has become increasingly prevalent in scene text recognition (STR), especially where training and testing data reside in different domains. The efficacy of existing UDA approaches tends to degrade when there is a large gap between the source and target domains. To deal with this problem, gradually shifting or progressively learning to shift from domain to domain… ▽ More

    Submitted 29 October, 2024; v1 submitted 13 October, 2024; originally announced October 2024.

    Comments: [WACV 2025] 15 pages, 12 figures, 5 tables, include supplementary materials, source code: https://github.com/KhaLee2307/StrDA

    Journal ref: 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (pp. 8990-9000)

  19. arXiv:2410.02845  [pdf, other

    cs.LG cs.AI

    Towards Layer-Wise Personalized Federated Learning: Adaptive Layer Disentanglement via Conflicting Gradients

    Authors: Minh Duong Nguyen, Khanh Le, Khoi Do, Nguyen H. Tran, Duc Nguyen, Chien Trinh, Zhaohui Yang

    Abstract: In personalized Federated Learning (pFL), high data heterogeneity can cause significant gradient divergence across devices, adversely affecting the learning process. This divergence, especially when gradients from different users form an obtuse angle during aggregation, can negate progress, leading to severe weight and gradient update degradation. To address this issue, we introduce a new approach… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

  20. arXiv:2410.02221  [pdf, other

    cs.HC cs.CV cs.LG cs.RO eess.SP

    Capturing complex hand movements and object interactions using machine learning-powered stretchable smart textile gloves

    Authors: Arvin Tashakori, Zenan Jiang, Amir Servati, Saeid Soltanian, Harishkumar Narayana, Katherine Le, Caroline Nakayama, Chieh-ling Yang, Z. Jane Wang, Janice J. Eng, Peyman Servati

    Abstract: Accurate real-time tracking of dexterous hand movements and interactions has numerous applications in human-computer interaction, metaverse, robotics, and tele-health. Capturing realistic hand movements is challenging because of the large number of articulations and degrees of freedom. Here, we report accurate and dynamic tracking of articulated hand and finger movements using stretchable, washabl… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Journal ref: Nature Machine Intelligence 6 (2024) 106-118

  21. arXiv:2407.03788  [pdf, other

    cs.CV cs.CL

    MAMA: Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

    Authors: Thong Nguyen, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi Le, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Data quality stands at the forefront of deciding the effectiveness of video-language representation learning. However, video-text pairs in previous data typically do not align perfectly with each other, which might lead to video-language representations that do not accurately reflect cross-modal semantics. Moreover, previous data also possess an uneven distribution of concepts, thereby hampering t… ▽ More

    Submitted 9 October, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  22. arXiv:2406.09717  [pdf, other

    cs.CL

    UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages

    Authors: Trinh Pham, Khoi M. Le, Luu Anh Tuan

    Abstract: In this paper, we introduce UniBridge (Cross-Lingual Transfer Learning with Optimized Embeddings and Vocabulary), a comprehensive approach developed to improve the effectiveness of Cross-Lingual Transfer Learning, particularly in languages with limited resources. Our approach tackles two essential elements of a language model: the initialization of embeddings and the optimal vocabulary size. Speci… ▽ More

    Submitted 20 August, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: First two authors contribute equally. Accepted at ACL 2024

  23. arXiv:2406.06777  [pdf, ps, other

    cs.CV cs.AI

    MolX: Enhancing Large Language Models for Molecular Understanding With A Multi-Modal Extension

    Authors: Khiem Le, Zhichun Guo, Kaiwen Dong, Xiaobao Huang, Bozhao Nan, Roshni Iyer, Xiangliang Zhang, Olaf Wiest, Wei Wang, Ting Hua, Nitesh V. Chawla

    Abstract: Large Language Models (LLMs) with their strong task-handling capabilities have shown remarkable advancements across a spectrum of fields, moving beyond natural language understanding. However, their proficiency within the chemistry domain remains restricted, especially in solving molecule-related tasks. This challenge is attributed to their inherent limitations in comprehending molecules using onl… ▽ More

    Submitted 7 July, 2025; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: MLoG-GenAI@KDD '25

  24. arXiv:2406.02624  [pdf, other

    cs.CR cs.SE

    Take a Step Further: Understanding Page Spray in Linux Kernel Exploitation

    Authors: Ziyi Guo, Dang K Le, Zhenpeng Lin, Kyle Zeng, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, Adam Doupé, Xinyu Xing

    Abstract: Recently, a novel method known as Page Spray emerges, focusing on page-level exploitation for kernel vulnerabilities. Despite the advantages it offers in terms of exploitability, stability, and compatibility, comprehensive research on Page Spray remains scarce. Questions regarding its root causes, exploitation model, comparative benefits over other exploitation techniques, and possible mitigation… ▽ More

    Submitted 8 November, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Published on 33rd USENIX Security Symposium (USENIX Security 24), see https://www.usenix.org/conference/usenixsecurity24/presentation/guo-ziyi

  25. arXiv:2405.20431  [pdf, other

    cs.LG cs.CV

    Exploring the Practicality of Federated Learning: A Survey Towards the Communication Perspective

    Authors: Khiem Le, Nhan Luong-Ha, Manh Nguyen-Duc, Danh Le-Phuoc, Cuong Do, Kok-Seng Wong

    Abstract: Federated Learning (FL) is a promising paradigm that offers significant advancements in privacy-preserving, decentralized machine learning by enabling collaborative training of models across distributed devices without centralizing data. However, the practical deployment of FL systems faces a significant bottleneck: the communication overhead caused by frequently exchanging large model updates bet… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  26. arXiv:2405.15308  [pdf, other

    cs.CR cs.HC

    Nudging Users to Change Breached Passwords Using the Protection Motivation Theory

    Authors: Yixin Zou, Khue Le, Peter Mayer, Alessandro Acquisti, Adam J. Aviv, Florian Schaub

    Abstract: We draw on the Protection Motivation Theory (PMT) to design nudges that encourage users to change breached passwords. Our online experiment ($n$=$1,386$) compared the effectiveness of a threat appeal (highlighting negative consequences of breached passwords) and a coping appeal (providing instructions on how to change the breached password) in a 2x2 factorial design. Compared to the control condit… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Manuscript under review at ACM Transactions on Computer-Human Interaction

  27. arXiv:2403.15605  [pdf, other

    cs.CV cs.LG

    Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

    Authors: Khiem Le, Long Ho, Cuong Do, Danh Le-Phuoc, Kok-Seng Wong

    Abstract: Domain shift is a formidable issue in Machine Learning that causes a model to suffer from performance degradation when tested on unseen domains. Federated Domain Generalization (FedDG) attempts to train a global model using collaborative clients in a privacy-preserving manner that can generalize well to unseen clients possibly with domain shift. However, most existing FedDG methods either cause ad… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  28. A Study of Vulnerability Repair in JavaScript Programs with Large Language Models

    Authors: Tan Khang Le, Saba Alimadadi, Steven Y. Ko

    Abstract: In recent years, JavaScript has become the most widely used programming language, especially in web development. However, writing secure JavaScript code is not trivial, and programmers often make mistakes that lead to security vulnerabilities in web applications. Large Language Models (LLMs) have demonstrated substantial advancements across multiple domains, and their evolving capabilities indicat… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: camera-ready version accepted to the short paper track at WWW'24

  29. arXiv:2403.08876  [pdf, other

    cs.CV

    ARtVista: Gateway To Empower Anyone Into Artist

    Authors: Trong-Vu Hoang, Quang-Binh Nguyen, Duy-Nam Ly, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Drawing is an art that enables people to express their imagination and emotions. However, individuals usually face challenges in drawing, especially when translating conceptual ideas into visually coherent representations and bridging the gap between mental visualization and practical execution. In response, we propose ARtVista - a novel system integrating AR and generative AI technologies. ARtVis… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: CHI 2024

  30. arXiv:2403.08746  [pdf, other

    cs.CV

    iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer

    Authors: Dinh-Khoi Vo, Duy-Nam Ly, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Creating thematic collections in industries demands innovative designs and cohesive concepts. Designers may face challenges in maintaining thematic consistency when drawing inspiration from existing objects, landscapes, or artifacts. While AI-powered graphic design tools offer help, they often fail to generate cohesive sets based on specific thematic concepts. In response, we introduce iCONTRA, an… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: CHI 2024

  31. Object Detection in Thermal Images Using Deep Learning for Unmanned Aerial Vehicles

    Authors: Minh Dang Tu, Kieu Trang Le, Manh Duong Phung

    Abstract: This work presents a neural network model capable of recognizing small and tiny objects in thermal images collected by unmanned aerial vehicles. Our model consists of three parts, the backbone, the neck, and the prediction head. The backbone is developed based on the structure of YOLOv5 combined with the use of a transformer encoder at the end. The neck includes a BI-FPN block combined with the us… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Published in: 2024 IEEE/SICE International Symposium on System Integration (SII)

  32. arXiv:2401.07278  [pdf, other

    cs.CV cs.AI

    Semi-Supervised Semantic Segmentation using Redesigned Self-Training for White Blood Cells

    Authors: Vinh Quoc Luu, Duy Khanh Le, Huy Thanh Nguyen, Minh Thanh Nguyen, Thinh Tien Nguyen, Vinh Quang Dinh

    Abstract: Artificial Intelligence (AI) in healthcare, especially in white blood cell cancer diagnosis, is hindered by two primary challenges: the lack of large-scale labeled datasets for white blood cell (WBC) segmentation and outdated segmentation methods. These challenges inhibit the development of more accurate and modern techniques to diagnose cancer relating to white blood cells. To address the first c… ▽ More

    Submitted 23 February, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

  33. LAMPAT: Low-Rank Adaption for Multilingual Paraphrasing Using Adversarial Training

    Authors: Khoi M. Le, Trinh Pham, Tho Quan, Anh Tuan Luu

    Abstract: Paraphrases are texts that convey the same meaning while using different words or sentence structures. It can be used as an automatic data augmentation tool for many Natural Language Processing tasks, especially when dealing with low-resource languages, where data shortage is a significant problem. To generate a paraphrase in multilingual settings, previous studies have leveraged the knowledge fro… ▽ More

    Submitted 23 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: First two authors contribute equally. Accepted at AAAI 2024

  34. arXiv:2401.03917  [pdf, other

    cs.MS

    Toward a comprehensive simulation framework for hypergraphs: a Python-base approach

    Authors: Quoc Chuong Nguyen, Trung Kien Le

    Abstract: Hypergraphs, or generalization of graphs such that edges can contain more than two nodes, have become increasingly prominent in understanding complex network analysis. Unlike graphs, hypergraphs have relatively few supporting platforms, and such dearth presents a barrier to more widespread adaptation of hypergraph computational toolboxes that could enable further research in several areas. Here, w… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: 13 pages, 3 figures

  35. arXiv:2312.07035  [pdf, other

    cs.LG cs.AI

    HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts

    Authors: Giang Do, Khiem Le, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Bint T. Nguyen, Chenghao Liu, Savitha Ramasamy, Xiaoli Li, Steven Hoi

    Abstract: By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models. Recent findings suggest that fixing the routers can achieve competitive performance by alleviating the collapsing problem, where all experts eventually learn similar representations. However, this strategy has two key limitations: (i) the policy derived from rando… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  36. arXiv:2312.06950  [pdf, other

    cs.CV cs.CL

    READ: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling

    Authors: Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Khoi Le, Zhiyuan Hu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization. With a growing number of tasks and limited training data, such full fine-tuning approach leads to costly model storage and unstable training. To overcome these shortcomings, we introduce lightweight adapte… ▽ More

    Submitted 5 October, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted at AAAI 2024

  37. arXiv:2309.03506  [pdf, other

    cs.CV cs.AI

    Towards Robust Natural-Looking Mammography Lesion Synthesis on Ipsilateral Dual-Views Breast Cancer Analysis

    Authors: Thanh-Huy Nguyen, Quang Hien Kha, Thai Ngoc Toan Truong, Ba Thinh Lam, Ba Hung Ngo, Quang Vinh Dinh, Nguyen Quoc Khanh Le

    Abstract: In recent years, many mammographic image analysis methods have been introduced for improving cancer classification tasks. Two major issues of mammogram classification tasks are leveraging multi-view mammographic information and class-imbalance handling. In the first problem, many multi-view methods have been released for concatenating features of two or more views for the training and inference st… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  38. arXiv:2308.13798  [pdf, other

    cs.CV

    DM-VTON: Distilled Mobile Real-time Virtual Try-On

    Authors: Khoi-Nguyen Nguyen-Ngoc, Thanh-Tung Phan-Nguyen, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: The fashion e-commerce industry has witnessed significant growth in recent years, prompting exploring image-based virtual try-on techniques to incorporate Augmented Reality (AR) experiences into online shopping platforms. However, existing research has primarily overlooked a crucial aspect - the runtime of the underlying machine-learning model. While existing methods prioritize enhancing output qu… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: Accepted to ISMAR 2023 (Poster paper)

  39. arXiv:2308.13795  [pdf, other

    cs.CV

    VIDES: Virtual Interior Design via Natural Language and Visual Guidance

    Authors: Minh-Hien Le, Chi-Bien Chu, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Interior design is crucial in creating aesthetically pleasing and functional indoor spaces. However, developing and editing interior design concepts requires significant time and expertise. We propose Virtual Interior DESign (VIDES) system in response to this challenge. Leveraging cutting-edge technology in generative AI, our system can assist users in generating and editing indoor scene concepts… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: Accepted to ISMAR 2023 (Poster paper)

  40. arXiv:2307.01844  [pdf, other

    cs.CV

    Advancing Wound Filling Extraction on 3D Faces: Auto-Segmentation and Wound Face Regeneration Approach

    Authors: Duong Q. Nguyen, Thinh D. Le, Phuong D. Nguyen, Nga T. K. Le, H. Nguyen-Xuan

    Abstract: Facial wound segmentation plays a crucial role in preoperative planning and optimizing patient outcomes in various medical applications. In this paper, we propose an efficient approach for automating 3D facial wound segmentation using a two-stream graph convolutional network. Our method leverages the Cir3D-FaIR dataset and addresses the challenge of data imbalance through extensive experimentation… ▽ More

    Submitted 12 July, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  41. arXiv:2304.06053  [pdf, other

    cs.CV

    TextANIMAR: Text-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran, Tuan-Anh Yang, Kim-Phat Tran, Nhu-Vinh Hoang, Minh-Quang Nguyen, E-Ro Nguyen, Minh-Khoi Nguyen-Nhat, Tuan-An To, Trung-Truc Huynh-Le, Nham-Tan Nguyen, Hoang-Chau Luong , et al. (8 additional authors not shown)

    Abstract: 3D object retrieval is an important yet challenging task that has drawn more and more attention in recent years. While existing approaches have made strides in addressing this issue, they are often limited to restricted settings such as image and sketch queries, which are often unfriendly interactions for common users. In order to overcome these limitations, this paper presents a novel SHREC chall… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers and Graphics (3DOR, Journal Track)

  42. arXiv:2304.05731  [pdf, other

    cs.CV

    SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Nhat-Quynh Le-Pham, Huu-Phuc Pham, Trong-Vu Hoang, Quang-Binh Nguyen, Trong-Hieu Nguyen-Mau, Tuan-Luc Huynh, Thanh-Danh Le, Ngoc-Linh Nguyen-Ha, Tuong-Vy Truong-Thuy, Truong Hoai Phong, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran , et al. (9 additional authors not shown)

    Abstract: The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers & Graphics (3DOR 2023, Journal track)

  43. arXiv:2304.05723  [pdf, ps, other

    eess.SY cs.RO

    Distributed Coverage Control of Constrained Constant-Speed Unicycle Multi-Agent Systems

    Authors: Qingchen Liu, Zengjie Zhang, Nhan Khanh Le, Jiahu Qin, Fangzhou Liu, Sandra Hirche

    Abstract: This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel cov… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

  44. arXiv:2303.16507  [pdf, other

    cs.CV

    Improving Object Detection in Medical Image Analysis through Multiple Expert Annotators: An Empirical Investigation

    Authors: Hieu H. Pham, Khiem H. Le, Tuan V. Tran, Ha Q. Nguyen

    Abstract: The work discusses the use of machine learning algorithms for anomaly detection in medical image analysis and how the performance of these algorithms depends on the number of annotators and the quality of labels. To address the issue of subjectivity in labeling with a single annotator, we introduce a simple and effective approach that aggregates annotations from multiple annotators with varying le… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: This is a short version submitted to the Midwest Machine Learning Symposium (MMLS 2023), Chicago, IL, USA

  45. arXiv:2303.09115  [pdf, other

    cs.CV

    Learning for Amalgamation: A Multi-Source Transfer Learning Framework For Sentiment Classification

    Authors: Cuong V. Nguyen, Khiem H. Le, Anh M. Tran, Quang H. Pham, Binh T. Nguyen

    Abstract: Transfer learning plays an essential role in Deep Learning, which can remarkably improve the performance of the target domain, whose training data is not sufficient. Our work explores beyond the common practice of transfer learning with a single pre-trained model. We focus on the task of Vietnamese sentiment classification and propose LIFA, a framework to learn a unified embedding from several pre… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Information Sciences

  46. arXiv:2302.02050  [pdf, other

    cs.HC

    Location-based AR for Social Justice: Case Studies, Lessons, and Open Challenges

    Authors: Hope Schroeder, Rob Tokanel, Kyle Qian, Khoi Le

    Abstract: Dear Visitor and Charleston Reconstructed were location-based augmented reality (AR) experiences created between 2018 and 2020 dealing with two controversial monument sites in the US. The projects were motivated by the ability of AR to 1) link layers of context to physical sites in ways that are otherwise difficult or impossible and 2) to visualize changes to physical spaces, potentially inspiring… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  47. arXiv:2208.07088  [pdf, other

    cs.CV

    Enhancing Deep Learning-based 3-lead ECG Classification with Heartbeat Counting and Demographic Data Integration

    Authors: Khiem H. Le, Hieu H. Pham, Thao B. T. Nguyen, Tu A. Nguyen, Cuong D. Do

    Abstract: Nowadays, an increasing number of people are being diagnosed with cardiovascular diseases (CVDs), the leading cause of death globally. The gold standard for identifying these heart problems is via electrocardiogram (ECG). The standard 12-lead ECG is widely used in clinical practice and the majority of current research. However, using a lower number of leads can make ECG more pervasive as it can be… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:2207.12381

  48. Detecting COVID-19 from digitized ECG printouts using 1D convolutional neural networks

    Authors: Thao Nguyen, Hieu H. Pham, Huy Khiem Le, Anh Tu Nguyen, Ngoc Tien Thanh, Cuong Do

    Abstract: The COVID-19 pandemic has exposed the vulnerability of healthcare services worldwide, raising the need to develop novel tools to provide rapid and cost-effective screening and diagnosis. Clinical reports indicated that COVID-19 infection may cause cardiac injury, and electrocardiograms (ECG) may serve as a diagnostic biomarker for COVID-19. This study aims to utilize ECG signals to detect COVID-19… ▽ More

    Submitted 5 October, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted with minor revision by Plos One

  49. arXiv:2207.12381  [pdf, other

    cs.CV cs.AI

    LightX3ECG: A Lightweight and eXplainable Deep Learning System for 3-lead Electrocardiogram Classification

    Authors: Khiem H. Le, Hieu H. Pham, Thao BT. Nguyen, Tu A. Nguyen, Tien N. Thanh, Cuong D. Do

    Abstract: Cardiovascular diseases (CVDs) are a group of heart and blood vessel disorders that is one of the most serious dangers to human health, and the number of such patients is still growing. Early and accurate detection plays a key role in successful treatment and intervention. Electrocardiogram (ECG) is the gold standard for identifying a variety of cardiovascular abnormalities. In clinical practices… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Under review at Biomedical Signal Processing and Control

  50. arXiv:2205.06404  [pdf, other

    cs.LG stat.ML

    Fast Conditional Network Compression Using Bayesian HyperNetworks

    Authors: Phuoc Nguyen, Truyen Tran, Ky Le, Sunil Gupta, Santu Rana, Dang Nguyen, Trong Nguyen, Shannon Ryan, Svetha Venkatesh

    Abstract: We introduce a conditional compression problem and propose a fast framework for tackling it. The problem is how to quickly compress a pretrained large neural network into optimal smaller networks given target contexts, e.g. a context involving only a subset of classes or a context where only limited compute resource is available. To solve this, we propose an efficient Bayesian framework to compres… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: Published as a conference paper at ECML 2021