Skip to main content

Showing 1–18 of 18 results for author: Nguyen, K A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.21546  [pdf, ps, other

    cs.CV cs.AI cs.CL cs.LG

    HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation

    Authors: Xinzhuo Li, Adheesh Juvekar, Xingyou Liu, Muntasir Wahed, Kiet A. Nguyen, Ismini Lourentzou

    Abstract: Recent progress in vision-language segmentation has significantly advanced grounded visual understanding. However, these models often exhibit hallucinations by producing segmentation masks for objects not grounded in the image content or by incorrectly labeling irrelevant regions. Existing evaluation protocols for segmentation hallucination primarily focus on label or textual hallucinations withou… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Project webpage: https://plan-lab.github.io/hallusegbench/

  2. arXiv:2506.17212  [pdf, ps, other

    cs.CV cs.AI cs.LG cs.RO

    Part$^{2}$GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting

    Authors: Tianjiao Yu, Vedant Shah, Muntasir Wahed, Ying Shen, Kiet A. Nguyen, Ismini Lourentzou

    Abstract: Articulated objects are common in the real world, yet modeling their structure and motion remains a challenging task for 3D reconstruction methods. In this work, we introduce Part$^{2}$GS, a novel framework for modeling articulated digital twins of multi-part objects with high-fidelity geometry and physically consistent articulation. Part$^{2}$GS leverages a part-aware 3D Gaussian representation t… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  3. arXiv:2503.10628  [pdf, other

    cs.AI cs.LG

    Uncertainty in Action: Confidence Elicitation in Embodied Agents

    Authors: Tianjiao Yu, Vedant Shah, Muntasir Wahed, Kiet A. Nguyen, Adheesh Juvekar, Tal August, Ismini Lourentzou

    Abstract: Expressing confidence is challenging for embodied agents navigating dynamic multimodal environments, where uncertainty arises from both perception and decision-making processes. We present the first work investigating embodied confidence elicitation in open-ended multimodal environments. We introduce Elicitation Policies, which structure confidence assessment across inductive, deductive, and abduc… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    Comments: Project page: https://plan-lab.github.io/ece/

  4. arXiv:2503.03285  [pdf, other

    cs.CV cs.LG

    Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations

    Authors: Khoi Anh Nguyen, Linh Yen Vu, Thang Dinh Duong, Thuan Nguyen Duong, Huy Thanh Nguyen, Vinh Quang Dinh

    Abstract: Visual Question Answering (VQA) is a multimodal task requiring reasoning across textual and visual inputs, which becomes particularly challenging in low-resource languages like Vietnamese due to linguistic variability and the lack of high-quality datasets. Traditional methods often rely heavily on extensive annotated datasets, computationally expensive pipelines, and large pre-trained models, spec… ▽ More

    Submitted 6 March, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

    Comments: 10 pages, 3 figures, AAAI-25 Workshop on Document Understanding and Intelligence

  5. arXiv:2412.19331  [pdf, other

    cs.CV cs.AI cs.LG

    CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models

    Authors: Kiet A. Nguyen, Adheesh Juvekar, Tianjiao Yu, Muntasir Wahed, Ismini Lourentzou

    Abstract: Recent advances in Large Vision-Language Models (LVLMs) have enabled general-purpose vision tasks through visual instruction tuning. While existing LVLMs can generate segmentation masks from text prompts for single images, they struggle with segmentation-grounded reasoning across images, especially at finer granularities such as object parts. In this paper, we introduce the new task of part-focuse… ▽ More

    Submitted 3 April, 2025; v1 submitted 26 December, 2024; originally announced December 2024.

    Comments: Accepted to CVPR 2025. Project page: https://plan-lab.github.io/calico/

  6. arXiv:2412.15209  [pdf, other

    cs.CV cs.AI cs.LG

    PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation

    Authors: Muntasir Wahed, Kiet A. Nguyen, Adheesh Sunil Juvekar, Xinzhuo Li, Xiaona Zhou, Vedant Shah, Tianjiao Yu, Pinar Yanardag, Ismini Lourentzou

    Abstract: Despite significant advancements in Large Vision-Language Models (LVLMs), existing pixel-grounding models operate on single-image settings, limiting their ability to perform detailed, fine-grained comparisons across multiple images. Conversely, current multi-image understanding models lack pixel-level grounding. Our work addresses this gap by introducing the task of multi-image pixel-grounded reas… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

    Comments: Project page: https://plan-lab.github.io/prima

  7. Conformalised data synthesis

    Authors: Julia A. Meister, Khuong An Nguyen

    Abstract: With the proliferation of increasingly complicated Deep Learning architectures, data synthesis is a highly promising technique to address the demand of data-hungry models. However, reliably assessing the quality of a 'synthesiser' model's output is an open research question with significant associated risks for high-stake domains. To address this challenge, we propose a unique synthesis algorithm… ▽ More

    Submitted 10 January, 2025; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in the Machine Learning journal special issue "Conformal Prediction and Distribution-Free Uncertainty Quantification"

    Report number: Meister, J.A., Nguyen, K.A. Conformalised data synthesis. Machine Learning 114, 57 (2025) MSC Class: 68T37

  8. A novel Deep Learning approach for one-step Conformal Prediction approximation

    Authors: Julia A. Meister, Khuong An Nguyen, Stelios Kapetanakis, Zhiyuan Luo

    Abstract: Deep Learning predictions with measurable confidence are increasingly desirable for real-world problems, especially in high-risk settings. The Conformal Prediction (CP) framework is a versatile solution that guarantees a maximum error rate given minimal constraints. In this paper, we propose a novel conformal loss function that approximates the traditionally two-step CP approach in a single step.… ▽ More

    Submitted 7 August, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: 34 pages, 15 figures, 5 tables

    Journal ref: Annals of Mathematics and Artificial Intelligence, 1-28 (2023)

  9. arXiv:2104.07128  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Audio feature ranking for sound-based COVID-19 patient detection

    Authors: Julia A. Meister, Khuong An Nguyen, Zhiyuan Luo

    Abstract: Audio classification using breath and cough samples has recently emerged as a low-cost, non-invasive, and accessible COVID-19 screening method. However, a comprehensive survey shows that no application has been approved for official use at the time of writing, due to the stringent reliability and accuracy requirements of the critical healthcare setting. To support the development of Machine Learni… ▽ More

    Submitted 23 November, 2022; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: 12 pages, 3 figures, 6 tables

    Journal ref: In EPIA Conference on Artificial Intelligence (pp. 146-158). Springer, Cham (2022)

  10. arXiv:2006.02251  [pdf, other

    eess.SP cs.LG

    A review of smartphones based indoor positioning: challenges and applications

    Authors: Khuong An Nguyen, Zhiyuan Luo, Guang Li, Chris Watkins

    Abstract: The continual proliferation of mobile devices has encouraged much effort in using the smartphones for indoor positioning. This article is dedicated to review the most recent and interesting smartphones based indoor navigation systems, ranging from electromagnetic to inertia to visible light ones, with an emphasis on their unique challenges and potential real-world applications. A taxonomy of smart… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

  11. arXiv:2006.00046  [pdf, other

    cs.CY eess.SP

    Epidemic contact tracing with smartphone sensors

    Authors: Khuong An Nguyen, Zhiyuan Luo, Chris Watkins

    Abstract: Contact tracing is widely considered as an effective procedure in the fight against epidemic diseases. However, one of the challenges for technology based contact tracing is the high number of false positives, questioning its trust-worthiness and efficiency amongst the wider population for mass adoption. To this end, this paper proposes a novel, yet practical smartphone-based contact tracing appro… ▽ More

    Submitted 25 July, 2020; v1 submitted 29 May, 2020; originally announced June 2020.

  12. arXiv:1810.13097  [pdf, other

    cs.CL

    Attentive Neural Network for Named Entity Recognition in Vietnamese

    Authors: Kim Anh Nguyen, Ngan Dong, Cam-Tu Nguyen

    Abstract: We propose an attentive neural network for the task of named entity recognition in Vietnamese. The proposed attentive neural model makes use of character-based language models and word embeddings to encode words as vector representations. A neural network architecture of encoder, attention, and decoder layers is then utilized to encode knowledge of input sentences and to label entity tags. The exp… ▽ More

    Submitted 9 June, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

  13. arXiv:1804.05388  [pdf, other

    cs.CL

    Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness

    Authors: Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

    Abstract: We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. ViSim-400 provides degrees of similarity across five semantic relations, as rated by human judges. The two datasets are verified through standard co… ▽ More

    Submitted 19 April, 2018; v1 submitted 15 April, 2018; originally announced April 2018.

    Comments: The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2018)

  14. arXiv:1707.07273  [pdf, other

    cs.CL

    Hierarchical Embeddings for Hypernymy Detection and Directionality

    Authors: Kim Anh Nguyen, Maximilian Köper, Sabine Schulte im Walde, Ngoc Thang Vu

    Abstract: We present a novel neural model HyperVec to learn hierarchical embeddings for hypernymy detection and directionality. While previous embeddings have shown limitations on prototypical hypernyms, HyperVec represents an unsupervised measure where embeddings are learned in a specific order and capture the hypernym$-$hyponym distributional hierarchy. Moreover, our model is able to generalize over unsee… ▽ More

    Submitted 23 July, 2017; originally announced July 2017.

    Comments: 11 pages, accepted as long paper at EMNLP 2017

  15. arXiv:1704.00148  [pdf, other

    cs.CY cs.SI

    Co-location Epidemic Tracking on London Public Transports Using Low Power Mobile Magnetometer

    Authors: Khuong An Nguyen, Chris Watkins, Zhiyuan Luo

    Abstract: The public transports provide an ideal means to enable contagious diseases transmission. This paper introduces a novel idea to detect co-location of people in such environment using just the ubiquitous geomagnetic field sensor on the smart phone. Essentially, given that all passengers must share the same journey between at least two consecutive stations, we have a long window to match the user tra… ▽ More

    Submitted 1 April, 2017; originally announced April 2017.

  16. arXiv:1701.02962  [pdf, other

    cs.CL

    Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network

    Authors: Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

    Abstract: Distinguishing between antonyms and synonyms is a key task to achieve high performance in NLP systems. While they are notoriously difficult to distinguish by distributional co-occurrence models, pattern-based methods have proven effective to differentiate between the relations. In this paper, we present a novel neural network model AntSynNET that exploits lexico-syntactic patterns from syntactic p… ▽ More

    Submitted 11 January, 2017; originally announced January 2017.

    Comments: EACL 2017, 10 pages

    Journal ref: EACL2017

  17. arXiv:1610.01874  [pdf, other

    cs.CL

    Neural-based Noise Filtering from Word Embeddings

    Authors: Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

    Abstract: Word embeddings have been demonstrated to benefit NLP tasks impressively. Yet, there is room for improvement in the vector representations, because current word embeddings typically contain unnecessary information, i.e., noise. We propose two novel models to improve word embeddings by unsupervised learning, in order to yield word denoising embeddings. The word denoising embeddings are obtained by… ▽ More

    Submitted 6 October, 2016; originally announced October 2016.

    Comments: 9 pages, 4 figures, COLING 2016

  18. arXiv:1605.07766  [pdf, other

    cs.CL

    Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction

    Authors: Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

    Abstract: We propose a novel vector representation that integrates lexical contrast into distributional vectors and strengthens the most salient features for determining degrees of word similarity. The improved vectors significantly outperform standard models and distinguish antonyms from synonyms with an average precision of 0.66-0.76 across word classes (adjectives, nouns, verbs). Moreover, we integrate t… ▽ More

    Submitted 25 May, 2016; originally announced May 2016.

    Comments: 6 pages, 4 figures, InProc ACL 2016