Skip to main content

Showing 1–5 of 5 results for author: Van Nguyen, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.17002  [pdf, other

    cs.CV

    UIT-DarkCow team at ImageCLEFmedical Caption 2024: Diagnostic Captioning for Radiology Images Efficiency with Transformer Models

    Authors: Quan Van Nguyen, Huy Quang Pham, Dan Quang Tran, Thang Kien-Bao Nguyen, Nhat-Hao Nguyen-Dang, Bao-Thien Nguyen-Tat

    Abstract: Purpose: This study focuses on the development of automated text generation from radiology images, termed diagnostic captioning, to assist medical professionals in reducing clinical errors and improving productivity. The aim is to provide tools that enhance report quality and efficiency, which can significantly impact both clinical practice and deep learning research in the biomedical field. Metho… ▽ More

    Submitted 27 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2404.18397  [pdf, other

    cs.CV

    ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question Answering by Understanding Vietnamese Text in Images

    Authors: Huy Quang Pham, Thang Kien-Bao Nguyen, Quan Van Nguyen, Dan Quang Tran, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Optical Character Recognition - Visual Question Answering (OCR-VQA) is the task of answering text information contained in images that have just been significantly developed in the English language in recent years. However, there are limited studies of this task in low-resource languages such as Vietnamese. To this end, we introduce a novel dataset, ViOCRVQA (Vietnamese Optical Character Recogniti… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  3. arXiv:2404.10652  [pdf, other

    cs.CL

    ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images

    Authors: Quan Van Nguyen, Dan Quang Tran, Huy Quang Pham, Thang Kien-Bao Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Visual Question Answerinng (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images. This task was initially researched with a focus on developing methods to help machines understand objects and scene contexts in images. However, some scene text that carries explicit information about the full content of the image is not mentioned. Along wit… ▽ More

    Submitted 16 May, 2025; v1 submitted 16 April, 2024; originally announced April 2024.

  4. arXiv:2210.04393  [pdf, other

    cs.CV

    LAPFormer: A Light and Accurate Polyp Segmentation Transformer

    Authors: Mai Nguyen, Tung Thanh Bui, Quan Van Nguyen, Thanh Tung Nguyen, Toan Van Pham

    Abstract: Polyp segmentation is still known as a difficult problem due to the large variety of polyp shapes, scanning and labeling modalities. This prevents deep learning model to generalize well on unseen data. However, Transformer-based approach recently has achieved some remarkable results on performance with the ability of extracting global context better than CNN-based architecture and yet lead to bett… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Comments: 7 pages, 7 figures, ACL 2023 underreview

  5. arXiv:2209.14599  [pdf, other

    cs.CV

    Online pseudo labeling for polyp segmentation with momentum networks

    Authors: Toan Pham Van, Linh Bao Doan, Thanh Tung Nguyen, Duc Trung Tran, Quan Van Nguyen, Dinh Viet Sang

    Abstract: Semantic segmentation is an essential task in developing medical image diagnosis systems. However, building an annotated medical dataset is expensive. Thus, semi-supervised methods are significant in this circumstance. In semi-supervised learning, the quality of labels plays a crucial role in model performance. In this work, we present a new pseudo labeling strategy that enhances the quality of ps… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted in KSE 2022