Skip to main content

Showing 1–9 of 9 results for author: Diep, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.16806  [pdf, other

    cs.CL

    CoT2Align: Cross-Chain of Thought Distillation via Optimal Transport Alignment for Language Models with Different Tokenizers

    Authors: Anh Duc Le, Tu Vu, Nam Le Hai, Nguyen Thi Ngoc Diep, Linh Ngo Van, Trung Le, Thien Huu Nguyen

    Abstract: Large Language Models (LLMs) achieve state-of-the-art performance across various NLP tasks but face deployment challenges due to high computational costs and memory constraints. Knowledge distillation (KD) is a promising solution, transferring knowledge from large teacher models to smaller student models. However, existing KD methods often assume shared vocabularies and tokenizers, limiting their… ▽ More

    Submitted 1 March, 2025; v1 submitted 23 February, 2025; originally announced February 2025.

  2. arXiv:2502.07409  [pdf, other

    cs.CV cs.LG

    MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification

    Authors: Anh-Tien Nguyen, Duy Minh Ho Nguyen, Nghiem Tuong Diep, Trung Quoc Nguyen, Nhat Ho, Jacqueline Michelle Metsch, Miriam Cindy Maurer, Daniel Sonntag, Hanibal Bohnenberger, Anne-Christin Hauschild

    Abstract: Whole slide pathology image classification presents challenges due to gigapixel image sizes and limited annotation labels, hindering model generalization. This paper introduces a prompt learning method to adapt large vision-language models for few-shot pathology classification. We first extend the Prov-GigaPath vision foundation model, pre-trained on 1.3 billion pathology image tiles, into a visio… ▽ More

    Submitted 14 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  3. arXiv:2502.03029  [pdf, ps, other

    cs.LG

    On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation

    Authors: Nghiem T. Diep, Huy Nguyen, Chau Nguyen, Minh Le, Duy M. H. Nguyen, Daniel Sonntag, Mathias Niepert, Nhat Ho

    Abstract: The LLaMA-Adapter has recently emerged as an efficient fine-tuning technique for LLaMA models, leveraging zero-initialized attention to stabilize training and enhance performance. However, despite its empirical success, the theoretical foundations of zero-initialized attention remain largely unexplored. In this paper, we provide a rigorous theoretical analysis, establishing a connection between ze… ▽ More

    Submitted 18 June, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

    Comments: Accepted at ICML 2025

  4. arXiv:2410.02615  [pdf, ps, other

    cs.LG

    EXGRA-MED: Extended Context Graph Alignment for Medical Vision- Language Models

    Authors: Duy M. H. Nguyen, Nghiem T. Diep, Trung Q. Nguyen, Hoang-Bao Le, Tai Nguyen, Tien Nguyen, TrungTin Nguyen, Nhat Ho, Pengtao Xie, Roger Wattenhofer, James Zou, Daniel Sonntag, Mathias Niepert

    Abstract: State-of-the-art medical multi-modal LLMs (med-MLLMs), such as LLAVA-MED and BIOMEDGPT, primarily depend on scaling model size and data volume, with training driven largely by autoregressive objectives. However, we reveal that this approach can lead to weak vision-language alignment, making these models overly dependent on costly instruction-following data. To address this, we introduce EXGRA-MED,… ▽ More

    Submitted 17 June, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: Version 2

  5. arXiv:2407.04489  [pdf, other

    cs.CV

    Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model

    Authors: Duy M. H. Nguyen, An T. Le, Trung Q. Nguyen, Nghiem T. Diep, Tai Nguyen, Duy Duong-Tran, Jan Peters, Li Shen, Mathias Niepert, Daniel Sonntag

    Abstract: Prompt learning methods are gaining increasing attention due to their ability to customize large vision-language models to new domains using pre-trained contextual knowledge and minimal training data. However, existing works typically rely on optimizing unified prompt inputs, often struggling with fine-grained classification tasks due to insufficient discriminative attributes. To tackle this, we c… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Version 1

  6. arXiv:2311.11096  [pdf, other

    eess.IV cs.CV

    On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

    Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

  7. arXiv:2306.11925  [pdf, other

    cs.CV

    LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching

    Authors: Duy M. H. Nguyen, Hoang Nguyen, Nghiem T. Diep, Tan N. Pham, Tri Cao, Binh T. Nguyen, Paul Swoboda, Nhat Ho, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Obtaining large pre-trained models that can be fine-tuned to new tasks with limited annotated samples has remained an open challenge for medical imaging data. While pre-trained deep networks on ImageNet and vision-language foundation models trained on web-scale data are prevailing approaches, their effectiveness on medical tasks is limited due to the significant domain shift between natural and me… ▽ More

    Submitted 18 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023

  8. arXiv:2304.06053  [pdf, other

    cs.CV

    TextANIMAR: Text-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran, Tuan-Anh Yang, Kim-Phat Tran, Nhu-Vinh Hoang, Minh-Quang Nguyen, E-Ro Nguyen, Minh-Khoi Nguyen-Nhat, Tuan-An To, Trung-Truc Huynh-Le, Nham-Tan Nguyen, Hoang-Chau Luong , et al. (8 additional authors not shown)

    Abstract: 3D object retrieval is an important yet challenging task that has drawn more and more attention in recent years. While existing approaches have made strides in addressing this issue, they are often limited to restricted settings such as image and sketch queries, which are often unfriendly interactions for common users. In order to overcome these limitations, this paper presents a novel SHREC chall… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers and Graphics (3DOR, Journal Track)

  9. arXiv:2304.05731  [pdf, other

    cs.CV

    SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Nhat-Quynh Le-Pham, Huu-Phuc Pham, Trong-Vu Hoang, Quang-Binh Nguyen, Trong-Hieu Nguyen-Mau, Tuan-Luc Huynh, Thanh-Danh Le, Ngoc-Linh Nguyen-Ha, Tuong-Vy Truong-Thuy, Truong Hoai Phong, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran , et al. (9 additional authors not shown)

    Abstract: The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers & Graphics (3DOR 2023, Journal track)