Skip to main content

Showing 1–4 of 4 results for author: Saiyin, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.23785  [pdf, ps, other

    cs.CV

    Visual Textualization for Image Prompted Object Detection

    Authors: Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Yan Xu

    Abstract: We propose VisTex-OVLM, a novel image prompted object detection method that introduces visual textualization -- a process that projects a few visual exemplars into the text feature space to enhance Object-level Vision-Language Models' (OVLMs) capability in detecting rare categories that are difficult to describe textually and nearly absent from their pre-training data, while preserving their pre-t… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

    Comments: Accepted by ICCV 2025

  2. AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models

    Authors: Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yan Xu

    Abstract: Large-scale visual-language pre-trained models (VLPMs) have demonstrated exceptional performance in downstream object detection through text prompts for natural scenes. However, their application to zero-shot nuclei detection on histopathology images remains relatively unexplored, mainly due to the significant gap between the characteristics of medical images and the web-originated text-image pair… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: This article has been accepted for publication in a future issue of IEEE Transactions on Medical Imaging (TMI), but has not been fully edited. Content may change prior to final publication. Citation information: DOI: https://doi.org/10.1109/TMI.2024.3473745 . Code: https://github.com/wuyongjianCODE/AttriPrompter

  3. arXiv:2407.11414  [pdf, other

    cs.CV

    SDPT: Synchronous Dual Prompt Tuning for Fusion-based Visual-Language Pre-trained Models

    Authors: Yang Zhou, Yongjian Wu, Jiya Saiyin, Bingzheng Wei, Maode Lai, Eric Chang, Yan Xu

    Abstract: Prompt tuning methods have achieved remarkable success in parameter-efficient fine-tuning on large pre-trained models. However, their application to dual-modal fusion-based visual-language pre-trained models (VLPMs), such as GLIP, has encountered issues. Existing prompt tuning methods have not effectively addressed the modal mapping and aligning problem for tokens in different modalities, leading… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  4. arXiv:2306.17659  [pdf, other

    cs.CV

    Zero-shot Nuclei Detection via Visual-Language Pre-trained Models

    Authors: Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yubo Fan, Yan Xu

    Abstract: Large-scale visual-language pre-trained models (VLPM) have proven their excellent performance in downstream object detection for natural scenes. However, zero-shot nuclei detection on H\&E images via VLPMs remains underexplored. The large gap between medical images and the web-originated text-image pairs used for pre-training makes it a challenging task. In this paper, we attempt to explore the po… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: This article has been accepted by MICCAI 2023,but has not been fully edited. Content may change prior to final publication