Skip to main content

Showing 1–12 of 12 results for author: Yeo, S Y

.
  1. arXiv:2505.15504  [pdf, ps, other

    cs.CV cs.AI

    Beyond Linearity: Squeeze-and-Recalibrate Blocks for Few-Shot Whole Slide Image Classification

    Authors: Conghao Xiong, Zhengrui Guo, Zhe Xu, Yifei Zhang, Raymond Kai-Yu Tong, Si Yong Yeo, Hao Chen, Joseph J. Y. Sung, Irwin King

    Abstract: Deep learning has advanced computational pathology but expert annotations remain scarce. Few-shot learning mitigates annotation burdens yet suffers from overfitting and discriminative feature mischaracterization. In addition, the current few-shot multiple instance learning (MIL) approaches leverage pretrained vision-language models to alleviate these issues, but at the cost of complex preprocessin… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  2. arXiv:2503.17069  [pdf, other

    cs.CV cs.AI

    PVChat: Personalized Video Chat with One-Shot Learning

    Authors: Yufei Shi, Weilong Yan, Gang Xu, Yumeng Li, Yuchen Li, Zhenxi Li, Fei Richard Yu, Ming Li, Si Yong Yeo

    Abstract: Video large language models (ViLLMs) excel in general video understanding, e.g., recognizing activities like talking and eating, but struggle with identity-aware comprehension, such as "Wilson is receiving chemotherapy" or "Tom is discussing with Sarah", limiting their applicability in smart healthcare and smart home environments. To address this limitation, we propose a one-shot learning framewor… ▽ More

    Submitted 21 March, 2025; originally announced March 2025.

  3. arXiv:2503.06565  [pdf, other

    cs.CV

    Future-Aware Interaction Network For Motion Forecasting

    Authors: Shijie Li, Xun Xu, Si Yong Yeo, Xulei Yang

    Abstract: Motion forecasting is a crucial component of autonomous driving systems, enabling the generation of accurate and smooth future trajectories to ensure safe navigation to the destination. In previous methods, potential future trajectories are often absent in the scene encoding stage, which may lead to suboptimal outcomes. Additionally, prior approaches typically employ transformer architectures for… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  4. arXiv:2503.01019  [pdf, other

    cs.CV cs.AI

    MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations

    Authors: Ziyang Zhang, Yang Yu, Yucheng Chen, Xulei Yang, Si Yong Yeo

    Abstract: Despite significant progress in Vision-Language Pre-training (VLP), current approaches predominantly emphasize feature extraction and cross-modal comprehension, with limited attention to generating or transforming visual content. This gap hinders the model's ability to synthesize coherent and novel visual representations from textual prompts, thereby reducing the effectiveness of multi-modal learn… ▽ More

    Submitted 20 April, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

    Comments: To be pubilshed in CVPR 2025

  5. arXiv:2501.00775  [pdf, other

    cs.HC

    MindCoder: Automated and Controllable Reasoning Chain in Qualitative Analysis

    Authors: Jie Gao, Zhiyao Shu, Shun Yi Yeo

    Abstract: Extracting insights from qualitative analysis involves a series of reasoning steps, such as open coding, grouping, and identifying themes. We introduce the MindCoder reasoning chain, built on Chain-of-Thought (CoT) prompting, to support the insight extraction process step by step-including topic clustering, code labeling, conceptualization, and reporting. We designed the MindCoder web application… ▽ More

    Submitted 16 April, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

    Comments: 17 pages for main content, 3 pages for references, 10 pages for appendix

  6. arXiv:2408.10222  [pdf, other

    eess.SP

    Near-Orthogonal Overlay Communications in LoS Channel Enabled by Novel OAM Beams without Central Energy Voids: An Experimental Study

    Authors: Yufei Zhao, Xiaoyan Ma, Yong Liang Guan, Yile Liu, Afkar Mohamed Ismail, Xiaobei Liu, Siew Yam Yeo, Chau Yuen

    Abstract: This paper introduces a groundbreaking Line-of-Sight (LoS) Multiple-Input Multiple-Output (MIMO) communication architecture leveraging non-traditional Orbital Angular Momentum (OAM) beams. Challenging the conventional paradigm of hollow-emitting OAM beams, this study presents an innovative OAM transmitter design that produces directional OAM beams without central energy voids, aligning their radia… ▽ More

    Submitted 27 July, 2024; originally announced August 2024.

  7. arXiv:2407.14230  [pdf, other

    cs.CV cs.LG

    ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading

    Authors: Zhiyuan Yang, Bo Zhang, Yufei Shi, Ningze Zhong, Johnathan Loh, Huihui Fang, Yanwu Xu, Si Yong Yeo

    Abstract: Glaucoma is one of the leading causes of vision impairment. Digital imaging techniques, such as color fundus photography (CFP) and optical coherence tomography (OCT), provide quantitative and noninvasive methods for glaucoma diagnosis. Recently, in the field of computer-aided glaucoma diagnosis, multi-modality methods that integrate the CFP and OCT modalities have achieved greater diagnostic accur… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted by Ophthalmic Medical Image Analysis Workshop at MICCAI'24

  8. arXiv:2407.05307  [pdf

    eess.IV

    Edge-guided and Cross-scale Feature Fusion Network for Efficient Multi-contrast MRI Super-Resolution

    Authors: Zhiyuan Yang, Bo Zhang, Zhiqiang Zeng, Si Yong Yeo

    Abstract: In recent years, MRI super-resolution techniques have achieved great success, especially multi-contrast methods that extract texture information from reference images to guide the super-resolution reconstruction. However, current methods primarily focus on texture similarities at the same scale, neglecting cross-scale similarities that provide comprehensive information. Moreover, the misalignment… ▽ More

    Submitted 24 August, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: submitted to ICPR2024

  9. Help Me Reflect: Leveraging Self-Reflection Interface Nudges to Enhance Deliberativeness on Online Deliberation Platforms

    Authors: Shun Yi Yeo, Gionnieve Lim, Jie Gao, Weiyu Zhang, Simon Tangi Perrault

    Abstract: The deliberative potential of online platforms has been widely examined. However, little is known about how various interface-based reflection nudges impact the quality of deliberation. This paper presents two user studies with 12 and 120 participants, respectively, to investigate the impacts of different reflective nudges on the quality of deliberation. In the first study, we examined five distin… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  10. arXiv:2309.17143  [pdf, other

    cs.CV cs.AI

    Revisiting Cephalometric Landmark Detection from the view of Human Pose Estimation with Lightweight Super-Resolution Head

    Authors: Qian Wu, Si Yong Yeo, Yufei Chen, Jun Liu

    Abstract: Accurate localization of cephalometric landmarks holds great importance in the fields of orthodontics and orthognathics due to its potential for automating key point labeling. In the context of landmark detection, particularly in cephalometrics, it has been observed that existing methods often lack standardized pipelines and well-designed bias reduction processes, which significantly impact their… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  11. arXiv:2204.08129  [pdf, other

    cs.CV

    Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding

    Authors: Xun Long Ng, Kian Eng Ong, Qichen Zheng, Yun Ni, Si Yong Yeo, Jun Liu

    Abstract: Understanding animals' behaviors is significant for a wide range of applications. However, existing animal behavior datasets have limitations in multiple aspects, including limited numbers of animal classes, data samples and provided tasks, and also limited variations in environmental conditions and viewpoints. To address these limitations, we create a large and diverse dataset, Animal Kingdom, th… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

    Comments: Accepted by CVPR2022 (Oral). Dataset: https://sutdcv.github.io/Animal-Kingdom

  12. arXiv:1703.01025  [pdf

    cs.CV

    A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification

    Authors: Xulei Yang, Zeng Zeng, Si Yong Yeo, Colin Tan, Hong Liang Tey, Yi Su

    Abstract: In this study, a multi-task deep neural network is proposed for skin lesion analysis. The proposed multi-task learning model solves different tasks (e.g., lesion segmentation and two independent binary lesion classifications) at the same time by exploiting commonalities and differences across tasks. This results in improved learning efficiency and potential prediction accuracy for the task-specifi… ▽ More

    Submitted 2 March, 2017; originally announced March 2017.

    Comments: Submission to support ISIC 2017 challenge results