Skip to main content

Showing 1–11 of 11 results for author: Fung, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.22592  [pdf, other

    eess.IV cs.AI cs.CV

    KEVS: Enhancing Segmentation of Visceral Adipose Tissue in Pre-Cystectomy CT with Gaussian Kernel Density Estimation

    Authors: Thomas Boucher, Nicholas Tetlow, Annie Fung, Amy Dewar, Pietro Arina, Sven Kerneis, John Whittle, Evangelos B. Mazomenos

    Abstract: Purpose: The distribution of visceral adipose tissue (VAT) in cystectomy patients is indicative of the incidence of post-operative complications. Existing VAT segmentation methods for computed tomography (CT) employing intensity thresholding have limitations relating to inter-observer variability. Moreover, the difficulty in creating ground-truth masks limits the development of deep learning (DL)… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

    Comments: Preprint for submission to IPCAI special edition of IJCARS 2025, version prior to any peer review

  2. arXiv:2502.00114  [pdf

    cs.RO cs.CV

    Mobile Robot Navigation Using Hand-Drawn Maps: A Vision Language Model Approach

    Authors: Aaron Hao Tan, Angus Fung, Haitong Wang, Goldie Nejat

    Abstract: Hand-drawn maps can be used to convey navigation instructions between humans and robots in a natural and efficient manner. However, these maps can often contain inaccuracies such as scale distortions and missing landmarks which present challenges for mobile robot navigation. This paper introduces a novel Hand-drawn Map Navigation (HAM-Nav) architecture that leverages pre-trained vision language mo… ▽ More

    Submitted 28 April, 2025; v1 submitted 31 January, 2025; originally announced February 2025.

    Comments: 8 pages, 8 figures

  3. arXiv:2412.00103  [pdf

    cs.RO cs.AI cs.LG

    MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models

    Authors: Angus Fung, Aaron Hao Tan, Haitong Wang, Beno Benhabib, Goldie Nejat

    Abstract: Robotic search of people in human-centered environments, including healthcare settings, is challenging as autonomous robots need to locate people without complete or any prior knowledge of their schedules, plans or locations. Furthermore, robots need to be able to adapt to real-time events that can influence a person's plan in an environment. In this paper, we present MLLM-Search, a novel zero-sho… ▽ More

    Submitted 27 November, 2024; originally announced December 2024.

  4. arXiv:2410.00388  [pdf

    cs.RO

    Find Everything: A General Vision Language Model Approach to Multi-Object Search

    Authors: Daniel Choi, Angus Fung, Haitong Wang, Aaron Hao Tan

    Abstract: The Multi-Object Search (MOS) problem involves navigating to a sequence of locations to maximize the likelihood of finding target objects while minimizing travel costs. In this paper, we introduce a novel approach to the MOS problem, called Finder, which leverages vision language models (VLMs) to locate multiple objects across diverse environments. Specifically, our approach introduces multi-chann… ▽ More

    Submitted 1 March, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: 8 pages, 5 figures

  5. arXiv:2402.08774  [pdf

    cs.CV cs.RO

    LDTrack: Dynamic People Tracking by Service Robots using Diffusion Models

    Authors: Angus Fung, Beno Benhabib, Goldie Nejat

    Abstract: Tracking of dynamic people in cluttered and crowded human-centered environments is a challenging robotics problem due to the presence of intraclass variations including occlusions, pose deformations, and lighting variations. This paper introduces a novel deep learning architecture, using conditional latent diffusion models, the Latent Diffusion Track (LDTrack), for tracking multiple dynamic people… ▽ More

    Submitted 6 November, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  6. arXiv:2307.14433  [pdf, other

    cs.CV

    ProtoASNet: Dynamic Prototypes for Inherently Interpretable and Uncertainty-Aware Aortic Stenosis Classification in Echocardiography

    Authors: Hooman Vaseli, Ang Nan Gu, S. Neda Ahmadi Amiri, Michael Y. Tsang, Andrea Fung, Nima Kondori, Armin Saadat, Purang Abolmaesumi, Teresa S. M. Tsang

    Abstract: Aortic stenosis (AS) is a common heart valve disease that requires accurate and timely diagnosis for appropriate treatment. Most current automatic AS severity detection methods rely on black-box models with a low level of trustworthiness, which hinders clinical adoption. To address this issue, we propose ProtoASNet, a prototypical network that directly detects AS from B-mode echocardiography video… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: To be published in MICCAI 2023

  7. arXiv:2304.02478  [pdf

    cs.CL cs.AI cs.CY

    Exploring AI-Generated Text in Student Writing: How Does AI Help?

    Authors: David James Woo, Hengky Susanto, Chi Ho Yeung, Kai Guo, April Ka Yeng Fung

    Abstract: English as foreign language_EFL_students' use of text generated from artificial intelligence_AI_natural language generation_NLG_tools may improve their writing quality. However, it remains unclear to what extent AI-generated text in these students' writing might lead to higher-quality writing. We explored 23 Hong Kong secondary school students' attempts to write stories comprising their own words… ▽ More

    Submitted 31 December, 2023; v1 submitted 10 March, 2023; originally announced April 2023.

    Comments: 45 pages, 11 figures, 3 tables

    ACM Class: J.5; K.3.1

    Journal ref: Language_Learning_and_Technology 28(2) (2024) 183_209

  8. arXiv:2203.00187  [pdf

    cs.RO cs.CV

    Robots Autonomously Detecting People: A Multimodal Deep Contrastive Learning Method Robust to Intraclass Variations

    Authors: Angus Fung, Beno Benhabib, Goldie Nejat

    Abstract: Robotic detection of people in crowded and/or cluttered human-centered environments including hospitals, long-term care, stores and airports is challenging as people can become occluded by other people or objects, and deform due to variations in clothing or pose. There can also be loss of discriminative visual features due to poor lighting. In this paper, we present a novel multimodal person detec… ▽ More

    Submitted 13 February, 2024; v1 submitted 28 February, 2022; originally announced March 2022.

  9. arXiv:2103.02484  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    DeepFN: Towards Generalizable Facial Action Unit Recognition with Deep Face Normalization

    Authors: Javier Hernandez, Daniel McDuff, Ognjen, Rudovic, Alberto Fung, Mary Czerwinski

    Abstract: Facial action unit recognition has many applications from market research to psychotherapy and from image captioning to entertainment. Despite its recent progress, deployment of these models has been impeded due to their limited generalization to unseen people and demographics. This work conducts an in-depth analysis of performance across several dimensions: individuals(40 subjects), genders (male… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Journal ref: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII)

  10. arXiv:1911.05946  [pdf, other

    cs.CV

    A Scalable Approach for Facial Action Unit Classifier Training UsingNoisy Data for Pre-Training

    Authors: Alberto Fung, Daniel McDuff

    Abstract: Machine learning systems are being used to automate many types of laborious labeling tasks. Facial actioncoding is an example of such a labeling task that requires copious amounts of time and a beyond average level of human domain expertise. In recent years, the use of end-to-end deep neural networks has led to significant improvements in action unit recognition performance and many network archit… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

  11. Scanning of Rich Web Applications for Parameter Tampering Vulnerabilities

    Authors: Adonis P. H. Fung, Tielei Wang, K. W. Cheung, T. Y. Wong

    Abstract: Web applications require exchanging parameters between a client and a server to function properly. In real-world systems such as online banking transfer, traversing multiple pages with parameters contributed by both the user and server is a must, and hence the applications have to enforce workflow and parameter dependency controls across multiple requests. An application that applies insufficient… ▽ More

    Submitted 5 June, 2014; v1 submitted 5 April, 2012; originally announced April 2012.

    Comments: 12 pages, 2 tables, 3 figures To appear in ACM ASIA CCS'14, Kyoto, Japan

    ACM Class: H.3.5; K.4.4