Skip to main content

Showing 1–13 of 13 results for author: Men, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.15414  [pdf, other

    eess.IV cs.CV

    Federated Continual 3D Segmentation With Single-round Communication

    Authors: Can Peng, Qianhui Men, Pramit Saha, Qianye Yang, Cheng Ouyang, J. Alison Noble

    Abstract: Federated learning seeks to foster collaboration among distributed clients while preserving the privacy of their local data. Traditionally, federated learning methods assume a fixed setting in which client data and learning objectives remain constant. However, in real-world scenarios, new clients may join, and existing clients may expand the segmentation label set as task requirements evolve. In s… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  2. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  3. arXiv:2408.09931  [pdf, other

    eess.IV cs.CV

    Pose-GuideNet: Automatic Scanning Guidance for Fetal Head Ultrasound from Pose Estimation

    Authors: Qianhui Men, Xiaoqing Guo, Aris T. Papageorghiou, J. Alison Noble

    Abstract: 3D pose estimation from a 2D cross-sectional view enables healthcare professionals to navigate through the 3D space, and such techniques initiate automatic guidance in many image-guided radiology applications. In this work, we investigate how estimating 3D fetal pose from freehand 2D ultrasound scanning can guide a sonographer to locate a head standard plane. Fetal head pose is estimated by the pr… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted by MICCAI2024

  4. arXiv:2408.03761  [pdf, other

    cs.CV

    MMSummary: Multimodal Summary Generation for Fetal Ultrasound Video

    Authors: Xiaoqing Guo, Qianhui Men, J. Alison Noble

    Abstract: We present the first automated multimodal summary generation system, MMSummary, for medical imaging video, particularly with a focus on fetal ultrasound analysis. Imitating the examination process performed by a human sonographer, MMSummary is designed as a three-stage pipeline, progressing from keyframe detection to keyframe captioning and finally anatomy segmentation and measurement. In the keyf… ▽ More

    Submitted 30 October, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: MICCAI 2024

  5. arXiv:2406.07212  [pdf, other

    cs.CL cs.AI cs.HC

    Trustworthy and Practical AI for Healthcare: A Guided Deferral System with Large Language Models

    Authors: Joshua Strong, Qianhui Men, Alison Noble

    Abstract: Large language models (LLMs) offer a valuable technology for various applications in healthcare. However, their tendency to hallucinate and the existing reliance on proprietary systems pose challenges in environments concerning critical decision-making and strict data privacy regulations, such as healthcare, where the trust in such systems is paramount. Through combining the strengths and discount… ▽ More

    Submitted 25 February, 2025; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: AAAI-AISI 2025

  6. arXiv:2304.00858  [pdf, other

    cs.CV

    Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition

    Authors: Qianhui Men, Edmond S. L. Ho, Hubert P. H. Shum, Howard Leung

    Abstract: Learning view-invariant representation is a key to improving feature discrimination power for skeleton-based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations. In this work, we propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL), which significantly suppresses t… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  7. arXiv:2208.08848  [pdf, other

    cs.CV

    A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction

    Authors: Manli Zhu, Qianhui Men, Edmond S. L. Ho, Howard Leung, Hubert P. H. Shum

    Abstract: Musculoskeletal and neurological disorders are the most common causes of walking problems among older people, and they often lead to diminished quality of life. Analyzing walking motion data manually requires trained professionals and the evaluations may not always be objective. To facilitate early diagnosis, recent deep learning-based methods have shown promising results for automated analysis, w… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: Journal of Medical Systems

  8. arXiv:2208.00774  [pdf, other

    cs.GR cs.CV

    Interaction Mix and Match: Synthesizing Close Interaction using Conditional Hierarchical GAN with Multi-Hot Class Embedding

    Authors: Aman Goel, Qianhui Men, Edmond S. L. Ho

    Abstract: Synthesizing multi-character interactions is a challenging task due to the complex and varied interactions between the characters. In particular, precise spatiotemporal alignment between characters is required in generating close interactions such as dancing and fighting. Existing work in generating multi-character interactions focuses on generating a single type of reactive motion for a given seq… ▽ More

    Submitted 4 August, 2022; v1 submitted 23 July, 2022; originally announced August 2022.

    Comments: Accepted to SCA 2022 (will be published in CGF)

  9. arXiv:2207.12833  [pdf, other

    cs.CV

    Multimodal-GuideNet: Gaze-Probe Bidirectional Guidance in Obstetric Ultrasound Scanning

    Authors: Qianhui Men, Clare Teng, Lior Drukker, Aris T. Papageorghiou, J. Alison Noble

    Abstract: Eye trackers can provide visual guidance to sonographers during ultrasound (US) scanning. Such guidance is potentially valuable for less experienced operators to improve their scanning skills on how to manipulate the probe to achieve the desired plane. In this paper, a multimodal guidance approach (Multimodal-GuideNet) is proposed to capture the stepwise dependency between a real-world US video si… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Early accepted by MICCAI 2022

  10. arXiv:2207.09425  [pdf, other

    cs.CV

    Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos

    Authors: Tanqiu Qiao, Qianhui Men, Frederick W. B. Li, Yoshiki Kubotani, Shigeo Morishima, Hubert P. H. Shum

    Abstract: Human-Object Interaction (HOI) recognition in videos is important for analyzing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when multiple people and objects are involved in HOIs. Consider that geometric features such as human pose and object position provide meaningful informati… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV 2022

  11. arXiv:2110.00380  [pdf, other

    cs.GR cs.CV

    GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction

    Authors: Qianhui Men, Hubert P. H. Shum, Edmond S. L. Ho, Howard Leung

    Abstract: Creating realistic characters that can react to the users' or another character's movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task due to the many different ways two humans can interact. While there are a number of successful researches in adapting the generative adversarial netwo… ▽ More

    Submitted 1 October, 2021; originally announced October 2021.

  12. arXiv:2108.04740  [pdf, other

    cs.CV

    Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction

    Authors: Ben A. Rainbow, Qianhui Men, Hubert P. H. Shum

    Abstract: Predicting the movement trajectories of multiple classes of road users in real-world scenarios is a challenging task due to the diverse trajectory patterns. While recent works of pedestrian trajectory prediction successfully modelled the influence of surrounding neighbours based on the relative distances, they are ineffective on multi-class trajectory prediction. This is because they ignore the im… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

  13. arXiv:2106.04471  [pdf, other

    cs.CV cs.LG eess.IV

    Interpreting Deep Learning based Cerebral Palsy Prediction with Channel Attention

    Authors: Manli Zhu, Qianhui Men, Edmond S. L. Ho, Howard Leung, Hubert P. H. Shum

    Abstract: Early prediction of cerebral palsy is essential as it leads to early treatment and monitoring. Deep learning has shown promising results in biomedical engineering thanks to its capacity of modelling complicated data with its non-linear architecture. However, due to their complex structure, deep learning models are generally not interpretable by humans, making it difficult for clinicians to rely on… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.