-
Multimodal, Multi-Disease Medical Imaging Foundation Model (MerMED-FM)
Authors:
Yang Zhou,
Chrystie Wan Ning Quek,
Jun Zhou,
Yan Wang,
Yang Bai,
Yuhe Ke,
Jie Yao,
Laura Gutierrez,
Zhen Ling Teo,
Darren Shu Jeng Ting,
Brian T. Soetikno,
Christopher S. Nielsen,
Tobias Elze,
Zengxiang Li,
Linh Le Dinh,
Lionel Tim-Ee Cheng,
Tran Nguyen Tuan Anh,
Chee Leong Cheng,
Tien Yin Wong,
Nan Liu,
Iain Beehuat Tan,
Tony Kiat Hon Lim,
Rick Siow Mong Goh,
Yong Liu,
Daniel Shu Wei Ting
Abstract:
Current artificial intelligence models for medical imaging are predominantly single modality and single disease. Attempts to create multimodal and multi-disease models have resulted in inconsistent clinical accuracy. Furthermore, training these models typically requires large, labour-intensive, well-labelled datasets. We developed MerMED-FM, a state-of-the-art multimodal, multi-specialty foundatio…
▽ More
Current artificial intelligence models for medical imaging are predominantly single modality and single disease. Attempts to create multimodal and multi-disease models have resulted in inconsistent clinical accuracy. Furthermore, training these models typically requires large, labour-intensive, well-labelled datasets. We developed MerMED-FM, a state-of-the-art multimodal, multi-specialty foundation model trained using self-supervised learning and a memory module. MerMED-FM was trained on 3.3 million medical images from over ten specialties and seven modalities, including computed tomography (CT), chest X-rays (CXR), ultrasound (US), pathology patches, color fundus photography (CFP), optical coherence tomography (OCT) and dermatology images. MerMED-FM was evaluated across multiple diseases and compared against existing foundational models. Strong performance was achieved across all modalities, with AUROCs of 0.988 (OCT); 0.982 (pathology); 0.951 (US); 0.943 (CT); 0.931 (skin); 0.894 (CFP); 0.858 (CXR). MerMED-FM has the potential to be a highly adaptable, versatile, cross-specialty foundation model that enables robust medical imaging interpretation across diverse medical disciplines.
△ Less
Submitted 30 June, 2025;
originally announced July 2025.
-
Building Footprint Extraction in Dense Areas using Super Resolution and Frame Field Learning
Authors:
Vuong Nguyen,
Anh Ho,
Duc-Anh Vu,
Nguyen Thi Ngoc Anh,
Tran Ngoc Thang
Abstract:
Despite notable results on standard aerial datasets, current state-of-the-arts fail to produce accurate building footprints in dense areas due to challenging properties posed by these areas and limited data availability. In this paper, we propose a framework to address such issues in polygonal building extraction. First, super resolution is employed to enhance the spatial resolution of aerial imag…
▽ More
Despite notable results on standard aerial datasets, current state-of-the-arts fail to produce accurate building footprints in dense areas due to challenging properties posed by these areas and limited data availability. In this paper, we propose a framework to address such issues in polygonal building extraction. First, super resolution is employed to enhance the spatial resolution of aerial image, allowing for finer details to be captured. This enhanced imagery serves as input to a multitask learning module, which consists of a segmentation head and a frame field learning head to effectively handle the irregular building structures. Our model is supervised by adaptive loss weighting, enabling extraction of sharp edges and fine-grained polygons which is difficult due to overlapping buildings and low data quality. Extensive experiments on a slum area in India that mimics a dense area demonstrate that our proposed approach significantly outperforms the current state-of-the-art methods by a large margin.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
SemiMemes: A Semi-supervised Learning Approach for Multimodal Memes Analysis
Authors:
Pham Thai Hoang Tung,
Nguyen Tan Viet,
Ngo Tien Anh,
Phan Duy Hung
Abstract:
The prevalence of memes on social media has created the need to sentiment analyze their underlying meanings for censoring harmful content. Meme censoring systems by machine learning raise the need for a semi-supervised learning solution to take advantage of the large number of unlabeled memes available on the internet and make the annotation process less challenging. Moreover, the approach needs t…
▽ More
The prevalence of memes on social media has created the need to sentiment analyze their underlying meanings for censoring harmful content. Meme censoring systems by machine learning raise the need for a semi-supervised learning solution to take advantage of the large number of unlabeled memes available on the internet and make the annotation process less challenging. Moreover, the approach needs to utilize multimodal data as memes' meanings usually come from both images and texts. This research proposes a multimodal semi-supervised learning approach that outperforms other multimodal semi-supervised learning and supervised learning state-of-the-art models on two datasets, the Multimedia Automatic Misogyny Identification and Hateful Memes dataset. Building on the insights gained from Contrastive Language-Image Pre-training, which is an effective multimodal learning technique, this research introduces SemiMemes, a novel training method that combines auto-encoder and classification task to make use of the resourceful unlabeled data.
△ Less
Submitted 16 May, 2023; v1 submitted 31 March, 2023;
originally announced April 2023.
-
Integrated Science, Technology, Engineering and Mathematics (STEM) Education through Active Experience of Designing Technical Toys in Vietnamese Schools
Authors:
Le Xuan Quang,
Le Huy Hoang,
Vu Dinh Chuan,
Nguyen Hoai Nam,
Nguyen Thi Tu Anh,
Vu Thi Hong Nhung
Abstract:
STEM has attracted great consideration. The purpose of research is: (1) study STEM education, (2) explore STEM education with the creative and experiential activity, (3) suggest applying STEM education by designing technical toys for the middle school student. This study used a qualitative approach to carry out teaching integration for STEM education. The study applied to teaching the technologica…
▽ More
STEM has attracted great consideration. The purpose of research is: (1) study STEM education, (2) explore STEM education with the creative and experiential activity, (3) suggest applying STEM education by designing technical toys for the middle school student. This study used a qualitative approach to carry out teaching integration for STEM education. The study applied to teaching the technological field in Vietnamese middle schools. The design performed at the Faculty of Technology Education, Hanoi National University of Education, Vietnam in April 2015. This study used the integrated approach to design subjects for STEM education. Two procedures for integration undertook with analysis. A sample of producing technical toy was consistent with developing students competencies. Integrated approach to STEM education through designing technical toys is possible. Recently, there has been a booming interest in Integrated Science, Technology, Engineering and Mathematics (STEM) education, but the approaches to STEM still remains controversial in diverse educational contexts. This study addressed this issue by exploring STEM education with the use of creative and experiential activities in a Vietnamese educational context. It also proposed a practical model for integrating STEM into teaching technology in secondary schools by designing technical toys. The implementation of the practical model suggests the possibility in using the integrated approach to STEM education through designing technical toys for middle school students in Vietnam. By applying the subject knowledge domains to solve real world problems and settings, the students can experience the benefits of a concrete and active learning in a meaningful and practical context. The multidisciplinary and interdisciplinary integration approaches are consistent with the development of the students competencies.
△ Less
Submitted 13 September, 2015;
originally announced September 2015.