Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models
Authors:
Cao Yuxuan,
Wu Jiayang,
Alistair Cheong Liang Chuen,
Bryan Shan Guanrong,
Theodore Lee Chong Jen,
Sherman Chann Zhi Shen
Abstract:
Traditional online content moderation systems struggle to classify modern multimodal means of communication, such as memes, a highly nuanced and information-dense medium. This task is especially hard in a culturally diverse society like Singapore, where low-resource languages are used and extensive knowledge on local context is needed to interpret online content. We curate a large collection of 11…
▽ More
Traditional online content moderation systems struggle to classify modern multimodal means of communication, such as memes, a highly nuanced and information-dense medium. This task is especially hard in a culturally diverse society like Singapore, where low-resource languages are used and extensive knowledge on local context is needed to interpret online content. We curate a large collection of 112K memes labeled by GPT-4V for fine-tuning a VLM to classify offensive memes in Singapore context. We show the effectiveness of fine-tuned VLMs on our dataset, and propose a pipeline containing OCR, translation and a 7-billion parameter-class VLM. Our solutions reach 80.62% accuracy and 0.8192 AUROC on a held-out test set, and can greatly aid human in moderating online contents. The dataset, code, and model weights have been open-sourced at https://github.com/aliencaocao/vlm-for-memes-aisg.
△ Less
Submitted 8 March, 2025; v1 submitted 25 February, 2025;
originally announced February 2025.