Improving Multilingual Social Media Insights: Aspect-based Comment Analysis

Zhang, Longyin; Zou, Bowei; Aw, Ai Ti

Computer Science > Computation and Language

arXiv:2505.23037 (cs)

[Submitted on 29 May 2025]

Title:Improving Multilingual Social Media Insights: Aspect-based Comment Analysis

Authors:Longyin Zhang, Bowei Zou, Ai Ti Aw

View PDF HTML (experimental)

Abstract:The inherent nature of social media posts, characterized by the freedom of language use with a disjointed array of diverse opinions and topics, poses significant challenges to downstream NLP tasks such as comment clustering, comment summarization, and social media opinion analysis. To address this, we propose a granular level of identifying and generating aspect terms from individual comments to guide model attention. Specifically, we leverage multilingual large language models with supervised fine-tuning for comment aspect term generation (CAT-G), further aligning the model's predictions with human expectations through DPO. We demonstrate the effectiveness of our method in enhancing the comprehension of social media discourse on two NLP tasks. Moreover, this paper contributes the first multilingual CAT-G test set on English, Chinese, Malay, and Bahasa Indonesian. As LLM capabilities vary among languages, this test set allows for a comparative analysis of performance across languages with varying levels of LLM proficiency.

Comments:	The paper was peer-reviewed
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2505.23037 [cs.CL]
	(or arXiv:2505.23037v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2505.23037

Submission history

From: Longyin Zhang [view email]
[v1] Thu, 29 May 2025 03:24:39 UTC (704 KB)

Computer Science > Computation and Language

Title:Improving Multilingual Social Media Insights: Aspect-based Comment Analysis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Improving Multilingual Social Media Insights: Aspect-based Comment Analysis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators