-
Involvement drives complexity of language in online debates
Authors:
Eleonora Amadori,
Daniele Cirulli,
Edoardo Di Martino,
Jacopo Nudo,
Maria Sahakyan,
Emanuele Sangiorgio,
Arnaldo Santoro,
Simon Zollo,
Alessandro Galeazzi,
Niccolò Di Marco
Abstract:
Language is a fundamental aspect of human societies, continuously evolving in response to various stimuli, including societal changes and intercultural interactions. Technological advancements have profoundly transformed communication, with social media emerging as a pivotal force that merges entertainment-driven content with complex social dynamics. As these platforms reshape public discourse, an…
▽ More
Language is a fundamental aspect of human societies, continuously evolving in response to various stimuli, including societal changes and intercultural interactions. Technological advancements have profoundly transformed communication, with social media emerging as a pivotal force that merges entertainment-driven content with complex social dynamics. As these platforms reshape public discourse, analyzing the linguistic features of user-generated content is essential to understanding their broader societal impact. In this paper, we examine the linguistic complexity of content produced by influential users on Twitter across three globally significant and contested topics: COVID-19, COP26, and the Russia-Ukraine war. By combining multiple measures of textual complexity, we assess how language use varies along four key dimensions: account type, political leaning, content reliability, and sentiment. Our analysis reveals significant differences across all four axes, including variations in language complexity between individuals and organizations, between profiles with sided versus moderate political views, and between those associated with higher versus lower reliability scores. Additionally, profiles producing more negative and offensive content tend to use more complex language, with users sharing similar political stances and reliability levels converging toward a common jargon. Our findings offer new insights into the sociolinguistic dynamics of digital platforms and contribute to a deeper understanding of how language reflects ideological and social structures in online spaces.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
Quantifying Polarization: A Comparative Study of Measures and Methods
Authors:
Edoardo Di Martino,
Matteo Cinelli,
Roy Cerqueti,
Walter Quattrociocchi
Abstract:
Political polarization, a key driver of social fragmentation, has drawn increasing attention for its role in shaping online and offline discourse. Despite significant efforts, accurately measuring polarization within ideological distributions remains a challenge. This study evaluates five widely used polarization measures, testing their strengths and weaknesses with synthetic datasets and a real-w…
▽ More
Political polarization, a key driver of social fragmentation, has drawn increasing attention for its role in shaping online and offline discourse. Despite significant efforts, accurately measuring polarization within ideological distributions remains a challenge. This study evaluates five widely used polarization measures, testing their strengths and weaknesses with synthetic datasets and a real-world case study on YouTube discussions during the 2020 U.S. Presidential Election. Building on these findings, we present a novel adaptation of Kleinberg's burst detection algorithm to improve mode detection in polarized distributions. By offering both a critical review and an innovative methodological tool, this work advances the analysis of ideological patterns in social media discourse.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Ideological Fragmentation of the Social Media Ecosystem: From echo chambers to echo platforms
Authors:
Edoardo Di Martino,
Alessandro Galeazzi,
Michele Starnini,
Walter Quattrociocchi,
Matteo Cinelli
Abstract:
The entertainment-driven nature of social media encourages users to engage with like-minded individuals and consume content aligned with their beliefs, limiting exposure to diverse perspectives. Simultaneously, users migrate between platforms, either due to moderation policies like de-platforming or in search of environments better suited to their preferences. These dynamics drive the specializati…
▽ More
The entertainment-driven nature of social media encourages users to engage with like-minded individuals and consume content aligned with their beliefs, limiting exposure to diverse perspectives. Simultaneously, users migrate between platforms, either due to moderation policies like de-platforming or in search of environments better suited to their preferences. These dynamics drive the specialization of the social media ecosystem, shifting from internal echo chambers to "echo platforms"--entire platforms functioning as ideologically homogeneous niches. To systematically analyze this phenomenon in political discussions, we propose a quantitative approach based on three key dimensions: platform centrality, news consumption, and user base composition. We analyze 117 million posts related to the 2020 US Presidential elections from nine social media platforms--Facebook, Reddit, Twitter, YouTube, BitChute, Gab, Parler, Scored, and Voat. Our findings reveal significant differences among platforms in their centrality within the ecosystem, the reliability of circulated news, and the ideological diversity of their users, highlighting a clear divide between mainstream and alt-tech platforms. The latter occupy a peripheral role, feature a higher prevalence of unreliable content, and exhibit greater ideological uniformity. These results highlight the key dimensions shaping the fragmentation and polarization of the social media landscape.
△ Less
Submitted 4 June, 2025; v1 submitted 25 November, 2024;
originally announced November 2024.
-
Tiny Transformers for Environmental Sound Classification at the Edge
Authors:
David Elliott,
Carlos E. Otero,
Steven Wyatt,
Evan Martino
Abstract:
With the growth of the Internet of Things and the rise of Big Data, data processing and machine learning applications are being moved to cheap and low size, weight, and power (SWaP) devices at the edge, often in the form of mobile phones, embedded systems, or microcontrollers. The field of Cyber-Physical Measurements and Signature Intelligence (MASINT) makes use of these devices to analyze and exp…
▽ More
With the growth of the Internet of Things and the rise of Big Data, data processing and machine learning applications are being moved to cheap and low size, weight, and power (SWaP) devices at the edge, often in the form of mobile phones, embedded systems, or microcontrollers. The field of Cyber-Physical Measurements and Signature Intelligence (MASINT) makes use of these devices to analyze and exploit data in ways not otherwise possible, which results in increased data quality, increased security, and decreased bandwidth. However, methods to train and deploy models at the edge are limited, and models with sufficient accuracy are often too large for the edge device. Therefore, there is a clear need for techniques to create efficient AI/ML at the edge. This work presents training techniques for audio models in the field of environmental sound classification at the edge. Specifically, we design and train Transformers to classify office sounds in audio clips. Results show that a BERT-based Transformer, trained on Mel spectrograms, can outperform a CNN using 99.85% fewer parameters. To achieve this result, we first tested several audio feature extraction techniques designed for Transformers, using ESC-50 for evaluation, along with various augmentations. Our final model outperforms the state-of-the-art MFCC-based CNN on the office sounds dataset, using just over 6,000 parameters -- small enough to run on a microcontroller.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.