-
The Rise of Bluesky
Authors:
Ozgur Can Seckin,
Filipi Nascimento Silva,
Bao Tran Truong,
Sangyeon Kim,
Fan Huang,
Nick Liu,
Alessandro Flammini,
Filippo Menczer
Abstract:
This study investigates the rapid growth and evolving network structure of Bluesky from August 2023 to February 2025. Through multiple waves of user migrations, the platform has reached a stable, persistently active user base. The growth process has given rise to a dense follower network with clustering and hub features that favor viral information diffusion. These developments highlight engagemen…
▽ More
This study investigates the rapid growth and evolving network structure of Bluesky from August 2023 to February 2025. Through multiple waves of user migrations, the platform has reached a stable, persistently active user base. The growth process has given rise to a dense follower network with clustering and hub features that favor viral information diffusion. These developments highlight engagement and structural similarities between Bluesky and established platforms.
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
Labeled Datasets for Research on Information Operations
Authors:
Ozgur Can Seckin,
Manita Pote,
Alexander Nwala,
Lake Yin,
Luca Luceri,
Alessandro Flammini,
Filippo Menczer
Abstract:
Social media platforms have become a hub for political activities and discussions, democratizing participation in these endeavors. However, they have also become an incubator for manipulation campaigns, like information operations (IOs). Some social media platforms have released datasets related to such IOs originating from different countries. However, we lack comprehensive control data that can…
▽ More
Social media platforms have become a hub for political activities and discussions, democratizing participation in these endeavors. However, they have also become an incubator for manipulation campaigns, like information operations (IOs). Some social media platforms have released datasets related to such IOs originating from different countries. However, we lack comprehensive control data that can enable the development of IO detection methods. To bridge this gap, we present new labeled datasets about 26 campaigns, which contain both IO posts verified by a social media platform and over 13M posts by 303k accounts that discussed similar topics in the same time frames (control data). The datasets will facilitate the study of narratives, network interactions, and engagement strategies employed by coordinated accounts across various campaigns and countries. By comparing these coordinated accounts against organic ones, researchers can develop and benchmark IO detection algorithms.
△ Less
Submitted 19 November, 2024; v1 submitted 15 November, 2024;
originally announced November 2024.
-
Implicit degree bias in the link prediction task
Authors:
Rachith Aiyappa,
Xin Wang,
Munjung Kim,
Ozgur Can Seckin,
Jisung Yoon,
Yong-Yeol Ahn,
Sadamori Kojaku
Abstract:
Link prediction -- a task of distinguishing actual hidden edges from random unconnected node pairs -- is one of the quintessential tasks in graph machine learning. Despite being widely accepted as a universal benchmark and a downstream task for representation learning, the validity of the link prediction benchmark itself has been rarely questioned. Here, we show that the common edge sampling proce…
▽ More
Link prediction -- a task of distinguishing actual hidden edges from random unconnected node pairs -- is one of the quintessential tasks in graph machine learning. Despite being widely accepted as a universal benchmark and a downstream task for representation learning, the validity of the link prediction benchmark itself has been rarely questioned. Here, we show that the common edge sampling procedure in the link prediction task has an implicit bias toward high-degree nodes and produces a highly skewed evaluation that favors methods overly dependent on node degree, to the extent that a ``null'' link prediction method based solely on node degree can yield nearly optimal performance. We propose a degree-corrected link prediction task that offers a more reasonable assessment that aligns better with the performance in the recommendation task. Finally, we demonstrate that the degree-corrected benchmark can more effectively train graph machine-learning models by reducing overfitting to node degrees and facilitating the learning of relevant structures in graphs.
△ Less
Submitted 29 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
A Multi-Platform Collection of Social Media Posts about the 2022 U.S. Midterm Elections
Authors:
Rachith Aiyappa,
Matthew R. DeVerna,
Manita Pote,
Bao Tran Truong,
Wanying Zhao,
David Axelrod,
Aria Pessianzadeh,
Zoher Kachwala,
Munjung Kim,
Ozgur Can Seckin,
Minsuk Kim,
Sunny Gandhi,
Amrutha Manikonda,
Francesco Pierri,
Filippo Menczer,
Kai-Cheng Yang
Abstract:
Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multip…
▽ More
Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multiple social networks. Here we describe and provide access to the Indiana University 2022 U.S. Midterms Multi-Platform Social Media Dataset (MEIU22), a collection of social media posts from Twitter, Facebook, Instagram, Reddit, and 4chan. MEIU22 links to posts about the midterm elections based on a comprehensive list of keywords and tracks the social media accounts of 1,011 candidates from October 1 to December 25, 2022. We also publish the source code of our pipeline to enable similar multi-platform research projects.
△ Less
Submitted 26 March, 2023; v1 submitted 16 January, 2023;
originally announced January 2023.
-
Academic Support Network Reflects Doctoral Experience and Productivity
Authors:
Ozgur Can Seckin,
Onur Varol
Abstract:
Current practices of quantifying performance by productivity leads serious concerns for psychological well-being of doctoral students and influence of research environment is often neglected in research evaluations. Acknowledgements in dissertations reflect the student experience and provide an opportunity to thank the people who support them. We conduct textual analysis of acknowledgments to buil…
▽ More
Current practices of quantifying performance by productivity leads serious concerns for psychological well-being of doctoral students and influence of research environment is often neglected in research evaluations. Acknowledgements in dissertations reflect the student experience and provide an opportunity to thank the people who support them. We conduct textual analysis of acknowledgments to build the "academic support network," uncovering five distinct communities: Academic, Administration, Family, Friends & Colleagues, and Spiritual; each of which is acknowledged differently by genders and disciplines. Female students mention fewer people from each community except for their families and total number of people mentioned in acknowledgements allows disciplines to be categorized as either individual science or team science. We also show that number of people mentioned from academic community is positively correlated with productivity and institutional rankings are found to be correlated with productivity and size of academic support networks but show no effect on students' sentiment on acknowledgements. Our results indicate the importance of academic support networks by explaining how they differ and how they influence productivity.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.