-
Characterizing Nodes and Edges in Dynamic Attributed Networks: A Social-based Approach
Authors:
Thiago H. P. Silva,
Alberto H. F. Laender,
Pedro O. S. Vaz de Melo
Abstract:
How to characterize nodes and edges in dynamic attributed networks based on social aspects? We address this problem by exploring the strength of the ties between actors and their associated attributes over time, thus capturing the social roles of the actors and the meaning of their dynamic interactions in different social network scenarios. For this, we apply social concepts to promote a better un…
▽ More
How to characterize nodes and edges in dynamic attributed networks based on social aspects? We address this problem by exploring the strength of the ties between actors and their associated attributes over time, thus capturing the social roles of the actors and the meaning of their dynamic interactions in different social network scenarios. For this, we apply social concepts to promote a better understanding of the underlying complexity that involves actors and their social motivations. More specifically, we explore the notion of social capital given by the strategic positioning of a particular actor in a social structure by means of the concepts of brokerage, the ability of creating bridges with diversified patterns, and closure, the ability of aggregating nodes with similar patterns. As a result, we unveil the differences of social interactions in distinct academic coauthorship networks and questions \& answers communities. We also statistically validate our social definitions considering the importance of the nodes and edges in a social structure by means of network properties.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Prediction-Free, Real-Time Flexible Control of Tidal Lagoons through Proximal Policy Optimisation: A Case Study for the Swansea Lagoon
Authors:
Túlio Marcondes Moreira,
Jackson Geraldo de Faria Jr,
Pedro O. S. Vaz de Melo,
Luiz Chaimowicz,
Gilberto Medeiros-Ribeiro
Abstract:
Tidal Range Structures (TRS) have been considered for large-scale electricity generation for their potential ability to produce reasonably predictable energy without the emission of greenhouse gases. Once the main forcing components for driving the tides have deterministic dynamics, the available energy in a given TRS has been estimated, through analytical and numerical optimisation routines, as a…
▽ More
Tidal Range Structures (TRS) have been considered for large-scale electricity generation for their potential ability to produce reasonably predictable energy without the emission of greenhouse gases. Once the main forcing components for driving the tides have deterministic dynamics, the available energy in a given TRS has been estimated, through analytical and numerical optimisation routines, as a mostly predictable event. This constraint imposes state-of-art flexible operation methods to rely on tidal predictions to infer best operational strategies for TRS, with the additional cost of requiring to run optimisation routines for every new tide. In this paper, a Deep Reinforcement Learning approach (Proximal Policy Optimisation through Unity ML-Agents) is introduced to perform automatic operation of TRS. For validation, the performance of the proposed method is compared with six different operation optimisation approaches devised from the literature, utilising the Swansea Bay Tidal Lagoon as a case study. We show that our approach is successful in maximising energy generation through an optimised operational policy of turbines and sluices, yielding competitive results with state-of-art optimisation strategies, with the clear advantages of requiring training once and performing real-time automatic control of TRS with measured ocean data only.
△ Less
Submitted 23 January, 2022; v1 submitted 18 June, 2021;
originally announced June 2021.
-
Overcoming Bias in Community Detection Evaluation
Authors:
Jeancarlo Campos Leão,
Alberto H. F. Laender,
Pedro O. S. Vaz de Melo
Abstract:
Community detection is a key task to further understand the function and the structure of complex networks. Therefore, a strategy used to assess this task must be able to avoid biased and incorrect results that might invalidate further analyses or applications that rely on such communities. Two widely used strategies to assess this task are generally known as structural and functional. The structu…
▽ More
Community detection is a key task to further understand the function and the structure of complex networks. Therefore, a strategy used to assess this task must be able to avoid biased and incorrect results that might invalidate further analyses or applications that rely on such communities. Two widely used strategies to assess this task are generally known as structural and functional. The structural strategy basically consists in detecting and assessing such communities by using multiple methods and structural metrics. On the other hand, the functional strategy might be used when ground truth data are available to assess the detected communities. However, the evaluation of communities based on such strategies is usually done in experimental configurations that are largely susceptible to biases, a situation that is inherent to algorithms, metrics and network data used in this task. Furthermore, such strategies are not systematically combined in a way that allows for the identification and mitigation of bias in the algorithms, metrics or network data to converge into more consistent results. In this context, the main contribution of this article is an approach that supports a robust quality evaluation when detecting communities in real-world networks. In our approach, we measure the quality of a community by applying the structural and functional strategies, and the combination of both, to obtain different pieces of evidence. Then, we consider the divergences and the consensus among the pieces of evidence to identify and overcome possible sources of bias in community detection algorithms, evaluation metrics, and network data. Experiments conducted with several real and synthetic networks provided results that show the effectiveness of our approach to obtain more consistent conclusions about the quality of the detected communities.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
A Multi-Strategy Approach to Overcoming Bias in Community Detection Evaluation
Authors:
Jeancarlo Campos Leão,
Alberto H. F. Laender,
Pedro O. S. Vaz de Melo
Abstract:
Community detection is key to understand the structure of complex networks. However, the lack of appropriate evaluation strategies for this specific task may produce biased and incorrect results that might invalidate further analyses or applications based on such networks. In this context, the main contribution of this paper is an approach that supports a robust quality evaluation when detecting c…
▽ More
Community detection is key to understand the structure of complex networks. However, the lack of appropriate evaluation strategies for this specific task may produce biased and incorrect results that might invalidate further analyses or applications based on such networks. In this context, the main contribution of this paper is an approach that supports a robust quality evaluation when detecting communities in real-world networks. In our approach, we use multiple strategies that capture distinct aspects of the communities. The conclusion on the quality of these communities is based on the consensus among the strategies adopted for the structural evaluation, as well as on the comparison with communities detected by different methods and with their existing ground truths. In this way, our approach allows one to overcome biases in network data, detection algorithms and evaluation metrics, thus providing more consistent conclusions about the quality of the detected communities. Experiments conducted with several real and synthetic networks provided results that show the effectiveness of our approach.
△ Less
Submitted 21 September, 2019;
originally announced September 2019.
-
Can WhatsApp Counter Misinformation by Limiting Message Forwarding?
Authors:
Philipe de Freitas Melo,
Carolina Coimbra Vieira,
Kiran Garimella,
Pedro O. S. Vaz de Melo,
Fabrício Benevenuto
Abstract:
WhatsApp is the most popular messaging app in the world. The closed nature of the app, in addition to the ease of transferring multimedia and sharing information to large-scale groups make WhatsApp unique among other platforms, where an anonymous encrypted messages can become viral, reaching multiple users in a short period of time. The personal feeling and immediacy of messages directly delivered…
▽ More
WhatsApp is the most popular messaging app in the world. The closed nature of the app, in addition to the ease of transferring multimedia and sharing information to large-scale groups make WhatsApp unique among other platforms, where an anonymous encrypted messages can become viral, reaching multiple users in a short period of time. The personal feeling and immediacy of messages directly delivered to the user's phone on WhatsApp was extensively abused to spread unfounded rumors and create misinformation campaigns during recent elections in Brazil and India. WhatsApp has been deploying measures to mitigate this problem, such as reducing the limit for forwarding a message to at most five users at once. Despite the welcomed effort to counter the problem, there is no evidence so far on the real effectiveness of such restrictions. In this work, we propose a methodology to evaluate the effectiveness of such measures on the spreading of misinformation circulating on WhatsApp. We use an epidemiological model and real data gathered from WhatsApp in Brazil, India and Indonesia to assess the impact of limiting virality features in this kind of network. Our results suggest that the current efforts deployed by WhatsApp can offer significant delays on the information spread, but they are ineffective in blocking the propagation of misinformation campaigns through public groups when the content has a high viral nature.
△ Less
Submitted 23 September, 2019; v1 submitted 18 September, 2019;
originally announced September 2019.
-
Improving Community Detection by Mining Social Interactions
Authors:
Jeancarlo Campos Leão,
Michele Amaral Brandão,
Pedro O. S. Vaz de Melo,
Alberto H. F. Laender
Abstract:
Social relationships can be divided into different classes based on the regularity with which they occur and the similarity among them. Thus, rare and somewhat similar relationships are random and cause noise in a social network, thus hiding the actual structure of the network and preventing an accurate analysis of it. In this context, in this paper we propose a process to handle social network da…
▽ More
Social relationships can be divided into different classes based on the regularity with which they occur and the similarity among them. Thus, rare and somewhat similar relationships are random and cause noise in a social network, thus hiding the actual structure of the network and preventing an accurate analysis of it. In this context, in this paper we propose a process to handle social network data that exploits temporal features to improve the detection of communities by existing algorithms. By removing random interactions, we observe that social networks converge to a topology with more purely social relationships and more modular communities.
△ Less
Submitted 4 October, 2018; v1 submitted 3 October, 2018;
originally announced October 2018.
-
Fast Estimation of Causal Interactions using Wold Processes
Authors:
Flavio Figueiredo,
Guilherme Borges,
Pedro O. S. Vaz de Melo,
Renato M. Assunção
Abstract:
We here focus on the task of learning Granger causality matrices for multivariate point processes. In order to accomplish this task, our work is the first to explore the use of Wold processes. By doing so, we are able to develop asymptotically fast MCMC learning algorithms. With $N$ being the total number of events and $K$ the number of processes, our learning algorithm has a…
▽ More
We here focus on the task of learning Granger causality matrices for multivariate point processes. In order to accomplish this task, our work is the first to explore the use of Wold processes. By doing so, we are able to develop asymptotically fast MCMC learning algorithms. With $N$ being the total number of events and $K$ the number of processes, our learning algorithm has a $O(N(\,\log(N)\,+\,\log(K)))$ cost per iteration. This is much faster than the $O(N^3\,K^2)$ or $O(K^3)$ for the state of the art. Our approach, called GrangerBusca, is validated on nine datasets. This is an advance in relation to most prior efforts which focus mostly on subsets of the Memetracker data. Regarding accuracy, GrangerBusca is three times more accurate (in Precision@10) than the state of the art for the commonly explored subsets Memetracker. Due to GrangerBusca's much lower training complexity, our approach is the only one able to train models for larger, full, sets of data.
△ Less
Submitted 2 December, 2018; v1 submitted 12 July, 2018;
originally announced July 2018.
-
When Politicians Talk About Politics: Identifying Political Tweets of Brazilian Congressmen
Authors:
Lucas S. Oliveira,
Pedro O. S. Vaz de Melo,
Marcelo S. Amaral,
José Antônio. G. Pinho
Abstract:
Since June 2013, when Brazil faced the largest and most significant mass protests in a generation, a political crisis is in course. In midst of this crisis, Brazilian politicians use social media to communicate with the electorate in order to retain or to grow their political capital. The problem is that many controversial topics are in course and deputies may prefer to avoid such themes in their…
▽ More
Since June 2013, when Brazil faced the largest and most significant mass protests in a generation, a political crisis is in course. In midst of this crisis, Brazilian politicians use social media to communicate with the electorate in order to retain or to grow their political capital. The problem is that many controversial topics are in course and deputies may prefer to avoid such themes in their messages. To characterize this behavior, we propose a method to accurately identify political and non-political tweets independently of the deputy who posted it and of the time it was posted. Moreover, we collected tweets of all congressmen who were active on Twitter and worked in the Brazilian parliament from October 2013 to October 2017. To evaluate our method, we used word clouds and a topic model to identify the main political and non-political latent topics in parliamentarian tweets. Both results indicate that our proposal is able to accurately distinguish political from non-political tweets. Moreover, our analyses revealed a striking fact: more than half of the messages posted by Brazilian deputies are non-political.
△ Less
Submitted 3 May, 2018;
originally announced May 2018.
-
GRM: Group Regularity Mobility Model
Authors:
Ivan O. Nunes,
Clayson Celes,
Michael D. Silva,
Pedro O. S. Vaz de Melo,
Antonio A. F. Loureiro
Abstract:
In this work we propose, implement, and evaluate GRM, a novel mobility model that accounts for the role of group meeting dynamics and regularity in human mobility. Specifically, we show that existing mobility models for humans do not capture the regularity of human group meetings which is present in real mobility traces. Next, we characterize the statistical properties of such group meetings in re…
▽ More
In this work we propose, implement, and evaluate GRM, a novel mobility model that accounts for the role of group meeting dynamics and regularity in human mobility. Specifically, we show that existing mobility models for humans do not capture the regularity of human group meetings which is present in real mobility traces. Next, we characterize the statistical properties of such group meetings in real mobility traces and design GRM accordingly. We show that GRM maintains the typical pairwise contact properties of real traces, such as contact duration and inter-contact time distributions. In addition, GRM accounts for the role of group mobility, presenting group meetings regularity and social communities' structure. Finally, we evaluate state-of-art social-aware protocols for opportunistic routing using a synthetic contact trace generated by our model. The results show that the behavior of such protocols in our model is similar to their behavior in real mobility traces.
△ Less
Submitted 24 June, 2017;
originally announced June 2017.
-
GROUPS-NET: Group Meetings Aware Routing in Multi-Hop D2D Networks
Authors:
Ivan O. Nunes,
Clayson Celes,
Pedro O. S. Vaz de Melo,
Antonio A. F. Loureiro
Abstract:
In the next generation cellular networks, device-to-device (D2D) communication is already considered a fundamental feature. A problem of multi-hop D2D networks is on how to define forwarding algorithms that achieve, at the same time, high delivery ratio and low network overhead. In this paper we aim to understand group meetings' properties by looking at their structure and regularity with the fina…
▽ More
In the next generation cellular networks, device-to-device (D2D) communication is already considered a fundamental feature. A problem of multi-hop D2D networks is on how to define forwarding algorithms that achieve, at the same time, high delivery ratio and low network overhead. In this paper we aim to understand group meetings' properties by looking at their structure and regularity with the final goal of applying such knowledge in the design of a forwarding algorithm for D2D multi-hop networks. We introduce a forwarding protocol, namely GROUPS-NET, which is aware of social group meetings and their evolution over time. Our algorithm is parameter-calibration free and does not require any knowledge about the social network structure of the system. In particular, different from the state of the art algorithms, GROUPS-NET does not need communities detection, which is a complex and expensive task. We validate our algorithm using different publicly available data-sources. In real large scale scenarios, our algorithm achieves approximately the same delivery ratio of the state-of-art solution with up to 40% less network overhead.
△ Less
Submitted 14 August, 2017; v1 submitted 24 May, 2016;
originally announced May 2016.
-
Burstiness Scale: a highly parsimonious model for characterizing random series of events
Authors:
Rodrigo A S Alves,
Renato Assunção,
Pedro O S Vaz de Melo
Abstract:
The problem to accurately and parsimoniously characterize random series of events (RSEs) present in the Web, such as e-mail conversations or Twitter hashtags, is not trivial. Reports found in the literature reveal two apparent conflicting visions of how RSEs should be modeled. From one side, the Poissonian processes, of which consecutive events follow each other at a relatively regular time and sh…
▽ More
The problem to accurately and parsimoniously characterize random series of events (RSEs) present in the Web, such as e-mail conversations or Twitter hashtags, is not trivial. Reports found in the literature reveal two apparent conflicting visions of how RSEs should be modeled. From one side, the Poissonian processes, of which consecutive events follow each other at a relatively regular time and should not be correlated. On the other side, the self-exciting processes, which are able to generate bursts of correlated events and periods of inactivities. The existence of many and sometimes conflicting approaches to model RSEs is a consequence of the unpredictability of the aggregated dynamics of our individual and routine activities, which sometimes show simple patterns, but sometimes results in irregular rising and falling trends. In this paper we propose a highly parsimonious way to characterize general RSEs, namely the Burstiness Scale (BuSca) model. BuSca views each RSE as a mix of two independent process: a Poissonian and a self-exciting one. Here we describe a fast method to extract the two parameters of BuSca that, together, gives the burstyness scale, which represents how much of the RSE is due to bursty and viral effects. We validated our method in eight diverse and large datasets containing real random series of events seen in Twitter, Yelp, e-mail conversations, Digg, and online forums. Results showed that, even using only two parameters, BuSca is able to accurately describe RSEs seen in these diverse systems, what can leverage many applications.
△ Less
Submitted 20 February, 2016;
originally announced February 2016.
-
Group Mobility: Detection, Tracking and Characterization
Authors:
Ivan Oliveira Nunes,
Pedro O. S. Vaz de Melo,
Antonio A. F. Loureiro
Abstract:
In the era of mobile computing, understanding human mobility patterns is crucial in order to better design protocols and applications. Many studies focus on different aspects of human mobility such as people's points of interests, routes, traffic, individual mobility patterns, among others. In this work, we propose to look at human mobility through a social perspective, i.e., analyze the impact of…
▽ More
In the era of mobile computing, understanding human mobility patterns is crucial in order to better design protocols and applications. Many studies focus on different aspects of human mobility such as people's points of interests, routes, traffic, individual mobility patterns, among others. In this work, we propose to look at human mobility through a social perspective, i.e., analyze the impact of social groups in mobility patterns. We use the MIT Reality Mining proximity trace to detect, track and investigate group's evolution throughout time. Our results show that group meetings happen in a periodical fashion and present daily and weekly periodicity. We analyze how groups' dynamics change over day hours and find that group meetings lasting longer are those with less changes in members composition and with members having stronger social bonds with each other. Our findings can be used to propose meeting prediction algorithms, opportunistic routing and information diffusion protocols, taking advantage of those revealed properties.
△ Less
Submitted 15 December, 2015;
originally announced December 2015.
-
How Many Political Parties Should Brazil Have? A Data-driven Method to Assess and Reduce Fragmentation in Multi-Party Political Systems
Authors:
Pedro O. S. Vaz de Melo
Abstract:
In June 2013, Brazil faced the largest and most significant mass protests in a generation. These were exacerbated by the population's disenchantment towards its highly fragmented party system, which is composed by a very large number of political parties. Under these circumstances, presidents are constrained by informal coalition governments, bringing very harmful consequences to the country. In t…
▽ More
In June 2013, Brazil faced the largest and most significant mass protests in a generation. These were exacerbated by the population's disenchantment towards its highly fragmented party system, which is composed by a very large number of political parties. Under these circumstances, presidents are constrained by informal coalition governments, bringing very harmful consequences to the country. In this work I propose ARRANGE, a dAta dRiven method foR Assessing and reduciNG party fragmEntation in a country. ARRANGE uses as input the roll call data for congress votes on bills and amendments as a proxy for political preferences and ideology. With that, ARRANGE finds the minimum number of parties required to house all congressmen without decreasing party discipline. When applied to Brazil's historical roll call data, ARRANGE was able to generate 23 distinct configurations that, compared with the status quo, have (i) a significant smaller number of parties, (ii) a higher discipline of partisans towards their parties and (iii) a more even distribution of partisans into parties. ARRANGE is fast and parsimonious, relying on a single, intuitive parameter.
△ Less
Submitted 26 October, 2015; v1 submitted 28 September, 2015;
originally announced September 2015.
-
Breaking the News: First Impressions Matter on Online News
Authors:
Julio Reis,
Fabrıcio Benevenuto,
Pedro O. S. Vaz de Melo,
Raquel Prates,
Haewoon Kwak,
Jisun An
Abstract:
A growing number of people are changing the way they consume news, replacing the traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity and immediacy present in online news are changing the way news are being produced and exposed by media corporations. News websites have to create effective strategies to catch people's attention and attract…
▽ More
A growing number of people are changing the way they consume news, replacing the traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity and immediacy present in online news are changing the way news are being produced and exposed by media corporations. News websites have to create effective strategies to catch people's attention and attract their clicks. In this paper we investigate possible strategies used by online news corporations in the design of their news headlines. We analyze the content of 69,907 headlines produced by four major global media corporations during a minimum of eight consecutive months in 2014. In order to discover strategies that could be used to attract clicks, we extracted features from the text of the news headlines related to the sentiment polarity of the headline. We discovered that the sentiment of the headline is strongly related to the popularity of the news and also with the dynamics of the posted comments on that particular news.
△ Less
Submitted 16 April, 2015; v1 submitted 26 March, 2015;
originally announced March 2015.
-
You are What you Eat (and Drink): Identifying Cultural Boundaries by Analyzing Food & Drink Habits in Foursquare
Authors:
Thiago H Silva,
Pedro O S Vaz de Melo,
Jussara Almeida,
Mirco Musolesi,
Antonio Loureiro
Abstract:
Food and drink are two of the most basic needs of human beings. However, as society evolved, food and drink became also a strong cultural aspect, being able to describe strong differences among people. Traditional methods used to analyze cross-cultural differences are mainly based on surveys and, for this reason, they are very difficult to represent a significant statistical sample at a global sca…
▽ More
Food and drink are two of the most basic needs of human beings. However, as society evolved, food and drink became also a strong cultural aspect, being able to describe strong differences among people. Traditional methods used to analyze cross-cultural differences are mainly based on surveys and, for this reason, they are very difficult to represent a significant statistical sample at a global scale. In this paper, we propose a new methodology to identify cultural boundaries and similarities across populations at different scales based on the analysis of Foursquare check-ins. This approach might be useful not only for economic purposes, but also to support existing and novel marketing and social applications. Our methodology consists of the following steps. First, we map food and drink related check-ins extracted from Foursquare into users' cultural preferences. Second, we identify particular individual preferences, such as the taste for a certain type of food or drink, e.g., pizza or sake, as well as temporal habits, such as the time and day of the week when an individual goes to a restaurant or a bar. Third, we show how to analyze this information to assess the cultural distance between two countries, cities or even areas of a city. Fourth, we apply a simple clustering technique, using this cultural distance measure, to draw cultural boundaries across countries, cities and regions.
△ Less
Submitted 3 April, 2014;
originally announced April 2014.
-
Universal and Distinct Properties of Communication Dynamics: How to Generate Realistic Inter-event Times
Authors:
Pedro O. S. Vaz de Melo,
Christos Faloutsos,
Renato Assunção,
Rodrigo Alves,
Antonio A. F. Loureiro
Abstract:
With the advancement of information systems, means of communications are becoming cheaper, faster and more available. Today, millions of people carrying smart-phones or tablets are able to communicate at practically any time and anywhere they want. Among others, they can access their e-mails, comment on weblogs, watch and post comments on videos, make phone calls or text messages almost ubiquitous…
▽ More
With the advancement of information systems, means of communications are becoming cheaper, faster and more available. Today, millions of people carrying smart-phones or tablets are able to communicate at practically any time and anywhere they want. Among others, they can access their e-mails, comment on weblogs, watch and post comments on videos, make phone calls or text messages almost ubiquitously. Given this scenario, in this paper we tackle a fundamental aspect of this new era of communication: how the time intervals between communication events behave for different technologies and means of communications? Are there universal patterns for the inter-event time distribution (IED)? In which ways inter-event times behave differently among particular technologies? To answer these questions, we analyze eight different datasets from real and modern communication data and we found four well defined patterns that are seen in all the eight datasets. Moreover, we propose the use of the Self-Feeding Process (SFP) to generate inter-event times between communications. The SFP is extremely parsimonious point process that requires at most two parameters and is able to generate inter-event times with all the universal properties we observed in the data. We show the potential application of SFP by proposing a framework to generate a synthetic dataset containing realistic communication events of any one of the analyzed means of communications (e.g. phone calls, e-mails, comments on blogs) and an algorithm to detect anomalies.
△ Less
Submitted 19 March, 2014;
originally announced March 2014.