-
Join the Chat: How Curiosity Sparks Participation in Telegram Groups
Authors:
Giordano Paoletti,
Jussara M. Almeida,
Luca Vassio,
Marcos André Gonçalves,
Marco Mellia
Abstract:
This study delves into the mechanisms that spark user curiosity driving active engagement within public Telegram groups. By analyzing approximately 6 million messages from 29,196 users across 409 groups, we identify and quantify the key factors that stimulate users to actively participate (i.e., send messages) in group discussions. These factors include social influence, novelty, complexity, uncer…
▽ More
This study delves into the mechanisms that spark user curiosity driving active engagement within public Telegram groups. By analyzing approximately 6 million messages from 29,196 users across 409 groups, we identify and quantify the key factors that stimulate users to actively participate (i.e., send messages) in group discussions. These factors include social influence, novelty, complexity, uncertainty, and conflict, all measured through metrics derived from message sequences and user participation over time. After clustering the messages, we apply explainability techniques to assign meaningful labels to the clusters. This approach uncovers macro categories representing distinct curiosity stimulation profiles, each characterized by a unique combination of various stimuli. Social influence from peers and influencers drives engagement for some users, while for others, rare media types or a diverse range of senders and media sparks curiosity. Analyzing patterns, we found that user curiosity stimuli are mostly stable, but, as the time between the initial message increases, curiosity occasionally shifts. A graph-based analysis of influence networks reveals that users motivated by direct social influence tend to occupy more peripheral positions, while those who are not stimulated by any specific factors are often more central, potentially acting as initiators and conversation catalysts. These findings contribute to understanding information dissemination and spread processes on social media networks, potentially contributing to more effective communication strategies.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Topic-wise Exploration of the Telegram Group-verse
Authors:
Alessandro Perlo,
Giordano Paoletti,
Nikhil Jha,
Luca Vassio,
Jussara Almeida,
Marco Mellia
Abstract:
Although Telegram is currently one of the most popular instant messaging apps in the world, previous studies have mainly focused on analysing discussions on specific angles and topics. In this paper, we present a broad analysis of publicly accessible groups that cover a wide range of discussions, including Education, Erotic, Politics, and Cryptocurrencies. How do people interact with different top…
▽ More
Although Telegram is currently one of the most popular instant messaging apps in the world, previous studies have mainly focused on analysing discussions on specific angles and topics. In this paper, we present a broad analysis of publicly accessible groups that cover a wide range of discussions, including Education, Erotic, Politics, and Cryptocurrencies. How do people interact with different topic groups? Is there any common or peculiar behaviour? We engineer and offer an open-source tool to automate the collection of messages from Telegram groups, a non-straightforward problem. We use it to collect more than 51 million messages from 669 groups. Here, we present a first-of-its-kind, per-topic analysis, contrasting the users' activity patterns from different angles -- the language, the presence of bots, the type and volume of shared media content, links to external platforms, etc. Our results confirm some anecdotal evidence, e.g., indications of spamming behaviour, and unveil some unexpected findings, e.g., the different sharing patterns of video and message length in groups of different topics. Our research provides a horizontal analysis of the public group in Telegram across various general topics, establishing a foundation for future studies that can delve deeper into user interactions and content dynamics within this unique messaging environment.
△ Less
Submitted 17 March, 2025; v1 submitted 4 September, 2024;
originally announced September 2024.
-
Dynamic Cluster Analysis to Detect and Track Novelty in Network Telescopes
Authors:
Kai Huang,
Luca Gioacchini,
Marco Mellia,
Luca Vassio
Abstract:
In the context of cybersecurity, tracking the activities of coordinated hosts over time is a daunting task because both participants and their behaviours evolve at a fast pace. We address this scenario by solving a dynamic novelty discovery problem with the aim of both re-identifying patterns seen in the past and highlighting new patterns. We focus on traffic collected by Network Telescopes, a pri…
▽ More
In the context of cybersecurity, tracking the activities of coordinated hosts over time is a daunting task because both participants and their behaviours evolve at a fast pace. We address this scenario by solving a dynamic novelty discovery problem with the aim of both re-identifying patterns seen in the past and highlighting new patterns. We focus on traffic collected by Network Telescopes, a primary and noisy source for cybersecurity analysis. We propose a 3-stage pipeline: (i) we learn compact representations (embeddings) of hosts through their traffic in a self-supervised fashion; (ii) via clustering, we distinguish groups of hosts performing similar activities; (iii) we track the cluster temporal evolution to highlight novel patterns. We apply our methodology to 20 days of telescope traffic during which we observe more than 8 thousand active hosts. Our results show that we efficiently identify 50-70 well-shaped clusters per day, 60-70% of which we associate with already analysed cases, while we pinpoint 10-20 previously unseen clusters per day. These correspond to activity changes and new incidents, of which we document some. In short, our novelty discovery methodology enormously simplifies the manual analysis the security analysts have to conduct to gain insights to interpret novel coordinated activities.
△ Less
Submitted 10 February, 2025; v1 submitted 17 May, 2024;
originally announced May 2024.
-
Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks
Authors:
Giordano Paoletti,
Luca Gioacchini,
Marco Mellia,
Luca Vassio,
Jussara M. Almeida
Abstract:
In dynamic complex networks, entities interact and form network communities that evolve over time. Among the many static Community Detection (CD) solutions, the modularity-based Louvain, or Greedy Modularity Algorithm (GMA), is widely employed in real-world applications due to its intuitiveness and scalability. Nevertheless, addressing CD in dynamic graphs remains an open problem, since the evolut…
▽ More
In dynamic complex networks, entities interact and form network communities that evolve over time. Among the many static Community Detection (CD) solutions, the modularity-based Louvain, or Greedy Modularity Algorithm (GMA), is widely employed in real-world applications due to its intuitiveness and scalability. Nevertheless, addressing CD in dynamic graphs remains an open problem, since the evolution of the network connections may poison the identification of communities, which may be evolving at a slower pace. Hence, naively applying GMA to successive network snapshots may lead to temporal inconsistencies in the communities. Two evolutionary adaptations of GMA, sGMA and $α$GMA, have been proposed to tackle this problem. Yet, evaluating the performance of these methods and understanding to which scenarios each one is better suited is challenging because of the lack of a comprehensive set of metrics and a consistent ground truth. To address these challenges, we propose (i) a benchmarking framework for evolutionary CD algorithms in dynamic networks and (ii) a generalised modularity-based approach (NeGMA). Our framework allows us to generate synthetic community-structured graphs and design evolving scenarios with nine basic graph transformations occurring at different rates. We evaluate performance through three metrics we define, i.e. Correctness, Delay, and Stability. Our findings reveal that $α$GMA is well-suited for detecting intermittent transformations, but struggles with abrupt changes; sGMA achieves superior stability, but fails to detect emerging communities; and NeGMA appears a well-balanced solution, excelling in responsiveness and instantaneous transformations detection.
△ Less
Submitted 11 January, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
LogPrécis: Unleashing Language Models for Automated Malicious Log Analysis
Authors:
Matteo Boffa,
Rodolfo Vieira Valentim,
Luca Vassio,
Danilo Giordano,
Idilio Drago,
Marco Mellia,
Zied Ben Houidi
Abstract:
The collection of security-related logs holds the key to understanding attack behaviors and diagnosing vulnerabilities. Still, their analysis remains a daunting challenge. Recently, Language Models (LMs) have demonstrated unmatched potential in understanding natural and programming languages. The question arises whether and how LMs could be also useful for security experts since their logs contain…
▽ More
The collection of security-related logs holds the key to understanding attack behaviors and diagnosing vulnerabilities. Still, their analysis remains a daunting challenge. Recently, Language Models (LMs) have demonstrated unmatched potential in understanding natural and programming languages. The question arises whether and how LMs could be also useful for security experts since their logs contain intrinsically confused and obfuscated information. In this paper, we systematically study how to benefit from the state-of-the-art in LM to automatically analyze text-like Unix shell attack logs. We present a thorough design methodology that leads to LogPrécis. It receives as input raw shell sessions and automatically identifies and assigns the attacker tactic to each portion of the session, i.e., unveiling the sequence of the attacker's goals. We demonstrate LogPrécis capability to support the analysis of two large datasets containing about 400,000 unique Unix shell attacks. LogPrécis reduces them into about 3,000 fingerprints, each grouping sessions with the same sequence of tactics. The abstraction it provides lets the analyst better understand attacks, identify fingerprints, detect novelty, link similar attacks, and track families and mutations. Overall, LogPrécis, released as open source, paves the way for better and more responsive defense against cyberattacks.
△ Less
Submitted 22 March, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.
-
GCNH: A Simple Method For Representation Learning On Heterophilous Graphs
Authors:
Andrea Cavallo,
Claas Grohnfeldt,
Michele Russo,
Giulio Lovisotto,
Luca Vassio
Abstract:
Graph Neural Networks (GNNs) are well-suited for learning on homophilous graphs, i.e., graphs in which edges tend to connect nodes of the same type. Yet, achievement of consistent GNN performance on heterophilous graphs remains an open research problem. Recent works have proposed extensions to standard GNN architectures to improve performance on heterophilous graphs, trading off model simplicity f…
▽ More
Graph Neural Networks (GNNs) are well-suited for learning on homophilous graphs, i.e., graphs in which edges tend to connect nodes of the same type. Yet, achievement of consistent GNN performance on heterophilous graphs remains an open research problem. Recent works have proposed extensions to standard GNN architectures to improve performance on heterophilous graphs, trading off model simplicity for prediction accuracy. However, these models fail to capture basic graph properties, such as neighborhood label distribution, which are fundamental for learning. In this work, we propose GCN for Heterophily (GCNH), a simple yet effective GNN architecture applicable to both heterophilous and homophilous scenarios. GCNH learns and combines separate representations for a node and its neighbors, using one learned importance coefficient per layer to balance the contributions of center nodes and neighborhoods. We conduct extensive experiments on eight real-world graphs and a set of synthetic graphs with varying degrees of heterophily to demonstrate how the design choices for GCNH lead to a sizable improvement over a vanilla GCN. Moreover, GCNH outperforms state-of-the-art models of much higher complexity on four out of eight benchmarks, while producing comparable results on the remaining datasets. Finally, we discuss and analyze the lower complexity of GCNH, which results in fewer trainable parameters and faster training times than other methods, and show how GCNH mitigates the oversmoothing problem.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Recommendation Systems in Libraries: an Application with Heterogeneous Data Sources
Authors:
Alessandro Speciale,
Greta Vallero,
Luca Vassio,
Marco Mellia
Abstract:
The Reading&Machine project exploits the support of digitalization to increase the attractiveness of libraries and improve the users' experience. The project implements an application that helps the users in their decision-making process, providing recommendation system (RecSys)-generated lists of books the users might be interested in, and showing them through an interactive Virtual Reality (VR)-…
▽ More
The Reading&Machine project exploits the support of digitalization to increase the attractiveness of libraries and improve the users' experience. The project implements an application that helps the users in their decision-making process, providing recommendation system (RecSys)-generated lists of books the users might be interested in, and showing them through an interactive Virtual Reality (VR)-based Graphical User Interface (GUI). In this paper, we focus on the design and testing of the recommendation system, employing data about all users' loans over the past 9 years from the network of libraries located in Turin, Italy. In addition, we use data collected by the Anobii online social community of readers, who share their feedback and additional information about books they read. Armed with this heterogeneous data, we build and evaluate Content Based (CB) and Collaborative Filtering (CF) approaches. Our results show that the CF outperforms the CB approach, improving by up to 47\% the relevant recommendations provided to a reader. However, the performance of the CB approach is heavily dependent on the number of books the reader has already read, and it can work even better than CF for users with a large history. Finally, our evaluations highlight that the performances of both approaches are significantly improved if the system integrates and leverages the information from the Anobii dataset, which allows us to include more user readings (for CF) and richer book metadata (for CB).
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
Modeling communication asymmetry and content personalization in online social networks
Authors:
Franco Galante,
Luca Vassio,
Michele Garetto,
Emilio Leonardi
Abstract:
The increasing popularity of online social networks (OSNs) attracted growing interest in modeling social interactions. On online social platforms, a few individuals, commonly referred to as influencers, produce the majority of content consumed by users and hegemonize the landscape of the social debate. However, classical opinion models do not capture this communication asymmetry. We develop an opi…
▽ More
The increasing popularity of online social networks (OSNs) attracted growing interest in modeling social interactions. On online social platforms, a few individuals, commonly referred to as influencers, produce the majority of content consumed by users and hegemonize the landscape of the social debate. However, classical opinion models do not capture this communication asymmetry. We develop an opinion model inspired by observations on social media platforms {with two main objectives: first, to describe this inherent communication asymmetry in OSNs, and second, to model the effects of content personalization. We derive a Fokker-Planck equation for the temporal evolution of users' opinion distribution and analytically characterize the stationary system behavior. Analytical results, confirmed by Monte-Carlo simulations, show how strict forms of content personalization tend to radicalize user opinion, leading to the emergence of echo chambers, and favor structurally advantaged influencers. As an example application, we apply our model to Facebook data during the Italian government crisis in the summer of 2019. Our work provides a flexible framework to evaluate the impact of content personalization on the opinion formation process, focusing on the interaction between influential individuals and regular users. This framework is interesting in the context of marketing and advertising, misinformation spreading, politics and activism.
△ Less
Submitted 18 July, 2023; v1 submitted 4 January, 2023;
originally announced January 2023.
-
2-hop Neighbor Class Similarity (2NCS): A graph structural metric indicative of graph neural network performance
Authors:
Andrea Cavallo,
Claas Grohnfeldt,
Michele Russo,
Giulio Lovisotto,
Luca Vassio
Abstract:
Graph Neural Networks (GNNs) achieve state-of-the-art performance on graph-structured data across numerous domains. Their underlying ability to represent nodes as summaries of their vicinities has proven effective for homophilous graphs in particular, in which same-type nodes tend to connect. On heterophilous graphs, in which different-type nodes are likely connected, GNNs perform less consistentl…
▽ More
Graph Neural Networks (GNNs) achieve state-of-the-art performance on graph-structured data across numerous domains. Their underlying ability to represent nodes as summaries of their vicinities has proven effective for homophilous graphs in particular, in which same-type nodes tend to connect. On heterophilous graphs, in which different-type nodes are likely connected, GNNs perform less consistently, as neighborhood information might be less representative or even misleading. On the other hand, GNN performance is not inferior on all heterophilous graphs, and there is a lack of understanding of what other graph properties affect GNN performance.
In this work, we highlight the limitations of the widely used homophily ratio and the recent Cross-Class Neighborhood Similarity (CCNS) metric in estimating GNN performance. To overcome these limitations, we introduce 2-hop Neighbor Class Similarity (2NCS), a new quantitative graph structural property that correlates with GNN performance more strongly and consistently than alternative metrics. 2NCS considers two-hop neighborhoods as a theoretically derived consequence of the two-step label propagation process governing GCN's training-inference process. Experiments on one synthetic and eight real-world graph datasets confirm consistent improvements over existing metrics in estimating the accuracy of GCN- and GAT-based architectures on the node classification task.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
User Value in Modern Payment Platforms: A Graph Approach
Authors:
Laura Arditti,
Martino Trevisan,
Luca Vassio,
Alberto De Lazzari,
Alberto Danese
Abstract:
Payment platforms have significantly evolved in recent years to keep pace with the proliferation of online and cashless payments. These platforms are increasingly aligned with online social networks, allowing users to interact with each other and transfer small amounts of money in a Peer-to-Peer fashion. This poses new challenges for analysing payment data, as traditional methods are only user-cen…
▽ More
Payment platforms have significantly evolved in recent years to keep pace with the proliferation of online and cashless payments. These platforms are increasingly aligned with online social networks, allowing users to interact with each other and transfer small amounts of money in a Peer-to-Peer fashion. This poses new challenges for analysing payment data, as traditional methods are only user-centric or business-centric and neglect the network users build during the interaction. This paper proposes a first methodology for measuring user value in modern payment platforms. We combine quantitative user-centric metrics with an analysis of the graph created by users' activities and its topological features inspired by the evolution of opinions in social networks. We showcase our approach using a dataset from a large operational payment platform and show how it can support business decisions and marketing campaign design, e.g., by targeting specific users.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
On the Dynamics of Political Discussions on Instagram: A Network Perspective
Authors:
Carlos H. G. Ferreira,
Fabricio Murai,
Ana P. C. Silva,
Jussara M. Almeida,
Martino Trevisan,
Luca Vassio,
Marco Mellia,
Idilio Drago
Abstract:
Instagram has been increasingly used as a source of information especially among the youth. As a result, political figures now leverage the platform to spread opinions and political agenda. We here analyze online discussions on Instagram, notably in political topics, from a network perspective. Specifically, we investigate the emergence of communities of co-commenters, that is, groups of users who…
▽ More
Instagram has been increasingly used as a source of information especially among the youth. As a result, political figures now leverage the platform to spread opinions and political agenda. We here analyze online discussions on Instagram, notably in political topics, from a network perspective. Specifically, we investigate the emergence of communities of co-commenters, that is, groups of users who often interact by commenting on the same posts and may be driving the ongoing online discussions. In particular, we are interested in salient co-interactions, i.e., interactions of co-commenters that occur more often than expected by chance and under independent behavior. Unlike casual and accidental co-interactions which normally happen in large volumes, salient co-interactions are key elements driving the online discussions and, ultimately, the information dissemination. We base our study on the analysis of 10 weeks of data centered around major elections in Brazil and Italy, following both politicians and other celebrities. We extract and characterize the communities of co-commenters in terms of topological structure, properties of the discussions carried out by community members, and how some community properties, notably community membership and topics, evolve over time. We show that communities discussing political topics tend to be more engaged in the debate by writing longer comments, using more emojis, hashtags and negative words than in other subjects. Also, communities built around political discussions tend to be more dynamic, although top commenters remain active and preserve community membership over time. Moreover, we observe a great diversity in discussed topics over time: whereas some topics attract attention only momentarily, others, centered around more fundamental political discussions, remain consistently active over time.
△ Less
Submitted 13 September, 2022; v1 submitted 19 September, 2021;
originally announced September 2021.
-
The Internet with Privacy Policies: Measuring The Web Upon Consent
Authors:
Nikhil Jha,
Martino Trevisan,
Luca Vassio,
Marco Mellia
Abstract:
To protect users' privacy, legislators have regulated the usage of tracking technologies, mandating the acquisition of users' consent before collecting data. Consequently, websites started showing more and more consent management modules -- i.e., Privacy Banners -- the visitors have to interact with to access the website content. They challenge the automatic collection of Web measurements, primari…
▽ More
To protect users' privacy, legislators have regulated the usage of tracking technologies, mandating the acquisition of users' consent before collecting data. Consequently, websites started showing more and more consent management modules -- i.e., Privacy Banners -- the visitors have to interact with to access the website content. They challenge the automatic collection of Web measurements, primarily to monitor the extensiveness of tracking technologies but also to measure Web performance in the wild. Privacy Banners in fact limit crawlers from observing the actual website content. In this paper, we present a thorough measurement campaign focusing on popular websites in Europe and the US, visiting both landing and internal pages from different countries around the world. We engineer Priv-Accept, a Web crawler able to accept the privacy policies, as most users would do in practice. This let us compare how webpages change before and after. Our results show that all measurements performed not dealing with the Privacy Banners offer a very biased and partial view of the Web. After accepting the privacy policies, we observe an increase of up to 70 trackers, which in turn slows down the webpage load time by a factor of 2x-3x.
△ Less
Submitted 13 September, 2022; v1 submitted 1 September, 2021;
originally announced September 2021.
-
z-anonymity: Zero-Delay Anonymization for Data Streams
Authors:
Nikhil Jha,
Thomas Favale,
Luca Vassio,
Martino Trevisan,
Marco Mellia
Abstract:
With the advent of big data and the birth of the data markets that sell personal information, individuals' privacy is of utmost importance. The classical response is anonymization, i.e., sanitizing the information that can directly or indirectly allow users' re-identification. The most popular solution in the literature is the k-anonymity. However, it is hard to achieve k-anonymity on a continuous…
▽ More
With the advent of big data and the birth of the data markets that sell personal information, individuals' privacy is of utmost importance. The classical response is anonymization, i.e., sanitizing the information that can directly or indirectly allow users' re-identification. The most popular solution in the literature is the k-anonymity. However, it is hard to achieve k-anonymity on a continuous stream of data, as well as when the number of dimensions becomes high.In this paper, we propose a novel anonymization property called z-anonymity. Differently from k-anonymity, it can be achieved with zero-delay on data streams and it is well suited for high dimensional data. The idea at the base of z-anonymity is to release an attribute (an atomic information) about a user only if at least z - 1 other users have presented the same attribute in a past time window. z-anonymity is weaker than k-anonymity since it does not work on the combinations of attributes, but treats them individually. In this paper, we present a probabilistic framework to map the z-anonymity into the k-anonymity property. Our results show that a proper choice of the z-anonymity parameters allows the data curator to likely obtain a k-anonymized dataset, with a precisely measurable probability. We also evaluate a real use case, in which we consider the website visits of a population of users and show that z-anonymity can work in practice for obtaining the k-anonymity too.
△ Less
Submitted 14 June, 2021;
originally announced June 2021.
-
Debate on Online Social Networks at the Time of COVID-19: An Italian Case Study
Authors:
Martino Trevisan,
Luca Vassio,
Danilo Giordano
Abstract:
The COVID-19 pandemic is not only having a heavy impact on healthcare but also changing people's habits and the society we live in. Countries such as Italy have enforced a total lockdown lasting several months, with most of the population forced to remain at home. During this time, online social networks, more than ever, have represented an alternative solution for social life, allowing users to i…
▽ More
The COVID-19 pandemic is not only having a heavy impact on healthcare but also changing people's habits and the society we live in. Countries such as Italy have enforced a total lockdown lasting several months, with most of the population forced to remain at home. During this time, online social networks, more than ever, have represented an alternative solution for social life, allowing users to interact and debate with each other. Hence, it is of paramount importance to understand the changing use of social networks brought about by the pandemic. In this paper, we analyze how the interaction patterns around popular influencers in Italy changed during the first six months of 2020, within Instagram and Facebook social networks. We collected a large dataset for this group of public figures, including more than 54 million comments on over 140 thousand posts for these months. We analyze and compare engagement on the posts of these influencers and provide quantitative figures for aggregated user activity. We further show the changes in the patterns of usage before and during the lockdown, which demonstrated a growth of activity and sizable daily and weekly variations. We also analyze the user sentiment through the psycholinguistic properties of comments, and the results testified the rapid boom and disappearance of topics related to the pandemic. To support further analyses, we release the anonymized dataset.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
RL-IoT: Reinforcement Learning to Interact with IoT Devices
Authors:
Giulia Milan,
Luca Vassio,
Idilio Drago,
Marco Mellia
Abstract:
Our life is getting filled by Internet of Things (IoT) devices. These devices often rely on closed or poorly documented protocols, with unknown formats and semantics. Learning how to interact with such devices in an autonomous manner is the key for interoperability and automatic verification of their capabilities. In this paper, we propose RL-IoT, a system that explores how to automatically intera…
▽ More
Our life is getting filled by Internet of Things (IoT) devices. These devices often rely on closed or poorly documented protocols, with unknown formats and semantics. Learning how to interact with such devices in an autonomous manner is the key for interoperability and automatic verification of their capabilities. In this paper, we propose RL-IoT, a system that explores how to automatically interact with possibly unknown IoT devices. We leverage reinforcement learning (RL) to recover the semantics of protocol messages and to take control of the device to reach a given goal, while minimizing the number of interactions. We assume to know only a database of possible IoT protocol messages, whose semantics are however unknown. RL-IoT exchanges messages with the target IoT device, learning those commands that are useful to reach the given goal. Our results show that RL-IoT is able to solve both simple and complex tasks. With properly tuned parameters, RL-IoT learns how to perform actions with the target device, a Yeelight smart bulb in our case study, completing non-trivial patterns with as few as 400 interactions. RL-IoT paves the road for automatic interactions with poorly documented IoT protocols, thus enabling interoperable systems.
△ Less
Submitted 10 September, 2021; v1 submitted 3 May, 2021;
originally announced May 2021.
-
A network analysis on cloud gaming: Stadia, GeForce Now and PSNow
Authors:
Andrea Di Domenico,
Gianluca Perna,
Martino Trevisan,
Luca Vassio,
Danilo Giordano
Abstract:
Cloud gaming is a new class of services that promises to revolutionize the videogame market. It allows the user to play a videogame with basic equipment while using a remote server for the actual execution. The multimedia content is streamed through the network from the server to the user. This service requires low latency and a large bandwidth to work properly with low response time and high-defi…
▽ More
Cloud gaming is a new class of services that promises to revolutionize the videogame market. It allows the user to play a videogame with basic equipment while using a remote server for the actual execution. The multimedia content is streamed through the network from the server to the user. This service requires low latency and a large bandwidth to work properly with low response time and high-definition video. Three of the leading tech companies, (Google, Sony and NVIDIA) entered this market with their own products, and others, like Microsoft and Amazon, are planning to launch their own platforms in the near future. However, these companies released so far little information about their cloud gaming operation and how they utilize the network. In this work, we study these new cloud gaming services from the network point of view. We collect more than 200 packet traces under different application settings and network conditions for 3 cloud gaming services, namely Stadia from Google, GeForce Now from NVIDIA and PS Now from Sony. We analyze the employed protocols and the workload they impose on the network. We find that GeForce Now and Stadia use the RTP protocol to stream the multimedia content, with the latter relying on the standard WebRTC APIs. They result in bandwidth-hungry and consume up to 45 Mbit/s, depending on the network and video quality. PS Now instead uses only undocumented protocols and never exceeds 13 Mbit/s.
△ Less
Submitted 21 October, 2021; v1 submitted 12 December, 2020;
originally announced December 2020.
-
Characterizing web pornography consumption from passive measurements
Authors:
Andrea Morichetta,
Martino Trevisan,
Luca Vassio
Abstract:
Web pornography represents a large fraction of the Internet traffic, with thousands of websites and millions of users. Studying web pornography consumption allows understanding human behaviors and it is crucial for medical and psychological research. However, given the lack of public data, these works typically build on surveys, limited by different factors, e.g. unreliable answers that volunteers…
▽ More
Web pornography represents a large fraction of the Internet traffic, with thousands of websites and millions of users. Studying web pornography consumption allows understanding human behaviors and it is crucial for medical and psychological research. However, given the lack of public data, these works typically build on surveys, limited by different factors, e.g. unreliable answers that volunteers may (involuntarily) provide. In this work, we collect anonymized accesses to pornography websites using HTTP-level passive traces. Our dataset includes about 15 000 broadband subscribers over a period of 3 years. We use it to provide quantitative information about the interactions of users with pornographic websites, focusing on time and frequency of use, habits, and trends. We distribute our anonymized dataset to the community to ease reproducibility and allow further studies.
△ Less
Submitted 4 May, 2021; v1 submitted 26 April, 2019;
originally announced April 2019.
-
Towards Understanding Political Interactions on Instagram
Authors:
Martino Trevisan,
Luca Vassio,
Idilio Drago,
Marco Mellia,
Fabricio Murai,
Flavio Figueiredo,
Ana Paula Couto da Silva,
Jussara M. Almeida
Abstract:
Online Social Networks (OSNs) allow personalities and companies to communicate directly with the public, bypassing filters of traditional medias. As people rely on OSNs to stay up-to-date, the political debate has moved online too. We witness the sudden explosion of harsh political debates and the dissemination of rumours in OSNs. Identifying such behaviour requires a deep understanding on how peo…
▽ More
Online Social Networks (OSNs) allow personalities and companies to communicate directly with the public, bypassing filters of traditional medias. As people rely on OSNs to stay up-to-date, the political debate has moved online too. We witness the sudden explosion of harsh political debates and the dissemination of rumours in OSNs. Identifying such behaviour requires a deep understanding on how people interact via OSNs during political debates. We present a preliminary study of interactions in a popular OSN, namely Instagram. We take Italy as a case study in the period before the 2019 European Elections. We observe the activity of top Italian Instagram profiles in different categories: politics, music, sport and show. We record their posts for more than two months, tracking "likes" and comments from users. Results suggest that profiles of politicians attract markedly different interactions than other categories. People tend to comment more, with longer comments, debating for longer time, with a large number of replies, most of which are not explicitly solicited. Moreover, comments tend to come from a small group of very active users. Finally, we witness substantial differences when comparing profiles of different parties.
△ Less
Submitted 4 May, 2021; v1 submitted 26 April, 2019;
originally announced April 2019.
-
You, the Web and Your Device: Longitudinal Characterization of Browsing Habits
Authors:
Luca Vassio,
Idilio Drago,
Marco Mellia,
Zied Ben Houidi,
Mohamed Lamine Lamali
Abstract:
Understanding how people interact with the web is key for a variety of applications, e.g., from the design of effective web pages to the definition of successful online marketing campaigns. Browsing behavior has been traditionally represented and studied by means of clickstreams, i.e., graphs whose vertices are web pages, and edges are the paths followed by users. Obtaining large and representativ…
▽ More
Understanding how people interact with the web is key for a variety of applications, e.g., from the design of effective web pages to the definition of successful online marketing campaigns. Browsing behavior has been traditionally represented and studied by means of clickstreams, i.e., graphs whose vertices are web pages, and edges are the paths followed by users. Obtaining large and representative data to extract clickstreams is however challenging. The evolution of the web questions whether browsing behavior is changing and, by consequence, whether properties of clickstreams are changing. This paper presents a longitudinal study of clickstreams in from 2013 to 2016. We evaluate an anonymized dataset of HTTP traces captured in a large ISP, where thousands of households are connected. We first propose a methodology to identify actual URLs requested by users from the massive set of requests automatically fired by browsers when rendering web pages. Then, we characterize web usage patterns and clickstreams, taking into account both the temporal evolution and the impact of the device used to explore the web. Our analyses precisely quantify various aspects of clickstreams and uncover interesting patterns, such as the typical short paths followed by people while navigating the web, the fast increasing trend in browsing from mobile devices and the different roles of search engines and social networks in promoting content. Finally, we contribute a dataset of anonymized clickstreams to the community to foster new studies (anonymized clickstreams are available to the public at http://bigdata.polito.it/clickstream).
△ Less
Submitted 4 May, 2021; v1 submitted 19 June, 2018;
originally announced June 2018.
-
The Exploitation of Web Navigation Data: Ethical Issues and Alternative Scenarios
Authors:
Luca Vassio,
Hassan Metwalley,
Danilo Giordano
Abstract:
Nowadays, the users' browsing activity on the Internet is not completely private due to many entities that collect and use such data, either for legitimate or illegal goals. The implications are serious, from a person who exposes unconsciously his private information to an unknown third party entity, to a company that is unable to control its information to the outside world. As a result, users ha…
▽ More
Nowadays, the users' browsing activity on the Internet is not completely private due to many entities that collect and use such data, either for legitimate or illegal goals. The implications are serious, from a person who exposes unconsciously his private information to an unknown third party entity, to a company that is unable to control its information to the outside world. As a result, users have lost control over their private data in the Internet. In this paper, we present the entities involved in users' data collection and usage. Then, we highlight what are the ethical issues that arise for users, companies, scientists and governments. Finally, we present some alternative scenarios and suggestions for the entities to address such ethical issues.
△ Less
Submitted 2 May, 2016; v1 submitted 10 December, 2015;
originally announced December 2015.
-
A hybrid swarm-based algorithm for single-objective optimization problems involving high-cost analyses
Authors:
Enrico Ampellio,
Luca Vassio
Abstract:
In many technical fields, single-objective optimization procedures in continuous domains involve expensive numerical simulations. In this context, an improvement of the Artificial Bee Colony (ABC) algorithm, called the Artificial super-Bee enhanced Colony (AsBeC), is presented. AsBeC is designed to provide fast convergence speed, high solution accuracy and robust performance over a wide range of p…
▽ More
In many technical fields, single-objective optimization procedures in continuous domains involve expensive numerical simulations. In this context, an improvement of the Artificial Bee Colony (ABC) algorithm, called the Artificial super-Bee enhanced Colony (AsBeC), is presented. AsBeC is designed to provide fast convergence speed, high solution accuracy and robust performance over a wide range of problems. It implements enhancements of the ABC structure and hybridizations with interpolation strategies. The latter are inspired by the quadratic trust region approach for local investigation and by an efficient global optimizer for separable problems. Each modification and their combined effects are studied with appropriate metrics on a numerical benchmark, which is also used for comparing AsBeC with some effective ABC variants and other derivative-free algorithms. In addition, the presented algorithm is validated on two recent benchmarks adopted for competitions in international conferences. Results show remarkable competitiveness and robustness for AsBeC.
△ Less
Submitted 2 May, 2016; v1 submitted 24 February, 2014;
originally announced February 2014.
-
Multidiscipinary Optimization For Gas Turbines Design
Authors:
Francesco Bertini,
Lorenzo Dal Mas,
Luca Vassio,
Enrico Ampellio
Abstract:
State-of-the-art aeronautic Low Pressure gas Turbines (LPTs) are already characterized by high quality standards, thus they offer very narrow margins of improvement. Typical design process starts with a Concept Design (CD) phase, defined using mean-line 1D and other low-order tools, and evolves through a Preliminary Design (PD) phase, which allows the geometric definition in details. In this frame…
▽ More
State-of-the-art aeronautic Low Pressure gas Turbines (LPTs) are already characterized by high quality standards, thus they offer very narrow margins of improvement. Typical design process starts with a Concept Design (CD) phase, defined using mean-line 1D and other low-order tools, and evolves through a Preliminary Design (PD) phase, which allows the geometric definition in details. In this framework, multidisciplinary optimization is the only way to properly handle the complicated peculiarities of the design. The authors present different strategies and algorithms that have been implemented exploiting the PD phase as a real-like design benchmark to illustrate results. The purpose of this work is to describe the optimization techniques, their settings and how to implement them effectively in a multidisciplinary environment. Starting from a basic gradient method and a semi-random second order method, the authors have introduced an Artificial Bee Colony-like optimizer, a multi-objective Genetic Diversity Evolutionary Algorithm [1] and a multi-objective response surface approach based on Artificial Neural Network, parallelizing and customizing them for the gas turbine study. Moreover, speedup and improvement arrangements are embedded in different hybrid strategies with the aim at finding the best solutions for different kind of problems that arise in this field.
△ Less
Submitted 3 February, 2014;
originally announced February 2014.
-
Message passing optimization of Harmonic Influence Centrality
Authors:
Luca Vassio,
Fabio Fagnani,
Paolo Frasca,
Asuman Ozdaglar
Abstract:
This paper proposes a new measure of node centrality in social networks, the Harmonic Influence Centrality, which emerges naturally in the study of social influence over networks. Using an intuitive analogy between social and electrical networks, we introduce a distributed message passing algorithm to compute the Harmonic Influence Centrality of each node. Although its design is based on theoretic…
▽ More
This paper proposes a new measure of node centrality in social networks, the Harmonic Influence Centrality, which emerges naturally in the study of social influence over networks. Using an intuitive analogy between social and electrical networks, we introduce a distributed message passing algorithm to compute the Harmonic Influence Centrality of each node. Although its design is based on theoretical results which assume the network to have no cycle, the algorithm can also be successfully applied on general graphs.
△ Less
Submitted 15 January, 2014; v1 submitted 30 September, 2013;
originally announced October 2013.