-
The Urban Impact of AI: Modeling Feedback Loops in Next-Venue Recommendation
Authors:
Giovanni Mauro,
Marco Minici,
Luca Pappalardo
Abstract:
Next-venue recommender systems are increasingly embedded in location-based services, shaping individual mobility decisions in urban environments. While their predictive accuracy has been extensively studied, less attention has been paid to their systemic impact on urban dynamics. In this work, we introduce a simulation framework to model the human-AI feedback loop underpinning next-venue recommend…
▽ More
Next-venue recommender systems are increasingly embedded in location-based services, shaping individual mobility decisions in urban environments. While their predictive accuracy has been extensively studied, less attention has been paid to their systemic impact on urban dynamics. In this work, we introduce a simulation framework to model the human-AI feedback loop underpinning next-venue recommendation, capturing how algorithmic suggestions influence individual behavior, which in turn reshapes the data used to retrain the models. Our simulations, grounded in real-world mobility data, systematically explore the effects of algorithmic adoption across a range of recommendation strategies. We find that while recommender systems consistently increase individual-level diversity in visited venues, they may simultaneously amplify collective inequality by concentrating visits on a limited subset of popular places. This divergence extends to the structure of social co-location networks, revealing broader implications for urban accessibility and spatial segregation. Our framework operationalizes the feedback loop in next-venue recommendation and offers a novel lens through which to assess the societal impact of AI-assisted mobility-providing a computational tool to anticipate future risks, evaluate regulatory interventions, and inform the design of ethic algorithmic systems.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
Characterizing User Behavior: The Interplay Between Mobility Patterns and Mobile Traffic
Authors:
Anne Josiane Kouam,
Aline Carneiro Viana,
Mariano G. Beiró,
Leo Ferres,
Luca Pappalardo
Abstract:
Mobile devices have become essential for capturing human activity, and eXtended Data Records (XDRs) offer rich opportunities for detailed user behavior modeling, which is useful for designing personalized digital services. Previous studies have primarily focused on aggregated mobile traffic and mobility analyses, often neglecting individual-level insights. This paper introduces a novel approach th…
▽ More
Mobile devices have become essential for capturing human activity, and eXtended Data Records (XDRs) offer rich opportunities for detailed user behavior modeling, which is useful for designing personalized digital services. Previous studies have primarily focused on aggregated mobile traffic and mobility analyses, often neglecting individual-level insights. This paper introduces a novel approach that explores the dependency between traffic and mobility behaviors at the user level. By analyzing 13 individual features that encompass traffic patterns and various mobility aspects, we enhance the understanding of how these behaviors interact. Our advanced user modeling framework integrates traffic and mobility behaviors over time, allowing for fine-grained dependencies while maintaining population heterogeneity through user-specific signatures. Furthermore, we develop a Markov model that infers traffic behavior from mobility and vice versa, prioritizing significant dependencies while addressing privacy concerns. Using a week-long XDR dataset from 1,337,719 users across several provinces in Chile, we validate our approach, demonstrating its robustness and applicability in accurately inferring user behavior and matching mobility and traffic profiles across diverse urban contexts.
△ Less
Submitted 24 March, 2025; v1 submitted 31 January, 2025;
originally announced January 2025.
-
Characterizing Model Collapse in Large Language Models Using Semantic Networks and Next-Token Probability
Authors:
Daniele Gambetta,
Gizem Gezici,
Fosca Giannotti,
Dino Pedreschi,
Alistair Knott,
Luca Pappalardo
Abstract:
As synthetic content increasingly infiltrates the web, generative AI models may experience an autophagy process, where they are fine-tuned using their own outputs. This autophagy could lead to a phenomenon known as model collapse, which entails a degradation in the performance and diversity of generative AI models over successive generations. Recent studies have explored the emergence of model col…
▽ More
As synthetic content increasingly infiltrates the web, generative AI models may experience an autophagy process, where they are fine-tuned using their own outputs. This autophagy could lead to a phenomenon known as model collapse, which entails a degradation in the performance and diversity of generative AI models over successive generations. Recent studies have explored the emergence of model collapse across various generative AI models and types of data. However, the current characterizations of model collapse tend to be simplistic and lack comprehensive evaluation. In this article, we conduct a thorough investigation of model collapse across three text datasets, utilizing semantic networks to analyze text repetitiveness and diversity, while employing next-token probabilities to quantify the loss of diversity. We also examine how the proportions of synthetic tokens affect the severity of model collapse and perform cross-dataset evaluations to identify domain-specific variations. By proposing metrics and strategies for a more detailed assessment of model collapse, our study provides new insights for the development of robust generative AI systems.
△ Less
Submitted 2 February, 2025; v1 submitted 16 October, 2024;
originally announced October 2024.
-
Geospatial Road Cycling Race Results Data Set
Authors:
Bram Janssens,
Luca Pappalardo,
Jelle De Bock,
Matthias Bogaert,
Steven Verstockt
Abstract:
The field of cycling analytics has only recently started to develop due to limited access to open data sources. Accordingly, research and data sources are very divergent, with large differences in information used across studies. To improve this, and facilitate further research in the field, we propose the publication of a data set which links thousands of professional race results from the period…
▽ More
The field of cycling analytics has only recently started to develop due to limited access to open data sources. Accordingly, research and data sources are very divergent, with large differences in information used across studies. To improve this, and facilitate further research in the field, we propose the publication of a data set which links thousands of professional race results from the period 2017-2023 to detailed geographic information about the courses, an essential aspect in road cycling analytics. Initial use cases are proposed, showcasing the usefulness in linking these two data sources.
△ Less
Submitted 26 September, 2024;
originally announced October 2024.
-
Future Directions in Human Mobility Science
Authors:
Luca Pappalardo,
Ed Manley,
Vedran Sekara,
Laura Alessandretti
Abstract:
We provide a brief review of human mobility science and present three key areas where we expect to see substantial advancements. We start from the mind and discuss the need to better understand how spatial cognition shapes mobility patterns. We then move to societies and argue the importance of better understanding new forms of transportation. We conclude by discussing how algorithms shape mobilit…
▽ More
We provide a brief review of human mobility science and present three key areas where we expect to see substantial advancements. We start from the mind and discuss the need to better understand how spatial cognition shapes mobility patterns. We then move to societies and argue the importance of better understanding new forms of transportation. We conclude by discussing how algorithms shape mobility behaviour and provide useful tools for modellers. Finally, we discuss how progress in these research directions may help us address some of the challenges our society faces today.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Navigation services amplify concentration of traffic and emissions in our cities
Authors:
Giuliano Cornacchia,
Mirco Nanni,
Dino Pedreschi,
Luca Pappalardo
Abstract:
The proliferation of human-AI ecosystems involving human interaction with algorithms, such as assistants and recommenders, raises concerns about large-scale social behaviour. Despite evidence of such phenomena across several contexts, the collective impact of GPS navigation services remains unclear: while beneficial to the user, they can also cause chaos if too many vehicles are driven through the…
▽ More
The proliferation of human-AI ecosystems involving human interaction with algorithms, such as assistants and recommenders, raises concerns about large-scale social behaviour. Despite evidence of such phenomena across several contexts, the collective impact of GPS navigation services remains unclear: while beneficial to the user, they can also cause chaos if too many vehicles are driven through the same few roads. Our study employs a simulation framework to assess navigation services' influence on road network usage and CO2 emissions. The results demonstrate a universal pattern of amplified conformity: increasing adoption rates of navigation services cause a reduction of route diversity of mobile travellers and increased concentration of traffic and emissions on fewer roads, thus exacerbating an unequal distribution of negative externalities on selected neighbourhoods. Although navigation services recommendations can help reduce CO2 emissions when their adoption rate is low, these benefits diminish or even disappear when the adoption rate is high and exceeds a certain city- and service-dependent threshold. We summarize these discoveries in a non-linear function that connects the marginal increase of conformity with the marginal reduction in CO2 emissions. Our simulation approach addresses the challenges posed by the complexity of transportation systems and the lack of data and algorithmic transparency.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions
Authors:
Luca Pappalardo,
Emanuele Ferragina,
Salvatore Citraro,
Giuliano Cornacchia,
Mirco Nanni,
Giulio Rossetti,
Gizem Gezici,
Fosca Giannotti,
Margherita Lalli,
Daniele Gambetta,
Giovanni Mauro,
Virginia Morini,
Valentina Pansanella,
Dino Pedreschi
Abstract:
Recommendation systems and assistants (in short, recommenders) are ubiquitous in online platforms and influence most actions of our day-to-day lives, suggesting items or providing solutions based on users' preferences or requests. This survey analyses the impact of recommenders in four human-AI ecosystems: social media, online retail, urban mapping and generative AI ecosystems. Its scope is to sys…
▽ More
Recommendation systems and assistants (in short, recommenders) are ubiquitous in online platforms and influence most actions of our day-to-day lives, suggesting items or providing solutions based on users' preferences or requests. This survey analyses the impact of recommenders in four human-AI ecosystems: social media, online retail, urban mapping and generative AI ecosystems. Its scope is to systematise a fast-growing field in which terminologies employed to classify methodologies and outcomes are fragmented and unsystematic. We follow the customary steps of qualitative systematic review, gathering 144 articles from different disciplines to develop a parsimonious taxonomy of: methodologies employed (empirical, simulation, observational, controlled), outcomes observed (concentration, model collapse, diversity, echo chamber, filter bubble, inequality, polarisation, radicalisation, volume), and their level of analysis (individual, item, model, and systemic). We systematically discuss all findings of our survey substantively and methodologically, highlighting also potential avenues for future research. This survey is addressed to scholars and practitioners interested in different human-AI ecosystems, policymakers and institutional stakeholders who want to understand better the measurable outcomes of recommenders, and tech companies who wish to obtain a systematic view of the impact of their recommenders.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Popularity-based Alternative Routing
Authors:
Giuliano Cornacchia,
Ludovico Lemma,
Luca Pappalardo
Abstract:
Alternative routing is crucial to minimize the environmental impact of urban transportation while enhancing road network efficiency and reducing traffic congestion. Existing methods neglect information about road popularity, possibly leading to unintended consequences such as increasing emissions and congestion. This paper introduces Polaris, an alternative routing algorithm that exploits road pop…
▽ More
Alternative routing is crucial to minimize the environmental impact of urban transportation while enhancing road network efficiency and reducing traffic congestion. Existing methods neglect information about road popularity, possibly leading to unintended consequences such as increasing emissions and congestion. This paper introduces Polaris, an alternative routing algorithm that exploits road popularity to optimize traffic distribution and reduce CO2 emissions. Polaris leverages the novel concept of K-road layers, which mitigates the feedback loop effect where redirecting vehicles to less popular roads could increase their popularity in the future. We conduct experiments in three cities to evaluate Polaris against state-of-the-art alternative routing algorithms. Our results demonstrate that Polaris significantly reduces the overuse of highly popular road edges and traversed regulated intersections, showcasing its ability to generate efficient routes and distribute traffic more evenly. Furthermore, Polaris achieves substantial CO2 reductions, outperforming existing alternative routing strategies. Finally, we compare Polaris to an algorithm that coordinates vehicles centrally to distribute them more evenly on the road network. Our findings reveal that Polaris performs comparably well, even with much less information, highlighting its potential as an efficient and sustainable solution for urban traffic management.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Mixing Individual and Collective Behaviours to Predict Out-of-Routine Mobility
Authors:
Sebastiano Bontorin,
Simone Centellegher,
Riccardo Gallotti,
Luca Pappalardo,
Bruno Lepri,
Massimiliano Luca
Abstract:
Predicting human displacements is crucial for addressing various societal challenges, including urban design, traffic congestion, epidemic management, and migration dynamics. While predictive models like deep learning and Markov models offer insights into individual mobility, they often struggle with out-of-routine behaviours. Our study introduces an approach that dynamically integrates individual…
▽ More
Predicting human displacements is crucial for addressing various societal challenges, including urban design, traffic congestion, epidemic management, and migration dynamics. While predictive models like deep learning and Markov models offer insights into individual mobility, they often struggle with out-of-routine behaviours. Our study introduces an approach that dynamically integrates individual and collective mobility behaviours, leveraging collective intelligence to enhance prediction accuracy. Evaluating the model on millions of privacy-preserving trajectories across three US cities, we demonstrate its superior performance in predicting out-of-routine mobility, surpassing even advanced deep learning methods. Spatial analysis highlights the model's effectiveness near urban areas with a high density of points of interest, where collective behaviours strongly influence mobility. During disruptive events like the COVID-19 pandemic, our model retains predictive capabilities, unlike individual-based models. By bridging the gap between individual and collective behaviours, our approach offers transparent and accurate predictions, crucial for addressing contemporary mobility challenges.
△ Less
Submitted 6 August, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Human-AI Coevolution
Authors:
Dino Pedreschi,
Luca Pappalardo,
Emanuele Ferragina,
Ricardo Baeza-Yates,
Albert-Laszlo Barabasi,
Frank Dignum,
Virginia Dignum,
Tina Eliassi-Rad,
Fosca Giannotti,
Janos Kertesz,
Alistair Knott,
Yannis Ioannidis,
Paul Lukowicz,
Andrea Passarella,
Alex Sandy Pentland,
John Shawe-Taylor,
Alessandro Vespignani
Abstract:
Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices on online pla…
▽ More
Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices on online platforms. The interaction between users and AI results in a potentially endless feedback loop, wherein users' choices generate data to train AI models, which, in turn, shape subsequent user preferences. This human-AI feedback loop has peculiar characteristics compared to traditional human-machine interaction and gives rise to complex and often ``unintended'' social outcomes. This paper introduces Coevolution AI as the cornerstone for a new field of study at the intersection between AI and complexity science focused on the theoretical, empirical, and mathematical investigation of the human-AI feedback loop. In doing so, we: (i) outline the pros and cons of existing methodologies and highlight shortcomings and potential ways for capturing feedback loop mechanisms; (ii) propose a reflection at the intersection between complexity science, AI and society; (iii) provide real-world examples for different human-AI ecosystems; and (iv) illustrate challenges to the creation of such a field of study, conceptualising them at increasing levels of abstraction, i.e., technical, epistemological, legal and socio-political.
△ Less
Submitted 3 May, 2024; v1 submitted 23 June, 2023;
originally announced June 2023.
-
One-Shot Traffic Assignment with Forward-Looking Penalization
Authors:
Giuliano Cornacchia,
Mirco Nanni,
Luca Pappalardo
Abstract:
Traffic assignment (TA) is crucial in optimizing transportation systems and consists in efficiently assigning routes to a collection of trips. Existing TA algorithms often do not adequately consider real-time traffic conditions, resulting in inefficient route assignments. This paper introduces METIS, a cooperative, one-shot TA algorithm that combines alternative routing with edge penalization and…
▽ More
Traffic assignment (TA) is crucial in optimizing transportation systems and consists in efficiently assigning routes to a collection of trips. Existing TA algorithms often do not adequately consider real-time traffic conditions, resulting in inefficient route assignments. This paper introduces METIS, a cooperative, one-shot TA algorithm that combines alternative routing with edge penalization and informed route scoring. We conduct experiments in several cities to evaluate the performance of METIS against state-of-the-art one-shot methods. Compared to the best baseline, METIS significantly reduces CO2 emissions by 18% in Milan, 28\% in Florence, and 46% in Rome, improving trip distribution considerably while still having low computational time. Our study proposes METIS as a promising solution for optimizing TA and urban transportation systems.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
How Routing Strategies Impact Urban Emissions
Authors:
Giuliano Cornacchia,
Matteo Böhm,
Giovanni Mauro,
Mirco Nanni,
Dino Pedreschi,
Luca Pappalardo
Abstract:
Navigation apps use routing algorithms to suggest the best path to reach a user's desired destination. Although undoubtedly useful, navigation apps' impact on the urban environment (e.g., carbon dioxide emissions and population exposure to pollution) is still largely unclear. In this work, we design a simulation framework to assess the impact of routing algorithms on carbon dioxide emissions withi…
▽ More
Navigation apps use routing algorithms to suggest the best path to reach a user's desired destination. Although undoubtedly useful, navigation apps' impact on the urban environment (e.g., carbon dioxide emissions and population exposure to pollution) is still largely unclear. In this work, we design a simulation framework to assess the impact of routing algorithms on carbon dioxide emissions within an urban environment. Using APIs from TomTom and OpenStreetMap, we find that settings in which either all vehicles or none of them follow a navigation app's suggestion lead to the worst impact in terms of CO2 emissions. In contrast, when just a portion (around half) of vehicles follow these suggestions, and some degree of randomness is added to the remaining vehicles' paths, we observe a reduction in the overall CO2 emissions over the road network. Our work is a first step towards designing next-generation routing principles that may increase urban well-being while satisfying individual needs.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Enhancing crowd flow prediction in various spatial and temporal granularities
Authors:
Marco Cardia,
Massimiliano Luca,
Luca Pappalardo
Abstract:
Thanks to the diffusion of the Internet of Things, nowadays it is possible to sense human mobility almost in real time using unconventional methods (e.g., number of bikes in a bike station). Due to the diffusion of such technologies, the last years have witnessed a significant growth of human mobility studies, motivated by their importance in a wide range of applications, from traffic management t…
▽ More
Thanks to the diffusion of the Internet of Things, nowadays it is possible to sense human mobility almost in real time using unconventional methods (e.g., number of bikes in a bike station). Due to the diffusion of such technologies, the last years have witnessed a significant growth of human mobility studies, motivated by their importance in a wide range of applications, from traffic management to public security and computational epidemiology. A mobility task that is becoming prominent is crowd flow prediction, i.e., forecasting aggregated incoming and outgoing flows in the locations of a geographic region. Although several deep learning approaches have been proposed to solve this problem, their usage is limited to specific types of spatial tessellations and cannot provide sufficient explanations of their predictions. We propose CrowdNet, a solution to crowd flow prediction based on graph convolutional networks. Compared with state-of-the-art solutions, CrowdNet can be used with regions of irregular shapes and provide meaningful explanations of the predicted crowd flows. We conduct experiments on public data varying the spatio-temporal granularity of crowd flows to show the superiority of our model with respect to existing methods, and we investigate CrowdNet's reliability to missing or noisy input data. Our model is a step forward in the design of reliable deep learning models to predict and explain human displacements in urban environments.
△ Less
Submitted 12 March, 2022;
originally announced March 2022.
-
Trajectory Test-Train Overlap in Next-Location Prediction Datasets
Authors:
Massimiliano Luca,
Luca Pappalardo,
Bruno Lepri,
Gianni Barlacchi
Abstract:
Next-location prediction, consisting of forecasting a user's location given their historical trajectories, has important implications in several fields, such as urban planning, geo-marketing, and disease spreading. Several predictors have been proposed in the last few years to address it, including last-generation ones based on deep learning. This paper tests the generalization capability of these…
▽ More
Next-location prediction, consisting of forecasting a user's location given their historical trajectories, has important implications in several fields, such as urban planning, geo-marketing, and disease spreading. Several predictors have been proposed in the last few years to address it, including last-generation ones based on deep learning. This paper tests the generalization capability of these predictors on public mobility datasets, stratifying the datasets by whether the trajectories in the test set also appear fully or partially in the training set. We consistently discover a severe problem of trajectory overlapping in all analyzed datasets, highlighting that predictors memorize trajectories while having limited generalization capacities. We thus propose a methodology to rerank the outputs of the next-location predictors based on spatial mobility patterns. With these techniques, we significantly improve the predictors' generalization capability, with a relative improvement on the accuracy up to 96.15% on the trajectories that cannot be memorized (i.e., low overlap with the training set).
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
Generating Synthetic Mobility Networks with Generative Adversarial Networks
Authors:
Giovanni Mauro,
Massimiliano Luca,
Antonio Longa,
Bruno Lepri,
Luca Pappalardo
Abstract:
The increasingly crucial role of human displacements in complex societal phenomena, such as traffic congestion, segregation, and the diffusion of epidemics, is attracting the interest of scientists from several disciplines. In this article, we address mobility network generation, i.e., generating a city's entire mobility network, a weighted directed graph in which nodes are geographic locations an…
▽ More
The increasingly crucial role of human displacements in complex societal phenomena, such as traffic congestion, segregation, and the diffusion of epidemics, is attracting the interest of scientists from several disciplines. In this article, we address mobility network generation, i.e., generating a city's entire mobility network, a weighted directed graph in which nodes are geographic locations and weighted edges represent people's movements between those locations, thus describing the entire mobility set flows within a city. Our solution is MoGAN, a model based on Generative Adversarial Networks (GANs) to generate realistic mobility networks. We conduct extensive experiments on public datasets of bike and taxi rides to show that MoGAN outperforms the classical Gravity and Radiation models regarding the realism of the generated networks. Our model can be used for data augmentation and performing simulations and what-if analysis.
△ Less
Submitted 14 December, 2022; v1 submitted 22 February, 2022;
originally announced February 2022.
-
Living in a pandemic: adaptation of individual mobility and social activity in the US
Authors:
Lorenzo Lucchini,
Simone Centellegher,
Luca Pappalardo,
Riccardo Gallotti,
Filippo Privitera,
Bruno Lepri,
Marco De Nadai
Abstract:
The non-pharmaceutical interventions (NPIs), aimed at reducing the diffusion of the COVID-19 pandemic, has dramatically influenced our behaviour in everyday life. In this work, we study how individuals adapted their daily movements and person-to-person contact patterns over time in response to the COVID-19 pandemic and the NPIs. We leverage longitudinal GPS mobility data of hundreds of thousands o…
▽ More
The non-pharmaceutical interventions (NPIs), aimed at reducing the diffusion of the COVID-19 pandemic, has dramatically influenced our behaviour in everyday life. In this work, we study how individuals adapted their daily movements and person-to-person contact patterns over time in response to the COVID-19 pandemic and the NPIs. We leverage longitudinal GPS mobility data of hundreds of thousands of anonymous individuals in four US states and empirically show the dramatic disruption in people's life. We find that local interventions did not just impact the number of visits to different venues but also how people experience them. Individuals spend less time in venues, preferring simpler and more predictable routines and reducing person-to-person contact activities. Moreover, we show that the stringency of interventions alone does explain the number and duration of visits to venues: individual patterns of visits seem to be influenced by the local severity of the pandemic and a risk adaptation factor, which increases the people's mobility regardless of the stringency of interventions.
△ Less
Submitted 17 August, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Gross polluters and vehicles' emissions reduction
Authors:
Matteo Böhm,
Mirco Nanni,
Luca Pappalardo
Abstract:
Vehicles' emissions produce a significant share of cities' air pollution, with a substantial impact on the environment and human health. Traditional emission estimation methods use remote sensing stations, missing vehicles' full driving cycle, or focus on a few vehicles. We use GPS traces and a microscopic model to analyse the emissions of four air pollutants from thousands of private vehicles in…
▽ More
Vehicles' emissions produce a significant share of cities' air pollution, with a substantial impact on the environment and human health. Traditional emission estimation methods use remote sensing stations, missing vehicles' full driving cycle, or focus on a few vehicles. We use GPS traces and a microscopic model to analyse the emissions of four air pollutants from thousands of private vehicles in three European cities. We find that the emissions across the vehicles and roads are well approximated by heavy-tailed distributions and thus discover the existence of gross polluters, vehicles responsible for the greatest quantity of emissions, and grossly polluted roads, which suffer the greatest amount of emissions. Our simulations show that emissions reduction policies targeting gross polluters are way more effective than those limiting circulation based on a non-informed choice of vehicles. Our study contributes to shaping the discussion on how to measure emissions with digital data.
△ Less
Submitted 17 March, 2022; v1 submitted 21 April, 2021;
originally announced July 2021.
-
Coach2vec: autoencoding the playing style of soccer coaches
Authors:
Paolo Cintia,
Luca Pappalardo
Abstract:
Capturing the playing style of professional soccer coaches is a complex, and yet barely explored, task in sports analytics. Nowadays, the availability of digital data describing every relevant spatio-temporal aspect of soccer matches, allows for capturing and analyzing the playing style of players, teams, and coaches in an automatic way. In this paper, we present coach2vec, a workflow to capture t…
▽ More
Capturing the playing style of professional soccer coaches is a complex, and yet barely explored, task in sports analytics. Nowadays, the availability of digital data describing every relevant spatio-temporal aspect of soccer matches, allows for capturing and analyzing the playing style of players, teams, and coaches in an automatic way. In this paper, we present coach2vec, a workflow to capture the playing style of professional coaches using match event streams and artificial intelligence. Coach2vec extracts ball possessions from each match, clusters them based on their similarity, and reconstructs the typical ball possessions of coaches. Then, it uses an autoencoder, a type of artificial neural network, to obtain a concise representation (encoding) of the playing style of each coach. Our experiments, conducted on soccer-logs describing the last four seasons of the Italian first division, reveal interesting similarities between prominent coaches, paving the road to the simulation of playing styles and the quantitative comparison of professional coaches.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Understanding peacefulness through the world news
Authors:
Vasiliki Voukelatou,
Ioanna Miliou,
Fosca Giannotti,
Luca Pappalardo
Abstract:
Peacefulness is a principal dimension of well-being and is the way out of inequity and violence. Thus, its measurement has drawn the attention of researchers, policymakers, and peacekeepers. During the last years, novel digital data streams have drastically changed the research in this field. The current study exploits information extracted from a new digital database called Global Data on Events,…
▽ More
Peacefulness is a principal dimension of well-being and is the way out of inequity and violence. Thus, its measurement has drawn the attention of researchers, policymakers, and peacekeepers. During the last years, novel digital data streams have drastically changed the research in this field. The current study exploits information extracted from a new digital database called Global Data on Events, Location, and Tone (GDELT) to capture peacefulness through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use explainable AI techniques to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by researchers, policymakers, and peacekeepers, with data science tools as powerful as machine learning, could contribute to maximizing the societal benefits and minimizing the risks to peacefulness.
△ Less
Submitted 26 October, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
An interactive dashboard for searching and comparing soccer performance scores
Authors:
Paolo Cintia,
Giovanni Mauro,
Luca Pappalardo,
Paolo Ferragina
Abstract:
The performance of soccer players is one of most discussed aspects by many actors in the soccer industry: from supporters to journalists, from coaches to talent scouts. Unfortunately, the dashboards available online provide no effective way to compare the evolution of the performance of players or to find players behaving similarly on the field. This paper describes the design of a web dashboard t…
▽ More
The performance of soccer players is one of most discussed aspects by many actors in the soccer industry: from supporters to journalists, from coaches to talent scouts. Unfortunately, the dashboards available online provide no effective way to compare the evolution of the performance of players or to find players behaving similarly on the field. This paper describes the design of a web dashboard that interacts via APIs with a performance evaluation algorithm and provides graphical tools that allow the user to perform many tasks, such as to search or compare players by age, role or trend of growth in their performance, find similar players based on their pitching behavior, change the algorithm's parameters to obtain customized performance scores. We also describe an example of how a talent scout can interact with the dashboard to find young, promising talents.
△ Less
Submitted 11 May, 2021; v1 submitted 16 April, 2021;
originally announced May 2021.
-
A Survey on Deep Learning for Human Mobility
Authors:
Massimiliano Luca,
Gianni Barlacchi,
Bruno Lepri,
Luca Pappalardo
Abstract:
The study of human mobility is crucial due to its impact on several aspects of our society, such as disease spreading, urban planning, well-being, pollution, and more. The proliferation of digital mobility data, such as phone records, GPS traces, and social media posts, combined with the predictive power of artificial intelligence, triggered the application of deep learning to human mobility. Exis…
▽ More
The study of human mobility is crucial due to its impact on several aspects of our society, such as disease spreading, urban planning, well-being, pollution, and more. The proliferation of digital mobility data, such as phone records, GPS traces, and social media posts, combined with the predictive power of artificial intelligence, triggered the application of deep learning to human mobility. Existing surveys focus on single tasks, data sources, mechanistic or traditional machine learning approaches, while a comprehensive description of deep learning solutions is missing. This survey provides a taxonomy of mobility tasks, a discussion on the challenges related to each task and how deep learning may overcome the limitations of traditional models, a description of the most relevant solutions to the mobility tasks described above and the relevant challenges for the future. Our survey is a guide to the leading deep learning solutions to next-location prediction, crowd flow prediction, trajectory generation, and flow generation. At the same time, it helps deep learning scientists and practitioners understand the fundamental concepts and the open challenges of the study of human mobility.
△ Less
Submitted 27 June, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Deep Gravity: enhancing mobility flows generation with deep neural networks and geographic information
Authors:
Filippo Simini,
Gianni Barlacchi,
Massimiliano Luca,
Luca Pappalardo
Abstract:
The movements of individuals within and among cities influence critical aspects of our society, such as well-being, the spreading of epidemics, and the quality of the environment. When information about mobility flows is not available for a particular region of interest, we must rely on mathematical models to generate them. In this work, we propose the Deep Gravity model, an effective method to ge…
▽ More
The movements of individuals within and among cities influence critical aspects of our society, such as well-being, the spreading of epidemics, and the quality of the environment. When information about mobility flows is not available for a particular region of interest, we must rely on mathematical models to generate them. In this work, we propose the Deep Gravity model, an effective method to generate flow probabilities that exploits many variables (e.g., land use, road network, transport, food, health facilities) extracted from voluntary geographic data, and uses deep neural networks to discover non-linear relationships between those variables and mobility flows. Our experiments, conducted on mobility flows in England, Italy, and New York State, show that Deep Gravity has good geographic generalization capability, achieving a significant increase in performance (especially in densely populated regions of interest) with respect to the classic gravity model and models that do not use deep neural networks or geographic data. We also show how flows generated by Deep Gravity may be explained in terms of the geographic features using explainable AI techniques.
△ Less
Submitted 21 January, 2022; v1 submitted 1 December, 2020;
originally announced December 2020.
-
An individual-level ground truth dataset for home location detection
Authors:
Luca Pappalardo,
Leo Ferres,
Manuel Sacasa,
Ciro Cattuto,
Loreto Bravo
Abstract:
Home detection, assigning a phone device to its home antenna, is a ubiquitous part of most studies in the literature on mobile phone data. Despite its widespread use, home detection relies on a few assumptions that are difficult to check without ground truth, i.e., where the individual that owns the device resides. In this paper, we provide an unprecedented evaluation of the accuracy of home detec…
▽ More
Home detection, assigning a phone device to its home antenna, is a ubiquitous part of most studies in the literature on mobile phone data. Despite its widespread use, home detection relies on a few assumptions that are difficult to check without ground truth, i.e., where the individual that owns the device resides. In this paper, we provide an unprecedented evaluation of the accuracy of home detection algorithms on a group of sixty-five participants for whom we know their exact home address and the antennas that might serve them. Besides, we analyze not only Call Detail Records (CDRs) but also two other mobile phone streams: eXtended Detail Records (XDRs, the ``data'' channel) and Control Plane Records (CPRs, the network stream). These data streams vary not only in their temporal granularity but also they differ in the data generation mechanism', e.g., CDRs are purely human-triggered while CPR is purely machine-triggered events. Finally, we quantify the amount of data that is needed for each stream to carry out successful home detection for each stream. We find that the choice of stream and the algorithm heavily influences home detection, with an hour-of-day algorithm for the XDRs performing the best, and with CPRs performing best for the amount of data needed to perform home detection. Our work is useful for researchers and practitioners in order to minimize data requests and to maximize the accuracy of home antenna location.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
Automatic Pass Annotation from Soccer VideoStreams Based on Object Detection and LSTM
Authors:
Danilo Sorano,
Fabio Carrara,
Paolo Cintia,
Fabrizio Falchi,
Luca Pappalardo
Abstract:
Soccer analytics is attracting increasing interest in academia and industry, thanks to the availability of data that describe all the spatio-temporal events that occur in each match. These events (e.g., passes, shots, fouls) are collected by human operators manually, constituting a considerable cost for data providers in terms of time and economic resources. In this paper, we describe PassNet, a m…
▽ More
Soccer analytics is attracting increasing interest in academia and industry, thanks to the availability of data that describe all the spatio-temporal events that occur in each match. These events (e.g., passes, shots, fouls) are collected by human operators manually, constituting a considerable cost for data providers in terms of time and economic resources. In this paper, we describe PassNet, a method to recognize the most frequent events in soccer, i.e., passes, from video streams. Our model combines a set of artificial neural networks that perform feature extraction from video streams, object detection to identify the positions of the ball and the players, and classification of frame sequences as passes or not passes. We test PassNet on different scenarios, depending on the similarity of conditions to the match used for training. Our results show good classification results and significant improvement in the accuracy of pass detection with respect to baseline classifiers, even when the match's video conditions of the test and training sets are considerably different. PassNet is the first step towards an automated event annotation system that may break the time and the costs for event annotation, enabling data collections for minor and non-professional divisions, youth leagues and, in general, competitions whose matches are not currently annotated by data providers.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Modelling Human Mobility considering Spatial,Temporal and Social Dimensions
Authors:
Giuliano Cornacchia,
Giulio Rossetti,
Luca Pappalardo
Abstract:
Modelling human mobility is crucial in several areas, from urban planning to epidemic modeling, traffic forecasting, and what-if analysis. On the one hand, existing models focus mainly on reproducing the spatial and temporal dimensions of human mobility, while the social aspect, though it influences human movements significantly, is often neglected. On the other hand, those models that capture som…
▽ More
Modelling human mobility is crucial in several areas, from urban planning to epidemic modeling, traffic forecasting, and what-if analysis. On the one hand, existing models focus mainly on reproducing the spatial and temporal dimensions of human mobility, while the social aspect, though it influences human movements significantly, is often neglected. On the other hand, those models that capture some social aspects of human mobility have trivial and unrealistic spatial and temporal mechanisms. In this paper, we propose STS-EPR, a modeling framework that embeds mechanisms to capture the spatial, temporal, and social aspects together. Our experiments show that STS-EPR outperforms existing spatial-temporal or social models on a set of standard mobility metrics and that it can be used with a limited amount of information without any significant loss of realism. STS-EPR, which is open-source and tested on open data, is a step towards the design of mechanistic models that can capture all the aspects of human mobility in a comprehensive way.
△ Less
Submitted 5 July, 2020;
originally announced July 2020.
-
The relationship between human mobility and viral transmissibility during the COVID-19 epidemics in Italy
Authors:
Paolo Cintia,
Luca Pappalardo,
Salvatore Rinzivillo,
Daniele Fadda,
Tobia Boschi,
Fosca Giannotti,
Francesca Chiaromonte,
Pietro Bonato,
Francesco Fabbri,
Francesco Penone,
Marcello Savarese,
Francesco Calabrese,
Giorgio Guzzetta,
Flavia Riccardo,
Valentina Marziano,
Piero Poletti,
Filippo Trentini,
Antonino Bella,
Xanthi Andrianou,
Martina Del Manso,
Massimo Fabiani,
Stefania Bellino,
Stefano Boros,
Alberto Mateo Urdiales,
Maria Fenicia Vescio
, et al. (7 additional authors not shown)
Abstract:
In 2020, countries affected by the COVID-19 pandemic implemented various non-pharmaceutical interventions to contrast the spread of the virus and its impact on their healthcare systems and economies. Using Italian data at different geographic scales, we investigate the relationship between human mobility, which subsumes many facets of the population's response to the changing situation, and the sp…
▽ More
In 2020, countries affected by the COVID-19 pandemic implemented various non-pharmaceutical interventions to contrast the spread of the virus and its impact on their healthcare systems and economies. Using Italian data at different geographic scales, we investigate the relationship between human mobility, which subsumes many facets of the population's response to the changing situation, and the spread of COVID-19. Leveraging mobile phone data from February through September 2020, we find a striking relationship between the decrease in mobility flows and the net reproduction number. We find that the time needed to switch off mobility and bring the net reproduction number below the critical threshold of 1 is about one week. Moreover, we observe a strong relationship between the number of days spent above such threshold before the lockdown-induced drop in mobility flows and the total number of infections per 100k inhabitants. Estimating the statistical effect of mobility flows on the net reproduction number over time, we document a 2-week lag positive association, strong in March and April, and weaker but still significant in June. Our study demonstrates the value of big mobility data to monitor the epidemic and inform control interventions during its unfolding.
△ Less
Submitted 1 April, 2021; v1 submitted 4 June, 2020;
originally announced June 2020.
-
Mobile phone data analytics against the COVID-19 epidemics in Italy: flow diversity and local job markets during the national lockdown
Authors:
Pietro Bonato,
Paolo Cintia,
Francesco Fabbri,
Daniele Fadda,
Fosca Giannotti,
Pier Luigi Lopalco,
Sara Mazzilli,
Mirco Nanni,
Luca Pappalardo,
Dino Pedreschi,
Francesco Penone,
Salvatore Rinzivillo,
Giulio Rossetti,
Marcello Savarese,
Lara Tavoschi
Abstract:
Understanding collective mobility patterns is crucial to plan the restart of production and economic activities, which are currently put in stand-by to fight the diffusion of the epidemics. In this report, we use mobile phone data to infer the movements of people between Italian provinces and municipalities, and we analyze the incoming, outcoming and internal mobility flows before and during the n…
▽ More
Understanding collective mobility patterns is crucial to plan the restart of production and economic activities, which are currently put in stand-by to fight the diffusion of the epidemics. In this report, we use mobile phone data to infer the movements of people between Italian provinces and municipalities, and we analyze the incoming, outcoming and internal mobility flows before and during the national lockdown (March 9th, 2020) and after the closure of non-necessary productive and economic activities (March 23th, 2020). The population flow across provinces and municipalities enable for the modelling of a risk index tailored for the mobility of each municipality or province. Such an index would be a useful indicator to drive counter-measures in reaction to a sudden reactivation of the epidemics. Mobile phone data, even when aggregated to preserve the privacy of individuals, are a useful data source to track the evolution in time of human mobility, hence allowing for monitoring the effectiveness of control measures such as physical distancing. We address the following analytical questions: How does the mobility structure of a territory change? Do incoming and outcoming flows become more predictable during the lockdown, and what are the differences between weekdays and weekends? Can we detect proper local job markets based on human mobility flows, to eventually shape the borders of a local outbreak?
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
Disease State Prediction From Single-Cell Data Using Graph Attention Networks
Authors:
Neal G. Ravindra,
Arijit Sehanobish,
Jenna L. Pappalardo,
David A. Hafler,
David van Dijk
Abstract:
Single-cell RNA sequencing (scRNA-seq) has revolutionized biological discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into both healthy systems and diseases, it has not been used for disease prediction or diagnostics. Graph Attention Networks (GAT) have proven to be versatile for a wide range of tasks by lea…
▽ More
Single-cell RNA sequencing (scRNA-seq) has revolutionized biological discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into both healthy systems and diseases, it has not been used for disease prediction or diagnostics. Graph Attention Networks (GAT) have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that can be difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92 % accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network and a random forest classifier. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the differences between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding that can be visualized. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.
△ Less
Submitted 12 March, 2020; v1 submitted 14 February, 2020;
originally announced February 2020.
-
Weak nodes detection in urban transport systems: Planning for resilience in Singapore
Authors:
Michele Ferretti,
Gianni Barlacchi,
Luca Pappalardo,
Lorenzo Lucchini,
Bruno Lepri
Abstract:
The availability of massive data-sets describing human mobility offers the possibility to design simulation tools to monitor and improve the resilience of transport systems in response to traumatic events such as natural and man-made disasters (e.g. floods terroristic attacks, etc...). In this perspective, we propose ACHILLES, an application to model people's movements in a given transport system…
▽ More
The availability of massive data-sets describing human mobility offers the possibility to design simulation tools to monitor and improve the resilience of transport systems in response to traumatic events such as natural and man-made disasters (e.g. floods terroristic attacks, etc...). In this perspective, we propose ACHILLES, an application to model people's movements in a given transport system mode through a multiplex network representation based on mobility data. ACHILLES is a web-based application which provides an easy-to-use interface to explore the mobility fluxes and the connectivity of every urban zone in a city, as well as to visualize changes in the transport system resulting from the addition or removal of transport modes, urban zones, and single stops. Notably, our application allows the user to assess the overall resilience of the transport network by identifying its weakest node, i.e. Urban Achilles Heel, with reference to the ancient Greek mythology. To demonstrate the impact of ACHILLES for humanitarian aid we consider its application to a real-world scenario by exploring human mobility in Singapore in response to flood prevention.
△ Less
Submitted 20 September, 2018;
originally announced September 2018.
-
Open the Black Box Data-Driven Explanation of Black Box Decision Systems
Authors:
Dino Pedreschi,
Fosca Giannotti,
Riccardo Guidotti,
Anna Monreale,
Luca Pappalardo,
Salvatore Ruggieri,
Franco Turini
Abstract:
Black box systems for automated decision making, often based on machine learning over (big) data, map a user's features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases hidden in the algorithms, due to human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong…
▽ More
Black box systems for automated decision making, often based on machine learning over (big) data, map a user's features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases hidden in the algorithms, due to human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. We introduce the local-to-global framework for black box explanation, a novel approach with promising early results, which paves the road for a wide spectrum of future developments along three dimensions: (i) the language for expressing explanations in terms of highly expressive logic-based rules, with a statistical and causal interpretation; (ii) the inference of local explanations aimed at revealing the logic of the decision adopted for a specific instance by querying and auditing the black box in the vicinity of the target instance; (iii), the bottom-up generalization of the many local explanations into simple global ones, with algorithms that optimize the quality and comprehensibility of explanations.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach
Authors:
Luca Pappalardo,
Paolo Cintia,
Paolo Ferragina,
Emanuele Massucco,
Dino Pedreschi,
Fosca Giannotti
Abstract:
The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this pap…
▽ More
The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this paper, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players' evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by {\sf PlayeRank} and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank -- i.e. searching players and player versatility --- showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.
△ Less
Submitted 25 January, 2019; v1 submitted 14 February, 2018;
originally announced February 2018.
-
Prediction of next career moves from scientific profiles
Authors:
Charlotte James,
Luca Pappalardo,
Alina Sirbu,
Filippo Simini
Abstract:
Changing institution is a scientist's key career decision, which plays an important role in education, scientific productivity, and the generation of scientific knowledge. Yet, our understanding of the factors influencing a relocation decision is very limited. In this paper we investigate how the scientific profile of a scientist determines their decision to move (i.e., change institution). To thi…
▽ More
Changing institution is a scientist's key career decision, which plays an important role in education, scientific productivity, and the generation of scientific knowledge. Yet, our understanding of the factors influencing a relocation decision is very limited. In this paper we investigate how the scientific profile of a scientist determines their decision to move (i.e., change institution). To this aim, we describe a scientist's profile by three main aspects: the scientist's recent scientific career, the quality of their scientific environment and the structure of their scientific collaboration network. We then design and implement a two-stage predictive model: first, we use data mining to predict which researcher will move in the next year on the basis of their scientific profile; second we predict which institution they will choose by using a novel social-gravity model, an adaptation of the traditional gravity model of human mobility. Experiments on a massive dataset of scientific publications show that our approach performs well in both the stages, resulting in a 85% reduction of the prediction error with respect to the state-of-the-art approaches.
△ Less
Submitted 13 February, 2018;
originally announced February 2018.
-
Human Perception of Performance
Authors:
Luca Pappalardo,
Paolo Cintia,
Dino Pedreschi,
Fosca Giannotti,
Albert-Laszlo Barabasi
Abstract:
Humans are routinely asked to evaluate the performance of other individuals, separating success from failure and affecting outcomes from science to education and sports. Yet, in many contexts, the metrics driving the human evaluation process remain unclear. Here we analyse a massive dataset capturing players' evaluations by human judges to explore human perception of performance in soccer, the wor…
▽ More
Humans are routinely asked to evaluate the performance of other individuals, separating success from failure and affecting outcomes from science to education and sports. Yet, in many contexts, the metrics driving the human evaluation process remain unclear. Here we analyse a massive dataset capturing players' evaluations by human judges to explore human perception of performance in soccer, the world's most popular sport. We use machine learning to design an artificial judge which accurately reproduces human evaluation, allowing us to demonstrate how human observers are biased towards diverse contextual features. By investigating the structure of the artificial judge, we uncover the aspects of the players' behavior which attract the attention of human judges, demonstrating that human evaluation is based on a noticeability heuristic where only feature values far from the norm are considered to rate an individual's performance.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Next Basket Prediction using Recurring Sequential Patterns
Authors:
Riccardo Guidotti,
Giulio Rossetti,
Luca Pappalardo,
Fosca Giannotti,
Dino Pedreschi
Abstract:
Nowadays, a hot challenge for supermarket chains is to offer personalized services for their customers. Next basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable to capture at the same time the different factors influencing the customer's decision process: co-occurrency, se…
▽ More
Nowadays, a hot challenge for supermarket chains is to offer personalized services for their customers. Next basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable to capture at the same time the different factors influencing the customer's decision process: co-occurrency, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Data-driven generation of spatio-temporal routines in human mobility
Authors:
Luca Pappalardo,
Filippo Simini
Abstract:
The generation of realistic spatio-temporal trajectories of human mobility is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad-hoc networks or what-if analysis in urban ecosystems. Current generative algorithms fail in accurately reproducing the individuals' recurrent schedules and at the same time in accounting for the possibility that i…
▽ More
The generation of realistic spatio-temporal trajectories of human mobility is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad-hoc networks or what-if analysis in urban ecosystems. Current generative algorithms fail in accurately reproducing the individuals' recurrent schedules and at the same time in accounting for the possibility that individuals may break the routine during periods of variable duration. In this article we present DITRAS (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility. DITRAS operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. We propose a data-driven algorithm which constructs a diary generator from real data, capturing the tendency of individuals to follow or break their routine. We also propose a trajectory generator based on the concept of preferential exploration and preferential return. We instantiate DITRAS with the proposed diary and trajectory generators and compare the resulting algorithm with real data and synthetic data produced by other generative algorithms, built by instantiating DITRAS with several combinations of diary and trajectory generators. We show that the proposed algorithm reproduces the statistical properties of real trajectories in the most accurate way, making a step forward the understanding of the origin of the spatio-temporal patterns of human mobility.
△ Less
Submitted 9 December, 2017; v1 submitted 16 July, 2016;
originally announced July 2016.
-
An analytical framework to nowcast well-being using mobile phone data
Authors:
Luca Pappalardo,
Maarten Vanhoof,
Lorenzo Gabrielli,
Zbigniew Smoreda,
Dino Pedreschi,
Fosca Giannotti
Abstract:
An intriguing open question is whether measurements made on Big Data recording human activities can yield us high-fidelity proxies of socio-economic development and well-being. Can we monitor and predict the socio-economic development of a territory just by observing the behavior of its inhabitants through the lens of Big Data? In this paper, we design a data-driven analytical framework that uses…
▽ More
An intriguing open question is whether measurements made on Big Data recording human activities can yield us high-fidelity proxies of socio-economic development and well-being. Can we monitor and predict the socio-economic development of a territory just by observing the behavior of its inhabitants through the lens of Big Data? In this paper, we design a data-driven analytical framework that uses mobility measures and social measures extracted from mobile phone data to estimate indicators for socio-economic development and well-being. We discover that the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits (i) significant correlation with two different socio-economic indicators and (ii) the highest importance in predictive models built to predict the socio-economic indicators. Our analytical framework opens an interesting perspective to study human behavior through the lens of Big Data by means of new statistical indicators that quantify and possibly "nowcast" the well-being and the socio-economic development of a territory.
△ Less
Submitted 16 March, 2016;
originally announced June 2016.