Search | arXiv e-print repository

Detecting and Mitigating Bias in Algorithms Used to Disseminate Information in Social Networks

Authors: Vedran Sekara, Ivan Dotu, Manuel Cebrian, Esteban Moro, Manuel Garcia-Herranz

Abstract: Social connections are conduits through which individuals communicate, information propagates, and diseases spread. Identifying individuals who are more likely to adopt ideas and spread them is essential in order to develop effective information campaigns, maximize the reach of resources, and fight epidemics. Influence maximization algorithms are used to identify sets of influencers. Based on exte… ▽ More Social connections are conduits through which individuals communicate, information propagates, and diseases spread. Identifying individuals who are more likely to adopt ideas and spread them is essential in order to develop effective information campaigns, maximize the reach of resources, and fight epidemics. Influence maximization algorithms are used to identify sets of influencers. Based on extensive computer simulations on synthetic and ten diverse real-world social networks we show that seeding information using these methods creates information gaps. Our results show that these algorithms select influencers who do not disseminate information equitably, threatening to create an increasingly unequal society. To overcome this issue we devise a multi-objective algorithm which maximizes influence and information equity. Our results demonstrate it is possible to reduce vulnerability at a relatively low trade-off with respect to spread. This highlights that in our search for maximizing information we do not need to compromise on information equality. △ Less

Submitted 30 October, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2404.13127 [pdf, other]

Uncovering large inconsistencies between machine learning derived gridded settlement datasets

Authors: Vedran Sekara, Andrea Martini, Manuel Garcia-Herranz, Do-Hyung Kim

Abstract: High-resolution human settlement maps provide detailed delineations of where people live and are vital for scientific and practical purposes, such as rapid disaster response, allocation of humanitarian resources, and international development. The increased availability of high-resolution satellite imagery, combined with powerful techniques from machine learning and artificial intelligence, has sp… ▽ More High-resolution human settlement maps provide detailed delineations of where people live and are vital for scientific and practical purposes, such as rapid disaster response, allocation of humanitarian resources, and international development. The increased availability of high-resolution satellite imagery, combined with powerful techniques from machine learning and artificial intelligence, has spurred the creation of a wealth of settlement datasets. However, the precise agreement and alignment between these datasets is not known. Here we quantify the overlap of high-resolution settlement map for 42 African countries developed by Google (Open Buildings), Meta (High Resolution Population Maps) and GRID3 (Geo-Referenced Infrastructure and Demographic Data for Development). Across all studied countries we find large disagreement between datasets on how much area is considered settled. We demonstrate that there are considerable geographic and socio-economic factors at play and build a machine learning model to predict for which areas datasets disagree. It it vital to understand the shortcomings of AI derived high-resolution settlement layers as international organizations, governments, and NGOs are already experimenting with incorporating these into programmatic work. As such, we anticipate our work to be a starting point for more critical and detailed analyses of AI derived datasets for humanitarian, planning, policy, and scientific purposes. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 14 pages, 4 figures

arXiv:2312.14692 [pdf, other]

Socioeconomic reorganization of communication and mobility networks in response to external shocks

Authors: Ludovico Napoli, Vedran Sekara, Manuel García-Herranz, Márton Karsai

Abstract: Socioeconomic segregation patterns in networks usually evolve gradually, yet they can change abruptly in response to external shocks. The recent COVID-19 pandemic and the subsequent government policies induced several interruptions in societies, potentially disadvantaging the socioeconomically most vulnerable groups. Using large-scale digital behavioral observations as a natural laboratory, here w… ▽ More Socioeconomic segregation patterns in networks usually evolve gradually, yet they can change abruptly in response to external shocks. The recent COVID-19 pandemic and the subsequent government policies induced several interruptions in societies, potentially disadvantaging the socioeconomically most vulnerable groups. Using large-scale digital behavioral observations as a natural laboratory, here we analyze how lockdown interventions lead to the reorganization of socioeconomic segregation patterns simultaneously in communication and mobility networks in Sierra Leone. We find that while segregation in mobility clearly increased during lockdown, the social communication network reorganized into a less segregated configuration as compared to reference periods. Moreover, due to differences in adaption capacities, the effects of lockdown policies varied across socioeconomic groups, leading to different or even opposite segregation patterns between the lower and higher socioeconomic classes. Such secondary effects of interventions need to be considered for better and more equitable policies. △ Less

Submitted 22 December, 2023; originally announced December 2023.

arXiv:2310.03557 [pdf, other]

Mobility Segregation Dynamics and Residual Isolation During Pandemic Interventions

Authors: Rafiazka Millanida Hilman, Manuel García-Herranz, Vedran Sekara, Márton Karsai

Abstract: External shocks embody an unexpected and disruptive impact on the regular life of people. This was the case during the COVID-19 outbreak that rapidly led to changes in the typical mobility patterns in urban areas. In response, people reorganised their daily errands throughout space. However, these changes might not have been the same across socioeconomic classes leading to possibile additional det… ▽ More External shocks embody an unexpected and disruptive impact on the regular life of people. This was the case during the COVID-19 outbreak that rapidly led to changes in the typical mobility patterns in urban areas. In response, people reorganised their daily errands throughout space. However, these changes might not have been the same across socioeconomic classes leading to possibile additional detrimental effects on inequality due to the pandemic. In this paper we study the reorganisation of mobility segregation networks due to external shocks and show that the diversity of visited places in terms of locations and socioeconomic status is affected by the enforcement of mobility restriction during pandemic. We use the case of COVID-19 as a natural experiment in several cities to observe not only the effect of external shocks but also its mid-term consequences and residual effects. We build on anonymised and privacy-preserved mobility data in four cities: Bogota, Jakarta, London, and New York. We couple mobility data with socioeconomic information to capture inequalities in mobility among different socioeconomic groups and see how it changes dynamically before, during, and after different lockdown periods. We find that the first lockdowns induced considerable increases in mobility segregation in each city, while loosening mobility restrictions did not necessarily diminished isolation between different socioeconomic groups, as mobility mixing has not recovered fully to its pre-pandemic level even weeks after the interruption of interventions. Our results suggest that a one fits-all policy does not equally affect the way people adjust their mobility, which calls for socioeconomically informed intervention policies in the future. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 27 pages, 14 figures

arXiv:2307.01891 [pdf, other]

doi 10.1142/S0219525924400022

Are machine learning technologies ready to be used for humanitarian work and development?

Authors: Vedran Sekara, Márton Karsai, Esteban Moro, Dohyung Kim, Enrique Delamonica, Manuel Cebrian, Miguel Luengo-Oroz, Rebeca Moreno Jiménez, Manuel Garcia-Herranz

Abstract: Novel digital data sources and tools like machine learning (ML) and artificial intelligence (AI) have the potential to revolutionize data about development and can contribute to monitoring and mitigating humanitarian problems. The potential of applying novel technologies to solving some of humanity's most pressing issues has garnered interest outside the traditional disciplines studying and workin… ▽ More Novel digital data sources and tools like machine learning (ML) and artificial intelligence (AI) have the potential to revolutionize data about development and can contribute to monitoring and mitigating humanitarian problems. The potential of applying novel technologies to solving some of humanity's most pressing issues has garnered interest outside the traditional disciplines studying and working on international development. Today, scientific communities in fields like Computational Social Science, Network Science, Complex Systems, Human Computer Interaction, Machine Learning, and the broader AI field are increasingly starting to pay attention to these pressing issues. However, are sophisticated data driven tools ready to be used for solving real-world problems with imperfect data and of staggering complexity? We outline the current state-of-the-art and identify barriers, which need to be surmounted in order for data-driven technologies to become useful in humanitarian and development contexts. We argue that, without organized and purposeful efforts, these new technologies risk at best falling short of promised goals, at worst they can increase inequality, amplify discrimination, and infringe upon human rights. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 15 pages, 2 figures

arXiv:2112.12521 [pdf, other]

Biases in human mobility data impact epidemic modeling

Authors: Frank Schlosser, Vedran Sekara, Dirk Brockmann, Manuel Garcia-Herranz

Abstract: Large-scale human mobility data is a key resource in data-driven policy making and across many scientific fields. Most recently, mobility data was extensively used during the COVID-19 pandemic to study the effects of governmental policies and to inform epidemic models. Large-scale mobility is often measured using digital tools such as mobile phones. However, it remains an open question how truthfu… ▽ More Large-scale human mobility data is a key resource in data-driven policy making and across many scientific fields. Most recently, mobility data was extensively used during the COVID-19 pandemic to study the effects of governmental policies and to inform epidemic models. Large-scale mobility is often measured using digital tools such as mobile phones. However, it remains an open question how truthfully these digital proxies represent the actual travel behavior of the general population. Here, we examine mobility datasets from multiple countries and identify two fundamentally different types of bias caused by unequal access to, and unequal usage of mobile phones. We introduce the concept of data generation bias, a previously overlooked type of bias, which is present when the amount of data that an individual produces influences their representation in the dataset. We find evidence for data generation bias in all examined datasets in that high-wealth individuals are overrepresented, with the richest 20% contributing over 50% of all recorded trips, substantially skewing the datasets. This inequality is consequential, as we find mobility patterns of different wealth groups to be structurally different, where the mobility networks of high-wealth users are denser and contain more long-range connections. To mitigate the skew, we present a framework to debias data and show how simple techniques can be used to increase representativeness. Using our approach we show how biases can severely impact outcomes of dynamic processes such as epidemic simulations, where biased data incorrectly estimates the severity and speed of disease transmission. Overall, we show that a failure to account for biases can have detrimental effects on the results of studies and urge researchers and practitioners to account for data-fairness in all future studies of human mobility. △ Less

Submitted 23 December, 2021; originally announced December 2021.

arXiv:2102.13349 [pdf, other]

Contact Tracing: Computational Bounds, Limitations and Implications

Authors: Quyu Kong, Manuel Garcia-Herranz, Ivan Dotu, Manuel Cebrian

Abstract: Contact tracing has been extensively studied from different perspectives in recent years. However, there is no clear indication of why this intervention has proven effective in some epidemics (SARS) and mostly ineffective in some others (COVID-19). Here, we perform an exhaustive evaluation of random testing and contact tracing on novel superspreading random networks to try to identify which epidem… ▽ More Contact tracing has been extensively studied from different perspectives in recent years. However, there is no clear indication of why this intervention has proven effective in some epidemics (SARS) and mostly ineffective in some others (COVID-19). Here, we perform an exhaustive evaluation of random testing and contact tracing on novel superspreading random networks to try to identify which epidemics are more containable with such measures. We also explore the suitability of positive rates as a proxy of the actual infection statuses of the population. Moreover, we propose novel ideal strategies to explore the potential limits of both testing and tracing strategies. Our study counsels caution, both at assuming epidemic containment and at inferring the actual epidemic progress, with current testing or tracing strategies. However, it also brings a ray of light for the future, with the promise of the potential of novel testing strategies that can achieve great effectiveness. △ Less

Submitted 26 February, 2021; originally announced February 2021.

arXiv:1909.11190 [pdf, ps, other]

doi 10.1007/978-3-030-12554-7_3

Mobile Phone Data for Children on the Move: Challenges and Opportunities

Authors: Vedran Sekara, Elisa Omodei, Laura Healy, Jan Beise, Claus Hansen, Danzhen You, Saskia Blume, Manuel Garcia-Herranz

Abstract: Today, 95% of the global population has 2G mobile phone coverage and the number of individuals who own a mobile phone is at an all time high. Mobile phones generate rich data on billions of people across different societal contexts and have in the last decade helped redefine how we do research and build tools to understand society. As such, mobile phone data has the potential to revolutionize how… ▽ More Today, 95% of the global population has 2G mobile phone coverage and the number of individuals who own a mobile phone is at an all time high. Mobile phones generate rich data on billions of people across different societal contexts and have in the last decade helped redefine how we do research and build tools to understand society. As such, mobile phone data has the potential to revolutionize how we tackle humanitarian problems, such as the many suffered by refugees all over the world. While promising, mobile phone data and the new computational approaches bring both opportunities and challenges. Mobile phone traces contain detailed information regarding people's whereabouts, social life, and even financial standing. Therefore, developing and adopting strategies that open data up to the wider humanitarian and international development community for analysis and research while simultaneously protecting the privacy of individuals is of paramount importance. Here we outline the challenging situation of children on the move and actions UNICEF is pushing in helping displaced children and youth globally, and discuss opportunities where mobile phone data can be used. We identify three key challenges: data access, data and algorithmic bias, and operationalization of research, which need to be addressed if mobile phone data is to be successfully applied in humanitarian contexts. △ Less

Submitted 24 September, 2019; originally announced September 2019.

Comments: 13 pages, book chapter

arXiv:1606.06343 [pdf, other]

Twitter as a Source of Global Mobility Patterns for Social Good

Authors: Mark Dredze, Manuel García-Herranz, Alex Rutherford, Gideon Mann

Abstract: Data on human spatial distribution and movement is essential for understanding and analyzing social systems. However existing sources for this data are lacking in various ways; difficult to access, biased, have poor geographical or temporal resolution, or are significantly delayed. In this paper, we describe how geolocation data from Twitter can be used to estimate global mobility patterns and add… ▽ More Data on human spatial distribution and movement is essential for understanding and analyzing social systems. However existing sources for this data are lacking in various ways; difficult to access, biased, have poor geographical or temporal resolution, or are significantly delayed. In this paper, we describe how geolocation data from Twitter can be used to estimate global mobility patterns and address these shortcomings. These findings will inform how this novel data source can be harnessed to address humanitarian and development efforts. △ Less

Submitted 20 June, 2016; originally announced June 2016.

Comments: Presented at 2016 ICML Workshop on #Data4Good: Machine Learning in Social Good Applications, New York, NY

arXiv:1606.04012 [pdf, other]

Inferring Mechanisms for Global Constitutional Progress

Authors: Alex Rutherford, Yonatan Lupu, Manuel Cebrian, Iyad Rahwan, Brad LeVeck, Manuel Garcia-Herranz

Abstract: Constitutions help define domestic political orders, but are known to be influenced by two international mechanisms: one that reflects global temporal trends in legal development, and another that reflects international network dynamics such as shared colonial history. We introduce the provision space; the growing set of all legal provisions existing in the world's constitutions over time. Through… ▽ More Constitutions help define domestic political orders, but are known to be influenced by two international mechanisms: one that reflects global temporal trends in legal development, and another that reflects international network dynamics such as shared colonial history. We introduce the provision space; the growing set of all legal provisions existing in the world's constitutions over time. Through this we uncover a third mechanism influencing constitutional change: hierarchical dependencies between legal provisions, under which the adoption of essential, fundamental provisions precedes more advanced provisions. This third mechanism appears to play an especially important role in the emergence of new political rights, and may therefore provide a useful roadmap for advocates of those rights. We further characterise each legal provision in terms of the strength of these mechanisms. △ Less

Submitted 13 July, 2017; v1 submitted 13 June, 2016; originally announced June 2016.

arXiv:1509.08368 [pdf, other]

Limits of Friendship Networks in Predicting Epidemic Risk

Authors: Lorenzo Coviello, Massimo Franceschetti, Manuel Garcia-Herranz, Iyad Rahwan

Abstract: The spread of an infection on a real-world social network is determined by the interplay of two processes: the dynamics of the network, whose structure changes over time according to the encounters between individuals, and the dynamics on the network, whose nodes can infect each other after an encounter. Physical encounter is the most common vehicle for the spread of infectious diseases, but detai… ▽ More The spread of an infection on a real-world social network is determined by the interplay of two processes: the dynamics of the network, whose structure changes over time according to the encounters between individuals, and the dynamics on the network, whose nodes can infect each other after an encounter. Physical encounter is the most common vehicle for the spread of infectious diseases, but detailed information about encounters is often unavailable because expensive, unpractical to collect or privacy sensitive. We asks whether the friendship ties between the individuals in a social network successfully predict who is at risk. Using a dataset from a popular online review service, we build a time-varying network that is a proxy of physical encounter between users and a static network based on reported friendship. Through computer simulations, we compare infection processes on the resulting networks and show that, whereas distance on the friendship network is correlated to epidemic risk, friendship provides a poor identification of the individuals at risk if the infection is driven by physical encounter. Such limit is not due to the randomness of the infection, but to the structural differences of the two networks. In contrast to the macroscopic similarity between processes spreading on different networks, the differences in local connectivity determined by the two definitions of edges result in striking differences between the dynamics at a microscopic level. Despite the limits highlighted, we show that periodical and relatively infrequent monitoring of the real infection on the encounter network allows to correct the predicted infection on the friendship network and to achieve satisfactory prediction accuracy. In addition, the friendship network contains valuable information to effectively contain epidemic outbreaks when a limited budget is available for immunization. △ Less

Submitted 27 October, 2015; v1 submitted 28 September, 2015; originally announced September 2015.

Comments: 74 pages, 28 figures, 12 tables

arXiv:1411.3140 [pdf, other]

doi 10.1371/journal.pone.0128692

Social media fingerprints of unemployment

Authors: Alejandro Llorente, Manuel Garcia-Herranz, Manuel Cebrian, Esteban Moro

Abstract: Recent wide-spread adoption of electronic and pervasive technologies has enabled the study of human behavior at an unprecedented level, uncovering universal patterns underlying human activity, mobility, and inter-personal communication. In the present work, we investigate whether deviations from these universal patterns may reveal information about the socio-economical status of geographical regio… ▽ More Recent wide-spread adoption of electronic and pervasive technologies has enabled the study of human behavior at an unprecedented level, uncovering universal patterns underlying human activity, mobility, and inter-personal communication. In the present work, we investigate whether deviations from these universal patterns may reveal information about the socio-economical status of geographical regions. We quantify the extent to which deviations in diurnal rhythm, mobility patterns, and communication styles across regions relate to their unemployment incidence. For this we examine a country-scale publicly articulated social media dataset, where we quantify individual behavioral features from over 145 million geo-located messages distributed among more than 340 different Spanish economic regions, inferred by computing communities of cohesive mobility fluxes. We find that regions exhibiting more diverse mobility fluxes, earlier diurnal rhythms, and more correct grammatical styles display lower unemployment rates. As a result, we provide a simple model able to produce accurate, easily interpretable reconstruction of regional unemployment incidence from their social-media digital fingerprints alone. Our results show that cost-effective economical indicators can be built based on publicly-available social media datasets. △ Less

Submitted 19 November, 2014; v1 submitted 12 November, 2014; originally announced November 2014.

Comments: 19 pages (8 main article, 11 Supplementary Information)

arXiv:1211.6512 [pdf]

Using Friends as Sensors to Detect Global-Scale Contagious Outbreaks

Authors: Manuel Garcia-Herranz, Esteban Moro Egido, Manuel Cebrian, Nicholas A. Christakis, James H. Fowler

Abstract: Recent research has focused on the monitoring of global-scale online data for improved detection of epidemics, mood patterns, movements in the stock market, political revolutions, box-office revenues, consumer behaviour and many other important phenomena. However, privacy considerations and the sheer scale of data available online are quickly making global monitoring infeasible, and existing metho… ▽ More Recent research has focused on the monitoring of global-scale online data for improved detection of epidemics, mood patterns, movements in the stock market, political revolutions, box-office revenues, consumer behaviour and many other important phenomena. However, privacy considerations and the sheer scale of data available online are quickly making global monitoring infeasible, and existing methods do not take full advantage of local network structure to identify key nodes for monitoring. Here, we develop a model of the contagious spread of information in a global-scale, publicly-articulated social network and show that a simple method can yield not just early detection, but advance warning of contagious outbreaks. In this method, we randomly choose a small fraction of nodes in the network and then we randomly choose a "friend" of each node to include in a group for local monitoring. Using six months of data from most of the full Twittersphere, we show that this friend group is more central in the network and it helps us to detect viral outbreaks of the use of novel hashtags about 7 days earlier than we could with an equal-sized randomly chosen group. Moreover, the method actually works better than expected due to network structure alone because highly central actors are both more active and exhibit increased diversity in the information they transmit to others. These results suggest that local monitoring is not just more efficient, it is more effective, and it is possible that other contagious processes in global-scale networks may be similarly monitored. △ Less

Submitted 27 November, 2012; originally announced November 2012.

Comments: Press embargo in place until publication

Showing 1–13 of 13 results for author: García-Herranz, M