-
Information Retrieval for Climate Impact
Authors:
Maarten de Rijke,
Bart van den Hurk,
Flora Salim,
Alaa Al Khourdajie,
Nan Bai,
Renato Calzone,
Declan Curran,
Getnet Demil,
Lesley Frew,
Noah Gießing,
Mukesh Kumar Gupta,
Maria Heuss,
Sanaa Hobeichi,
David Huard,
Jingwei Kang,
Ana Lucic,
Tanwi Mallick,
Shruti Nath,
Andrew Okem,
Barbara Pernici,
Thilina Rajapakse,
Hira Saleem,
Harry Scells,
Nicole Schneider,
Damiano Spina
, et al. (6 additional authors not shown)
Abstract:
The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- informa…
▽ More
The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- information retrieval, natural language processing, systematic reviews, impact assessments, and climate science. The workshop brought together a diverse set of researchers and practitioners interested in contributing to the development of a technical research agenda for information retrieval to assess climate change impacts.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
Analyzing social media with crowdsourcing in Crowd4SDG
Authors:
Carlo Bono,
Mehmet Oğuz Mülâyim,
Cinzia Cappiello,
Mark Carman,
Jesus Cerquides,
Jose Luis Fernandez-Marquez,
Rosy Mondardini,
Edoardo Ramalli,
Barbara Pernici
Abstract:
Social media have the potential to provide timely information about emergency situations and sudden events. However, finding relevant information among millions of posts being posted every day can be difficult, and developing a data analysis project usually requires time and technical skills. This study presents an approach that provides flexible support for analyzing social media, particularly du…
▽ More
Social media have the potential to provide timely information about emergency situations and sudden events. However, finding relevant information among millions of posts being posted every day can be difficult, and developing a data analysis project usually requires time and technical skills. This study presents an approach that provides flexible support for analyzing social media, particularly during emergencies. Different use cases in which social media analysis can be adopted are introduced, and the challenges of retrieving information from large sets of posts are discussed.
The focus is on analyzing images and text contained in social media posts and a set of automatic data processing tools for filtering, classification, and geolocation of content with a human-in-the-loop approach to support the data analyst. Such support includes both feedback and suggestions to configure automated tools, and crowdsourcing to gather inputs from citizens. The results are validated by discussing three case studies developed within the Crowd4SDG H2020 European project.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Extracting Large Scale Spatio-Temporal Descriptions from Social Media
Authors:
Carlo Bono,
Barbara Pernici
Abstract:
The ability to track large-scale events as they happen is essential for understanding them and coordinating reactions in an appropriate and timely manner. This is true, for example, in emergency management and decision-making support, where the constraints on both quality and latency of the extracted information can be stringent. In some contexts, real-time and large-scale sensor data and forecast…
▽ More
The ability to track large-scale events as they happen is essential for understanding them and coordinating reactions in an appropriate and timely manner. This is true, for example, in emergency management and decision-making support, where the constraints on both quality and latency of the extracted information can be stringent. In some contexts, real-time and large-scale sensor data and forecasts may be available. We are exploring the hypothesis that this kind of data can be augmented with the ingestion of semi-structured data sources, like social media. Social media can diffuse valuable knowledge, such as direct witness or expert opinions, while their noisy nature makes them not trivial to manage. This knowledge can be used to complement and confirm other spatio-temporal descriptions of events, highlighting previously unseen or undervalued aspects. The critical aspects of this investigation, such as event sensing, multilingualism, selection of visual evidence, and geolocation, are currently being studied as a foundation for a unified spatio-temporal representation of multi-modal descriptions. The paper presents, together with an introduction on the topics, the work done so far on this line of research, also presenting case studies relevant to the posed challenges, focusing on emergencies caused by natural disasters.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
TriggerCit: Early Flood Alerting using Twitter and Geolocation -- a comparison with alternative sources
Authors:
Carlo Bono,
Barbara Pernici,
Jose Luis Fernandez-Marquez,
Amudha Ravi Shankar,
Mehmet Oğuz Mülâyim,
Edoardo Nemni
Abstract:
Rapid impact assessment in the immediate aftermath of a natural disaster is essential to provide adequate information to international organisations, local authorities, and first responders. Social media can support emergency response with evidence-based content posted by citizens and organisations during ongoing events. In the paper, we propose TriggerCit: an early flood alerting tool with a mult…
▽ More
Rapid impact assessment in the immediate aftermath of a natural disaster is essential to provide adequate information to international organisations, local authorities, and first responders. Social media can support emergency response with evidence-based content posted by citizens and organisations during ongoing events. In the paper, we propose TriggerCit: an early flood alerting tool with a multilanguage approach focused on timeliness and geolocation. The paper focuses on assessing the reliability of the approach as a triggering system, comparing it with alternative sources for alerts, and evaluating the quality and amount of complementary information gathered. Geolocated visual evidence extracted from Twitter by TriggerCit was analysed in two case studies on floods in Thailand and Nepal in 2021.
△ Less
Submitted 5 March, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
Knowledge-driven Data Ecosystems Towards Data Transparency
Authors:
Sandra Geisler,
Maria-Esther Vidal,
Cinzia Cappiello,
Bernadette Farias Lóscio,
Avigdor Gal,
Matthias Jarke,
Maurizio Lenzerini,
Paolo Missier,
Boris Otto,
Elda Paja,
Barbara Pernici,
Jakob Rehof
Abstract:
A Data Ecosystem offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance and management, trustability is still affected by the absence of transparent and traceable data-driven pipelines. In this work, we focus on requiremen…
▽ More
A Data Ecosystem offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance and management, trustability is still affected by the absence of transparent and traceable data-driven pipelines. In this work, we focus on requirements and challenges that data ecosystems face when ensuring data transparency. Requirements are derived from the data and organizational management, as well as from broader legal and ethical considerations. We propose a novel knowledge-driven data ecosystem architecture, providing the pillars for satisfying the analyzed requirements. We illustrate the potential of our proposal in a real-world scenario. Lastly, we discuss and rate the potential of the proposed architecture in the fulfillment of these requirements.
△ Less
Submitted 21 May, 2021; v1 submitted 19 May, 2021;
originally announced May 2021.
-
Image-based Social Sensing: Combining AI and the Crowd to Mine Policy-Adherence Indicators from Twitter
Authors:
Virginia Negri,
Dario Scuratti,
Stefano Agresti,
Donya Rooein,
Gabriele Scalia,
Amudha Ravi Shankar,
Jose Luis Fernandez Marquez,
Mark James Carman,
Barbara Pernici
Abstract:
Social Media provides a trove of information that, if aggregated and analysed appropriately can provide important statistical indicators to policy makers. In some situations these indicators are not available through other mechanisms. For example, given the ongoing COVID-19 outbreak, it is essential for governments to have access to reliable data on policy-adherence with regards to mask wearing, s…
▽ More
Social Media provides a trove of information that, if aggregated and analysed appropriately can provide important statistical indicators to policy makers. In some situations these indicators are not available through other mechanisms. For example, given the ongoing COVID-19 outbreak, it is essential for governments to have access to reliable data on policy-adherence with regards to mask wearing, social distancing, and other hard-to-measure quantities. In this paper we investigate whether it is possible to obtain such data by aggregating information from images posted to social media. The paper presents VisualCit, a pipeline for image-based social sensing combining recent advances in image recognition technology with geocoding and crowdsourcing techniques. Our aim is to discover in which countries, and to what extent, people are following COVID-19 related policy directives. We compared the results with the indicators produced within the CovidDataHub behavior tracker initiative. Preliminary results shows that social media images can produce reliable indicators for policy makers.
△ Less
Submitted 5 March, 2021; v1 submitted 6 October, 2020;
originally announced October 2020.
-
Evaluating Scalable Uncertainty Estimation Methods for DNN-Based Molecular Property Prediction
Authors:
Gabriele Scalia,
Colin A. Grambow,
Barbara Pernici,
Yi-Pei Li,
William H. Green
Abstract:
Advances in deep neural network (DNN) based molecular property prediction have recently led to the development of models of remarkable accuracy and generalization ability, with graph convolution neural networks (GCNNs) reporting state-of-the-art performance for this task. However, some challenges remain and one of the most important that needs to be fully addressed concerns uncertainty quantificat…
▽ More
Advances in deep neural network (DNN) based molecular property prediction have recently led to the development of models of remarkable accuracy and generalization ability, with graph convolution neural networks (GCNNs) reporting state-of-the-art performance for this task. However, some challenges remain and one of the most important that needs to be fully addressed concerns uncertainty quantification. DNN performance is affected by the volume and the quality of the training samples. Therefore, establishing when and to what extent a prediction can be considered reliable is just as important as outputting accurate predictions, especially when out-of-domain molecules are targeted. Recently, several methods to account for uncertainty in DNNs have been proposed, most of which are based on approximate Bayesian inference. Among these, only a few scale to the large datasets required in applications. Evaluating and comparing these methods has recently attracted great interest, but results are generally fragmented and absent for molecular property prediction. In this paper, we aim to quantitatively compare scalable techniques for uncertainty estimation in GCNNs. We introduce a set of quantitative criteria to capture different uncertainty aspects, and then use these criteria to compare MC-Dropout, deep ensembles, and bootstrapping, both theoretically in a unified framework that separates aleatoric/epistemic uncertainty and experimentally on the QM9 dataset. Our experiments quantify the performance of the different uncertainty estimation methods and their impact on uncertainty-related error reduction. Our findings indicate that ensembling and bootstrapping consistently outperform MC-Dropout, with different context-specific pros and cons. Our analysis also leads to a better understanding of the role of aleatoric/epistemic uncertainty and highlights the challenge posed by out-of-domain uncertainty.
△ Less
Submitted 7 October, 2019;
originally announced October 2019.
-
Geolocating social media posts for emergency mapping
Authors:
Barbara Pernici,
Chiara Francalanci,
Gabriele Scalia,
Marco Corsi,
Domenico Grandoni,
Mariano Alfonso Biscardi
Abstract:
The demo will illustrate the features of a webGIS interface to support the rapid mapping activities after a natural disaster, with the goal of providing additional information from social media to the mapping operators. This demo shows the first results of the E2mC H2020 European project, where the goal is to extract precisely located information from available social media sources, providing accu…
▽ More
The demo will illustrate the features of a webGIS interface to support the rapid mapping activities after a natural disaster, with the goal of providing additional information from social media to the mapping operators. This demo shows the first results of the E2mC H2020 European project, where the goal is to extract precisely located information from available social media sources, providing accurate geolocating functionalities and, starting from posts searched in Twitter, extending the social media exploration to Flickr, YouTube, and Instagram.
△ Less
Submitted 21 January, 2018;
originally announced January 2018.