-
Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS) challenge results
Authors:
Meritxell Riera-Marin,
Sikha O K,
Julia Rodriguez-Comas,
Matthias Stefan May,
Zhaohong Pan,
Xiang Zhou,
Xiaokun Liang,
Franciskus Xaverius Erick,
Andrea Prenner,
Cedric Hemon,
Valentin Boussot,
Jean-Louis Dillenseger,
Jean-Claude Nunes,
Abdul Qayyum,
Moona Mazher,
Steven A Niederer,
Kaisar Kushibar,
Carlos Martin-Isla,
Petia Radeva,
Karim Lekadir,
Theodore Barfoot,
Luis C. Garcia Peraza Herrera,
Ben Glocker,
Tom Vercauteren,
Lucas Gago
, et al. (7 additional authors not shown)
Abstract:
Deep learning (DL) has become the dominant approach for medical image segmentation, yet ensuring the reliability and clinical applicability of these models requires addressing key challenges such as annotation variability, calibration, and uncertainty estimation. This is why we created the Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS), which highl…
▽ More
Deep learning (DL) has become the dominant approach for medical image segmentation, yet ensuring the reliability and clinical applicability of these models requires addressing key challenges such as annotation variability, calibration, and uncertainty estimation. This is why we created the Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS), which highlights the critical role of multiple annotators in establishing a more comprehensive ground truth, emphasizing that segmentation is inherently subjective and that leveraging inter-annotator variability is essential for robust model evaluation. Seven teams participated in the challenge, submitting a variety of DL models evaluated using metrics such as Dice Similarity Coefficient (DSC), Expected Calibration Error (ECE), and Continuous Ranked Probability Score (CRPS). By incorporating consensus and dissensus ground truth, we assess how DL models handle uncertainty and whether their confidence estimates align with true segmentation performance. Our findings reinforce the importance of well-calibrated models, as better calibration is strongly correlated with the quality of the results. Furthermore, we demonstrate that segmentation models trained on diverse datasets and enriched with pre-trained knowledge exhibit greater robustness, particularly in cases deviating from standard anatomical structures. Notably, the best-performing models achieved high DSC and well-calibrated uncertainty estimates. This work underscores the need for multi-annotator ground truth, thorough calibration assessments, and uncertainty-aware evaluations to develop trustworthy and clinically reliable DL-based medical image segmentation models.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Faster Convergence with Less Communication: Broadcast-Based Subgraph Sampling for Decentralized Learning over Wireless Networks
Authors:
Daniel Pérez Herrera,
Zheng Chen,
Erik G. Larsson
Abstract:
Consensus-based decentralized stochastic gradient descent (D-SGD) is a widely adopted algorithm for decentralized training of machine learning models across networked agents. A crucial part of D-SGD is the consensus-based model averaging, which heavily relies on information exchange and fusion among the nodes. Specifically, for consensus averaging over wireless networks, communication coordination…
▽ More
Consensus-based decentralized stochastic gradient descent (D-SGD) is a widely adopted algorithm for decentralized training of machine learning models across networked agents. A crucial part of D-SGD is the consensus-based model averaging, which heavily relies on information exchange and fusion among the nodes. Specifically, for consensus averaging over wireless networks, communication coordination is necessary to determine when and how a node can access the channel and transmit (or receive) information to (or from) its neighbors. In this work, we propose $\texttt{BASS}$, a broadcast-based subgraph sampling method designed to accelerate the convergence of D-SGD while considering the actual communication cost per iteration. $\texttt{BASS}$ creates a set of mixing matrix candidates that represent sparser subgraphs of the base topology. In each consensus iteration, one mixing matrix is sampled, leading to a specific scheduling decision that activates multiple collision-free subsets of nodes. The sampling occurs in a probabilistic manner, and the elements of the mixing matrices, along with their sampling probabilities, are jointly optimized. Simulation results demonstrate that $\texttt{BASS}$ enables faster convergence with fewer transmission slots compared to existing link-based scheduling methods. In conclusion, the inherent broadcasting nature of wireless channels offers intrinsic advantages in accelerating the convergence of decentralized optimization and learning.
△ Less
Submitted 11 February, 2025; v1 submitted 24 January, 2024;
originally announced January 2024.
-
Decentralized Learning over Wireless Networks with Broadcast-Based Subgraph Sampling
Authors:
Daniel Pérez Herrera,
Zheng Chen,
Erik G. Larsson
Abstract:
This work centers on the communication aspects of decentralized learning over wireless networks, using consensus-based decentralized stochastic gradient descent (D-SGD). Considering the actual communication cost or delay caused by in-network information exchange in an iterative process, our goal is to achieve fast convergence of the algorithm measured by improvement per transmission slot. We propo…
▽ More
This work centers on the communication aspects of decentralized learning over wireless networks, using consensus-based decentralized stochastic gradient descent (D-SGD). Considering the actual communication cost or delay caused by in-network information exchange in an iterative process, our goal is to achieve fast convergence of the algorithm measured by improvement per transmission slot. We propose BASS, an efficient communication framework for D-SGD over wireless networks with broadcast transmission and probabilistic subgraph sampling. In each iteration, we activate multiple subsets of non-interfering nodes to broadcast model updates to their neighbors. These subsets are randomly activated over time, with probabilities reflecting their importance in network connectivity and subject to a communication cost constraint (e.g., the average number of transmission slots per iteration). During the consensus update step, only bi-directional links are effectively preserved to maintain communication symmetry. In comparison to existing link-based scheduling methods, the inherent broadcasting nature of wireless channels offers intrinsic advantages in speeding up convergence of decentralized learning by creating more communicated links with the same number of transmission slots.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Distributed Consensus in Wireless Networks with Probabilistic Broadcast Scheduling
Authors:
Daniel Pérez Herrera,
Zheng Chen,
Erik G. Larsson
Abstract:
We consider distributed average consensus in a wireless network with partial communication to reduce the number of transmissions in every iteration/round. Considering the broadcast nature of wireless channels, we propose a probabilistic approach that schedules a subset of nodes for broadcasting information to their neighbors in every round. We compare several heuristic methods for assigning the no…
▽ More
We consider distributed average consensus in a wireless network with partial communication to reduce the number of transmissions in every iteration/round. Considering the broadcast nature of wireless channels, we propose a probabilistic approach that schedules a subset of nodes for broadcasting information to their neighbors in every round. We compare several heuristic methods for assigning the node broadcast probabilities under a fixed number of transmissions per round. Furthermore, we introduce a pre-compensation method to correct the bias between the consensus value and the average of the initial values, and suggest possible extensions for our design. Our results are particularly relevant for developing communication-efficient consensus protocols in a wireless environment with limited frequency/time resources.
△ Less
Submitted 27 January, 2023;
originally announced January 2023.
-
Personalized musically induced emotions of not-so-popular Colombian music
Authors:
Juan Sebastián Gómez-Cañón,
Perfecto Herrera,
Estefanía Cano,
Emilia Gómez
Abstract:
This work presents an initial proof of concept of how Music Emotion Recognition (MER) systems could be intentionally biased with respect to annotations of musically induced emotions in a political context. In specific, we analyze traditional Colombian music containing politically charged lyrics of two types: (1) vallenatos and social songs from the "left-wing" guerrilla Fuerzas Armadas Revoluciona…
▽ More
This work presents an initial proof of concept of how Music Emotion Recognition (MER) systems could be intentionally biased with respect to annotations of musically induced emotions in a political context. In specific, we analyze traditional Colombian music containing politically charged lyrics of two types: (1) vallenatos and social songs from the "left-wing" guerrilla Fuerzas Armadas Revolucionarias de Colombia (FARC) and (2) corridos from the "right-wing" paramilitaries Autodefensas Unidas de Colombia (AUC). We train personalized machine learning models to predict induced emotions for three users with diverse political views - we aim at identifying the songs that may induce negative emotions for a particular user, such as anger and fear. To this extent, a user's emotion judgements could be interpreted as problematizing data - subjective emotional judgments could in turn be used to influence the user in a human-centered machine learning environment. In short, highly desired "emotion regulation" applications could potentially deviate to "emotion manipulation" - the recent discredit of emotion recognition technologies might transcend ethical issues of diversity and inclusion.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
The emotions that we perceive in music: the influence of language and lyrics comprehension on agreement
Authors:
Juan Sebastián Gómez Cañón,
Perfecto Herrera,
Emilia Gómez,
Estefanía Cano
Abstract:
In the present study, we address the relationship between the emotions perceived in pop and rock music (mainly in Euro-American styles with English lyrics) and the language spoken by the listener. Our goal is to understand the influence of lyrics comprehension on the perception of emotions and use this information to improve Music Emotion Recognition (MER) models. Two main research questions are a…
▽ More
In the present study, we address the relationship between the emotions perceived in pop and rock music (mainly in Euro-American styles with English lyrics) and the language spoken by the listener. Our goal is to understand the influence of lyrics comprehension on the perception of emotions and use this information to improve Music Emotion Recognition (MER) models. Two main research questions are addressed: 1. Are there differences and similarities between the emotions perceived in pop/rock music by listeners raised with different mother tongues? 2. Do personal characteristics have an influence on the perceived emotions for listeners of a given language? Personal characteristics include the listeners' general demographics, familiarity and preference for the fragments, and music sophistication. Our hypothesis is that inter-rater agreement (as defined by Krippendorff's alpha coefficient) from subjects is directly influenced by the comprehension of lyrics.
△ Less
Submitted 25 October, 2019; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour
Authors:
Emilia Gómez,
Carlos Castillo,
Vicky Charisi,
Verónica Dahl,
Gustavo Deco,
Blagoj Delipetrev,
Nicole Dewandre,
Miguel Ángel González-Ballester,
Fabien Gouyon,
José Hernández-Orallo,
Perfecto Herrera,
Anders Jonsson,
Ansgar Koene,
Martha Larson,
Ramón López de Mántaras,
Bertin Martens,
Marius Miron,
Rubén Moreno-Bote,
Nuria Oliver,
Antonio Puertas Gallardo,
Heike Schweitzer,
Nuria Sebastian,
Xavier Serra,
Joan Serrà,
Songül Tolan
, et al. (1 additional authors not shown)
Abstract:
This document contains the outcome of the first Human behaviour and machine intelligence (HUMAINT) workshop that took place 5-6 March 2018 in Barcelona, Spain. The workshop was organized in the context of a new research programme at the Centre for Advanced Studies, Joint Research Centre of the European Commission, which focuses on studying the potential impact of artificial intelligence on human b…
▽ More
This document contains the outcome of the first Human behaviour and machine intelligence (HUMAINT) workshop that took place 5-6 March 2018 in Barcelona, Spain. The workshop was organized in the context of a new research programme at the Centre for Advanced Studies, Joint Research Centre of the European Commission, which focuses on studying the potential impact of artificial intelligence on human behaviour. The workshop gathered an interdisciplinary group of experts to establish the state of the art research in the field and a list of future research challenges to be addressed on the topic of human and machine intelligence, algorithm's potential impact on human cognitive capabilities and decision making, and evaluation and regulation needs. The document is made of short position statements and identification of challenges provided by each expert, and incorporates the result of the discussions carried out during the workshop. In the conclusion section, we provide a list of emerging research topics and strategies to be addressed in the near future.
△ Less
Submitted 7 June, 2018;
originally announced June 2018.
-
Analysis of the Cuban journal Bibliotecas: Anales de Investigacion
Authors:
C. L. González-Valiente,
S. Núñez Amaro,
J. R. Santovenia Díaz,
M. P. Linares Herrera
Abstract:
The objective of this article is to describe the academic impact, the editorial process quality, and the editorial and visibility strategies of Bibliotecas. Anales de Investigacion (BAI), a scientific Cuban journal edited by National Library of Cuba Jose Marti. The academic impact is determined through a citation analysis, which considers Google Scholar database as reference source. The bibliometr…
▽ More
The objective of this article is to describe the academic impact, the editorial process quality, and the editorial and visibility strategies of Bibliotecas. Anales de Investigacion (BAI), a scientific Cuban journal edited by National Library of Cuba Jose Marti. The academic impact is determined through a citation analysis, which considers Google Scholar database as reference source. The bibliometric indicators applied are: citation per year, citation vs. self-citation, citable journals vs. non-citable documents, Hirsch Index, and impact factor. The editorial process quality and the visibility strategies are determined through a self-evaluation which takes into account the SciELO, Scopus, CLASE, Redalyc, Latindex, Dialnet, and ERIH PLUS methodologies. The results reveal an ascending citation line that highlights citing journals from the field of Library and Information Science, Medicine and Health Sciences, and Education. Aspects related content and format have negatively influenced on editorial process quality. Some strategies are proposed to improve scientific visibility through the inclusion in databases, directories, and social and academic networks. In general, this study contributes to the editorial decision taking, an issue that could augment the impact and scientific visibility of BAI.
△ Less
Submitted 16 March, 2016;
originally announced March 2016.
-
Characterization and exploitation of community structure in cover song networks
Authors:
Joan Serrà,
Massimiliano Zanin,
Perfecto Herrera,
Xavier Serra
Abstract:
The use of community detection algorithms is explored within the framework of cover song identification, i.e. the automatic detection of different audio renditions of the same underlying musical piece. Until now, this task has been posed as a typical query-by-example task, where one submits a query song and the system retrieves a list of possible matches ranked by their similarity to the query. In…
▽ More
The use of community detection algorithms is explored within the framework of cover song identification, i.e. the automatic detection of different audio renditions of the same underlying musical piece. Until now, this task has been posed as a typical query-by-example task, where one submits a query song and the system retrieves a list of possible matches ranked by their similarity to the query. In this work, we propose a new approach which uses song communities to provide more relevant answers to a given query. Starting from the output of a state-of-the-art system, songs are embedded in a complex weighted network whose links represent similarity (related musical content). Communities inside the network are then recognized as groups of covers and this information is used to enhance the results of the system. In particular, we show that this approach increases both the coherence and the accuracy of the system. Furthermore, we provide insight into the internal organization of individual cover song communities, showing that there is a tendency for the original song to be central within the community. We postulate that the methods and results presented here could be relevant to other query-by-example tasks.
△ Less
Submitted 12 September, 2011; v1 submitted 29 August, 2011;
originally announced August 2011.
-
Statistical keyword detection in literary corpora
Authors:
Juan P. Herrera,
Pedro A. Pury
Abstract:
Understanding the complexity of human language requires an appropriate analysis of the statistical distribution of words in texts. We consider the information retrieval problem of detecting and ranking the relevant words of a text by means of statistical information referring to the "spatial" use of the words. Shannon's entropy of information is used as a tool for automatic keyword extraction. B…
▽ More
Understanding the complexity of human language requires an appropriate analysis of the statistical distribution of words in texts. We consider the information retrieval problem of detecting and ranking the relevant words of a text by means of statistical information referring to the "spatial" use of the words. Shannon's entropy of information is used as a tool for automatic keyword extraction. By using The Origin of Species by Charles Darwin as a representative text sample, we show the performance of our detector and compare it with another proposals in the literature. The random shuffled text receives special attention as a tool for calibrating the ranking indices.
△ Less
Submitted 30 May, 2008; v1 submitted 5 January, 2007;
originally announced January 2007.