-
A large-scale multicenter breast cancer DCE-MRI benchmark dataset with expert segmentations
Authors:
Lidia Garrucho,
Kaisar Kushibar,
Claire-Anne Reidel,
Smriti Joshi,
Richard Osuala,
Apostolia Tsirikoglou,
Maciej Bobowicz,
Javier del Riego,
Alessandro Catanese,
Katarzyna Gwoździewicz,
Maria-Laura Cosaka,
Pasant M. Abo-Elhoda,
Sara W. Tantawy,
Shorouq S. Sakrana,
Norhan O. Shawky-Abdelfatah,
Amr Muhammad Abdo-Salem,
Androniki Kozana,
Eugen Divjak,
Gordana Ivanac,
Katerina Nikiforaki,
Michail E. Klontzas,
Rosa García-Dosdá,
Meltem Gulsun-Akpinar,
Oğuz Lafcı,
Ritse Mann
, et al. (8 additional authors not shown)
Abstract:
Artificial Intelligence (AI) research in breast cancer Magnetic Resonance Imaging (MRI) faces challenges due to limited expert-labeled segmentations. To address this, we present a multicenter dataset of 1506 pre-treatment T1-weighted dynamic contrast-enhanced MRI cases, including expert annotations of primary tumors and non-mass-enhanced regions. The dataset integrates imaging data from four colle…
▽ More
Artificial Intelligence (AI) research in breast cancer Magnetic Resonance Imaging (MRI) faces challenges due to limited expert-labeled segmentations. To address this, we present a multicenter dataset of 1506 pre-treatment T1-weighted dynamic contrast-enhanced MRI cases, including expert annotations of primary tumors and non-mass-enhanced regions. The dataset integrates imaging data from four collections in The Cancer Imaging Archive (TCIA), where only 163 cases with expert segmentations were initially available. To facilitate the annotation process, a deep learning model was trained to produce preliminary segmentations for the remaining cases. These were subsequently corrected and verified by 16 breast cancer experts (averaging 9 years of experience), creating a fully annotated dataset. Additionally, the dataset includes 49 harmonized clinical and demographic variables, as well as pre-trained weights for a baseline nnU-Net model trained on the annotated data. This resource addresses a critical gap in publicly available breast cancer datasets, enabling the development, validation, and benchmarking of advanced deep learning models, thus driving progress in breast cancer diagnostics, treatment response prediction, and personalized care.
△ Less
Submitted 21 February, 2025; v1 submitted 19 June, 2024;
originally announced June 2024.
-
High-resolution synthesis of high-density breast mammograms: Application to improved fairness in deep learning based mass detection
Authors:
Lidia Garrucho,
Kaisar Kushibar,
Richard Osuala,
Oliver Diaz,
Alessandro Catanese,
Javier del Riego,
Maciej Bobowicz,
Fredrik Strand,
Laura Igual,
Karim Lekadir
Abstract:
Computer-aided detection systems based on deep learning have shown good performance in breast cancer detection. However, high-density breasts show poorer detection performance since dense tissues can mask or even simulate masses. Therefore, the sensitivity of mammography for breast cancer detection can be reduced by more than 20% in dense breasts. Additionally, extremely dense cases reported an in…
▽ More
Computer-aided detection systems based on deep learning have shown good performance in breast cancer detection. However, high-density breasts show poorer detection performance since dense tissues can mask or even simulate masses. Therefore, the sensitivity of mammography for breast cancer detection can be reduced by more than 20% in dense breasts. Additionally, extremely dense cases reported an increased risk of cancer compared to low-density breasts. This study aims to improve the mass detection performance in highdensity breasts using synthetic high-density full-field digital mammograms (FFDM) as data augmentation during breast mass detection model training. To this end, a total of five cycle-consistent GAN (CycleGAN) models using three FFDM datasets were trained for low-to-high-density image translation in highresolution mammograms. The training images were split by breast density BIRADS categories, being BI-RADS A almost entirely fatty and BI-RADS D extremely dense breasts. Our results showed that the proposed data augmentation technique improved the sensitivity and precision of mass detection in models trained with small datasets and improved the domain generalization of the models trained with large databases. In addition, the clinical realism of the synthetic images was evaluated in a reader study involving two expert radiologists and one surgical oncologist.
△ Less
Submitted 24 January, 2023; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Crawling Facebook for Social Network Analysis Purposes
Authors:
Salvatore A. Catanese,
Pasquale De Meo,
Emilio Ferrara,
Giacomo Fiumara,
Alessandro Provetti
Abstract:
We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collect…
▽ More
We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.
△ Less
Submitted 31 May, 2011;
originally announced May 2011.