Search | arXiv e-print repository

MIAS-SAM: Medical Image Anomaly Segmentation without thresholding

Authors: Marco Colussi, Dragan Ahmetovic, Sergio Mascetti

Abstract: This paper presents MIAS-SAM, a novel approach for the segmentation of anomalous regions in medical images. MIAS-SAM uses a patch-based memory bank to store relevant image features, which are extracted from normal data using the SAM encoder. At inference time, the embedding patches extracted from the SAM encoder are compared with those in the memory bank to obtain the anomaly map. Finally, MIAS-SA… ▽ More This paper presents MIAS-SAM, a novel approach for the segmentation of anomalous regions in medical images. MIAS-SAM uses a patch-based memory bank to store relevant image features, which are extracted from normal data using the SAM encoder. At inference time, the embedding patches extracted from the SAM encoder are compared with those in the memory bank to obtain the anomaly map. Finally, MIAS-SAM computes the center of gravity of the anomaly map to prompt the SAM decoder, obtaining an accurate segmentation from the previously extracted features. Differently from prior works, MIAS-SAM does not require to define a threshold value to obtain the segmentation from the anomaly map. Experimental results conducted on three publicly available datasets, each with a different imaging modality (Brain MRI, Liver CT, and Retina OCT) show accurate anomaly segmentation capabilities measured using DICE score. The code is available at: https://github.com/warpcut/MIAS-SAM △ Less

Submitted 28 May, 2025; originally announced May 2025.

arXiv:2412.00946 [pdf, other]

MapIO: Embodied Interaction for the Accessibility of Tactile Maps Through Augmented Touch Exploration and Conversation

Authors: Matteo Manzoni, Sergio Mascetti, Dragan Ahmetovic, Ryan Crabb, James M. Coughlan

Abstract: For individuals who are blind or have low vision, tactile maps provide essential spatial information but are limited in the amount of data they can convey. Digitally augmented tactile maps enhance these capabilities with audio feedback, thereby combining the tactile feedback provided by the map with an audio description of the touched elements. In this context, we explore an embodied interaction p… ▽ More For individuals who are blind or have low vision, tactile maps provide essential spatial information but are limited in the amount of data they can convey. Digitally augmented tactile maps enhance these capabilities with audio feedback, thereby combining the tactile feedback provided by the map with an audio description of the touched elements. In this context, we explore an embodied interaction paradigm to augment tactile maps with conversational interaction based on Large Language Models, thus enabling users to obtain answers to arbitrary questions regarding the map. We analyze the type of questions the users are interested in asking, engineer the Large Language Model's prompt to provide reliable answers, and study the resulting system with a set of 10 participants, evaluating how the users interact with the system, its usability, and user experience. △ Less

Submitted 1 December, 2024; originally announced December 2024.

Comments: 25 pages

arXiv:2411.17869 [pdf, other]

ReC-TTT: Contrastive Feature Reconstruction for Test-Time Training

Authors: Marco Colussi, Sergio Mascetti, Jose Dolz, Christian Desrosiers

Abstract: The remarkable progress in deep learning (DL) showcases outstanding results in various computer vision tasks. However, adaptation to real-time variations in data distributions remains an important challenge. Test-Time Training (TTT) was proposed as an effective solution to this issue, which increases the generalization ability of trained models by adding an auxiliary task at train time and then us… ▽ More The remarkable progress in deep learning (DL) showcases outstanding results in various computer vision tasks. However, adaptation to real-time variations in data distributions remains an important challenge. Test-Time Training (TTT) was proposed as an effective solution to this issue, which increases the generalization ability of trained models by adding an auxiliary task at train time and then using its loss at test time to adapt the model. Inspired by the recent achievements of contrastive representation learning in unsupervised tasks, we propose ReC-TTT, a test-time training technique that can adapt a DL model to new unseen domains by generating discriminative views of the input data. ReC-TTT uses cross-reconstruction as an auxiliary task between a frozen encoder and two trainable encoders, taking advantage of a single shared decoder. This enables, at test time, to adapt the encoders to extract features that will be correctly reconstructed by the decoder that, in this phase, is frozen on the source domain. Experimental results show that ReC-TTT achieves better results than other state-of-the-art techniques in most domain shift classification challenges. △ Less

Submitted 26 November, 2024; originally announced November 2024.

arXiv:2305.18489 [pdf, other]

A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images

Authors: Mattia Giovanni Campana, Marco Colussi, Franca Delmastro, Sergio Mascetti, Elena Pagani

Abstract: In recent months, the monkeypox (mpox) virus -- previously endemic in a limited area of the world -- has started spreading in multiple countries until being declared a ``public health emergency of international concern'' by the World Health Organization. The alert was renewed in February 2023 due to a persisting sustained incidence of the virus in several countries and worries about possible new o… ▽ More In recent months, the monkeypox (mpox) virus -- previously endemic in a limited area of the world -- has started spreading in multiple countries until being declared a ``public health emergency of international concern'' by the World Health Organization. The alert was renewed in February 2023 due to a persisting sustained incidence of the virus in several countries and worries about possible new outbreaks. Low-income countries with inadequate infrastructures for vaccine and testing administration are particularly at risk. A symptom of mpox infection is the appearance of skin rashes and eruptions, which can drive people to seek medical advice. A technology that might help perform a preliminary screening based on the aspect of skin lesions is the use of Machine Learning for image classification. However, to make this technology suitable on a large scale, it should be usable directly on mobile devices of people, with a possible notification to a remote medical expert. In this work, we investigate the adoption of Deep Learning to detect mpox from skin lesion images. The proposal leverages Transfer Learning to cope with the scarce availability of mpox image datasets. As a first step, a homogenous, unpolluted, dataset is produced by manual selection and preprocessing of available image data. It will also be released publicly to researchers in the field. Then, a thorough comparison is conducted amongst several Convolutional Neural Networks, based on a 10-fold stratified cross-validation. The best models are then optimized through quantization for use on mobile devices; measures of classification quality, memory footprint, and processing times validate the feasibility of our proposal. Additionally, the use of eXplainable AI is investigated as a suitable instrument to both technically and clinically validate classification outcomes. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Comments: Submitted to Pervasive and Mobile Computing

arXiv:2211.12089 [pdf, other]

Ultrasound Detection of Subquadricipital Recess Distension

Authors: Marco Colussi, Gabriele Civitarese, Dragan Ahmetovic, Claudio Bettini, Roberta Gualtierotti, Flora Peyvandi, Sergio Mascetti

Abstract: Joint bleeding is a common condition for people with hemophilia and, if untreated, can result in hemophilic arthropathy. Ultrasound imaging has recently emerged as an effective tool to diagnose joint recess distension caused by joint bleeding. However, no computer-aided diagnosis tool exists to support the practitioner in the diagnosis process. This paper addresses the problem of automatically det… ▽ More Joint bleeding is a common condition for people with hemophilia and, if untreated, can result in hemophilic arthropathy. Ultrasound imaging has recently emerged as an effective tool to diagnose joint recess distension caused by joint bleeding. However, no computer-aided diagnosis tool exists to support the practitioner in the diagnosis process. This paper addresses the problem of automatically detecting the recess and assessing whether it is distended in knee ultrasound images collected in patients with hemophilia. After framing the problem, we propose two different approaches: the first one adopts a one-stage object detection algorithm, while the second one is a multi-task approach with a classification and a detection branch. The experimental evaluation, conducted with $483$ annotated images, shows that the solution based on object detection alone has a balanced accuracy score of $0.74$ with a mean IoU value of $0.66$, while the multi-task approach has a higher balanced accuracy value ($0.78$) at the cost of a slightly lower mean IoU value. △ Less

Submitted 22 November, 2022; originally announced November 2022.

arXiv:2005.06875 [pdf, other]

Developing Accessible Mobile Applications with Cross-Platform Development Frameworks

Authors: Sergio Mascetti, Mattia Ducci, Niccoló Cantù, Paolo Pecis, Dragan Ahmetovic

Abstract: This contribution investigates how cross-platform development frameworks (CPDF) support the creation of mobile applications that are accessible to people with visual impairments through screen readers. We first systematically analyze screen-reader APIs available in native iOS and Android, and we examine whether and at what level the same functionalities are available in two popular CPDF: Xamarin a… ▽ More This contribution investigates how cross-platform development frameworks (CPDF) support the creation of mobile applications that are accessible to people with visual impairments through screen readers. We first systematically analyze screen-reader APIs available in native iOS and Android, and we examine whether and at what level the same functionalities are available in two popular CPDF: Xamarin and React Native. This analysis unveils that there are many functionalities shared between native iOS and Android APIs, but most of them are not available in React Native or Xamarin. In particular, not even all basic APIs are exposed by the examined CPDF. Accessing the unavailable APIs is still possible, but it requires an additional effort by the developers who need to know native APIs and to write platform specific code, hence partially negating the advantages of CPDF. To address this problem, we consider a representative set of native APIs that cannot be directly accessed from React Native and Xamarin and show sample implementations for accessing them. △ Less

Submitted 14 May, 2020; originally announced May 2020.

arXiv:1506.07272 [pdf, other]

Sonification of guidance data during road crossing for people with visual impairments or blindness

Authors: Sergio Mascetti, Lorenzo Picinali, Andrea Gerino, Dragan Ahmetovic, Cristian Bernareggi

Abstract: In the last years several solutions were proposed to support people with visual impairments or blindness during road crossing. These solutions focus on computer vision techniques for recognizing pedestrian crosswalks and computing their relative position from the user. Instead, this contribution addresses a different problem; the design of an auditory interface that can effectively guide the user… ▽ More In the last years several solutions were proposed to support people with visual impairments or blindness during road crossing. These solutions focus on computer vision techniques for recognizing pedestrian crosswalks and computing their relative position from the user. Instead, this contribution addresses a different problem; the design of an auditory interface that can effectively guide the user during road crossing. Two original auditory guiding modes based on data sonification are presented and compared with a guiding mode based on speech messages. Experimental evaluation shows that there is no guiding mode that is best suited for all test subjects. The average time to align and cross is not significantly different among the three guiding modes, and test subjects distribute their preferences for the best guiding mode almost uniformly among the three solutions. From the experiments it also emerges that higher effort is necessary for decoding the sonified instructions if compared to the speech instructions, and that test subjects require frequent `hints' (in the form of speech messages). Despite this, more than 2/3 of test subjects prefer one of the two guiding modes based on sonification. There are two main reasons for this: firstly, with speech messages it is harder to hear the sound of the environment, and secondly sonified messages convey information about the "quantity" of the expected movement. △ Less

Submitted 24 June, 2015; originally announced June 2015.

arXiv:1110.2213 [pdf, ps, other]

doi 10.1613/jair.2136

Supporting Temporal Reasoning by Mapping Calendar Expressions to Minimal Periodic Sets

Authors: C. Bettini, S. Mascetti, X. S. Wang

Abstract: In the recent years several research efforts have focused on the concept of time granularity and its applications. A first stream of research investigated the mathematical models behind the notion of granularity and the algorithms to manage temporal data based on those models. A second stream of research investigated symbolic formalisms providing a set of algebraic operators to define granularitie… ▽ More In the recent years several research efforts have focused on the concept of time granularity and its applications. A first stream of research investigated the mathematical models behind the notion of granularity and the algorithms to manage temporal data based on those models. A second stream of research investigated symbolic formalisms providing a set of algebraic operators to define granularities in a compact and compositional way. However, only very limited manipulation algorithms have been proposed to operate directly on the algebraic representation making it unsuitable to use the symbolic formalisms in applications that need manipulation of granularities. This paper aims at filling the gap between the results from these two streams of research, by providing an efficient conversion from the algebraic representation to the equivalent low-level representation based on the mathematical models. In addition, the conversion returns a minimal representation in terms of period length. Our results have a major practical impact: users can more easily define arbitrary granularities in terms of algebraic operators, and then access granularity reasoning and other services operating efficiently on the equivalent, minimal low-level representation. As an example, we illustrate the application to temporal constraint reasoning with multiple granularities. From a technical point of view, we propose an hybrid algorithm that interleaves the conversion of calendar subexpressions into periodical sets with the minimization of the period length. The algorithm returns set-based granularity representations having minimal period length, which is the most relevant parameter for the performance of the considered reasoning services. Extensive experimental work supports the techniques used in the algorithm, and shows the efficiency and effectiveness of the algorithm. △ Less

Submitted 10 October, 2011; originally announced October 2011.

Journal ref: Journal Of Artificial Intelligence Research, Volume 28, pages 299-348, 2007

arXiv:1007.0408 [pdf, other]

Privacy in geo-social networks: proximity notification with untrusted service providers and curious buddies

Authors: Sergio Mascetti, Dario Freni, Claudio Bettini, X. Sean Wang, Sushil Jajodia

Abstract: A major feature of the emerging geo-social networks is the ability to notify a user when one of his friends (also called buddies) happens to be geographically in proximity with the user. This proximity service is usually offered by the network itself or by a third party service provider (SP) using location data acquired from the users. This paper provides a rigorous theoretical and experimental an… ▽ More A major feature of the emerging geo-social networks is the ability to notify a user when one of his friends (also called buddies) happens to be geographically in proximity with the user. This proximity service is usually offered by the network itself or by a third party service provider (SP) using location data acquired from the users. This paper provides a rigorous theoretical and experimental analysis of the existing solutions for the location privacy problem in proximity services. This is a serious problem for users who do not trust the SP to handle their location data, and would only like to release their location information in a generalized form to participating buddies. The paper presents two new protocols providing complete privacy with respect to the SP, and controllable privacy with respect to the buddies. The analytical and experimental analysis of the protocols takes into account privacy, service precision, and computation and communication costs, showing the superiority of the new protocols compared to those appeared in the literature to date. The proposed protocols have also been tested in a full system implementation of the proximity service. △ Less

Submitted 6 November, 2010; v1 submitted 2 July, 2010; originally announced July 2010.

Showing 1–9 of 9 results for author: Mascetti, S