-
Data Augmentation with Diffusion Models for Colon Polyp Localization on the Low Data Regime: How much real data is enough?
Authors:
Adrian Tormos,
Blanca Llauradó,
Fernando Núñez,
Axel Romero,
Dario Garcia-Gasulla,
Javier Béjar
Abstract:
The scarcity of data in medical domains hinders the performance of Deep Learning models. Data augmentation techniques can alleviate that problem, but they usually rely on functional transformations of the data that do not guarantee to preserve the original tasks. To approximate the distribution of the data using generative models is a way of reducing that problem and also to obtain new samples tha…
▽ More
The scarcity of data in medical domains hinders the performance of Deep Learning models. Data augmentation techniques can alleviate that problem, but they usually rely on functional transformations of the data that do not guarantee to preserve the original tasks. To approximate the distribution of the data using generative models is a way of reducing that problem and also to obtain new samples that resemble the original data. Denoising Diffusion models is a promising Deep Learning technique that can learn good approximations of different kinds of data like images, time series or tabular data.
Automatic colonoscopy analysis and specifically Polyp localization in colonoscopy videos is a task that can assist clinical diagnosis and treatment. The annotation of video frames for training a deep learning model is a time consuming task and usually only small datasets can be obtained. The fine tuning of application models using a large dataset of generated data could be an alternative to improve their performance. We conduct a set of experiments training different diffusion models that can generate jointly colonoscopy images with localization annotations using a combination of existing open datasets. The generated data is used on various transfer learning experiments in the task of polyp localization with a model based on YOLO v9 on the low data regime.
△ Less
Submitted 28 November, 2024;
originally announced November 2024.
-
Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification
Authors:
José Fernando Núñez,
Jamie Arjona,
Javier Béjar
Abstract:
Deep learning models need a sufficient amount of data in order to be able to find the hidden patterns in it. It is the purpose of generative modeling to learn the data distribution, thus allowing us to sample more data and augment the original dataset. In the context of physiological data, and more specifically electrocardiogram (ECG) data, given its sensitive nature and expensive data collection,…
▽ More
Deep learning models need a sufficient amount of data in order to be able to find the hidden patterns in it. It is the purpose of generative modeling to learn the data distribution, thus allowing us to sample more data and augment the original dataset. In the context of physiological data, and more specifically electrocardiogram (ECG) data, given its sensitive nature and expensive data collection, we can exploit the benefits of generative models in order to enlarge existing datasets and improve downstream tasks, in our case, classification of heart rhythm.
In this work, we explore the usefulness of synthetic data generated with different generative models from Deep Learning namely Diffweave, Time-Diffusion and Time-VQVAE in order to obtain better classification results for two open source multivariate ECG datasets. Moreover, we also investigate the effects of transfer learning, by fine-tuning a synthetically pre-trained model and then progressively adding increasing proportions of real data. We conclude that although the synthetic samples resemble the real ones, the classification improvement when simply augmenting the real dataset is barely noticeable on individual datasets, but when both datasets are merged the results show an increase across all metrics for the classifiers when using synthetic samples as augmented data. From the fine-tuning results the Time-VQVAE generative model has shown to be superior to the others but not powerful enough to achieve results close to a classifier trained with real data only. In addition, methods and metrics for measuring closeness between synthetic data and the real one have been explored as a side effect of the main research questions of this study.
△ Less
Submitted 27 November, 2024;
originally announced November 2024.
-
UruBots Autonomous Cars Team One Description Paper for FIRA 2024
Authors:
Pablo Moraes,
Christopher Peters,
Any Da Rosa,
Vinicio Melgar,
Franco Nuñez,
Maximo Retamar,
William Moraes,
Victoria Saravia,
Hiago Sodre,
Sebastian Barcelona,
Anthony Scirgalea,
Juan Deniz,
Bruna Guterres,
André Kelbouscas,
Ricardo Grando
Abstract:
This document presents the design of an autonomous car developed by the UruBots team for the 2024 FIRA Autonomous Cars Race Challenge. The project involves creating an RC-car sized electric vehicle capable of navigating race tracks with in an autonomous manner. It integrates mechanical and electronic systems alongside artificial intelligence based algorithms for the navigation and real-time decisi…
▽ More
This document presents the design of an autonomous car developed by the UruBots team for the 2024 FIRA Autonomous Cars Race Challenge. The project involves creating an RC-car sized electric vehicle capable of navigating race tracks with in an autonomous manner. It integrates mechanical and electronic systems alongside artificial intelligence based algorithms for the navigation and real-time decision-making. The core of our project include the utilization of an AI-based algorithm to learn information from a camera and act in the robot to perform the navigation. We show that by creating a dataset with more than five thousand samples and a five-layered CNN we managed to achieve promissing performance we our proposed hardware setup. Overall, this paper aims to demonstrate the autonomous capabilities of our car, highlighting its readiness for the 2024 FIRA challenge, helping to contribute to the field of autonomous vehicle research.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
UruBots Autonomous Car Team Two: Team Description Paper for FIRA 2024
Authors:
William Moraes,
Juan Deniz,
Pablo Moraes,
Christopher Peters,
Vincent Sandin,
Gabriel da Silva,
Franco Nunez,
Maximo Retamar,
Victoria Saravia,
Hiago Sodre,
Sebastian Barcelona,
Anthony Scirgalea,
Bruna Guterres,
Andre Kelbouscas,
Ricardo Grando
Abstract:
This paper proposes a mini autonomous car to be used by the team UruBots for the 2024 FIRA Autonomous Cars Race Challenge. The vehicle is proposed focusing on a low cost and light weight setup. Powered by a Raspberry PI4 and with a total weight of 1.15 Kilograms, we show that our vehicle manages to race a track of approximately 13 meters in 11 seconds at the best evaluation that was carried out, w…
▽ More
This paper proposes a mini autonomous car to be used by the team UruBots for the 2024 FIRA Autonomous Cars Race Challenge. The vehicle is proposed focusing on a low cost and light weight setup. Powered by a Raspberry PI4 and with a total weight of 1.15 Kilograms, we show that our vehicle manages to race a track of approximately 13 meters in 11 seconds at the best evaluation that was carried out, with an average speed of 1.2m/s in average. That performance was achieved after training a convolutional neural network with 1500 samples for a total amount of 60 epochs. Overall, we believe that our vehicle are suited to perform at the FIRA Autonomous Cars Race Challenge 2024, helping the development of the field of study and the category in the competition.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Authors:
Feiqi Cao,
Siwen Luo,
Felipe Nunez,
Zean Wen,
Josiah Poon,
Caren Han
Abstract:
Most TextVQA approaches focus on the integration of objects, scene texts and question words by a simple transformer encoder. But this fails to capture the semantic relations between different modalities. The paper proposes a Scene Graph based co-Attention Network (SceneGATE) for TextVQA, which reveals the semantic relations among the objects, Optical Character Recognition (OCR) tokens and the ques…
▽ More
Most TextVQA approaches focus on the integration of objects, scene texts and question words by a simple transformer encoder. But this fails to capture the semantic relations between different modalities. The paper proposes a Scene Graph based co-Attention Network (SceneGATE) for TextVQA, which reveals the semantic relations among the objects, Optical Character Recognition (OCR) tokens and the question words. It is achieved by a TextVQA-based scene graph that discovers the underlying semantics of an image. We created a guided-attention module to capture the intra-modal interplay between the language and the vision as a guidance for inter-modal interactions. To make explicit teaching of the relations between the two modalities, we proposed and integrated two attention modules, namely a scene graph-based semantic relation-aware attention and a positional relation-aware attention. We conducted extensive experiments on two benchmark datasets, Text-VQA and ST-VQA. It is shown that our SceneGATE method outperformed existing ones because of the scene graph and its attention modules.
△ Less
Submitted 7 August, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Input complexity and out-of-distribution detection with likelihood-based generative models
Authors:
Joan Serrà,
David Álvarez,
Vicenç Gómez,
Olga Slizovskaia,
José F. Núñez,
Jordi Luque
Abstract:
Likelihood-based generative models are a promising resource to detect out-of-distribution (OOD) inputs which could compromise the robustness or reliability of a machine learning system. However, likelihoods derived from such models have been shown to be problematic for detecting certain types of inputs that significantly differ from training data. In this paper, we pose that this problem is due to…
▽ More
Likelihood-based generative models are a promising resource to detect out-of-distribution (OOD) inputs which could compromise the robustness or reliability of a machine learning system. However, likelihoods derived from such models have been shown to be problematic for detecting certain types of inputs that significantly differ from training data. In this paper, we pose that this problem is due to the excessive influence that input complexity has in generative models' likelihoods. We report a set of experiments supporting this hypothesis, and use an estimate of input complexity to derive an efficient and parameter-free OOD score, which can be seen as a likelihood-ratio, akin to Bayesian model comparison. We find such score to perform comparably to, or even better than, existing OOD detection approaches under a wide range of data sets, models, model sizes, and complexity estimates.
△ Less
Submitted 17 January, 2020; v1 submitted 25 September, 2019;
originally announced September 2019.
-
A Rule-based Model of a Hypothetical Zombie Outbreak: Insights on the role of emotional factors during behavioral adaptation of an artificial population
Authors:
F. Nuñez,
C. Ravello,
H. Urbina,
T. Perez-Acle
Abstract:
Models of infectious diseases have been developed since the first half of the twentieth century. Most models haven't considered the role that emotional factors of the individual may play on the population's behavioral adaptation during the spread of a pandemic disease. Considering that local interactions among individuals generate patterns that -at a large scale- govern the action of masses, we ha…
▽ More
Models of infectious diseases have been developed since the first half of the twentieth century. Most models haven't considered the role that emotional factors of the individual may play on the population's behavioral adaptation during the spread of a pandemic disease. Considering that local interactions among individuals generate patterns that -at a large scale- govern the action of masses, we have studied the behavioral adaptation of a population induced by the spread of an infectious disease. Therefore, we have developed a rule-based model of a hypothetical zombie outbreak, written in Kappa language, and simulated using Guillespie's stochastic approach. Our study addresses the specificity and heterogeneity of the system at the individual level, a highly desirable characteristic, mostly overlooked in classic epidemic models. Together with the basic elements of a typical epidemiological model, our model includes an individual representation of the disease progression and the traveling of agents among cities being affected. It also introduces an approximation to measure the effect of panic in the population as a function of the individual situational awareness. In addition, the effect of two possible countermeasures to overcome the zombie threat is considered: the availability of medical treatment and the deployment of special armed forces. However, due to the special characteristics of this hypothetical infectious disease, even using exaggerated numbers of countermeasures, only a small percentage of the population can be saved at the end of the simulations. As expected from a rule-based model approach, the global dynamics of our model resulted primarily governed by the mechanistic description of local interactions occurring at the individual level. As a whole, people's situational awareness resulted essential to modulate the inner dynamics of the system.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.