-
Personalised Insulin Adjustment with Reinforcement Learning: An In-Silico Validation for People with Diabetes on Intensive Insulin Treatment
Authors:
Maria Panagiotou,
Lorenzo Brigato,
Vivien Streit,
Amanda Hayoz,
Stephan Proennecke,
Stavros Athanasopoulos,
Mikkel T. Olsen,
Elizabeth J. den Brok,
Cecilie H. Svensson,
Konstantinos Makrilakis,
Maria Xatzipsalti,
Andriani Vazeou,
Peter R. Mertens,
Ulrik Pedersen-Bjergaard,
Bastiaan E. de Galan,
Stavroula Mougiakakou
Abstract:
Despite recent advances in insulin preparations and technology, adjusting insulin remains an ongoing challenge for the majority of people with type 1 diabetes (T1D) and longstanding type 2 diabetes (T2D). In this study, we propose the Adaptive Basal-Bolus Advisor (ABBA), a personalised insulin treatment recommendation approach based on reinforcement learning for individuals with T1D and T2D, perfo…
▽ More
Despite recent advances in insulin preparations and technology, adjusting insulin remains an ongoing challenge for the majority of people with type 1 diabetes (T1D) and longstanding type 2 diabetes (T2D). In this study, we propose the Adaptive Basal-Bolus Advisor (ABBA), a personalised insulin treatment recommendation approach based on reinforcement learning for individuals with T1D and T2D, performing self-monitoring blood glucose measurements and multiple daily insulin injection therapy. We developed and evaluated the ability of ABBA to achieve better time-in-range (TIR) for individuals with T1D and T2D, compared to a standard basal-bolus advisor (BBA). The in-silico test was performed using an FDA-accepted population, including 101 simulated adults with T1D and 101 with T2D. An in-silico evaluation shows that ABBA significantly improved TIR and significantly reduced both times below- and above-range, compared to BBA. ABBA's performance continued to improve over two months, whereas BBA exhibited only modest changes. This personalised method for adjusting insulin has the potential to further optimise glycaemic control and support people with T1D and T2D in their daily self-management. Our results warrant ABBA to be trialed for the first time in humans.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
The Role of Artificial Intelligence in Enhancing Insulin Recommendations and Therapy Outcomes
Authors:
Maria Panagiotou,
Knut Stroemmen,
Lorenzo Brigato,
Bastiaan E. de Galan,
Stavroula Mougiakakou
Abstract:
The growing worldwide incidence of diabetes requires more effective approaches for managing blood glucose levels. Insulin delivery systems have advanced significantly, with artificial intelligence (AI) playing a key role in improving their precision and adaptability. AI algorithms, particularly those based on reinforcement learning, allow for personalised insulin dosing by continuously adapting to…
▽ More
The growing worldwide incidence of diabetes requires more effective approaches for managing blood glucose levels. Insulin delivery systems have advanced significantly, with artificial intelligence (AI) playing a key role in improving their precision and adaptability. AI algorithms, particularly those based on reinforcement learning, allow for personalised insulin dosing by continuously adapting to an individual's responses. Despite these advancements, challenges such as data privacy, algorithm transparency, and accessibility still need to be addressed. Continued progress and validation in AI-driven insulin delivery systems promise to improve therapy outcomes further, offering people more effective and individualised management of their diabetes. This paper presents an overview of current strategies, key challenges, and future directions.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Benchmarking Post-Hoc Unknown-Category Detection in Food Recognition
Authors:
Lubnaa Abdur Rahman,
Ioannis Papathanail,
Lorenzo Brigato,
Stavroula Mougiakakou
Abstract:
Food recognition models often struggle to distinguish between seen and unseen samples, frequently misclassifying samples from unseen categories by assigning them an in-distribution (ID) label. This misclassification presents significant challenges when deploying these models in real-world applications, particularly within automatic dietary assessment systems, where incorrect labels can lead to cas…
▽ More
Food recognition models often struggle to distinguish between seen and unseen samples, frequently misclassifying samples from unseen categories by assigning them an in-distribution (ID) label. This misclassification presents significant challenges when deploying these models in real-world applications, particularly within automatic dietary assessment systems, where incorrect labels can lead to cascading errors throughout the system. Ideally, such models should prompt the user when an unknown sample is encountered, allowing for corrective action. Given no prior research exploring food recognition in real-world settings, in this work we conduct an empirical analysis of various post-hoc out-of-distribution (OOD) detection methods for fine-grained food recognition. Our findings indicate that virtual logit matching (ViM) performed the best overall, likely due to its combination of logits and feature-space representations. Additionally, our work reinforces prior notions in the OOD domain, noting that models with higher ID accuracy performed better across the evaluated OOD detection methods. Furthermore, transformer-based architectures consistently outperformed convolution-based models in detecting OOD samples across various methods.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Position: There are no Champions in Long-Term Time Series Forecasting
Authors:
Lorenzo Brigato,
Rafael Morand,
Knut Strømmen,
Maria Panagiotou,
Markus Schmidt,
Stavroula Mougiakakou
Abstract:
Recent advances in long-term time series forecasting have introduced numerous complex prediction models that consistently outperform previously published architectures. However, this rapid progression raises concerns regarding inconsistent benchmarking and reporting practices, which may undermine the reliability of these comparisons. Our position emphasizes the need to shift focus away from pursui…
▽ More
Recent advances in long-term time series forecasting have introduced numerous complex prediction models that consistently outperform previously published architectures. However, this rapid progression raises concerns regarding inconsistent benchmarking and reporting practices, which may undermine the reliability of these comparisons. Our position emphasizes the need to shift focus away from pursuing ever-more complex models and towards enhancing benchmarking practices through rigorous and standardized evaluation methods. To support our claim, we first perform a broad, thorough, and reproducible evaluation of the top-performing models on the most popular benchmark by training 3,500+ networks over 14 datasets. Then, through a comprehensive analysis, we find that slight changes to experimental setups or current evaluation metrics drastically shift the common belief that newly published results are advancing the state of the art. Our findings suggest the need for rigorous and standardized evaluation methods that enable more substantiated claims, including reproducible hyperparameter setups and statistical testing.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
A SAM based Tool for Semi-Automatic Food Annotation
Authors:
Lubnaa Abdur Rahman,
Ioannis Papathanail,
Lorenzo Brigato,
Stavroula Mougiakakou
Abstract:
The advancement of artificial intelligence (AI) in food and nutrition research is hindered by a critical bottleneck: the lack of annotated food data. Despite the rise of highly efficient AI models designed for tasks such as food segmentation and classification, their practical application might necessitate proficiency in AI and machine learning principles, which can act as a challenge for non-AI e…
▽ More
The advancement of artificial intelligence (AI) in food and nutrition research is hindered by a critical bottleneck: the lack of annotated food data. Despite the rise of highly efficient AI models designed for tasks such as food segmentation and classification, their practical application might necessitate proficiency in AI and machine learning principles, which can act as a challenge for non-AI experts in the field of nutritional sciences. Alternatively, it highlights the need to translate AI models into user-friendly tools that are accessible to all. To address this, we present a demo of a semi-automatic food image annotation tool leveraging the Segment Anything Model (SAM). The tool enables prompt-based food segmentation via user interactions, promoting user engagement and allowing them to further categorise food items within meal images and specify weight/volume if necessary. Additionally, we release a fine-tuned version of SAM's mask decoder, dubbed MealSAM, with the ViT-B backbone tailored specifically for food image segmentation. Our objective is not only to contribute to the field by encouraging participation, collaboration, and the gathering of more annotated food data but also to make AI technology available for a broader audience by translating AI into practical tools.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets
Authors:
Lorenzo Brigato,
Stavroula Mougiakakou
Abstract:
We introduce Tune without Validation (Twin), a pipeline for tuning learning rate and weight decay without validation sets. We leverage a recent theoretical framework concerning learning phases in hypothesis space to devise a heuristic that predicts what hyper-parameter (HP) combinations yield better generalization. Twin performs a grid search of trials according to an early-/non-early-stopping sch…
▽ More
We introduce Tune without Validation (Twin), a pipeline for tuning learning rate and weight decay without validation sets. We leverage a recent theoretical framework concerning learning phases in hypothesis space to devise a heuristic that predicts what hyper-parameter (HP) combinations yield better generalization. Twin performs a grid search of trials according to an early-/non-early-stopping scheduler and then segments the region that provides the best results in terms of training loss. Among these trials, the weight norm strongly correlates with predicting generalization. To assess the effectiveness of Twin, we run extensive experiments on 20 image classification datasets and train several families of deep networks, including convolutional, transformer, and feed-forward models. We demonstrate proper HP selection when training from scratch and fine-tuning, emphasizing small-sample scenarios.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
An Empirical Analysis for Zero-Shot Multi-Label Classification on COVID-19 CT Scans and Uncurated Reports
Authors:
Ethan Dack,
Lorenzo Brigato,
Matthew McMurray,
Matthias Fontanellaz,
Thomas Frauenfelder,
Hanno Hoppe,
Aristomenis Exadaktylos,
Thomas Geiser,
Manuela Funke-Chambour,
Andreas Christe,
Lukas Ebner,
Stavroula Mougiakakou
Abstract:
The pandemic resulted in vast repositories of unstructured data, including radiology reports, due to increased medical examinations. Previous research on automated diagnosis of COVID-19 primarily focuses on X-ray images, despite their lower precision compared to computed tomography (CT) scans. In this work, we leverage unstructured data from a hospital and harness the fine-grained details offered…
▽ More
The pandemic resulted in vast repositories of unstructured data, including radiology reports, due to increased medical examinations. Previous research on automated diagnosis of COVID-19 primarily focuses on X-ray images, despite their lower precision compared to computed tomography (CT) scans. In this work, we leverage unstructured data from a hospital and harness the fine-grained details offered by CT scans to perform zero-shot multi-label classification based on contrastive visual language learning. In collaboration with human experts, we investigate the effectiveness of multiple zero-shot models that aid radiologists in detecting pulmonary embolisms and identifying intricate lung details like ground glass opacities and consolidations. Our empirical analysis provides an overview of the possible solutions to target such fine-grained tasks, so far overlooked in the medical multimodal pretraining literature. Our investigation promises future advancements in the medical image analysis community by addressing some challenges associated with unstructured data and fine-grained multi-label classification.
△ Less
Submitted 6 September, 2023; v1 submitted 4 September, 2023;
originally announced September 2023.
-
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Authors:
Lorenzo Brigato,
Stavroula Mougiakakou
Abstract:
Solving image classification tasks given small training datasets remains an open challenge for modern computer vision. Aggressive data augmentation and generative models are among the most straightforward approaches to overcoming the lack of data. However, the first fails to be agnostic to varying image domains, while the latter requires additional compute and careful design. In this work, we stud…
▽ More
Solving image classification tasks given small training datasets remains an open challenge for modern computer vision. Aggressive data augmentation and generative models are among the most straightforward approaches to overcoming the lack of data. However, the first fails to be agnostic to varying image domains, while the latter requires additional compute and careful design. In this work, we study alternative regularization strategies to push the limits of supervised learning on small image classification datasets. In particular, along with the model size and training schedule scaling, we employ a heuristic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters. By training on only 1% of the original CIFAR-10 training set (i.e., 50 images per class) and testing on ciFAIR-10, a variant of the original CIFAR without duplicated images, we reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Food Recognition and Nutritional Apps
Authors:
Lubnaa Abdur Rahman,
Ioannis Papathanail,
Lorenzo Brigato,
Elias K. Spanakis,
Stavroula Mougiakakou
Abstract:
Food recognition and nutritional apps are trending technologies that may revolutionise the way people with diabetes manage their diet. Such apps can monitor food intake as a digital diary and even employ artificial intelligence to assess the diet automatically. Although these apps offer a promising solution for managing diabetes, they are rarely used by patients. This chapter aims to provide an in…
▽ More
Food recognition and nutritional apps are trending technologies that may revolutionise the way people with diabetes manage their diet. Such apps can monitor food intake as a digital diary and even employ artificial intelligence to assess the diet automatically. Although these apps offer a promising solution for managing diabetes, they are rarely used by patients. This chapter aims to provide an in-depth assessment of the current status of apps for food recognition and nutrition, to identify factors that may inhibit or facilitate their use, while it is accompanied by an outline of relevant research and development.
△ Less
Submitted 20 June, 2023;
originally announced July 2023.
-
Image Classification with Small Datasets: Overview and Benchmark
Authors:
L. Brigato,
B. Barz,
L. Iocchi,
J. Denzler
Abstract:
Image classification with small datasets has been an active research area in the recent past. However, as research in this scope is still in its infancy, two key ingredients are missing for ensuring reliable and truthful progress: a systematic and extensive overview of the state of the art, and a common benchmark to allow for objective comparisons between published methods. This article addresses…
▽ More
Image classification with small datasets has been an active research area in the recent past. However, as research in this scope is still in its infancy, two key ingredients are missing for ensuring reliable and truthful progress: a systematic and extensive overview of the state of the art, and a common benchmark to allow for objective comparisons between published methods. This article addresses both issues. First, we systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered. Second, we propose a common benchmark that allows for an objective comparison of approaches. It consists of five datasets spanning various domains (e.g., natural images, medical imagery, satellite data) and data types (RGB, grayscale, multispectral). We use this benchmark to re-evaluate the standard cross-entropy baseline and ten existing methods published between 2017 and 2021 at renowned venues. Surprisingly, we find that thorough hyper-parameter tuning on held-out validation data results in a highly competitive baseline and highlights a stunted growth of performance over the years. Indeed, only a single specialized method dating back to 2019 clearly wins our benchmark and outperforms the baseline classifier.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
On the Effectiveness of Neural Ensembles for Image Classification with Small Datasets
Authors:
Lorenzo Brigato,
Luca Iocchi
Abstract:
Deep neural networks represent the gold standard for image classification. However, they usually need large amounts of data to reach superior performance. In this work, we focus on image classification problems with a few labeled examples per class and improve data efficiency by using an ensemble of relatively small networks. For the first time, our work broadly studies the existing concept of neu…
▽ More
Deep neural networks represent the gold standard for image classification. However, they usually need large amounts of data to reach superior performance. In this work, we focus on image classification problems with a few labeled examples per class and improve data efficiency by using an ensemble of relatively small networks. For the first time, our work broadly studies the existing concept of neural ensembling in domains with small data, through extensive validation using popular datasets and architectures. We compare ensembles of networks to their deeper or wider single competitors given a total fixed computational budget. We show that ensembling relatively shallow networks is a simple yet effective technique that is generally better than current state-of-the-art approaches for learning from small datasets. Finally, we present our interpretation according to which neural ensembles are more sample efficient because they learn simpler functions.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
A Strong Baseline for the VIPriors Data-Efficient Image Classification Challenge
Authors:
Björn Barz,
Lorenzo Brigato,
Luca Iocchi,
Joachim Denzler
Abstract:
Learning from limited amounts of data is the hallmark of intelligence, requiring strong generalization and abstraction skills. In a machine learning context, data-efficient methods are of high practical importance since data collection and annotation are prohibitively expensive in many domains. Thus, coordinated efforts to foster progress in this area emerged recently, e.g., in the form of dedicat…
▽ More
Learning from limited amounts of data is the hallmark of intelligence, requiring strong generalization and abstraction skills. In a machine learning context, data-efficient methods are of high practical importance since data collection and annotation are prohibitively expensive in many domains. Thus, coordinated efforts to foster progress in this area emerged recently, e.g., in the form of dedicated workshops and competitions. Besides a common benchmark, measuring progress requires strong baselines. We present such a strong baseline for data-efficient image classification on the VIPriors challenge dataset, which is a sub-sampled version of ImageNet-1k with 100 images per class. We do not use any methods tailored to data-efficient classification but only standard models and techniques as well as common competition tricks and thorough hyper-parameter tuning. Our baseline achieves 69.7% accuracy on the VIPriors image classification dataset and outperforms 50% of submissions to the VIPriors 2021 challenge.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification
Authors:
Lorenzo Brigato,
Björn Barz,
Luca Iocchi,
Joachim Denzler
Abstract:
Data-efficient image classification using deep neural networks in settings, where only small amounts of labeled data are available, has been an active research area in the recent past. However, an objective comparison between published methods is difficult, since existing works use different datasets for evaluation and often compare against untuned baselines with default hyper-parameters. We desig…
▽ More
Data-efficient image classification using deep neural networks in settings, where only small amounts of labeled data are available, has been an active research area in the recent past. However, an objective comparison between published methods is difficult, since existing works use different datasets for evaluation and often compare against untuned baselines with default hyper-parameters. We design a benchmark for data-efficient image classification consisting of six diverse datasets spanning various domains (e.g., natural images, medical imagery, satellite data) and data types (RGB, grayscale, multispectral). Using this benchmark, we re-evaluate the standard cross-entropy baseline and eight methods for data-efficient deep learning published between 2017 and 2021 at renowned venues. For a fair and realistic comparison, we carefully tune the hyper-parameters of all methods on each dataset. Surprisingly, we find that tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline, which outperforms all but one specialized method and performs competitively to the remaining one.
△ Less
Submitted 30 August, 2021;
originally announced August 2021.
-
A Close Look at Deep Learning with Small Data
Authors:
L. Brigato,
L. Iocchi
Abstract:
In this work, we perform a wide variety of experiments with different deep learning architectures on datasets of limited size. According to our study, we show that model complexity is a critical factor when only a few samples per class are available. Differently from the literature, we show that in some configurations, the state of the art can be improved using low complexity models. For instance,…
▽ More
In this work, we perform a wide variety of experiments with different deep learning architectures on datasets of limited size. According to our study, we show that model complexity is a critical factor when only a few samples per class are available. Differently from the literature, we show that in some configurations, the state of the art can be improved using low complexity models. For instance, in problems with scarce training samples and without data augmentation, low-complexity convolutional neural networks perform comparably well or better than state-of-the-art architectures. Moreover, we show that even standard data augmentation can boost recognition performance by large margins. This result suggests the development of more complex data generation/augmentation pipelines for cases when data is limited. Finally, we show that dropout, a widely used regularization technique, maintains its role as a good regularizer even when data is scarce. Our findings are empirically validated on the sub-sampled versions of popular CIFAR-10, Fashion-MNIST and, SVHN benchmarks.
△ Less
Submitted 25 October, 2020; v1 submitted 28 March, 2020;
originally announced March 2020.