-
Are Vision-Language Models Ready for Dietary Assessment? Exploring the Next Frontier in AI-Powered Food Image Recognition
Authors:
Sergio Romero-Tapiador,
Ruben Tolosana,
Blanca Lacruz-Pleguezuelos,
Laura Judith Marcos Zambrano,
Guadalupe X. Bazán,
Isabel Espinosa-Salinas,
Julian Fierrez,
Javier Ortega-Garcia,
Enrique Carrillo de Santa Pau,
Aythami Morales
Abstract:
Automatic dietary assessment based on food images remains a challenge, requiring precise food detection, segmentation, and classification. Vision-Language Models (VLMs) offer new possibilities by integrating visual and textual reasoning. In this study, we evaluate six state-of-the-art VLMs (ChatGPT, Gemini, Claude, Moondream, DeepSeek, and LLaVA), analyzing their capabilities in food recognition a…
▽ More
Automatic dietary assessment based on food images remains a challenge, requiring precise food detection, segmentation, and classification. Vision-Language Models (VLMs) offer new possibilities by integrating visual and textual reasoning. In this study, we evaluate six state-of-the-art VLMs (ChatGPT, Gemini, Claude, Moondream, DeepSeek, and LLaVA), analyzing their capabilities in food recognition at different levels. For the experimental framework, we introduce the FoodNExTDB, a unique food image database that contains 9,263 expert-labeled images across 10 categories (e.g., "protein source"), 62 subcategories (e.g., "poultry"), and 9 cooking styles (e.g., "grilled"). In total, FoodNExTDB includes 50k nutritional labels generated by seven experts who manually annotated all images in the database. Also, we propose a novel evaluation metric, Expert-Weighted Recall (EWR), that accounts for the inter-annotator variability. Results show that closed-source models outperform open-source ones, achieving over 90% EWR in recognizing food products in images containing a single product. Despite their potential, current VLMs face challenges in fine-grained food recognition, particularly in distinguishing subtle differences in cooking styles and visually similar food items, which limits their reliability for automatic dietary assessment. The FoodNExTDB database is publicly available at https://github.com/AI4Food/FoodNExtDB.
△ Less
Submitted 9 April, 2025;
originally announced April 2025.
-
Improving the portability of predicting students performance models by using ontologies
Authors:
Javier Lopez Zambrano,
Juan A. Lara,
Cristobal Romero
Abstract:
One of the main current challenges in Educational Data Mining and Learning Analytics is the portability or transferability of predictive models obtained for a particular course so that they can be applied to other different courses. To handle this challenge, one of the foremost problems is the models excessive dependence on the low-level attributes used to train them, which reduces the models port…
▽ More
One of the main current challenges in Educational Data Mining and Learning Analytics is the portability or transferability of predictive models obtained for a particular course so that they can be applied to other different courses. To handle this challenge, one of the foremost problems is the models excessive dependence on the low-level attributes used to train them, which reduces the models portability. To solve this issue, the use of high level attributes with more semantic meaning, such as ontologies, may be very useful. Along this line, we propose the utilization of an ontology that uses a taxonomy of actions that summarises students interactions with the Moodle learning management system. We compare the results of this proposed approach against our previous results when we used low-level raw attributes obtained directly from Moodle logs. The results indicate that the use of the proposed ontology improves the portability of the models in terms of predictive accuracy. The main contribution of this paper is to show that the ontological models obtained in one source course can be applied to other different target courses with similar usage levels without losing prediction accuracy.
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
Characterization of Neural Networks Automatically Mapped on Automotive-grade Microcontrollers
Authors:
Giulia Crocioni,
Giambattista Gruosso,
Danilo Pau,
Davide Denaro,
Luigi Zambrano,
Giuseppe di Giore
Abstract:
Nowadays, Neural Networks represent a major expectation for the realization of powerful Deep Learning algorithms, which can determine several physical systems' behaviors and operations. Computational resources required for model, training, and running are large, especially when related to the amount of data that Neural Networks typically need to generalize. The latest TinyML technologies allow int…
▽ More
Nowadays, Neural Networks represent a major expectation for the realization of powerful Deep Learning algorithms, which can determine several physical systems' behaviors and operations. Computational resources required for model, training, and running are large, especially when related to the amount of data that Neural Networks typically need to generalize. The latest TinyML technologies allow integrating pre-trained models on embedded systems, allowing making computing at the edge faster, cheaper, and safer. Although these technologies originated in the consumer and industrial worlds, many sectors can greatly benefit from them, such as the automotive industry. In this paper, we present a framework for implementing Neural Network-based models on a family of automotive Microcontrollers, showing their efficiency in two case studies applied to vehicles: intrusion detection on the Controller Area Network bus and residual capacity estimation in Lithium-Ion batteries, widely used in Electric Vehicles.
△ Less
Submitted 27 February, 2021;
originally announced March 2021.