Skip to main content

Showing 1–50 of 51 results for author: Bagdanov, A

.
  1. arXiv:2506.19530  [pdf, ps, other

    cs.AI

    NTRL: Encounter Generation via Reinforcement Learning for Dynamic Difficulty Adjustment in Dungeons and Dragons

    Authors: Carlo Romeo, Andrew D. Bagdanov

    Abstract: Balancing combat encounters in Dungeons & Dragons (D&D) is a complex task that requires Dungeon Masters (DM) to manually assess party strength, enemy composition, and dynamic player interactions while avoiding interruption of the narrative flow. In this paper, we propose Encounter Generation via Reinforcement Learning (NTRL), a novel approach that automates Dynamic Difficulty Adjustment (DDA) in D… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  2. arXiv:2503.10439  [pdf, other

    cs.CV

    EFC++: Elastic Feature Consolidation with Prototype Re-balancing for Cold Start Exemplar-free Incremental Learning

    Authors: Simone Magistri, Tomaso Trinci, Albin Soutif-Cormerais, Joost van de Weijer, Andrew D. Bagdanov

    Abstract: Exemplar-Free Class Incremental Learning (EFCIL) aims to learn from a sequence of tasks without having access to previous task data. In this paper, we consider the challenging Cold Start scenario in which insufficient data is available in the first task to learn a high-quality backbone. This is especially challenging for EFCIL since it requires high plasticity, resulting in feature drift which is… ▽ More

    Submitted 15 March, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

    Comments: Under Review since July 2024. Extension of our previous conference paper https://openreview.net/forum?id=7D9X2cFnt1

  3. arXiv:2502.04959  [pdf, ps, other

    cs.LG

    No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

    Authors: Daniel Marczak, Simone Magistri, Sebastian Cygert, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper, we investigate the key characteristics of task matrices -- weight update matrices applied to a pre-trained model -- that enable effective merging. We show that… ▽ More

    Submitted 11 June, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

    Comments: Accepted at ICML 2025

  4. arXiv:2502.04263  [pdf, other

    cs.CV cs.AI cs.LG

    Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion

    Authors: Marco Mistretta, Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, Andrew D. Bagdanov

    Abstract: Pre-trained multi-modal Vision-Language Models like CLIP are widely used off-the-shelf for a variety of applications. In this paper, we show that the common practice of individually exploiting the text or image encoders of these powerful multi-modal models is highly suboptimal for intra-modal tasks like image-to-image retrieval. We argue that this is inherently due to the CLIP-style inter-modal co… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

    Comments: Accepted for publication at ICLR 2025

  5. arXiv:2501.08669  [pdf, other

    cs.LG cs.AI

    SPEQ: Offline Stabilization Phases for Efficient Q-Learning in High Update-To-Data Ratio Reinforcement Learning

    Authors: Carlo Romeo, Girolamo Macaluso, Alessandro Sestini, Andrew D. Bagdanov

    Abstract: High update-to-data (UTD) ratio algorithms in reinforcement learning (RL) improve sample efficiency but incur high computational costs, limiting real-world scalability. We propose Offline Stabilization Phases for Efficient Q-Learning (SPEQ), an RL algorithm that combines low-UTD online training with periodic offline stabilization phases. During these phases, Q-functions are fine-tuned with high UT… ▽ More

    Submitted 18 March, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

  6. arXiv:2412.14326  [pdf, other

    cs.LG cs.CV

    Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models

    Authors: Dipam Goswami, Simone Magistri, Kai Wang, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: Using pre-trained models has been found to reduce the effect of data heterogeneity and speed up federated learning algorithms. Recent works have investigated the use of first-order statistics and second-order statistics to aggregate local client data distributions at the server and achieve very high performance without any training. In this work we propose a training-free method based on an unbias… ▽ More

    Submitted 4 February, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

  7. arXiv:2410.17827  [pdf, other

    cs.AI

    RE-tune: Incremental Fine Tuning of Biomedical Vision-Language Models for Multi-label Chest X-ray Classification

    Authors: Marco Mistretta, Andrew D. Bagdanov

    Abstract: In this paper we introduce RE-tune, a novel approach for fine-tuning pre-trained Multimodal Biomedical Vision-Language models (VLMs) in Incremental Learning scenarios for multi-label chest disease diagnosis. RE-tune freezes the backbones and only trains simple adaptors on top of the Image and Text encoders of the VLM. By engineering positive and negative text prompts for diseases, we leverage the… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Comments: Accepted for publication at Medical Imaging meets NeurIPS (NeurIPS23)

  8. arXiv:2409.18664  [pdf, other

    cs.LG

    How green is continual learning, really? Analyzing the energy consumption in continual training of vision foundation models

    Authors: Tomaso Trinci, Simone Magistri, Roberto Verdecchia, Andrew D. Bagdanov

    Abstract: With the ever-growing adoption of AI, its impact on the environment is no longer negligible. Despite the potential that continual learning could have towards Green AI, its environmental sustainability remains relatively uncharted. In this work we aim to gain a systematic understanding of the energy efficiency of continual learning algorithms. To that end, we conducted an extensive set of empirical… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Comments: This manuscript has been accepted at the Green FOundation MOdels (GreenFOMO) ECCV 2024 Workshop

  9. arXiv:2409.17851  [pdf, other

    cs.CV

    ViewpointDepth: A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

    Authors: Aurel Pjetri, Stefano Caprasecca, Leonardo Taccari, Matteo Simoncini, Henrique Piñeiro Monteagudo, Wallace Walter, Douglas Coimbra de Andrade, Francesco Sambo, Andrew David Bagdanov

    Abstract: Monocular depth estimation is a critical task for autonomous driving and many other computer vision applications. While significant progress has been made in this field, the effects of viewpoint shifts on depth estimation models remain largely underexplored. This paper introduces a novel dataset and evaluation methodology to quantify the impact of different camera positions and orientations on mon… ▽ More

    Submitted 3 February, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

  10. arXiv:2407.10839  [pdf, other

    cs.LG cs.AI

    Offline Reinforcement Learning with Imputed Rewards

    Authors: Carlo Romeo, Andrew D. Bagdanov

    Abstract: Offline Reinforcement Learning (ORL) offers a robust solution to training agents in applications where interactions with the environment must be strictly limited due to cost, safety, or lack of accurate simulation environments. Despite its potential to facilitate deployment of artificial agents in the real world, Offline Reinforcement Learning typically requires very many demonstrations annotated… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: RLBRew Workshop @ RLC 2024

  11. arXiv:2407.09415  [pdf, other

    cs.AI cs.LG

    A Benchmark Environment for Offline Reinforcement Learning in Racing Games

    Authors: Girolamo Macaluso, Alessandro Sestini, Andrew D. Bagdanov

    Abstract: Offline Reinforcement Learning (ORL) is a promising approach to reduce the high sample complexity of traditional Reinforcement Learning (RL) by eliminating the need for continuous environmental interactions. ORL exploits a dataset of pre-collected transitions and thus expands the range of application of RL to tasks in which the excessive environment queries increase training time and decrease effi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted at IEEE Conference on Games

  12. arXiv:2407.08536  [pdf, other

    cs.CV

    Exemplar-free Continual Representation Learning via Learnable Drift Compensation

    Authors: Alex Gomez-Villa, Dipam Goswami, Kai Wang, Andrew D. Bagdanov, Bartlomiej Twardowski, Joost van de Weijer

    Abstract: Exemplar-free class-incremental learning using a backbone trained from scratch and starting from a small first task presents a significant challenge for continual representation learning. Prototype-based approaches, when continually updated, face the critical issue of semantic drift due to which the old class prototypes drift to different positions in the new feature space. Through an analysis of… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  13. arXiv:2407.03056  [pdf, other

    cs.CV cs.AI cs.LG

    Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation

    Authors: Marco Mistretta, Alberto Baldrati, Marco Bertini, Andrew D. Bagdanov

    Abstract: Vision-Language Models (VLMs) demonstrate remarkable zero-shot generalization to unseen tasks, but fall short of the performance of supervised methods in generalizing to downstream tasks with limited data. Prompt learning is emerging as a parameter-efficient method for adapting VLMs, but state-of-the-art approaches require annotated samples. In this paper we propose a novel approach to prompt lear… ▽ More

    Submitted 30 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted for publication at ECCV24

  14. arXiv:2406.02380  [pdf, other

    cs.CV

    EUFCC-340K: A Faceted Hierarchical Dataset for Metadata Annotation in GLAM Collections

    Authors: Francesc Net, Marc Folia, Pep Casals, Andrew D. Bagdanov, Lluis Gomez

    Abstract: In this paper, we address the challenges of automatic metadata annotation in the domain of Galleries, Libraries, Archives, and Museums (GLAMs) by introducing a novel dataset, EUFCC340K, collected from the Europeana portal. Comprising over 340,000 images, the EUFCC340K dataset is organized across multiple facets: Materials, Object Types, Disciplines, and Subjects, following a hierarchical structure… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 23 pages, 13 figures

    ACM Class: I.4.9

  15. arXiv:2405.18069  [pdf, other

    cs.LG

    An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates

    Authors: Albin Soutif--Cormerais, Simone Magistri, Joost van de Weijer, Andew D. Bagdanov

    Abstract: Broad, open source availability of large pretrained foundation models on the internet through platforms such as HuggingFace has taken the world of practical deep learning by storm. A classical pipeline for neural network training now typically consists of finetuning these pretrained network on a small target dataset instead of training from scratch. In the case of large models this can be done eve… ▽ More

    Submitted 19 May, 2025; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: CoLLAs 2024 accepted paper, PMLR 274:996-1012

  16. arXiv:2402.03917  [pdf, other

    cs.CV cs.LG

    Elastic Feature Consolidation for Cold Start Exemplar-Free Incremental Learning

    Authors: Simone Magistri, Tomaso Trinci, Albin Soutif-Cormerais, Joost van de Weijer, Andrew D. Bagdanov

    Abstract: Exemplar-Free Class Incremental Learning (EFCIL) aims to learn from a sequence of tasks without having access to previous task data. In this paper, we consider the challenging Cold Start scenario in which insufficient data is available in the first task to learn a high-quality backbone. This is especially challenging for EFCIL since it requires high plasticity, which results in feature drift which… ▽ More

    Submitted 30 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted at Twelfth International Conference on Learning Representations (ICLR 2024)

  17. arXiv:2312.09844  [pdf, other

    cs.LG cs.AI

    Small Dataset, Big Gains: Enhancing Reinforcement Learning by Offline Pre-Training with Model Based Augmentation

    Authors: Girolamo Macaluso, Alessandro Sestini, Andrew D. Bagdanov

    Abstract: Offline reinforcement learning leverages pre-collected datasets of transitions to train policies. It can serve as effective initialization for online algorithms, enhancing sample efficiency and speeding up convergence. However, when such datasets are limited in size and quality, offline pre-training can produce sub-optimal policies and lead to degraded online reinforcement learning performance. In… ▽ More

    Submitted 19 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

  18. arXiv:2310.20348  [pdf, other

    cs.CV cs.LG

    Class Incremental Learning with Pre-trained Vision-Language Models

    Authors: Xialei Liu, Xusheng Cao, Haori Lu, Jia-wen Xiao, Andrew D. Bagdanov, Ming-Ming Cheng

    Abstract: With the advent of large-scale pre-trained models, interest in adapting and exploiting them for continual learning scenarios has grown. In this paper, we propose an approach to exploiting pre-trained vision-language models (e.g. CLIP) that enables further adaptation instead of only using zero-shot learning of new tasks. We augment a pre-trained CLIP model with additional layers after the Image E… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  19. arXiv:2308.12510  [pdf, other

    cs.CV cs.AI cs.LG

    Masked Autoencoders are Efficient Class Incremental Learners

    Authors: Jiang-Tian Zhai, Xialei Liu, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng

    Abstract: Class Incremental Learning (CIL) aims to sequentially learn new classes while avoiding catastrophic forgetting of previous knowledge. We propose to use Masked Autoencoders (MAEs) as efficient learners for CIL. MAEs were originally designed to learn useful representations through reconstructive unsupervised learning, and they can be easily integrated with a supervised loss for classification. Moreo… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023

  20. arXiv:2212.08251  [pdf, other

    cs.CV

    Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning

    Authors: Xialei Liu, Jiang-Tian Zhai, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng

    Abstract: Exemplar-free Class Incremental Learning (EFCIL) aims to sequentially learn tasks with access only to data from the current one. EFCIL is of interest because it mitigates concerns about privacy and long-term storage of data, while at the same time alleviating the problem of catastrophic forgetting in incremental learning. In this work, we introduce task-adaptive saliency for EFCIL and propose a ne… ▽ More

    Submitted 27 March, 2024; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: Accepted at CVPR 2024

  21. arXiv:2211.12292  [pdf, other

    cs.CV

    Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation

    Authors: Marco Cotogni, Fei Yang, Claudio Cusano, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Ex… ▽ More

    Submitted 27 July, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

  22. arXiv:2210.01600  [pdf, other

    cs.CV

    Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification

    Authors: Kai Wang, Chenshen Wu, Andy Bagdanov, Xialei Liu, Shiqi Yang, Shangling Jui, Joost van de Weijer

    Abstract: Lifelong object re-identification incrementally learns from a stream of re-identification tasks. The objective is to learn a representation that can be applied to all tasks and that generalizes to previously unseen re-identification tasks. The main challenge is that at inference time the representation must generalize to previously unseen identities. To address this problem, we apply continual met… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: BMVC 2022

  23. arXiv:2210.00266  [pdf, other

    cs.CV

    Long-Tailed Class Incremental Learning

    Authors: Xialei Liu, Yu-Song Hu, Xu-Sheng Cao, Andrew D. Bagdanov, Ke Li, Ming-Ming Cheng

    Abstract: In class incremental learning (CIL) a model must learn new classes in a sequential manner without forgetting old ones. However, conventional CIL methods consider a balanced distribution for each new task, which ignores the prevalence of long-tailed distributions in the real world. In this work we propose two long-tailed CIL scenarios, which we term ordered and shuffled LT-CIL. Ordered LT-CIL consi… ▽ More

    Submitted 1 October, 2022; originally announced October 2022.

    Comments: Accepted at ECCV 2022

  24. arXiv:2208.07811  [pdf, other

    cs.SE cs.AI cs.LG

    Towards Informed Design and Validation Assistance in Computer Games Using Imitation Learning

    Authors: Alessandro Sestini, Joakim Bergdahl, Konrad Tollmar, Andrew D. Bagdanov, Linus Gisslén

    Abstract: In games, as in and many other domains, design validation and testing is a huge challenge as systems are growing in size and manual testing is becoming infeasible. This paper proposes a new approach to automated game validation and testing. Our method leverages a data-driven imitation learning technique, which requires little effort and time and no knowledge of machine learning or programming, tha… ▽ More

    Submitted 19 August, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

    Comments: 10 pages, 8 figures, and 3 tables

  25. arXiv:2202.10057  [pdf, other

    cs.LG cs.AI

    CCPT: Automatic Gameplay Testing and Validation with Curiosity-Conditioned Proximal Trajectories

    Authors: Alessandro Sestini, Linus Gisslén, Joakim Bergdahl, Konrad Tollmar, Andrew D. Bagdanov

    Abstract: This paper proposes a novel deep reinforcement learning algorithm to perform automatic analysis and detection of gameplay issues in complex 3D navigation environments. The Curiosity-Conditioned Proximal Trajectories (CCPT) method combines curiosity and imitation learning to train agents to methodically explore in the proximity of known trajectories derived from expert demonstrations. We show how C… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  26. arXiv:2202.07993  [pdf, other

    cs.CV cs.LG

    Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training

    Authors: Simone Zini, Alex Gomez-Villa, Marco Buzzelli, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: Several recent works on self-supervised learning are trained by mapping different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in le… ▽ More

    Submitted 2 February, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: Accepted at Eleventh International Conference on Learning Representations (ICLR 2023)

  27. arXiv:2112.15022  [pdf, other

    cs.CV

    Continually Learning Self-Supervised Representations with Projected Functional Regularization

    Authors: Alex Gomez-Villa, Bartlomiej Twardowski, Lu Yu, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally -- they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay mec… ▽ More

    Submitted 2 May, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

    Comments: Accepted at Workshop on Continual Learning in Computer Vision (CVPR 2022)

  28. arXiv:2111.04993  [pdf, other

    cs.CV

    Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition

    Authors: Kai Wang, Xialei Liu, Andy Bagdanov, Luis Herranz, Shangling Jui, Joost van de Weijer

    Abstract: Most meta-learning approaches assume the existence of a very large set of labeled data available for episodic meta-learning of base knowledge. This contrasts with the more realistic continual learning paradigm in which data arrives incrementally in the form of tasks containing disjoint classes. In this paper we consider this problem of Incremental Meta-Learning (IML) in which classes are presented… ▽ More

    Submitted 11 November, 2021; v1 submitted 9 November, 2021; originally announced November 2021.

  29. arXiv:2104.10610  [pdf, ps, other

    cs.LG

    Policy Fusion for Adaptive and Customizable Reinforcement Learning Agents

    Authors: Alessandro Sestini, Alexander Kuhnle, Andrew D. Bagdanov

    Abstract: In this article we study the problem of training intelligent agents using Reinforcement Learning for the purpose of game development. Unlike systems built to replace human players and to achieve super-human performance, our agents aim to produce meaningful interactions with the player, and at the same time demonstrate behavioral traits as desired by game designers. We show how to combine distinct… ▽ More

    Submitted 21 April, 2021; originally announced April 2021.

  30. arXiv:2102.02005  [pdf, other

    cs.CV

    Robust pedestrian detection in thermal imagery using synthesized images

    Authors: My Kieu, Lorenzo Berlincioni, Leonardo Galteri, Marco Bertini, Andrew D. Bagdanov, Alberto Del Bimbo

    Abstract: In this paper we propose a method for improving pedestrian detection in the thermal domain using two stages: first, a generative data augmentation approach is used, then a domain adaptation method using generated data adapts an RGB pedestrian detector. Our model, based on the Least-Squares Generative Adversarial Network, is trained to synthesize realistic thermal versions of input RGB images which… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: Accepted at ICPR2020

  31. arXiv:2012.03532  [pdf, other

    cs.LG cs.AI

    Deep Policy Networks for NPC Behaviors that Adapt to Changing Design Parameters in Roguelike Games

    Authors: Alessandro Sestini, Alexander Kuhnle, Andrew D. Bagdanov

    Abstract: Recent advances in Deep Reinforcement Learning (DRL) have largely focused on improving the performance of agents with the aim of replacing humans in known and well-defined environments. The use of these techniques as a game design tool for video game production, where the aim is instead to create Non-Player Character (NPC) behaviors, has received relatively little attention until recently. Turn-ba… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: Presented at the AAAI-21 Workshop on Reinforcement Learning in Games

  32. arXiv:2012.02527  [pdf, other

    cs.LG cs.AI

    Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments

    Authors: Alessandro Sestini, Alexander Kuhnle, Andrew D. Bagdanov

    Abstract: Deep Reinforcement Learning achieves very good results in domains where reward functions can be manually engineered. At the same time, there is growing interest within the community in using games based on Procedurally Content Generation (PCG) as benchmark environments since this type of environment is perfect for studying overfitting and generalization of agents under domain shift. Inverse Reinfo… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

    Comments: Presented at the AAAI-21 Workshop on Reinforcement Learning in Games

  33. arXiv:2012.01914  [pdf, other

    cs.LG cs.AI

    DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games

    Authors: Alessandro Sestini, Alexander Kuhnle, Andrew D. Bagdanov

    Abstract: In this paper we introduce DeepCrawl, a fully-playable Roguelike prototype for iOS and Android in which all agents are controlled by policy networks trained using Deep Reinforcement Learning (DRL). Our aim is to understand whether recent advances in DRL can be used to develop convincing behavioral models for non-player characters in videogames. We begin with an analysis of requirements that such a… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    Comments: Presented at AIIDE-19 Workshop on Experimental Artificial Intelligence in Games

  34. arXiv:2010.15277  [pdf, other

    cs.LG cs.CV

    Class-incremental learning: survey and performance evaluation on image classification

    Authors: Marc Masana, Xialei Liu, Bartlomiej Twardowski, Mikel Menta, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: For future learning systems, incremental learning is desirable because it allows for: efficient resource usage by eliminating the need to retrain from scratch at the arrival of new data; reduced memory usage by preventing or limiting the amount of data required to be stored -- also important when privacy limitations are imposed; and learning that more closely resembles human learning. The main cha… ▽ More

    Submitted 11 October, 2022; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: Paper accepted for publication at TPAMI 2022. Code publicly available at https://github.com/mmasana/FACIL

  35. arXiv:2007.06271  [pdf, other

    cs.CV cs.LG

    RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning

    Authors: Riccardo Del Chiaro, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

    Abstract: Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propo… ▽ More

    Submitted 29 October, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: 9 pages, 4 figures, 8 supplementary pages, 12 supplementary images, to be published in NeurIPS 2020

  36. arXiv:2004.09199  [pdf, other

    cs.CV cs.LG

    Generative Feature Replay For Class-Incremental Learning

    Authors: Xialei Liu, Chenshen Wu, Mikel Menta, Luis Herranz, Bogdan Raducanu, Andrew D. Bagdanov, Shangling Jui, Joost van de Weijer

    Abstract: Humans are capable of learning new tasks without forgetting previous ones, while neural networks fail due to catastrophic forgetting between new and previously-learned tasks. We consider a class-incremental setting which means that the task-ID is unknown at inference time. The imbalance between old and new classes typically results in a bias of the network towards the newest ones. This imbalance p… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted at CVPR2020: Workshop on Continual Learning in Computer Vision

  37. Visual Question Answering for Cultural Heritage

    Authors: Pietro Bongini, Federico Becattini, Andrew D. Bagdanov, Alberto Del Bimbo

    Abstract: Technology and the fruition of cultural heritage are becoming increasingly more entwined, especially with the advent of smart audio guides, virtual and augmented reality, and interactive installations. Machine learning and computer vision are important components of this ongoing integration, enabling new interaction modalities between user and museum. Nonetheless, the most frequent way of interact… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

    Comments: accepted at FlorenceHeritech 2020

  38. Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank

    Authors: Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov

    Abstract: For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task f… ▽ More

    Submitted 17 February, 2019; originally announced February 2019.

    Comments: Accepted at TPAMI. (Keywords: Learning from rankings, image quality assessment, crowd counting, active learning). arXiv admin note: text overlap with arXiv:1803.03095

  39. arXiv:1809.00854  [pdf, other

    cs.CV

    Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images

    Authors: Dena Bazazian, Dimosthenis Karatzas, Andrew D. Bagdanov

    Abstract: Word spotting in natural scene images has many applications in scene understanding and visual assistance. In this paper we propose a technique to create and exploit an intermediate representation of images based on text attributes which are character probability maps. Our representation extends the concept of the Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks t… ▽ More

    Submitted 11 October, 2019; v1 submitted 4 September, 2018; originally announced September 2018.

    Comments: 9 pages, 10 figures, The Third International Workshop on Egocentric Perception, Interaction and Computing (EPIC) at ECCV2018

  40. arXiv:1803.03095  [pdf, other

    cs.CV

    Leveraging Unlabeled Data for Crowd Counting by Learning to Rank

    Authors: Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov

    Abstract: We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets fo… ▽ More

    Submitted 8 March, 2018; originally announced March 2018.

    Comments: Accepted by CVPR18

  41. arXiv:1802.02950  [pdf, other

    cs.CV

    Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting

    Authors: Xialei Liu, Marc Masana, Luis Herranz, Joost Van de Weijer, Antonio M. Lopez, Andrew D. Bagdanov

    Abstract: In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolida… ▽ More

    Submitted 12 December, 2018; v1 submitted 8 February, 2018; originally announced February 2018.

    Comments: Accepted at ICPR'18. First two authors contributed equally

  42. arXiv:1709.01041  [pdf, other

    cs.CV

    Domain-adaptive deep network compression

    Authors: Marc Masana, Joost van de Weijer, Luis Herranz, Andrew D. Bagdanov, Jose M Alvarez

    Abstract: Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work… ▽ More

    Submitted 6 September, 2017; v1 submitted 4 September, 2017; originally announced September 2017.

    Comments: Accepted at ICCV 2017

  43. Review on Computer Vision Techniques in Emergency Situation

    Authors: Laura Lopez-Fuentes, Joost van de Weijer, Manuel Gonzalez-Hidalgo, Harald Skinnemoen, Andrew D. Bagdanov

    Abstract: In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, U… ▽ More

    Submitted 9 March, 2018; v1 submitted 24 August, 2017; originally announced August 2017.

    Comments: 25 pages

    Journal ref: Multimedia Tools and Applications, 2017, p. 1-39

  44. arXiv:1707.08347  [pdf, other

    cs.CV

    RankIQA: Learning from Rankings for No-reference Image Quality Assessment

    Authors: Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov

    Abstract: We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labelin… ▽ More

    Submitted 26 July, 2017; originally announced July 2017.

    Comments: Accepted by ICCV 2017

  45. arXiv:1706.01487  [pdf, other

    cs.CV

    Visual attention models for scene text recognition

    Authors: Suman K. Ghosh, Ernest Valveny, Andrew D. Bagdanov

    Abstract: In this paper we propose an approach to lexicon-free recognition of text in scene images. Our approach relies on a LSTM-based soft visual attention model learned from convolutional features. A set of feature vectors are derived from an intermediate convolutional layer corresponding to different areas of the image. This permits encoding of spatial information into the image representation. In this… ▽ More

    Submitted 5 June, 2017; originally announced June 2017.

  46. arXiv:1702.05089  [pdf, other

    cs.CV

    Improving Text Proposals for Scene Images with Fully Convolutional Networks

    Authors: Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, Andrew D. Bagdanov

    Abstract: Text Proposals have emerged as a class-dependent version of object proposals - efficient approaches to reduce the search space of possible text object locations in an image. Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text recognition. In this paper we propose an improvement over the original Text Proposals algorithm of Gom… ▽ More

    Submitted 16 February, 2017; originally announced February 2017.

    Comments: 6 pages, 8 figures, International Conference on Pattern Recognition (ICPR) - DLPR (Deep Learning for Pattern Recognition) workshop

  47. Bandwidth limited object recognition in high resolution imagery

    Authors: Laura Lopez-Fuentes, Andrew D. Bagdanov, Joost van de Weijer, Harald Skinnemoen

    Abstract: This paper proposes a novel method to optimize bandwidth usage for object detection in critical communication scenarios. We develop two operating models of active information seeking. The first model identifies promising regions in low resolution imagery and progressively requests higher resolution regions on which to perform recognition of higher semantic quality. The second model identifies prom… ▽ More

    Submitted 16 January, 2017; originally announced January 2017.

    Comments: 9 pages, 9 figures, accepted in WACV

    Journal ref: Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. IEEE, 2017. p. 1197-1205

  48. Scale Coding Bag of Deep Features for Human Attribute and Action Recognition

    Authors: Fahad Shahbaz Khan, Joost van de Weijer, Rao Muhammad Anwer, Andrew D. Bagdanov, Michael Felsberg, Jorma Laaksonen

    Abstract: Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional… ▽ More

    Submitted 26 March, 2018; v1 submitted 14 December, 2016; originally announced December 2016.

    Comments: To appear in Machine Vision and Applications

  49. arXiv:1605.03477  [pdf, ps, other

    cs.CV

    On-the-fly Network Pruning for Object Detection

    Authors: Marc Masana, Joost van de Weijer, Andrew D. Bagdanov

    Abstract: Object detection with deep neural networks is often performed by passing a few thousand candidate bounding boxes through a deep neural network for each image. These bounding boxes are highly correlated since they originate from the same image. In this paper we investigate how to exploit feature occurrence at the image scale to prune the neural network which is subsequently applied to all bounding… ▽ More

    Submitted 11 May, 2016; originally announced May 2016.

    Comments: Accepted at ICLR 2016 workshop track as a poster presentation

  50. arXiv:1601.01885  [pdf, other

    cs.CV

    Visual Script and Language Identification

    Authors: Anguelos Nicolaou, Andrew Bagdanov, Lluis Gomez-Bigorda, Dimosthenis Karatzas

    Abstract: In this paper we introduce a script identification method based on hand-crafted texture features and an artificial neural network. The proposed pipeline achieves near state-of-the-art performance for script identification of video-text and state-of-the-art performance on visual language identification of handwritten text. More than using the deep network as a classifier, the use of its intermediar… ▽ More

    Submitted 8 January, 2016; originally announced January 2016.