-
ProMi: An Efficient Prototype-Mixture Baseline for Few-Shot Segmentation with Bounding-Box Annotations
Authors:
Florent Chiaroni,
Ali Ayub,
Ola Ahmad
Abstract:
In robotics applications, few-shot segmentation is crucial because it allows robots to perform complex tasks with minimal training data, facilitating their adaptation to diverse, real-world environments. However, pixel-level annotations of even small amount of images is highly time-consuming and costly. In this paper, we present a novel few-shot binary segmentation method based on bounding-box ann…
▽ More
In robotics applications, few-shot segmentation is crucial because it allows robots to perform complex tasks with minimal training data, facilitating their adaptation to diverse, real-world environments. However, pixel-level annotations of even small amount of images is highly time-consuming and costly. In this paper, we present a novel few-shot binary segmentation method based on bounding-box annotations instead of pixel-level labels. We introduce, ProMi, an efficient prototype-mixture-based method that treats the background class as a mixture of distributions. Our approach is simple, training-free, and effective, accommodating coarse annotations with ease. Compared to existing baselines, ProMi achieves the best results across different datasets with significant gains, demonstrating its effectiveness. Furthermore, we present qualitative experiments tailored to real-world mobile robot tasks, demonstrating the applicability of our approach in such scenarios. Our code: https://github.com/ThalesGroup/promi.
△ Less
Submitted 18 May, 2025;
originally announced May 2025.
-
Bag of Tricks for Fully Test-Time Adaptation
Authors:
Saypraseuth Mounsaveng,
Florent Chiaroni,
Malik Boudiaf,
Marco Pedersoli,
Ismail Ben Ayed
Abstract:
Fully Test-Time Adaptation (TTA), which aims at adapting models to data drifts, has recently attracted wide interest. Numerous tricks and techniques have been proposed to ensure robust learning on arbitrary streams of unlabeled data. However, assessing the true impact of each individual technique and obtaining a fair comparison still constitutes a significant challenge. To help consolidate the com…
▽ More
Fully Test-Time Adaptation (TTA), which aims at adapting models to data drifts, has recently attracted wide interest. Numerous tricks and techniques have been proposed to ensure robust learning on arbitrary streams of unlabeled data. However, assessing the true impact of each individual technique and obtaining a fair comparison still constitutes a significant challenge. To help consolidate the community's knowledge, we present a categorization of selected orthogonal TTA techniques, including small batch normalization, stream rebalancing, reliable sample selection, and network confidence calibration. We meticulously dissect the effect of each approach on different scenarios of interest. Through our analysis, we shed light on trade-offs induced by those techniques between accuracy, the computational power required, and model complexity. We also uncover the synergy that arises when combining techniques and are able to establish new state-of-the-art results.
△ Less
Submitted 9 November, 2023; v1 submitted 3 October, 2023;
originally announced October 2023.
-
MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning
Authors:
Julien Nicolas,
Florent Chiaroni,
Imtiaz Ziko,
Ola Ahmad,
Christian Desrosiers,
Jose Dolz
Abstract:
Despite the recent progress in incremental learning, addressing catastrophic forgetting under distributional drift is still an open and important problem. Indeed, while state-of-the-art domain incremental learning (DIL) methods perform satisfactorily within known domains, their performance largely degrades in the presence of novel domains. This limitation hampers their generalizability, and restri…
▽ More
Despite the recent progress in incremental learning, addressing catastrophic forgetting under distributional drift is still an open and important problem. Indeed, while state-of-the-art domain incremental learning (DIL) methods perform satisfactorily within known domains, their performance largely degrades in the presence of novel domains. This limitation hampers their generalizability, and restricts their scalability to more realistic settings where train and test data are drawn from different distributions. To address these limitations, we present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP), which generalizes the paradigm of S-Prompting to handle both in-distribution and out-of-distribution data at inference. In particular, at the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain. At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain, leveraging a mixture of the prompt-tuned CLIP models. Our empirical evaluation reveals the poor performance of existing DIL methods under domain shift, and suggests that the proposed MoP-CLIP performs competitively in the standard DIL settings while outperforming state-of-the-art methods in OOD scenarios. These results demonstrate the superiority of MoP-CLIP, offering a robust and general solution to the problem of domain incremental learning.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Parametric Information Maximization for Generalized Category Discovery
Authors:
Florent Chiaroni,
Jose Dolz,
Ziko Imtiaz Masud,
Amar Mitiche,
Ismail Ben Ayed
Abstract:
We introduce a Parametric Information Maximization (PIM) model for the Generalized Category Discovery (GCD) problem. Specifically, we propose a bi-level optimization formulation, which explores a parameterized family of objective functions, each evaluating a weighted mutual information between the features and the latent labels, subject to supervision constraints from the labeled samples. Our form…
▽ More
We introduce a Parametric Information Maximization (PIM) model for the Generalized Category Discovery (GCD) problem. Specifically, we propose a bi-level optimization formulation, which explores a parameterized family of objective functions, each evaluating a weighted mutual information between the features and the latent labels, subject to supervision constraints from the labeled samples. Our formulation mitigates the class-balance bias encoded in standard information maximization approaches, thereby handling effectively both short-tailed and long-tailed data sets. We report extensive experiments and comparisons demonstrating that our PIM model consistently sets new state-of-the-art performances in GCD across six different datasets, more so when dealing with challenging fine-grained problems.
△ Less
Submitted 14 July, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions
Authors:
Florent Chiaroni,
Malik Boudiaf,
Amar Mitiche,
Ismail Ben Ayed
Abstract:
We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general m…
▽ More
We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general maximum a posteriori (MAP) perspective of clustering distributions, emphasizing that the statistical models underlying the existing distortion-based methods may not be descriptive enough. Instead, we optimize a mixed-variable objective measuring data conformity within each cluster to the introduced sBeta density function, whose parameters are constrained and estimated jointly with binary assignment variables. Our versatile formulation approximates various parametric densities for modeling simplex data and enables the control of the cluster-balance bias. This yields highly competitive performances for the unsupervised adjustment of black-box model predictions in various scenarios. Our code and comparisons with the existing simplex-clustering approaches and our introduced softmax-prediction benchmarks are publicly available: https://github.com/fchiaroni/Clustering_Softmax_Predictions.
△ Less
Submitted 30 June, 2024; v1 submitted 30 July, 2022;
originally announced August 2022.
-
Self-supervised classification of dynamic obstacles using the temporal information provided by videos
Authors:
Sid Ali Hamideche,
Florent Chiaroni,
Mohamed-Cherif Rahal
Abstract:
Nowadays, autonomous driving systems can detect, segment, and classify the surrounding obstacles using a monocular camera. However, state-of-the-art methods solving these tasks generally perform a fully supervised learning process and require a large amount of training labeled data. On another note, some self-supervised learning approaches can deal with detection and segmentation of dynamic obstac…
▽ More
Nowadays, autonomous driving systems can detect, segment, and classify the surrounding obstacles using a monocular camera. However, state-of-the-art methods solving these tasks generally perform a fully supervised learning process and require a large amount of training labeled data. On another note, some self-supervised learning approaches can deal with detection and segmentation of dynamic obstacles using the temporal information available in video sequences. In this work, we propose to classify the detected obstacles depending on their motion pattern. We present a novel self-supervised framework consisting of learning offline clusters from temporal patch sequences and considering these clusters as labeled sets to train a real-time image classifier. The presented model outperforms state-of-the-art unsupervised image classification methods on large-scale diverse driving video dataset BDD100K.
△ Less
Submitted 7 June, 2020; v1 submitted 20 October, 2019;
originally announced October 2019.
-
Generating Relevant Counter-Examples from a Positive Unlabeled Dataset for Image Classification
Authors:
Florent Chiaroni,
Ghazaleh Khodabandelou,
Mohamed-Cherif Rahal,
Nicolas Hueber,
Frederic Dufaux
Abstract:
With surge of available but unlabeled data, Positive Unlabeled (PU) learning is becoming a thriving challenge. This work deals with this demanding task for which recent GAN-based PU approaches have demonstrated promising results. Generative adversarial Networks (GANs) are not hampered by deterministic bias or need for specific dimensionality. However, existing GAN-based PU approaches also present…
▽ More
With surge of available but unlabeled data, Positive Unlabeled (PU) learning is becoming a thriving challenge. This work deals with this demanding task for which recent GAN-based PU approaches have demonstrated promising results. Generative adversarial Networks (GANs) are not hampered by deterministic bias or need for specific dimensionality. However, existing GAN-based PU approaches also present some drawbacks such as sensitive dependence to prior knowledge, a cumbersome architecture or first-stage overfitting. To settle these issues, we propose to incorporate a biased PU risk within the standard GAN discriminator loss function. In this manner, the discriminator is constrained to request the generator to converge towards the unlabeled samples distribution while diverging from the positive samples distribution. This enables the proposed model, referred to as D-GAN, to exclusively learn the counter-examples distribution without prior knowledge. Experiments demonstrate that our approach outperforms state-of-the-art PU methods without prior by overcoming their issues.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
Self-supervised learning for autonomous vehicles perception: A conciliation between analytical and learning methods
Authors:
Florent Chiaroni,
Mohamed-Cherif Rahal,
Nicolas Hueber,
Frederic Dufaux
Abstract:
Nowadays, supervised deep learning techniques yield the best state-of-the-art prediction performances for a wide variety of computer vision tasks. However, such supervised techniques generally require a large amount of manually labeled training data. In the context of autonomous vehicles perception, this requirement is critical, as the distribution of sensor data can continuously change and includ…
▽ More
Nowadays, supervised deep learning techniques yield the best state-of-the-art prediction performances for a wide variety of computer vision tasks. However, such supervised techniques generally require a large amount of manually labeled training data. In the context of autonomous vehicles perception, this requirement is critical, as the distribution of sensor data can continuously change and include several unexpected variations. It turns out that a category of learning techniques, referred to as self-supervised learning (SSL), consists of replacing the manual labeling effort by an automatic labeling process. Thanks to their ability to learn on the application time and in varying environments, state-of-the-art SSL techniques provide a valid alternative to supervised learning for a variety of different tasks, including long-range traversable area segmentation, moving obstacle instance segmentation, long-term moving obstacle tracking, or depth map prediction. In this tutorial-style article, we present an overview and a general formalization of the concept of self-supervised learning (SSL) for autonomous vehicles perception. This formalization provides helpful guidelines for developing novel frameworks based on generic SSL principles. Moreover, it enables to point out significant challenges in the design of future SSL systems.
△ Less
Submitted 7 June, 2020; v1 submitted 3 October, 2019;
originally announced October 2019.