-
CALM: Cognitive Assessment using Light-insensitive Model
Authors:
Akhil Meethal,
Anita Paas,
Nerea Urrestilla Anguiozar,
David St-Onge
Abstract:
The demand for cognitive load assessment with low-cost easy-to-use equipment is increasing, with applications ranging from safety-critical industries to entertainment. Though pupillometry is an attractive solution for cognitive load estimation in such applications, its sensitivity to light makes it less robust under varying lighting conditions. Multimodal data acquisition provides a viable alterna…
▽ More
The demand for cognitive load assessment with low-cost easy-to-use equipment is increasing, with applications ranging from safety-critical industries to entertainment. Though pupillometry is an attractive solution for cognitive load estimation in such applications, its sensitivity to light makes it less robust under varying lighting conditions. Multimodal data acquisition provides a viable alternative, where pupillometry is combined with electrocardiography (ECG) or electroencephalography (EEG). In this work, we study the sensitivity of pupillometry-based cognitive load estimation to light. By collecting heart rate variability (HRV) data during the same experimental sessions, we analyze how the multimodal data reduces this sensitivity and increases robustness to light conditions. In addition to this, we compared the performance in multimodal settings using the HRV data obtained from low-cost fitness-grade equipment to that from clinical-grade equipment by synchronously collecting data from both devices for all task conditions. Our results indicate that multimodal data improves the robustness of cognitive load estimation under changes in light conditions and improves the accuracy by more than 20% points over assessment based on pupillometry alone. In addition to that, the fitness grade device is observed to be a potential alternative to the clinical grade one, even in controlled laboratory settings.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors
Authors:
Atif Belal,
Akhil Meethal,
Francisco Perdigon Romero,
Marco Pedersoli,
Eric Granger
Abstract:
Domain adaptation methods for object detection (OD) strive to mitigate the impact of distribution shifts by promoting feature alignment across source and target domains. Multi-source domain adaptation (MSDA) allows leveraging multiple annotated source datasets and unlabeled target data to improve the accuracy and robustness of the detection model. Most state-of-the-art MSDA methods for OD perform…
▽ More
Domain adaptation methods for object detection (OD) strive to mitigate the impact of distribution shifts by promoting feature alignment across source and target domains. Multi-source domain adaptation (MSDA) allows leveraging multiple annotated source datasets and unlabeled target data to improve the accuracy and robustness of the detection model. Most state-of-the-art MSDA methods for OD perform feature alignment in a class-agnostic manner. This is challenging since the objects have unique modality information due to variations in object appearance across domains. A recent prototype-based approach proposed a class-wise alignment, yet it suffers from error accumulation caused by noisy pseudo-labels that can negatively affect adaptation with imbalanced data. To overcome these limitations, we propose an attention-based class-conditioned alignment method for MSDA, designed to align instances of each object category across domains. In particular, an attention module combined with an adversarial domain classifier allows learning domain-invariant and class-specific instance representations. Experimental results on multiple benchmarking MSDA datasets indicate that our method outperforms state-of-the-art methods and exhibits robustness to class imbalance, achieved through a conceptually simple class-conditioning strategy. Our code is available at: https://github.com/imatif17/ACIA.
△ Less
Submitted 11 December, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Multi-Source Domain Adaptation for Object Detection with Prototype-based Mean-teacher
Authors:
Atif Belal,
Akhil Meethal,
Francisco Perdigon Romero,
Marco Pedersoli,
Eric Granger
Abstract:
Adapting visual object detectors to operational target domains is a challenging task, commonly achieved using unsupervised domain adaptation (UDA) methods. Recent studies have shown that when the labeled dataset comes from multiple source domains, treating them as separate domains and performing a multi-source domain adaptation (MSDA) improves the accuracy and robustness over blending these source…
▽ More
Adapting visual object detectors to operational target domains is a challenging task, commonly achieved using unsupervised domain adaptation (UDA) methods. Recent studies have shown that when the labeled dataset comes from multiple source domains, treating them as separate domains and performing a multi-source domain adaptation (MSDA) improves the accuracy and robustness over blending these source domains and performing a UDA. For adaptation, existing MSDA methods learn domain-invariant and domain-specific parameters (for each source domain). However, unlike single-source UDA methods, learning domain-specific parameters makes them grow significantly in proportion to the number of source domains. This paper proposes a novel MSDA method called Prototype-based Mean Teacher (PMT), which uses class prototypes instead of domain-specific subnets to encode domain-specific information. These prototypes are learned using a contrastive loss, aligning the same categories across domains and separating different categories far apart. Given the use of prototypes, the number of parameters required for our PMT method does not increase significantly with the number of source domains, thus reducing memory issues and possible overfitting. Empirical studies indicate that PMT outperforms state-of-the-art MSDA methods on several challenging object detection datasets. Our code is available at https://github.com/imatif17/Prototype-Mean-Teacher.
△ Less
Submitted 31 July, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Density Crop-guided Semi-supervised Object Detection in Aerial Images
Authors:
Akhil Meethal,
Eric Granger,
Marco Pedersoli
Abstract:
One of the important bottlenecks in training modern object detectors is the need for labeled images where bounding box annotations have to be produced for each object present in the image. This bottleneck is further exacerbated in aerial images where the annotators have to label small objects often distributed in clusters on high-resolution images. In recent days, the mean-teacher approach trained…
▽ More
One of the important bottlenecks in training modern object detectors is the need for labeled images where bounding box annotations have to be produced for each object present in the image. This bottleneck is further exacerbated in aerial images where the annotators have to label small objects often distributed in clusters on high-resolution images. In recent days, the mean-teacher approach trained with pseudo-labels and weak-strong augmentation consistency is gaining popularity for semi-supervised object detection. However, a direct adaptation of such semi-supervised detectors for aerial images where small clustered objects are often present, might not lead to optimal results. In this paper, we propose a density crop-guided semi-supervised detector that identifies the cluster of small objects during training and also exploits them to improve performance at inference. During training, image crops of clusters identified from labeled and unlabeled images are used to augment the training set, which in turn increases the chance of detecting small objects and creating good pseudo-labels for small objects on the unlabeled images. During inference, the detector is not only able to detect the objects of interest but also regions with a high density of small objects (density crops) so that detections from the input image and detections from image crops are combined, resulting in an overall more accurate object prediction, especially for small objects. Empirical studies on the popular benchmarks of VisDrone and DOTA datasets show the effectiveness of our density crop-guided semi-supervised detector with an average improvement of more than 2\% over the basic mean-teacher method in COCO style AP. Our code is available at: https://github.com/akhilpm/DroneSSOD.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Cascaded Zoom-in Detector for High Resolution Aerial Images
Authors:
Akhil Meethal,
Eric Granger,
Marco Pedersoli
Abstract:
Detecting objects in aerial images is challenging because they are typically composed of crowded small objects distributed non-uniformly over high-resolution images. Density cropping is a widely used method to improve this small object detection where the crowded small object regions are extracted and processed in high resolution. However, this is typically accomplished by adding other learnable c…
▽ More
Detecting objects in aerial images is challenging because they are typically composed of crowded small objects distributed non-uniformly over high-resolution images. Density cropping is a widely used method to improve this small object detection where the crowded small object regions are extracted and processed in high resolution. However, this is typically accomplished by adding other learnable components, thus complicating the training and inference over a standard detection process. In this paper, we propose an efficient Cascaded Zoom-in (CZ) detector that re-purposes the detector itself for density-guided training and inference. During training, density crops are located, labeled as a new class, and employed to augment the training dataset. During inference, the density crops are first detected along with the base class objects, and then input for a second stage of inference. This approach is easily integrated into any detector, and creates no significant change in the standard detection process, like the uniform cropping approach popular in aerial image detection. Experimental results on the aerial images of the challenging VisDrone and DOTA datasets verify the benefits of the proposed approach. The proposed CZ detector also provides state-of-the-art results over uniform cropping and other density cropping methods on the VisDrone dataset, increasing the detection mAP of small objects by more than 3 points.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Semi-Weakly Supervised Object Detection by Sampling Pseudo Ground-Truth Boxes
Authors:
Akhil Meethal,
Marco Pedersoli,
Zhongwen Zhu,
Francisco Perdigon Romero,
Eric Granger
Abstract:
Semi- and weakly-supervised learning have recently attracted considerable attention in the object detection literature since they can alleviate the cost of annotation needed to successfully train deep learning models. State-of-art approaches for semi-supervised learning rely on student-teacher models trained using a multi-stage process, and considerable data augmentation. Custom networks have been…
▽ More
Semi- and weakly-supervised learning have recently attracted considerable attention in the object detection literature since they can alleviate the cost of annotation needed to successfully train deep learning models. State-of-art approaches for semi-supervised learning rely on student-teacher models trained using a multi-stage process, and considerable data augmentation. Custom networks have been developed for the weakly-supervised setting, making it difficult to adapt to different detectors. In this paper, a weakly semi-supervised training method is introduced that reduces these training challenges, yet achieves state-of-the-art performance by leveraging only a small fraction of fully-labeled images with information in weakly-labeled images. In particular, our generic sampling-based learning strategy produces pseudo-ground-truth (GT) bounding box annotations in an online fashion, eliminating the need for multi-stage training, and student-teacher network configurations. These pseudo GT boxes are sampled from weakly-labeled images based on the categorical score of object proposals accumulated via a score propagation process. Empirical results on the Pascal VOC dataset, indicate that the proposed approach improves performance by 5.0% when using VOC 2007 as fully-labeled, and VOC 2012 as weak-labeled data. Also, with 5-10% fully annotated images, we observed an improvement of more than 10% in mAP, showing that a modest investment in image-level annotation, can substantially improve detection performance.
△ Less
Submitted 16 June, 2022; v1 submitted 31 March, 2022;
originally announced April 2022.
-
Unsupervised MKL in Multi-layer Kernel Machines
Authors:
Akhil Meethal,
Asharaf S,
Sumitra S
Abstract:
Kernel based Deep Learning using multi-layer kernel machines(MKMs) was proposed by Y.Cho and L.K. Saul in \cite{saul}. In MKMs they used only one kernel(arc-cosine kernel) at a layer for the kernel PCA-based feature extraction. We propose to use multiple kernels in each layer by taking a convex combination of many kernels following an unsupervised learning strategy. Empirical study is conducted on…
▽ More
Kernel based Deep Learning using multi-layer kernel machines(MKMs) was proposed by Y.Cho and L.K. Saul in \cite{saul}. In MKMs they used only one kernel(arc-cosine kernel) at a layer for the kernel PCA-based feature extraction. We propose to use multiple kernels in each layer by taking a convex combination of many kernels following an unsupervised learning strategy. Empirical study is conducted on \textit{mnist-back-rand}, \textit{mnist-back-image} and \textit{mnist-rot-back-image} datasets generated by adding random noise in the image background of MNIST dataset. Experimental results indicate that using MKL in MKMs earns a better representation of the raw data and improves the classifier performance.
△ Less
Submitted 26 November, 2021;
originally announced November 2021.
-
Convolutional STN for Weakly Supervised Object Localization
Authors:
Akhil Meethal,
Marco Pedersoli,
Soufiane Belharbi,
Eric Granger
Abstract:
Weakly supervised object localization is a challenging task in which the object of interest should be localized while learning its appearance. State-of-the-art methods recycle the architecture of a standard CNN by using the activation maps of the last layer for localizing the object. While this approach is simple and works relatively well, object localization relies on different features than clas…
▽ More
Weakly supervised object localization is a challenging task in which the object of interest should be localized while learning its appearance. State-of-the-art methods recycle the architecture of a standard CNN by using the activation maps of the last layer for localizing the object. While this approach is simple and works relatively well, object localization relies on different features than classification, thus, a specialized localization mechanism is required during training to improve performance. In this paper, we propose a convolutional, multi-scale spatial localization network that provides accurate localization for the object of interest. Experimental results on CUB-200-2011 and ImageNet datasets show that our proposed approach provides competitive performance for weakly supervised localization.
△ Less
Submitted 1 December, 2020; v1 submitted 3 December, 2019;
originally announced December 2019.