Search | arXiv e-print repository

doi 10.3390/s24248002

Improving Object Detection for Time-Lapse Imagery Using Temporal Features in Wildlife Monitoring

Authors: Marcus Jenkins, Kirsty A. Franklin, Malcolm A. C. Nicoll, Nik C. Cole, Kevin Ruhomaun, Vikash Tatayah, Michal Mackiewicz

Abstract: Monitoring animal populations is crucial for assessing the health of ecosystems. Traditional methods, which require extensive fieldwork, are increasingly being supplemented by time-lapse camera-trap imagery combined with an automatic analysis of the image data. The latter usually involves some object detector aimed at detecting relevant targets (commonly animals) in each image, followed by some po… ▽ More Monitoring animal populations is crucial for assessing the health of ecosystems. Traditional methods, which require extensive fieldwork, are increasingly being supplemented by time-lapse camera-trap imagery combined with an automatic analysis of the image data. The latter usually involves some object detector aimed at detecting relevant targets (commonly animals) in each image, followed by some postprocessing to gather activity and population data. In this paper, we show that the performance of an object detector in a single frame of a time-lapse sequence can be improved by including spatio-temporal features from the prior frames. We propose a method that leverages temporal information by integrating two additional spatial feature channels which capture stationary and non-stationary elements of the scene and consequently improve scene understanding and reduce the number of stationary false positives. The proposed technique achieves a significant improvement of 24\% in mean average precision ([email protected]:0.95) over the baseline (temporal feature-free, single frame) object detector on a large dataset of breeding tropical seabirds. We envisage our method will be widely applicable to other wildlife monitoring applications that use time-lapse imaging. △ Less

Submitted 20 December, 2024; originally announced December 2024.

Comments: 18 pages, 13 figures

MSC Class: 68T45 ACM Class: I.4.8

Journal ref: Sensors 2024, 24, 8002

arXiv:2309.06188 [pdf, other]

Computer Vision Pipeline for Automated Antarctic Krill Analysis

Authors: Mazvydas Gudelis, Michal Mackiewicz, Julie Bremner, Sophie Fielding

Abstract: British Antarctic Survey (BAS) researchers launch annual expeditions to the Antarctic in order to estimate Antarctic Krill biomass and assess the change from previous years. These comparisons provide insight into the effects of the current environment on this key component of the marine food chain. In this work we have developed tools for automating the data collection and analysis process, using… ▽ More British Antarctic Survey (BAS) researchers launch annual expeditions to the Antarctic in order to estimate Antarctic Krill biomass and assess the change from previous years. These comparisons provide insight into the effects of the current environment on this key component of the marine food chain. In this work we have developed tools for automating the data collection and analysis process, using web-based image annotation tools and deep learning image classification and regression models. We achieve highly accurate krill instance segmentation results with an average 77.28% AP score, as well as separate maturity stage and length estimation of krill specimens with 62.99% accuracy and a 1.98mm length error respectively. △ Less

Submitted 12 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: Accepted to MVEO @ BMVC 2023

arXiv:2211.04888 [pdf, other]

Extending Temporal Data Augmentation for Video Action Recognition

Authors: Artjoms Gorpincenko, Michal Mackiewicz

Abstract: Pixel space augmentation has grown in popularity in many Deep Learning areas, due to its effectiveness, simplicity, and low computational cost. Data augmentation for videos, however, still remains an under-explored research topic, as most works have been treating inputs as stacks of static images rather than temporally linked series of data. Recently, it has been shown that involving the time dime… ▽ More Pixel space augmentation has grown in popularity in many Deep Learning areas, due to its effectiveness, simplicity, and low computational cost. Data augmentation for videos, however, still remains an under-explored research topic, as most works have been treating inputs as stacks of static images rather than temporally linked series of data. Recently, it has been shown that involving the time dimension when designing augmentations can be superior to its spatial-only variants for video action recognition. In this paper, we propose several novel enhancements to these techniques to strengthen the relationship between the spatial and temporal domains and achieve a deeper level of perturbations. The video action recognition results of our techniques outperform their respective variants in Top-1 and Top-5 settings on the UCF-101 and the HMDB-51 datasets. △ Less

Submitted 9 November, 2022; originally announced November 2022.

arXiv:2111.09692 [pdf, other]

SUB-Depth: Self-distillation and Uncertainty Boosting Self-supervised Monocular Depth Estimation

Authors: Hang Zhou, Sarah Taylor, David Greenwood, Michal Mackiewicz

Abstract: We propose SUB-Depth, a universal multi-task training framework for self-supervised monocular depth estimation (SDE). Depth models trained with SUB-Depth outperform the same models trained in a standard single-task SDE framework. By introducing an additional self-distillation task into a standard SDE training framework, SUB-Depth trains a depth network, not only to predict the depth map for an ima… ▽ More We propose SUB-Depth, a universal multi-task training framework for self-supervised monocular depth estimation (SDE). Depth models trained with SUB-Depth outperform the same models trained in a standard single-task SDE framework. By introducing an additional self-distillation task into a standard SDE training framework, SUB-Depth trains a depth network, not only to predict the depth map for an image reconstruction task, but also to distill knowledge from a trained teacher network with unlabelled data. To take advantage of this multi-task setting, we propose homoscedastic uncertainty formulations for each task to penalize areas likely to be affected by teacher network noise, or violate SDE assumptions. We present extensive evaluations on KITTI to demonstrate the improvements achieved by training a range of existing networks using the proposed framework, and we achieve state-of-the-art performance on this task. Additionally, SUB-Depth enables models to estimate uncertainty on depth output. △ Less

Submitted 29 November, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

Comments: bmvc version

arXiv:2110.04487 [pdf, other]

Colour augmentation for improved semi-supervised semantic segmentation

Authors: Geoff French, Michal Mackiewicz

Abstract: Consistency regularization describes a class of approaches that have yielded state-of-the-art results for semi-supervised classification. While semi-supervised semantic segmentation proved to be more challenging, a number of successful approaches have been recently proposed. Recent work explored the challenges involved in using consistency regularization for segmentation problems. In their self-su… ▽ More Consistency regularization describes a class of approaches that have yielded state-of-the-art results for semi-supervised classification. While semi-supervised semantic segmentation proved to be more challenging, a number of successful approaches have been recently proposed. Recent work explored the challenges involved in using consistency regularization for segmentation problems. In their self-supervised work Chen et al. found that colour augmentation prevents a classification network from using image colour statistics as a short-cut for self-supervised learning via instance discrimination. Drawing inspiration from this we find that a similar problem impedes semi-supervised semantic segmentation and offer colour augmentation as a solution, improving semi-supervised semantic segmentation performance on challenging photographic imagery. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: 9 pages, 1 figure

arXiv:2103.04068 [pdf, other]

doi 10.1109/JSEN.2020.3032031

Improving Automated Sonar Video Analysis to Notify About Jellyfish Blooms

Authors: Artjoms Gorpincenko, Geoffrey French, Peter Knight, Mike Challiss, Michal Mackiewicz

Abstract: Human enterprise often suffers from direct negative effects caused by jellyfish blooms. The investigation of a prior jellyfish monitoring system showed that it was unable to reliably perform in a cross validation setting, i.e. in new underwater environments. In this paper, a number of enhancements are proposed to the part of the system that is responsible for object classification. First, the trai… ▽ More Human enterprise often suffers from direct negative effects caused by jellyfish blooms. The investigation of a prior jellyfish monitoring system showed that it was unable to reliably perform in a cross validation setting, i.e. in new underwater environments. In this paper, a number of enhancements are proposed to the part of the system that is responsible for object classification. First, the training set is augmented by adding synthetic data, making the deep learning classifier able to generalise better. Then, the framework is enhanced by employing a new second stage model, which analyzes the outputs of the first network to make the final prediction. Finally, weighted loss and confidence threshold are added to balance out true and false positives. With all the upgrades in place, the system can correctly classify 30.16% (comparing to the initial 11.52%) of all spotted jellyfish, keep the amount of false positives as low as 0.91% (comparing to the initial 2.26%) and operate in real-time within the computational constraints of an autonomous embedded platform. △ Less

Submitted 6 March, 2021; originally announced March 2021.

Journal ref: IEEE Sensors Journal, 21, 4981-4988 (2021)

arXiv:2008.08369 [pdf, other]

Virtual Adversarial Training in Feature Space to Improve Unsupervised Video Domain Adaptation

Authors: Artjoms Gorpincenko, Geoffrey French, Michal Mackiewicz

Abstract: Virtual Adversarial Training has recently seen a lot of success in semi-supervised learning, as well as unsupervised Domain Adaptation. However, so far it has been used on input samples in the pixel space, whereas we propose to apply it directly to feature vectors. We also discuss the unstable behaviour of entropy minimization and Decision-Boundary Iterative Refinement Training With a Teacher in D… ▽ More Virtual Adversarial Training has recently seen a lot of success in semi-supervised learning, as well as unsupervised Domain Adaptation. However, so far it has been used on input samples in the pixel space, whereas we propose to apply it directly to feature vectors. We also discuss the unstable behaviour of entropy minimization and Decision-Boundary Iterative Refinement Training With a Teacher in Domain Adaptation, and suggest substitutes that achieve similar behaviour. By adding the aforementioned techniques to the state of the art model TA$^3$N, we either maintain competitive results or outperform prior art in multiple unsupervised video Domain Adaptation tasks △ Less

Submitted 19 August, 2020; originally announced August 2020.

Comments: Submitted to the EI conference

arXiv:1907.02040 [pdf, other]

Using Deep Learning to Count Albatrosses from Space

Authors: Ellen Bowler, Peter T. Fretwell, Geoffrey French, Michal Mackiewicz

Abstract: In this paper we test the use of a deep learning approach to automatically count Wandering Albatrosses in Very High Resolution (VHR) satellite imagery. We use a dataset of manually labelled imagery provided by the British Antarctic Survey to train and develop our methods. We employ a U-Net architecture, designed for image segmentation, to simultaneously classify and localise potential albatrosses.… ▽ More In this paper we test the use of a deep learning approach to automatically count Wandering Albatrosses in Very High Resolution (VHR) satellite imagery. We use a dataset of manually labelled imagery provided by the British Antarctic Survey to train and develop our methods. We employ a U-Net architecture, designed for image segmentation, to simultaneously classify and localise potential albatrosses. We aid training with the use of the Focal Loss criterion, to deal with extreme class imbalance in the dataset. Initial results achieve peak precision and recall values of approximately 80%. Finally we assess the model's performance in relation to inter-observer variation, by comparing errors against an image labelled by multiple observers. We conclude model accuracy falls within the range of human counters. We hope that the methods will streamline the analysis of VHR satellite images, enabling more frequent monitoring of a species which is of high conservation concern. △ Less

Submitted 3 July, 2019; originally announced July 2019.

Comments: 4 pages, 5 figures, to be presented at IEEE 2019 International Geoscience & Remote Sensing Symposium (IGARSS 2019), scheduled for July 28 - August 2, 2019

arXiv:1906.01916 [pdf, other]

Semi-supervised semantic segmentation needs strong, varied perturbations

Authors: Geoff French, Samuli Laine, Timo Aila, Michal Mackiewicz, Graham Finlayson

Abstract: Consistency regularization describes a class of approaches that have yielded ground breaking results in semi-supervised classification problems. Prior work has established the cluster assumption - under which the data distribution consists of uniform class clusters of samples separated by low density regions - as important to its success. We analyze the problem of semantic segmentation and find th… ▽ More Consistency regularization describes a class of approaches that have yielded ground breaking results in semi-supervised classification problems. Prior work has established the cluster assumption - under which the data distribution consists of uniform class clusters of samples separated by low density regions - as important to its success. We analyze the problem of semantic segmentation and find that its' distribution does not exhibit low density regions separating classes and offer this as an explanation for why semi-supervised segmentation is a challenging problem, with only a few reports of success. We then identify choice of augmentation as key to obtaining reliable performance without such low-density regions. We find that adapted variants of the recently proposed CutOut and CutMix augmentation techniques yield state-of-the-art semi-supervised semantic segmentation results in standard datasets. Furthermore, given its challenging nature we propose that semantic segmentation acts as an effective acid test for evaluating semi-supervised regularizers. Implementation at: https://github.com/Britefury/cutmix-semisup-seg. △ Less

Submitted 11 August, 2020; v1 submitted 5 June, 2019; originally announced June 2019.

Comments: 21 pages, 7 figures, accepted to BMVC 2020

arXiv:1901.08419 [pdf, ps, other]

doi 10.1364/JOSAA.36.000096

Spherical sampling methods for the calculation of metamer mismatch volumes

Authors: Michal Mackiewicz, Hans Jakob Rivertz, Graham D. Finlayson

Abstract: In this paper, we propose two methods of calculating theoretically maximal metamer mismatch volumes. Unlike prior art techniques, our methods do not make any assumptions on the shape of spectra on the boundary of the mismatch volumes. Both methods utilize a spherical sampling approach, but they calculate mismatch volumes in two different ways. The first method uses a linear programming optimizatio… ▽ More In this paper, we propose two methods of calculating theoretically maximal metamer mismatch volumes. Unlike prior art techniques, our methods do not make any assumptions on the shape of spectra on the boundary of the mismatch volumes. Both methods utilize a spherical sampling approach, but they calculate mismatch volumes in two different ways. The first method uses a linear programming optimization, while the second is a computational geometry approach based on half-space intersection. We show that under certain conditions the theoretically maximal metamer mismatch volume is significantly larger than the one approximated using a prior art method. △ Less

Submitted 23 January, 2019; originally announced January 2019.

Comments: One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modifications of this paper are prohibited. Optical Society of America

MSC Class: 68U99

Journal ref: Vol. 36, No. 1 / Jan 2019 / Journal of the Optical Society of America A

arXiv:1706.05208 [pdf, other]

Self-ensembling for visual domain adaptation

Authors: Geoffrey French, Michal Mackiewicz, Mark Fisher

Abstract: This paper explores the use of self-ensembling for visual domain adaptation problems. Our technique is derived from the mean teacher variant (Tarvainen et al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that achieved state of the art results in the area of semi-supervised learning. We introduce a number of modifications to their approach for challenging domain adaptation scenar… ▽ More This paper explores the use of self-ensembling for visual domain adaptation problems. Our technique is derived from the mean teacher variant (Tarvainen et al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that achieved state of the art results in the area of semi-supervised learning. We introduce a number of modifications to their approach for challenging domain adaptation scenarios and evaluate its effectiveness. Our approach achieves state of the art results in a variety of benchmarks, including our winning entry in the VISDA-2017 visual domain adaptation challenge. In small image benchmarks, our algorithm not only outperforms prior art, but can also achieve accuracy that is close to that of a classifier trained in a supervised fashion. △ Less

Submitted 23 September, 2018; v1 submitted 16 June, 2017; originally announced June 2017.

Comments: 20 pages, 3 figure, accepted as a poster at ICLR 2018

Showing 1–11 of 11 results for author: Mackiewicz, M