-
EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks
Authors:
Athinoulla Konstantinou,
Georgios Leontidis,
Mamatha Thota,
Aiden Durrant
Abstract:
Learning self-supervised representations that are invariant and equivariant to transformations is crucial for advancing beyond traditional visual classification tasks. However, many methods rely on predictor architectures to encode equivariance, despite evidence that architectural choices, such as capsule networks, inherently excel at learning interpretable pose-aware representations. To explore t…
▽ More
Learning self-supervised representations that are invariant and equivariant to transformations is crucial for advancing beyond traditional visual classification tasks. However, many methods rely on predictor architectures to encode equivariance, despite evidence that architectural choices, such as capsule networks, inherently excel at learning interpretable pose-aware representations. To explore this, we introduce EquiCaps (Equivariant Capsule Network), a capsule-based approach to pose-aware self-supervision that eliminates the need for a specialised predictor for enforcing equivariance. Instead, we leverage the intrinsic pose-awareness capabilities of capsules to improve performance in pose estimation tasks. To further challenge our assumptions, we increase task complexity via multi-geometric transformations to enable a more thorough evaluation of invariance and equivariance by introducing 3DIEBench-T, an extension of a 3D object-rendering benchmark dataset. Empirical results demonstrate that EquiCaps outperforms prior state-of-the-art equivariant methods on rotation prediction, achieving a supervised-level $R^2$ of 0.78 on the 3DIEBench rotation prediction benchmark and improving upon SIE and CapsIE by 0.05 and 0.04 $R^2$, respectively. Moreover, in contrast to non-capsule-based equivariant approaches, EquiCaps maintains robust equivariant performance under combined geometric transformations, underscoring its generalisation capabilities and the promise of predictor-free capsule architectures.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
Monkey Transfer Learning Can Improve Human Pose Estimation
Authors:
Bradley Scott,
Clarisse de Vries,
Aiden Durrant,
Nir Oren,
Edward Chadwick,
Dimitra Blana
Abstract:
In this study, we investigated whether transfer learning from macaque monkeys could improve human pose estimation. Current state-of-the-art pose estimation techniques, often employing deep neural networks, can match human annotation in non-clinical datasets. However, they underperform in novel situations, limiting their generalisability to clinical populations with pathological movement patterns.…
▽ More
In this study, we investigated whether transfer learning from macaque monkeys could improve human pose estimation. Current state-of-the-art pose estimation techniques, often employing deep neural networks, can match human annotation in non-clinical datasets. However, they underperform in novel situations, limiting their generalisability to clinical populations with pathological movement patterns. Clinical datasets are not widely available for AI training due to ethical challenges and a lack of data collection. We observe that data from other species may be able to bridge this gap by exposing the network to a broader range of motion cues. We found that utilising data from other species and undertaking transfer learning improved human pose estimation in terms of precision and recall compared to the benchmark, which was trained on humans only. Compared to the benchmark, fewer human training examples were needed for the transfer learning approach (1,000 vs 19,185). These results suggest that macaque pose estimation can improve human pose estimation in clinical situations. Future work should further explore the utility of pose estimation trained with monkey data in clinical populations.
△ Less
Submitted 20 December, 2024;
originally announced December 2024.
-
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Authors:
Andy Li,
Aiden Durrant,
Milan Markovic,
Tianjin Huang,
Souvik Kundu,
Tianlong Chen,
Lu Yin,
Georgios Leontidis
Abstract:
Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%, accuracy quickly deteriorates when pushing sparsities to ext…
▽ More
Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%, accuracy quickly deteriorates when pushing sparsities to extreme levels due to unique challenges such as fragile gradient flow. In this work, we explore network performance beyond the commonly studied sparsities, and propose a collection of techniques that enable the continuous learning of networks without accuracy collapse even at extreme sparsities, including 99.90%, 99.95% and 99.99% on ResNet architectures. Our approach combines 1) Dynamic ReLU phasing, where DyReLU initially allows for richer parameter exploration before being gradually replaced by standard ReLU, 2) weight sharing which reuses parameters within a residual layer while maintaining the same number of learnable parameters, and 3) cyclic sparsity, where both sparsity levels and sparsity patterns evolve dynamically throughout training to better encourage parameter exploration. We evaluate our method, which we term Extreme Adaptive Sparse Training (EAST) at extreme sparsities using ResNet-34 and ResNet-50 on CIFAR-10, CIFAR-100, and ImageNet, achieving significant performance improvements over state-of-the-art methods we compared with.
△ Less
Submitted 10 March, 2025; v1 submitted 20 November, 2024;
originally announced November 2024.
-
Capsule Network Projectors are Equivariant and Invariant Learners
Authors:
Miles Everett,
Aiden Durrant,
Mingjun Zhong,
Georgios Leontidis
Abstract:
Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture eq…
▽ More
Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture equivariance with respect to novel viewpoints. We demonstrate that the use of CapsNets in equivariant self-supervised architectures achieves improved downstream performance on equivariant tasks with higher efficiency and fewer network parameters. To accommodate the architectural changes of CapsNets, we introduce a new objective function based on entropy minimisation. This approach which we name CapsIE (Capsule Invariant Equivariant Network) achieves state-of-the-art performance on the equivariant rotation tasks on the 3DIEBench dataset compared to prior equivariant SSL methods, while performing competitively against supervised counterparts. Our results demonstrate the ability of CapsNets to learn complex and generalised representations for large-scale, multi-task datasets compared to previous CapsNet benchmarks. Code is available at https://github.com/AberdeenML/CapsIE.
△ Less
Submitted 20 November, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
S-JEA: Stacked Joint Embedding Architectures for Self-Supervised Visual Representation Learning
Authors:
Alžběta Manová,
Aiden Durrant,
Georgios Leontidis
Abstract:
The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchica…
▽ More
The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchical representations by stacking Joint Embedding Architectures (JEA) where higher-level JEAs are input with representations of lower-level JEA. This results in a representation space that exhibits distinct sub-categories of semantic concepts (e.g., model and colour of vehicles) in higher-level JEAs. We empirically show that representations from stacked JEA perform on a similar level as traditional JEA with comparative parameter counts and visualise the representation spaces to validate the semantic hierarchies.
△ Less
Submitted 4 November, 2024; v1 submitted 19 May, 2023;
originally announced May 2023.
-
HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes
Authors:
Aiden Durrant,
Georgios Leontidis
Abstract:
Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised lea…
▽ More
Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised learning is yet to be explored fully. In this work, we explore the use of hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches. First, we extend the Masked Siamese Networks to operate on the Poincaré ball model of hyperbolic space, secondly, we place prototypes on the ideal boundary of the Poincaré ball. Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic. Empirically we demonstrate the ability of these methods to perform comparatively to Euclidean methods in lower dimensions for linear evaluation tasks, whilst showing improvements in extreme few-shot learning tasks.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
A Bayesian approach to identify changepoints in spatio-temporal ordered categorical data: An application to COVID-19 data
Authors:
Siddharth Rawat,
Abe Durrant,
Adam Simpson,
Grant Nielson,
Candace Berrett,
Soudeep Deb
Abstract:
Although there is substantial literature on identifying structural changes for continuous spatio-temporal processes, the same is not true for categorical spatio-temporal data. This work bridges that gap and proposes a novel spatio-temporal model to identify changepoints in ordered categorical data. The model leverages an additive mean structure with separable Gaussian space-time processes for the…
▽ More
Although there is substantial literature on identifying structural changes for continuous spatio-temporal processes, the same is not true for categorical spatio-temporal data. This work bridges that gap and proposes a novel spatio-temporal model to identify changepoints in ordered categorical data. The model leverages an additive mean structure with separable Gaussian space-time processes for the latent variable. Our proposed methodology can detect significant changes in the mean structure as well as in the spatio-temporal covariance structures. We implement the model through a Bayesian framework that gives a computational edge over conventional approaches. From an application perspective, our approach's capability to handle ordinal categorical data provides an added advantage in real applications. This is illustrated using county-wise COVID-19 data (converted to categories according to CDC guidelines) from the state of New York in the USA. Our model identifies three changepoints in the transmission levels of COVID-19, which are indeed aligned with the ``waves'' due to specific variants encountered during the pandemic. The findings also provide interesting insights into the effects of vaccination and the extent of spatial and temporal dependence in different phases of the pandemic.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
A systematic review of physical-digital play technology and developmentally relevant child behaviour
Authors:
Pablo E. Torres,
Philip I. N. Ulrich,
Veronica Cucuiat,
Mutlu Cukurova,
Maria Fercovic De la Presa,
Rose Luckin,
Amanda Carr,
Thomas Dylan,
Abigail Durrant,
John Vines,
Shaun Lawson
Abstract:
New interactive physical-digital play technologies are shaping the way children plan. These technologies refer to digital play technologies that engage children in analogue forms of behaviour, either alone or with others. Current interactive physical-digital play technologies include robots, digital agents, mixed or augmented reality devices, and smart-eye based gaming. Little is known, however, a…
▽ More
New interactive physical-digital play technologies are shaping the way children plan. These technologies refer to digital play technologies that engage children in analogue forms of behaviour, either alone or with others. Current interactive physical-digital play technologies include robots, digital agents, mixed or augmented reality devices, and smart-eye based gaming. Little is known, however, about the ways in which these technologies could promote or damage child development. This systematic review was aimed at understanding if and how these physical-digital play technologies promoted developmentally relevant behaviour in typically developing 0 to 12 year-olds. Psychology, Education, and Computer Science databases were searched producing 635 paper. A total of 31 papers met the inclusion criteria, of which 17 were of high enough quality to be included for synthesis. Results indicate that these new interactive play technologies could have a positive effect on children's developmentally relevant behaviour. The review indicated specific ways in which different behaviour were promoted. Providing information about own performance promoted self-monitoring. Slowing interactivity, play interdependency, and joint object accessibility promoted collaboration. Offering delimited choices promoted decision making. Problem solving and physical activity were promoted by requiring children to engage in them to keep playing. Four principles underpinned the ways in which physical digital play technologies afforded child behaviour. These included social expectations framing play situations, the directiveness of action regulations (inviting, guiding or forcing behaviours), the technical features of play technologies (digital play mechanics and physical characteristics), and the alignment between play goals, play technology and the play behaviours promoted.
△ Less
Submitted 10 February, 2022; v1 submitted 22 May, 2021;
originally announced May 2021.
-
Hyperspherically Regularized Networks for Self-Supervision
Authors:
Aiden Durrant,
Georgios Leontidis
Abstract:
Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrast…
▽ More
Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrastive methods. This work empirically demonstrates that feature diversity enforced by contrastive losses is beneficial to image representation uniformity when employed in BYOL, and as such, provides greater inter-class representation separability. Additionally, we explore and advocate the use of regularization methods, specifically the layer-wise minimization of hyperspherical energy (i.e. maximization of entropy) of network weights to encourage representation uniformity. We show that directly optimizing a measure of uniformity alongside the standard loss, or regularizing the networks of the BYOL architecture to minimize the hyperspherical energy of neurons can produce more uniformly distributed and therefore better performing representations for downstream tasks.
△ Less
Submitted 27 March, 2022; v1 submitted 29 April, 2021;
originally announced May 2021.
-
The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector
Authors:
Aiden Durrant,
Milan Markovic,
David Matthews,
David May,
Jessica Enright,
Georgios Leontidis
Abstract:
Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI tec…
▽ More
Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI technologies often require large amounts of training data in order to perform well, something that in many scenarios is unrealistic. However, recent machine learning advances, e.g. federated learning and privacy-preserving technologies, can offer a solution to this issue via providing the infrastructure and underpinning technologies needed to use data from various sources to train models without ever sharing the raw data themselves. In this paper, we propose a technical solution based on federated learning that uses decentralized data, (i.e. data that are not exchanged or shared but remain with the owners) to develop a cross-silo machine learning model that facilitates data sharing across supply chains. We focus our data sharing proposition on improving production optimization through soybean yield prediction, and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also helping to adopt emerging machine learning technologies to boost productivity.
△ Less
Submitted 4 May, 2023; v1 submitted 14 April, 2021;
originally announced April 2021.
-
Intensity-dependent dispersion under conditions of electromagnetically induced transparency in coherently prepared multi-state atoms
Authors:
Andrew D. Greentree,
Derek Richards,
J. A. Vaccaro,
A. V. Durrant,
S. R. de Echaniz,
D. M. Segal,
J. P. Marangos
Abstract:
Interest in lossless nonlinearities has focussed on the the dispersive properties of $Λ$ systems under conditions of electromagnetically induced transparency (EIT). We generalize the lambda system by introducing further degenerate states to realize a `Chain $ Λ$' atom where multiple coupling of the probe field significantly enhances the intensity dependent dispersion without compromising the EIT…
▽ More
Interest in lossless nonlinearities has focussed on the the dispersive properties of $Λ$ systems under conditions of electromagnetically induced transparency (EIT). We generalize the lambda system by introducing further degenerate states to realize a `Chain $ Λ$' atom where multiple coupling of the probe field significantly enhances the intensity dependent dispersion without compromising the EIT condition.
△ Less
Submitted 18 March, 2003; v1 submitted 9 September, 2002;
originally announced September 2002.
-
Resonant and off-resonant transients in electromagnetically induced transparency: turn-on and turn-off dynamics
Authors:
Andrew D. Greentree,
T. B. Smith,
S. R. de Echaniz,
A. V. Durrant,
J. P. Marangos,
D. M. Segal,
J. A. Vaccaro
Abstract:
This paper presents a wide-ranging theoretical and experimental study of non-adiabatic transient phenomena in a $Λ$ EIT system when a strong coupling field is rapidly switched on or off. The theoretical treatment uses a Laplace transform approach to solve the time-dependent density matrix equation. The experiments are carried out in a Rb$^{87}$ MOT. The results show transient probe gain in param…
▽ More
This paper presents a wide-ranging theoretical and experimental study of non-adiabatic transient phenomena in a $Λ$ EIT system when a strong coupling field is rapidly switched on or off. The theoretical treatment uses a Laplace transform approach to solve the time-dependent density matrix equation. The experiments are carried out in a Rb$^{87}$ MOT. The results show transient probe gain in parameter regions not previously studied, and provide insight into the transition dynamics between bare and dressed states.
△ Less
Submitted 19 September, 2001;
originally announced September 2001.
-
Observation of transient gain without population inversion in a laser-cooled rubidium lambda system
Authors:
S. R. de Echaniz,
Andrew D. Greentree,
A. V. Durrant,
D. M. Segal,
J. P. Marangos,
J. A. Vaccaro
Abstract:
We have observed clear Rabi oscillations of a weak probe in a strongly driven three-level lambda system in laser-cooled rubidium for the first time. When the coupling field is non-adiabatically switched on using a Pockels cell, transient probe gain without population inversion is obtained in the presence of uncoupled absorptions. Our results are supported by three-state computations.
We have observed clear Rabi oscillations of a weak probe in a strongly driven three-level lambda system in laser-cooled rubidium for the first time. When the coupling field is non-adiabatically switched on using a Pockels cell, transient probe gain without population inversion is obtained in the presence of uncoupled absorptions. Our results are supported by three-state computations.
△ Less
Submitted 17 May, 2001;
originally announced May 2001.
-
Observations of a doubly driven V system probed to a fourth level in laser-cooled rubidium
Authors:
S. R. de Echaniz,
Andrew D. Greentree,
A. V. Durrant,
D. M. Segal,
J. P. Marangos,
J. A. Vaccaro
Abstract:
Observations of a doubly driven V system probed to a fourth level in an N configuration are reported. A dressed state analysis is also presented. The expected three-peak spectrum is explored in a cold rubidium sample in a magneto-optic trap. Good agreement is found between the dressed state theory and the experimental spectra once light shifts and uncoupled absorptions in the rubidium system are…
▽ More
Observations of a doubly driven V system probed to a fourth level in an N configuration are reported. A dressed state analysis is also presented. The expected three-peak spectrum is explored in a cold rubidium sample in a magneto-optic trap. Good agreement is found between the dressed state theory and the experimental spectra once light shifts and uncoupled absorptions in the rubidium system are taken into account.
△ Less
Submitted 4 May, 2001; v1 submitted 20 February, 2001;
originally announced February 2001.
-
Prospects for photon blockade in four level systems in the N configuration with more than one atom
Authors:
Andrew D. Greentree,
John A. Vaccaro,
Sebastian R. de Echaniz,
Alan V. Durrant,
Jon P. Marangos
Abstract:
We show that for appropriate choices of parameters it is possible to achieve photon blockade in idealised one, two and three atom systems. We also include realistic parameter ranges for rubidium as the atomic species. Our results circumvent the doubts cast by recent discussion in the literature (Grangier et al Phys. Rev Lett. 81, 2833 (1998), Imamoglu et al Phys. Rev. Lett. 81, 2836 (1998)) on t…
▽ More
We show that for appropriate choices of parameters it is possible to achieve photon blockade in idealised one, two and three atom systems. We also include realistic parameter ranges for rubidium as the atomic species. Our results circumvent the doubts cast by recent discussion in the literature (Grangier et al Phys. Rev Lett. 81, 2833 (1998), Imamoglu et al Phys. Rev. Lett. 81, 2836 (1998)) on the possibility of photon blockade in multi-atom systems.
△ Less
Submitted 29 February, 2000;
originally announced February 2000.