Search | arXiv e-print repository

EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks

Authors: Athinoulla Konstantinou, Georgios Leontidis, Mamatha Thota, Aiden Durrant

Abstract: Learning self-supervised representations that are invariant and equivariant to transformations is crucial for advancing beyond traditional visual classification tasks. However, many methods rely on predictor architectures to encode equivariance, despite evidence that architectural choices, such as capsule networks, inherently excel at learning interpretable pose-aware representations. To explore t… ▽ More Learning self-supervised representations that are invariant and equivariant to transformations is crucial for advancing beyond traditional visual classification tasks. However, many methods rely on predictor architectures to encode equivariance, despite evidence that architectural choices, such as capsule networks, inherently excel at learning interpretable pose-aware representations. To explore this, we introduce EquiCaps (Equivariant Capsule Network), a capsule-based approach to pose-aware self-supervision that eliminates the need for a specialised predictor for enforcing equivariance. Instead, we leverage the intrinsic pose-awareness capabilities of capsules to improve performance in pose estimation tasks. To further challenge our assumptions, we increase task complexity via multi-geometric transformations to enable a more thorough evaluation of invariance and equivariance by introducing 3DIEBench-T, an extension of a 3D object-rendering benchmark dataset. Empirical results demonstrate that EquiCaps outperforms prior state-of-the-art equivariant methods on rotation prediction, achieving a supervised-level $R^2$ of 0.78 on the 3DIEBench rotation prediction benchmark and improving upon SIE and CapsIE by 0.05 and 0.04 $R^2$, respectively. Moreover, in contrast to non-capsule-based equivariant approaches, EquiCaps maintains robust equivariant performance under combined geometric transformations, underscoring its generalisation capabilities and the promise of predictor-free capsule architectures. △ Less

Submitted 11 June, 2025; originally announced June 2025.

Comments: 19 pages, 11 Figures, 13 Tables

arXiv:2412.15966 [pdf, other]

Monkey Transfer Learning Can Improve Human Pose Estimation

Authors: Bradley Scott, Clarisse de Vries, Aiden Durrant, Nir Oren, Edward Chadwick, Dimitra Blana

Abstract: In this study, we investigated whether transfer learning from macaque monkeys could improve human pose estimation. Current state-of-the-art pose estimation techniques, often employing deep neural networks, can match human annotation in non-clinical datasets. However, they underperform in novel situations, limiting their generalisability to clinical populations with pathological movement patterns.… ▽ More In this study, we investigated whether transfer learning from macaque monkeys could improve human pose estimation. Current state-of-the-art pose estimation techniques, often employing deep neural networks, can match human annotation in non-clinical datasets. However, they underperform in novel situations, limiting their generalisability to clinical populations with pathological movement patterns. Clinical datasets are not widely available for AI training due to ethical challenges and a lack of data collection. We observe that data from other species may be able to bridge this gap by exposing the network to a broader range of motion cues. We found that utilising data from other species and undertaking transfer learning improved human pose estimation in terms of precision and recall compared to the benchmark, which was trained on humans only. Compared to the benchmark, fewer human training examples were needed for the transfer learning approach (1,000 vs 19,185). These results suggest that macaque pose estimation can improve human pose estimation in clinical situations. Future work should further explore the utility of pose estimation trained with monkey data in clinical populations. △ Less

Submitted 20 December, 2024; originally announced December 2024.

arXiv:2411.13545 [pdf, other]

Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning

Authors: Andy Li, Aiden Durrant, Milan Markovic, Tianjin Huang, Souvik Kundu, Tianlong Chen, Lu Yin, Georgios Leontidis

Abstract: Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%, accuracy quickly deteriorates when pushing sparsities to ext… ▽ More Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%, accuracy quickly deteriorates when pushing sparsities to extreme levels due to unique challenges such as fragile gradient flow. In this work, we explore network performance beyond the commonly studied sparsities, and propose a collection of techniques that enable the continuous learning of networks without accuracy collapse even at extreme sparsities, including 99.90%, 99.95% and 99.99% on ResNet architectures. Our approach combines 1) Dynamic ReLU phasing, where DyReLU initially allows for richer parameter exploration before being gradually replaced by standard ReLU, 2) weight sharing which reuses parameters within a residual layer while maintaining the same number of learnable parameters, and 3) cyclic sparsity, where both sparsity levels and sparsity patterns evolve dynamically throughout training to better encourage parameter exploration. We evaluate our method, which we term Extreme Adaptive Sparse Training (EAST) at extreme sparsities using ResNet-34 and ResNet-50 on CIFAR-10, CIFAR-100, and ImageNet, achieving significant performance improvements over state-of-the-art methods we compared with. △ Less

Submitted 10 March, 2025; v1 submitted 20 November, 2024; originally announced November 2024.

Comments: V3: moderate revisions and overall improvements, 12 pages, 6 figures, 5 tables, including supplementary material

arXiv:2405.14386 [pdf, other]

Capsule Network Projectors are Equivariant and Invariant Learners

Authors: Miles Everett, Aiden Durrant, Mingjun Zhong, Georgios Leontidis

Abstract: Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture eq… ▽ More Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture equivariance with respect to novel viewpoints. We demonstrate that the use of CapsNets in equivariant self-supervised architectures achieves improved downstream performance on equivariant tasks with higher efficiency and fewer network parameters. To accommodate the architectural changes of CapsNets, we introduce a new objective function based on entropy minimisation. This approach which we name CapsIE (Capsule Invariant Equivariant Network) achieves state-of-the-art performance on the equivariant rotation tasks on the 3DIEBench dataset compared to prior equivariant SSL methods, while performing competitively against supervised counterparts. Our results demonstrate the ability of CapsNets to learn complex and generalised representations for large-scale, multi-task datasets compared to previous CapsNet benchmarks. Code is available at https://github.com/AberdeenML/CapsIE. △ Less

Submitted 20 November, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: V3: Ignore V1 and V2 as we have fixed a bug in our code and results; 15 pages, 5 figures, 8 Tables

arXiv:2305.11701 [pdf, other]

S-JEA: Stacked Joint Embedding Architectures for Self-Supervised Visual Representation Learning

Authors: Alžběta Manová, Aiden Durrant, Georgios Leontidis

Abstract: The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchica… ▽ More The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchical representations by stacking Joint Embedding Architectures (JEA) where higher-level JEAs are input with representations of lower-level JEA. This results in a representation space that exhibits distinct sub-categories of semantic concepts (e.g., model and colour of vehicles) in higher-level JEAs. We empirically show that representations from stacked JEA perform on a similar level as traditional JEA with comparative parameter counts and visualise the representation spaces to validate the semantic hierarchies. △ Less

Submitted 4 November, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures, 3 tables; V2: fixed typos

arXiv:2305.10926 [pdf, other]

HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes

Authors: Aiden Durrant, Georgios Leontidis

Abstract: Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised lea… ▽ More Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised learning is yet to be explored fully. In this work, we explore the use of hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches. First, we extend the Masked Siamese Networks to operate on the Poincaré ball model of hyperbolic space, secondly, we place prototypes on the ideal boundary of the Poincaré ball. Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic. Empirically we demonstrate the ability of these methods to perform comparatively to Euclidean methods in lower dimensions for linear evaluation tasks, whilst showing improvements in extreme few-shot learning tasks. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.01906 [pdf, other]

A Bayesian approach to identify changepoints in spatio-temporal ordered categorical data: An application to COVID-19 data

Authors: Siddharth Rawat, Abe Durrant, Adam Simpson, Grant Nielson, Candace Berrett, Soudeep Deb

Abstract: Although there is substantial literature on identifying structural changes for continuous spatio-temporal processes, the same is not true for categorical spatio-temporal data. This work bridges that gap and proposes a novel spatio-temporal model to identify changepoints in ordered categorical data. The model leverages an additive mean structure with separable Gaussian space-time processes for the… ▽ More Although there is substantial literature on identifying structural changes for continuous spatio-temporal processes, the same is not true for categorical spatio-temporal data. This work bridges that gap and proposes a novel spatio-temporal model to identify changepoints in ordered categorical data. The model leverages an additive mean structure with separable Gaussian space-time processes for the latent variable. Our proposed methodology can detect significant changes in the mean structure as well as in the spatio-temporal covariance structures. We implement the model through a Bayesian framework that gives a computational edge over conventional approaches. From an application perspective, our approach's capability to handle ordinal categorical data provides an added advantage in real applications. This is illustrated using county-wise COVID-19 data (converted to categories according to CDC guidelines) from the state of New York in the USA. Our model identifies three changepoints in the transmission levels of COVID-19, which are indeed aligned with the ``waves'' due to specific variants encountered during the pandemic. The findings also provide interesting insights into the effects of vaccination and the extent of spatial and temporal dependence in different phases of the pandemic. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2105.10731 [pdf]

doi 10.1016/j.ijcci.2021.100323

A systematic review of physical-digital play technology and developmentally relevant child behaviour

Authors: Pablo E. Torres, Philip I. N. Ulrich, Veronica Cucuiat, Mutlu Cukurova, Maria Fercovic De la Presa, Rose Luckin, Amanda Carr, Thomas Dylan, Abigail Durrant, John Vines, Shaun Lawson

Abstract: New interactive physical-digital play technologies are shaping the way children plan. These technologies refer to digital play technologies that engage children in analogue forms of behaviour, either alone or with others. Current interactive physical-digital play technologies include robots, digital agents, mixed or augmented reality devices, and smart-eye based gaming. Little is known, however, a… ▽ More New interactive physical-digital play technologies are shaping the way children plan. These technologies refer to digital play technologies that engage children in analogue forms of behaviour, either alone or with others. Current interactive physical-digital play technologies include robots, digital agents, mixed or augmented reality devices, and smart-eye based gaming. Little is known, however, about the ways in which these technologies could promote or damage child development. This systematic review was aimed at understanding if and how these physical-digital play technologies promoted developmentally relevant behaviour in typically developing 0 to 12 year-olds. Psychology, Education, and Computer Science databases were searched producing 635 paper. A total of 31 papers met the inclusion criteria, of which 17 were of high enough quality to be included for synthesis. Results indicate that these new interactive play technologies could have a positive effect on children's developmentally relevant behaviour. The review indicated specific ways in which different behaviour were promoted. Providing information about own performance promoted self-monitoring. Slowing interactivity, play interdependency, and joint object accessibility promoted collaboration. Offering delimited choices promoted decision making. Problem solving and physical activity were promoted by requiring children to engage in them to keep playing. Four principles underpinned the ways in which physical digital play technologies afforded child behaviour. These included social expectations framing play situations, the directiveness of action regulations (inviting, guiding or forcing behaviours), the technical features of play technologies (digital play mechanics and physical characteristics), and the alignment between play goals, play technology and the play behaviours promoted. △ Less

Submitted 10 February, 2022; v1 submitted 22 May, 2021; originally announced May 2021.

Comments: 11 Tables, 1 Figure, 4 Appendices; Keywords: Systematic review, digital play, child development, child behaviour, child-computer interactions *Corresponding author info: Faculty of Education, University of Cambridge. Email: [email protected]; [email protected]

arXiv:2105.00925 [pdf, other]

Hyperspherically Regularized Networks for Self-Supervision

Authors: Aiden Durrant, Georgios Leontidis

Abstract: Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrast… ▽ More Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrastive methods. This work empirically demonstrates that feature diversity enforced by contrastive losses is beneficial to image representation uniformity when employed in BYOL, and as such, provides greater inter-class representation separability. Additionally, we explore and advocate the use of regularization methods, specifically the layer-wise minimization of hyperspherical energy (i.e. maximization of entropy) of network weights to encourage representation uniformity. We show that directly optimizing a measure of uniformity alongside the standard loss, or regularizing the networks of the BYOL architecture to minimize the hyperspherical energy of neurons can produce more uniformly distributed and therefore better performing representations for downstream tasks. △ Less

Submitted 27 March, 2022; v1 submitted 29 April, 2021; originally announced May 2021.

Comments: 11 pages, 8 figures

arXiv:2104.07468 [pdf, other]

doi 10.1016/j.compag.2021.106648

The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector

Authors: Aiden Durrant, Milan Markovic, David Matthews, David May, Jessica Enright, Georgios Leontidis

Abstract: Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI tec… ▽ More Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI technologies often require large amounts of training data in order to perform well, something that in many scenarios is unrealistic. However, recent machine learning advances, e.g. federated learning and privacy-preserving technologies, can offer a solution to this issue via providing the infrastructure and underpinning technologies needed to use data from various sources to train models without ever sharing the raw data themselves. In this paper, we propose a technical solution based on federated learning that uses decentralized data, (i.e. data that are not exchanged or shared but remain with the owners) to develop a cross-silo machine learning model that facilitates data sharing across supply chains. We focus our data sharing proposition on improving production optimization through soybean yield prediction, and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also helping to adopt emerging machine learning technologies to boost productivity. △ Less

Submitted 4 May, 2023; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: 23 pages, 5 figures, 7 tables || Version 2 fixed typos etc

Journal ref: Computers and Electronics in Agriculture, 2021

arXiv:quant-ph/0209067 [pdf, ps, other]

doi 10.1103/PhysRevA.67.023818

Intensity-dependent dispersion under conditions of electromagnetically induced transparency in coherently prepared multi-state atoms

Authors: Andrew D. Greentree, Derek Richards, J. A. Vaccaro, A. V. Durrant, S. R. de Echaniz, D. M. Segal, J. P. Marangos

Abstract: Interest in lossless nonlinearities has focussed on the the dispersive properties of $Λ$ systems under conditions of electromagnetically induced transparency (EIT). We generalize the lambda system by introducing further degenerate states to realize a `Chain $ Λ$' atom where multiple coupling of the probe field significantly enhances the intensity dependent dispersion without compromising the EIT… ▽ More Interest in lossless nonlinearities has focussed on the the dispersive properties of $Λ$ systems under conditions of electromagnetically induced transparency (EIT). We generalize the lambda system by introducing further degenerate states to realize a `Chain $ Λ$' atom where multiple coupling of the probe field significantly enhances the intensity dependent dispersion without compromising the EIT condition. △ Less

Submitted 18 March, 2003; v1 submitted 9 September, 2002; originally announced September 2002.

Comments: 6 pages, 7 figures, revtex 4, Two errors corrected (|D_0^{(7)}> and R^{(7)}, Replaced with version accepted in Phys. Rev. A

Journal ref: Phys. Rev. A 67, 023818 (2002)

arXiv:quant-ph/0109090 [pdf, ps, other]

doi 10.1103/PhysRevA.65.053802

Resonant and off-resonant transients in electromagnetically induced transparency: turn-on and turn-off dynamics

Authors: Andrew D. Greentree, T. B. Smith, S. R. de Echaniz, A. V. Durrant, J. P. Marangos, D. M. Segal, J. A. Vaccaro

Abstract: This paper presents a wide-ranging theoretical and experimental study of non-adiabatic transient phenomena in a $Λ$ EIT system when a strong coupling field is rapidly switched on or off. The theoretical treatment uses a Laplace transform approach to solve the time-dependent density matrix equation. The experiments are carried out in a Rb$^{87}$ MOT. The results show transient probe gain in param… ▽ More This paper presents a wide-ranging theoretical and experimental study of non-adiabatic transient phenomena in a $Λ$ EIT system when a strong coupling field is rapidly switched on or off. The theoretical treatment uses a Laplace transform approach to solve the time-dependent density matrix equation. The experiments are carried out in a Rb$^{87}$ MOT. The results show transient probe gain in parameter regions not previously studied, and provide insight into the transition dynamics between bare and dressed states. △ Less

Submitted 19 September, 2001; originally announced September 2001.

Comments: 19 Pages, 9 Figures, 1 in colour. RevTex 4.0 Submitted to Physical Review A

arXiv:quant-ph/0105084 [pdf, ps, other]

doi 10.1103/PhysRevA.64.055801

Observation of transient gain without population inversion in a laser-cooled rubidium lambda system

Authors: S. R. de Echaniz, Andrew D. Greentree, A. V. Durrant, D. M. Segal, J. P. Marangos, J. A. Vaccaro

Abstract: We have observed clear Rabi oscillations of a weak probe in a strongly driven three-level lambda system in laser-cooled rubidium for the first time. When the coupling field is non-adiabatically switched on using a Pockels cell, transient probe gain without population inversion is obtained in the presence of uncoupled absorptions. Our results are supported by three-state computations. We have observed clear Rabi oscillations of a weak probe in a strongly driven three-level lambda system in laser-cooled rubidium for the first time. When the coupling field is non-adiabatically switched on using a Pockels cell, transient probe gain without population inversion is obtained in the presence of uncoupled absorptions. Our results are supported by three-state computations. △ Less

Submitted 17 May, 2001; originally announced May 2001.

Comments: 4 pages, 4 figures, experimental results

arXiv:quant-ph/0102098 [pdf, ps, other]

doi 10.1103/PhysRevA.64.013812

Observations of a doubly driven V system probed to a fourth level in laser-cooled rubidium

Authors: S. R. de Echaniz, Andrew D. Greentree, A. V. Durrant, D. M. Segal, J. P. Marangos, J. A. Vaccaro

Abstract: Observations of a doubly driven V system probed to a fourth level in an N configuration are reported. A dressed state analysis is also presented. The expected three-peak spectrum is explored in a cold rubidium sample in a magneto-optic trap. Good agreement is found between the dressed state theory and the experimental spectra once light shifts and uncoupled absorptions in the rubidium system are… ▽ More Observations of a doubly driven V system probed to a fourth level in an N configuration are reported. A dressed state analysis is also presented. The expected three-peak spectrum is explored in a cold rubidium sample in a magneto-optic trap. Good agreement is found between the dressed state theory and the experimental spectra once light shifts and uncoupled absorptions in the rubidium system are taken into account. △ Less

Submitted 4 May, 2001; v1 submitted 20 February, 2001; originally announced February 2001.

Comments: 8 pages, 8 figures, Replaced with version scheduled to appear in 1 July 2001 issue of Physical Review A

Report number: OUPD2001/1

arXiv:quant-ph/0002091 [pdf, ps, other]

doi 10.1088/1464-4266/2/3/306

Prospects for photon blockade in four level systems in the N configuration with more than one atom

Authors: Andrew D. Greentree, John A. Vaccaro, Sebastian R. de Echaniz, Alan V. Durrant, Jon P. Marangos

Abstract: We show that for appropriate choices of parameters it is possible to achieve photon blockade in idealised one, two and three atom systems. We also include realistic parameter ranges for rubidium as the atomic species. Our results circumvent the doubts cast by recent discussion in the literature (Grangier et al Phys. Rev Lett. 81, 2833 (1998), Imamoglu et al Phys. Rev. Lett. 81, 2836 (1998)) on t… ▽ More We show that for appropriate choices of parameters it is possible to achieve photon blockade in idealised one, two and three atom systems. We also include realistic parameter ranges for rubidium as the atomic species. Our results circumvent the doubts cast by recent discussion in the literature (Grangier et al Phys. Rev Lett. 81, 2833 (1998), Imamoglu et al Phys. Rev. Lett. 81, 2836 (1998)) on the possibility of photon blockade in multi-atom systems. △ Less

Submitted 29 February, 2000; originally announced February 2000.

Comments: 8 page, revtex, 7 figures, gif. Submitted to Journal of Optics B: Quantum and Semiclassical Optics

Showing 1–15 of 15 results for author: Durrant, A