Skip to main content

Showing 1–8 of 8 results for author: Ajanthan, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2006.12807  [pdf, other

    cs.LG cs.CV stat.ML

    Post-hoc Calibration of Neural Networks by g-Layers

    Authors: Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan, Cristian Sminchisescu, Richard Hartley

    Abstract: Calibration of neural networks is a critical aspect to consider when incorporating machine learning models in real-world decision-making systems where the confidence of decisions are equally important as the decisions themselves. In recent years, there is a surge of research on neural network calibration and the majority of the works can be categorized into post-hoc calibration methods, defined as… ▽ More

    Submitted 21 February, 2022; v1 submitted 23 June, 2020; originally announced June 2020.

  2. arXiv:2006.12800  [pdf, other

    cs.LG cs.CV stat.ML

    Calibration of Neural Networks using Splines

    Authors: Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, Richard Hartley

    Abstract: Calibrating neural networks is of utmost importance when employing them in safety-critical applications where the downstream decision making depends on the predicted probabilities. Measuring calibration error amounts to comparing two empirical distributions. In this work, we introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test in which the… ▽ More

    Submitted 29 December, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: ICLR 2021

  3. arXiv:2006.12169  [pdf, other

    cs.LG cs.NE stat.ML

    Bidirectionally Self-Normalizing Neural Networks

    Authors: Yao Lu, Stephen Gould, Thalaiyasingam Ajanthan

    Abstract: The problem of vanishing and exploding gradients has been a long-standing obstacle that hinders the effective training of neural networks. Despite various tricks and techniques that have been employed to alleviate the problem in practice, there still lacks satisfactory theories or provable solutions. In this paper, we address the problem from the perspective of high-dimensional probability theory.… ▽ More

    Submitted 2 December, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

  4. arXiv:2003.11316  [pdf, other

    cs.LG stat.ML

    Understanding the Effects of Data Parallelism and Sparsity on Neural Network Training

    Authors: Namhoon Lee, Thalaiyasingam Ajanthan, Philip H. S. Torr, Martin Jaggi

    Abstract: We study two factors in neural network training: data parallelism and sparsity; here, data parallelism means processing training data in parallel using distributed systems (or equivalently increasing batch size), so that training can be accelerated; for sparsity, we refer to pruning parameters in a neural network model, so as to reduce computational and memory cost. Despite their promising benefit… ▽ More

    Submitted 2 April, 2021; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: ICLR 2021

  5. arXiv:1910.08237  [pdf, other

    cs.LG cs.CV stat.ML

    Mirror Descent View for Neural Network Quantization

    Authors: Thalaiyasingam Ajanthan, Kartik Gupta, Philip H. S. Torr, Richard Hartley, Puneet K. Dokania

    Abstract: Quantizing large Neural Networks (NN) while maintaining the performance is highly desirable for resource-limited devices due to reduced memory and time complexity. It is usually formulated as a constrained optimization problem and optimized via a modified version of gradient descent. In this work, by interpreting the continuous parameters (unconstrained) as the dual of the quantized ones, we intro… ▽ More

    Submitted 2 March, 2021; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: This paper was accepted at AISTATS 2021

  6. arXiv:1906.06307  [pdf, ps, other

    cs.LG cs.CV stat.ML

    A Signal Propagation Perspective for Pruning Neural Networks at Initialization

    Authors: Namhoon Lee, Thalaiyasingam Ajanthan, Stephen Gould, Philip H. S. Torr

    Abstract: Network pruning is a promising avenue for compressing deep neural networks. A typical approach to pruning starts by training a model and then removing redundant parameters while minimizing the impact on what is learned. Alternatively, a recent approach shows that pruning can be done at initialization prior to training, based on a saliency criterion called connection sensitivity. However, it remain… ▽ More

    Submitted 16 February, 2020; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: ICLR 2020

  7. arXiv:1902.10486  [pdf, other

    cs.LG stat.ML

    On Tiny Episodic Memories in Continual Learning

    Authors: Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K. Dokania, Philip H. S. Torr, Marc'Aurelio Ranzato

    Abstract: In continual learning (CL), an agent learns from a stream of tasks leveraging prior experience to transfer knowledge to future tasks. It is an ideal framework to decrease the amount of supervision in the existing learning algorithms. But for a successful knowledge transfer, the learner needs to remember how to perform previous tasks. One way to endow the learner the ability to perform tasks seen i… ▽ More

    Submitted 4 June, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: Making the main point of the paper more clear

  8. arXiv:1804.06364  [pdf, other

    cs.CV stat.ML

    DGPose: Deep Generative Models for Human Body Analysis

    Authors: Rodrigo de Bem, Arnab Ghosh, Thalaiyasingam Ajanthan, Ondrej Miksik, Adnane Boukhayma, N. Siddharth, Philip Torr

    Abstract: Deep generative modelling for human body analysis is an emerging problem with many interesting applications. However, the latent space learned by such approaches is typically not interpretable, resulting in less flexibility. In this work, we present deep generative models for human body analysis in which the body pose and the visual appearance are disentangled. Such a disentanglement allows indepe… ▽ More

    Submitted 14 February, 2020; v1 submitted 17 April, 2018; originally announced April 2018.

    Comments: IJCV 2020 special issue on 'Generating Realistic Visual Data of Human Behavior' preprint. Keywords: deep generative models, semi-supervised learning, human pose estimation, variational autoencoders, generative adversarial networks