Skip to main content

Showing 1–10 of 10 results for author: Ashukha, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2110.13523  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Automating Control of Overestimation Bias for Reinforcement Learning

    Authors: Arsenii Kuznetsov, Alexander Grishin, Artem Tsypin, Arsenii Ashukha, Artur Kadurin, Dmitry Vetrov

    Abstract: Overestimation bias control techniques are used by the majority of high-performing off-policy reinforcement learning algorithms. However, most of these techniques rely on pre-defined bias correction policies that are either not flexible enough or require environment-specific tuning of hyperparameters. In this work, we present a general data-driven approach for the automatic selection of bias contr… ▽ More

    Submitted 28 January, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  2. arXiv:2109.07161  [pdf, other

    cs.CV eess.IV

    Resolution-robust Large Mask Inpainting with Fourier Convolutions

    Authors: Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, Victor Lempitsky

    Abstract: Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function. To alleviate this issue, we propose a new method called large mask inpainting (LaMa). LaMa… ▽ More

    Submitted 10 November, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: Winter Conference on Applications of Computer Vision (WACV 2022)

  3. arXiv:2106.08038  [pdf, other

    cs.LG cs.CV

    Mean Embeddings with Test-Time Data Augmentation for Ensembling of Representations

    Authors: Arsenii Ashukha, Andrei Atanov, Dmitry Vetrov

    Abstract: Averaging predictions over a set of models -- an ensemble -- is widely used to improve predictive performance and uncertainty estimation of deep learning models. At the same time, many machine learning systems, such as search, matching, and recommendation systems, heavily rely on embeddings. Unfortunately, due to misalignment of features of independently trained models, embeddings, cannot be impro… ▽ More

    Submitted 14 July, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

  4. arXiv:2002.09103  [pdf, other

    stat.ML cs.CV cs.LG

    Greedy Policy Search: A Simple Baseline for Learnable Test-Time Augmentation

    Authors: Dmitry Molchanov, Alexander Lyzhov, Yuliya Molchanova, Arsenii Ashukha, Dmitry Vetrov

    Abstract: Test-time data augmentation$-$averaging the predictions of a machine learning model across multiple augmented samples of data$-$is a widely used technique that improves the predictive performance. While many advanced learnable data augmentation techniques have emerged in recent years, they are focused on the training phase. Such techniques are not necessarily optimal for test-time augmentation and… ▽ More

    Submitted 20 June, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

  5. arXiv:2002.06470  [pdf, other

    stat.ML cs.LG

    Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning

    Authors: Arsenii Ashukha, Alexander Lyzhov, Dmitry Molchanov, Dmitry Vetrov

    Abstract: Uncertainty estimation and ensembling methods go hand-in-hand. Uncertainty estimation is one of the main benchmarks for assessment of ensembling performance. At the same time, deep learning ensembles have provided state-of-the-art results in uncertainty estimation. In this work, we focus on in-domain uncertainty for image classification. We explore the standards for its quantification and point ou… ▽ More

    Submitted 18 July, 2021; v1 submitted 15 February, 2020; originally announced February 2020.

    Journal ref: Eighth International Conference on Learning Representations (ICLR 2020)

  6. arXiv:1905.00505  [pdf, other

    stat.ML cs.LG

    Semi-Conditional Normalizing Flows for Semi-Supervised Learning

    Authors: Andrei Atanov, Alexandra Volokhova, Arsenii Ashukha, Ivan Sosnovik, Dmitry Vetrov

    Abstract: This paper proposes a semi-conditional normalizing flow model for semi-supervised learning. The model uses both labelled and unlabeled data to learn an explicit model of joint distribution over objects and labels. Semi-conditional architecture of the model allows us to efficiently compute a value and gradients of the marginal likelihood for unlabeled objects. The conditional part of the model is b… ▽ More

    Submitted 22 June, 2020; v1 submitted 1 May, 2019; originally announced May 2019.

  7. arXiv:1810.06943  [pdf, other

    stat.ML cs.LG

    The Deep Weight Prior

    Authors: Andrei Atanov, Arsenii Ashukha, Kirill Struminsky, Dmitry Vetrov, Max Welling

    Abstract: Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (DWP), that exploit generative models to encourage a specific structure of trained convolutional f… ▽ More

    Submitted 18 February, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: TL;DR: The deep weight prior learns a generative model for kernels of convolutional neural networks, that acts as a prior distribution while training on new datasets

  8. arXiv:1802.07329  [pdf, other

    stat.ML cs.LG

    Bayesian Incremental Learning for Deep Neural Networks

    Authors: Max Kochurov, Timur Garipov, Dmitry Podoprikhin, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov

    Abstract: In industrial machine learning pipelines, data often arrive in parts. Particularly in the case of deep neural networks, it may be too expensive to train the model from scratch each time, so one would rather use a previously learned model and the new data to improve performance. However, deep neural networks are prone to getting stuck in a suboptimal solution when trained on only new data as compar… ▽ More

    Submitted 27 March, 2018; v1 submitted 20 February, 2018; originally announced February 2018.

  9. arXiv:1802.04893  [pdf, other

    stat.ML cs.LG

    Uncertainty Estimation via Stochastic Batch Normalization

    Authors: Andrei Atanov, Arsenii Ashukha, Dmitry Molchanov, Kirill Neklyudov, Dmitry Vetrov

    Abstract: In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. However, inference becomes computationally ineff… ▽ More

    Submitted 20 March, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

    Comments: Under review as a workshop paper at ICLR 2018

    Journal ref: Workshop track - ICLR 2018

  10. arXiv:1701.05369  [pdf, other

    stat.ML cs.LG

    Variational Dropout Sparsifies Deep Neural Networks

    Authors: Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov

    Abstract: We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse soluti… ▽ More

    Submitted 13 June, 2017; v1 submitted 19 January, 2017; originally announced January 2017.

    Comments: Published in ICML 2017