Skip to main content

Showing 1–4 of 4 results for author: Struminsky, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:1911.10036  [pdf, other

    cs.LG stat.ML

    Low-variance Black-box Gradient Estimates for the Plackett-Luce Distribution

    Authors: Artyom Gadetsky, Kirill Struminsky, Christopher Robinson, Novi Quadrianto, Dmitry Vetrov

    Abstract: Learning models with discrete latent variables using stochastic gradient descent remains a challenge due to the high variance of gradient estimates. Modern variance reduction techniques mostly consider categorical distributions and have limited applicability when the number of possible outcomes becomes large. In this work, we consider models with latent permutations and propose control variates fo… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

    Comments: Accepted as a conference paper at AAAI 2020. Shortened version of the paper appears at BDL NeurIPS 2019 workshop

  2. arXiv:1810.11544  [pdf, other

    cs.LG cs.AI stat.ML

    Quantifying Learning Guarantees for Convex but Inconsistent Surrogates

    Authors: Kirill Struminsky, Simon Lacoste-Julien, Anton Osokin

    Abstract: We study consistency properties of machine learning methods based on minimizing convex surrogates. We extend the recent framework of Osokin et al. (2017) for the quantitative analysis of consistency properties to the case of inconsistent surrogates. Our key technical contribution consists in a new lower bound on the calibration function for the quadratic surrogate, which is non-trivial (not always… ▽ More

    Submitted 9 January, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

    Comments: Appears in: Advances in Neural Information Processing Systems 31 (NeurIPS 2018). 18 pages

  3. arXiv:1810.06943  [pdf, other

    stat.ML cs.LG

    The Deep Weight Prior

    Authors: Andrei Atanov, Arsenii Ashukha, Kirill Struminsky, Dmitry Vetrov, Max Welling

    Abstract: Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (DWP), that exploit generative models to encourage a specific structure of trained convolutional f… ▽ More

    Submitted 18 February, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: TL;DR: The deep weight prior learns a generative model for kernels of convolutional neural networks, that acts as a prior distribution while training on new datasets

  4. arXiv:1611.09226  [pdf, other

    cs.LG stat.ML

    Robust Variational Inference

    Authors: Michael Figurnov, Kirill Struminsky, Dmitry Vetrov

    Abstract: Variational inference is a powerful tool for approximate inference. However, it mainly focuses on the evidence lower bound as variational objective and the development of other measures for variational inference is a promising area of research. This paper proposes a robust modification of evidence and a lower bound for the evidence, which is applicable when the majority of the training set samples… ▽ More

    Submitted 28 November, 2016; originally announced November 2016.

    Comments: NIPS 2016 Workshop, Advances in Approximate Bayesian Inference