Skip to main content

Showing 1–9 of 9 results for author: Osokin, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:1912.03771  [pdf, other

    cs.LG stat.ML

    Cost-Sensitive Training for Autoregressive Models

    Authors: Irina Saparina, Anton Osokin

    Abstract: Training autoregressive models to better predict under the test metric, instead of maximizing the likelihood, has been reported to be beneficial in several use cases but brings additional complications, which prevent wider adoption. In this paper, we follow the learning-to-search approach (Daumé III et al., 2009; Leblond et al., 2018) and investigate its several components. First, we propose a way… ▽ More

    Submitted 8 December, 2019; originally announced December 2019.

  2. arXiv:1902.11088  [pdf, other

    cs.LG stat.ML

    Scaling Matters in Deep Structured-Prediction Models

    Authors: Aleksandr Shevchenko, Anton Osokin

    Abstract: Deep structured-prediction energy-based models combine the expressive power of learned representations and the ability of embedding knowledge about the task at hand into the system. A common way to learn parameters of such models consists in a multistage procedure where different combinations of components are trained at different stages. The joint end-to-end training of the whole system is then d… ▽ More

    Submitted 28 February, 2019; originally announced February 2019.

    Comments: 13 pages

  3. arXiv:1811.08725  [pdf, other

    stat.ML cs.LG

    Marginal Weighted Maximum Log-likelihood for Efficient Learning of Perturb-and-Map models

    Authors: Tatiana Shpakova, Francis Bach, Anton Osokin

    Abstract: We consider the structured-output prediction problem through probabilistic approaches and generalize the "perturb-and-MAP" framework to more challenging weighted Hamming losses, which are crucial in applications. While in principle our approach is a straightforward marginalization, it requires solving many related MAP inference problems. We show that for log-supermodular pairwise models these oper… ▽ More

    Submitted 21 November, 2018; originally announced November 2018.

    Comments: Published in Proceedings of the Conference of Uncertainty in Artificial Intelligence (UAI), 2018

  4. arXiv:1810.11544  [pdf, other

    cs.LG cs.AI stat.ML

    Quantifying Learning Guarantees for Convex but Inconsistent Surrogates

    Authors: Kirill Struminsky, Simon Lacoste-Julien, Anton Osokin

    Abstract: We study consistency properties of machine learning methods based on minimizing convex surrogates. We extend the recent framework of Osokin et al. (2017) for the quantitative analysis of consistency properties to the case of inconsistent surrogates. Our key technical contribution consists in a new lower bound on the calibration function for the quadratic surrogate, which is non-trivial (not always… ▽ More

    Submitted 9 January, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

    Comments: Appears in: Advances in Neural Information Processing Systems 31 (NeurIPS 2018). 18 pages

  5. arXiv:1708.04692  [pdf, other

    cs.CV cs.LG stat.ML

    GANs for Biological Image Synthesis

    Authors: Anton Osokin, Anatole Chessel, Rafael E. Carazo Salas, Federico Vaggi

    Abstract: In this paper, we propose a novel application of Generative Adversarial Networks (GAN) to the synthesis of cells imaged by fluorescence microscopy. Compared to natural images, cells tend to have a simpler and more geometric global structure that facilitates image generation. However, the correlation between the spatial pattern of different fluorescent proteins reflects important biological functio… ▽ More

    Submitted 12 September, 2017; v1 submitted 15 August, 2017; originally announced August 2017.

    Comments: The paper appearing at the International Conference on Computer Vision (ICCV) 2017 + its supplementary materials

  6. arXiv:1706.04499  [pdf, other

    cs.LG stat.ML

    SEARNN: Training RNNs with Global-Local Losses

    Authors: Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien

    Abstract: We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropria… ▽ More

    Submitted 4 March, 2018; v1 submitted 14 June, 2017; originally announced June 2017.

    Comments: Published as a conference paper at ICLR 2018, 16 pages

  7. arXiv:1703.02403  [pdf, other

    cs.LG stat.ML

    On Structured Prediction Theory with Calibrated Convex Surrogate Losses

    Authors: Anton Osokin, Francis Bach, Simon Lacoste-Julien

    Abstract: We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees. For any task loss, we construct a convex surrogate that can be optimized via stochastic gradient descent and we prove tight bounds on the so-called "calibration function" relating the excess surrogate risk to the actual risk. In contrast to prio… ▽ More

    Submitted 29 January, 2018; v1 submitted 7 March, 2017; originally announced March 2017.

    Comments: Appears in: Advances in Neural Information Processing Systems 30 (NIPS 2017). 30 pages

  8. arXiv:1605.09346  [pdf, other

    cs.LG math.OC stat.ML

    Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs

    Authors: Anton Osokin, Jean-Baptiste Alayrac, Isabella Lukasewitz, Puneet K. Dokania, Simon Lacoste-Julien

    Abstract: In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications. The key intuition behind our improvements is that the estimates of block gaps maintained by BCFW reveal the bl… ▽ More

    Submitted 30 May, 2016; originally announced May 2016.

    Comments: Appears in Proceedings of the 33rd International Conference on Machine Learning (ICML 2016). 31 pages

    MSC Class: 90C52; 90C90; 90C06; 68T05 ACM Class: G.1.6; I.2.6

  9. arXiv:1501.03771  [pdf, ps, other

    cs.CV math.OC stat.ML

    Submodular relaxation for inference in Markov random fields

    Authors: Anton Osokin, Dmitry Vetrov

    Abstract: In this paper we address the problem of finding the most probable state of a discrete Markov random field (MRF), also known as the MRF energy minimization problem. The task is known to be NP-hard in general and its practical importance motivates numerous approximate algorithms. We propose a submodular relaxation approach (SMR) based on a Lagrangian relaxation of the initial problem. Unlike the dua… ▽ More

    Submitted 15 January, 2015; originally announced January 2015.

    Comments: This paper is accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence