Skip to main content

Showing 1–8 of 8 results for author: Smola, A

Searching in archive math. Search in all archives.
.
  1. arXiv:1608.06879  [pdf, other

    math.OC cs.LG stat.ML

    AIDE: Fast and Communication Efficient Distributed Optimization

    Authors: Sashank J. Reddi, Jakub Konečný, Peter Richtárik, Barnabás Póczós, Alex Smola

    Abstract: In this paper, we present two new communication-efficient methods for distributed minimization of an average of functions. The first algorithm is an inexact variant of the DANE algorithm that allows any local algorithm to return an approximate solution to a local subproblem. We show that such a strategy does not affect the theoretical guarantees of DANE significantly. In fact, our approach can be… ▽ More

    Submitted 24 August, 2016; originally announced August 2016.

  2. arXiv:1607.08254  [pdf, other

    math.OC cs.LG stat.ML

    Stochastic Frank-Wolfe Methods for Nonconvex Optimization

    Authors: Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

    Abstract: We study Frank-Wolfe methods for nonconvex stochastic and finite-sum optimization problems. Frank-Wolfe methods (in the convex case) have gained tremendous recent interest in machine learning and optimization communities due to their projection-free property and their ability to exploit structured constraints. However, our understanding of these algorithms in the nonconvex setting is fairly limite… ▽ More

    Submitted 29 July, 2016; v1 submitted 27 July, 2016; originally announced July 2016.

  3. arXiv:1605.06900  [pdf, other

    math.OC cs.LG stat.ML

    Fast Stochastic Methods for Nonsmooth Nonconvex Optimization

    Authors: Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

    Abstract: We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems, where the nonconvex part is smooth and the nonsmooth part is convex. Surprisingly, unlike the smooth case, our knowledge of this fundamental problem is very limited. For example, it is not known whether the proximal stochastic gradient method with constant minibatch converges to a stationary point. To tackle… ▽ More

    Submitted 23 May, 2016; originally announced May 2016.

  4. arXiv:1603.06160  [pdf, other

    math.OC cs.LG cs.NE stat.ML

    Stochastic Variance Reduction for Nonconvex Optimization

    Authors: Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola

    Abstract: We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary po… ▽ More

    Submitted 4 April, 2016; v1 submitted 19 March, 2016; originally announced March 2016.

    Comments: Minor feedback changes

  5. arXiv:1603.06159  [pdf, other

    math.OC cs.LG stat.ML

    Fast Incremental Method for Nonconvex Optimization

    Authors: Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alex Smola

    Abstract: We analyze a fast incremental aggregated gradient method for optimizing nonconvex problems of the form $\min_x \sum_i f_i(x)$. Specifically, we analyze the SAGA algorithm within an Incremental First-order Oracle framework, and show that it converges to a stationary point provably faster than both gradient descent and stochastic gradient descent. We also discuss a Polyak's special class of nonconve… ▽ More

    Submitted 19 March, 2016; originally announced March 2016.

  6. arXiv:1508.05003  [pdf, other

    stat.ML cs.LG math.OC

    AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization

    Authors: Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola

    Abstract: We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients. We discuss, analyze, and experiment with a setup motivated by the behavior of real-world distributed computation networks, where the machines are differently slow at different time. Therefore, we allow the parame… ▽ More

    Submitted 20 August, 2015; originally announced August 2015.

    Comments: 19 pages

  7. arXiv:0911.0491  [pdf, ps, other

    math.OC stat.ML

    Slow Learners are Fast

    Authors: John Langford, Alexander Smola, Martin Zinkevich

    Abstract: Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online lear… ▽ More

    Submitted 3 November, 2009; originally announced November 2009.

    Comments: Extended version of conference paper - NIPS 2009

    MSC Class: 49M30; 80M50

  8. Kernel methods in machine learning

    Authors: Thomas Hofmann, Bernhard Schölkopf, Alexander J. Smola

    Abstract: We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on the data domain, expanded in terms of a kernel. Working in linear spaces of function has the benefit of facilitating the construction and analysis of learning algorithms while at the same time allowin… ▽ More

    Submitted 1 July, 2008; v1 submitted 30 January, 2007; originally announced January 2007.

    Comments: Published in at http://dx.doi.org/10.1214/009053607000000677 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS0290 MSC Class: 30C40 (Primary) 68T05 (Secondary)

    Journal ref: Annals of Statistics 2008, Vol. 36, No. 3, 1171-1220