Skip to main content

Showing 1–45 of 45 results for author: Harchaoui, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.07647  [pdf, ps, other

    math.PR stat.ML

    Langevin Diffusion Approximation to Same Marginal Schrödinger Bridge

    Authors: Medha Agarwal, Zaid Harchaoui, Garrett Mulcahy, Soumik Pal

    Abstract: We introduce a novel approximation to the same marginal Schrödinger bridge using the Langevin diffusion. As $\varepsilon \downarrow 0$, it is known that the barycentric projection (also known as the entropic Brenier map) of the Schrödinger bridge converges to the Brenier map, which is the identity. Our diffusion approximation is leveraged to show that, under suitable assumptions, the difference be… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: These results have been extracted from the first half of arXiv:2406.10823, where they first appeared. The rest of arXiv:2406.10823 will be modified to reflect this change

    MSC Class: 49N99; 49Q22; 60J60

  2. arXiv:2412.07905  [pdf, other

    stat.ME stat.ML

    Spectral Differential Network Analysis for High-Dimensional Time Series

    Authors: Michael Hellstern, Byol Kim, Zaid Harchaoui, Ali Shojaie

    Abstract: Spectral networks derived from multivariate time series data arise in many domains, from brain science to Earth science. Often, it is of interest to study how these networks change under different conditions. For instance, to better understand epilepsy, it would be interesting to capture the changes in the brain connectivity network as a patient experiences a seizure, using electroencephalography… ▽ More

    Submitted 10 December, 2024; originally announced December 2024.

    Comments: 23 pages, 13 figures

  3. arXiv:2408.15065  [pdf, other

    stat.ML cs.LG math.ST

    The Benefits of Balance: From Information Projections to Variance Reduction

    Authors: Lang Liu, Ronak Mehta, Soumik Pal, Zaid Harchaoui

    Abstract: Data balancing across multiple modalities and sources appears in various forms in foundation models in machine learning and AI, e.g. in CLIP and DINO. We show that data balancing across modalities and sources actually offers an unsuspected benefit: variance reduction. We present a non-asymptotic statistical bound that quantifies this variance reduction effect and relates it to the eigenvalue decay… ▽ More

    Submitted 11 February, 2025; v1 submitted 27 August, 2024; originally announced August 2024.

  4. arXiv:2406.10823  [pdf, other

    math.PR stat.ML

    Iterated Schrödinger bridge approximation to Wasserstein Gradient Flows

    Authors: Medha Agarwal, Zaid Harchaoui, Garrett Mulcahy, Soumik Pal

    Abstract: We introduce a novel discretization scheme for Wasserstein gradient flows that involves successively computing Schrödinger bridges with the same marginals. This is different from both the forward/geodesic approximation and the backward/Jordan-Kinderlehrer-Otto (JKO) approximations. The proposed scheme has two advantages: one, it avoids the use of the score function, and, two, it is amenable to par… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 36 pages, 1 figure

    MSC Class: 49N99; 49Q22; 60J60

  5. arXiv:2403.10763  [pdf, other

    stat.ML cs.LG math.OC

    Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization

    Authors: Ronak Mehta, Jelena Diakonikolas, Zaid Harchaoui

    Abstract: We consider the penalized distributionally robust optimization (DRO) problem with a closed, convex uncertainty set, a setting that encompasses learning using $f$-DRO and spectral/$L$-risk minimization. We present Drago, a stochastic primal-dual algorithm that combines cyclic and randomized components with a carefully regularized primal update to achieve dual variance reduction. Owing to its design… ▽ More

    Submitted 11 February, 2025; v1 submitted 15 March, 2024; originally announced March 2024.

  6. arXiv:2310.13863  [pdf, other

    stat.ML cs.LG math.OC

    Distributionally Robust Optimization with Bias and Variance Reduction

    Authors: Ronak Mehta, Vincent Roulet, Krishna Pillutla, Zaid Harchaoui

    Abstract: We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value-at-risk (CVaR) and average top-$k$ loss. We present Prospect, a stochastic gradient-based algorithm that only requires tuning a single learning rate hyperparame… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  7. arXiv:2301.00260  [pdf, other

    math.ST stat.ML

    Confidence Sets under Generalized Self-Concordance

    Authors: Lang Liu, Zaid Harchaoui

    Abstract: This paper revisits a fundamental problem in statistical inference from a non-asymptotic theoretical viewpoint $\unicode{x2013}$ the construction of confidence sets. We establish a finite-sample bound for the estimator, characterizing its asymptotic behavior in a non-asymptotic fashion. An important feature of our bound is that its dimension dependency is captured by the effective dimension… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

  8. arXiv:2212.05149  [pdf, other

    stat.ML cs.LG math.OC

    Stochastic Optimization for Spectral Risk Measures

    Authors: Ronak Mehta, Vincent Roulet, Krishna Pillutla, Lang Liu, Zaid Harchaoui

    Abstract: Spectral risk objectives - also called $L$-risks - allow for learning systems to interpolate between optimizing average-case performance (as in empirical risk minimization) and worst-case performance on a task. We develop stochastic algorithms to optimize these quantities by characterizing their subdifferential and addressing challenges such as biasedness of subgradient estimates and non-smoothnes… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  9. arXiv:2212.04014  [pdf, other

    stat.ML cs.LG math.ST

    Statistical and Computational Guarantees for Influence Diagnostics

    Authors: Jillian Fisher, Lang Liu, Krishna Pillutla, Yejin Choi, Zaid Harchaoui

    Abstract: Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful statistical tools to identify influential datapoints or subsets of datapoints. We establish finite-sample statistical bounds, as well as computational complexity bounds, for influence functions and approx… ▽ More

    Submitted 19 September, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: For AISTATS 2023. Software see https://github.com/jfisher52/influence_theory

  10. arXiv:2210.00422  [pdf, ps, other

    math.PR cs.LG stat.ML

    Stochastic optimization on matrices and a graphon McKean-Vlasov limit

    Authors: Zaid Harchaoui, Sewoong Oh, Soumik Pal, Raghav Somani, Raghavendra Tripathi

    Abstract: We consider stochastic gradient descents on the space of large symmetric matrices of suitable functions that are invariant under permuting the rows and columns using the same permutation. We establish deterministic limits of these random curves as the dimensions of the matrices go to infinity while the entries remain bounded. Under a ``small noise'' assumption the limit is shown to be the gradient… ▽ More

    Submitted 27 May, 2024; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: 37 pages+ references, introduction modified and new examples added. Improved presentation

    MSC Class: 05C60; 05C63; 05C80; 68R10; 60K35; 60G09

  11. arXiv:2205.00350  [pdf, other

    stat.ML cs.IT cs.LG

    Orthogonal Statistical Learning with Self-Concordant Loss

    Authors: Lang Liu, Carlos Cinelli, Zaid Harchaoui

    Abstract: Orthogonal statistical learning and double machine learning have emerged as general frameworks for two-stage statistical prediction in the presence of a nuisance component. We establish non-asymptotic bounds on the excess risk of orthogonal statistical learning methods with a loss function satisfying a self-concordance property. Our bounds improve upon existing bounds by a dimension factor while l… ▽ More

    Submitted 19 June, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

    Comments: COLT 2022

  12. arXiv:2203.03756  [pdf, other

    cs.LG math.OC stat.ML

    Flat minima generalize for low-rank matrix recovery

    Authors: Lijun Ding, Dmitriy Drusvyatskiy, Maryam Fazel, Zaid Harchaoui

    Abstract: Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance. Flat minima -- those around which the loss grows slowly -- appear to generalize well. This work takes a step towards understanding this phenomenon by focusing on the simplest class of overparameter… ▽ More

    Submitted 17 February, 2023; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: 36 pages

  13. arXiv:2112.15595  [pdf, other

    stat.ML cs.LG math.PR

    Triangular Flows for Generative Modeling: Statistical Consistency, Smoothness Classes, and Fast Rates

    Authors: Nicholas J. Irons, Meyer Scetbon, Soumik Pal, Zaid Harchaoui

    Abstract: Triangular flows, also known as Knöthe-Rosenblatt measure couplings, comprise an important building block of normalizing flow models for generative modeling and density estimation, including popular autoregressive flow models such as real-valued non-volume preserving transformation models (Real NVP). We present statistical guarantees and sample complexity bounds for triangular flow statistical mod… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

  14. arXiv:2112.15265  [pdf, other

    stat.ML cs.LG

    Entropy Regularized Optimal Transport Independence Criterion

    Authors: Lang Liu, Soumik Pal, Zaid Harchaoui

    Abstract: We introduce an independence criterion based on entropy regularized optimal transport. Our criterion can be used to test for independence between two samples. We establish non-asymptotic bounds for our test statistic and study its statistical behavior under both the null hypothesis and the alternative hypothesis. The theoretical results involve tools from U-process theory and optimal transport the… ▽ More

    Submitted 19 April, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

  15. arXiv:2112.09429  [pdf, other

    cs.LG math.OC stat.ML

    Federated Learning with Superquantile Aggregation for Heterogeneous Data

    Authors: Krishna Pillutla, Yassine Laguel, Jérôme Malick, Zaid Harchaoui

    Abstract: We present a federated learning framework that is designed to robustly deliver good predictive performance across individual clients with heterogeneous data. The proposed approach hinges upon a superquantile-based learning objective that captures the tail statistics of the error distribution over heterogeneous clients. We present a stochastic training algorithm that interleaves differentially priv… ▽ More

    Submitted 6 December, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: Machine Learning Journal, Special Issue on Safe and Fair Machine Learning (To appear)

    Journal ref: Machine Learning (2023): 1-68

  16. arXiv:2106.14122  [pdf, other

    stat.ML cs.LG

    Score-Based Change Detection for Gradient-Based Learning Machines

    Authors: Lang Liu, Joseph Salmon, Zaid Harchaoui

    Abstract: The widespread use of machine learning algorithms calls for automatic change detection algorithms to monitor their behavior over time. As a machine learning algorithm learns from a continuous, possibly evolving, stream of data, it is desirable and often critical to supplement it with a companion change detection algorithm to facilitate its monitoring and control. We present a generic score-based c… ▽ More

    Submitted 26 June, 2021; originally announced June 2021.

  17. arXiv:2106.07898  [pdf, other

    stat.ML cs.LG

    Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals

    Authors: Lang Liu, Krishna Pillutla, Sean Welleck, Sewoong Oh, Yejin Choi, Zaid Harchaoui

    Abstract: The spectacular success of deep generative models calls for quantitative tools to measure their statistical performance. Divergence frontiers have recently been proposed as an evaluation framework for generative models, due to their ability to measure the quality-diversity trade-off inherent to deep generative modeling. We establish non-asymptotic bounds on the sample complexity of divergence fron… ▽ More

    Submitted 11 December, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

  18. arXiv:2012.15458  [pdf, other

    math.OC cs.LG stat.ML

    Differentiable Programming à la Moreau

    Authors: Vincent Roulet, Zaid Harchaoui

    Abstract: The notion of a Moreau envelope is central to the analysis of first-order optimization algorithms for machine learning. Yet, it has not been developed and extended to be applied to a deep network and, more broadly, to a machine learning system with a differentiable programming implementation. We define a compositional calculus adapted to Moreau envelopes and show how to integrate it within differe… ▽ More

    Submitted 11 December, 2022; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Short version appeared in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  19. arXiv:2012.06684  [pdf, other

    cs.LG stat.ML

    Faster Policy Learning with Continuous-Time Gradients

    Authors: Samuel Ainsworth, Kendall Lowrey, John Thickstun, Zaid Harchaoui, Siddhartha Srinivasa

    Abstract: We study the estimation of policy gradients for continuous-time systems with known dynamics. By reframing policy learning in continuous-time, we show that it is possible construct a more efficient and accurate gradient estimator. The standard back-propagation through time estimator (BPTT) computes exact gradients for a crude discretization of the continuous-time system. In contrast, we approximate… ▽ More

    Submitted 24 June, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Journal ref: L4DC 2021

  20. arXiv:2011.08963  [pdf, ps, other

    math.PR stat.ML

    Asymptotics of Discrete Schrödinger Bridges via Chaos Decomposition

    Authors: Zaid Harchaoui, Lang Liu, Soumik Pal

    Abstract: Consider the problem of matching two independent i.i.d. samples of size $N$ from two distributions $P$ and $Q$ in $\mathbb{R}^d$. For an arbitrary continuous cost function, the optimal assignment problem looks for the matching that minimizes the total cost. We consider instead in this paper the problem where each matching is endowed with a Gibbs probability weight proportional to the exponential o… ▽ More

    Submitted 31 December, 2022; v1 submitted 17 November, 2020; originally announced November 2020.

    MSC Class: 46N10; 60J35; 60F17; 62G20

  21. arXiv:2009.14575  [pdf, other

    math.OC cs.LG stat.ML

    First-order Optimization for Superquantile-based Supervised Learning

    Authors: Yassine Laguel, Jérôme Malick, Zaid Harchaoui

    Abstract: Classical supervised learning via empirical risk (or negative log-likelihood) minimization hinges upon the assumption that the testing distribution coincides with the training distribution. This assumption can be challenged in modern applications of machine learning in which learning machines may operate at prediction time with testing data whose distribution departs from the one of the training d… ▽ More

    Submitted 1 October, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: 6 pages, 2 figures, 2 tables, presented at IEEE MLSP

  22. arXiv:2003.12756  [pdf, other

    stat.ML cs.LG

    Harmonic Decompositions of Convolutional Networks

    Authors: Meyer Scetbon, Zaid Harchaoui

    Abstract: We present a description of the function space and the smoothness class associated with a convolutional network using the machinery of reproducing kernel Hilbert spaces. We show that the mapping associated with a convolutional network expands into a sum involving elementary functions akin to spherical harmonics. This functional decomposition can be related to the functional ANOVA decomposition in… ▽ More

    Submitted 16 November, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

  23. arXiv:2002.12640  [pdf, other

    stat.ML cs.LG

    A Spectral Analysis of Dot-product Kernels

    Authors: Meyer Scetbon, Zaid Harchaoui

    Abstract: We present eigenvalue decay estimates of integral operators associated with compositional dot-product kernels. The estimates improve on previous ones established for power series kernels on spheres. This allows us to obtain the volumes of balls in the corresponding reproducing kernel Hilbert spaces. We discuss the consequences on statistical estimation with compositional dot product kernels and hi… ▽ More

    Submitted 26 February, 2021; v1 submitted 28 February, 2020; originally announced February 2020.

  24. arXiv:2002.11223  [pdf, other

    stat.ML cs.DC cs.LG math.OC

    Device Heterogeneity in Federated Learning: A Superquantile Approach

    Authors: Yassine Laguel, Krishna Pillutla, Jérôme Malick, Zaid Harchaoui

    Abstract: We propose a federated learning framework to handle heterogeneous client devices which do not conform to the population data distribution. The approach hinges upon a parameterized superquantile-based objective, where the parameter ranges over levels of conformity. We present an optimization algorithm and establish its convergence to a stationary point. We show how to practically implement it using… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

    Journal ref: Machine Learning (2023): 1-68

  25. arXiv:2002.09051  [pdf, ps, other

    cs.LG stat.ML

    An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks

    Authors: Vincent Roulet, Zaid Harchaoui

    Abstract: We present an approach to obtain convergence guarantees of optimization algorithms for deep networks based on elementary arguments and computations. The convergence analysis revolves around the analytical and computational structures of optimization oracles central to the implementation of deep networks in machine learning software. We provide a systematic way to compute estimates of the smoothnes… ▽ More

    Submitted 29 December, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: The changes from v1 to v2 include i) slightly more general results; ii) slightly more concise proofs; iii) highway and residual networks; iv) implicitly defined network layers; v) additional algorithm boxes and illustration figures

  26. arXiv:1912.13445  [pdf, other

    stat.ML cs.CR cs.LG

    Robust Aggregation for Federated Learning

    Authors: Krishna Pillutla, Sham M. Kakade, Zaid Harchaoui

    Abstract: Federated learning is the centralized training of statistical models from decentralized data on mobile devices while preserving the privacy of each device. We present a robust aggregation approach to make federated learning robust to settings when a fraction of the devices may be sending corrupted updates to the server. The approach relies on a robust aggregation oracle based on the geometric medi… ▽ More

    Submitted 17 January, 2022; v1 submitted 31 December, 2019; originally announced December 2019.

    Journal ref: IEEE Transactions on Signal Processing 70 (2022): 1142-1154

  27. Discriminative Clustering with Representation Learning with any Ratio of Labeled to Unlabeled Data

    Authors: Corinne Jones, Vincent Roulet, Zaid Harchaoui

    Abstract: We present a discriminative clustering approach in which the feature representation can be learned from data and moreover leverage labeled data. Representation learning can give a similarity-based clustering method the ability to automatically adapt to an underlying, yet hidden, geometric structure of the data. The proposed approach augments the DIFFRAC method with a representation learning capabi… ▽ More

    Submitted 17 February, 2023; v1 submitted 30 December, 2019; originally announced December 2019.

    Comments: Published in Statistics and Computing, 2022

    Journal ref: Stat Comput 32, 17 (2022)

  28. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  29. arXiv:1904.03834  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    A Statistical Investigation of Long Memory in Language and Music

    Authors: Alexander Greaves-Tunnell, Zaid Harchaoui

    Abstract: Representation and learning of long-range dependencies is a central challenge confronted in modern applications of machine learning to sequence data. Yet despite the prominence of this issue, the basic problem of measuring long-range dependence, either in a given data source or as represented in a trained deep model, remains largely limited to heuristic tools. We contribute a statistical framework… ▽ More

    Submitted 6 June, 2019; v1 submitted 8 April, 2019; originally announced April 2019.

    Comments: 29 pages; expanded supplement, added details in background and methods per reviewer feedback, included additional references

  30. arXiv:1903.08131  [pdf, other

    stat.ML cs.LG math.OC

    Kernel-based Translations of Convolutional Networks

    Authors: Corinne Jones, Vincent Roulet, Zaid Harchaoui

    Abstract: Convolutional Neural Networks, as most artificial neural networks, are commonly viewed as methods different in essence from kernel-based methods. We provide a systematic translation of Convolutional Neural Networks (ConvNets) into their kernel-based counterparts, Convolutional Kernel Networks (CKNs), and demonstrate that this perception is unfounded both formally and empirically. We show that, giv… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

  31. arXiv:1902.03228  [pdf, other

    stat.ML cs.LG math.OC

    A Smoother Way to Train Structured Prediction Models

    Authors: Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui

    Abstract: We present a framework to train a structured prediction model by performing smoothing on the inference algorithm it builds upon. Smoothing overcomes the non-smoothness inherent to the maximum margin structured prediction objective, and paves the way for the use of fast primal gradient-based optimization algorithms. We illustrate the proposed framework by developing a novel primal incremental optim… ▽ More

    Submitted 8 February, 2019; originally announced February 2019.

    Comments: Short version appeared in Neural Information Processing Systems (NeurIPS) 2018

  32. arXiv:1811.08045  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Coupled Recurrent Models for Polyphonic Music Composition

    Authors: John Thickstun, Zaid Harchaoui, Dean P. Foster, Sham M. Kakade

    Abstract: This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music. We propose an efficient new conditional probabilistic factorization of musical scores, viewing a score as a collection of concurrent, coupled sequences: i.e. voices. To model the conditional distributions, we borrow ideas from both convolutional and recurrent neural models; we… ▽ More

    Submitted 26 November, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: 13 pages; long version of the paper appearing in ISMIR 2019

  33. arXiv:1806.04028  [pdf, ps, other

    math.ST stat.ML

    Adaptive Denoising of Signals with Local Shift-Invariant Structure

    Authors: Zaid Harchaoui, Anatoli Juditsky, Arkadi Nemirovski, Dmitrii Ostrovskii

    Abstract: We discuss the problem of adaptive discrete-time signal denoising in the situation where the signal to be recovered admits a "linear oracle" -- an unknown linear estimate that takes the form of convolution of observations with a time-invariant filter. It was shown by Juditsky and Nemirovski (2009) that when the $\ell_2$-norm of the oracle filter is small enough, such oracle can be "mimicked" by an… ▽ More

    Submitted 11 February, 2021; v1 submitted 11 June, 2018; originally announced June 2018.

    Comments: 39 pages

  34. arXiv:1803.11262  [pdf, other

    math.ST math.OC stat.ML

    Efficient First-Order Algorithms for Adaptive Signal Denoising

    Authors: Dmitrii Ostrovskii, Zaid Harchaoui

    Abstract: We consider the problem of discrete-time signal denoising, focusing on a specific family of non-linear convolution-type estimators. Each such estimator is associated with a time-invariant filter which is obtained adaptively, by solving a certain convex optimization problem. Adaptive convolution-type estimators were demonstrated to have favorable statistical properties. However, the question of the… ▽ More

    Submitted 12 June, 2018; v1 submitted 29 March, 2018; originally announced March 2018.

    Comments: 27 pages, 5 figures

  35. arXiv:1712.05654  [pdf, other

    stat.ML math.OC

    Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

    Authors: Hongzhou Lin, Julien Mairal, Zaid Harchaoui

    Abstract: We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective function, and consists of approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. One of the keys to achieve acceleration… ▽ More

    Submitted 19 June, 2018; v1 submitted 15 December, 2017; originally announced December 2017.

    Comments: link to publisher website: http://jmlr.org/papers/volume18/17-748/17-748.pdf

    Journal ref: Journal of Machine Learning Research (JMLR), 18(212):1--54, 2018

  36. arXiv:1711.04845  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    Invariances and Data Augmentation for Supervised Music Transcription

    Authors: John Thickstun, Zaid Harchaoui, Dean Foster, Sham M. Kakade

    Abstract: This paper explores a variety of models for frame-based music transcription, with an emphasis on the methods needed to reach state-of-the-art on human recordings. The translation-invariant network discussed in this paper, which combines a traditional filterbank with a convolutional neural network, was the top-performing model in the 2017 MIREX Multiple Fundamental Frequency Estimation evaluation.… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

    Comments: 6 pages

  37. arXiv:1703.10993  [pdf, other

    stat.ML math.OC

    Catalyst Acceleration for Gradient-Based Non-Convex Optimization

    Authors: Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui

    Abstract: We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed approach allows one to use them on weakly convex objectives, which covers a large class of non-convex functions typically appearing in machine learning and sign… ▽ More

    Submitted 31 December, 2018; v1 submitted 31 March, 2017; originally announced March 2017.

  38. arXiv:1611.09827  [pdf, other

    stat.ML cs.LG cs.SD

    Learning Features of Music from Scratch

    Authors: John Thickstun, Zaid Harchaoui, Sham Kakade

    Abstract: This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of supervision and evaluation of machine learning methods for music research. MusicNet consists of hundreds of freely-licensed classical music recordings by 10 composers, written for 11 instruments, together with instrument/note annotations resulting in over 1 million temporal labels on 34 hours of chamber music… ▽ More

    Submitted 5 April, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

    Comments: 14 pages; camera-ready version; updated experiments and related works; additional MIR metrics (Appendix C)

  39. arXiv:1610.00960  [pdf, other

    stat.ML math.OC

    An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration

    Authors: Hongzhou Lin, Julien Mairal, Zaid Harchaoui

    Abstract: We propose an inexact variable-metric proximal point algorithm to accelerate gradient-based optimization algorithms. The proposed scheme, called QNing can be notably applied to incremental first-order methods such as the stochastic variance-reduced gradient descent algorithm (SVRG) and other randomized incremental optimization algorithms. QNing is also compatible with composite objectives, meaning… ▽ More

    Submitted 29 January, 2019; v1 submitted 4 October, 2016; originally announced October 2016.

    Comments: to appear in SIAM Journal on Optimization

  40. arXiv:1608.01264  [pdf, other

    cs.LG math.OC stat.ML

    Fast and Simple Optimization for Poisson Likelihood Models

    Authors: Niao He, Zaid Harchaoui, Yichen Wang, Le Song

    Abstract: Poisson likelihood models have been prevalently used in imaging, social networks, and time series analysis. We propose fast, simple, theoretically-grounded, and versatile, optimization algorithms for Poisson likelihood modeling. The Poisson log-likelihood is concave but not Lipschitz-continuous. Since almost all gradient-based optimization algorithms rely on Lipschitz-continuity, optimizing Poisso… ▽ More

    Submitted 3 August, 2016; originally announced August 2016.

  41. arXiv:1607.00567  [pdf, ps, other

    stat.ML cs.LG

    Rademacher Complexity Bounds for a Penalized Multiclass Semi-Supervised Algorithm

    Authors: Yury Maximov, Massih-Reza Amini, Zaid Harchaoui

    Abstract: We propose Rademacher complexity bounds for multiclass classifiers trained with a two-step semi-supervised model. In the first step, the algorithm partitions the partially labeled data and then identifies dense clusters containing $κ$ predominant classes using the labeled training examples such that the proportion of their non-predominant classes is below a fixed threshold. In the second step, a c… ▽ More

    Submitted 25 January, 2018; v1 submitted 2 July, 2016; originally announced July 2016.

    Comments: 26 pages, 6 figures

    Journal ref: Journal of Aritificial Intelligence Research, v 61, 2018

  42. arXiv:1406.3332  [pdf, ps, other

    cs.CV cs.LG stat.ML

    Convolutional Kernel Networks

    Authors: Julien Mairal, Piotr Koniusz, Zaid Harchaoui, Cordelia Schmid

    Abstract: An important goal in visual recognition is to devise image representations that are invariant to particular transformations. In this paper, we address this goal with a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel. Unlike traditional approaches where neural networks are learned either to represent data or for solving a classification task, our n… ▽ More

    Submitted 14 November, 2014; v1 submitted 12 June, 2014; originally announced June 2014.

    Comments: appears in Advances in Neural Information Processing Systems (NIPS), Dec 2014, Montreal, Canada, http://nips.cc

  43. arXiv:1405.6472  [pdf, other

    cs.CV cs.LG stat.ML

    Fast and Robust Archetypal Analysis for Representation Learning

    Authors: Yuansi Chen, Julien Mairal, Zaid Harchaoui

    Abstract: We revisit a pioneer unsupervised learning technique called archetypal analysis, which is related to successful data analysis methods such as sparse coding and non-negative matrix factorization. Since it was proposed, archetypal analysis did not gain a lot of popularity even though it produces more interpretable models than other alternatives. Because no efficient implementation has ever been made… ▽ More

    Submitted 26 May, 2014; originally announced May 2014.

    Journal ref: CVPR 2014 - IEEE Conference on Computer Vision \& Pattern Recognition (2014)

  44. arXiv:1302.2325  [pdf, other

    math.OC stat.CO stat.ML

    Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization

    Authors: Zaid Harchaoui, Anatoli Juditsky, Arkadi Nemirovski

    Abstract: Motivated by some applications in signal processing and machine learning, we consider two convex optimization problems where, given a cone $K$, a norm $\|\cdot\|$ and a smooth convex function $f$, we want either 1) to minimize the norm over the intersection of the cone and a level set of $f$, or 2) to minimize over the cone the sum of $f$ and a multiple of the norm. We focus on the case where (a)… ▽ More

    Submitted 28 March, 2013; v1 submitted 10 February, 2013; originally announced February 2013.

    Comments: 30 pages

  45. arXiv:0804.1026  [pdf, ps, other

    stat.ML

    Testing for Homogeneity with Kernel Fisher Discriminant Analysis

    Authors: Zaid Harchaoui, Francis Bach, Eric Moulines

    Abstract: We propose to investigate test statistics for testing homogeneity in reproducing kernel Hilbert spaces. Asymptotic null distributions under null hypothesis are derived, and consistency against fixed and local alternatives is assessed. Finally, experimental evidence of the performance of the proposed approach on both artificial data and a speaker verification task is provided.

    Submitted 7 April, 2008; originally announced April 2008.