Skip to main content

Showing 1–29 of 29 results for author: Gaiffas, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2309.17316  [pdf, other

    stat.ML cs.LG

    Robust Stochastic Optimization via Gradient Quantile Clipping

    Authors: Ibrahim Merad, Stéphane Gaïffas

    Abstract: We introduce a clipping strategy for Stochastic Gradient Descent (SGD) which uses quantiles of the gradient norm as clipping thresholds. We prove that this new strategy provides a robust and efficient optimization algorithm for smooth objectives (convex or non-convex), that tolerates heavy-tailed samples (including infinite variance) and a fraction of outliers in the data stream akin to Huber cont… ▽ More

    Submitted 12 October, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Journal ref: Transactions on Machine Learning Research 2024

  2. arXiv:2307.06048  [pdf, other

    math.OC cs.LG stat.ML

    Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex Optimization

    Authors: Massil Hihat, Stéphane Gaïffas, Guillaume Garrigos, Simon Bussy

    Abstract: We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses. Our motivation is to consider general demands, losses and dynamics to go beyond standard models which usually rely on newsvendor-type losses, fixed dynamics, and unrealistic i.i.d. demand assumptions. We propo… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  3. arXiv:2306.11497  [pdf, ps, other

    stat.ML cs.LG math.OC

    Convergence and concentration properties of constant step-size SGD through Markov chains

    Authors: Ibrahim Merad, Stéphane Gaïffas

    Abstract: We consider the optimization of a smooth and strongly convex objective using constant step-size stochastic gradient descent (SGD) and study its properties through the prism of Markov chains. We show that, for unbiased gradient estimates with mildly controlled variance, the iteration converges to an invariant distribution in total variation distance. We also establish this convergence in Wasserstei… ▽ More

    Submitted 4 July, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

  4. arXiv:2208.05447  [pdf, other

    stat.ML cs.LG

    Robust Methods for High-Dimensional Linear Learning

    Authors: Ibrahim Merad, Stéphane Gaïffas

    Abstract: We propose statistically robust and computationally efficient linear learning methods in the high-dimensional batch setting, where the number of features $d$ may exceed the sample size $n$. We employ, in a generic learning setting, two algorithms depending on whether the considered loss function is gradient-Lipschitz or not. Then, we instantiate our framework on several applications including vani… ▽ More

    Submitted 29 May, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: accepted version

  5. arXiv:2201.13372  [pdf, other

    stat.ML cs.LG math.ST

    Robust supervised learning with coordinate gradient descent

    Authors: Stéphane Gaïffas, Ibrahim Merad

    Abstract: This paper considers the problem of supervised learning with linear methods when both features and labels can be corrupted, either in the form of heavy tailed data and/or corrupted rows. We introduce a combination of coordinate gradient descent as a learning algorithm together with robust estimators of the partial derivatives. This leads to robust statistical learning methods that have a numerical… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: 57 pages, 6 figures

  6. arXiv:2109.08010  [pdf, other

    cs.LG stat.ML

    WildWood: a new Random Forest algorithm

    Authors: Stéphane Gaïffas, Ibrahim Merad, Yiyang Yu

    Abstract: We introduce WildWood (WW), a new ensemble algorithm for supervised learning of Random Forest (RF) type. While standard RF algorithms use bootstrap out-of-bag samples to compute out-of-bag scores, WW uses these samples to produce improved predictions given by an aggregation of the predictions of all possible subtrees of each fully grown tree in the forest. This is achieved by aggregation with expo… ▽ More

    Submitted 13 June, 2023; v1 submitted 16 September, 2021; originally announced September 2021.

  7. arXiv:2012.01064  [pdf, other

    cs.LG cs.AI stat.ML

    About contrastive unsupervised representation learning for classification and its convergence

    Authors: Ibrahim Merad, Yiyang Yu, Emmanuel Bacry, Stéphane Gaïffas

    Abstract: Contrastive representation learning has been recently proved to be very efficient for self-supervised training. These methods have been successfully used to train encoders which perform comparably to supervised training on downstream classification tasks. A few works have started to build a theoretical framework around contrastive learning in which guarantees for its performance can be proven. We… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

  8. arXiv:1912.10784  [pdf, ps, other

    math.ST cs.LG stat.ML

    An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

    Authors: Jaouad Mourtada, Stéphane Gaïffas

    Abstract: We introduce a procedure for conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This estimator minimizes a new general excess risk bound for statistical learning. On standard examples, this bound scales as $d/n$ with $d$ the model dimension and $n$ the sample size, and critically remains valid under model misspecification. Being an improper (out-of-… ▽ More

    Submitted 8 December, 2021; v1 submitted 23 December, 2019; originally announced December 2019.

    Comments: 43 pages, minor revision

  9. arXiv:1911.05346  [pdf, other

    cs.LG stat.ML

    ZiMM: a deep learning model for long term and blurry relapses with non-clinical claims data

    Authors: Anastasiia Kabeshova, Yiyang Yu, Bertrand Lukacs, Emmanuel Bacry, Stéphane Gaïffas

    Abstract: This paper considers the problems of modeling and predicting a long-term and ``blurry'' relapse that occurs after a medical act, such as a surgery. The relapse is observed only indirectly, in a ``blurry'' fashion, through longitudinal prescriptions of drugs over a long period of time after the medical act. We introduce a new model, called ZiMM (Zero-inflated Mixture of Multinomial distributions) i… ▽ More

    Submitted 25 July, 2020; v1 submitted 13 November, 2019; originally announced November 2019.

  10. arXiv:1906.10529  [pdf, other

    stat.ML cs.LG math.ST

    AMF: Aggregated Mondrian Forests for Online Learning

    Authors: Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

    Abstract: Random Forests (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such tree-ensemble methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training… ▽ More

    Submitted 15 May, 2020; v1 submitted 25 June, 2019; originally announced June 2019.

  11. arXiv:1809.01382  [pdf, other

    stat.ML cs.LG

    On the optimality of the Hedge algorithm in the stochastic regime

    Authors: Jaouad Mourtada, Stéphane Gaïffas

    Abstract: In this paper, we study the behavior of the Hedge algorithm in the online stochastic setting. We prove that anytime Hedge with decreasing learning rate, which is one of the simplest algorithm for the problem of prediction with expert advice, is surprisingly both worst-case optimal and adaptive to the easier stochastic and adversarial with a gap problems. This shows that, in spite of its small, non… ▽ More

    Submitted 8 July, 2019; v1 submitted 5 September, 2018; originally announced September 2018.

    Journal ref: Journal of Machine Learning Research, 20(83), 2019

  12. arXiv:1807.09821  [pdf, other

    stat.ML cs.LG

    Comparison of methods for early-readmission prediction in a high-dimensional heterogeneous covariates and time-to-event outcome framework

    Authors: Simon Bussy, Raphaël Veil, Vincent Looten, Anita Burgun, Stéphane Gaïffas, Agathe Guilloux, Brigitte Ranque, Anne-Sophie Jannot

    Abstract: Background: Choosing the most performing method in terms of outcome prediction or variables selection is a recurring problem in prognosis studies, leading to many publications on methods comparison. But some aspects have received little attention. First, most comparison studies treat prediction performance and variable selection aspects separately. Second, methods are either compared within a bina… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

  13. arXiv:1807.03545  [pdf, other

    stat.ML cs.LG

    Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

    Authors: Martin Bompaire, Emmanuel Bacry, Stéphane Gaïffas

    Abstract: The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions. For such problems, a large set of stochastic first-order solvers based on the idea of variance reduction are available and combine both computational efficiency and sound theoretical guarantees (linear convergence… ▽ More

    Submitted 15 December, 2018; v1 submitted 10 July, 2018; originally announced July 2018.

    MSC Class: 90C25; 65K05; 65K10; 49M29

  14. arXiv:1803.05784  [pdf, ps, other

    stat.ML math.ST

    Minimax optimal rates for Mondrian trees and forests

    Authors: Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

    Abstract: Introduced by Breiman, Random Forests are widely used classification and regression algorithms. While being initially designed as batch algorithms, several variants have been proposed to handle online learning. One particular instance of such forests is the \emph{Mondrian Forest}, whose trees are built using the so-called Mondrian process, therefore allowing to easily update their construction in… ▽ More

    Submitted 9 April, 2019; v1 submitted 15 March, 2018; originally announced March 2018.

  15. arXiv:1712.08243  [pdf, other

    stat.AP stat.ME stat.ML

    ConvSCCS: convolutional self-controlled case series model for lagged adverse event detection

    Authors: Maryan Morel, Emmanuel Bacry, Stéphane Gaïffas, Agathe Guilloux, Fanny Leroy

    Abstract: With the increased availability of large databases of electronic health records (EHRs) comes the chance of enhancing health risks screening. Most post-marketing detections of adverse drug reaction (ADR) rely on physicians' spontaneous reports, leading to under reporting. To take up this challenge, we develop a scalable model to estimate the effect of multiple longitudinal features (drug exposures)… ▽ More

    Submitted 25 January, 2018; v1 submitted 21 December, 2017; originally announced December 2017.

  16. arXiv:1712.02640  [pdf, other

    stat.ML

    High-dimensional robust regression and outliers detection with SLOPE

    Authors: Alain Virouleau, Agathe Guilloux, Stéphane Gaïffas, Malgorzata Bogdan

    Abstract: The problems of outliers detection and robust regression in a high-dimensional setting are fundamental in statistics, and have numerous applications. Following a recent set of works providing methods for simultaneous robust regression and outliers detection, we consider in this paper a model of linear regression with individual intercepts, in a high-dimensional setting. We introduce a new procedur… ▽ More

    Submitted 7 December, 2017; originally announced December 2017.

    MSC Class: Primary 62J05; Secondary 62F35; 62J07; 62H15

  17. arXiv:1711.02887  [pdf, other

    stat.ML

    Universal consistency and minimax rates for online Mondrian Forests

    Authors: Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

    Abstract: We establish the consistency of an algorithm of Mondrian Forests, a randomized classification algorithm that can be implemented online. First, we amend the original Mondrian Forest algorithm, that considers a fixed lifetime parameter. Indeed, the fact that this parameter is fixed hinders the statistical consistency of the original procedure. Our modified Mondrian Forest algorithm grows trees with… ▽ More

    Submitted 8 November, 2017; originally announced November 2017.

    Comments: NIPS 2017

  18. arXiv:1707.03010  [pdf, other

    stat.ML

    Sparse inference of the drift of a high-dimensional Ornstein-Uhlenbeck process

    Authors: Stéphane Gaïffas, Gustaw Matulewicz

    Abstract: Given the observation of a high-dimensional Ornstein-Uhlenbeck (OU) process in continuous time, we proceed to the inference of the drift parameter under a row-sparsity assumption. Towards that aim, we consider the negative log-likelihood of the process, penalized by an $\ell^1$-penalization (Lasso and Adaptive Lasso). We provide both non-asymptotic and asymptotic results for this procedure, by mea… ▽ More

    Submitted 10 July, 2017; originally announced July 2017.

    MSC Class: 60G15; 62H12; 62M99

  19. arXiv:1707.03003  [pdf, other

    stat.ML

    Tick: a Python library for statistical learning, with a particular emphasis on time-dependent modelling

    Authors: Emmanuel Bacry, Martin Bompaire, Stéphane Gaïffas, Soren Poulsen

    Abstract: Tick is a statistical learning library for Python~3, with a particular emphasis on time-dependent models, such as point processes, and tools for generalized linear models and survival analysis. The core of the library is an optimization module providing model computational classes, solvers and proximal operators for regularization. tick relies on a C++ implementation and state-of-the-art optimizat… ▽ More

    Submitted 15 March, 2018; v1 submitted 10 July, 2017; originally announced July 2017.

  20. arXiv:1703.08619  [pdf, other

    stat.ML

    Binarsity: a penalization for one-hot encoded features in linear supervised learning

    Authors: Mokhtar Z. Alaya, Simon Bussy, Stéphane Gaïffas, Agathe Guilloux

    Abstract: This paper deals with the problem of large-scale linear supervised learning in settings where a large number of continuous features are available. We propose to combine the well-known trick of one-hot encoding of continuous features with a new penalization called \emph{binarsity}. In each group of binary features coming from the one-hot encoding of a single raw continuous feature, this penalizatio… ▽ More

    Submitted 9 January, 2019; v1 submitted 24 March, 2017; originally announced March 2017.

  21. arXiv:1610.07407  [pdf, other

    stat.ML

    C-mix: a high dimensional mixture model for censored durations, with applications to genetic data

    Authors: Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot

    Abstract: We introduce a mixture model for censored durations (C-mix), and develop maximum likelihood inference for the joint estimation of the time distributions and latent regression parameters of the model. We consider a high-dimensional setting, with datasets containing a large number of biomedical covariates. We therefore penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse… ▽ More

    Submitted 25 November, 2017; v1 submitted 24 October, 2016; originally announced October 2016.

  22. arXiv:1607.06333  [pdf, other

    stat.ML cs.LG

    Uncovering Causality from Multivariate Hawkes Integrated Cumulants

    Authors: Massil Achab, Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, Jean-Francois Muzy

    Abstract: We design a new nonparametric method that allows one to estimate the matrix of integrated kernels of a multivariate Hawkes process. This matrix not only encodes the mutual influences of each nodes of the process, but also disentangles the causality relationships between them. Our approach is the first that leads to an estimation of this matrix without any parametric modeling and estimation of the… ▽ More

    Submitted 29 May, 2017; v1 submitted 21 July, 2016; originally announced July 2016.

  23. arXiv:1510.04822  [pdf, other

    stat.ML cs.LG

    SGD with Variance Reduction beyond Empirical Risk Minimization

    Authors: Massil Achab, Agathe Guilloux, Stéphane Gaïffas, Emmanuel Bacry

    Abstract: We introduce a doubly stochastic proximal gradient algorithm for optimizing a finite average of smooth convex functions, whose gradients depend on numerically expensive expectations. Our main motivation is the acceleration of the optimization of the regularized Cox partial-likelihood (the core model used in survival analysis), but our algorithm can be used in different settings as well. The propos… ▽ More

    Submitted 8 November, 2016; v1 submitted 16 October, 2015; originally announced October 2015.

    Comments: 17 pages

  24. arXiv:1507.00513  [pdf, other

    math.ST stat.ML

    Learning the intensity of time events with change-points

    Authors: Mokhtar Zahdi Alaya, Stéphane Gaïffas, Agathe Guilloux

    Abstract: We consider the problem of learning the inhomogeneous intensity of a counting process, under a sparse segmentation assumption. We introduce a weighted total-variation penalization, using data-driven weights that correctly scale the penalization along the observation interval. We prove that this leads to a sharp tuning of the convex relaxation of the segmentation prior, by stating oracle inequaliti… ▽ More

    Submitted 2 July, 2015; originally announced July 2015.

  25. arXiv:1501.00725  [pdf, other

    stat.ML

    Sparse and low-rank multivariate Hawkes processes

    Authors: Emmanuel Bacry, Martin Bompaire, Stéphane Gaïffas, Jean-François Muzy

    Abstract: We consider the problem of unveiling the implicit network structure of node interactions (such as user interactions in a social network), based only on high-frequency timestamps. Our inference is based on the minimization of the least-squares loss associated with a multivariate Hawkes model, penalized by $\ell_1$ and trace norm of the interaction tensor. We provide a first theoretical analysis for… ▽ More

    Submitted 24 February, 2020; v1 submitted 4 January, 2015; originally announced January 2015.

  26. arXiv:1412.7705  [pdf, ps, other

    math.PR stat.ML

    Concentration for matrix martingales in continuous time and microscopic activity of social networks

    Authors: Emmanuel Bacry, Stéphane Gaïffas, Jean-François Muzy

    Abstract: This paper gives new concentration inequalities for the spectral norm of a wide class of matrix martingales in continuous time. These results extend previously established Freedman and Bernstein inequalities for series of random matrices to the class of continuous time processes. Our analysis relies on a new supermartingale property of the trace exponential proved within the framework of stochasti… ▽ More

    Submitted 27 October, 2016; v1 submitted 24 December, 2014; originally announced December 2014.

  27. arXiv:1401.8017  [pdf, other

    stat.ML

    Sparse Bayesian Unsupervised Learning

    Authors: Stephane Gaiffas, Bertrand Michel

    Abstract: This paper is about variable selection, clustering and estimation in an unsupervised high-dimensional setting. Our approach is based on fitting constrained Gaussian mixture models, where we learn the number of clusters $K$ and the set of relevant variables $S$ using a generalized Bayesian posterior with a sparsity inducing prior. We prove a sparsity oracle inequality which shows that this procedur… ▽ More

    Submitted 30 January, 2014; originally announced January 2014.

    MSC Class: 62H30 ACM Class: G.3; H.3.3; I.5.3

  28. arXiv:1209.3230  [pdf, ps, other

    stat.ML

    Link Prediction in Graphs with Autoregressive Features

    Authors: Emile Richard, Stephane Gaiffas, Nicolas Vayatis

    Abstract: In the paper, we consider the problem of link prediction in time-evolving graphs. We assume that certain graph features, such as the node degree, follow a vector autoregressive (VAR) model and we propose to use this information to improve the accuracy of prediction. Our strategy involves a joint optimization procedure over the space of adjacency matrices and VAR matrices which takes into account b… ▽ More

    Submitted 14 September, 2012; originally announced September 2012.

    Comments: NIPS 2012

  29. arXiv:0912.1618  [pdf, other

    stat.ML

    Hyper-sparse optimal aggregation

    Authors: Stéphane Gaïffas, Guillaume Lecué

    Abstract: In this paper, we consider the problem of "hyper-sparse aggregation". Namely, given a dictionary $F = \{f_1, ..., f_M \}$ of functions, we look for an optimal aggregation algorithm that writes $\tilde f = \sum_{j=1}^M θ_j f_j$ with as many zero coefficients $θ_j$ as possible. This problem is of particular interest when $F$ contains many irrelevant functions that should not appear in $\tilde{f}$.… ▽ More

    Submitted 8 December, 2009; originally announced December 2009.

    Comments: 33 pages