Skip to main content

Showing 1–27 of 27 results for author: Biau, G

Searching in archive math. Search in all archives.
.
  1. arXiv:2503.22580  [pdf, other

    stat.ME math.ST stat.AP

    Optimal treatment regimes for the net benefit of a treatment

    Authors: François Petit, Gérard Biau, Raphaël Porcher

    Abstract: We developed a mathematical setup inspired by Buyse's generalized pairwise comparisons to define a notion of optimal individualized treatment rule (ITR) in the presence of prioritized outcomes in a randomized controlled trial, terming such an ITR pairwise optimal. We present two approaches to estimate pairwise optimal ITRs. The first is a variant of the k-nearest neighbors algorithm. The second is… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  2. arXiv:2502.10485  [pdf, other

    stat.ML cs.AI cs.LG math.ST stat.AP stat.ME

    Forecasting time series with constraints

    Authors: Nathan Doumèche, Francis Bach, Éloi Bedek, Gérard Biau, Claire Boyer, Yannig Goude

    Abstract: Time series forecasting presents unique challenges that limit the effectiveness of traditional machine learning algorithms. To address these limitations, various approaches have incorporated linear constraints into learning algorithms, such as generalized additive models and hierarchical forecasting. In this paper, we propose a unified framework for integrating and combining linear constraints in… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  3. arXiv:2409.13786  [pdf, other

    stat.ML cs.LG math.ST

    Physics-informed kernel learning

    Authors: Nathan Doumèche, Francis Bach, Gérard Biau, Claire Boyer

    Abstract: Physics-informed machine learning typically integrates physical priors into the learning process by minimizing a loss function that includes both a data-driven term and a partial differential equation (PDE) regularization. Building on the formulation of the problem as a kernel regression task, we use Fourier methods to approximate the associated kernel, and propose a tractable estimator that minim… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  4. arXiv:2402.07514  [pdf, other

    cs.AI math.ST

    Physics-informed machine learning as a kernel method

    Authors: Nathan Doumèche, Francis Bach, Gérard Biau, Claire Boyer

    Abstract: Physics-informed machine learning combines the expressiveness of data-based approaches with the interpretability of physical models. In this context, we consider a general regression problem where the empirical risk is regularized by a partial differential equation that quantifies the physical inconsistency. We prove that for linear differential priors, the problem can be formulated as a kernel re… ▽ More

    Submitted 19 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  5. arXiv:2305.01240  [pdf, other

    math.ST

    Convergence and error analysis of PINNs

    Authors: Nathan Doumèche, Gérard Biau, Claire Boyer

    Abstract: Physics-informed neural networks (PINNs) are a promising approach that combines the power of neural networks with the interpretability of physical modeling. PINNs have shown good practical performance in solving partial differential equations (PDEs) and in hybrid modeling scenarios, where physical models enhance data-driven approaches. However, it is essential to establish their theoretical proper… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  6. arXiv:2201.02824  [pdf, other

    stat.ML cs.LG math.ST

    Optimal 1-Wasserstein Distance for WGANs

    Authors: Arthur Stéphanovitch, Ugo Tanielian, Benoît Cadre, Nicolas Klutchnikoff, Gérard Biau

    Abstract: The mathematical forces at work behind Generative Adversarial Networks raise challenging theoretical issues. Motivated by the important question of characterizing the geometrical properties of the generated distributions, we provide a thorough analysis of Wasserstein GANs (WGANs) in both the finite sample and asymptotic regimes. We study the specific case where the latent space is univariate and d… ▽ More

    Submitted 5 October, 2023; v1 submitted 8 January, 2022; originally announced January 2022.

  7. arXiv:1908.06852  [pdf, other

    stat.ML cs.LG math.ST

    SIRUS: Stable and Interpretable RUle Set for Classification

    Authors: Clément Bénard, Gérard Biau, Sébastien da Veiga, Erwan Scornet

    Abstract: State-of-the-art learning algorithms, such as random forests or neural networks, are often qualified as "black-boxes" because of the high number and complexity of operations involved in their prediction mechanism. This lack of interpretability is a strong limitation for applications involving critical decisions, typically the analysis of production processes in the manufacturing industry. In such… ▽ More

    Submitted 16 December, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

  8. arXiv:1707.05023  [pdf, other

    math.ST cs.LG

    Optimization by gradient boosting

    Authors: Gérard Biau, Benoît Cadre

    Abstract: Gradient boosting is a state-of-the-art prediction technique that sequentially produces a model in the form of linear combinations of simple predictors---typically decision trees---by solving an infinite-dimensional convex optimization problem. We provide in the present paper a thorough analysis of two widespread versions of gradient boosting, and introduce a general framework for studying these a… ▽ More

    Submitted 17 July, 2017; originally announced July 2017.

  9. arXiv:1604.07143  [pdf, other

    stat.ML cs.LG math.ST

    Neural Random Forests

    Authors: Gérard Biau, Erwan Scornet, Johannes Welbl

    Abstract: Given an ensemble of randomized regression trees, it is possible to restructure them as a collection of multilayered neural networks with particular connection weights. Following this principle, we reformulate the random forest method of Breiman (2001) into a neural network setting, and in turn propose two new hybrid procedures that we call neural random forests. Both predictors exploit prior know… ▽ More

    Submitted 3 April, 2018; v1 submitted 25 April, 2016; originally announced April 2016.

  10. arXiv:1511.05741  [pdf, other

    math.ST stat.ML

    A Random Forest Guided Tour

    Authors: Gérard Biau, Erwan Scornet

    Abstract: The random forest algorithm, proposed by L. Breiman in 2001, has been extremely successful as a general-purpose classification and regression method. The approach, which combines several randomized decision trees and aggregates their predictions by averaging, has shown excellent performance in settings where the number of variables is much larger than the number of observations. Moreover, it is ve… ▽ More

    Submitted 18 November, 2015; originally announced November 2015.

  11. arXiv:1507.00171  [pdf, other

    math.ST stat.AP stat.ME

    The Statistical Performance of Collaborative Inference

    Authors: Gérard Biau, Kevin Bleakley, Benoit Cadre

    Abstract: The statistical analysis of massive and complex data sets will require the development of algorithms that depend on distributed computing and collaborative inference. Inspired by this, we propose a collaborative framework that aims to estimate the unknown mean $θ$ of a random variable $X$. In the model we present, a certain number of calculation units, distributed across a communication network re… ▽ More

    Submitted 1 July, 2015; originally announced July 2015.

  12. arXiv:1504.01702  [pdf, other

    math.ST stat.AP

    Long signal change-point detection

    Authors: Gérard Biau, Kevin Bleakley, David Mason

    Abstract: The detection of change-points in a spatially or time ordered data sequence is an important problem in many fields such as genetics and finance. We derive the asymptotic distribution of a statistic recently suggested for detecting change-points. Simulation of its estimated limit distribution leads to a new and computationally efficient change-point detection algorithm, which can be used on very lo… ▽ More

    Submitted 30 September, 2015; v1 submitted 7 April, 2015; originally announced April 2015.

  13. arXiv:1410.4029  [pdf, ps, other

    math.ST

    Cox process functional learning

    Authors: Gérard Biau, Benoît Cadre, Quentin Paris

    Abstract: This article addresses the problem of functional supervised classification of Cox process trajectories, whose random intensity is driven by some exogenous random covariable. The classification task is achieved through a regularized convex empirical risk minimization procedure, and a nonasymptotic oracle inequality is derived. We show that the algorithm provides a Bayes-risk consistent classifier.… ▽ More

    Submitted 15 October, 2014; originally announced October 2014.

    MSC Class: 62G05; 62G20

  14. arXiv:1407.4373  [pdf, other

    math.ST stat.ML

    Online Asynchronous Distributed Regression

    Authors: Gérard Biau, Ryad Zenine

    Abstract: Distributed computing offers a high degree of flexibility to accommodate modern learning constraints and the ever increasing size of datasets involved in massive data issues. Drawing inspiration from the theory of distributed computation models developed in the context of gradient-type optimization algorithms, we present a consensus-based asynchronous distributed approach for nonparametric online… ▽ More

    Submitted 16 July, 2014; originally announced July 2014.

  15. arXiv:1405.2881  [pdf, ps, other

    math.ST stat.ML

    Consistency of random forests

    Authors: Erwan Scornet, Gérard Biau, Jean-Philippe Vert

    Abstract: Random forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (2001) 5--32] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical performance, little is known about the mathematical properties of the procedure. This disparity between theory and practice originates in the difficulty to simultane… ▽ More

    Submitted 8 August, 2015; v1 submitted 12 May, 2014; originally announced May 2014.

    Journal ref: Annals of Statistics, Institute of Mathematical Statistics (IMS), 2015, 43 (4), pp.1716-1741

  16. arXiv:1311.0587  [pdf, ps, other

    math.ST

    High-dimensional $p$-norms

    Authors: Gérard Biau, David D. M. Mason

    Abstract: Let $\bX=(X_1, \hdots, X_d)$ be a $\mathbb R^d$-valued random vector with i.i.d. components, and let $\Vert\bX\Vert_p= (\sum_{j=1}^d|X_j|^p)^{1/p}$ be its $p$-norm, for $p>0$. The impact of letting $d$ go to infinity on $\Vert\bX\Vert_p$ has surprising consequences, which may dramatically affect high-dimensional data processing. This effect is usually referred to as the {\it distance concentration… ▽ More

    Submitted 4 November, 2013; originally announced November 2013.

    Comments: 19 pages

  17. COBRA: A Combined Regression Strategy

    Authors: Gérard Biau, Aurélie Fischer, Benjamin Guedj, James Malley

    Abstract: A new method for combining several initial estimators of the regression function is introduced. Instead of building a linear or convex optimized combination over a collection of basic estimators $r_1,\dots,r_M$, we use them as a collective indicator of the proximity between the training data and a test observation. This local distance approach is model-free and very fast. More specifically, the re… ▽ More

    Submitted 23 May, 2019; v1 submitted 9 March, 2013; originally announced March 2013.

    Comments: 42 pages

    Journal ref: Journal of Multivariate Analysis (2016), vol. 146, 18--28

  18. arXiv:1301.4679  [pdf, ps, other

    stat.ML cs.LG math.ST

    Cellular Tree Classifiers

    Authors: Gérard Biau, Luc Devroye

    Abstract: The cellular tree classifier model addresses a fundamental problem in the design of classifiers for a parallel or distributed computing world: Given a data set, is it sufficient to apply a majority rule for classification, or shall one split the data into two or more parts and send each part to a potentially different computer (or cell) for further processing? At first sight, it seems impossible t… ▽ More

    Submitted 25 June, 2013; v1 submitted 20 January, 2013; originally announced January 2013.

  19. arXiv:1207.6461  [pdf, ps, other

    math.ST

    New Insights Into Approximate Bayesian Computation

    Authors: Gérard Biau, Frédéric Cérou, Arnaud Guyader

    Abstract: Approximate Bayesian Computation (ABC for short) is a family of computational techniques which offer an almost automated solution in situations where evaluation of the posterior likelihood is computationally prohibitive, or whenever suitable likelihoods are not available. In the present paper, we analyze the procedure from the point of view of k-nearest neighbor theory and explore the statistical… ▽ More

    Submitted 3 June, 2013; v1 submitted 27 July, 2012; originally announced July 2012.

  20. arXiv:1201.0586  [pdf, ps, other

    math.ST

    An Affine Invariant $k$-Nearest Neighbor Regression Estimate

    Authors: Gérard Biau, Luc Devroye, Vida Dujmovic, Adam Krzyzak

    Abstract: We design a data-dependent metric in $\mathbb R^d$ and use it to define the $k$-nearest neighbors of a given point. Our metric is invariant under all affine transformations. We show that, with this metric, the standard $k$-nearest neighbor regression estimate is asymptotically consistent under the usual conditions on $k$, and minimal requirements on the input data.

    Submitted 18 May, 2012; v1 submitted 3 January, 2012; originally announced January 2012.

  21. arXiv:1101.3229  [pdf, other

    math.ST

    Sparse single-index model

    Authors: Pierre Alquier, Gérard Biau

    Abstract: Let $(\bX, Y)$ be a random pair taking values in $\mathbb R^p \times \mathbb R$. In the so-called single-index model, one has $Y=f^{\star}(θ^{\star T}\bX)+\bW$, where $f^{\star}$ is an unknown univariate measurable function, $θ^{\star}$ is an unknown vector in $\mathbb R^d$, and $W$ denotes a random noise satisfying $\mathbb E[\bW|\bX]=0$. The single-index model is known to offer a flexible way to… ▽ More

    Submitted 6 October, 2011; v1 submitted 17 January, 2011; originally announced January 2011.

    Journal ref: Journal of Machine Learning Research 14 (2013) 243-280

  22. Statistical analysis of $k$-nearest neighbor collaborative recommendation

    Authors: Gérard Biau, Benoît Cadre, Laurent Rouvière

    Abstract: Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite… ▽ More

    Submitted 4 October, 2010; originally announced October 2010.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOS759 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS759

    Journal ref: Annals of Statistics 2010, Vol. 38, No. 3, 1568-1592

  23. arXiv:1005.0208  [pdf, other

    stat.ML math.ST

    Analysis of a Random Forests Model

    Authors: Gérard Biau

    Abstract: Random forests are a scheme proposed by Leo Breiman in the 2000's for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random forests, and little is known about the mathematical forces driving the algorithm. In this paper, we off… ▽ More

    Submitted 26 March, 2012; v1 submitted 3 May, 2010; originally announced May 2010.

  24. arXiv:1003.5089  [pdf, ps, other

    math.ST

    PCA-Kernel Estimation

    Authors: Gérard Biau, André Mas

    Abstract: Many statistical estimation techniques for high-dimensional or functional data are based on a preliminary dimension reduction step, which consists in projecting the sample $\bX_1, \hdots, \bX_n$ onto the first $D$ eigenvectors of the Principal Component Analysis (PCA) associated with the empirical projector $\hat Π_D$. Classical nonparametric inference methods such as kernel density estimation or… ▽ More

    Submitted 26 March, 2010; originally announced March 2010.

  25. arXiv:0910.2340  [pdf, ps, other

    stat.ML math.ST

    A Stochastic Model for Collaborative Recommendation

    Authors: Gérard Biau, Benoit Cadre, Laurent Rouvière

    Abstract: Collaborative recommendation is an information-filtering technique that attempts to present information items (movies, music, books, news, images, Web pages, etc.) that are likely of interest to the Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for… ▽ More

    Submitted 13 October, 2009; originally announced October 2009.

  26. arXiv:0908.2503  [pdf, ps, other

    stat.ME math.ST

    Sequential Quantile Prediction of Time Series

    Authors: Gérard Biau, Benoît Patra

    Abstract: Motivated by a broad range of potential applications, we address the quantile prediction problem of real-valued time series. We present a sequential quantile forecasting model based on the combination of a set of elementary nearest neighbor-type predictors called "experts" and show its consistency under a minimum of conditions. Our approach builds on the methodology developed in recent years for p… ▽ More

    Submitted 31 May, 2010; v1 submitted 18 August, 2009; originally announced August 2009.

  27. arXiv:0801.0327  [pdf, ps, other

    stat.ME math.PR

    Nonparametric sequential prediction of time series

    Authors: Gérard Biau, Kevin Bleakley, László Györfi, György Ottucsák

    Abstract: Time series prediction covers a vast field of every-day statistical applications in medical, environmental and economic domains. In this paper we develop nonparametric prediction strategies based on the combination of a set of 'experts' and show the universal consistency of these strategies under a minimum of conditions. We perform an in-depth analysis of real-world data sets and show that these… ▽ More

    Submitted 1 January, 2008; originally announced January 2008.

    Comments: article + 2 figures

    MSC Class: 62G99