-
Identifiability of Deep Polynomial Neural Networks
Authors:
Konstantin Usevich,
Clara Dérand,
Ricardo Borsoi,
Marianne Clausel
Abstract:
Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degree…
▽ More
Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. This yields both generic conditions determined by the architecture, and effective conditions that depend on the network's parameters. We also settle an open conjecture on the expected dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach its maximum.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Ensembles of Probabilistic Regression Trees
Authors:
Alexandre Seiller,
Éric Gaussier,
Emilie Devijver,
Marianne Clausel,
Sami Alkhoury
Abstract:
Tree-based ensemble methods such as random forests, gradient-boosted trees, and Bayesianadditive regression trees have been successfully used for regression problems in many applicationsand research studies. In this paper, we study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect…
▽ More
Tree-based ensemble methods such as random forests, gradient-boosted trees, and Bayesianadditive regression trees have been successfully used for regression problems in many applicationsand research studies. In this paper, we study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution. We prove thatthe ensemble versions of probabilistic regression trees considered are consistent, and experimentallystudy their bias-variance trade-off and compare them with the state-of-the-art interms of performance prediction.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Wind power predictions from nowcasts to 4-hour forecasts: a learning approach with variable selection
Authors:
Dimitri Bouche,
Rémi Flamary,
Florence d'Alché-Buc,
Riwal Plougonven,
Marianne Clausel,
Jordi Badosa,
Philippe Drobinski
Abstract:
We study short-term prediction of wind speed and wind power (every 10 minutes up to 4 hours ahead). Accurate forecasts for these quantities are crucial to mitigate the negative effects of wind farms' intermittent production on energy systems and markets. We use machine learning to combine outputs from numerical weather prediction models with local observations. The former provide valuable informat…
▽ More
We study short-term prediction of wind speed and wind power (every 10 minutes up to 4 hours ahead). Accurate forecasts for these quantities are crucial to mitigate the negative effects of wind farms' intermittent production on energy systems and markets. We use machine learning to combine outputs from numerical weather prediction models with local observations. The former provide valuable information on higher scales dynamics while the latter gives the model fresher and location-specific data. So as to make the results usable for practitioners, we focus on well-known methods which can handle a high volume of data. We study first variable selection using both a linear technique and a nonlinear one. Then we exploit these results to forecast wind speed and wind power still with an emphasis on linear models versus nonlinear ones. For the wind power prediction, we also compare the indirect approach (wind speed predictions passed through a power curve) and the indirect one (directly predict wind power).
△ Less
Submitted 13 December, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)
Authors:
Aleksandra Burashnikova,
Yury Maximov,
Marianne Clausel,
Charlotte Laclau,
Franck Iutzeler,
Massih-Reza Amini
Abstract:
This paper is an extended version of [Burashnikova et al., 2021, arXiv: 2012.06910], where we proposed a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items…
▽ More
This paper is an extended version of [Burashnikova et al., 2021, arXiv: 2012.06910], where we proposed a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach. To prevent updating the parameters for an abnormally high number of clicks over some targeted items (mainly due to bots), we introduce an upper and a lower threshold on the number of updates for each user. These thresholds are estimated over the distribution of the number of blocks in the training set. They affect the decision of RS by shifting the distribution of items that are shown to the users. Furthermore, we provide a convergence analysis of both algorithms and demonstrate their practical efficiency over six large-scale collections with respect to various ranking measures.
△ Less
Submitted 26 February, 2022;
originally announced February 2022.
-
Nonlinear Functional Output Regression: a Dictionary Approach
Authors:
Dimitri Bouche,
Marianne Clausel,
François Roueff,
Florence d'Alché-Buc
Abstract:
To address functional-output regression, we introduce projection learning (PL), a novel dictionary-based approach that learns to predict a function that is expanded on a dictionary while minimizing an empirical risk based on a functional loss. PL makes it possible to use non orthogonal dictionaries and can then be combined with dictionary learning; it is thus much more flexible than expansion-base…
▽ More
To address functional-output regression, we introduce projection learning (PL), a novel dictionary-based approach that learns to predict a function that is expanded on a dictionary while minimizing an empirical risk based on a functional loss. PL makes it possible to use non orthogonal dictionaries and can then be combined with dictionary learning; it is thus much more flexible than expansion-based approaches relying on vectorial losses. This general method is instantiated with reproducing kernel Hilbert spaces of vector-valued functions as kernel-based projection learning (KPL). For the functional square loss, two closed-form estimators are proposed, one for fully observed output functions and the other for partially observed ones. Both are backed theoretically by an excess risk analysis. Then, in the more general setting of integral losses based on differentiable ground losses, KPL is implemented using first-order optimization for both fully and partially observed output functions. Eventually, several robustness aspects of the proposed algorithms are highlighted on a toy dataset; and a study on two real datasets shows that they are competitive compared to other nonlinear approaches. Notably, using the square loss and a learnt dictionary, KPL enjoys a particularily attractive trade-off between computational cost and performances.
△ Less
Submitted 26 February, 2021; v1 submitted 3 March, 2020;
originally announced March 2020.
-
Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees
Authors:
Myriam Tami,
Marianne Clausel,
Emilie Devijver,
Adrien Dulac,
Eric Gaussier,
Stefan Janaqi,
Meriam Chebre
Abstract:
Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty in the output variable, using for example a quantile loss in Random Forests (Meinshausen, 2006). To the best of our knowledge, no extension has been provided y…
▽ More
Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty in the output variable, using for example a quantile loss in Random Forests (Meinshausen, 2006). To the best of our knowledge, no extension has been provided yet for dealing with uncertainties in the input variables, even though such uncertainties are common in practical situations. We propose here such an extension by showing how standard regression trees optimizing a quadratic loss can be adapted and learned while taking into account the uncertainties in the inputs. By doing so, one no longer assumes that an observation lies into a single region of the regression tree, but rather that it belongs to each region with a certain probability. Experiments conducted on several data sets illustrate the good behavior of the proposed extension.
△ Less
Submitted 18 November, 2018; v1 submitted 27 October, 2018;
originally announced October 2018.
-
New results on approximate Hilbert pairs of wavelet filters with common factors
Authors:
Sophie Achard,
Irène Gannaz,
Marianne Clausel,
François Roueff
Abstract:
In this paper, we consider the design of wavelet filters based on the Thiran common-factor approach proposed in Selesnick [2001]. This approach aims at building finite impulseresponse filters of a Hilbert-pair of wavelets serving as real and imaginary part of a complexwavelet. Unfortunately it is not possible to construct wavelets which are both finitelysupported and analytic. The wavelet filters…
▽ More
In this paper, we consider the design of wavelet filters based on the Thiran common-factor approach proposed in Selesnick [2001]. This approach aims at building finite impulseresponse filters of a Hilbert-pair of wavelets serving as real and imaginary part of a complexwavelet. Unfortunately it is not possible to construct wavelets which are both finitelysupported and analytic. The wavelet filters constructed using the common-factor approachare then approximately analytic. Thus, it is of interest to control their analyticity. Thepurpose of this paper is to first provide precise and explicit expressions as well as easilyexploitable bounds for quantifying the analytic approximation of this complex wavelet.Then, we prove the existence of such filters enjoying the classical perfect reconstructionconditions, with arbitrarily many vanishing moments.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.