-
Robust oracle estimation and uncertainty quantification for possibly sparse quantiles
Authors:
Eduard Belitser,
Paulo Serra,
Alexandra Vegelien
Abstract:
A general many quantiles + noise model is studied in the robust formulation (allowing non-normal, non-independent observations), where the identifiability requirement for the noise is formulated in terms of quantiles rather than the traditional zero expectation assumption. We propose a penalization method based on the quantile loss function with appropriately chosen penalty function making inferen…
▽ More
A general many quantiles + noise model is studied in the robust formulation (allowing non-normal, non-independent observations), where the identifiability requirement for the noise is formulated in terms of quantiles rather than the traditional zero expectation assumption. We propose a penalization method based on the quantile loss function with appropriately chosen penalty function making inference on possibly sparse high-dimensional quantile vector. We apply a local approach to address the optimality by comparing procedures to the oracle sparsity structure. We establish that the proposed procedure mimics the oracle in the problems of estimation and uncertainty quantification (under the so called EBR condition). Adaptive minimax results over sparsity scale follow from our local results.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Road traffic estimation and distribution-based route selection
Authors:
Rens Kamphuis,
Michel Mandjes,
Paulo Serra
Abstract:
In route selection problems, the driver's personal preferences will determine whether she prefers a route with a travel time that has a relatively low mean and high variance over one that has relatively high mean and low variance. In practice, however, such risk aversion issues are often ignored, in that a route is selected based on a single-criterion Dijkstra-type algorithm. In addition, the rout…
▽ More
In route selection problems, the driver's personal preferences will determine whether she prefers a route with a travel time that has a relatively low mean and high variance over one that has relatively high mean and low variance. In practice, however, such risk aversion issues are often ignored, in that a route is selected based on a single-criterion Dijkstra-type algorithm. In addition, the routing decision typically does not take into account the uncertainty in the estimates of the travel time's mean and variance. This paper aims at resolving both issues by setting up a framework for travel time estimation. In our framework, the underlying road network is represented as a graph. Each edge is subdivided into multiple smaller pieces, so as to naturally model the statistical similarity between road pieces that are spatially nearby. Relying on a Bayesian approach, we construct an estimator for the joint per-edge travel time distribution, thus also providing us with an uncertainty quantification of our estimates. Our machinery relies on establishing limit theorems, making the resulting estimation procedure robust in the sense that it effectively does not assume any distributional properties. We present an extensive set of numerical experiments that demonstrate the validity of the estimation procedure and the use of the distributional estimates in the context of data-driven route selection.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Gordian Adjacency for Positive Braid Knots
Authors:
Tolson H. Bell,
David C. Luo,
Luke Seaton,
Samuel P. Serra
Abstract:
A knot $K_1$ is said to be Gordian adjacent to a knot $K_2$ if $K_1$ is an intermediate knot on an unknotting sequence of $K_2$. We extend previous results on Gordian adjacency by showing sufficient conditions for Gordian adjacency between classes of positive braid knots through manipulations of braid words. In addition, we explore unknotting sequences of positive braid knots and give a proof that…
▽ More
A knot $K_1$ is said to be Gordian adjacent to a knot $K_2$ if $K_1$ is an intermediate knot on an unknotting sequence of $K_2$. We extend previous results on Gordian adjacency by showing sufficient conditions for Gordian adjacency between classes of positive braid knots through manipulations of braid words. In addition, we explore unknotting sequences of positive braid knots and give a proof that there are only finitely many positive braid knots for a given unknotting number.
△ Less
Submitted 16 November, 2020; v1 submitted 7 October, 2019;
originally announced October 2019.
-
Adaptive Non-parametric Estimation of Mean and Autocovariance in Regression with Dependent Errors
Authors:
Tatyana Krivobokova,
Paulo Serra,
Francisco Rosales,
Karolina Klockmann
Abstract:
Gaussian processes that can be decomposed into a smooth mean function and a stationary autocorrelated noise process are considered and a fully automatic nonparametric method to simultaneous estimation of mean and auto-covariance functions of such processes is developed. Our empirical Bayes approach is data-driven, numerically efficient and allows for the construction of confidence sets for the mea…
▽ More
Gaussian processes that can be decomposed into a smooth mean function and a stationary autocorrelated noise process are considered and a fully automatic nonparametric method to simultaneous estimation of mean and auto-covariance functions of such processes is developed. Our empirical Bayes approach is data-driven, numerically efficient and allows for the construction of confidence sets for the mean function. Performance is demonstrated in simulations and real data analysis. The method is implemented in the R package eBsc that accompanies the paper.
△ Less
Submitted 18 August, 2021; v1 submitted 17 December, 2018;
originally announced December 2018.
-
Estimation of Local Degree Distributions via Local Weighted Averaging and Monte Carlo Cross-Validation
Authors:
Paulo Serra,
Michel Mandjes
Abstract:
Owing to their capability of summarising interactions between elements of a system, networks have become a common type of data in many fields. As networks can be inhomogeneous, in that different regions of the network may exhibit different topologies, an important topic concerns their local properties. This paper focuses on the estimation of the local degree distribution of a vertex in an inhomoge…
▽ More
Owing to their capability of summarising interactions between elements of a system, networks have become a common type of data in many fields. As networks can be inhomogeneous, in that different regions of the network may exhibit different topologies, an important topic concerns their local properties. This paper focuses on the estimation of the local degree distribution of a vertex in an inhomogeneous network. The contributions are twofold: we propose an estimator based on local weighted averaging, and we set up a Monte Carlo cross-validation procedure to pick the parameters of this estimator. Under a specific modelling assumption we derive an oracle inequality that shows how the model parameters affect the precision of the estimator. We illustrate our method by several numerical experiments, on both real and synthetic data, showing in particular that the approach considerably improves upon the natural, empirical estimator.
△ Less
Submitted 26 February, 2018;
originally announced February 2018.
-
Dimension Estimation Using Random Connection Models
Authors:
Paulo Serra,
Michel Mandjes
Abstract:
Information about intrinsic dimension is crucial to perform dimensionality reduction, compress information, design efficient algorithms, and do statistical adaptation. In this paper we propose an estimator for the intrinsic dimension of a data set. The estimator is based on binary neighbourhood information about the observations in the form of two adjacency matrices, and does not require any expli…
▽ More
Information about intrinsic dimension is crucial to perform dimensionality reduction, compress information, design efficient algorithms, and do statistical adaptation. In this paper we propose an estimator for the intrinsic dimension of a data set. The estimator is based on binary neighbourhood information about the observations in the form of two adjacency matrices, and does not require any explicit distance information. The underlying graph is modelled according to a subset of a specific random connection model, sometimes referred to as the Poisson blob model. Computationally the estimator scales like n log n, and we specify its asymptotic distribution and rate of convergence. A simulation study on both real and simulated data shows that our approach compares favourably with some competing methods from the literature, including approaches that rely on distance information.
△ Less
Submitted 8 November, 2017;
originally announced November 2017.
-
Adaptive empirical Bayesian smoothing splines
Authors:
Paulo Serra,
Tatyana Krivobokova
Abstract:
In this paper we develop and study adaptive empirical Bayesian smoothing splines. These are smoothing splines with both smoothing parameter and penalty order determined via the empirical Bayes method from the marginal likelihood of the model. The selected order and smoothing parameter are used to construct adaptive credible sets with good frequentist coverage for the underlying regression function…
▽ More
In this paper we develop and study adaptive empirical Bayesian smoothing splines. These are smoothing splines with both smoothing parameter and penalty order determined via the empirical Bayes method from the marginal likelihood of the model. The selected order and smoothing parameter are used to construct adaptive credible sets with good frequentist coverage for the underlying regression function. We use these credible sets as a proxy to show the superior performance of adaptive empirical Bayesian smoothing splines compared to frequentist smoothing splines.
△ Less
Submitted 17 November, 2015; v1 submitted 25 November, 2014;
originally announced November 2014.
-
Online Tracking of a Predictable Drifting Parameter of a Time Series
Authors:
Eduard Belitser,
Paulo Serra
Abstract:
We propose an online algorithm for tracking a multidimensional time-varying parameter of a time series, which is also allowed to be a predictable process with respect to the underlying time series. The algorithm is driven by a gain function. Under assumptions on the gain, we derive uniform non-asymptotic error bounds on the tracking algorithm in terms of chosen step size for the algorithm and the…
▽ More
We propose an online algorithm for tracking a multidimensional time-varying parameter of a time series, which is also allowed to be a predictable process with respect to the underlying time series. The algorithm is driven by a gain function. Under assumptions on the gain, we derive uniform non-asymptotic error bounds on the tracking algorithm in terms of chosen step size for the algorithm and the variation of the parameter of interest. We also outline how appropriate gain functions can be constructed. We give several examples of different variational setups for the parameter process where our result can be applied. The proposed approach covers many frameworks and models (including the classical Robbins-Monro and Kiefer-Wolfowitz procedures) where stochastic approximation algorithms comprise the main inference tool for the data analysis. We treat in some detail a couple of specific models.
△ Less
Submitted 14 November, 2013; v1 submitted 3 June, 2013;
originally announced June 2013.
-
Rate-optimal Bayesian intensity smoothing for inhomogeneous Poisson processes
Authors:
Eduard Belitser,
Paulo Serra,
Harry van Zanten
Abstract:
We apply nonparametric Bayesian methods to study the problem of estimating the intensity function of an inhomogeneous Poisson process. We exhibit a prior on intensities which both leads to a computationally feasible method and enjoys desirable theoretical optimality properties. The prior we use is based on B-spline expansions with free knots, adapted from well-established methods used in regressio…
▽ More
We apply nonparametric Bayesian methods to study the problem of estimating the intensity function of an inhomogeneous Poisson process. We exhibit a prior on intensities which both leads to a computationally feasible method and enjoys desirable theoretical optimality properties. The prior we use is based on B-spline expansions with free knots, adapted from well-established methods used in regression, for instance. We illustrate its practical use in the Poisson process setting by analyzing count data coming from a call centre. Theoretically we derive a new general theorem on contraction rates for posteriors in the setting of intensity function estimation. Practical choices that have to be made in the construction of our concrete prior, such as choosing the priors on the number and the locations of the spline knots, are based on these theoretical findings. The results assert that when properly constructed, our approach yields a rate-optimal procedure that automatically adapts to the regularity of the unknown intensity function.
△ Less
Submitted 27 November, 2013; v1 submitted 22 April, 2013;
originally announced April 2013.
-
Adaptive Priors based on Splines with Random Knots
Authors:
Eduard Belitser,
Paulo Serra
Abstract:
Splines are useful building blocks when constructing priors on nonparametric models indexed by functions. Recently it has been established in the literature that hierarchical priors based on splines with a random number of equally spaced knots and random coefficients in the B-spline basis corresponding to those knots lead, under certain conditions, to adaptive posterior contraction rates, over cer…
▽ More
Splines are useful building blocks when constructing priors on nonparametric models indexed by functions. Recently it has been established in the literature that hierarchical priors based on splines with a random number of equally spaced knots and random coefficients in the B-spline basis corresponding to those knots lead, under certain conditions, to adaptive posterior contraction rates, over certain smoothness functional classes. In this paper we extend these results for when the location of the knots is also endowed with a prior. This has already been a common practice in MCMC applications, where the resulting posterior is expected to be more "spatially adaptive", but a theoretical basis in terms of adaptive contraction rates was missing. Under some mild assumptions, we establish a result that provides sufficient conditions for adaptive contraction rates in a range of models.
△ Less
Submitted 14 March, 2013;
originally announced March 2013.