-
Data-driven confidence bands for distributed nonparametric regression
Authors:
Valeriy Avanesov
Abstract:
Gaussian Process Regression and Kernel Ridge Regression are popular nonparametric regression approaches. Unfortunately, they suffer from high computational complexity rendering them inapplicable to the modern massive datasets. To that end a number of approximations have been suggested, some of them allowing for a distributed implementation. One of them is the divide and conquer approach, splitting…
▽ More
Gaussian Process Regression and Kernel Ridge Regression are popular nonparametric regression approaches. Unfortunately, they suffer from high computational complexity rendering them inapplicable to the modern massive datasets. To that end a number of approximations have been suggested, some of them allowing for a distributed implementation. One of them is the divide and conquer approach, splitting the data into a number of partitions, obtaining the local estimates and finally averaging them. In this paper we suggest a novel computationally efficient fully data-driven algorithm, quantifying uncertainty of this method, yielding frequentist $L_2$-confidence bands. We rigorously demonstrate validity of the algorithm. Another contribution of the paper is a minimax-optimal high-probability bound for the averaged estimator, complementing and generalizing the known risk bounds.
△ Less
Submitted 8 June, 2020; v1 submitted 13 December, 2019;
originally announced December 2019.
-
How to gamble with non-stationary $\mathcal{X}$-armed bandits and have no regrets
Authors:
Valeriy Avanesov
Abstract:
In $\mathcal{X}$-armed bandit problem an agent sequentially interacts with environment which yields a reward based on the vector input the agent provides. The agent's goal is to maximise the sum of these rewards across some number of time steps. The problem and its variations have been a subject of numerous studies, suggesting sub-linear and some times optimal strategies. The given paper introduce…
▽ More
In $\mathcal{X}$-armed bandit problem an agent sequentially interacts with environment which yields a reward based on the vector input the agent provides. The agent's goal is to maximise the sum of these rewards across some number of time steps. The problem and its variations have been a subject of numerous studies, suggesting sub-linear and some times optimal strategies. The given paper introduces a novel variation of the problem. We consider an environment, which can abruptly change its behaviour an unknown number of times. To that end we propose a novel strategy and prove it attains sub-linear cumulative regret. Moreover, in case of highly smooth relation between an action and the corresponding reward, the method is nearly optimal. The theoretical result are supported by experimental study.
△ Less
Submitted 17 January, 2021; v1 submitted 20 August, 2019;
originally announced August 2019.
-
Nonparametric Change Point Detection in Regression
Authors:
Valeriy Avanesov
Abstract:
This paper considers the prominent problem of change-point detection in regression. The study suggests a novel testing procedure featuring a fully data-driven calibration scheme. The method is essentially a black box, requiring no tuning from the practitioner. The approach is investigated from both theoretical and practical points of view. The theoretical study demonstrates proper control of first…
▽ More
This paper considers the prominent problem of change-point detection in regression. The study suggests a novel testing procedure featuring a fully data-driven calibration scheme. The method is essentially a black box, requiring no tuning from the practitioner. The approach is investigated from both theoretical and practical points of view. The theoretical study demonstrates proper control of first-type error rate under $H_0$ and power approaching $1$ under $H_1$. The experiments conducted on synthetic data fully support the theoretical claims. In conclusion, the method is applied to financial data, where it detects sensible change-points. Techniques for change-point localization are also suggested and investigated.
△ Less
Submitted 1 July, 2019; v1 submitted 6 March, 2019;
originally announced March 2019.
-
Structural break analysis in high-dimensional covariance structure
Authors:
Valeriy Avanesov
Abstract:
We consider detection and localization of an abrupt break in the covariance structure of high-dimensional random data. The paper proposes a novel testing procedure for this problem. Due to its nature, the approach requires a properly chosen critical level. In this regard we propose a purely data-driven calibration scheme. The approach can be straightforwardly employed in online setting and is esse…
▽ More
We consider detection and localization of an abrupt break in the covariance structure of high-dimensional random data. The paper proposes a novel testing procedure for this problem. Due to its nature, the approach requires a properly chosen critical level. In this regard we propose a purely data-driven calibration scheme. The approach can be straightforwardly employed in online setting and is essentially multiscale allowing for a trade-off between sensitivity and change-point localization (in online setting, the delay of detection). The description of the algorithm is followed by a formal theoretical study justifying the proposed calibration scheme under mild assumption and providing guaranties for break detection. All the theoretical results are obtained in a high-dimensional setting (dimensionality $p >> n$). The results are supported by a simulation study inspired by real-world financial data.
△ Less
Submitted 14 July, 2019; v1 submitted 1 March, 2018;
originally announced March 2018.
-
Bootstrap for change point detection
Authors:
Nazar Buzun,
Valeriy Avanesov
Abstract:
In Change point detection task Likelihood Ratio Test (LRT) is sequentially applied in a sliding window procedure. Its high values indicate changes of parametric distribution in the data sequence. Correspondingly LRT values require predefined bound for their maximum. The maximum value has unknown distribution and may be calibrated with multiplier bootstrap. Bootstrap procedure convolves independent…
▽ More
In Change point detection task Likelihood Ratio Test (LRT) is sequentially applied in a sliding window procedure. Its high values indicate changes of parametric distribution in the data sequence. Correspondingly LRT values require predefined bound for their maximum. The maximum value has unknown distribution and may be calibrated with multiplier bootstrap. Bootstrap procedure convolves independent components of the Likelihood function with random weights, that enables to estimate empirically LRT distribution. For this empirical distribution of the LRT we show convergence rates to the real maximum value distribution.
△ Less
Submitted 19 October, 2017;
originally announced October 2017.
-
Change-point detection in high-dimensional covariance structure
Authors:
Valeriy Avanesov,
Nazar Buzun
Abstract:
In this paper we introduce a novel approach for an important problem of break detection. Specifically, we are interested in detection of an abrupt change in the covariance structure of a high-dimensional random process -- a problem, which has applications in many areas e.g., neuroimaging and finance. The developed approach is essentially a testing procedure involving a choice of a critical level.…
▽ More
In this paper we introduce a novel approach for an important problem of break detection. Specifically, we are interested in detection of an abrupt change in the covariance structure of a high-dimensional random process -- a problem, which has applications in many areas e.g., neuroimaging and finance. The developed approach is essentially a testing procedure involving a choice of a critical level. To that end a non-standard bootstrap scheme is proposed and theoretically justified under mild assumptions. Theoretical study features a result providing guaranties for break detection. All the theoretical results are established in a high-dimensional setting (dimensionality $p \gg n$). Multiscale nature of the approach allows for a trade-off between sensitivity of break detection and localization. The approach can be naturally employed in an on-line setting. Simulation study demonstrates that the approach matches the nominal level of false alarm probability and exhibits high power, outperforming a recent approach.
△ Less
Submitted 29 July, 2020; v1 submitted 12 October, 2016;
originally announced October 2016.