Showing 1–2 of 2 results for author: Ribeiro, V J

Search v0.5.6 released 2020-02-24

arXiv:2109.05047 [pdf, other]

stat.ME math.ST stat.AP stat.ML

PAC Mode Estimation using PPR Martingale Confidence Sequences

Authors: Shubham Anand Jain, Rohan Shah, Sanit Gupta, Denil Mehta, Inderjeet Jayakumar Nair, Jian Vora, Sushil Khyalia, Sourav Das, Vinay J. Ribeiro, Shivaram Kalyanakrishnan

Abstract: We consider the problem of correctly identifying the \textit{mode} of a discrete distribution $\mathcal{P}$ with sufficiently high probability by observing a sequence of i.i.d. samples drawn from $\mathcal{P}$. This problem reduces to the estimation of a single parameter when $\mathcal{P}$ has a support set of size $K = 2$. After noting that this special case is tackled very well by prior-posterio… ▽ More We consider the problem of correctly identifying the \textit{mode} of a discrete distribution $\mathcal{P}$ with sufficiently high probability by observing a sequence of i.i.d. samples drawn from $\mathcal{P}$. This problem reduces to the estimation of a single parameter when $\mathcal{P}$ has a support set of size $K = 2$. After noting that this special case is tackled very well by prior-posterior-ratio (PPR) martingale confidence sequences \citep{waudby-ramdas-ppr}, we propose a generalisation to mode estimation, in which $\mathcal{P}$ may take $K \geq 2$ values. To begin, we show that the "one-versus-one" principle to generalise from $K = 2$ to $K \geq 2$ classes is more efficient than the "one-versus-rest" alternative. We then prove that our resulting stopping rule, denoted PPR-1v1, is asymptotically optimal (as the mistake probability is taken to $0$). PPR-1v1 is parameter-free and computationally light, and incurs significantly fewer samples than competitors even in the non-asymptotic regime. We demonstrate its gains in two practical applications of sampling: election forecasting and verification of smart contracts in blockchains. △ Less

Submitted 11 April, 2022; v1 submitted 10 September, 2021; originally announced September 2021.
arXiv:math/0611191 [pdf, ps, other]

math.ST

doi 10.1214/074921706000000509

Optimal sampling strategies for multiscale stochastic processes

Authors: Vinay J. Ribeiro, Rudolf H. Riedi, Richard G. Baraniuk

Abstract: In this paper, we determine which non-random sampling of fixed size gives the best linear predictor of the sum of a finite spatial population. We employ different multiscale superpopulation models and use the minimum mean-squared error as our optimality criterion. In multiscale superpopulation tree models, the leaves represent the units of the population, interior nodes represent partial sums of… ▽ More In this paper, we determine which non-random sampling of fixed size gives the best linear predictor of the sum of a finite spatial population. We employ different multiscale superpopulation models and use the minimum mean-squared error as our optimality criterion. In multiscale superpopulation tree models, the leaves represent the units of the population, interior nodes represent partial sums of the population, and the root node represents the total sum of the population. We prove that the optimal sampling pattern varies dramatically with the correlation structure of the tree nodes. While uniform sampling is optimal for trees with ``positive correlation progression'', it provides the worst possible sampling with ``negative correlation progression.'' As an analysis tool, we introduce and study a class of independent innovations trees that are of interest in their own right. We derive a fast water-filling algorithm to determine the optimal sampling of the leaves to estimate the root of an independent innovations tree. △ Less

Submitted 7 November, 2006; originally announced November 2006.

Comments: Published at http://dx.doi.org/10.1214/074921706000000509 in the IMS Lecture Notes--Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-LNMS49-LNMS4916 MSC Class: 94A20; 62M30; 60G18 (Primary) 62H11; 62H12; 78M50 (Secondary)

Journal ref: IMS Lecture Notes--Monograph Series 2006, Vol. 49, 266-290

Search v0.5.6 released 2020-02-24