Search | arXiv e-print repository

Unbiased estimates for products of moments and cumulants for finite and infinite populations

Abstract: Let $F=F_N$ be the distribution of a finite real population of size $N$. Let $\widehat{F}=F_N$ be the empirical distribution of a sample of size $n$ drawn from the population without replacement. We prove the following remarkable {\it inversion principle} for obtaining unbiased estimates. Let $ T \left(F_N\right)$ be any product of the moments or cumulants of $F_N$. Let… ▽ More Let $F=F_N$ be the distribution of a finite real population of size $N$. Let $\widehat{F}=F_N$ be the empirical distribution of a sample of size $n$ drawn from the population without replacement. We prove the following remarkable {\it inversion principle} for obtaining unbiased estimates. Let $ T \left(F_N\right)$ be any product of the moments or cumulants of $F_N$. Let $T_{n, N} \left( F_N \right) = E T \left( F_n \right)$. Then $E T_{N, n} \left( F_n \right) = T \left( F_N \right)$. We also obtain an explicit expression for $T_{n, N} \left(F_N\right)$ for all $ T \left( F_N \right)$ of order up to 6. We also prove the following related result. If $F_n$ and $F_N$ are the sample and population distributions, the only functionals for which $E T \left( F_n \right) = λ_{n, N} T \left( F_N \right)$ are noncentral moments, and generalized second and third order central moments. For these three cases the eigenvalues are $λ_{n, N}=1$, $\left( 1 - n^{-1} \right) \left( 1 - N^{-1} \right)^{-1}$, and $\left( 1 - n^{-1} \right) \left( 1 - 2n^{-1} \right) \left( 1 - N^{-1} \right)^{-1} \left( 1 - 2N^{-1} \right)^{-1}$ respectively. △ Less

Submitted 27 October, 2014; originally announced October 2014.

arXiv:1312.7150 [pdf, ps, other]

The distribution of the maximum of an ARMA(1, 1) process

Authors: C. S. Withers, S. Nadarajah

Abstract: We give the cumulative distribution function of $M_n$, the maximum of a sequence of $n$ observations from an ARMA(1, 1) process. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. The distribution of $M_n$ is then given as a weighted sum of the $n$th powers of the eigenvalues of a non-symmetric Fredholm k… ▽ More We give the cumulative distribution function of $M_n$, the maximum of a sequence of $n$ observations from an ARMA(1, 1) process. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. The distribution of $M_n$ is then given as a weighted sum of the $n$th powers of the eigenvalues of a non-symmetric Fredholm kernel. The weights are given in terms of the left and right eigenfunctions of the kernel. These results are large deviations expansions for estimates, since the maximum need not be standardized to have a limit. In fact, such a limit need not exist. △ Less

Submitted 26 December, 2013; originally announced December 2013.

Comments: arXiv admin note: text overlap with arXiv:1001.5265

arXiv:1008.0127 [pdf, ps, other]

Nonparametric estimates of low bias

Authors: C. S. Withers, S. Nadarajah

Abstract: We consider the problem of estimating an arbitrary smooth functional of $k \geq 1 $ distribution functions (d.f.s.) in terms of random samples from them. The natural estimate replaces the d.f.s by their empirical d.f.s. Its bias is generally $\sim n^{-1}$, where $n$ is the minimum sample size, with a {\it $p$th order} iterative estimate of bias $ \sim n^{-p}$ for any $p$. For $p \leq 4$, we give a… ▽ More We consider the problem of estimating an arbitrary smooth functional of $k \geq 1 $ distribution functions (d.f.s.) in terms of random samples from them. The natural estimate replaces the d.f.s by their empirical d.f.s. Its bias is generally $\sim n^{-1}$, where $n$ is the minimum sample size, with a {\it $p$th order} iterative estimate of bias $ \sim n^{-p}$ for any $p$. For $p \leq 4$, we give an explicit estimate in terms of the first $2p - 2$ von Mises derivatives of the functional evaluated at the empirical d.f.s. These may be used to obtain {\it unbiased} estimates, where these exist and are of known form in terms of the sample sizes; our form for such unbiased estimates is much simpler than that obtained using polykays and tables of the symmetric functions. Examples include functions of a mean vector (such as the ratio of two means and the inverse of a mean), standard deviation, correlation, return times and exceedances. These $p$th order estimates require only $\sim n $ calculations. This is in sharp contrast with computationally intensive bias reduction methods such as the $p$th order bootstrap and jackknife, which require $\sim n^p $ calculations. △ Less

Submitted 31 July, 2010; originally announced August 2010.

MSC Class: Primary 62G05; Secondary 62G30

arXiv:1002.4338 [pdf, ps, other]

The distribution and quantiles of functionals of weighted empirical distributions when observations have different distributions

Authors: C. S. Withers, S. Nadarajah

Abstract: This paper extends Edgeworth-Cornish-Fisher expansions for the distribution and quantiles of nonparametric estimates in two ways. Firstly it allows observations to have different distributions. Secondly it allows the observations to be weighted in a predetermined way. The use of weighted estimates has a long history including applications to regression, rank statistics and Bayes theory. However,… ▽ More This paper extends Edgeworth-Cornish-Fisher expansions for the distribution and quantiles of nonparametric estimates in two ways. Firstly it allows observations to have different distributions. Secondly it allows the observations to be weighted in a predetermined way. The use of weighted estimates has a long history including applications to regression, rank statistics and Bayes theory. However, asymptotic results have generally been only first order (the CLT and weak convergence). We give third order asymptotics for the distribution and percentiles of any smooth functional of a weighted empirical distribution, thus allowing a considerable increase in accuracy over earlier CLT results. Consider independent non-identically distributed ({\it non-iid}) observations $X_{1n}, ..., X_{nn}$ in $R^s$. Let $\hat{F}(x)$ be their {\it weighted empirical distribution} with weights $w_{1n}, ..., w_{nn}$. We obtain cumulant expansions and hence Edgeworth-Cornish-Fisher expansions for $T(\hat{F})$ for any smooth functional $T(\cdot)$ by extending the concepts of von Mises derivatives to signed measures of total measure 1. As an example we give the cumulant coefficients needed for Edgeworth-Cornish-Fisher expansions to $O(n^{-3/2})$ for the sample variance when observations are non-iid. △ Less

Submitted 23 February, 2010; originally announced February 2010.

arXiv:1001.5265 [pdf, ps, other]

The distribution of the maximum of a second order autoregressive process: the continuous case

Authors: C. S. Withers, S. Nadarajah

Abstract: We give the distribution function of $M_n$, the maximum of a sequence of $n$ observations from an autoregressive process of order 2. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. When the correlations are positive, P(M_n \leq x) =a_{n,x}, where a_{n,x}= \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} = O (ν_{1… ▽ More We give the distribution function of $M_n$, the maximum of a sequence of $n$ observations from an autoregressive process of order 2. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. When the correlations are positive, P(M_n \leq x) =a_{n,x}, where a_{n,x}= \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} = O (ν_{1x}^{n}), where $\{ν_{jx}\}$ are the eigenvalues of a non-symmetric Fredholm kernel, and $ν_{1x}$ is the eigenvalue of maximum magnitude. The weights $β_{jx}$ depend on the $j$th left and right eigenfunctions of the kernel. These results are large deviations expansions for estimates, since the maximum need not be standardized to have a limit. In fact such a limit need not exist. △ Less

Submitted 1 February, 2010; v1 submitted 28 January, 2010; originally announced January 2010.

Comments: 8 pages This version removes an inappropriate note

Report number: 138cor/ar2.tex

arXiv:0802.0529 [pdf, ps, other]

The distribution of the maximum of a first order moving average: the discrete case

Authors: Christopher S. Withers, Saralees Nadarajah

Abstract: We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables are discrete. When the correlation is positive, $$ P(M_n \max^n_{i=1} X_i \leq x) = \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} \approx B_{x} r{1x}^{n} $$ where… ▽ More We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables are discrete. When the correlation is positive, $$ P(M_n \max^n_{i=1} X_i \leq x) = \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} \approx B_{x} r{1x}^{n} $$ where $\{ν_{jx}\}$ are the eigenvalues of a certain matrix, $r_{1x}$ is the maximum magnitude of the eigenvalues, and $I$ depends on the number of possible values of the underlying random variables. The eigenvalues do not depend on $x$ only on its range. △ Less

Submitted 6 April, 2009; v1 submitted 4 February, 2008; originally announced February 2008.

Comments: 13 pages. This version gives full solutions to the examples

arXiv:0802.0523 [pdf, ps, other]

The distribution of the maximum of a first order moving average: the continuous case

Authors: Christopher S. Withers, Saralees Nadarajah

Abstract: We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables have an absolutely continuous density. When the correlation is positive,… ▽ More We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables have an absolutely continuous density. When the correlation is positive, $$ P(M_n %\max^n_{i=1} X_i \leq x) =\ \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} \approx B_{x} ν_{1x}^{n} $$ where %$\{X_i\}$ is a moving average of order 1 with positive correlation, and $\{ν_{jx}\}$ are the eigenvalues (singular values) of a Fredholm kernel and $ν_{1x}$ is the eigenvalue of maximum magnitude. A similar result is given when the correlation is negative. The result is analogous to large deviations expansions for estimates, since the maximum need not be standardized to have a limit. % there are more terms, and $$P(M_n <x) \approx B'_{x}\ (1+ν_{1x})^n.$$ For the continuous case the integral equations for the left and right eigenfunctions are converted to first order linear differential equations. The eigenvalues satisfy an equation of the form $$\sum_{i=1}^\infty w_i(λ-θ_i)^{-1}=λ-θ_0$$ for certain known weights $\{w_i\}$ and eigenvalues $\{θ_i\}$ of a given matrix. This can be solved by truncating the sum to an increasing number of terms. △ Less

Submitted 6 September, 2009; v1 submitted 4 February, 2008; originally announced February 2008.

Comments: 15 A4 pages. Version 4 corrects (3.8). Version 3 expands Section 2. Version 2 corrected recurrence relation (2.5)

arXiv:0802.0502 [pdf, ps, other]

Fredholm equations for non-symmetric kernels, with applications to iterated integral operators

Authors: Christopher S. Withers, Saralees Nadarajah

Abstract: We give the Jordan form and the Singular Value Decomposition for an integral operator ${\cal N}$ with a non-symmetric kernel $N(y,z)$. This is used to give solutions of Fredholm equations for non-symmetric kernels, and to determine the behaviour of ${\cal N}^n$ and $({\cal N}{\cal N^*})^n$ for large $n$. We give the Jordan form and the Singular Value Decomposition for an integral operator ${\cal N}$ with a non-symmetric kernel $N(y,z)$. This is used to give solutions of Fredholm equations for non-symmetric kernels, and to determine the behaviour of ${\cal N}^n$ and $({\cal N}{\cal N^*})^n$ for large $n$. △ Less

Submitted 2 April, 2008; v1 submitted 4 February, 2008; originally announced February 2008.

Comments: 12 A4 pages

Showing 1–8 of 8 results for author: Withers, C