-
Unbiased estimates for products of moments and cumulants for finite and infinite populations
Authors:
C. S. Withers,
S. Nadarajah
Abstract:
Let $F=F_N$ be the distribution of a finite real population of size $N$. Let $\widehat{F}=F_N$ be the empirical distribution of a sample of size $n$ drawn from the population without replacement. We prove the following remarkable {\it inversion principle} for obtaining unbiased estimates. Let $ T \left(F_N\right)$ be any product of the moments or cumulants of $F_N$. Let…
▽ More
Let $F=F_N$ be the distribution of a finite real population of size $N$. Let $\widehat{F}=F_N$ be the empirical distribution of a sample of size $n$ drawn from the population without replacement. We prove the following remarkable {\it inversion principle} for obtaining unbiased estimates. Let $ T \left(F_N\right)$ be any product of the moments or cumulants of $F_N$. Let $T_{n, N} \left( F_N \right) = E T \left( F_n \right)$. Then $E T_{N, n} \left( F_n \right) = T \left( F_N \right)$. We also obtain an explicit expression for $T_{n, N} \left(F_N\right)$ for all $ T \left( F_N \right)$ of order up to 6.
We also prove the following related result. If $F_n$ and $F_N$ are the sample and population distributions, the only functionals for which $E T \left( F_n \right) = λ_{n, N} T \left( F_N \right)$ are noncentral moments, and generalized second and third order central moments. For these three cases the eigenvalues are $λ_{n, N}=1$, $\left( 1 - n^{-1} \right) \left( 1 - N^{-1} \right)^{-1}$, and $\left( 1 - n^{-1} \right) \left( 1 - 2n^{-1} \right) \left( 1 - N^{-1} \right)^{-1} \left( 1 - 2N^{-1} \right)^{-1}$ respectively.
△ Less
Submitted 27 October, 2014;
originally announced October 2014.
-
The distribution of the maximum of an ARMA(1, 1) process
Authors:
C. S. Withers,
S. Nadarajah
Abstract:
We give the cumulative distribution function of $M_n$, the maximum of a sequence of $n$ observations from an ARMA(1, 1) process. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. The distribution of $M_n$ is then given as a weighted sum of the $n$th powers of the eigenvalues of a non-symmetric Fredholm k…
▽ More
We give the cumulative distribution function of $M_n$, the maximum of a sequence of $n$ observations from an ARMA(1, 1) process. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. The distribution of $M_n$ is then given as a weighted sum of the $n$th powers of the eigenvalues of a non-symmetric Fredholm kernel. The weights are given in terms of the left and right eigenfunctions of the kernel.
These results are large deviations expansions for estimates, since the maximum need not be standardized to have a limit. In fact, such a limit need not exist.
△ Less
Submitted 26 December, 2013;
originally announced December 2013.
-
Nonparametric estimates of low bias
Authors:
C. S. Withers,
S. Nadarajah
Abstract:
We consider the problem of estimating an arbitrary smooth functional of $k \geq 1 $ distribution functions (d.f.s.) in terms of random samples from them. The natural estimate replaces the d.f.s by their empirical d.f.s. Its bias is generally $\sim n^{-1}$, where $n$ is the minimum sample size, with a {\it $p$th order} iterative estimate of bias $ \sim n^{-p}$ for any $p$. For $p \leq 4$, we give a…
▽ More
We consider the problem of estimating an arbitrary smooth functional of $k \geq 1 $ distribution functions (d.f.s.) in terms of random samples from them. The natural estimate replaces the d.f.s by their empirical d.f.s. Its bias is generally $\sim n^{-1}$, where $n$ is the minimum sample size, with a {\it $p$th order} iterative estimate of bias $ \sim n^{-p}$ for any $p$. For $p \leq 4$, we give an explicit estimate in terms of the first $2p - 2$ von Mises derivatives of the functional evaluated at the empirical d.f.s. These may be used to obtain {\it unbiased} estimates, where these exist and are of known form in terms of the sample sizes; our form for such unbiased estimates is much simpler than that obtained using polykays and tables of the symmetric functions. Examples include functions of a mean vector (such as the ratio of two means and the inverse of a mean), standard deviation, correlation, return times and exceedances. These $p$th order estimates require only $\sim n $ calculations. This is in sharp contrast with computationally intensive bias reduction methods such as the $p$th order bootstrap and jackknife, which require $\sim n^p $ calculations.
△ Less
Submitted 31 July, 2010;
originally announced August 2010.
-
The distribution and quantiles of functionals of weighted empirical distributions when observations have different distributions
Authors:
C. S. Withers,
S. Nadarajah
Abstract:
This paper extends Edgeworth-Cornish-Fisher expansions for the distribution and quantiles of nonparametric estimates in two ways. Firstly it allows observations to have different distributions. Secondly it allows the observations to be weighted in a predetermined way. The use of weighted estimates has a long history including applications to regression, rank statistics and Bayes theory. However,…
▽ More
This paper extends Edgeworth-Cornish-Fisher expansions for the distribution and quantiles of nonparametric estimates in two ways. Firstly it allows observations to have different distributions. Secondly it allows the observations to be weighted in a predetermined way. The use of weighted estimates has a long history including applications to regression, rank statistics and Bayes theory. However, asymptotic results have generally been only first order (the CLT and weak convergence). We give third order asymptotics for the distribution and percentiles of any smooth functional of a weighted empirical distribution, thus allowing a considerable increase in accuracy over earlier CLT results.
Consider independent non-identically distributed ({\it non-iid}) observations $X_{1n}, ..., X_{nn}$ in $R^s$. Let $\hat{F}(x)$ be their {\it weighted empirical distribution} with weights $w_{1n}, ..., w_{nn}$. We obtain cumulant expansions and hence Edgeworth-Cornish-Fisher expansions for $T(\hat{F})$ for any smooth functional $T(\cdot)$ by extending the concepts of von Mises derivatives to signed measures of total measure 1. As an example we give the cumulant coefficients needed for Edgeworth-Cornish-Fisher expansions to $O(n^{-3/2})$ for the sample variance when observations are non-iid.
△ Less
Submitted 23 February, 2010;
originally announced February 2010.
-
The distribution of the maximum of a second order autoregressive process: the continuous case
Authors:
C. S. Withers,
S. Nadarajah
Abstract:
We give the distribution function of $M_n$, the maximum of a sequence of $n$ observations from an autoregressive process of order 2. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. When the correlations are positive, P(M_n \leq x) =a_{n,x}, where a_{n,x}= \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} = O (ν_{1…
▽ More
We give the distribution function of $M_n$, the maximum of a sequence of $n$ observations from an autoregressive process of order 2. Solutions are first given in terms of repeated integrals and then for the case, where the underlying random variables are absolutely continuous. When the correlations are positive, P(M_n \leq x) =a_{n,x}, where a_{n,x}= \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} = O (ν_{1x}^{n}), where $\{ν_{jx}\}$ are the eigenvalues of a non-symmetric Fredholm kernel, and $ν_{1x}$ is the eigenvalue of maximum magnitude. The weights $β_{jx}$ depend on the $j$th left and right eigenfunctions of the kernel.
These results are large deviations expansions for estimates, since the maximum need not be standardized to have a limit. In fact such a limit need not exist.
△ Less
Submitted 1 February, 2010; v1 submitted 28 January, 2010;
originally announced January 2010.
-
The distribution of the maximum of a first order moving average: the discrete case
Authors:
Christopher S. Withers,
Saralees Nadarajah
Abstract:
We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables are discrete. When the correlation is positive, $$ P(M_n \max^n_{i=1} X_i \leq x) = \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} \approx B_{x} r{1x}^{n} $$ where…
▽ More
We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables are discrete. When the correlation is positive, $$ P(M_n \max^n_{i=1} X_i \leq x) = \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} \approx B_{x} r{1x}^{n} $$ where $\{ν_{jx}\}$ are the eigenvalues of a certain matrix, $r_{1x}$ is the maximum magnitude of the eigenvalues, and $I$ depends on the number of possible values of the underlying random variables. The eigenvalues do not depend on $x$ only on its range.
△ Less
Submitted 6 April, 2009; v1 submitted 4 February, 2008;
originally announced February 2008.
-
The distribution of the maximum of a first order moving average: the continuous case
Authors:
Christopher S. Withers,
Saralees Nadarajah
Abstract:
We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables have an absolutely continuous density. When the correlation is positive,…
▽ More
We give the distribution of $M_n$, the maximum of a sequence of $n$ observations from a moving average of order 1. Solutions are first given in terms of repeated integrals and then for the case where the underlying independent random variables have an absolutely continuous density. When the correlation is positive, $$ P(M_n %\max^n_{i=1} X_i \leq x) =\ \sum_{j=1}^\infty β_{jx} ν_{jx}^{n} \approx B_{x} ν_{1x}^{n} $$ where %$\{X_i\}$ is a moving average of order 1 with positive correlation, and $\{ν_{jx}\}$ are the eigenvalues (singular values) of a Fredholm kernel and $ν_{1x}$ is the eigenvalue of maximum magnitude. A similar result is given when the correlation is negative. The result is analogous to large deviations expansions for estimates, since the maximum need not be standardized to have a limit. % there are more terms, and $$P(M_n <x) \approx B'_{x}\ (1+ν_{1x})^n.$$
For the continuous case the integral equations for the left and right eigenfunctions are converted to first order linear differential equations. The eigenvalues satisfy an equation of the form $$\sum_{i=1}^\infty w_i(λ-θ_i)^{-1}=λ-θ_0$$ for certain known weights $\{w_i\}$ and eigenvalues $\{θ_i\}$ of a given matrix. This can be solved by truncating the sum to an increasing number of terms.
△ Less
Submitted 6 September, 2009; v1 submitted 4 February, 2008;
originally announced February 2008.
-
Fredholm equations for non-symmetric kernels, with applications to iterated integral operators
Authors:
Christopher S. Withers,
Saralees Nadarajah
Abstract:
We give the Jordan form and the Singular Value Decomposition for an integral operator ${\cal N}$ with a non-symmetric kernel $N(y,z)$. This is used to give solutions of Fredholm equations for non-symmetric kernels, and to determine the behaviour of ${\cal N}^n$ and $({\cal N}{\cal N^*})^n$ for large $n$.
We give the Jordan form and the Singular Value Decomposition for an integral operator ${\cal N}$ with a non-symmetric kernel $N(y,z)$. This is used to give solutions of Fredholm equations for non-symmetric kernels, and to determine the behaviour of ${\cal N}^n$ and $({\cal N}{\cal N^*})^n$ for large $n$.
△ Less
Submitted 2 April, 2008; v1 submitted 4 February, 2008;
originally announced February 2008.