-
Comparing two spatial variables with the probability of agreement
Authors:
Jonathan Acosta,
Ronny Vallejos,
Aaron M. Ellison,
Felipe Osorio,
Mario de Castro
Abstract:
Computing the agreement between two continuous sequences is of great interest in statistics when comparing two instruments or one instrument with a gold standard. The probability of agreement (PA) quantifies the similarity between two variables of interest, and it is useful for accounting what constitutes a practically important difference. In this article we introduce a generalization of the PA f…
▽ More
Computing the agreement between two continuous sequences is of great interest in statistics when comparing two instruments or one instrument with a gold standard. The probability of agreement (PA) quantifies the similarity between two variables of interest, and it is useful for accounting what constitutes a practically important difference. In this article we introduce a generalization of the PA for the treatment of spatial variables. Our proposal makes the PA dependent on the spatial lag. As a consequence, for isotropic stationary and nonstationary spatial processes, the conditions for which the PA decays as a function of the distance lag are established. Estimation is addressed through a first-order approximation that guarantees the asymptotic normality of the sample version of the PA. The sensitivity of the PA is studied for finite sample size, with respect to the covariance parameters. The new method is described and illustrated with real data involving autumnal changes in the green chromatic coordinate (Gcc), an index of "greenness" that captures the phenological stage of tree leaves, is associated with carbon flux from ecosystems, and is estimated from repeated images of forest canopies.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Diagnostic tools for a multivariate negative binomial model for fitting correlated data with overdispersion
Authors:
Lizandra Castilho Fabio,
Cristian Villegas,
Jalmar M. F. Carrasco,
Mário de Castro
Abstract:
We focus on the development of diagnostic tools and an R package called MNB for a multivariate negative binomial (MNB) regression model for detecting atypical and influential subjects. The MNB model is deduced from a Poisson mixed model in which the random intercept follows the generalized log-gamma (GLG) distribution. The MNB model for correlated count data leads to an MNB regression model that i…
▽ More
We focus on the development of diagnostic tools and an R package called MNB for a multivariate negative binomial (MNB) regression model for detecting atypical and influential subjects. The MNB model is deduced from a Poisson mixed model in which the random intercept follows the generalized log-gamma (GLG) distribution. The MNB model for correlated count data leads to an MNB regression model that inherits the features of a hierarchical model to accommodate the intraclass correlation and the occurrence of overdispersion simultaneously. The asymptotic consistency of the dispersion parameter estimator depends on the asymmetry of the GLG distribution. Inferential procedures for the MNB regression model are simple, although it can provide inconsistent estimates of the asymptotic variance when the correlation structure is misspecified. We propose the randomized quantile residual for checking the adequacy of the multivariate model, and derive global and local influence measures from the multivariate model to assess influential subjects. Finally, two applications are presented in the data analysis section. The code for installing the MNB package and the code used in the two examples is exhibited in the Appendix.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
A new regression model for positive data
Authors:
Marcelo Bourguignon,
Manoel Santos-Neto,
Mário de Castro
Abstract:
In this paper, we propose a regression model where the response variable is beta prime distributed using a new parameterization of this distribution that is indexed by mean and precision parameters. The proposed regression model is useful for situations where the variable of interest is continuous and restricted to the positive real line and is related to other variables through the mean and preci…
▽ More
In this paper, we propose a regression model where the response variable is beta prime distributed using a new parameterization of this distribution that is indexed by mean and precision parameters. The proposed regression model is useful for situations where the variable of interest is continuous and restricted to the positive real line and is related to other variables through the mean and precision parameters. The variance function of the proposed model has a quadratic form. In addition, the beta prime model has properties that its competitor distributions of the exponential family do not have. Estimation is performed by maximum likelihood. Furthermore, we discuss residuals and influence diagnostic tools. Finally, we also carry out an application to real data that demonstrates the usefulness of the proposed model.
△ Less
Submitted 20 April, 2018;
originally announced April 2018.
-
A model selection approach for multiple sequence segmentation and dimensionality reduction
Authors:
Bruno M. de Castro,
Florencia Leonardi
Abstract:
In this paper we consider the problem of segmenting $n$ aligned random sequences of equal length $m$, into a finite number of independent blocks. We propose to use a penalized maximum likelihood criterion to infer simultaneously the number of points of independence as well as the position of each one of these points. We show how to compute the estimator efficiently by means of a dynamic programmin…
▽ More
In this paper we consider the problem of segmenting $n$ aligned random sequences of equal length $m$, into a finite number of independent blocks. We propose to use a penalized maximum likelihood criterion to infer simultaneously the number of points of independence as well as the position of each one of these points. We show how to compute the estimator efficiently by means of a dynamic programming algorithm with time complexity $O(m^2n)$. We also propose another algorithm, called hierarchical algorithm, that provides an approximation to the estimator when the sample size increases and runs in time $O(mn)$. Our main theoretical result is the proof of almost sure consistency of the estimator and the convergence of the hierarchical algorithm when the sample size $n$ grows to infinity. We illustrate the convergence of these algorithms through some simulation examples and we apply the method to a real protein sequence alignment of Ebola Virus.
△ Less
Submitted 8 January, 2015;
originally announced January 2015.
-
A Poisson Mixed Model with Nonnormal Random Effect Distribution
Authors:
Lizandra C. Fabio,
Gilberto A. Paula,
Mario de Castro
Abstract:
We propose in this paper a random intercept Poisson model in which the random effect distribution is assumed to follow a generalized log-gamma (GLG) distribution. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are in general required for deriving the marginal models, we obtain the multivariate negative…
▽ More
We propose in this paper a random intercept Poisson model in which the random effect distribution is assumed to follow a generalized log-gamma (GLG) distribution. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are in general required for deriving the marginal models, we obtain the multivariate negative binomial model for a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis are proposed and two applications with real data are given for illustration.
△ Less
Submitted 10 May, 2011;
originally announced May 2011.