Search | arXiv e-print repository

doi 10.1093/biomtc/ujae009

Comparing two spatial variables with the probability of agreement

Authors: Jonathan Acosta, Ronny Vallejos, Aaron M. Ellison, Felipe Osorio, Mario de Castro

Abstract: Computing the agreement between two continuous sequences is of great interest in statistics when comparing two instruments or one instrument with a gold standard. The probability of agreement (PA) quantifies the similarity between two variables of interest, and it is useful for accounting what constitutes a practically important difference. In this article we introduce a generalization of the PA f… ▽ More Computing the agreement between two continuous sequences is of great interest in statistics when comparing two instruments or one instrument with a gold standard. The probability of agreement (PA) quantifies the similarity between two variables of interest, and it is useful for accounting what constitutes a practically important difference. In this article we introduce a generalization of the PA for the treatment of spatial variables. Our proposal makes the PA dependent on the spatial lag. As a consequence, for isotropic stationary and nonstationary spatial processes, the conditions for which the PA decays as a function of the distance lag are established. Estimation is addressed through a first-order approximation that guarantees the asymptotic normality of the sample version of the PA. The sensitivity of the PA is studied for finite sample size, with respect to the covariance parameters. The new method is described and illustrated with real data involving autumnal changes in the green chromatic coordinate (Gcc), an index of "greenness" that captures the phenological stage of tree leaves, is associated with carbon flux from ecosystems, and is estimated from repeated images of forest canopies. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 34 pages, 16 figures

Journal ref: Biometrics 80, ujae009, 2024

arXiv:2102.07752 [pdf, other]

doi 10.1080/03610926.2021.1939380

Diagnostic tools for a multivariate negative binomial model for fitting correlated data with overdispersion

Authors: Lizandra Castilho Fabio, Cristian Villegas, Jalmar M. F. Carrasco, Mário de Castro

Abstract: We focus on the development of diagnostic tools and an R package called MNB for a multivariate negative binomial (MNB) regression model for detecting atypical and influential subjects. The MNB model is deduced from a Poisson mixed model in which the random intercept follows the generalized log-gamma (GLG) distribution. The MNB model for correlated count data leads to an MNB regression model that i… ▽ More We focus on the development of diagnostic tools and an R package called MNB for a multivariate negative binomial (MNB) regression model for detecting atypical and influential subjects. The MNB model is deduced from a Poisson mixed model in which the random intercept follows the generalized log-gamma (GLG) distribution. The MNB model for correlated count data leads to an MNB regression model that inherits the features of a hierarchical model to accommodate the intraclass correlation and the occurrence of overdispersion simultaneously. The asymptotic consistency of the dispersion parameter estimator depends on the asymmetry of the GLG distribution. Inferential procedures for the MNB regression model are simple, although it can provide inconsistent estimates of the asymptotic variance when the correlation structure is misspecified. We propose the randomized quantile residual for checking the adequacy of the multivariate model, and derive global and local influence measures from the multivariate model to assess influential subjects. Finally, two applications are presented in the data analysis section. The code for installing the MNB package and the code used in the two examples is exhibited in the Appendix. △ Less

Submitted 15 February, 2021; originally announced February 2021.

arXiv:1804.07734 [pdf, other]

A new regression model for positive data

Authors: Marcelo Bourguignon, Manoel Santos-Neto, Mário de Castro

Abstract: In this paper, we propose a regression model where the response variable is beta prime distributed using a new parameterization of this distribution that is indexed by mean and precision parameters. The proposed regression model is useful for situations where the variable of interest is continuous and restricted to the positive real line and is related to other variables through the mean and preci… ▽ More In this paper, we propose a regression model where the response variable is beta prime distributed using a new parameterization of this distribution that is indexed by mean and precision parameters. The proposed regression model is useful for situations where the variable of interest is continuous and restricted to the positive real line and is related to other variables through the mean and precision parameters. The variance function of the proposed model has a quadratic form. In addition, the beta prime model has properties that its competitor distributions of the exponential family do not have. Estimation is performed by maximum likelihood. Furthermore, we discuss residuals and influence diagnostic tools. Finally, we also carry out an application to real data that demonstrates the usefulness of the proposed model. △ Less

Submitted 20 April, 2018; originally announced April 2018.

Comments: 20 pages and 11 figures

arXiv:1501.01756 [pdf, ps, other]

A model selection approach for multiple sequence segmentation and dimensionality reduction

Authors: Bruno M. de Castro, Florencia Leonardi

Abstract: In this paper we consider the problem of segmenting $n$ aligned random sequences of equal length $m$, into a finite number of independent blocks. We propose to use a penalized maximum likelihood criterion to infer simultaneously the number of points of independence as well as the position of each one of these points. We show how to compute the estimator efficiently by means of a dynamic programmin… ▽ More In this paper we consider the problem of segmenting $n$ aligned random sequences of equal length $m$, into a finite number of independent blocks. We propose to use a penalized maximum likelihood criterion to infer simultaneously the number of points of independence as well as the position of each one of these points. We show how to compute the estimator efficiently by means of a dynamic programming algorithm with time complexity $O(m^2n)$. We also propose another algorithm, called hierarchical algorithm, that provides an approximation to the estimator when the sample size increases and runs in time $O(mn)$. Our main theoretical result is the proof of almost sure consistency of the estimator and the convergence of the hierarchical algorithm when the sample size $n$ grows to infinity. We illustrate the convergence of these algorithms through some simulation examples and we apply the method to a real protein sequence alignment of Ebola Virus. △ Less

Submitted 8 January, 2015; originally announced January 2015.

arXiv:1105.2072 [pdf, ps, other]

A Poisson Mixed Model with Nonnormal Random Effect Distribution

Authors: Lizandra C. Fabio, Gilberto A. Paula, Mario de Castro

Abstract: We propose in this paper a random intercept Poisson model in which the random effect distribution is assumed to follow a generalized log-gamma (GLG) distribution. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are in general required for deriving the marginal models, we obtain the multivariate negative… ▽ More We propose in this paper a random intercept Poisson model in which the random effect distribution is assumed to follow a generalized log-gamma (GLG) distribution. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are in general required for deriving the marginal models, we obtain the multivariate negative binomial model for a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis are proposed and two applications with real data are given for illustration. △ Less

Submitted 10 May, 2011; originally announced May 2011.

Comments: Submitted in the Computational Statistics & Data Analysis journal

Showing 1–5 of 5 results for author: de Castro, M