-
Analysis of a longitudinal multilevel experiment using GAMLSSs
Authors:
Gustavo Thomas,
Alexandre Igor de Azevedo Pereira,
Cristian Marcelo Villegas Lobos,
Clarice G. B. Demétrio.
Abstract:
The standard procedures for analysing hierarquical or grouped data are by (non)linear mixed models or generalized mixed models. However, the generalized additive models for location, scale and shape (GAMLSSs) also allow different types of random effects to be included in the model formulation. Even though already popular in many areas of research, this type of models have not been found to be used…
▽ More
The standard procedures for analysing hierarquical or grouped data are by (non)linear mixed models or generalized mixed models. However, the generalized additive models for location, scale and shape (GAMLSSs) also allow different types of random effects to be included in the model formulation. Even though already popular in many areas of research, this type of models have not been found to be used for mixed modeling purposes yet. Therefore, this paper describes the analysis of an experiment with plants' growth using mixed GAMLSSs, comparing it to a linear mixed model approach.
△ Less
Submitted 7 October, 2018;
originally announced October 2018.
-
Modeling data with zero inflation and overdispersion using GAMLSSs
Authors:
Gustavo Thomas,
Luiz R. Nakamura,
Rafael A. Moral,
Clarice G. B. Demétrio
Abstract:
Count data with high frequencies of zeros are found in many areas, specially in biology. Statistical models to analyze such data started to be developed in the 80s and are still a topic of active research. Such models usually assume a response distribution that belongs to the exponential family of distributions and the analysis is performed under the generalized linear models framework. However, t…
▽ More
Count data with high frequencies of zeros are found in many areas, specially in biology. Statistical models to analyze such data started to be developed in the 80s and are still a topic of active research. Such models usually assume a response distribution that belongs to the exponential family of distributions and the analysis is performed under the generalized linear models framework. However, the generalized additive models for location, scale and shape (GAMLSSs) represent a more general class of univariate models that can also be used to model zero inflated data. In this paper, the analysis of a data set with excess of zeros and overdispersion is described using GAMLSSs. Specific GAMLSSs' tools were used in the analysis, which enhanced model comparison and eased the interpretation of results.
△ Less
Submitted 5 October, 2018;
originally announced October 2018.
-
Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Data
Authors:
Eduardo E. Ribeiro Jr,
Walmes M. Zeviani,
Wagner H. Bonat,
Clarice G. B. Demétrio,
John Hinde
Abstract:
In the analysis of count data often the equidispersion assumption is not suitable, hence the Poisson regression model is inappropriate. As a generalization of the Poisson distribution, the COM-Poisson distribution can deal with under-, equi- and overdispersed count data. It is a member of the exponential family of distributions and has well known special cases. In spite of the nice properties of t…
▽ More
In the analysis of count data often the equidispersion assumption is not suitable, hence the Poisson regression model is inappropriate. As a generalization of the Poisson distribution, the COM-Poisson distribution can deal with under-, equi- and overdispersed count data. It is a member of the exponential family of distributions and has well known special cases. In spite of the nice properties of the COM-Poisson distribution, its location parameter does not correspond to the expectation, which complicates the interpretation of regression models. In this paper, we propose a straightforward reparametrization of the COM-Poisson distribution based on an approximation to the expectation of this distribution. The main advantage of our new parametrization is the straightforward interpretation of the regression coefficients in terms of the expectation, as usual in the context of generalized linear models. Furthermore, the estimation and inference for the new COM-Poisson regression model can be done based on the likelihood paradigm. We carried out simulation studies to verify the finite sample properties of the maximum likelihood estimators. The results from our simulation study show that the maximum likelihood estimators are unbiased and consistent for both regression and dispersion parameters. We observed that the empirical correlation between the regression and dispersion parameter estimators is close to zero, which suggests that these parameters are orthogonal. We illustrate the application of the proposed model through the analysis of three data sets with over-, under- and equidispersed count data. The study of distribution properties through a consideration of dispersion, zero-inflated and heavy tail indexes, together with the results of data analysis show the flexibility over standard approaches.
△ Less
Submitted 29 January, 2018;
originally announced January 2018.
-
Extended Poisson-Tweedie: properties and regression models for count data
Authors:
Wagner H. Bonat,
Bent Jørgensen,
Célestin C. Kokonendji,
John Hinde,
Clarice G. B. Demétrio
Abstract:
We propose a new class of discrete generalized linear models based on the class of Poisson-Tweedie factorial dispersion models with variance of the form $μ+ φμ^p$, where $μ$ is the mean, $φ$ and $p$ are the dispersion and Tweedie power parameters, respectively. The models are fitted by using an estimating function approach obtained by combining the quasi-score and Pearson estimating functions for…
▽ More
We propose a new class of discrete generalized linear models based on the class of Poisson-Tweedie factorial dispersion models with variance of the form $μ+ φμ^p$, where $μ$ is the mean, $φ$ and $p$ are the dispersion and Tweedie power parameters, respectively. The models are fitted by using an estimating function approach obtained by combining the quasi-score and Pearson estimating functions for estimation of the regression and dispersion parameters, respectively. This provides a flexible and efficient regression methodology for a comprehensive family of count models including Hermite, Neyman Type A, Pólya-Aeppli, negative binomial and Poisson-inverse Gaussian. The estimating function approach allows us to extend the Poisson-Tweedie distributions to deal with underdispersed count data by allowing negative values for the dispersion parameter $φ$. Furthermore, the Poisson-Tweedie family can automatically adapt to highly skewed count data with excessive zeros, without the need to introduce zero-inflated or hurdle components, by the simple estimation of the power parameter. Thus, the proposed models offer a unified framework to deal with under, equi, overdispersed, zero-inflated and heavy-tailed count data. The computational implementation of the proposed models is fast, relying only on a simple Newton scoring algorithm. Simulation studies showed that the estimating function approach provides unbiased and consistent estimators for both regression and dispersion parameters. We highlight the ability of the Poisson-Tweedie distributions to deal with count data through a consideration of dispersion, zero-inflated and heavy tail indices, and illustrate its application with four data analyses. We provide an \texttt{R} implementation and the data sets as supplementary materials.
△ Less
Submitted 11 September, 2016; v1 submitted 24 August, 2016;
originally announced August 2016.
-
A Family of Generalized Linear Models for Repeated Measures with Normal and Conjugate Random Effects
Authors:
Geert Molenberghs,
Geert Verbeke,
Clarice G. B. Demétrio,
Afrânio M. C. Vieira
Abstract:
Non-Gaussian outcomes are often modeled using members of the so-called exponential family. Notorious members are the Bernoulli model for binary data, leading to logistic regression, and the Poisson model for count data, leading to Poisson regression. Two of the main reasons for extending this family are (1) the occurrence of overdispersion, meaning that the variability in the data is not adequatel…
▽ More
Non-Gaussian outcomes are often modeled using members of the so-called exponential family. Notorious members are the Bernoulli model for binary data, leading to logistic regression, and the Poisson model for count data, leading to Poisson regression. Two of the main reasons for extending this family are (1) the occurrence of overdispersion, meaning that the variability in the data is not adequately described by the models, which often exhibit a prescribed mean--variance link, and (2) the accommodation of hierarchical structure in the data, stemming from clustering in the data which, in turn, may result from repeatedly measuring the outcome, for various members of the same family, etc. The first issue is dealt with through a variety of overdispersion models, such as, for example, the beta-binomial model for grouped binary data and the negative-binomial model for counts. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these phenomena may occur simultaneously, models combining them are uncommon. This paper proposes a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. We place particular emphasis on so-called conjugate random effects at the level of the mean for the first aspect and normal random effects embedded within the linear predictor for the second aspect, even though our family is more general. The binary, count and time-to-event cases are given particular emphasis. Apart from model formulation, we present an overview of estimation methods, and then settle for maximum likelihood estimation with analytic--numerical integration. Implications for the derivation of marginal correlations functions are discussed. The methodology is applied to data from a study in epileptic seizures, a clinical trial in toenail infection named onychomycosis and survival data in children with asthma.
△ Less
Submitted 5 January, 2011;
originally announced January 2011.