-
Bayesian analysis for a class of beta mixed models
Authors:
Wagner Hugo Bonat,
Paulo Justiniano Ribeiro Jr,
Silvia emiko Shimakura
Abstract:
Generalized linear mixed models (GLMM) encompass large class of statistical models, with a vast range of applications areas. GLMM extends the linear mixed models allowing for different types of response variable. Three most common data types are continuous, counts and binary and standard distributions for these types of response variables are Gaussian, Poisson and Binomial, respectively. Despite t…
▽ More
Generalized linear mixed models (GLMM) encompass large class of statistical models, with a vast range of applications areas. GLMM extends the linear mixed models allowing for different types of response variable. Three most common data types are continuous, counts and binary and standard distributions for these types of response variables are Gaussian, Poisson and Binomial, respectively. Despite that flexibility, there are situations where the response variable is continuous, but bounded, such as rates, percentages, indexes and proportions. In such situations the usual GLMM's are not adequate because bounds are ignored and the beta distribution can be used. Likelihood and Bayesian inference for beta mixed models are not straightforward demanding a computational overhead. Recently, a new algorithm for Bayesian inference called INLA (Integrated Nested Laplace Approximation) was proposed.INLA allows computation of many Bayesian GLMMs in a reasonable amount time allowing extensive comparison among models. We explore Bayesian inference for beta mixed models by INLA. We discuss the choice of prior distributions, sensitivity analysis and model selection measures through a real data set. The results obtained from INLA are compared with those obtained by an MCMC algorithm and likelihood analysis. We analyze data from an study on a life quality index of industry workers collected according to a hierarchical sampling scheme. Results show that the INLA approach is suitable and faster to fit the proposed beta mixed models producing results similar to alternative algorithms and with easier handling of modeling alternatives. Sensitivity analysis, measures of goodness of fit and model choice are discussed.
△ Less
Submitted 10 February, 2014; v1 submitted 13 January, 2014;
originally announced January 2014.
-
The Gamma-count distribution in the analysis of experimental underdispersed data
Authors:
Walmes Marques Zeviani,
Paulo Justiniano Ribeiro Jr.,
Wagner Hugo Bonat,
Silvia Emiko Shimakura,
Joel Augusti Muniz
Abstract:
Event counts are response variables with non-negative integer values representing the number of times that an event occurs within a fixed domain such as a time interval, a geographical area or a cell of a contingency table. Analysis of counts by Gaussian regression models ignores the discreteness, asymmetry and heterocedasticity and is inefficient, providing unrealistic standard errors or possibil…
▽ More
Event counts are response variables with non-negative integer values representing the number of times that an event occurs within a fixed domain such as a time interval, a geographical area or a cell of a contingency table. Analysis of counts by Gaussian regression models ignores the discreteness, asymmetry and heterocedasticity and is inefficient, providing unrealistic standard errors or possibily negative predictions of the expected number of events. The Poisson regression is the standard model for count data with underlying assumptions on the generating process which may be implausible in many applications. Statisticians have long recognized the limitation of imposing equidispersion under the Poisson regression model. A typical situation is when the conditional variance exceeds the conditional mean, in which case models allowing for overdispersion are routinely used. Less reported is the case of underdispersion with fewer modelling alternatives and assessments available in the literature. One of such alternatives, the Gamma-count model, is adopted here in the analysis of an agronomic experiment designed to investigate the effect of levels of defoliation on different phenological states upon the number of cotton bolls. Results show improvements over the Poisson model and the semiparametric quasi-Poisson model in capturing the observed variability in the data. Estimating rather than assuming the underlying variance process lead to important insights into the process.
△ Less
Submitted 9 December, 2013;
originally announced December 2013.
-
Likelihood analysis for a class of beta mixed models
Authors:
Wagner H. Bonat,
Paulo J. Ribeiro Jr.,
Walmes Marque Zeviani
Abstract:
Beta regression models are a suitable choice for continuous response variables on the unity interval. Random effects add further flexibility to the models and accommodate data structures such as hierarchical, repeated measures and longitudinal, which typically induce extra variability and/or dependence. Closed expressions cannot be obtained for parameter estimation and numerical methods are requir…
▽ More
Beta regression models are a suitable choice for continuous response variables on the unity interval. Random effects add further flexibility to the models and accommodate data structures such as hierarchical, repeated measures and longitudinal, which typically induce extra variability and/or dependence. Closed expressions cannot be obtained for parameter estimation and numerical methods are required and possibly combined with sampling algorithms. We focus on likelihood inference and related algorithms for the analysis of beta mixed models motivated by two real problems with grouped data structures. The first is a study on a life quality index of industry workers with data collected according to an hierarchical sampling scheme. The second is a study with a nested and longitudinal data structure assessing the impact of hydroelectric power plants upon measures of water quality indexes up, downstream and at the reservoirs of the dammed rivers. Relevant scientific hypothesis are investigated by comparing alternative models. The analysis uses different algorithms including data-cloning, an alternative to numerical approximations which also assess identifiability. Confidence intervals based on profiled likelihoods are compared to those obtained by asymptotic quadratic approximations, showing relevant differences for parameters related to the random effects.
△ Less
Submitted 15 January, 2014; v1 submitted 9 December, 2013;
originally announced December 2013.