-
Multivariate Generalised Linear Mixed Models With Graphical Latent Covariance Structure
Authors:
Jeanett S. Pelck,
Rodrigo Labouriau
Abstract:
This paper introduces a method for studying the correlation structure of a range of responses modelled by a multivariate generalised linear mixed model (MGLMM). The methodology requires the existence of clusters of observations and that each of the several responses studied is modelled using a generalised linear mixed models (GLMM) containing random components representing the clusters. We constru…
▽ More
This paper introduces a method for studying the correlation structure of a range of responses modelled by a multivariate generalised linear mixed model (MGLMM). The methodology requires the existence of clusters of observations and that each of the several responses studied is modelled using a generalised linear mixed models (GLMM) containing random components representing the clusters. We construct a MGLMM by assuming that the distribution of each of the random components representing the clusters is the marginal distribution of a (sufficiently regular) multivariate elliptically contoured distribution. We use an undirected graphical model to represent the correlation structure of the random components representing the clusters of observations for each response. This representation allows us to draw conclusions regarding unknown underlying determining factors related to the clusters of observations. Using a combination of an undirected graph and a directed acyclic graph (DAG), we jointly represent the correlation structure of the responses and the related random components. Applying the theory of graphical models allows us to describe and draw conclusions on the correlation and, in some cases, the dependence between responses of different statistical nature (\eg following different distributions, different linear predictors and link functions). We present some simulation studies illustrating the proposed methodology.
△ Less
Submitted 30 July, 2021;
originally announced July 2021.
-
Conditional Inference for Multivariate Generalised Linear Mixed Models
Authors:
Jeanett S. Pelck,
Rodrigo Labouriau
Abstract:
We propose a method for inference in generalised linear mixed models (GLMMs) and several extensions of these models. First, we extend the GLMM by allowing the distribution of the random components to be non-Gaussian, that is, assuming an absolutely continuous distribution with respect to the Lebesgue measure that is symmetric around zero, unimodal and with finite moments up to fourth-order. Second…
▽ More
We propose a method for inference in generalised linear mixed models (GLMMs) and several extensions of these models. First, we extend the GLMM by allowing the distribution of the random components to be non-Gaussian, that is, assuming an absolutely continuous distribution with respect to the Lebesgue measure that is symmetric around zero, unimodal and with finite moments up to fourth-order. Second, we allow the conditional distribution to follow a dispersion model instead of exponential dispersion models. Finally, we extend these models to a multivariate framework where multiple responses are combined by imposing a multivariate absolute continuous distribution on the random components representing common clusters of observations in all the marginal models.
Maximum likelihood inference in these models involves evaluating an integral that often cannot be computed in closed form. We suggest an inference method that predicts values of random components and does not involve the integration of conditional likelihood quantities.
The multivariate GLMMs that we studied can be constructed with marginal GLMMs of different statistical nature, and at the same time, represent complex dependence structure providing a rather flexible tool for applications.
△ Less
Submitted 25 July, 2021;
originally announced July 2021.
-
Multivariate Methods for Detection of Rubbery Rot in Storage Apples by Monitoring Volatile Organic Compounds: An Example of Multivariate Generalised Mixed Models
Authors:
J. S. Pelck,
H. Holthusen,
M. Edelenbos,
A. Luca,
R. Labouriau
Abstract:
This article is a case study illustrating the use of a multivariate statistical method for screening potential chemical markers for early detection of post-harvest disease in storage fruit. We simultaneously measure a range of volatile organic compounds (VOCs) and two measures of severity of disease infection in apples under storage: the number of apples presenting visible symptoms and the lesion…
▽ More
This article is a case study illustrating the use of a multivariate statistical method for screening potential chemical markers for early detection of post-harvest disease in storage fruit. We simultaneously measure a range of volatile organic compounds (VOCs) and two measures of severity of disease infection in apples under storage: the number of apples presenting visible symptoms and the lesion area. We use multivariate generalised linear mixed models (MGLMM) for studying association patterns of those simultaneously observed responses via the covariance structure of random components. Remarkably, those MGLMMs can be used to represent patterns of association between quantities of different statistical nature. In the particular example considered in this paper, there are positive responses (concentrations of VOC, Gamma distribution based models), positive responses possibly containing observations with zero values (lesion area, Compound Poisson distribution based models) and binomially distributed responses (proportion of apples presenting infection symptoms). We represent patterns of association inferred with the MGLMMs using graphical models (a network represented by a graph), which allow us to eliminate spurious associations due to a cascade of indirect correlations between the responses.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
A Multivariate Methodology for Analysing Students' Performance Using Register Data
Authors:
Jeanett S. Pelck,
Rafael Pimentel Maia,
Hildete P. Pinheiro,
Rodrigo Labouriau
Abstract:
We present a new method for jointly modelling the students' results in the university's admission exams and their performance in subsequent courses at the university. The case considered involved all the students enrolled at the University of Campinas in 2014 to evening studies programs in educational branches related to exact sciences. We collected the number of attempts used for passing the univ…
▽ More
We present a new method for jointly modelling the students' results in the university's admission exams and their performance in subsequent courses at the university. The case considered involved all the students enrolled at the University of Campinas in 2014 to evening studies programs in educational branches related to exact sciences. We collected the number of attempts used for passing the university course of geometry and the results of the admission exams of those students in seven disciplines. The method introduced involved a combination of multivariate generalised linear mixed models (GLMM) and graphical models for representing the covariance structure of the random components. The models we used allowed us to discuss the association of quantities of very different nature. We used Gaussian GLMM for modelling the performance in the admission exams and a frailty discrete-time Cox proportional model, represented by a GLMM, to describe the number of attempts for passing Geometry.
The analyses were stratified into two populations: the students who received a bonus giving advantages in the university's admission process to compensate social and racial inequalities and those who did not receive the compensation. The two populations presented different patterns. Using general properties of graphical models, we argue that, on the one hand, the predicted performance in the admission exam of Mathematics could solely be used as a predictor of the performance in geometry for the students who received the bonus. On the other hand, the Portuguese admission exam's predicted performance could be used as a single predictor of the performance in geometry for the students who did not receive the bonus.
△ Less
Submitted 21 February, 2021;
originally announced February 2021.
-
Using Multivariate Generalised Linear Mixed Models for Studying Roots Development: An Example Based on Minirhizotron Observations
Authors:
Jeanett S. Pelck,
Rodrigo Labouriau
Abstract:
The characterisation of the spatial and temporal distribution of the root system in a cultivated field depends on the soil volume occupied by the root systems (the scatter), and the local intensity of the root colonisation in the field (the intensity). We introduce a multivariate generalised linear mixed model for simultaneously describing the scatter and the intensity using data obtained with min…
▽ More
The characterisation of the spatial and temporal distribution of the root system in a cultivated field depends on the soil volume occupied by the root systems (the scatter), and the local intensity of the root colonisation in the field (the intensity). We introduce a multivariate generalised linear mixed model for simultaneously describing the scatter and the intensity using data obtained with minirhizotrons (i.e., tubes with observation windows, which are inserted in the soil, enabling to observe the roots directly). The models presented allow studying intricate spatial and temporal dependence patterns using a graphical model to represent the dependence structure of latent random components.
The scatter is described by a binomial mixed model (presence of roots in observation windows). The number of roots crossing the reference lines in the observational windows of the minirhizotron is used to estimate the intensity through a specially defined Poisson mixed model. We explore the fact that it is possible to construct multivariate extensions of generalised linear mixed models that allow to simultaneously represent patterns of dependency of the scatter and the intensity along with time and space.
We present an example where the intensity and scatter are simultaneously determined at three different time points. A positive association between the intensity and scatter at each time point was found, suggesting that the plants are not compensating a reduced occupation of the soil by increasing the number of roots per volume of soil. Using the general properties of graphical models, we identify a first-order Markovian dependence pattern between successively observed scatters and intensities. This lack of memory indicates that no long-lasting temporal causal effects are affecting the roots' development. The two dependence patterns described above cannot be detected with univariate models.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.