Multiple imputation of multilevel missing data: An introduction to the R package pan
Authors:
Simon Grund,
Oliver Lüdtke,
Alexander Robitzsch
Abstract:
The treatment of missing data can be difficult in multilevel research because state-of-the-art procedures such as multiple imputation (MI) may require advanced statistical knowledge or a high degree of familiarity with certain statistical software. In the missing data literature, pan has been recommended for MI of multilevel data. In this article, we provide an introduction to MI of multilevel mis…
▽ More
The treatment of missing data can be difficult in multilevel research because state-of-the-art procedures such as multiple imputation (MI) may require advanced statistical knowledge or a high degree of familiarity with certain statistical software. In the missing data literature, pan has been recommended for MI of multilevel data. In this article, we provide an introduction to MI of multilevel missing data using the R package pan, and we discuss its possibilities and limitations in accommodating typical questions in multilevel research. In order to make pan more accessible to applied researchers, we make use of the mitml package, which provides a user-friendly interface to the pan package and several tools for managing and analyzing multiply imputed data sets. We illustrate the use of pan and mitml with two empirical examples that represent common applications of multilevel models, and we discuss how these procedures may be used in conjunction with other software.
△ Less
Submitted 4 November, 2016;
originally announced November 2016.
Multiple imputation of missing covariate values in multilevel models with random slopes: A cautionary note
Authors:
Simon Grund,
Oliver Lüdtke,
Alexander Robitzsch
Abstract:
Multiple imputation (MI) has become one of the main procedures used to treat missing data, but the guidelines from the methodological literature are not easily transferred to multilevel research. For models including random slopes, proper MI can be difficult, especially when the covariate values are partially missing. In the present article, we discuss applications of MI in multilevel random-coeff…
▽ More
Multiple imputation (MI) has become one of the main procedures used to treat missing data, but the guidelines from the methodological literature are not easily transferred to multilevel research. For models including random slopes, proper MI can be difficult, especially when the covariate values are partially missing. In the present article, we discuss applications of MI in multilevel random-coefficient models, theoretical challenges posed by slope variation, and the current limitations of standard MI software. Our findings from three simulation studies suggest that (a) MI is able to recover most parameters, but is currently not well suited to capture slope variation entirely when covariate values are missing; (b) MI offers reasonable estimates for most parameters, even in smaller samples or when its assumptions are not met; and (c) listwise deletion can be an alternative worth considering when preserving the slope variance is particularly important.
△ Less
Submitted 16 June, 2016;
originally announced June 2016.