SMABS 2004 Jena University
SMABS 2004 Home Organization About Jena Sponsors Links Imprint / Contact SMABS Home

European Association of Methodology

Department of methodology and evaluation research

Jena University

Contributions: Abstract

Multiple imputation of missing data under a multilevel setting

Gert Jacobusse Stef van Buuren
TNO
Karin Groothuis-Oudshoorn
Roessingh Research and Development
The Netherlands

Missing data are an inevitable byproduct of any data gathering technique. For a statistician, these missing data often form a substantive challenge. Multiple imputation (Rubin, 1987) is a general and practical approach that yields valid inferences from incomplete data. Van Buuren and Oudshoorn (2000) describe a sequential Gibbs sampling method called MICE for imputing multivariate incomplete data. Imputations are drawn from the posterior predictive distribution, which is updated in each iteration. Simulations have shown that imputations converge towards the appropriate multivariate posterior distribution of the incomplete information.

In multilevel settings, units cannot be considered to be independently and identically distributed over classes. Multilevel models have become a popular way of handling such data. We extended the ideas of Rubin and propose a method to impute missing data under a multilevel setting using a sequential imputation scheme.

We have developed C++ software for Windows that implements the MICE method and our extension for multilevel data. The program provides some helpful diagnostics. The interface is quite basic: it reads and writes tab delimited ASCII files, and can be directed by means of a straightforward syntax, but most of the functionality is also available via menus. The presentation will include a demonstration of the software prototype.

References

Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons.

Van Buuren, S. and Oudshoorn, C.G.M. (2000). Multivariate imputation by chained equations. TNO report PG/VGZ/00.038.