Missing Data Analyses

doi:10.1017/9781316995808.040

33 - Missing Data Analyses

from Part VII - General Analytic Considerations

Published online by Cambridge University Press: 23 March 2020

Amanda N. Baraldi and

Craig K. Enders

Edited by

Aidan G. C. Wright and

Michael N. Hallquist

Show author details

Aidan G. C. Wright: Affiliation:
University of Pittsburgh
Michael N. Hallquist: Affiliation:
Pennsylvania State University

Book contents

Get access

Summary

The methodological literature recommends multiple imputation and maximum likelihood estimation as best practices in handling missing data in published research. Relative to older methods such as listwise and pairwise deletion, these approaches are preferable because they rely on a less stringent assumption about how missingness relates to analysis variables. Furthermore, in contrast to deletion methods, multiple imputation and maximum likelihood estimation enable researchers to include all available data in the analysis, resulting in increased statistical power. This chapter provides an overview of multiple imputation and maximum likelihood estimation for handling missing data. Using an example from a study of predictors of depressive symptoms in children with juvenile rheumatic diseases, the chapter illustrates the use of multiple imputation and maximum likelihood estimation using variety of statistical software packages.

Keywords

missing data multiple imputation maximum likelihood estimation missing at random

Information

Type: Chapter
Information: The Cambridge Handbook of Research Methods in Clinical Psychology , pp. 444 - 466

DOI: https://doi.org/10.1017/9781316995808.040 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Anderson, T. W. (1957). Maximum Likelihood Estimates for a Multivariate Normal Distribution When Some Observations Are Missing. Journal of the American Statistical Association, 52, 200‒203.Google Scholar

Arbuckle, J. L. (1996). Full Information Estimation in the Presence of Incomplete Data. In Marcoulides, G. A. & Schumacker, R. E. (Eds.), Advanced Structural Equation Modeling (pp. 243‒277). Mahwah, NJ: Lawrence Erlbaum.Google Scholar

Asparouhov, T., & Muthén, B. (2010a). Chi-Square Statistics with Multiple Imputation. Retrieved from www.statmodel.com/download/MI7.pdf Google Scholar

Asparouhov, T., & Muthén, B. (2010b). Multiple Imputation with Mplus. Retrieved from www.statmodel.com/download/Imputations7.pdf Google Scholar

Barnard, J., & Rubin, D. B. (1999). Small-Sample Degrees of Freedom with Multiple Imputation. Biometrika, 86, 948‒955.CrossRef Google Scholar

Bartlett, J. W., Seaman, S. R., White, I. R., & Carpenter, J. R. (2014). Multiple Imputation of Covariates by Fully Conditional Specification: Accommodating the Substantive Model. Statistical Methods in Medical Research, 24, 462‒487.Google Scholar

Beale, E. M. L., & Little, R. J. A. (1975). Missing Values in Multivariate Analysis. Journal of the Royal Statistical Society, Series B (Methodological), 37, 129–145.Google Scholar

Carpenter, J. R., & Kenward, M. G. (2013). Multiple Imputation and Its Application. Chichester: Wiley.Google Scholar

Chaney, J. M., Gamwell, K. L., Baraldi, A. N., Ramsey, R. R., Cushing, C. C., Mullins, A. J., … & Mullins, L. L. (2016). Parent Perceptions of Illness Uncertainty and Child Depressive Symptoms in Juvenile Rheumatic Diseases: Examining Caregiver Demand and Parent Distress as Mediators. Journal of Pediatric Psychology, 41, 941‒951.Google Scholar

Collins, L. M., Schafer, J. L., & Kam, C.-M. (2001). A Comparison of Inclusive and Restrictive Strategies in Modern Missing Data Procedures. Psychological Methods, 6, 330‒351.Google Scholar

Dempster, A. P., & Laird, N. M. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 1–38.CrossRef Google Scholar

Dixon, W. J. (1988). BMDP Statistical Software. Los Angeles: University of California Press.Google Scholar

Eekhout, I., Enders, C. K., Twisk, J. W. R., de Boer, M. R., de Vet, H. C. W., & Heymans, M. W. (2015). Analyzing Incomplete Item Scores in Longitudinal Data by Including Item Score Information as Auxiliary Variables. Structural Equation Modeling: A Multidisciplinary Journal, 22, 1‒15.Google Scholar

Enders, C. K. (2010). Applied Missing Data Analysis. New York: Guilford Press.Google Scholar

Enders, C. K. (2011). Missing Not at Random Models for Latent Growth Curve Analyses. Psychological Methods, 16, 1‒16.Google Scholar

Enders, C. K., Baraldi, A. N., & Cham, H. (2014). Estimating Interaction Effects with Incomplete Predictor Variables. Psychological Methods, 19, 39‒55.Google Scholar

Enders, C. K., Du, H., & Keller, B. T. (2017a). A Fully Bayesian Imputation Procedure for Random Coefficient Models (and Other Pesky Product Terms). Paper presented at the Society for Multivariate Experimental Psychology, Minneapolis, MN, October 4‒7.Google Scholar

Enders, C. K., Du, H., & Keller, B. T. (In press). A Model-Based Imputation Procedure for Multilevel Regression Models with Random Coefficients, Interaction Effects, and Other Nonlinear Terms. Psychological Methods. Retrieved from doi:10.1037/met0000148Google Scholar

Enders, C. K., Keller, B. T., & Levy, R. (2017b). A Fully Conditional Specification Approach to Multilevel Imputation of Categorical and Continuous Variables. Psychological Methods, 23(2), 298‒317.Google Scholar

Gelman, A., & Rubin, D. B. (1992). Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7, 457–472.Google Scholar

Gelman, A., & Shirley, K. (2011). Inference from Simulations and Monitoring Convergence. In Brooks, S., Gelman, A., Jones, G., & Meng, X. L. (Eds.), Handbook of Markov Chain Monte Carlo. Boca Raton, FL: CRC Press.Google Scholar

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2014). Bayesian Data Analysis (3rd edn.). Boca Raton, FL: CRC Press.Google Scholar

Goldstein, H., Carpenter, J., Kenward, M. G., & Levin, K. A. (2009). Multilevel Models with Multivariate Mixed Response Types. Statistical Modelling, 9, 173‒197.Google Scholar

Gottschall, A. C., West, S. G., & Enders, C. K. (2012). A Comparison of Item-Level and Scale-Level Multiple Imputation for Questionnaire Batteries. Multivariate Behavioral Research, 47, 1‒25.CrossRef Google Scholar

Graham, J. W. (2003). Adding Missing-Data Relevant Variables to FIML-Based Structural Equation Models. Structural Equation Modeling: A Multidisciplinary Journal, 10, 80–100.Google Scholar

Graham, J. W., Hofer, S. M., Donaldson, S. I., MacKinnon, D. P., & Schafer, J. L. (1997). Analysis with Missing Data in Prevention Research. In Bryant, K. J., Windle, M., & West, S. G. (Eds.), The Science of Prevention: Methodological Advances from Alcohol and Substance Abuse Research (pp. 325–366). Washington, DC: American Psychological Association.Google Scholar

Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How Many Imputations Are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206–213.CrossRef Google Scholar PubMed

Grund, S., Robitzsch, A., & Lüdke, O. (2016). mitml: Tools for Multiple Imputation in Multilevel Modeling. Retrieved from https://cran.r-project.org/web/packages/mitml/Google Scholar

Jeličić, H., Phelps, E., & Lerner, R. M. (2009). Use of Missing Data Methods in Longitudinal Studies: The Persistence of Bad Practices in Developmental Psychology. Developmental Psychology, 45, 1195‒1199.Google Scholar

Keller, B. T., & Enders, C. K. (2019). Blimp User’s Manual (Version 2.0). Los Angeles, CA.Google Scholar

Kenward, M. G., & Molenberghs, G. (1998). Likelihood Based Frequentist Inference when Data Are Missing at Random. Statistical Science, 13, 236‒247.Google Scholar

Kim, K. H., & Bentler, P. M. (2002). Tests of Homogeneity of Means and Covariance Matrices for Multivariate Incomplete Data. Psychometrika, 67, 609–624.Google Scholar

Li, K. H., Raghunathan, T. E., & Rubin, D. B. (1991). Large-Sample Significance Levels from Multiply Imputed Data Using Moment-Based Statistics and an F Reference Distribution. Journal of the American Statistical Association, 86, 1065‒1073.Google Scholar

Little, R. J. A. (1988). A Test of Missing Completely at Random for Multivariate Data with Missing Values. Journal of the American Statistical Association, 83, 1198–1202.Google Scholar

Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data (2nd edn.). Hoboken, NJ: WileyGoogle Scholar

Mazza, G. L., Enders, C. K., & Ruehlman, L. S. (2015). Addressing Item-Level Missing Data: A Comparison of Proration and Full Information Maximum Likelihood Estimation. Multivariate Behavioral Research, 50, 504‒519.CrossRef Google Scholar

Meng, X.-L. (1994). Multiple-Imputation Inferences with Uncongenial Sources of Input. Statistical Science, 9, 538‒558.Google Scholar

Meng, X.-L., & Rubin, D. B. (1992). Performing Likelihood Ratio Tests with Multiply-Imputed Data Sets. Biometrika, 79, 103‒111.Google Scholar

Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus User’s Guide (8th edn.). Los Angeles, CA: Muthén & MuthénGoogle Scholar

Muthén, B., Asparouhov, T., Hunter, A. M., & Leuchter, A. F. (2011). Growth Modeling with Nonignorable Dropout: Alternative Analyses of the STAR*D Antidepressant Trial. Psychological Methods, 16, 17‒33.Google Scholar

Muthén, B., Kaplan, D., & Hollis, M. (1987). On Structural Equation Modeling with Data that Are Not Missing Completely at Random. Psychometrika, 52, 431–462.Google Scholar

Orchard, T., & Woodbury, M. A. (1972). A Missing Information Principle: Theory and Applications. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 697–715). Berkeley: University of California Press.Google Scholar

Peugh, J. L., & Enders, C. K. (2004). Missing Data in Educational Research: A Review of Reporting Practices and Suggestions for Improvement. Review of Educational Research, 74, 525–556.CrossRef Google Scholar

Raghunathan, T. E. (2004). What Do We Do with Missing Data? Some Options for Analysis of Incomplete Data. Annual Review of Public Health, 25, 99–117.Google Scholar

Raghunathan, T. E., Lepkowski, J. M., Van Hoewyk, J., & Solenberger, P. (2001). A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models. Survey Methodology, 27, 85‒95.Google Scholar

Raykov, T. (2011). On Testability of Missing Data Mechanisms in Incomplete Data Sets. Structural Equation Modeling: A Multidisciplinary Journal, 18, 419‒429.Google Scholar

Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36.Google Scholar

Rubin, D. B. (1976). Inference and Missing Data. Biometrika, 63, 581‒592.Google Scholar

Rubin, D. B. (1978). Multiple Imputations in Sample Surveys ‒ A Phenomenological Bayesian Approach to Nonresponse. In Proceedings of the Survey Research Methods Section of the American Statistical Association (pp. 30–34).Google Scholar

Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Hoboken, NJ: Wiley.Google Scholar

Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York: Chapman & Hall.Google Scholar

Schafer, J. L. (2003). Multiple Imputation in Multivariate Problems When the Imputation and Analysis Models Differ. Statistica Neerlandica, 57, 19‒35.Google Scholar

Schafer, J. L., & Graham, J. W. (2002). Missing Data: Our View of the State of the Art. Psychological Methods, 7, 147‒177.CrossRef Google Scholar PubMed

Shin, Y., & Raudenbush, S. W. (2007). Just-Identified versus Overidentified Two-Level Hierarchical Linear Models with Missing Data. Biometrics, 63, 1262‒1268.CrossRef Google Scholar PubMed

Shin, Y., & Raudenbush, S. W. (2010). A Latent Cluster-Mean Approach to the Contextual Effects Model with Missing Data. Journal of Educational and Behavioral Statistics, 35, 26–53.Google Scholar

Sullivan, T. R., Yelland, L. N., Lee, K. J., Ryan, P., & Salter, A. B. (2017). Treatment of Missing Data in Follow-Up Studies of Randomised Controlled Trials: A Systematic Review of the Literature. Clinical Trials, 14, 387‒395.Google Scholar

Van Buuren, S. (2007). Multiple Imputation of Discrete and Continuous Data by Fully Conditional Specification. Statistical Methods in Medical Research, 16, 219‒242.Google Scholar

Van Buuren, S. (2012). Flexible Imputation of Missing Data. New York: Chapman & Hall.Google Scholar

Van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully Conditional Specification in Multivariate Imputation. Journal of Statistical Computation and Simulation, 76, 1049‒1064.Google Scholar

Widaman, K. F. (2006). Missing Data: What to Do with or without Them. Monographs of the Society for Research in Child Development, 71, 42‒64.Google Scholar

Yuan, K.-H., & Savalei, V. (2014). Consistency, Bias and Efficiency of the Normal-Distribution-Based MLE: The Role of Auxiliary Variables. Journal of Multivariate Analysis, 124, 353–370.Google Scholar

Accessibility standard: Unknown

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.