Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-fbnjt Total loading time: 0 Render date: 2024-11-15T23:23:26.731Z Has data issue: false hasContentIssue false

33 - Missing Data Analyses

from Part VII - General Analytic Considerations

Published online by Cambridge University Press:  23 March 2020

Aidan G. C. Wright
Affiliation:
University of Pittsburgh
Michael N. Hallquist
Affiliation:
Pennsylvania State University
Get access

Summary

The methodological literature recommends multiple imputation and maximum likelihood estimation as best practices in handling missing data in published research. Relative to older methods such as listwise and pairwise deletion, these approaches are preferable because they rely on a less stringent assumption about how missingness relates to analysis variables. Furthermore, in contrast to deletion methods, multiple imputation and maximum likelihood estimation enable researchers to include all available data in the analysis, resulting in increased statistical power. This chapter provides an overview of multiple imputation and maximum likelihood estimation for handling missing data. Using an example from a study of predictors of depressive symptoms in children with juvenile rheumatic diseases, the chapter illustrates the use of multiple imputation and maximum likelihood estimation using variety of statistical software packages.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Anderson, T. W. (1957). Maximum Likelihood Estimates for a Multivariate Normal Distribution When Some Observations Are Missing. Journal of the American Statistical Association, 52, 200203.Google Scholar
Arbuckle, J. L. (1996). Full Information Estimation in the Presence of Incomplete Data. In Marcoulides, G. A. & Schumacker, R. E. (Eds.), Advanced Structural Equation Modeling (pp. 243277). Mahwah, NJ: Lawrence Erlbaum.Google Scholar
Asparouhov, T., & Muthén, B. (2010a). Chi-Square Statistics with Multiple Imputation. Retrieved from www.statmodel.com/download/MI7.pdfGoogle Scholar
Asparouhov, T., & Muthén, B. (2010b). Multiple Imputation with Mplus. Retrieved from www.statmodel.com/download/Imputations7.pdfGoogle Scholar
Barnard, J., & Rubin, D. B. (1999). Small-Sample Degrees of Freedom with Multiple Imputation. Biometrika, 86, 948955.CrossRefGoogle Scholar
Bartlett, J. W., Seaman, S. R., White, I. R., & Carpenter, J. R. (2014). Multiple Imputation of Covariates by Fully Conditional Specification: Accommodating the Substantive Model. Statistical Methods in Medical Research, 24, 462487.Google Scholar
Beale, E. M. L., & Little, R. J. A. (1975). Missing Values in Multivariate Analysis. Journal of the Royal Statistical Society, Series B (Methodological), 37, 129145.Google Scholar
Carpenter, J. R., & Kenward, M. G. (2013). Multiple Imputation and Its Application. Chichester: Wiley.Google Scholar
Chaney, J. M., Gamwell, K. L., Baraldi, A. N., Ramsey, R. R., Cushing, C. C., Mullins, A. J., … & Mullins, L. L. (2016). Parent Perceptions of Illness Uncertainty and Child Depressive Symptoms in Juvenile Rheumatic Diseases: Examining Caregiver Demand and Parent Distress as Mediators. Journal of Pediatric Psychology, 41, 941951.Google Scholar
Collins, L. M., Schafer, J. L., & Kam, C.-M. (2001). A Comparison of Inclusive and Restrictive Strategies in Modern Missing Data Procedures. Psychological Methods, 6, 330351.Google Scholar
Dempster, A. P., & Laird, N. M. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 138.CrossRefGoogle Scholar
Dixon, W. J. (1988). BMDP Statistical Software. Los Angeles: University of California Press.Google Scholar
Eekhout, I., Enders, C. K., Twisk, J. W. R., de Boer, M. R., de Vet, H. C. W., & Heymans, M. W. (2015). Analyzing Incomplete Item Scores in Longitudinal Data by Including Item Score Information as Auxiliary Variables. Structural Equation Modeling: A Multidisciplinary Journal, 22, 115.Google Scholar
Enders, C. K. (2010). Applied Missing Data Analysis. New York: Guilford Press.Google Scholar
Enders, C. K. (2011). Missing Not at Random Models for Latent Growth Curve Analyses. Psychological Methods, 16, 116.Google Scholar
Enders, C. K., Baraldi, A. N., & Cham, H. (2014). Estimating Interaction Effects with Incomplete Predictor Variables. Psychological Methods, 19, 3955.Google Scholar
Enders, C. K., Du, H., & Keller, B. T. (2017a). A Fully Bayesian Imputation Procedure for Random Coefficient Models (and Other Pesky Product Terms). Paper presented at the Society for Multivariate Experimental Psychology, Minneapolis, MN, October 4‒7.Google Scholar
Enders, C. K., Du, H., & Keller, B. T. (In press). A Model-Based Imputation Procedure for Multilevel Regression Models with Random Coefficients, Interaction Effects, and Other Nonlinear Terms. Psychological Methods. Retrieved from doi:10.1037/met0000148Google Scholar
Enders, C. K., Keller, B. T., & Levy, R. (2017b). A Fully Conditional Specification Approach to Multilevel Imputation of Categorical and Continuous Variables. Psychological Methods, 23(2), 298317.Google Scholar
Gelman, A., & Rubin, D. B. (1992). Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7, 457472.Google Scholar
Gelman, A., & Shirley, K. (2011). Inference from Simulations and Monitoring Convergence. In Brooks, S., Gelman, A., Jones, G., & Meng, X. L. (Eds.), Handbook of Markov Chain Monte Carlo. Boca Raton, FL: CRC Press.Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2014). Bayesian Data Analysis (3rd edn.). Boca Raton, FL: CRC Press.Google Scholar
Goldstein, H., Carpenter, J., Kenward, M. G., & Levin, K. A. (2009). Multilevel Models with Multivariate Mixed Response Types. Statistical Modelling, 9, 173197.Google Scholar
Gottschall, A. C., West, S. G., & Enders, C. K. (2012). A Comparison of Item-Level and Scale-Level Multiple Imputation for Questionnaire Batteries. Multivariate Behavioral Research, 47, 125.CrossRefGoogle Scholar
Graham, J. W. (2003). Adding Missing-Data Relevant Variables to FIML-Based Structural Equation Models. Structural Equation Modeling: A Multidisciplinary Journal, 10, 80100.Google Scholar
Graham, J. W., Hofer, S. M., Donaldson, S. I., MacKinnon, D. P., & Schafer, J. L. (1997). Analysis with Missing Data in Prevention Research. In Bryant, K. J., Windle, M., & West, S. G. (Eds.), The Science of Prevention: Methodological Advances from Alcohol and Substance Abuse Research (pp. 325366). Washington, DC: American Psychological Association.Google Scholar
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How Many Imputations Are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206213.CrossRefGoogle ScholarPubMed
Grund, S., Robitzsch, A., & Lüdke, O. (2016). mitml: Tools for Multiple Imputation in Multilevel Modeling. Retrieved from https://cran.r-project.org/web/packages/mitml/Google Scholar
Jeličić, H., Phelps, E., & Lerner, R. M. (2009). Use of Missing Data Methods in Longitudinal Studies: The Persistence of Bad Practices in Developmental Psychology. Developmental Psychology, 45, 11951199.Google Scholar
Keller, B. T., & Enders, C. K. (2019). Blimp User’s Manual (Version 2.0). Los Angeles, CA.Google Scholar
Kenward, M. G., & Molenberghs, G. (1998). Likelihood Based Frequentist Inference when Data Are Missing at Random. Statistical Science, 13, 236247.Google Scholar
Kim, K. H., & Bentler, P. M. (2002). Tests of Homogeneity of Means and Covariance Matrices for Multivariate Incomplete Data. Psychometrika, 67, 609624.Google Scholar
Li, K. H., Raghunathan, T. E., & Rubin, D. B. (1991). Large-Sample Significance Levels from Multiply Imputed Data Using Moment-Based Statistics and an F Reference Distribution. Journal of the American Statistical Association, 86, 10651073.Google Scholar
Little, R. J. A. (1988). A Test of Missing Completely at Random for Multivariate Data with Missing Values. Journal of the American Statistical Association, 83, 11981202.Google Scholar
Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data (2nd edn.). Hoboken, NJ: WileyGoogle Scholar
Mazza, G. L., Enders, C. K., & Ruehlman, L. S. (2015). Addressing Item-Level Missing Data: A Comparison of Proration and Full Information Maximum Likelihood Estimation. Multivariate Behavioral Research, 50, 504519.CrossRefGoogle Scholar
Meng, X.-L. (1994). Multiple-Imputation Inferences with Uncongenial Sources of Input. Statistical Science, 9, 538558.Google Scholar
Meng, X.-L., & Rubin, D. B. (1992). Performing Likelihood Ratio Tests with Multiply-Imputed Data Sets. Biometrika, 79, 103111.Google Scholar
Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus User’s Guide (8th edn.). Los Angeles, CA: Muthén & MuthénGoogle Scholar
Muthén, B., Asparouhov, T., Hunter, A. M., & Leuchter, A. F. (2011). Growth Modeling with Nonignorable Dropout: Alternative Analyses of the STAR*D Antidepressant Trial. Psychological Methods, 16, 1733.Google Scholar
Muthén, B., Kaplan, D., & Hollis, M. (1987). On Structural Equation Modeling with Data that Are Not Missing Completely at Random. Psychometrika, 52, 431462.Google Scholar
Orchard, T., & Woodbury, M. A. (1972). A Missing Information Principle: Theory and Applications. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 697715). Berkeley: University of California Press.Google Scholar
Peugh, J. L., & Enders, C. K. (2004). Missing Data in Educational Research: A Review of Reporting Practices and Suggestions for Improvement. Review of Educational Research, 74, 525556.CrossRefGoogle Scholar
Raghunathan, T. E. (2004). What Do We Do with Missing Data? Some Options for Analysis of Incomplete Data. Annual Review of Public Health, 25, 99117.Google Scholar
Raghunathan, T. E., Lepkowski, J. M., Van Hoewyk, J., & Solenberger, P. (2001). A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models. Survey Methodology, 27, 8595.Google Scholar
Raykov, T. (2011). On Testability of Missing Data Mechanisms in Incomplete Data Sets. Structural Equation Modeling: A Multidisciplinary Journal, 18, 419429.Google Scholar
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 136.Google Scholar
Rubin, D. B. (1976). Inference and Missing Data. Biometrika, 63, 581592.Google Scholar
Rubin, D. B. (1978). Multiple Imputations in Sample Surveys ‒ A Phenomenological Bayesian Approach to Nonresponse. In Proceedings of the Survey Research Methods Section of the American Statistical Association (pp. 30–34).Google Scholar
Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Hoboken, NJ: Wiley.Google Scholar
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York: Chapman & Hall.Google Scholar
Schafer, J. L. (2003). Multiple Imputation in Multivariate Problems When the Imputation and Analysis Models Differ. Statistica Neerlandica, 57, 1935.Google Scholar
Schafer, J. L., & Graham, J. W. (2002). Missing Data: Our View of the State of the Art. Psychological Methods, 7, 147177.CrossRefGoogle ScholarPubMed
Shin, Y., & Raudenbush, S. W. (2007). Just-Identified versus Overidentified Two-Level Hierarchical Linear Models with Missing Data. Biometrics, 63, 12621268.CrossRefGoogle ScholarPubMed
Shin, Y., & Raudenbush, S. W. (2010). A Latent Cluster-Mean Approach to the Contextual Effects Model with Missing Data. Journal of Educational and Behavioral Statistics, 35, 2653.Google Scholar
Sullivan, T. R., Yelland, L. N., Lee, K. J., Ryan, P., & Salter, A. B. (2017). Treatment of Missing Data in Follow-Up Studies of Randomised Controlled Trials: A Systematic Review of the Literature. Clinical Trials, 14, 387395.Google Scholar
Van Buuren, S. (2007). Multiple Imputation of Discrete and Continuous Data by Fully Conditional Specification. Statistical Methods in Medical Research, 16, 219242.Google Scholar
Van Buuren, S. (2012). Flexible Imputation of Missing Data. New York: Chapman & Hall.Google Scholar
Van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully Conditional Specification in Multivariate Imputation. Journal of Statistical Computation and Simulation, 76, 10491064.Google Scholar
Widaman, K. F. (2006). Missing Data: What to Do with or without Them. Monographs of the Society for Research in Child Development, 71, 4264.Google Scholar
Yuan, K.-H., & Savalei, V. (2014). Consistency, Bias and Efficiency of the Normal-Distribution-Based MLE: The Role of Auxiliary Variables. Journal of Multivariate Analysis, 124, 353370.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×