Nonconstant Variance Response Models

Jamie D. Riggs; Trent L. Lalonde

doi:10.1017/9781316544778.005

Heterogeneity in Response Variance

One of the assumptions of the normal regression model is the homogeneity of variance. That is, it is assumed that each response represents an outcome from a (normal) distribution with the same, constant variation, regardless of the values of associated predictors. However, this property of homoscedasticity is often violated by responses that show fluctuations in variation across values of predictors as was seen in Chapter 3 using the four data sets. In such cases researchers will still be interested in evaluating all of the same questions that can be answered using normal regression models, but must turn to alternative modeling techniques.

For example, education researchers may collect records of standardized test scores for students in multiple classrooms with an interest in comparing performance across different student demographics. However, in analyzing the data it may be seen that students in classes with more experienced instructors show more consistency in their performances, and therefore lower variation in scores than for students in classes with less experienced instructors. As another example, health policy researchers might be interested in connections between family income level and health and medical expenditures. When data are collected, data analysts may see that the health spending varies greatly for families with significant financial resources, but health spending is consistent for families less financial resources. In both examples the variability of the outcome of interest changes depending on different subject characteristics.

Within any normal regression model, inferential methods such as t-tests and F-tests are constructed under the assumption of homoscedasticity. Using this assumption, “pooled” estimates of variation are made using the data, including the random unexplained variation captured by the mean-squared error (MSE). If such inferential methods are applied to heteroscedastic data, the pooled estimates such as MSE can either underestimate or overestimate the true variation in the response, and therefore hypothesis-testing procedures will be unreliable, often in unpredictable ways. Therefore it is important for data analysts to be able to detect and correct for violations of the constant-variance assumption.

Book contents

4 - Nonconstant Variance Response Models

Summary

Access options

Book contents

4 - Nonconstant Variance Response Models

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive