Multivariate density estimation

Daniel J. Henderson; Christopher F. Parmeter

doi:10.1017/CBO9780511845765.003

In traditional applied econometric settings, we typically have access to several variables. For example, in our growth example presented in Chapter 2, not only would a typical analysis have access to output per worker, but also physical and human capital stocks, measures of corruption, natural resource levels, institutional quality, and perhaps many other variables. In this sense, univariate density exploration is limited. For example, suppose you view a univariate density estimate and find bimodality to be a plausible feature. Is this bimodality inherent to the variable of interest, or is there some connection with a secondary variable? Jones, Marron, and Sheather (1996) find exactly this pattern in their research. They have a visually bimodal univariate density (202 observations) of lean body mass. Subsequent analysis shows that the bimodal nature of this density is linked to the gender of the individual. By splitting the data into 100 men and 102 women, each individual density is strongly unimodal. Thus, generically, the lean body mass measurements data was not bimodal, it was combining two different subpopulations into what was believed to be a homogeneous population.

To aptly characterize these types of issues, multivariate nonparametric methods need to be deployed. The natural extension of the univariate kernel density estimator developed in Chapter 2 is the multivariate kernel density estimator. This estimator looks and operates similarly to the univariate estimator and so the intuition built in Chapter 2 will prove useful here. However, there are some conceptual issues. How do we conceive of a kernel function in multiple dimensions? Should we have a bandwidth for each dimension or a single bandwidth which smooths all variables equally? What happens to the statistical properties of our estimator if we incorporate more variables into our density?

In this chapter, we outline both joint and conditional density estimation. We discuss asymptotic properties as well as bandwidth selection and the presence of irrelevant variables.

Book contents

3 - Multivariate density estimation

Summary

Access options

Book contents

3 - Multivariate density estimation

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive