Book contents
- Frontmatter
- Dedication
- Contents
- 1 Introduction
- 2 Univariate density estimation
- 3 Multivariate density estimation
- 4 Inference about the density
- 5 Regression
- 6 Testing in regression
- 7 Smoothing discrete variables
- 8 Regression with discrete covariates
- 9 Semiparametric methods
- 10 Instrumental variables
- 11 Panel data
- 12 Constrained estimation and inference
- Bibliography
- Index
7 - Smoothing discrete variables
Published online by Cambridge University Press: 05 February 2015
- Frontmatter
- Dedication
- Contents
- 1 Introduction
- 2 Univariate density estimation
- 3 Multivariate density estimation
- 4 Inference about the density
- 5 Regression
- 6 Testing in regression
- 7 Smoothing discrete variables
- 8 Regression with discrete covariates
- 9 Semiparametric methods
- 10 Instrumental variables
- 11 Panel data
- 12 Constrained estimation and inference
- Bibliography
- Index
Summary
In this chapter, we discuss the intuition underlying the smoothing of discrete variables in the context of a probability density. In virtually all applied economic milieus, many variables will be discrete (also termed categorical), which is to say the variables take on a countable number of outcomes. For example, when we include a regional indicator in a growth regression, this variable takes on anywhere from four to eight distinct values depending upon how finely we wish to partition the globe. Alternatively, a variable categorizing membership into the OECD would present itself as a classic dummy variable, taking only two values. In these instances, smoothing with traditional kernels – such as the s-class of kernels described in Chapter 2 – is inappropriate. Here we outline kernels that are appropriate for smoothing discrete variables. We discuss the interpretation of these smoothing parameters. However, we must delineate between two types of discrete variables: unordered (such as a region indicator) and ordered (say, year).
The elegance of the inclusion of categorical variables to the empirical analysis is that while the interpretation and handling of the variables requires some care beyond what we covered with continuous variables, the mechanics of the estimators do not vary greatly, requiring nothing more than some additional notation and a generalization of the product kernel. The beauty of discrete variables is that (with respect to data requirements) their addition to the model does not lead to severe consequences as was the case with the addition of continuous variables. As with all nonparametric estimation, bandwidth selection is of primary importance and the presence of categorical variables does nothing to change this perception.
We end the chapter incorporating region and time into our investigation of cross-country output. We investigate the distribution of cross-country output looking both over time (ordered discrete) and across regions (unordered discrete). This allows us to document how smoothing these variables aids our understanding of the global distribution of cross-country output without resorting to the common frequency approach, which simply splits the data by each individual category.
- Type
- Chapter
- Information
- Applied Nonparametric Econometrics , pp. 187 - 204Publisher: Cambridge University PressPrint publication year: 2015