Book contents
- Frontmatter
- Contents
- Preface
- A guide to notation
- 1 Model selection: data examples and introduction
- 2 Akaike's information criterion
- 3 The Bayesian information criterion
- 4 A comparison of some selection methods
- 5 Bigger is not always better
- 6 The focussed information criterion
- 7 Frequentist and Bayesian model averaging
- 8 Lack-of-fit and goodness-of-fit tests
- 9 Model selection and averaging schemes in action
- 10 Further topics
- Overview of data examples
- References
- Author index
- Subject index
6 - The focussed information criterion
Published online by Cambridge University Press: 05 September 2012
- Frontmatter
- Contents
- Preface
- A guide to notation
- 1 Model selection: data examples and introduction
- 2 Akaike's information criterion
- 3 The Bayesian information criterion
- 4 A comparison of some selection methods
- 5 Bigger is not always better
- 6 The focussed information criterion
- 7 Frequentist and Bayesian model averaging
- 8 Lack-of-fit and goodness-of-fit tests
- 9 Model selection and averaging schemes in action
- 10 Further topics
- Overview of data examples
- References
- Author index
- Subject index
Summary
The model selection methods presented earlier (such as AIC and the BIC) have one thing in common: they select one single ‘best model’, which should then be used to explain all aspects of the mechanisms underlying the data and predict all future data points. The tolerance discussion in Chapter 5 showed that sometimes one model is best for estimating one type of estimand, whereas another model is best for another estimand. The point of view expressed via the focussed information criterion (FIC) is that a ‘best model’ should depend on the parameter under focus, such as the mean, or the variance, or the particular covariate values, etc. Thus the FIC allows and encourages different models to be selected for different parameters of interest.
Estimators and notation in submodels
In model selection applications there is a list of models to consider. We shall assume here that there is a ‘smallest’ and a ‘biggest’ model among these, and that the others lie between these two extremes. More concretely, there is a narrow model, which is the simplest model that we possibly might use for the data, having an unknown parameter vector θ of length p. Secondly, in the wide model, the largest model that we consider, there are an additional q parameters γ = (γ1, …, γq). We assume that the narrow model is a special case of the wide model, which means that there is a value γ0 such that with γ = γ0 in the wide model, we get precisely the narrow model.
- Type
- Chapter
- Information
- Model Selection and Model Averaging , pp. 145 - 191Publisher: Cambridge University PressPrint publication year: 2008