Book contents
- Frontmatter
- Contents
- Preface
- Guide to the chapters
- Acknowledgment of support
- Part I Introduction to the four themes
- Part II Studies on the four themes
- 5 Parametric Inference
- 6 Polytope Propagation on Graphs
- 7 Parametric Sequence Alignment
- 8 Bounds for Optimal Sequence Alignment
- 9 Inference Functions
- 10 Geometry of Markov Chains
- 11 Equations Defining Hidden Markov Models
- 12 The EM Algorithm for Hidden Markov Models
- 13 Homology Mapping with Markov Random Fields
- 14 Mutagenetic Tree Models
- 15 Catalog of Small Trees
- 16 The Strand Symmetric Model
- 17 Extending Tree Models to Splits Networks
- 18 Small Trees and Generalized Neighbor-Joining
- 19 Tree Construction using Singular Value Decomposition
- 20 Applications of Interval Methods to Phylogenetics
- 21 Analysis of Point Mutations in Vertebrate Genomes
- 22 Ultra-Conserved Elements in Vertebrate and Fly Genomes
- References
- Index
9 - Inference Functions
from Part II - Studies on the four themes
Published online by Cambridge University Press: 04 August 2010
- Frontmatter
- Contents
- Preface
- Guide to the chapters
- Acknowledgment of support
- Part I Introduction to the four themes
- Part II Studies on the four themes
- 5 Parametric Inference
- 6 Polytope Propagation on Graphs
- 7 Parametric Sequence Alignment
- 8 Bounds for Optimal Sequence Alignment
- 9 Inference Functions
- 10 Geometry of Markov Chains
- 11 Equations Defining Hidden Markov Models
- 12 The EM Algorithm for Hidden Markov Models
- 13 Homology Mapping with Markov Random Fields
- 14 Mutagenetic Tree Models
- 15 Catalog of Small Trees
- 16 The Strand Symmetric Model
- 17 Extending Tree Models to Splits Networks
- 18 Small Trees and Generalized Neighbor-Joining
- 19 Tree Construction using Singular Value Decomposition
- 20 Applications of Interval Methods to Phylogenetics
- 21 Analysis of Point Mutations in Vertebrate Genomes
- 22 Ultra-Conserved Elements in Vertebrate and Fly Genomes
- References
- Index
Summary
Some of the statistical models introduced in Chapter 1 have the feature that, aside from the observed data, there is hidden information that cannot be determined from an observation. In this chapter we consider graphical models with hidden variables, such as the hidden Markov model and the hidden tree model. A natural problem in such models is to determine, given a particular observation, what is the most likely hidden data (which is called the explanation) for that observation. This problem is called MAP inference (Remark 4.13). Any fixed values of the parameters determine a way to assign an explanation to each possible observation. A map obtained in this way is called an inference function.
Examples of inference functions include gene-finding functions which were discussed in [Pachter and Sturmfels, 2005, Section 5]. These inference functions of a hidden Markov model are used to identify gene structures in DNA sequences (see Section 4.4. An observation in such a model is a sequence over the alphabet Σ′ = {A, C, G, T}.
After a short introduction to inference functions, we present the main result of this chapter in Section 9.2. We call it the Few Inference Functions Theorem, and it states that in any graphical model the number of inference functions grows polynomially if the number of parameters is fixed. This theorem shows that most functions from the set of observations to possible values of the hidden data cannot be inference functions for any choice of the model parameters.
- Type
- Chapter
- Information
- Algebraic Statistics for Computational Biology , pp. 215 - 225Publisher: Cambridge University PressPrint publication year: 2005