Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- Bibliography
- Author index
- Subject index
8 - Probabilistic approaches to phylogeny
Published online by Cambridge University Press: 05 September 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- Bibliography
- Author index
- Subject index
Summary
Introduction
Our goal in this chapter is to formulate probabilistic models for phylogeny and show how trees can be inferred from sets of sequences, either by maximum likelihood or by sampling methods. We also review the phylogenetic methods of the previous chapter, and show that they often have probabilistic interpretations, though they are not usually presented this way.
Overview of the probabilistic approach to phylogeny
The basic aim of probability-based phylogeny is to rank trees either according to their likelihood P(data∣tree), or, if we are taking a more Bayesian view, according to their posterior probability P (tree∣data). There may be subsidiary aims, such as finding the likelihood or posterior probability of some particular taxonomic feature, such as a grouping of a set of organisms on a single branch. To achieve any of these aims, we must be able to define and compute P(x•∣T, t•), the probability of a set of data given a tree. Here the data are a set of n sequences xj for j = 1…n, which we write compactly as x•. T is a tree with n leaves with sequence j at leaf j, and the t• are the edge lengths of the tree. To define P(x•∣T, t•) we need a model of evolution, i.e. of the mutation and selection events that change sequences along the edges of a tree.
- Type
- Chapter
- Information
- Biological Sequence AnalysisProbabilistic Models of Proteins and Nucleic Acids, pp. 193 - 233Publisher: Cambridge University PressPrint publication year: 1998