Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- References
- Index
8 - Probabilistic approaches to phylogeny
Published online by Cambridge University Press: 06 January 2010
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- References
- Index
Summary
Establishing phylogenetic relationships between species is one of the central problems of biological science. While in Chapter 7 the reader was introduced to non-probabilistic methods of building phylogenetic trees for DNA and protein sequences, Chapter 8 continues the subject from the standpoint of consistent probabilistic methodology. The evolution of biological sequences has been largely viewed as a random process, and several probabilistic models with varying levels of complexity have been proposed. Therefore, the reconstruction of phylogenetic relationships can be formulated in probabilistic terms as well.
Several introductory BSA problems in Chapter 8 are concerned with the properties of the simplest probabilistic models of evolution, such as the Jukes–Cantor and the Kimura models.
Given a set of sequences (associated with the leaves of a tree) and a model of the process of substitutions in a DNA or protein sequence, it is important to know how to compute the likelihood of a tree with a given topology. The Felsenstein algorithm addresses this issue using the post-order traversal. Felsenstein also developed an EM-type algorithm for finding the optimal (maximum likelihood) lengths of the tree edges. However, as the number of leaves increases, the number of tree topologies grows too quickly to be processed in a reasonable time.
Therefore, finding the optimal tree among all possible trees for a rather large number of sequences (leaves) is one of the major challenges. The mainstream approach to managing such a problem is sampling from the posterior distribution on the space of trees.
The tree HMM concept described in BSA could be used for phylogenetic tree construction utilizing most general models of the sequence evolution.
- Type
- Chapter
- Information
- Problems and Solutions in Biological Sequence Analysis , pp. 218 - 278Publisher: Cambridge University PressPrint publication year: 2006