Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- References
- Index
6 - Multiple sequence alignment methods
Published online by Cambridge University Press: 06 January 2010
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- References
- Index
Summary
The theory described in Chapter 5 of BSA suggests that constructing the multiple alignment of several biological sequences should be a part of the algorithm of the profile HMM training. Such an iterative expectation maximization method is supposed to estimate parameters of the profile HMM from unaligned sequences by means of the construction of the multiple alignment in parallel with the HMM parameter estimation. The resulting alignment can be evoked at the last step of the algorithm via an optimal alignment of each individual sequence to the just built profile HMM. Nevertheless, since this impressive theoretical design meets many practical difficulties, discussed in great detail in BSA, it has not yet been implemented in its pure form as an efficient tool for multiple sequence alignment.
One of the major difficulties on the road to a universal and efficient multiple sequence alignment algorithm is as follows. Establishing a gold standard for a multiple sequence alignment that would help to distinguish a good alignment from a better one is difficult. Since both sequence and structure are evolving and the ancestral sequences and structures can be reconstructed only by theoretical means, it is impossible to verify experimentally either alignments or phylogenies. Nevertheless, a formal assignment of the alignment score immediately leads to the notion of the best alignment for a given set of sequences; however, the implications of a so defined optimal alignment have to be taken cautiously. There are several biologically motivated options for the score assignment. For instance, the sum-of-pairs score is computationally convenient and frequently used, but it has well known theoretical drawbacks (Durbin et al. (1998), p. 141).
- Type
- Chapter
- Information
- Problems and Solutions in Biological Sequence Analysis , pp. 162 - 182Publisher: Cambridge University PressPrint publication year: 2006