Multiple sequence alignment methods

Mark Borodovsky; Svetlana Ekisheva

doi:10.1017/CBO9780511617829.007

6 - Multiple sequence alignment methods

Published online by Cambridge University Press: 06 January 2010

Mark Borodovsky and

Svetlana Ekisheva

Show author details

Mark Borodovsky: Affiliation:
Georgia Institute of Technology
Svetlana Ekisheva: Affiliation:
Georgia Institute of Technology

Book contents

Get access

Summary

The theory described in Chapter 5 of BSA suggests that constructing the multiple alignment of several biological sequences should be a part of the algorithm of the profile HMM training. Such an iterative expectation maximization method is supposed to estimate parameters of the profile HMM from unaligned sequences by means of the construction of the multiple alignment in parallel with the HMM parameter estimation. The resulting alignment can be evoked at the last step of the algorithm via an optimal alignment of each individual sequence to the just built profile HMM. Nevertheless, since this impressive theoretical design meets many practical difficulties, discussed in great detail in BSA, it has not yet been implemented in its pure form as an efficient tool for multiple sequence alignment.

One of the major difficulties on the road to a universal and efficient multiple sequence alignment algorithm is as follows. Establishing a gold standard for a multiple sequence alignment that would help to distinguish a good alignment from a better one is difficult. Since both sequence and structure are evolving and the ancestral sequences and structures can be reconstructed only by theoretical means, it is impossible to verify experimentally either alignments or phylogenies. Nevertheless, a formal assignment of the alignment score immediately leads to the notion of the best alignment for a given set of sequences; however, the implications of a so defined optimal alignment have to be taken cautiously. There are several biologically motivated options for the score assignment. For instance, the sum-of-pairs score is computationally convenient and frequently used, but it has well known theoretical drawbacks (Durbin et al. (1998), p. 141).

Type: Chapter
Information: Problems and Solutions in Biological Sequence Analysis , pp. 162 - 182

DOI: https://doi.org/10.1017/CBO9780511617829.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

6 - Multiple sequence alignment methods

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive