Markov chains and hidden Markov models

Richard Durbin; Sean R. Eddy; Anders Krogh; Graeme Mitchison

doi:10.1017/CBO9780511790492.004

3 - Markov chains and hidden Markov models

Published online by Cambridge University Press: 05 September 2012

Anders Krogh and

Richard Durbin: Affiliation:
Sanger Centre, Cambridge
Sean R. Eddy: Affiliation:
Washington University, Missouri
Anders Krogh: Affiliation:
Technical University of Denmark, Lyngby

Book contents

Get access

Summary

Having introduced some methods for pairwise alignment in Chapter 2, the emphasis will switch in this chapter to questions about a single sequence. The main aim of the chapter is to develop the theory for a very general form of probabilistic model for sequences of symbols, called a hidden Markov model (abbreviated HMM). The types of question we can use HMMs and their simpler cousins, Markov models, to consider are: ‘Does this sequence belong to a particular family?’ or ‘Assuming the sequence does come from some family, what can we say about its internal structure?’ An example of the second type of problem would be to try to identify alpha helix or beta sheet regions in a protein sequence.

As well as giving examples from the biological sequence world, we also give the mathematics and algorithms for many of the operations on HMMs in a more general form. These methods, or close analogues of them, are applied in many other sections of the book. This chapter therefore contains a fairly large amount of mathematically technical material. We have tried to organise it so that the first half, approximately, leads the reader through the essential algorithms using a single biological example. In the later sections we introduce a variety of other examples to illustrate more complex extensions of the basic approaches.

In the next chapter, we will see how HMMs can also be applied to the types of alignment problem discussed in Chapter 2, in Chapter 5 they are applied to searching databases for protein families, and in Chapter 6 to alignment of several sequences simultaneously.

Type: Chapter
Information: Biological Sequence Analysis
Probabilistic Models of Proteins and Nucleic Acids
, pp. 47 - 80

DOI: https://doi.org/10.1017/CBO9780511790492.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 1998

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

3 - Markov chains and hidden Markov models

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive