Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- References
- Index
9 - Transformational grammars
Published online by Cambridge University Press: 06 January 2010
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Pairwise alignment
- 3 Markov chains and hidden Markov models
- 4 Pairwise alignment using HMMs
- 5 Profile HMMs for sequence families
- 6 Multiple sequence alignment methods
- 7 Building phylogenetic trees
- 8 Probabilistic approaches to phylogeny
- 9 Transformational grammars
- 10 RNA structure analysis
- 11 Background on probability
- References
- Index
Summary
The one-dimensional string is too simple a model to reflect fully the properties of a real biological molecule, which have, after all, been determined by its three dimensional structure selected in the course of evolution. Physical interactions of amino acids and nucleotides in the three-dimensional folds have to be described by the models that would go beyond the short range correlations which are the typical targets of the Markov chain models. The long range correlations are more important for proteins than for DNA, which has a rather uniform double helix structure. However, the structure of another nucleic acid, RNA, commonly has a significant number of long range interactions of special type, which could be a target for yet another class of probabilistic models.
Chapter 9 introduces the Chomsky hierarchy of deterministic transformational grammars, the models developed originally for natural languages and then applied to computer languages. These grammars could be readily used for the description of a protein (a regular grammar could generate amino acid sequences described as the PROSITE patterns) and RNA (a context-free grammar could generate RNA sequences with a given secondary structure).
Further generalization of these deterministic grammar classes to stochastic ones increases opportunities for sequence modeling. Stochastic regular grammars could be shown to be equivalent to hidden Markov models. Stochastic context-free grammars (SCFGs) are useful for modeling RNA sequences.
- Type
- Chapter
- Information
- Problems and Solutions in Biological Sequence Analysis , pp. 279 - 290Publisher: Cambridge University PressPrint publication year: 2006