Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- Design and conventions of this book
- 1 Introduction: working with the molecules of life in the computer
- 2 Gene technology: cutting DNA
- 3 Gene technology: knocking genes down
- 4 Gene technology: amplifying DNA
- 5 Human disease: when DNA sequences are toxic
- 6 Human disease: iron imbalance and the iron responsive element
- 7 Human disease: cancer as a result of aberrant proteins
- 8 Evolution: what makes us human?
- 9 Evolution: resolving a criminal case
- 10 Evolution: the sad case of the Tasmanian tiger
- 11 A function to every gene: termites, metagenomics and learning about the function of a sequence
- 12 A function to every gene: royal blood and order in the sequence universe
- 13 A function to every gene: a slimy molecule
- 14 Information resources: learning about flu viruses
- 15 Finding genes: going ashore at CpG islands
- 16 Finding genes: in the world of snurpsp
- 17 Finding genes: hunting for the distant RNA relatives
- 18 Personal genomes: the differences between you and me
- 19 Personal genomes: what’s in my genome?
- 20 Personal genomes: details of family genetics
- Appendix I Brief Unix reference
- Appendix II A selection of biological sequence analysis software
- Appendix III A short Perl reference
- Appendix IV A brief introduction to R
- Index
- References
11 - A function to every gene: termites, metagenomics and learning about the function of a sequence
Published online by Cambridge University Press: 05 August 2012
- Frontmatter
- Contents
- Preface
- Acknowledgements
- Design and conventions of this book
- 1 Introduction: working with the molecules of life in the computer
- 2 Gene technology: cutting DNA
- 3 Gene technology: knocking genes down
- 4 Gene technology: amplifying DNA
- 5 Human disease: when DNA sequences are toxic
- 6 Human disease: iron imbalance and the iron responsive element
- 7 Human disease: cancer as a result of aberrant proteins
- 8 Evolution: what makes us human?
- 9 Evolution: resolving a criminal case
- 10 Evolution: the sad case of the Tasmanian tiger
- 11 A function to every gene: termites, metagenomics and learning about the function of a sequence
- 12 A function to every gene: royal blood and order in the sequence universe
- 13 A function to every gene: a slimy molecule
- 14 Information resources: learning about flu viruses
- 15 Finding genes: going ashore at CpG islands
- 16 Finding genes: in the world of snurpsp
- 17 Finding genes: hunting for the distant RNA relatives
- 18 Personal genomes: the differences between you and me
- 19 Personal genomes: what’s in my genome?
- 20 Personal genomes: details of family genetics
- Appendix I Brief Unix reference
- Appendix II A selection of biological sequence analysis software
- Appendix III A short Perl reference
- Appendix IV A brief introduction to R
- Index
- References
Summary
Great fleas have little fleas upon their backs to bite ’em, And little fleas have lesser fleas, and so ad infinitum.
(Augustus De Morgan, 1806–1871)For this chapter, as well as Chapters 12 and 13, we turn to the important genomics and bioinformatics problem of identifying biological function based on nucleotide and amino acid sequences.
Assigning function based on sequence similarity
A common problem in molecular biology is that you are faced with a gene or a gene product and you have no clue from experimental studies as to its function. In this context a critical contribution of bioinformatics is to attribute the sequence of a gene or a gene product a function. As one example, a genome sequencing project may give rise to tens of thousands of predicted protein sequences. In such a case we want to assign as many of these as possible a biological function using computational tools. In this manner we avoid many laborious wetlab experiments. In addition to genome sequencing projects, there are other more specialized situations where we want to find functions of genes. For instance, we could identify genes as being related to a specific genetic trait or disease, or a set of genes as being expressed under certain conditions.
A number of computational tools are available to predict a biological function associated with a protein sequence. In this chapter we will see an example in which we assign a function to a protein based on sequence similarity. Consider the human gene encoding the protein BRCA1, originally sequenced in 1994 (Miki et al., 1994). It was found to be related in sequence to a yeast protein RAD9. This yeast protein is involved in cell cycle control. This observation gave scientists a hint about possible roles of the BRCA1 gene. We see here an example of inferring a function based on a homology relationship to a protein that has already been functionally characterized. We will see yet another example of this situation in this chapter, where we will make use of BLAST to identify a homology relationship. We already encountered BLAST in the context of the BCR–ABL fusion protein in Chapter 7.
- Type
- Chapter
- Information
- Genomics and BioinformaticsAn Introduction to Programming Tools for Life Scientists, pp. 137 - 149Publisher: Cambridge University PressPrint publication year: 2012