CORA—Topological fingerprints for protein structural families

CHRISTINE ANNE ORENGO

doi:10.1110/ps.8.4.699

Abstract

CORA is a suite of programs for multiply aligning and analyzing protein structural families to identify the consensus positions and capture their most conserved structural characteristics (e.g., residue accessibility, torsional angles, and global geometry as described by inter-residue vectors/contacts). Knowledge of these structurally conserved positions, which are mostly in the core of the fold and of their properties, significantly improves the identification and classification of newly-determined relatives. Information is encoded in a consensus three-dimensional (3D) template and relatives found by a sensitive alignment method, which employs a new scoring scheme based on conserved residue contacts. By encapsulating these critical “core” features, templates perform more reliably in recognizing distant structural relatives than searches with representative structures.

Parameters for 3D-template generation and alignment were optimized for each structural class (mainly-α, mainly-β, α-β), using representative superfold families. For all families selected, the templates gave significant improvements in sensitivity and selectivity in recognizing distant structural relatives. Furthermore, since templates contain less than 70% of fold positions and compare fewer positions when aligning structures, scans are at least an order of magnitude faster than scans using selected structures. CORA was subsequently tested on eight other broad structural families from the CATH database.

Diagnostics plots are generated automatically and provide qualitative assistance for classifying newly determined relatives. They are demonstrated here by application to the large globin-like fold family. CORA templates for both homologous superfamilies and fold families will be stored in CATH and used to improve the classification and analysis of newly determined structures.

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Orengo, Christine A Todd, Annabel E and Thornton, Janet M 1999. From protein structure to function. Current Opinion in Structural Biology, Vol. 9, Issue. 3, p. 374.

Todd, Annabelle E Orengo, Christine A and Thornton, Janet M 1999. Evolution of protein function, from a structural perspective. Current Opinion in Chemical Biology, Vol. 3, Issue. 5, p. 548.

Zhang, Chao and Kim, Sung-Hou 2000. The anatomy of protein β-sheet topology 1 1Edited by J. Thornton. Journal of Molecular Biology, Vol. 299, Issue. 4, p. 1075.

Yang, An-Suei and Honig, Barry 2000. An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments 1 1Edited by F. E. Cohen. Journal of Molecular Biology, Vol. 301, Issue. 3, p. 691.

Bray, J.E. Todd, A.E. Pearl, F.M.G. Thornton, J.M. and Orengo, C.A. 2000. The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. Protein Engineering, Design and Selection, Vol. 13, Issue. 3, p. 153.

Víksna, Juris and Gilbert, David 2001. Algorithms in Bioinformatics. Vol. 2149, Issue. , p. 98.

Nagano, Nozomi Porter, Craig T. and Thornton, Janet M. 2001. The (βα)8 glycosidases: sequence and structure analyses suggest distant evolutionary relationships. Protein Engineering, Design and Selection, Vol. 14, Issue. 11, p. 845.

Orengo, Christine A. Sillitoe, Ian Reeves, Gabrielle and Pearl, Frances M.G. 2001. Review: What Can Structural Classifications Reveal about Protein Evolution?. Journal of Structural Biology, Vol. 134, Issue. 2-3, p. 145.

Anand, Sneh and Santhosh, Jayashree 2001. Bioinformatics in Clinical Medicine. IETE Technical Review, Vol. 18, Issue. 4, p. 253.

Leibowitz, Nathaniel Nussinov, Ruth and Wolfson, Haim J. 2001. MUSTA - A General, Efficient, Automated Method for Multiple Structure Alignment and Detection of Common Motifs: Application to Proteins. Journal of Computational Biology, Vol. 8, Issue. 2, p. 93.

Todd, Annabel E Orengo, Christine A and Thornton, Janet M 2001. Evolution of function in protein superfamilies, from a structural perspective 1 1Edited by A. R. Fersht. Journal of Molecular Biology, Vol. 307, Issue. 4, p. 1113.

Valdar, William S. J. and Thornton, Janet M. 2001. Protein-protein interfaces: Analysis of amino acid conservation in homodimers. Proteins: Structure, Function, and Genetics, Vol. 42, Issue. 1, p. 108.

Reva, B. Kister, A. Topiol, S. and Gelfand, I. 2002. Determining the roles of different chain fragments in recognition of immunoglobulin fold. Protein Engineering, Design and Selection, Vol. 15, Issue. 1, p. 13.

Shatsky, Maxim Nussinov, Ruth and Wolfson, Haim J. 2002. Flexible protein alignment and hinge detection. Proteins: Structure, Function, and Bioinformatics, Vol. 48, Issue. 2, p. 242.

de la Cruz, Xavier Sillitoe, Ian and Orengo, Christine 2002. Use of structure comparison methods for the refinement of protein structure predictions. I. Identifying the structural family of a protein from low‐resolution models. Proteins: Structure, Function, and Bioinformatics, Vol. 46, Issue. 1, p. 72.

Chew, L. Paul and Kedem, Klara 2002. Finding the consensus shape for a protein family. p. 64.

Luscombe, Nicholas M. and Thornton, Janet M. 2002. Protein–DNA Interactions: Amino Acid Conservation and the Effects of Mutations on Binding Specificity. Journal of Molecular Biology, Vol. 320, Issue. 5, p. 991.

Nagano, Nozomi Orengo, Christine A and Thornton, Janet M 2002. One Fold with Many Functions: The Evolutionary Relationships between TIM Barrel Families Based on their Sequences, Structures and Functions. Journal of Molecular Biology, Vol. 321, Issue. 5, p. 741.

Orengo, Christine A. Bray, James E. Buchan, Daniel W. A. Harrison, Andrew Lee, David Pearl, Frances M. G. Sillitoe, Ian Todd, Annabel E. and Thornton, Janet M. 2002. The CATH protein family database: A resource for structural and functional annotation of genomes. PROTEOMICS, Vol. 2, Issue. 1, p. 11.

Day, Ryan Beck, David A.C. Armen, Roger S. and Daggett, Valerie 2003. A consensus view of fold space: Combining SCOP, CATH, and the Dali Domain Dictionary. Protein Science, Vol. 12, Issue. 10, p. 2150.

Download full list

Article contents

CORA—Topological fingerprints for protein structural families

Abstract

Keywords

Access options

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

CORA—Topological fingerprints for protein structural families

Abstract

Keywords

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests