The relationship between protein sequences and their gene ontology functions

Affordable Access

The relationship between protein sequences and their gene ontology functions

Publisher
BioMed Central
Publication Date
Dec 12, 2006
Source
PMC
Keywords
Disciplines
  • Biology
  • Computer Science
License
Unknown

Abstract

1471-2105-7-S4-S11.fm ral ss BioMed CentBMC Bioinformatics Open AcceResearch The relationship between protein sequences and their gene ontology functions Zhong-Hui Duan*1, Brent Hughes1, Lothar Reichel2, Dianne M Perez3 and Ting Shi3 Address: 1Department of Computer Science, University of Akron, Akron, OH, 44325, USA, 2Department of Mathematical Sciences, Kent State University, Kent, OH, 44242, USA and 3Department of Molecular Cardiology, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, 44195, USA Email: Zhong-Hui Duan* - [email protected]; Brent Hughes - [email protected]; Lothar Reichel - [email protected]; Dianne M Perez - [email protected]; Ting Shi - [email protected] * Corresponding author Abstract Background: One main research challenge in the post-genomic era is to understand the relationship between protein sequences and their biological functions. In recent years, several automated annotation systems have been developed for the functional assignment of uncharacterized proteins. The underlying assumption of these systems is that similar sequences imply similar biological functions. However, it has been noted that matching sequences do not always infer similar functions. Results: In this paper, we present the correlation between protein sequences and protein functions for the yeast proteome in the context of gene ontology. A novel measure is introduced to define the overall similarity between two protein sequences. The effects of the level as well as the size of a gene ontology group on the degree of similarity were studied. The similarity distributions at different levels of gene ontology trees are presented. To evaluate the theoretical prediction power of similar sequences, we computed the posterior probability of correct predictions. Conclusion: The results indicate that protein pairs of similar biological functions tend to have higher sequence similarity, although the similarity distribution in each functional group is heterogeneous and

Report this publication

Statistics

Seen <100 times