Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

Affordable Access

Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

Publisher
BioMed Central
Publication Date
Apr 16, 2007
Source
PMC
Keywords
Disciplines
  • Biology
  • Computer Science
License
Unknown

Abstract

1472-6807-7-25.fm ral ss BioMed CentBMC Structural Biology Open AcceMethodology article Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs Ke Chen1, Lukasz A Kurgan*1 and Jishou Ruan2 Address: 1Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada and 2Chern Institute of Mathematics, College of Mathematical Science and LPMC, Nankai University, Tianjin 300071, PCR Email: Ke Chen - [email protected]; Lukasz A Kurgan* - [email protected]; Jishou Ruan - [email protected] * Corresponding author Abstract Background: Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D) protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP); the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. Results: The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine l

Report this publication

Statistics

Seen <100 times