Affordable Access

Protein classification using sequential pattern mining.

Authors
  • Exarchos, Themis P
  • Papaloukas, Costas
  • Lampros, Christos
  • Fotiadis, Dimitrios I
Type
Published Article
Journal
Annual International Conference of the IEEE Engineering in Medicine and Biology Society
Publication Date
Jan 01, 2006
Volume
1
Pages
5814–5817
Identifiers
PMID: 17945916
Source
Medline
License
Unknown

Abstract

Protein classification in terms of fold recognition can be employed to determine the structural and functional properties of a newly discovered protein. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. One of the most efficient SPM algorithms, cSPADE, is employed for protein primary structure analysis. Then a classifier uses the extracted sequential patterns for classifying proteins of unknown structure in the appropriate fold category. The proposed methodology exhibited an overall accuracy of 36% in a multi-class problem of 17 candidate categories. The classification performance reaches up to 65% when the three most probable protein folds are considered.

Report this publication

Statistics

Seen <100 times