Affordable Access

Publisher Website

An approach of encoding for prediction of splice sites using SVM

Authors
Journal
Biochimie
0300-9084
Publisher
Elsevier
Publication Date
Volume
88
Issue
7
Identifiers
DOI: 10.1016/j.biochi.2006.03.006
Keywords
  • Splice Sites
  • Coding Sequence
  • Support Vector Machines
Disciplines
  • Computer Science

Abstract

Abstract In splice sites prediction, the accuracy is lower than 90% though the sequences adjacent to the splice sites have a high conservation. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, and few used for solving the fundamental issues, namely, nucleotide encoding. In this paper, a predictor is constructed to predict the true and false splice sites for higher eukaryotes based on support vector machines (SVM). Four types of encoding, which were mono-nucleotide (MN) encoding, MN with frequency difference between the true sites and false sites (FDTF) encoding, Pair-wise nucleotides (PN) encoding and PN with FDTF encoding, were applied to generate the input for the SVM. The results showed that PN with FDTF encoding as input to SVM led to the most reliable recognition of splice sites and the accuracy for the prediction of true donor sites and false sites were 96.3%, 93.7%, respectively, and the accuracy for predicting of true acceptor sites and false sites were 94.0%, 93.2%, respectively.

There are no comments yet on this publication. Be the first to share your thoughts.