Affordable Access

Publisher Website

Scores of generalized base properties for quantitative sequence-activity modelings forE. colipromoters based on support vector machine

Authors
Journal
Journal of Molecular Graphics and Modelling
1093-3263
Publisher
Elsevier
Publication Date
Volume
26
Issue
1
Identifiers
DOI: 10.1016/j.jmgm.2006.12.004
Keywords
  • Scores Of Generalized Base Properties (Sgbp)
  • Quantitative Sequence-Activity Modeling (Qsam)
  • Support Vector Machine (Svm)
  • Dna
  • Promoter
Disciplines
  • Design
  • Mathematics

Abstract

Abstract A novel base sequence representation technique, namely SGBP (scores of generalized base properties), was derived from principal component analysis of a matrix of 1209 property parameters including 0D, 1D, 2D and 3D information for five bases such as A, C, G, T and U. It was then employed to represent sequence structures of E. coli promoters. Variables which were used as inputs of partial least square (PLS) and support vector machine (SVM) were selected by genetic arithmetic-partial least square. All samples were divided into train set which was applied to develop quantitative sequence-activity modelings (QSAMs) and test set which was used to validate the predictive power of the resulting models according to D-optimal design. Investigation on QSAM by PLS showed properties of base of position −42, −34, −31, −33, −41, −46 and −29 may yield more influence on strengths, which has thus pointed us further into the direction of strong promoters. Parameters of SVM were determined by response surface methodology. Satisfactory results indicated that the simulative and the predictive abilities for the internal and external samples of QSAM by SVM were better than those of PLS. Those results showed that SGBP is a useful structural representation methodology in QSAMs due to its many advantages including plentiful structural information, easy manipulation, and high characterization competence. Moreover, SGBP-GA-SVM route for sequences design and activities prediction of DNA or RNA can further be applied.

There are no comments yet on this publication. Be the first to share your thoughts.