Affordable Access

Publisher Website

DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches.

Authors
Type
Published Article
Journal
Proteins Structure Function and Bioinformatics
0887-3585
Publisher
Wiley Blackwell (John Wiley & Sons)
Publication Date
Volume
81
Issue
11
Pages
1885–1899
Identifiers
DOI: 10.1002/prot.24330
PMID: 23737141
Source
Medline
Keywords
  • Dna-Binding Residue
  • Conformational Change
  • Machine Learning
  • Protein-Dna Interaction
  • Structural Analysis
  • Template

Abstract

Accurate prediction of DNA-binding residues has become a problem of increasing importance in structural bioinformatics. Here, we presented DNABind, a novel hybrid algorithm for identifying these crucial residues by exploiting the complementarity between machine learning- and template-based methods. Our machine learning-based method was based on the probabilistic combination of a structure-based and a sequence-based predictor, both of which were implemented using support vector machines algorithms. The former included our well-designed structural features, such as solvent accessibility, local geometry, topological features, and relative positions, which can effectively quantify the difference between DNA-binding and nonbinding residues. The latter combined evolutionary conservation features with three other sequence attributes. Our template-based method depended on structural alignment and utilized the template structure from known protein-DNA complexes to infer DNA-binding residues. We showed that the template method had excellent performance when reliable templates were found for the query proteins but tended to be strongly influenced by the template quality as well as the conformational changes upon DNA binding. In contrast, the machine learning approach yielded better performance when high-quality templates were not available (about 1/3 cases in our dataset) or the query protein was subject to intensive transformation changes upon DNA binding. Our extensive experiments indicated that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures. DNABind also significantly outperformed the state-of-art algorithms by around 10% in terms of Matthews's correlation coefficient. The proposed methodology could also have wide application in various protein functional site annotations. DNABind is freely available at http://mleg.cse.sc.edu/DNABind/.

There are no comments yet on this publication. Be the first to share your thoughts.

Statistics

Seen <100 times
0 Comments
F