# Is it possible to predict the average surface hydrophobicity of a protein using only its amino acid composition?

- Authors
- Publisher
- ELSEVIER SCIENCE BV
- Publication Date
- Keywords
- Disciplines

## Abstract

Hydrophobicity is one of the most important physicochemical properties of proteins. Moreover, it plays a fundamental role in hydrophobic interaction chromatography, a separation technique that, at present time, is used in most industrial processes for protein purification as well as in laboratory scale applications. Although there are many ways of assessing the hydrophobicity value of a protein, recently, it has been shown that the average surface hydrophobicity (ASH) is an important tool in the area of protein separation and purification particularly in protein chromatography. The ASH is calculated based on the hydrophobic characteristics of each class of amino acid present on the protein surface. The hydrophobic characteristics of the amino acids are determined by a scale of aminoacidic hydrophobicity. In this work, the scales of Cowan-Whittaker and Berggren were studied. However, to calculate the ASH, it is necessary to have the three-dimensional protein structure. Frequently this data does not exist, and the only information available is the amino acid sequence. In these cases it would be desirable to estimate the ASH based only on properties extracted from the protein sequence. It was found that it is possible to predict the ASH from a protein to an acceptable level for many practical applications (correlation coefficient > 0.8) using only the aminoacidic composition. Two predictive tools were built: one based on a simple linear model and the other on a neural network. Both tools were constructed starting from the analysis of a set of 1982 non-redundant proteins. The linear model was able to predict the ASH for an independent subset with a correlation coefficient of 0.769 for the case of Cowan-Whittaker and 0.803 for the case of Berggren. On the other hand, the neural model improved the results shown by the linear model obtaining correlation coefficients of 0.831 and 0.836, respectively. The neural model was somewhat more robust than the linear model particularly as it gave similar correlation coefficients for both hydrophobicity scales tested, moreover, the observed variabilities did not overcome 6.1% of the mean square error. Finally, we tested our models in a set of nine proteins with known retention time in hydrophobic interaction chromatography. We found that both models can predict this retention time with correlation coefficients only slightly inferior (11.5% and 5.5% for the linear and the neural network models, respectively) than models that use the information about the three-dimensional structure of proteins.

## There are no comments yet on this publication. Be the first to share your thoughts.