Published
How to Cite
Sites Identification in Proteins, using machines with support vectors
The increasing amount of protein three-dimensional (3D) structures determined by x ray and NMR technologies as well as structures predicted by computational methods results in the need for automated methods to provide initial annotations.We have developed a new method for recognizing sites in three-dimensional protein structures.
Our method is based on a previously reported algorithm for creating descriptions of protein microenvironments using physical and chemical properties at multiple levels of detail. The recognition method takes three inputs: 1. a set of sites that share some structural or functional role, 2.a set of control non-sites that lack this role, and 3. a single query site. A support vector machine classifier is built using feature vectors where each component represents a property in a given volume. Validation against an independent test shows that this recognition approach has high sensitivity and specificity.
We also describe the results of scanning four calcium binding proteins (with the calcium removed) using a three dimensional grid of probe points at 1.25Å spacing. Our results show that property based descriptions along with support vector machines can be used for recognizing protein sites in un-annotated structures