Title: PREDICTION OF PROTEIN STRUCTURE AND FUNCTION ON A GENOMIC SCALE

Abstract

A novel method for the prediction of protein function based on the sequence-to-structure-to-function paradigm has been developed. First, the tertiary structure of the sequence of interest is predicted from either ab initio folding or threading. The resulting structures are then refined using novel techniques. Then, using a library of three-dimensional descriptors of protein active sites, termed ``fuzzy functional forms'' or FFFs, the resulting structures are screened. If the geometry and residue type in the predicted structure match an FFF, then the protein is predicted to have the specified molecular function. The FFFs correctly identify the active sites in a library of experimental structures as well as in models produced by ab initio folding or threading. This shows that low-to-moderate resolution models whose alpha-carbon root mean square deviations from native range from 3.5-6 A are sufficient to identify protein active sites. Next, this approach is applied to the screening of a number of genomes. In general, the method is found to be very robust and is subject to fewer false positive functional predictions than alternative sequence-based approaches.

Joint work with Andrzej Kolinski, Daisuke Kihara, Piotr Rotkiewicz, and Bartosz Ilkowski Laboratory of Computational Genomics, Danforth Plant Science Center, St. Louis MO 63141. http://bioinformatics.danforthcenter.org

http://www.stat.purdue.edu/research/seminars/biostat.html