Title: "A Scalable Inferential Approach to Protein Backbone Nuclear Magnetic Resonance Assignment"
Speaker: Ms. Olga Vitek; Department of Statistics, Purdue University
Place: Smith Hall (SMTH) 108; Tuesday, March 8, 2005; 4:30pm


Nuclear Magnetic Resonance (NMR) spectroscopy is a key data source for genome-wide studies of the three-dimensional structures of proteins. Like many experiments in molecular biology NMR spectroscopy generates noisy and incomplete data, and the existing analysis tools are error-prone and lack scalability. These problems can be addressed by developing methods of statistical inference specifically designed for NMR data, combined with new algorithms for carrying out the inference in large and noisy data sets.

The talk introduces a Bayesian approach to a particular step of the NMR-based procedure, backbone resonance assignment. The approach is based on a Gaussian graphical model where informative priors are derived from existing NMR databases. A difficulty lies in the exploration of the combinatorially large and jagged posterior landscape of candidate graphs. We develop an algorithm that, instead of examining one candidate graph at a time, recursively partitions the graph space into smaller subspaces. The resulting tree structure is searched using adaptive importance sampling where the importance of the branches is learned from previously visited nodes. We demonstrate the accuracy and scalability of the approach using a range of simulated and experimental data, and show that the results are superior to the ones obtained using existing assignment methods.

This is a Statistics Bioinformatics COALESCE candidate interview

Click here for a full scheule of BIOINFORMATICS SEMINARS, past and present.