Title: "Shotgun Proteomic Data Analysis and Statistical Approaches to Protein Quantitation"
Speaker: Yuliya Karpievitch, Pacific Northwest National Laboratory, Richland, WA
Place: LILLY G126; October 19, 2010, Tuesday, 4:30pm


Mass spectrometry-based bottom-up proteomics requires complex data preprocessing before any statistical analysis can begin. LC-MS based data generated with Accurate Mass and Tag (AMT) approach from high performance mass spectrometers require peaks to be deisotoped, LC-MS features to be detected, aligned and matched to a database. Also some scoring criteria needs to be in place to distinguish between correct and incorrect identifications to utilize as selection cutoff.

Among the peptides that are deemed correctly identified, many peptides that are observed in some samples are not observed in others, resulting in widespread missing values. Furthermore, the fact that a peak was not observed for a peptide is often due to that peptide's presence at a lower abundance than the instrument can detect. Because of this informative missingness, care must be taken when handling the missing values to avoid biasing abundance estimates.

I will present (i) proteomics pipeline utilized at PNNL and outline potential areas for improvement, (ii) a statistical model that carefully accounts for informative "missingness" in peak intensities and allows unbiased, model-based, protein-level estimation and inference and (iii) DAnTER, a software for proteomic data analysis.

Associated Reading:
Karpievitch Y, Stanley J, Taverner T, Huang J, Adkins JN, Ansong C, Heffron F, Metz TO, Qian WJ, Yoon H, Smith RD, Dabney AR. "A statistical framework for protein quantitation in bottom-up MS-based proteomics." Bioinformatics. 2009 Aug 15;25(16):2028-34.

Karpievitch YV, Taverner T, Adkins JN, Callister SJ, Anderson GA, Smith RD, Dabney AR. "Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition." Bioinformatics. 2009 Oct 1;25(19):2573-80.

