Title: "Model Selection Approach for Genome Wide Association Studies"
Speaker: Malgorzata Bogdan, Institute of Mathematics and Computer Science, Wroclaw University of Technology, Poland
Place: LILLY G126; September 14, 2010, Tuesday, 4:30pm

Abstract

In Genome Wide Association Studies (GWAS) hundreds of thousands of genetic markers are assessed for association with a trait of interest. This type of search is typically based on individual tests at each of the markers and requires some correction for multiple testing. It is however important to note that in reality all "causal" mutations jointly influence the trait of interest and the statistical methods which allow us to model this joint influence seem to be more appropriate for GWAS.

In this talk we will concentrate on the case where the trait can be modeled using a normal distribution. In this context we will discuss the multiple testing approaches to GWAS and compare it to the approach based on the multiple regression model. We will present two model selection criteria based on the modifications of the popular Bayesian Information Criterionm, which can be used for identification of important predictors for GWAS. One of these criteria, mBIC (see [1]), was previously successfully used in the context of quantitative trait locus (QTL) mapping in experimental populations. We will also present a new, modified version of mBIC, which works similarly to the Extended Bayesian Information Criterion of [2] and seems to be much more appropriate for GWAS. We will demonstrate the results of a simulation study that shows in the case where the trait is influenced by many genes, methods based on single tests have substantial problems with proper ranking of causal SNPs. This leads both to a low power and a high false discovery rate. The regression model supplied with the modified version of mBIC performs much better with respect to both these parameters. We will also demonstrate the performance of our multiple regression approach on the e-QTL data set of [3].

This is a joint work with Florian Frommlet and Felix Ruhaltinger from University of Vienna and Piotr Twarg.

Associated Reading:
[1] Bogdan, M., Ghosh, J.K., and Doerge, R.W. (2004). Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting quantitive trait loci. Genetics, 167, 989--999.

[2] Chen, J. and Chen, Z. (2008). Extended Bayesian Information criteria for model selection with large model spaces. Biometrika, 95, 759--771.

[3] Stranger, B.E., et al. (2007). Nature Genetics, Population genomics of human gene expression. Nature Genetics, 39: 1217--1224.



Click here for a full schedule of BIOINFORMATICS SEMINARS, past and present.