Title: HIERARCHIAL MODELS FOR THE ANALYSIS OF GENE EXPRESSION MICROARRAY DATA

Abstract

The development of microarray technologies provides a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously. Comparison of expression arrays from different tissue samples can provide insight into and information about gene function. Statistical problems abound in such comparisons, in part due to the high number of dimensions, few replicates, and numerous sources of variation. In particular, the variation in apparent differential expression is higher for genes expressed at a low level as compared to genes expressed at a relatively high level. To estimate differential expression and identify significant gene expression changes, we have developed a hierarchichal model approach that accounts for measurement error, fluctuations in absolute gene expression levels, and inherent dependencies between genes. The approach assumes that intensities are Gamma distributed with constant coefficient of variation. Derived significance levels account for increased variability at low intensities. An examination of model properties indicates that the Gamma assumption is not necessary for this to be the case. The Gamma based model will be compared with a Log-Normal based approach. Identifiability of model parameters and a set of general conditions to ensure appropriate significance bounds will be discussed.