Title: ``Challenges and Practical Solutions in the Analysis of Microarray Data''
Speaker: James Lyons-Weiler
Place: Stanley Coulter (SC) 239; Tuesday, 4:30pm


A bewildering number of approaches exist for analyzing data from microarray experiments, and open questions remain on appropriate approaches for transformation, normalization, finding differentially expressed genes, identifying co-regulated genes, and for performing predictive classification operations (e.g., molecular diagnosis and prognosis). Our coarse-grain simulation approach offers partial solutions to many of these questions. With simulation, one can examine directly the effects of each decision on analysis by comparing, for example, the sensitivity and specificity of tests for finding differentially expressed genes, and the relative suitability of classification algorithms at recovering the known sample classification. We also can report on the relative robustness of these approaches to various data distributions, and, more importantly, robustness to multiple unwanted sources of variation, such as background, array-specific biases, and confounding. Given the large number of approaches to analysis that have been or could be defined, we have only begun to map the immense method space. The most important results to date include (1) the J5 test is superior to fold-change and to the t-test in finding differentially expressed genes; (2) only three tests are robust to the effects of confounding; (3) some simple normalization approaches can recover most of the information lost due to the cumulative and compounded effect of array-specific and sample-specific errors . Our simulator, and the associated online Gene Expression Data Analysis tool, are therefore open-source, and ready for extension and use by collaborators and independent researchers.