Myra Samuels Memorial Lecture

High Dimensional Statistics in Genomics: Some New Problems and Solutions

Dr. Hongzhe Li
Department of Biostatistics and Epidemiology, University of Pennsylvania

Start Date and Time: Thu, 2 Apr 2009, 4:30 PM

End Date and Time: Thu, 2 Apr 2009, 5:30 PM

Venue: MATH 175

Refreshments: 5:30 p.m. in HAAS 111


Large-scale systematic genomic datasets have been generated to inform our biological understanding of both the normal workings of organisms in biology and disrupted pathways which cause human disease. The integrative analysis of these vast amounts of diverse types of quantitative data, which has become an increasingly important part of genomics and systems biology research, poses many interesting statistical problems, largely driven by the complex inter-relationships between these high-dimensional genomic measurements. In this talk, I will present three problems in genomics research that require the development of new statistical methods: (1) identification of active transcription factors in microarray time-course experiments; (2) identification of sub-networks that are associated with clinical outcomes; and (3) identification of genetic variants that explain higher-order gene co-expression modules. I will present several regularized estimation methods to address these questions and demonstrate their applications using real data examples. I will also discuss some theoretical properties of these procedures.

