Min Zhang

Written by: Jeremy Troisi, M.S. candidate in Statistics

Min Zhang

Min Zhang

There is a wide array of medical and biological problems that are implicated in high dimensional data analysis demands and Dr. Min Zhang, Assistant Professor of Statistics, is attempting to produce results that could benefit these fields. Dr. Zhang received an M.D. from Hebei Medical University, a Ph.D. in Neuroscience from Peking University Health Science Center, and a Ph.D. in Biological Statistics and Computational Biology from Cornell University. She has built her career as a statistician by combining her medical background with her specialties in biology and computation.

One particularly large and important application that Zhang is involved in is a genome-wide association study. "With recent technology the study has been able to genotype around five hundred thousand to a million Single Nucleotide Polymorphisms (SNPs). This is one particular area that caught my interests, because of its significance and my previous experience within clinical medicine. I want to use my statistical tools and expertise to help people in the clinical world identify the biomarkers and risk factors for certain diseases so they can make better clinical decisions in their daily practice."

Identifying both biomarkers and risk factors that indicate the presence of or future potential of a disease is difficult due to the large number of SNPs. Although SNPs can be computationally assessed for their individual impact on a certain disease, it is currently difficult to handle all pairwise interactions, let alone the higher order interactions. The inability to statistically test interactions between SNPs limits results to individual p-values for all SNPs and represents their individual chances of being associated with a certain disease. Recently, Zhang and her collaborators have developed a penalized orthogonal-component regression (POCRE) approach that can simultaneously evaluate the effects of tens of thousands of predictors. The method has been successfully applied to SNP based association analysis.

Zhang's research is not limited to just human applications and their diseases. Her genomic research can, and is, applied to animals and plants, such as sorghum, a new and emerging bio-fuel. Aside from fewer numbers of genetic markers, plants have another qualities that benefit experimenters: little or no ethical qualms in the attempt of producing scientific results. With human subjects and to a similar degree animals, there are great moral and ethical considerations in the experimental procedure to assure the safety of the patients. For the purposes of possible scientific advancements the standards can be greatly reduced when experimenting with plants. By focusing her work on plants, Zhang has a better understanding of what high dimensional data analysis methods are best used in the genome-wide association studies of human diseases.

Zhang has also developed methods on analyzing health care and pharmacy costs in what has been dubbed as zero inflated data since there are so many individuals that completely avoid health care due to the potential costs. She is actively involved in the Cancer Care Engineering (CCE) project, which will generate a variety of different omic (such as proteomic, genomic, lipidomic, and metabolomic) data and demand statistical approaches for studying biological systems.

Zhang joined the Department of Statistics at Purdue University in 2005 after receiving her Ph.D. in Biological Statistics and Computational Biology from Cornell University. She is currently teaching Design of Experiments (STAT 514). For more information, please visit her homepage.

November 2009