Congratulations 2009-2010 Graduates - Department of Statistics - Purdue University Skip to main content

Congratulations 2009-2010 Graduates

Ryan Martin (August 2009)

Dr. Ryan Martin

Dr. Ryan Martin's thesis is concerned with a problem posed by Professor Jayanta Ghosh on a theoretical study of convergence of a new recursive algorithm and its extensions and applications to inference based on high-dimensional mixture models. The algorithm, called Predictive Recursion (PR), was originally proposed by Michael Newton as an approximation to a nonparametric Bayes procedure.

The theoretical part of the thesis deals with the above problems and the results obtained may be regarded as convergence theory for a special case of non-linear infinite-dimensional stochastic approximation algorithms. The results on convergence and extensions of PR have appeared in Martin and Ghosh (Statistical Science, 2008) and Tokdar, Martin and Ghosh (Annals of Statistics, 2009). More recently, Martin and Tokdar (IUPUI Tech Report, 2009) give results on robustness and rate of convergence for PR. Surya Tokdar is a former doctoral student of Ghosh at Purdue, and is now an assistant professor in the Department of Statistical Science at Duke University.

The proposed extension of PR, namely PR+, is then applied to analyze microarray data on gene expression. The data are modeled via Efron's two-groups framework, in which one group represents the unexpressed genes and the other the expressed genes. The expressed group is modeled by a fully nonparametric mixture, while the unexpressed group is modeled parametrically. The null density is taken to be Normal but not the standard Normal favored in the classical testing literature. Efron gives examples to show the standard Normal may not be the right null distribution in all cases. Surya and Ryan's version of the two-groups model avoids the possibility of non-identifiability, and they analyze several microarray data sets using Efron's local fdr-based decision procedure. Their method of estimating the null and alternative densities is also quite different from Efron's.

The gene expression application is expected to lead to a substantial paper, and Ryan and Surya are currently working on applying their empirical Bayes testing framework to other important bioinformatics-related problems, namely inferring gene association networks, and analyzing single-nucleotide polymorphisms (SNPs).

After graduation, Dr. Ryan Martin accepted a position as an Assistant Professor at Indiana University - Purdue University Indianapolis (IUPUI) in the Department of Mathematical Sciences. For more information about Dr. Martin, please visit his homepage.

 

Cherie Ochsenfeld (August 2009)

Cherie Ochsenfeld with Professors Kristofer Jennings and Rebecca W. Doerge Pictured left to right: Professor Kristofer Jennings, Dr. Cherie Ochsenfeld, Professor Rebecca W. Doerge

Dr. Cherie Ochsenfeld’s research (“Mixed Models in Quantitative Trait Loci and Association Mapping with Bootstrap Thresholds”) is focused on understanding the complicated process by which the genetic code is translated into complex traits. Associating regions of the genome with complex traits continues to support improved treatments for disease and better breeding programs for both animals and crops.

Quantitative trait loci (QTL) and association mapping are two well known analytic methodologies that are employed for locating regions of the genome that are associated with complex traits. However, at times determining the actual statistical significance from these approaches has been difficult since the intrinsic relationships between the experimental design, the genome, and the environment are complex.

Previously, permutation testing (Churchill and Doerge 1994) successfully provided significance thresholds for QTL analyses.  However, due to the requirement of exchangeability, permutation thresholds are limited to simple linear models that limit the inclusion of cofactors in that analysis.  Two models, a mixture of mixed models and a mixed mixture model, are proposed by Dr. Ochsenfeld.  Both models extend QTL interval mapping to include mixed models that incorporate biological cofactors. These extensions include novel applications of the alternating expectation conditional maximization algorithm and nested EM algorithm and provide maximum likelihood estimates for a mixture of mixed models and mixed mixture model, respectively. A bootstrap threshold algorithm is presented that establishes significance thresholds that are appropriate for a mixed model QTL analysis.  This approach is also extended to association mapping studies that incorporate mixed models to control for population structure. In simulation and real data studies, the proposed mixed models for QTL mapping demonstrate improved detection of additive QTL effects when influential covariates are incorporated into the analysis. Additionally, it is shown empirically that the bootstrap threshold algorithm establishes appropriate significance thresholds for mixed model QTL or association mapping analyses.

After graduation, Dr. Cherie Ochsenfeld accepted a position as a Senior Analyst with Dow AgroSciences in Indianapolis, IN.

 

Alexander Lipka (December 2009)

Pictured left to right: Alexander Lipka and Rebecca W. Doerge

Dr. Alexander Lipka's Ph.D. research is focused on a phenomenon known as quasi-separation of points (QSP) which can occur when statistically testing for an association between single nucleotide polymorphisms (SNPs) and binary traits (or diseases). Logistic regression is typically used in association mapping studies and QSP arises when, for at least one SNP genotype (called a SNP type), all individuals have the same observed binary trait value (e.g., all individuals have the disease) leaving the other category void of observations. The impact of a zero category, or an empty disease class, results in infinite maximum likelihood estimates of the logistic regression parameters. Using simulation and real data analyses Alex investigated the impact of QSP on binary trait association mapping and implemented Firth's penalized likelihood function to obtain the MLEs of logistic regression parameters (additive, dominance, and epistatic) in the presence of QSP. Although this work is framed in a single SNP and single binary trait setting, the long-range impact of this research is most obvious when multiple SNPs are analyzed simultaneously and present combinations of SNPs classifications that result in empty categories.

After graduation Dr. Alexander Lipka joined Cornell University (Ithaca, NY) as a postdoctoral Fellow in the laboratory of Professor Ed Buckler.

Past graduates can be seen here.

Purdue Department of Statistics, 150 N. University St, West Lafayette, IN 47907

Phone: (765) 494-6030, Fax: (765) 494-0558

© 2023 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.