I have broad intererts, including multiplte testing, model selection, and empirical processes; stochastic differential equations, their statistical inference and applications to mathematical finance, random matrix theory, and analytic number theory.

Publications, Work in progress, and others:

Xiongzhi Chen and R.W. Doerge (2012). *Generalized estimators for multiple testing: proportion of true nulls and false discovery rate.* Technical report # 12-04, Department of Statistics, Purdue University. Click to view Abstract

*Abstract*: *For multiple testing based on discrete p-values, we have proposed new estimators of the proportion of true nulls and of the FDR of one-step multiple testing procedures. We have theoretically proved, and verified via extensive simulation studies and real data applications, that the new estimators out perform existing competitors for discrete p-values, and perform equally well for continuous data. We have also established the asymptotic (simultaneous) conservativeness of the new adaptive FDR estimator, and we have proven that the threshold of a general type of adaptive FDR estimator is a stopping time relative to the backward filtration generated by the p-values. This is true regardless of whether the p-values are dependent or independent, discrete or continuous.*

Xiongzhi Chen and R.W. Doerge (2012). *Consistent estimation of the proportion of nonzero normal means under certain strong covariance dependencies*. Technical report # 12-03, Department of Statistics, Purdue University. Click to view Abstract

*Abstract*: *To better estimate the FDR of one-step multiple testing procedures for strongly dependent test
statistics, we assume that the test statistics are jointly normally distributed with a known covariance matrix, and study multiple testing for which the normal means are zero. Under this model and using a technique called "principal factor approximation", we have developed a (uniformly) consistent estimator of the proportion of nonzero normal means under certain strong dependence embedded in the covariance matrix. . In order to develop this estimator, we have extended the Fourier transform method to estimate the proportion of nonzero normal means to the case of weakly dependent heterogeneous null distributions, and developed partially a theory for partially penalized least squares (PLS) in linear regression with weakly dependent normally distributed errors.*

Xiongzhi Chen and R.W. Doerge (2012).* Variance of normalized number of conditional rejections for multiple testing normal means*. Manuscript. Click to view Abstract

*Abstract*: * Very recently, the strong law of large numbers (SLLN) for the normalized number of
conditional rejections has been derived through a technique called principal factor approximation
(PFA), when testing which normal means are zero when the test statistics follow a joint normal
distribution whose known correlation matrix is the sample correlation matrix of a random sample
of some normal random vector. We reformulate this result under two additional conditions when
the correlation matrix is deterministic and caution that the result may not hold for arbitrary
correlation matrices, random or deterministic. Our justi
fication provides an integrated view on
how the speed of PFA, the linear dependency among components of the normal random vector,
and the magnitudes of the normal means should interact with each other in order to validate the
SLLN for this random process. *

Xiongzhi Chen. *Lower bound for the proportion of nonnulls under pairwise dependence.* In preparation. Click to view Abstract

*Abstract*: *It is well known that Prof. Bradley Efron's method of mode matching requires the proportion of true nulls to be at least 0.9, even though it is robust to dependence. We are attempting to remove this restriction in multiple testing of normal means, when the normal test statistics are only pairwise bivirate normal but not jointly normal. The key is to develop random density functions with given convariance structure, and then find a normalizing sequence for the difference of two empirical processes, which is a quite demanding task... *

Xiongzhi Chen. *Multiple Testing under Discrete and Heterogeneous Null Distributions*. In preparation

*Abstract*: * The generalized estimators provide a partial solution to multiple testing when the null distributions are discrete. However, multiple testing when the null distributions are also heterogenous has largely been untouched by the literature. With the deluge of low-sample-size discrete data from sequencing data, it seems urgent to explore this aspect of multiple testing. We propose a novel method that respects and emploits such heterogeneity. Initial simulation results shows our method results in significant improvement. *

Xiongzhi Chen. *Random Geometric Operators and Variable Selectimators.* In preparation

*Abstract*: * It seems that mordern variable selection techniques can be simply decomposed into the composition of some simple geometric random operators. This point of view may lead to a unified treatment of the microscopic structure of the solution parths of these variable selection techniques. *

Xiongzhi Chen. * Variable selection and multiple testing: shared optimality under dependence*. Click to view Abstract

*Abstract*: *Optimal multiple testing under dependence in a frequentist paradigm is a long-standing open problem. Fortunately, recently work has revealed a connection between multiple testing and variable selection under dependence. This can lead to the shared optimality between these two techniques. *

Xiongzhi Chen. *Optimal subsample allocation in multi-stage high dimensional variable slection*. Click to view Abstract