Guang Cheng (̹)

(Currently on sabbatical leave at Princeton)

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. --- John von Neumann

Associate Professor
Department of Statistics
Purdue University 
Office: HAAS 232
E-mail: chengg at stat dot purdue dot edu
Phone: 765-496-9549

Research Fellow, 2012 -- 2013
Massive Data Program, SAMSI


University of Wisconsin-Madison
in Statistics, September, 2003 C July, 2006
Advisor: Michael Kosorok

Tsinghua University
School of Economics and Management
B.A. in Economics, September, 1998 C July, 2002

Research Interest (3Ms)

Mathematical Statistics: Semi/Nonparametric Inference; Bernstein-von Mises Theorem; Bootstrap Theory; Empirical Processes
Machine Learning: Classification Stability; Active Learning
Massive Data

Research Assistantship Available
One or two RA positions are available for PhD students in the above research areas.  These positions are funded by National Science Foundation in the Division of Mathematical Sciences (DMS). If you are interested in these positions, please send your CV to, or stop by my office.


Simons Fellowship in Mathematics, 2014
Noether Young Scholar Award, 2012
NSF CAREER Award, 2012
Teaching for Tomorrow Award, Purdue, 2013
College of Science Professional Achievement Award, Purdue, 2013


NSF DMS-1418202, 2014-2017, PI
Simons Fellowship in Mathematics, 2014-2015, Sole PI
NSF CAREER Award: DMS-1151692, 2012-2017, Sole PI
NSF DMS-0906497, 2009-2012, Sole PI

Editorial Work

Associate Editor for Electronic Journal of Statistics

Selected Publications

Cheng, G. and Kosorok, M.R. (2008), Higher Order Semiparametric Frequentist Inference with the Profile Sampler Annals of Statistics, 36, 1786-1818
Cheng, G. and Kosorok, M.R. (2008), General Frequentist Properties of the Posterior Profile Distribution Annals of Statistics, 36, 1819-1853

Cheng, G. (2009), Semiparametric Additive Isotonic Regression Journal of Statistical Planning and Inference, 139, 1980-1991
Cheng, G. and Huang, J.Z., (2010) Bootstrap Consistency for General Semiparametric M-estimation Annals of Statistics, 38, 2884-2915

Cheng, G. and Wang, X. (2011), Semiparametric Additive Transformation Models under Current Status Data, Electronic Journal of Statistics, 5, 1735-1764
Zhang, H., Cheng, G. and Liu, Y., (2011) Linear or Nonlinear? Automatic Discovery for Partially Linear Models, JASA-Theory & Methods, 106, 1099-1112

Leng, C. and Cheng, G., (2012) Discussion on Probabilistic Index Models by Thas, Neve, Clement and Ottoy, JRSS-B, 74, 661-662

Cheng, G., Yu, Z*. and Huang, J.Z. (2013). The Cluster Bootstrap Consistency in Generalized Estimating Equations, Journal of Multivariate Analysis, 115, 33-47 *: PhD student
Cheng, G. (2013). How Many Iterations are Sufficient for Efficient Semiparametric Estimation?, Scandinavian Journal of Statistics, 40, 592-618

Shang, Z. and Cheng, G. (2013) Local and Global Asymptotic Inference in Smoothing Spline Models, Annals of Statistics, 41, 2608-2638.
Cheng, G.,
Zhou, L. and Huang, J.Z. (2014) Efficient  Semiparametric Estimation in Generalized Partially Linear Additive  Models for Longitudinal/Clustered Data, Bernoulli, 20, 141-163
Cheng, G.,
Zhang, H.H. and Shang, Z. (2014) Sparse and Efficient Estimation for Partial Spline Models with Increasing Dimension, AISM, To Appear


Collaborative Research

Liang, C., Cheng, G., Wixon, D. and Balse, T. (2011) An Absorbing Markov Chain Approach to Understanding the Microbial Role in Soil Carbon Stabilization, Biogeochemistry, 106, 303-309

Selected Manuscripts


Bootstrapping High Dimensional Time Series (with Zhang, X.)

Nearest Neighbor Classifier with Optimal Stability (with Sun, W. and Qiao, X.)
Active Clinical Trials for Personalized Medicine (with Zhao, Y. and Minsker, S.)

A Simple Averaged Confidence Interval after Model Selection (with Xu, G. and Huang, J.)
Nonparametric Inference in Generalized Functional Linear Models (with Shang, Z.)
Semiparametric Objective Prior (with Yang, Y. and Dunson, D.)
Joint Asymptotics for Semi-Nonparametric Regression Models under Partially Linear Structure (with Shang, Z.)

Talk Slides

T1. Bootstrapping High Dimensional Vector: Interplay between Dependence and Dimensionality. Link (presented by my co-author Zhang at SAMSI workshop on May 13, 2014)
T2. Nonparametric Inference in Functional Data. Link (presented by my co-author Shang at SAMSI workshop)
T3. Nearest Neighbor Classifier with Optimal Stability. Link (presented by my PhD student Sun at ISBIS 2014 and SLDM Meeting)
T4. A Long March towards Joint Asymptotics: My 1st Steps. Link

T5. Semiparametric Model Based Bootstrap. Link
T6. Bootstrap Consistency for General Semiparametric M-Estimate. Link

T7.How Many Iterations are Sufficient for Semiparametric Estimation? Link

T8. Inverse Problems in Semiparametric Statistical Models. Link


Stat 695R: Asymptotic Statistics and Empirical Processes


PhD Students

Wei Sun

Zhuqing Yu

Ching-Wei Cheng
Meimei Liu
Yaowu Liu
Botao Hao
Ritabrata Dutta (2008-2012, co-advised by Prof. Jayanta Ghosh), Currently Postdoc Fellow


Zhiyuan Luo



Selected Invited Talk


December, The 9th ICSA International Conference, Hong Kong
December, Department of Statistics, University of Michigan

October, Department of Statistics and Probability, Michigan State University
October, Department of Mathematics, University of Bochum, Germany
October, Applicable Semiparametrics Conference, Berlin, Germany

June, The 4th IMS-China International Conference, Chengdu, China
May, Department of Statistics, Virginia Tech Univ., Blacksburg, VA

March, Data Seminar, Duke University, Durham, NC



July 2-4, The 2nd Institute of Mathematical Statistics-Asian Pacific Rim Meeting, Tsukuba, Japan
June 5-7, Conference on Statistical Learning and data Mining, Ann Arbor, MI

April 12, Dept of Statistics and Actuarial Science, University of Iowa, Iowa City, IA

April 5, IAMCS, Texas A&M University, College Station, TX


Sept 23, Dept. of Math., IUPUI, Indianapolis, IN.

June 26-29, ICSA 2011, Applied Statistics Symposium, NYC

May 23-27, Applied Inverse Problems Conference, College Station, TX

March 9, Department of Statistics, Northwestern University, Chicago, IL



July 31-August 5, Section of Nonparametric Statistics, JSM

June 20-23, Session of Semiparametric Modelling, ICSA, Applied Statistics Symposium

May 14, Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison

March 12, Department of Statistics, University of Michigan

March 25-26, Conference on Resampling Methods and High Dimensional Data, Texas A&M University



October, Department of Statistics, University of Illinois at Urbana-Champaign

September, Department of Mathematics, Statistics and Computer Science, University of Illinois at Chicago

August, Section of Nonparametric Statistics, JSM

June, National Center for Atmospheric Research (NCAR)

May, Symposium on New Directions in Asymptotic Statistics, University of Georgia

April, Department of Statistics, Penn State University



Watson Research Center, IBM; Texas A&M University; Rutgers University; University of Iowa; Purdue University; Michigan State University; UC-Berkeley; North Carolina State University; Iowa State University; Indiana University; 10th New Researcher Conference; 2007 Nonparametric Conference at University of South Carolina; University of Bristol; Temple University; University of Oxford; HongKong University; University of Georgia; National University of Singapore; SAMSI.