Session 03 - Department of Statistics - Purdue University Skip to main content

Recent Advances in Sufficient Dimension Reduction

Organizer: Michael Zhu, Professor of Statistics, Purdue University

  • Bing Li, Verne M. Willaman Professor of Statistics; Chair of Graduate Studies, Department of Statistics, Penn State
  • Hanmin Guo, Department of Statistics, Stanford University
  • Wenxuan Zhong, Professor, Department of Statistics, University of Georgia
  • Pang Du, Department of Statistics, Virginia Tech University

Speaker Title
Bing Li Nonlinear function-on-function regression by RKHS

Abstract: We propose a nonlinear function-on-function regression model where both the covariate and the response are random functions. The nonlinear regression is carried out in two steps: we first construct Hilbert spaces to accommodate the functional covariate and the functional response, and then build a second-layer Hilbert space for the covariate to capture nonlinearity. The second-layer space is assumed to be a reproducing kernel Hilbert space, which is generated by a positive definite kernel determined by the inner product of the first-layer Hilbert space for X--this structure is known as the nested Hilbert spaces. We develop estimation procedures to implement the proposed method, which allows the functional data to be observed at different time points for different subjects. Furthermore, we establish the convergence rate of our estimator as well as the weak convergence of the predicted response in the Hilbert space. Numerical studies including both simulations and a data application are conducted to investigate the performance of our estimator in finite sample.  (Joint work with Peijun Sang of University of Waterloo)

Hanmin Guo Minimal σ-field for flexible sufficient dimension reduction

Abstract: Sufficient Dimension Reduction (SDR) becomes an important tool for mitigating the curse of dimensionality in high dimensional regression analysis. Recently, flexible SDR (FSDR) has been proposed to extend SDR by finding lower dimensional projections of transformed explanatory variables. In this talk, the σ-field associated with the projections is proposed and referred to as the FSDR σ-field. FSDR σ-field together with its dimension can fully characterize FSDR and represent the extent of data reduction that FSDR can achieve. Further, the concept of minimal FSDR σ-field is introduced and the FSDR projections with the minimal σ-field can be considered optimal. The paper shows that the minimal FSDR σ-field exists under mild condition, while attaining lowest dimensionality at the same time. A two-stage procedure called the Generalized Kernel Dimension Reduction (GKDR) method is proposed to estimate the minimal FSDR σ-field, and its consistency property is partially established under weak conditions. We use simulation experiments and real application to an air pollution data set to demonstrate the utility and effectiveness of the proposed GKDR method.

Wenxuan Zhong Sufficient dimension reduction for classification using principal optimal transport direction

Abstract: Sufficient dimension reduction is used pervasively as a supervised dimension reduction approach. Most existing sufficient dimension reduction methods are developed for data with a continuous response and may have an unsatisfactory performance for the categorical response, especially for the binary response. To address this issue, we propose a novel estimation method of sufficient dimension reduction subspace (SDR subspace) using optimal transport. The proposed method, named principal optimal transport direction (POTD), estimates the basis of the SDR subspace using the principal directions of the optimal transport coupling between the data respecting different response categories. The proposed method also reveals the relationship among three seemingly irrelevant topics, i.e., sufficient dimension reduction, support vector machine, and optimal transport. We study the asymptotic properties of POTD and show that in the cases when the class labels contain no error, POTD estimates the SDR subspace exclusively. Empirical studies show POTD outperforms most state-of-the-art linear dimension reduction methods.

Pang Du

Sparse graphical modeling of longitudinal data

Abstract: The conditional independence graph is a common tool in describing the relationship between a set of variables. The existing methods such as the dynamic Bayesian network (DBN) treat longitudinal measurements as time series, which often requires a much higher sampling frequency as well as restricts the correlation to be serial. We propose a penalized likelihood method as well as a node-wise regression method, which is an extension of Meinshausen and Buhlmann (2016) for longitudinal data. We use a pairwise coordinate descent combined with second order cone programming (SOCP) to optimize the penalized likelihood and estimate the parameters, and so that the edges in the conditional independence graph can be identified.

 

Purdue Department of Statistics, 150 N. University St, West Lafayette, IN 47907

Phone: (765) 494-6030, Fax: (765) 494-0558

© 2023 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.