Session 02 - Department of Statistics - Purdue University Skip to main content

Purdue Statistics Alumni

Organizer: Fei Xue, Assistant Professor of Statistics, Purdue University

Speakers

  • Dipak Dey, Board of Trustees Distinguished Professor of Statistics and holds an Affiliated faculty position in the Department of Mathematics, University of Connecticut
  • Whitney Huang, Assistant Professor of Statistics, School of Mathematics and Statistical Sciences, Clemson University
  • Jessie Jeng, Associate Professor, Department of Statistics, North Carolina State University
  • Yao Zheng, Assistant Professor, Department of Statistics, University of Connecticut

Speaker Title
Dipak Dey Generalized Variable Selection Algorithms for Gaussian Process Models by LASSO-like Penalty

Abstract: With the rapid development of modern technology, massive amounts of data with complex patterns are generated. Gaussian process models that can easily fit the nonlinearity in data become more and more popular nowadays. It is often the case. that in some data only a few features are important or active. However, unlike classical linear models, it is challenging to identify active variables in Gaussian process models. One of the most used methods for variable selection in Gaussian process models are automatic relevance determination, which is known to be open-ended. There is no rule of thumb to determine the threshold for dropping features, which makes the variable selection in Gaussian process models ambiguous. In this work, we propose two variable selection algorithms for Gaussian process models, which use the artificial nuisance columns as baseline for identifying the active features. Moreover, the proposed methods work for both regression and classification problems. The algorithms are demonstrated using comprehensive simulation experiments and an application to multi-subject electroencephalography (EEG) data that studies alcoholic levels of experimental subjects.

Whitney Huang A formal invitation to Purdue Statistics Alumni Association: My personal reflection on graduate school

Abstract: In this talk I will share my journey at Purdue Statistics and how my experience at Purdue helps with my career path. Specifically, Dr. Hao Zhang brought me from the Industrial Engineering to Statistics program and introduced me to the area of environmental statistics that I am currently working on. I also benefited substantially from my involvement in the statistical consulting service led by Dr. Bruce Craig, especially for my interdisciplinary research program. My connection with Purdue Statistics became even stronger after my graduation, first through my SAMSI postdoctoral fellowship and then my engagement with Purdue Statistics Alumni Association (PSAA) initiated by Dr. Dennis Lin. I will conclude with a formal invitation of PSAA to all Purdue Statistics family members.  

Jessie Jeng Transfer Learning with False Negative Control Improves Polygenic Risk Prediction

Abstract: Polygenic risk score (PRS) estimates an individual's genetic predisposition for a trait by aggregating variant effects across the genome. However, mismatches in the ancestral background between base and target data sets are common and can compromise the accuracy of PRS analysis. In response, we propose a transfer learning framework comprising two steps: (1) false negative control (FNC) marginal screening to extract useful knowledge from base data, and (2) joint model training to integrate knowledge with target data for accurate prediction. Our FNC screening method efficiently retains a high proportion of signal variants in base data under arbitrary covariance dependence between variants. This new transfer learning framework with FNC screening provides a novel solution for PRS analysis with mismatched ancestral backgrounds, improving prediction accuracy and facilitating efficient joint-model training.

Yao Zheng Interpretable and Efficient Infinite-Order Vector Autoregressive Model for High-Dimensional Time Series

Abstract: As a special infinite-order vector autoregressive (VAR) model, the vector autoregressive moving average (VARMA) model can capture much richer temporal patterns than the widely used finite-order VAR model. However, its practicality has long been hindered by its non-identifiability, computational intractability, and difficulty of interpretation. We introduce a novel parsimonious infinite-order VAR model which inherits the temporal patterns of the VARMA model but avoids all of its above drawbacks. As another attractive feature that facilitates interpretation, the temporal and cross-sectional dependence structures of this model are parameterized separately. For the high-dimensional setup, this motivates us to impose sparsity on the parameters corresponding to the cross-sectional dependence, without incurring any loss of temporal information. This model allows us to not only infer Granger causality relationships among the component series but distinguish between persistent and nonpersistent lagged effects. For the proposed high-dimensional model, we introduce an l1-regularized estimator and derive the corresponding nonasymptotic error bounds. An efficient block coordinate descent algorithm and a consistent model order selection method are developed. The merit of the proposed approach is supported by simulation studies and a real-world macroeconomic data analysis.

Purdue Department of Statistics, 150 N. University St, West Lafayette, IN 47907

Phone: (765) 494-6030, Fax: (765) 494-0558

© 2023 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.