Session 06 - Department of Statistics - Purdue University Skip to main content

IU Session

Organizer: Wanzhu Tu, Professor of Biostatistics and Health Data Science, School of Medicine, Indiana University

Speakers

  • Yi Zhao, Assistant Professor of Biostatistics & Health Data Science and Public Health, School of Medicine, Indiana University
  • Zilin Li, Professor of Biostatistics and Health Data Science, School of Medicine, Indiana University
  • Sha Cao, Assistant Professor of Biostatistics and Health Data Science, School of Medicine, Indiana University
  • Sean McCabe, Assistant Professor of Biostatistics and Health Data Science, School of Medicine, Indiana University

Speaker Title
Yi Zhao Beyond massive univariate tests: Covariance regression reveals complex patterns of brain functional connectivity

Abstract: Studies of brain functional connectivity typically involve massive univariate tests, performing statistical analysis on each individual connection. In this study, we consider the problem of regressing covariance matrices on associated covariates. The goal is to use covariates to explain variation in covariance matrices across units. As such, we introduce Covariate Assisted Principal (CAP) regression, an optimization-based method for identifying components associated with the covariates using a generalized linear model approach. For high-dimensional data, a well-conditioned linear shrinkage estimator of the covariance matrix is introduced. With multiple covariance matrices, the shrinkage coefficients are proposed to be common across matrices. Theoretical studies demonstrate that the proposed covariance matrix estimator is optimal achieving the uniformly minimum quadratic loss asymptotically among all linear combinations of the identity matrix and the sample covariance matrix. Under regularity conditions, the proposed estimator of the model parameters is consistent. We develop computationally efficient algorithms to jointly search for common linear projections of the covariance matrices, as well as the regression coefficients. The superior performance of the proposed approach over existing methods is illustrated through simulation studies. Applied to resting-state functional magnetic resonance imaging (fMRI) studies, the proposed approach regresses whole-brain functional connectivity on covariates and enables the identification of relevant brain subnetworks.

Zilin Li STAARpipeline: An all-in-one rare-variant tool for biobank-scale whole-genome sequencing data

Abstract: Large-scale whole-genome sequencing (WGS) studies have enabled the analysis of rare variant associations with complex human diseases and traits. Variant set analysis is a powerful approach to studying rare variant associations. However, existing methods have limited ability to define the variant set in the genome, especially for the noncoding genome. We propose a computationally efficient and robust rare variant association-detection framework, STAARpipeline, to automatically annotate a WGS study and perform flexible rare variant association analysis, including gene-centric analysis and fixed-window and dynamic-window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline groups coding and noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, in addition to fixed-size sliding window analysis, STAARpipeline provides a data-adaptive-size dynamic window analysis. All these variant sets could be automatically defined and selected in STAARpipeline. STAARpipeline also provides analytical follow-up of dissecting association signals independent of known variants via conditional analysis. We applied the STAARpipeline to analyze the total cholesterol in 30,138 samples from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. All analyses scale well in computation time and memory. We discover several potentially new significant associations with lipids, including a finding of rare variants in an intergenic region near JKAMPP1 associated with total cholesterol. In summary, the STAARpipeline is a powerful and resource-efficient tool for association analysis of biobank-scale WGS studies.

Sha Cao Statistical model for recovering the low rank structure of spatial transcriptomics data 

Abstract: Currently, methods specifically designed for spatial transcriptomics (ST) data modeling are lacking. Firstly, for most existing methods, cellular and regional expression profiles are typically analyzed first without the spatial information and only later projected back onto the spatial structure for visual inspection of spatial trends. Secondly, similar to single-cell RNA-Seq data, ST-based gene expression data is also plagued by dropout events, a phenomenon where genes expressed in a given cell or region are incorrectly measured as unexpressed. Thirdly, the gene-by-sample expression matrix is no longer retainable for many spatial methods. To address these challenges, we present a regularized maximum likelihood estimator to recover the noisy observed expression matrix as an approximately low-rank expression matrix under Poisson distribution, which is also spatially smooth. Our method enables spatial clustering by modeling a low-dimensional representation of the count-based gene expression matrix and encouraging neighboring spots to belong to the same cluster via a spatial smoothness penalty term.

Sean McCabe

Statistical and ethical considerations in COVID-19 vaccination acceptance research

Abstract: Smartphone-based apps can generate a large amount of data with a vast array of clinical applications. During the Coronavirus Disease 2019 (COVID-19) pandemic, the How We Feel app was developed as a way for users to report their daily symptoms, test results, and protective behaviors. I will discuss the opportunities and challenges, both statistical and ethical, that were presented in developing, analyzing, and distributing the results of a vaccine acceptance questionnaire within the application. We identified sociodemographic and behavioral factors associated with COVID-19 vaccine acceptance and uptake. We found several vulnerable groups at increased risk of COVID-19 burden, morbidity, and mortality were more likely to be vaccine non-accepting and had lower rates of vaccination. Our findings highlight specific populations in which targeted efforts to develop education and outreach programs are needed to overcome vaccine acceptance and improve equitable access, diversity, and inclusion in the national response to COVID-19.

Purdue Department of Statistics, 150 N. University St, West Lafayette, IN 47907

Phone: (765) 494-6030, Fax: (765) 494-0558

© 2023 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.