Session 05 - Department of Statistics - Purdue University Skip to main content

Texas A&M Session

Co-organizers: Brani Vidakovic, Professor and Head H.O. Hartley Chair, Department of Statistics, Texas A&M

Speakers

  • Nilanjana Laha, Assistant Professor, Department of Statistics, Texas A&M
  • Jesus Arroyo, Assistant Professor, Department of Statistics, Texas A&M
  • Quan Zhou, Assistant Professor, Department of Statistics, Texas A&M
  • Brani Vidakovic, Professor and Head H.O. Hartley Chair, Department of Statistics, Texas A&M

Speaker Title
Nilanjana Laha

Optimal dynamic treatment regimes via smooth surrogate losses

 Abstract: Large health care data repositories such as electronic health records (EHR) open new opportunities to derive individualized treatment strategies for complicated diseases such as sepsis. In this talk, I will discuss the problem of estimating sequential treatment rules tailored to a patient's individual characteristics, often referred to as dynamic treatment regimes (DTRs). Our main objective will be to find the optimal DTR that maximizes a discontinuous value function through direct maximization of Fisher consistent surrogate loss functions. In this regard, we demonstrate that a large class of concave surrogates fails to be Fisher consistent -- a behavior that differs from the classical binary classification problems. We further characterize a non-concave family of Fisher consistent smooth surrogate functions, which can be optimized via gradient descent using off-the-shelf machine learning algorithms. Compared to the existing direct search approach under the support vector machine framework (Zhao et al., 2015), our proposed DTR estimation via surrogate loss optimization (DTRESLO) method is more computationally scalable to large sample sizes and allows for broader functional classes for treatment policies. We establish theoretical properties for our proposed DTR estimator and obtain a sharp upper bound on the regret corresponding to our DTRESLO method. The finite sample performance of our proposed estimator is evaluated through extensive simulations. Finally, we illustrate the working principles and benefits of our method for estimating an optimal DTR for treating sepsis using EHR data from sepsis patients admitted to intensive care units.

Jesus Arroyo Joint spectral clustering in multilayer networks

Abstract: Modern network datasets are often composed of multiple layers, either as different views, time-varying observations, or independent sample units. These data require models and methods that are flexible enough to capture local and global differences across the networks, while at the same time being parsimonious and tractable to yield computationally efficient and theoretically sound solutions that are capable of aggregating information across the networks. This talk considers the multilayer degree-corrected stochastic blockmodel, where a collection of networks share the same community structure, but degree-corrections and block connection probability matrices are permitted to be different. We establish the identifiability of this model and propose a spectral clustering algorithm for community detection in this setting. Our theoretical results demonstrate that the misclustering error rate of the algorithm improves exponentially with multiple network realizations, even in the presence of significant layer heterogeneity. Simulation studies show that this approach improves on existing multilayer community detection methods in this challenging regime. Furthermore, in a case study of US airport data through January 2016 - September 2021, we find that this methodology identifies meaningful community structure and trends in airport popularity influenced by pandemic impacts on travel. 

Quan Zhou Complexity analysis of informed MCMC methods for high-dimensional model selection problems

Abstract: Informed Markov chain Monte Carlo (MCMC) methods have been proposed as scalable solutions to Bayesian posterior computation on high-dimensional discrete state spaces, but theoretical results about their convergence behavior in general settings are lacking. In this talk, we introduce a novel and generally applicable framework for studying the complexity of local Metropolis-Hastings (MH) algorithms on discrete spaces. For random walk MH algorithms, our bounds are better than the existing ones in the literature; for informed MH algorithms, our method yields the optimal "dimension-free" mixing rate, which serves as the theoretical justification for the use of informed MCMC methods in practice. One example we will discuss is high-dimensional structure learning, a fundamental problem in causal inference and machine learning. On the algorithmic side, we propose two novel informed MCMC algorithms, one based on Metropolis-Hastings sampling and the other based on importance weighting. The talk is based on joint works with J. Yang, D. Vats, G. Roberts, J. Rosenthal, H. Chang and A. Smith.

Brani Vidakovic

Benefits of Noise in Biomedical Research: A Multiscale Point of View

Abstract: Many measured biomedical signals are inherently noisy and, as a rule, the noise carries relevant diagnostic information. In some extreme cases, for example in the analysis of high frequency pupil diameter measurements, the trends in the measured time series depend on the ambient light and are irrelevant for assessing potential visual imparity -- all information is contained in the noise. In this overview talk we discuss statistical inference based on descriptors/summaries distilled form the noisy biomedical data. The signals and images in biometric analysis often exhibit self-similarity and are well modeled by (multi) fractional Brownian motions/fields. A range of fractal and multifractal indices derived from the spectral characteristics of the measurements proved useful in disease diagnostic tasks, and often represent a testing modality independent of the traditionally used modalities. In the talk we discuss the use and performance of various multiscale spectral measures, their robust counterparts, wavelet-based multifractal indices, and dual wavelet spectra. The methods are illustrated by real-life measurements in the area of diagnostics of breast and ovarian cancers.

Purdue Department of Statistics, 150 N. University St, West Lafayette, IN 47907

Phone: (765) 494-6030, Fax: (765) 494-0558

© 2023 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science.