Jayanta K. Ghosh

Written by: Shannon Knapp, Ph.D. candidate in Statistics

Photo of Jayanta K. Ghosh
Jayanta K. Ghosh
Photo of Shraddha Mehta
Shraddha Mehta
Photo of Surya Tokdar
Surya Tokdar

Survival analysis is an important tool in the study of diseases and their treatments. It can be used as the measure of effectiveness of a new drug or surgical procedure but can also be used to identify other variables that contribute to either the recovery of the patient or the progression of the disease. In addition to the treatment group variable (e.g., drug versus placebo or experimental drug versus standard drug), a dozen or more variables may be used in these studies, such as the age, gender, and weight of the patient, stage of the disease, and health-related metrics (e.g., white-blood cell count).

A popular model of survival in studies of diseases and their treatments is the Cox Proportional-Hazards model. However, these analyses, so vital to human health and medical care, may be flawed. In 2002 Leo Breiman, Professor of Statistics and Director of the Statistical Computing Facility at the University of California at Berkeley, gave a lecture at the Institute of Mathematical Statistics meeting in which he discussed the problems of using standard classical methods of variable selection, such as all subset testing, or forward and backward stepping, in Cox regression models. Breiman believed the wrong variables were being selected by epidemiologists.

Professor Jayanta K. Ghosh and doctoral student, Shraddha Mehta, are working to rectify these problems. Ghosh attributes the problems in the Cox model to two sources. First, testing dozens of variables leads to multiple testing errors. Ghosh and Mehta propose to solve this issue by first doing a principal components analysis and then using only the first two principal components in the analysis instead of the dozen or so original variables, a technique known as Sliced Inverse Regression (SIR). Second, the Cox model assumes the ratio of the hazard function between two individuals is a constant (determined by the difference in the values of the covariates), so the hazard rates are proportional. Ghosh and Mehta will generalize the Cox model by relaxing the assumption of proportional hazard rates. This work with Mehta builds on Ghosh's work with one of his former students, Surya Tokdar (Ph.D. 2006). Tokdar’s dissertation "Exploring Dirichlet Mixture and Logistic Gaussian Process Priors in Density Estimation, Regression, and Sufficient Dimension Reduction" won the Leonard J. Savage Award for best dissertation in Theory and Methods at the 2007 Joint Statistical Meetings. Meshed into one computation, Mehta will calculate the first two principal components and use Gaussian process priors to estimate the hazard function. The implementation of this new analysis will involve a sophisticated Markov Chain Monte Carlo (MCMC) algorithm. This will be a joint project of Ghosh, Mehta, Tokdar, and Mehta's co-advisor, Professor Bruce A. Craig.

Ghosh earned his doctorate from Calcutta University in 1964. In 1987 he became the Director of the Indian Statistical Institute (ISI). Since coming to Purdue in 1989, Ghosh has made an annual migration, spending fall semesters at Purdue and spring semesters in India at ISI. In 2002 Ghosh retired from ISI but has continued his annual migration for family reasons. In his nearly five decades of research, Ghosh has produced over 150 research articles and authored or co-authored four books, most recently Introduction to Bayesian Analysis - Theory and Methods published by Springer (2006). He has supervised 20 doctoral students (16 at ISI and 4 at Purdue) and co-advised 5 others. Ghosh's research has spanned a wide spectrum of topics. His early work focused on sequential analysis and higher order asymptotics. Over time, Ghosh noted the changing paradigms in statistics, in particular the increasing acceptance and use of Bayesian statistics, in which Ghosh has examined everything from prior selection to model selection to asymptotic properties. Ghosh's work has also moved into high dimensional data analysis. "Today's data is very different. In the old days, large data meant the sample size was large. It is almost the opposite now, the dimension of the data is large." In addition to a range of theoretical work, Ghosh has done applied research in reliability theory, statistical quality control, modeling hydrocarbon discoveries, geological mapping, and DNA fingerprinting.

September 2007