Title: "Group variable selection methods for data with interdependent structures"
Speaker: Dr. Jun Xie, Department of Statistics, Purdue University
Place: Mechanical Engineering (ME) 161; September 16, 2008, Tuesday, 4:30pm


Variable selection and regularization are old statistical problems that have recently attracted much attention in analysis of large scale genomic data. In analysis of gene expression data, it is well known that for genes sharing a common biological pathway or a similar function, the correlations among them can be very high. However, most of the existing variable selection approaches cannot deal with complicated interdependence among data. We propose several new methods to select groups of highly correlated variables together in regression models. The new methods group and select at the same time, and include approaches based on both a penalty function and a continuous solution path. Simulations show that our proposed methods often outperform other variable selection methods, including LARS (lasso) and elastic net, in terms of prediction error and preserving sparsity of representation. We have applied the proposed methods in analyses of gene expression data. The biological application shows a good example of how new statistical methods are motivated from computational biology.

Click here for a full schedule of BIOINFORMATICS SEMINARS, past and present.