Dabao Zhang

Written by: Andrea Rau, Ph.D. candidate in Statistics

Dabao Zhang

Dabao Zhang

Metabolites are small molecules that are the intermediates and products of metabolism, and are the end products of gene expression in a biological organism. Metabolomics refers to the quantitative study of metabolite profiles left behind by specific cellular processes, in order to characterize their phenotypic response to genetic or nutritional perturbations. In such an analysis, a critical step to be addressed is the separation of analytes found in the metabolites extracted from biological tissues. This is done primarily through chromatography, a method by which components are physically separated between two phases, the stationary phase and the mobile phase [1].

Two-dimensional (2-D) gas chromatography coupled with time-of-flight mass spectrometry (GC×GC/TOF-MS) is a particularly powerful tool to analyze complex samples and quantify underlying compounds with two different stationary phases, by performing 2-D separations. Although GC×GC/TOF-MS has been successfully used to analyze petrochemical products, environmental pollutants, and biological metabolites, it is often a challenge to analyze the very complicated data sets it produces. To this end, Professor Dabao Zhang, in collaboration with Professors Min Zhang (Department of Statistics) and Fred Regnier (Department of Chemistry), has been working to improve current algorithms that can tackle the full chromatographic data.

Multivariate tools such as principal components analysis, hierarchical cluster analysis, partial least-squares discriminant analysis, and parallel factor analysis have all been proposed in the past to analyze GC×GC/TOF-MS data, but successful application of these approaches relies on good quality of preprocessed data, especially the high repeatability of retention times in separate columns of the GC×GC. As it is preferable to use the entire data set for chemometric analyses, Professor Zhang and his collaborators have worked to establish an alignment algorithm to correct 2-D retention time shifts without resorting to data-reduction techniques.

In particular, Professor Zhang and his collaborators were able to extend the correlation optimized warping (COW) algorithm to the 2-D case for warping GC×GC data. The COW algorithm "interpolatively stretches and compresses local regions to maximize the correlation between the warped and reference chromatographic profiles," and is consequently more powerful and flexible in correcting retention time shifts [2]. To develop a 2-D version of this algorithm, the researchers first partitioned the raw chromatographic profiles and then applied time warping of the grid points simultaneously along the first and second columns using the established 1-D COW algorithm (see figure below for an example). These shifted grid points could then be used to interpolatively warp nongrid points.

Professor Zhang joined the Department of Statistics at Purdue in 2005, and is currently an Assistant Professor. While he continues his research on comparative metabolomics in collaboration with his colleagues, he is "developing a new regularized variable selection for high-dimensional data, and applying it to whole-genome association study, whole-genome animal selection, eQTL mapping, and gene-environment interaction." His many research interests include multivariate statistics, shrinkage estimation, Bayesian variable selection, multivariate extreme values, and computational statistics. For more information on Professor Zhang, please visit his homepage.

[1] Chromatography. (2008, June 5). In Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title=Chromatography&oldid=217247314.

[2] Dabao Zhang, Xiaodong Huang, Fred E. Regnier, and Min Zhang (2008). "Two-Dimensional Correlation Optimized Warping Algorithm for Aligning GC×GC-MS Data." Analytic Chemistry, 80 (8), pp 2664–2671.

Figure 5

Two selected ion chromatograms (SIC) were generated from two serum samples, and consecutively aligned twice using 2-D COW. The alignment coefficients were then applied to the corresponding total ion chromatograms (TIC).

August 2008