Projects
Computational Analysis of Transcriptional Regulation
-
People
Department of Statistics: Dr. Jing Wu and Dr. Jun Xie
Informatics School at IU Bloomington: Dr. Sun Kim and Dr. Haixu Tang
Department of Veterinary Pathobiology: Dr. Sulma I. Mohammed
Funding
Collaboration in Life Sciences and Informatics Research Pilot Grant, Indiana University Bloomington and Purdue, 2006-2007.
Description
The regulation of gene expression plays a central role in nearly all biological processes in living cells. With many sequenced genomes and advanced genome-wide analytical technologies, it is now possible to begin constructing transcriptional regulatory networks that control gene expression in biological processes, including the process of cancer. This project focuses on the development of statistical and computational methods for the prediction of transcriptional regulatory activities, which we define as a network including information of the transcription factors (TFs), the transcription factor binding sites in non-coding genomic sequences, the genes bound by the transcription factors, and if and when the TFs occupy their binding sites. This is an interdisciplinary project involving statisticians (Drs. Wu and Xie) from the Department of Statistics at Purdue, computer scientists (Drs. Kim and Tang) from Informatics School at IU Bloomington, and an oncologist (Dr. Mohammed) from Veterinary School of Medicine at Purdue. The project is currently supported by the Collaboration in Life Scienes & Informatics Research
(CLSIR) pilot grant from IU Bloomington and Purdue. A proposal to the National Institutes of Health (NIH) is pending.
References
Seung-Hee Bae, Haixu Tang, Jing Wu, Jun Xie, Sun Kim,
"dPattern: Transcription Factor Binding Site (TFBS) Discovery in Human Genome Using a Discriminate Pattern Analysis", Bioinformatics, 2007, doi:10.1093/bioinformatics/btm288.
Statistical and Computational Methods for Transcriptomic Profile of Drosophila Melanogaster
- People
Department of Statistics: Dr. Jing Wu and Dr. Jun Xie
Department of Entomolgy: Dr. Larry Murdock and Dr. Barry Pittendrigh
Description
The rich source of gene expression microarray data contains information about co-expressed genes. However, the underlying regulatory mechanisms of these genes are mostly unknown. For many species, for instance, Drosophila fruit fly, only a small number of transcription factor binding sites are annotated (mainly embryo developmental TFs). In addition to the collections of transcription factor binding sites (TFBSs), e.g. TRANSFAC, novel TFBSs can be computationally identified in a group of co-expressed genes. This project is a collaboration between faculty in Statistics (Drs. Wu and Xie) and faculty in Entomology (Drs. Murdock and Pittendrigh) at Purdue. We are developing computational and statistical methods to identify transcriptomic profiles (a set of transcription factors regulating a set of genes) of Drosophila Melanogaster genes, especially those associated with midgut.
References
Hong-Mei Li, Lijie Sun, Omprakash Mittapalli, William M. Muir, Jun Xie, Jing Wu, Brandi Schemerhorn, Weilin Sun, Larry Murdock, and Barry R.
Pittendrigh, "Transcriptional Signatures in Response to Wheat Germ Agglutinin and Starvation in Drosophila melanogaster Larval Midgut", Insect Molecular Biology, 2008, doi:10.1111/j.1365-2583.2008.00844.x.
Hong-Mei Li, Grzegorz Buczkowski, Omprakash Mittapalli, Jun Xie, Jing Wu, Rick Westerman, Brandy Schemerhorn, Larry L. Murdock, and Barry R.
Pittendrigh, "Transcriptomic profile of Drosophila melanogaster third-instar larval midgut and responses to oxidative stress", Insect Mocular Biology, 2008, Vol. 17(4), 325-339.
Statistical methods in integrative analysis for gene regulatory modules
- PI: Dr. Jun Xie
Funding: NSF DMS-604776
Other people: Dr. Jing Wu and Lingmin Zeng, Department of Statistics, Purdue University.
Description
A suite of statistical methods are proposed for inferring cis-regulatory module, which is a combination of several transcription factors binding in the promoter regions to regulate gene expression. The approach is an integrative analysis that combines information from multiple types of biological data, including genomic DNA sequences, genome-wide location analysis (ChIP-chip experiments), and gene expression microarray. More specifically, a hidden Markov model is developed to first predict a cluster of transcription factor binding sites in DNA sequences. The predictions are refined by regression analysis on gene expression microarray data and/or ChIP-chip binding experiments. In regression analysis, factor analysis is particularly applied, whose statistical model characterizes the modular structure of cis-regulation. When groups of coexpressed genes are available, canonical correlation analysis is further applied to infer relationships between a group of genes and their common set of transcription factors. The multiple data sources provide information of transcriptional regulation from different aspects.
Therefore, the integrative analysis offers a fine prediction on transcriptional regulatory code and infers potential regulatory networks.
References
Lingmin Zeng, Jing Wu, and Jun Xie, "Statistical methods in
integrative analysis for gene regulatory modules", Statistical Applications in Genetics and Molecular Biology, 2008, Vol. 7, Iss. 1, Article 28, http://www.bepress.com/sagmb/vol7/iss1/art28.
Jing Wu and Jun Xie, "Computation-Based Discovery of
Cis-Regulatory Modules by Hidden Markov Models", Journal of Computational Biology, 2008, Vol. 15, No. 3, 279-290.