Title: "Important Lessons from Complex Genomes"
Speaker: Thomas R. Gingeras, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

Place: LILY G126
Date: October 16, 2012; Tuesday
Time: 4:30pm


Abstract

The approximately three billion base pairs of the human DNA represent a storage devise encoding information for hundreds of thousands of processes that can go on within and outside of a cell. This information is partially revealed in the RNAs that is potentially composed of 12 billion nucleotides considering the strandedness and the allelic content each of the diploid copies of the genome. Results stemming from the efforts to catalogue and analyze the RNA products made by cells in the human (ENCODE), fly-worm (modENCODE) and mouse ENCODE projects have shed light on both the functional content and how this information is organized. Currently, there are a total of ~183,000 transcripts annotated within ~55,000 genic regions represent our previously best manually-curated annotation (based on v13 Gencode). The results from the ENCODE project point to considerable supplementation of these data including information concerning location of regulatory regions, the association of human variation with various types of functional domains and the role of epigenetic modifications on the functionality of various types of elements. Analyses of both the regulatory and transcriptome data sets have resulted in important and under-appreciated lessons such as: a) expression ranges follow transcript types and subcellular localization b) expression of isoforms of a gene by a cell do not follow a minimalistic strategy, c) genomic characteristics of potential trans-acting enhancer regions are distinguishable from other types of cis-acting regulatory regions d) expression status and expression levels can be predicted by different groups of chromatin features with high accuracy and e) pervasive genome-wide transcription prompts a need to redefine the definition of a gene. These and other lessons drawn from the ENCODE data sets while assisting in understanding and organizing what is often seen as dauntingly complex genomes, also point to the need for new and robust computational approaches to carry out cross data type analyses.

Associated Reading:
1. Djebali et al. 2012. Landscape of transcription in human cells. Nature. doi:10.1038/nature11233.

2. Thurman et al. 2012. The accessible chromatin landscape of the human genome. doi:10.1038/nature11232.



Click here www.stat.purdue.edu/~doerge/BIOINFORM.D/SPRING12/sem.html for a full schedule of BIOINFORMATICS SEMINARS, past and present.