Title: "Using RNA-seq to understand variation in the human transcriptome"
Speaker: John Marioni, Department of Human Genetics, University of Chicago
Place: HORT 117; November 3, 2009, Tuesday, 4:30pm

Abstract

Understanding the genetic mechanisms that underlie natural variation in gene expression is a central goal of both medical and evolutionary genetics. Recently, advances in next]generation sequencing technology have allowed transcript variation to be studied at unprecedented resolution. To take advantage of this new resource, we sequenced RNA from 69 lymphoblastoid cell lines (LCLs) derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap project.

In this talk I will begin by providing the biological motivation for our study, before briefly outlining the sequencing technology and experimental design that we used. Subsequently, I will focus on a major technical hurdle that arises when mapping short sequencing reads to a reference genome . namely, the impact of SNP variation on the reliability of read mapping. At heterozygous SNPs, our results show that there is a significant bias towards higher mapping rates of the reference allele and, perhaps surprisingly, masking known SNP locations in the reference sequence does not lead to more reliable results overall. Overcoming this problem by filtering out inherently biased SNPs removes 40% of the top signals of allele]specific expression (ASE). Further, we find that the remaining SNPs showing ASE are enriched in genes known to harbour cis]regulatory variation or known to show uniparental imprinting. To conclude, I will describe the results of our analysis of the entire dataset. By pooling all individuals, we identify extensive use of unannotated polyadenylation sites and over 100 novel protein]coding exons. Further, using genotype information, we find many genetic variants that influence overall levels of expression and splicing. Overall, our results show the power of high]throughput sequencing for the joint analysis of variation in transcription, splicing, and allelespecific expression across individuals.

Associated Reading:
Degner J.F., Marioni J.C., Pai A.A., Pickrell J.K., Nkadori E., Gilad Y., Pritchard J.K. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 2009 (advance access online).



Click here for a full schedule of BIOINFORMATICS SEMINARS, past and present.