Genetic variation on health and disease

Many common diseases are affected by multiple genetic and environmental factors. The majority of variants in the genome are single-nucleotide polymorphisms (SNPs, single-base differences in DNA sequence among individuals). The recent production of the human haplotype map, with tag SNPs, allows comprehensive studies of the genome for associations with disease. However, genome-wide association study brings challenges to statisticians and computational biologists. First of all, the number of SNPs under consideration is very large, much larger than the sample size. There are also complex interations among genes (i.e. the corresponding SNPs) and interations of genetic factors with environmental factors. In addition to SNP data, DNA sequences in the corresponding genome regions will provide more detailed information for a specific type of genetic variation. Taking the advantage of our expertise in protein and DNA sequence analysis, we want to study genetic variations, including SNPs, sequence variants such as insertions and deletions, segmental duplications, etc., and connect the variants to disease phenotype.