Title: "An efficient algorithm for simulating coalescence with recombination"
Speaker: Dr. Katy Simonsen
Place: SMITH 108; Tuesday, 4:30pm


In population genetics, the coalescent process is an important model by which the variability of DNA sequence data can be understood. Coalescent models incorporating genetic recombination have for the last 20 years played an important role in understanding the effect of linkage on genetic variability in natural populations, both theoretically and via simulation. For example, coalescence with recombination can be used to simulate the SNP marker data used to detect association with diseases and traits in humans and other non-experimental populations. However, simulation with such models (including that of Simonsen and Churchill, 1997) has suffered from a common problem: the computational complexity (computer time and memory needed) increases exponentially with the number of genetic loci involved, and with the population size and recombination rate. Thus such simulations have been limited to small numbers of loci encompassing small regions of the genome. This motivates the development of a much more efficient computer algorithm for such simulations, whose complexity is only polynomial in the parameters. I will describe the special structure of the model that made such efficiency possible, and give some timing results to show that the desired efficiency has been achieved. This new algorithm will enable the simulation of genetic data on a genome-wide scale.
See http://www.stat.purdue.edu/~doerge/BIOINFORM.D/SPRING04/sem.html for a full scheule of BIOINFORMATICS SEMINARS.