Title: Relational Schema Design for DNA Sequencing Laboratory Databases
Speaker: Dr. Mike Wendl, Genome Sequencing Center, School of Medicine, Washington University
Place: LAEB 2280; Tuesday, 4:30pm

Abstract

Technology advances have enabled vast increases in DNA sequence throughput capacity in recent years, placing commensurately greater demands upon database information systems. A particularly important aspect of this problem for large centers is effective management of day-to-day operations, such as sample tracking and data quality analysis. Here, we outline some of the relevant database design issues and summarize a relational schema implemented at the Genome Sequencing Center, which contributed approximately 20% of the draft human DNA sequence. Methods of interfacing to the database are discussed, along with sample applications. This system has kept pace with our continued growth in capacity, which now stands at 2 million sequencing reads and 30 megabases of finished sequence per month.