Title: "GeneX: Analysis and Database Server of Gene Expression Data"

Abstract

The post-genomic era is beginning to produce data at a rate that will soon rival that of high-energy physics. Biologists and informaticists are faced with issues of how best to organize this information and share it so that it will provide the greatest good to the largest number of researchers.

The GeneX project hopes to contribute to this effort by combining a variety of public gene expression data sets into an internet-available database, supplied with a group of diverse analysis tools. This will allow researchers to analyze their own data with reference to other public data sets. GeneX includes data derived from different technologies, which soon will be able to be directly compared if appropriate conditions exist, encouraging better noise reduction, equalization, and analytical algorithms to be developed.

The open-source software database consists of a data model, a query tool for retrieving complete or partial data sets, a curation tool that assists researchers in formatting their own data sets for secure upload to the database, and analytical routines including various clustering procedures, Principal Component Analysis, statistical tests and static 2/3 dimensional plotting.

I will first present a short background on the production of gene expression data, followed by an introduction to the GeneX project. For the remainder of the seminar, I will present and discuss in more detail two of the analytical tools included in the GeneX system.