iPlant: Scalable and Extensible Cyberinfrastructure for the Life Sciences
Speaker(s)
- Jason Williams (Cold Spring Harbor Laboratory)
- Min Zhang (Purdue University)
Description
iPlant provides cyberinfrastructure for solving data-intensive problems in the life-sciences. This workshop highlights some of the computational problems biologists face, and introduces some of the resources available through iPlant including:
- iPlant Discovery Environment: iPlant's extensible web-based system for data analysis and management. The "DE" provides a simple graphical interface that gives users access to copious data storage, and scalable high performance computing. You see how to perform an analysis, and how users can integrate their own tools into the platform.
- iPlant Atmosphere: On-demand cloud computing allows you to customize a virtual appliance in a linux-based environment. Using the "Basic R" image for example would deliver virtual computer in 10-15 minutes; pre-configured with standard R packages, and up to 16 CPUs, and 32GB of RAM.
- iPlant Data Store: Cloud data storage that gives researchers terabytes of storage that can be connected to any online computational resource Also provides the means for high-performance of large files.
Note: iPlant is funded by the National Science Foundation, and its resources are free for you to use subject to user policies. Please sign up for an account atuser.iplantcollaborative.org in advance of the workshop if you wish to follow along.
Schedule
Thur, June 21 - Location: STEW 310
Time | Speaker | Title |
---|---|---|
1:30-2:00PM | Jason Williams | Introduction to iPlant |
2:00-2:30PM | Jason Williams | iPlant Discovery Element |
2:30-2:50PM | Min Zhang | Efficient Methods for Genome Selection |
Abstract: Recent advances in high-throughput genotyping have motivated genomic selection using high-density markers. However, an increasingly available large number of markers bring up both statistical and computational issues, and make it difficult to estimate the breeding values. We propose to apply the penalized orthogonal-components regression (POCRE) method to estimate breeding values. Via grouping highly correlated predictors, POCRE allows for collinear or nearly collinear markers and can efficiently select important markers when constructing each component. In simulation studies, POCRE greatly reduces the computing time compared to popular methods. The utility of POCRE was demonstrated in real data analyses. | ||
2:50-3:10PM | Break | |
3:10-3:30PM | Jason Williams | Integrating Tools Into the Discovery Environment |
3:30-4:15PM | Jason Williams | Atmosphere Cloud Computing |
4:15-5:00PM | Jason Williams | iPlant Data Store |