STAT 598A, Fall 2008

Analysis of Massive Dependent Data

Tu Thur 3-4:15, University Hall 203


Instructor: Professor Hao Zhang

††††††††††††††††††††††† Office: MATH 536, Ph: 496-9548; Email:


Office Hours: Monday and Wednesday 1:30-2:30 and by appointment.

Course Description: Due to the technological innovation and improved ability to acquire and achieve data, huge amount of data are collected in many disciplines including environmental, agricultural and public health studies. For example, the EPA has thousands of monitoring stations across the US that measure air quality, ozone level and other variables; Ecologists insert censors to animals to track the animals and these censors reveal the animalsí whereabouts at any moment; Hospitals record the admission of patients having various diseases to monitor the disease outbreak. These data have a space attribute (where are observed) and also a time dimension (when they are observed). These kinds of correlated data are more prevalent these days than anytime before, and present some interesting and challenging statistical and computing problems.

This course covers methods that deal with the additional complexity of modeling and analyzing the massive data, which is attributed to the correlation or dependence in space and time. Some topics include sparse matrices, approximate likelihood-based inferences, covariance tapering, spectral methods, separable space-time covariance functions, and process convolution.

The R language and environment for computing will be used expensively in the class; some packages (ff, bigmemory, fields, spam, sparseM, geoR) will be introduced.

There is no textbook for this course. Lecture notes and designated readings will be distributed.

Grading is based on the following: