Purdue University - Department of Statistics - Schedule and Textbooks Information

# Schedule and Textbooks Information

## Fall 2020 Schedule and Textbook Information for STAT 545

DISCLAIMER:

We believe the information about textbooks to be accurate, but the Purdue University Bookstores are the official source of information on textbooks. Please check with them for verification before purchasing texts for a specific academic semester or session.

### STAT 545 - Texbook(s) for Fall 2020

Textbook information is not available at this time.

### STAT 545 - Schedule information for Fall 2020

CRN Section Instructor Day Time Room

#### STAT 545 - Course Outline

General Description: Successful statistical data analysis relies increasingly on using computers. As datasets and dimensionality increase in size the computational element of data analysis takes on a central role. In these cases, statistical inference requires carefully crafted solutions that are computationally efficient and numerically accurate. Moreover, the resulting algorithms need to be efficiently implemented in a computer system. The course starts with an introduction to programming in R, a specialized statistical language for data analysis, and C. This introduction will progress at a fast pace and will assume that the students have had some previous exposure to programming. Following that, we will cover several fundamental data structures and algorithms that are directly related to statistical analysis. In the final part of the course we will study several well known computational techniques in statistics such as EM, bootstrap and Monte Carlo techniques. In the second and third parts of the course the students will further develop their computing skills via programming exercises in R and C, and a medium-scale programming project.

Contents:
1. Programming for statistics [5 weeks]
1. Basics of the R and C languages
2. Interfacing with the operating system; calling C from within R programs
3. R tools for data visualization
2. Data structures and algorithms for statistics [5 weeks]
1. Introduction to algorithms in the context of data analysis
2. Elementary data structures for statistics: linked lists, trees, hash tables
3. Elementary algorithms for accessing and manipulating statistical data: searching, sorting, dynamic programming, graph algorithms
4. Convex optimization and its applications in frequentist and Bayesian statistics: gradient descent, Newton's method, conjugate gradient, quasi-Newton's method
5. Relational databases and their use in large scale data analysis
3. Selected stochastic methods [4 weeks]
1. Bootstrap methods
2. The EM algorithm
3. Monte Carlo and its applications: integration and sampling from distributions
Prerequisites STAT 516, STAT 517, and some familiarity with computing. In particular, the students should have some programming experience using a language such as C, C++, Pascal, FORTRAN, Java (the students should be able to write, debug and compile a simple program in one of the above languages).

Expected Outcomes The students should be able to
1. write C and R code, and understand and modify existing code.
2. code and use elementary data structures and algorithms in statistical applications.
3. understand and implement the computational techniques of bootstrap, EM, Monte Carlo integration and sampling.
Grading The course will include a written final exam, a number of programming assignments, and one medium sized programming project. Grade composition is 40% final, 30% assignments, 30% programming project.c

Purdue Department of Statistics, 250 N. University St, West Lafayette, IN 47907

Phone: (765) 494-6030, Fax: (765) 494-0558