Statistics 516
Basic Probability and Applications
Fall 2007
Instructor: Professor Thomas Sellke
Office: Math 532
Phone: 7654946034
Telephone Office Hours for OffCampus Students: MWF 12:001 pm
Office Hours for OnCampus Students: MWF 13 pm and by appointment
Graduate student Nik Tuzov will be available for telephone office hours at 7654940025 (office) or, if he is out of office,
7188776352 (cell) at the following times (it is subject to change depending on holidays etc):
For the week of Sept
17^{th} I will be available on Fr only. Sorry about the inconvenience.
Monday 4pm  5pm
Wed 6 – 7:30 pm
Fri 4pm – 5:30 pm
You can email questions to stat516questions@stat.purdue.edu
For offcampus students, homework can be mailed (postmarked on or before the date of the relevant session) to:
Statistics 516 Homework
Department of Statistics
or emailed to Nik Tuzov at ntuzov@purdue.edu (this is just for HW, questions has to be mailed to stat516questions@stat.purdue.edu )
Text: Probability, by Jim Pitman, Springer Verlag
Lecture Notes: My notes from Fall 2005 are posted on this page .
Prerequisites: You need to know (or learn) that a derivative is a slope, that a onedimensional definite integral is the area under a curve, and that a twodimensional definite integral is the volume under a surface. You should be able to compute straightforward derivatives and integrals. It will be helpful to know that an integral is a limit of (Reimann) sums.
But seriously, folks, this course (and subject of probability in general) is about fractions of a whole (called probabilities) and averages (called expectations, or expected values, or means, all of which mean the same thing). Hence, the material we’ll cover is in a sense very elementary. (The exception is the central limit theorem, which is quite a bit deeper than simple fractions and averages.) As mentioned above, you’ll need to know how to do basic calculus, but the most important prerequisites are the ability to understand simple English and the ability to think.
Exams: There will be two examinations during the semester (see schedule at the bottom) and a comprehensive final exam.
Homework: There is a lot of homework, but you won’t learn probability without doing lots of problems. You are encouraged to collaborate, though each person should write up his or her own solutions. The homework is not worth a huge number of points (see below), so it’s no big deal if you don’t succeed in solving a few of the problems. What is important is that you think about them (hard, if necessary) and then learn the solutions of those you miss. (We will send back solutions when we return your homework.) Note that there are short answers to most odd problems in the back of the book.
Homework of oncampus students is due at the end of class on the days indicated. Homework of offcampus students should be postmarked on the corresponding session day for each student’s offcampus location.
Grades: Here’s the point breakdown.
Homework 50
Exam 1 100
Exam 2 100
Total 400
Old Course Material: Lecture notes, and practice exam from the Fall 2001 rendition of Stat 516 can be found at: http://www.stat.purdue.edu/~tsellke/stat516/2001/
Solutions for practice Exam #1, 2001 are here
Some Comments about Stat 516:
1. Probability is about fractions and averages.
2. “Fractions and averages” may sound trivial, boring, and unimportant, but probability is none of these things. In fact, probability is a key to understanding the world. Here is a quote from an article entitled “The Unexpected Uselessness of Philosophy” by a very smart guy named Steve Sailer:
To this day, philosophers suffer from Plato’s disease: the assumption that reality fundamentally consists of abstract essences best described by words or geometry. (In truth, reality is largely a probabilistic affair best described by statistics.)
3. Probability and statistics are fundamental in the practice of science. How do we learn things scientifically? Typically, we try to measure something, but our measurements almost always have errors. Often we can use probability to study how these errors should affect our conclusions.
A more specific example of the scientific use of probability is the randomized clinical trial in medicine. Does taking multivitamins make you healthier and live longer? Well, suppose you have two uncles, one of whom takes a vitamin pill every day and the other of whom never takes vitamins. If one uncle is healthier than the other, can you conclude anything about the effect of vitamins? No, since the two uncles differ in a zillion ways, aside from whether they take vitamins, and some of these other differences are likely to be involved in their health differences. What if we get data on 100,000 people and find that those who take vitamins tend to be healthier than those who don’t? Again, the people who take vitamins may differ in other ways from people who don’t take vitamins. (Maybe the vitamintakers tend to exercise more and eat a healthier diet than those who don’t take vitamins, and maybe these other differences in behavior are what are really responsible for differences in health.) So, to learn about the effect of the vitamins, we’d like to compare two groups of people, one group taking vitamins and the other group not, but with the two groups as much alike as possible aside from the taking of vitamins. Then, if we see a difference in health, perhaps we can conclude that the vitamins are responsible. But how can we form our similar experimental groups? One way is to flip a coin for each person: heads means the person gets a daily vitamin pill and tails means no vitamin pill. (Better yet, tails means the person gets a fake vitamin pill, called a placebo, with neither the subjects nor those evaluating health knowing who is in which group. Why is this better?) Now it may happen just by chance that most healthy people end up in one experimental group, and that a subsequent difference in health between the two groups is mostly due to the way the coin flips came out. However, we can use the laws of probability to show that the two experimental groups will almost certainly be very similar, provided that we’re dealing with a large enough number of experimental subjects.
4. The most important idea in biology is evolution by means of natural selection.
Natural selection is a “random” process, described by probability.
5. Insurance is based on probability.
6. Mathematical finance uses probabilistic models of stock prices to figure out the values of stock options and other “derivatives.”
7. Engineers use probability for quality control and for modeling telecommunications networks.
8. Probability theory originated in the study of games of chance during the 1500’s and 1600’s. People have been wagering on dice for thousands of years, and, looking back, it seems strange that it took so long for intelligent analysis to replace ignorant superstition in how games of chance are viewed.
Many people (like me) are fascinated by games of chance, particularly when they have money at stake. Some people (like my parents) find them uninteresting. Some people (like my deceased grandmother, who used to say that card play is the devil’s play) think them sinful. However, for purposes of teaching probability, games of chance have the advantage of offering clean, easytostate problems, without the messy, extraneous features present in many more important applications of probability.
9. The course title of Statistics 516 is “Basic Probability and Applications.” I did not come up with this title, and if it had been up to me, I’d have left off the “and Applications” part. My goal (and Pitman’s, I think) is to get you to understand the basic ideas of mathematical probability. The examples and problems will be chosen to do this as efficiently and effectively as possible. Hence, we’ll talk a lot about coins, dice, and cards.
The “applications” to opinion polling, finance, disease clusters, prediction of college grades from SAT scores, etc., will generally be based on madeup examples which illustrate the probabilistic gist of the phenomenon but largely ignore the nonprobabilistic complications which would crop up in the real world.
10. You probably think you have some understanding of fractions already. Here’s a (moreorless) true story.
The
_{}
Evans says, “Go sign up for the course at table A.” Students who got their questions right were directed to table B. (See Constance Reid’s biography of Neyman for the original version of this story.)
11. If you’re a table B person, you know (or can figure out, with a little prodding) some basic facts about probability already.
Suppose that
Men 

60% 
Women 

40% 
Fraction of men taking a math course 

90% 
Fraction of women taking a math course 

80% 
Average height of men 

70 inches 
Average height of women 

64 inches 
Fraction of freshmen taking a biology course 

35% 
Fraction of freshmen taking a chemistry course 

25% 
Fraction of freshmen taking courses in both biology and chemistry 

10% 
Now answer the following questions:
Q1. What fraction of freshmen are not taking biology? (This is the complement rule on page 19 of Pitman.)
Q2. Of all freshmen, what fraction are women who are taking a math course? (This is the multiplication rule for conditional probabilities on page 37 of Pitman.)
Q3. What fraction of freshmen are taking a math course? (This is the rule of average conditional probabilities on page 41 of Pitman.)
Q4. Of the freshmen who are taking a math course, what fraction are women? (This is Bayes rule, found on page 49 of Pitman.)
Q5. What is the average height of freshmen? (This is the rule of average conditional expectation on page 402 of Pitman.)
Q6. What fraction of freshmen are taking biology or chemistry (or both)? (This is the inclusionexclusion formula on page 22.)
The answers are 65%, 32%, 86%, 16/43, 67.6 inches, and 50%.
12. Few people understand much about calculus right after finishing a year of calculus. Mostly, calculus students just memorize a lot of mysterious, magical recipes, along with the cue words that help them figure out which recipe to use. (When the word “slope” appears, you differentiate, and when the word “area” appears, you integrate.)
My friend Steve Lalley says that the only people who understand anything about calculus are the ones who, after taking calculus, go on to take courses where calculus is used, for example in physics, chemistry, engineering, or economics. The second half of Stat 516 will use a lot of calculus. So, if you’re a little hazy about what is really going on in calculus, Stat 516 will help you learn.
On this same topic, note that in Pitman’s preface, he suggests that the instructor zip through chapters 1 and 2 so as to have enough time in chapters 46 to teach calculus.
==============================================================================================
Fall 2007
Session 1. Aug. 20 Sections 1.1, 1.2
Session 2. Aug. 22 Section 1.3
Session 3 Aug. 24 Section 1.4
Session 4. Aug. 27 Section 1.5
Session 5. Aug. 29 Section 1.6
Session 6. Aug. 31 Section 2.1
Sep. 3 Labor Day No Class
Session 7. Sep. 5 Section 2.2
Session 8. Sep. 7 Section 2.4
Session 9. Sep. 10 Section 2.5
Session 10. Sep. 12
Section 3.1
Session 11. Sep. 14 Section 3.2
Session 12. Sep. 17 Section 3.2
Session 13 Sep. 19
Section 3.3
Session 14 Sep. 21
Section 3.3
Session 15 Sep. 24 Section 3.3
Session 16 Sep. 26
Section 3.4
Session 17 Sep. 28
Section 3.5
Session 18 Oct. 1 Review of chapters 13
Session 19 Oct. 3 Exam 1
Session 20 Oct. 5 Section 4.1
Oct. 8 October Break No Class
Session 21 Oct. 10 Section 4.1
Session 22 Oct. 12 Section 4.2
Session 23 Oct. 15 Section 4.2
Session 24 Oct. 17 Section 4.3
Session 25 Oct. 19 Section 4.5
Session 26 Oct. 22 Sections 4.5 and 4.6
Session 27 Oct. 24 Section 4.4
Session 28 Oct. 26 Section 4.4
Session 29 Oct. 29 Section 5.1
Session 30 Oct. 31 Section 5.2
Session 31 Nov. 2 Section 5.3
Session 32 Nov. 5 Review for Exam 2
Session 33 Nov. 7 Exam 2
Session 34 Nov. 9 Section 5.4
Session 35 Nov. 12 Section 5.4
Session 36 Nov. 14 Section 6.1
Session 37 Nov. 16 Sections 6.2, 6.3
Session 38 Nov. 19 Section 6.3
Nov. 2124 Thanksgiving Vacation No Class
Session 39 Nov. 26 Section 6.4
Session 40 Nov. 28 Section 6.4
Session 41 Nov. 30 Section 6.5
Session 42 Dec. 3 Section 6.5
Session 43 Dec. 5 Review
Session 44 Dec. 7 Review
* I will be available in the classroom for offcampus students taking exams during cancelledclass times.
Homework 1 – Due August 29
(Session 5)
Section 1.1: 4
Section 1.3: 2, 4, 5, 8
Section 1.4: 2, 3, 4
Homework 2 – Due September 5
(Session 7)
Section 1.4: 5, 6, 7
Section 1.5: 2, 4, 5
Section 1.6: 1, 2, 4, 6, 7
Homework 3 – Due September 10 (Session 9)
Section 2.1: 2, 4, 6, 12
Section 2.2: 2 (Exercise 2 refers to the approximations in Exercise 1), 3, 6, 9, 11, 12 (Use the normal approximation on page 99 for these exercises, not the skewnormal approximation on page 106.)
Important: some work has to be shown in your
HWs, not just answers.
If it looks like the answer key from the
book the corresponding problem will receive no points.
Homework 4 – due Sept 17 (Session 12)
Section 2.4: 1b, 1c, 2, 6, 7
Section 2.5: 1, 2, 4, 8
Section 3.1: 1, 2, 4, 6, 10
Homework 5 – due Sept 24 (Session 15)
Section 3.2: 2, 3, 4, 5, 6, 10
Section 3.3: 2, 3, 7, 8, 9, 13, 14, 16
Offcampus students: when you submit hw via
email, please make sure that it can be printed easily, e.g. no multiple files
or multiple
pages in an Excel file.
Homework 6 – due Oct 1 (Session 18)
Section 3.3: 19, 20
Section 3.4: 1, 2, 3, 4 (Question 4(a) asks for the probability that the 3 cointoss results are not all the same.), 7a, 7b, 11
Section 3.5: 1, 2, 3, 4, 7, 9
Homework 7 – due
Oct 15 (Session 23)
Section 4.1: 1, 2, 3, 5, 6, 8, 11, 12, 13
Homework 8 – due Oct 22 (Session 26)
Section 4.2: 1, 2, 4, 5, 6, 8, 15
Section 4.5: 2, 5
Homework 9 – due Oct 29 (Session 29)
Section 4.5: 6
Section 4.6: 1
Section 4.4: 3, 4, 5, 6, 10
Chapter 4 Review Exercises: 5, 6. Also do Review Exercise 3, except do it with
X=max{Y_{1}, Y_{2}, Y_{3}, Y_{4}}.
Homework 10 – due Nov 5 (Session 32)
Section 5.1: 1, 2, 4, 5, 6
Section 5.2: 2, 3, 4, 5, 7
Homework 11 – due Nov 12 (Session 35)
Section 5.3: 1, 2, 3, 5, 7, 9
Homework 12 – due Nov 19 (Session 38)
Section 5.4: 1, 2, 5, 6, 9, 13
Do problems 9 and 13 by finding the cdf’s of X (in 9) and Z (in 13) and then differentiating. Then do:
T = U, X = UV.
(a) What points in the plane are the possible values of (T, X)?
(b) Find the joint density of (T, X).
(c) Find the density of X from your answer to part (b). Compare with your answer to 5.4.9.
T = X + Y, Z = X – Y.
(a) What points in the plane are the possible values of (T ,Z)?
(b) Find the joint density of (T, Z).
(c) Find the density of Z from your answer to part (b). Compare with your answer to 5.4.13.
Homework 13 – due Nov 28 (Session 40)
Section 6.1: 1, 2
Section 6.2: 3, 4a,d
Section 6.3: 1, 2, 4
Homework 14 – due Dec 5 (Session 43)
Section 6.4: 1, 3, 5, 10
Section 6.5: 1, 2, 3, 4
As of now,
homeworks have different weights but at
the end of the term each HW grade will be converted to a 050 points
scale.
Solutions for practice Exam #1, 2001 are here
Solutions for Exam1, Fall 2007 are here
Solutions for Exam2,
Fall07 (.tiff file, can be opened with MS Office Document Imaging) are here