Statistics 516

Basic Probability and Applications

Fall 2007

Instructor:                  Professor Thomas Sellke

Office:                         Math 532

Phone:                        765-494-6034

## E-mail:                        tsellke@stat.purdue.edu

Telephone Office Hours for Off-Campus Students:  MWF 12:00-1 pm

Office Hours for On-Campus Students:       MWF 1-3 pm and by appointment

Graduate student Nik Tuzov will be available for telephone office hours at 765-494-0025 (office) or, if he is out of office,

718-877-6352 (cell) at the following times (it is subject to change depending on holidays etc):

For the week of Sept 17th I will be available on Fr only. Sorry about the inconvenience.

Monday          4pm -  5pm

Wed                6 – 7:30 pm

Fri                   4pm –  5:30 pm

You can email questions to stat516questions@stat.purdue.edu

For off-campus students, homework can be mailed (post-marked on or before the date of the relevant session) to:

Statistics 516 Homework

Department of Statistics

150 N. University Street

West Lafayette IN 47907-2067

or emailed to Nik Tuzov at ntuzov@purdue.edu (this is just for HW, questions has to be mailed to stat516questions@stat.purdue.edu )

### HW 2007 solutions:  see below .

Text:   Probability, by Jim Pitman, Springer Verlag

Lecture Notes:  My notes from Fall 2005 are posted on this page .

Prerequisites:  You need to know (or learn) that a derivative is a slope, that a one-dimensional definite integral is the area under a curve, and that a two-dimensional definite integral is the volume under a surface.  You should be able to compute straightforward derivatives and integrals.  It will be helpful to know that an integral is a limit of (Reimann) sums.

But seriously, folks, this course (and subject of probability in general) is about fractions of a whole (called probabilities) and averages (called expectations, or expected values, or means, all of which mean the same thing).  Hence, the material we’ll cover is in a sense very elementary.  (The exception is the central limit theorem, which is quite a bit deeper than simple fractions and averages.)  As mentioned above, you’ll need to know how to do basic calculus, but the most important prerequisites are the ability to understand simple English and the ability to think.

Exams:  There will be two examinations during the semester (see schedule at the bottom) and a comprehensive final exam.

Homework:  There is a lot of homework, but you won’t learn probability without doing lots of problems.  You are encouraged to collaborate, though each person should write up his or her own solutions.  The homework is not worth a huge number of points (see below), so it’s no big deal if you don’t succeed in solving a few of the problems.  What is important is that you think about them (hard, if necessary) and then learn the solutions of those you miss.  (We will send back solutions when we return your homework.)  Note that there are short answers to most odd problems in the back of the book.

Homework of on-campus students is due at the end of class on the days indicated.  Homework of off-campus students should be postmarked on the corresponding session day for each student’s off-campus location.

Homework               50

Exam 1                   100

Exam 2                   100

# Final Exam              150

Total                       400

Old Course Material:  Lecture notes, and practice exam from the Fall 2001 rendition of Stat 516 can be found at:  http://www.stat.purdue.edu/~tsellke/stat516/2001/

Solutions for practice Exam #1, 2001 are here

1.  Probability is about fractions and averages.

2.  “Fractions and averages” may sound trivial, boring, and unimportant, but probability is none of these things.  In fact, probability is a key to understanding the world.  Here is a quote from an article entitled “The Unexpected Uselessness of Philosophy” by a very smart guy named Steve Sailer:

To this day, philosophers suffer from Plato’s disease: the assumption that reality fundamentally consists of abstract essences best described by words or geometry. (In truth, reality is largely a probabilistic affair best described by statistics.)

3.  Probability and statistics are fundamental in the practice of science.  How do we learn things scientifically?  Typically, we try to measure something, but our measurements almost always have errors.  Often we can use probability to study how these errors should affect our conclusions.

A more specific example of the scientific use of probability is the randomized clinical trial in medicine.  Does taking multivitamins make you healthier and live longer?  Well, suppose you have two uncles, one of whom takes a vitamin pill every day and the other of whom never takes vitamins.  If one uncle is healthier than the other, can you conclude anything about the effect of vitamins?  No, since the two uncles differ in a zillion ways, aside from whether they take vitamins, and some of these other differences are likely to be involved in their health differences.  What if we get data on 100,000 people and find that those who take vitamins tend to be healthier than those who don’t?  Again, the people who take vitamins may differ in other ways from people who don’t take vitamins.  (Maybe the vitamin-takers tend to exercise more and eat a healthier diet than those who don’t take vitamins, and maybe these other differences in behavior are what are really responsible for differences in health.)  So, to learn about the effect of the vitamins, we’d like to compare two groups of people, one group taking vitamins and the other group not, but with the two groups as much alike as possible aside from the taking of vitamins.  Then, if we see a difference in health, perhaps we can conclude that the vitamins are responsible.  But how can we form our similar experimental groups?  One way is to flip a coin for each person:  heads means the person gets a daily vitamin pill and tails means no vitamin pill.  (Better yet, tails means the person gets a fake vitamin pill, called a placebo, with neither the subjects nor those evaluating health knowing who is in which group.  Why is this better?)  Now it may happen just by chance that most healthy people end up in one experimental group, and that a subsequent difference in health between the two groups is mostly due to the way the coin flips came out.  However, we can use the laws of probability to show that the two experimental groups will almost certainly be very similar, provided that we’re dealing with a large enough number of experimental subjects.

4.  The most important idea in biology is evolution by means of natural selection.

Natural selection is a “random” process, described by probability.

5.   Insurance is based on probability.

6.  Mathematical finance uses probabilistic models of stock prices to figure out the values of stock options and other “derivatives.”

7.  Engineers use probability for quality control and for modeling telecommunications networks.

8.  Probability theory originated in the study of games of chance during the 1500’s and 1600’s.  People have been wagering on dice for thousands of years, and, looking back, it seems strange that it took so long for intelligent analysis to replace ignorant superstition in how games of chance are viewed.

Many people (like me) are fascinated by games of chance, particularly when they have money at stake.  Some people (like my parents) find them uninteresting.  Some people (like my deceased grandmother, who used to say that card play is the devil’s play) think them sinful.  However, for purposes of teaching probability, games of chance have the advantage of offering clean, easy-to-state problems, without the messy, extraneous features present in many more important applications of probability.

9.  The course title of Statistics 516 is “Basic Probability and Applications.”  I did not come up with this title, and if it had been up to me, I’d have left off the “and Applications” part.  My goal (and Pitman’s, I think) is to get you to understand the basic ideas of mathematical probability.  The examples and problems will be chosen to do this as efficiently and effectively as possible.  Hence, we’ll talk a lot about coins, dice, and cards.

The “applications” to opinion polling, finance, disease clusters, prediction of college grades from SAT scores, etc., will generally be based on made-up examples which illustrate the probabilistic gist of the phenomenon but largely ignore the non-probabilistic complications which would crop up in the real world.

10.  You probably think you have some understanding of fractions already.  Here’s a (more-or-less) true story.

The University of California at Berkeley brought in a Polish guy named Jerzy Neyman in 1939 to start up their program in statistics.  When Neyman arrived there just before the start of fall classes, Griffith C. Evans, the chairman of the Mathematics Department, grabbed him and said, “Come with me.  You can help with registering new students.”  So Neyman goes with Evans, and there’s a line of students waiting to get into the appropriate math classes.  Evans tells the girl who’s first in line to go to a blackboard and to compute 3/5 + 1/4.  The girl looks at Evans with an expression that clearly says, “What kind of an idiot do you take me for?”  Then she goes to the black board and writes:

Evans says, “Go sign up for the course at table A.”  Students who got their questions right were directed to table B.  (See Constance Reid’s biography of Neyman for the original version of this story.)

11.  If you’re a table B person, you know (or can figure out, with a little prodding) some basic facts about probability already.

Suppose that Purdue University collects the following data on first-semester freshmen.

 Men 60% Women 40% Fraction of men taking a math course 90% Fraction of women taking a math course 80% Average height of men 70 inches Average height of women 64 inches Fraction of freshmen taking a biology course 35% Fraction of freshmen taking a chemistry course 25% Fraction of freshmen taking courses in both biology and chemistry 10%

Q1. What fraction of freshmen are not taking biology? (This is the complement rule on page 19 of Pitman.)

Q2.  Of all freshmen, what fraction are women who are taking a math course?  (This is the multiplication rule for conditional probabilities on page 37 of Pitman.)

Q3.  What fraction of freshmen are taking a math course?  (This is the rule of average conditional probabilities on page 41 of Pitman.)

Q4.  Of the freshmen who are taking a math course, what fraction are women?  (This is Bayes rule, found on page 49 of Pitman.)

Q5.  What is the average height of freshmen? (This is the rule of average conditional expectation on page 402 of Pitman.)

Q6.  What fraction of freshmen are taking biology or chemistry (or both)?  (This is the inclusion-exclusion formula on page 22.)

The answers are 65%, 32%, 86%, 16/43, 67.6 inches, and 50%.

12.  Few people understand much about calculus right after finishing a year of calculus.  Mostly, calculus students just memorize a lot of mysterious, magical recipes, along with the cue words that help them figure out which recipe to use.  (When the word “slope” appears, you differentiate, and when the word “area” appears, you integrate.)

My friend Steve Lalley says that the only people who understand anything about calculus are the ones who, after taking calculus, go on to take courses where calculus is used, for example in physics, chemistry, engineering, or economics.  The second half of Stat 516 will use a lot of calculus.  So, if you’re a little hazy about what is really going on in calculus, Stat 516 will help you learn.

On this same topic, note that in Pitman’s preface, he suggests that the instructor zip through chapters 1 and 2 so as to have enough time in chapters 4-6 to teach calculus.

==============================================================================================

Fall 2007

# Week 1

Session 1.                    Aug. 20                       Sections 1.1, 1.2

Session 2.                    Aug. 22                       Section 1.3

Session 3                     Aug. 24                       Section 1.4

# Week 2

Session 4.                    Aug. 27                       Section 1.5

Session 5.                    Aug. 29                       Section 1.6

Session 6.                    Aug. 31                       Section 2.1

# Week 3

Sep. 3                          Labor Day                               No Class

Session 7.                    Sep. 5                          Section 2.2

Session 8.                    Sep. 7                          Section 2.4

# Week 4

Session 9.                    Sep. 10                        Section 2.5

Session 10.                  Sep. 12                        Section 3.1

Session 11.                  Sep. 14                        Section 3.2

# Week 5

Session 12.                  Sep. 17                        Section 3.2

Session 13                   Sep. 19                        Section 3.3

Session 14                   Sep. 21                        Section 3.3

# Week 6

Session 15                   Sep. 24                        Section 3.3

Session 16                   Sep. 26                        Section 3.4

Session 17                   Sep. 28                        Section 3.5

# Week 7

Session 18                   Oct. 1                          Review of chapters 1-3

Session 19                   Oct. 3                          Exam 1

Session 20                   Oct. 5                          Section 4.1

# Week 8

Oct. 8                          October Break                         No Class

Session 21                   Oct. 10                        Section 4.1

Session 22                   Oct. 12                        Section 4.2

# Week 9

Session 23                   Oct. 15                        Section 4.2

Session 24                   Oct. 17                        Section 4.3

Session 25                   Oct. 19                        Section 4.5

# Week 10

Session 26                   Oct. 22                        Sections 4.5 and 4.6

Session 27                   Oct. 24                        Section 4.4

Session 28                   Oct. 26                        Section 4.4

# Week 11

Session 29                   Oct. 29                        Section 5.1

Session 30                   Oct. 31                        Section 5.2

Session 31                   Nov. 2                         Section 5.3

# Week 12

Session 32                   Nov. 5                         Review for Exam 2

Session 33                   Nov. 7                         Exam 2

Session 34                   Nov. 9                         Section 5.4

# Week 13

Session 35                   Nov. 12                       Section 5.4

Session 36                   Nov. 14                       Section 6.1

Session 37                   Nov. 16                       Sections 6.2, 6.3

# Week 14

Session 38                   Nov. 19                       Section 6.3

Nov. 21-24                  Thanksgiving Vacation           No Class

# Week 15

Session 39                   Nov. 26                       Section 6.4

Session 40                   Nov. 28                       Section 6.4

Session 41                   Nov. 30                       Section 6.5

# Week 16

Session 42                   Dec. 3                          Section 6.5

Session 43                   Dec. 5                          Review

Session 44                   Dec. 7                          Review

# Week 17

## Final Exam

* I will be available in the classroom for off-campus students taking exams during cancelled-class times.

Homework 1 – Due August 29 (Session 5)

Section 1.1:     4

Section 1.3:     2, 4, 5, 8

Section 1.4:     2, 3, 4

Homework 2 – Due September 5 (Session 7)

Section 1.4:     5, 6, 7

Section 1.5:     2, 4, 5

Section 1.6:     1, 2, 4, 6, 7

Homework 3 – Due September 10 (Session 9)

Section 2.1:      2, 4, 6, 12

Section 2.2:      2 (Exercise 2 refers to the approximations in Exercise 1), 3, 6, 9, 11, 12 (Use the normal approximation on page 99 for these exercises, not the skew-normal approximation on page 106.)

Important: some work has to be shown in your HWs, not just answers.

If it looks like the answer key from the book the corresponding problem will receive no points.

Homework 4 – due Sept 17 (Session 12)

Section 2.4:      1b, 1c, 2, 6, 7

Section 2.5:      1, 2, 4, 8

Section 3.1:      1, 2, 4, 6, 10

Homework 5 – due Sept 24 (Session 15)

Section 3.2:      2, 3, 4, 5, 6, 10

Section 3.3:      2, 3, 7, 8, 9, 13, 14, 16

Off-campus students: when you submit hw via email, please make sure that it can be printed easily, e.g. no multiple files or multiple

pages in an Excel file.

Homework 6 – due Oct 1 (Session 18)

Section 3.3:      19, 20

Section 3.4:      1, 2, 3, 4 (Question 4(a) asks for the probability that the 3 coin-toss results are not all the same.), 7a, 7b, 11

Section 3.5:      1, 2, 3, 4, 7, 9

Homework 7 –  due Oct 15 (Session 23)

Section 4.1:      1, 2, 3, 5, 6, 8, 11, 12, 13

Homework 8 – due Oct 22 (Session 26)

Section 4.2:      1, 2, 4, 5, 6, 8, 15

Section 4.5:      2, 5

Homework 9 – due Oct 29 (Session 29)

Section 4.5:      6

Section 4.6:      1

Section 4.4:      3, 4, 5, 6, 10

Chapter 4 Review Exercises:     5, 6.  Also do Review Exercise 3, except do it with

X=max{Y1, Y2, Y3, Y4}.

Homework 10 – due Nov 5 (Session 32)

Section 5.1:      1, 2, 4, 5, 6

Section 5.2:      2, 3, 4, 5, 7

Homework 11 – due Nov 12 (Session 35)

Section 5.3:      1, 2, 3, 5, 7, 9

Homework 12 – due Nov 19 (Session 38)

Section 5.4:      1, 2, 5, 6, 9, 13

Do problems 9 and 13 by finding the cdf’s of X (in 9) and Z (in 13) and then differentiating.  Then do:

## E 5.4.9             For U and V independent uniform (0,1) random variables, define

T = U,                               X = UV.

(a)    What points in the plane are the possible values of (T, X)?

(b)    Find the joint density of (T, X).

## E 5.4.13           For X and Y independent exponential (λ) random variables, define

T = X + Y,                      Z = X – Y.

(a)    What points in the plane are the possible values of (T ,Z)?

(b)    Find the joint density of (T, Z).

Homework 13 – due Nov 28 (Session 40)

Section 6.1:      1, 2

Section 6.2:      3, 4a,d

Section 6.3:      1, 2, 4

Homework 14 – due Dec 5 (Session 43)

Section 6.4:      1, 3, 5, 10

Section 6.5:      1, 2, 3, 4

HW Solutions Fall 2007

As of now, homeworks  have different weights but at the end of the term each HW grade will be converted to a 0-50 points scale.

HW1 Solution

HW2 Solution

HW3 Solution

HW4 Solution

HW5 Solution

HW6 Solution

Solutions for practice Exam #1, 2001 are here

Solutions for Exam1, Fall 2007 are here

HW7 Solution

HW8 Solution

HW9 Solution

HW10 Solution

HW11 Solution

Solutions for Exam2, Fall07 (.tiff file, can be opened with MS Office Document Imaging) are here