This page contains related R codes for STAT 51100 Sections 010. This page is for Chapter 3, some discrete random variables. We will cover several distributions: geometric distribution, binomial distribution, hypergeometric distribution, nonnegative binomial distribution and Poisson distribution.

We can use R to simulate random variables from the above distribution, and we can also use R to calculate the probability mass function, cumulative distribution function, and also draw the probability histogram.

General function format are

 d*(x, parameters) p*(x, parameters) q*(p, parameters) r*(sample size, parameters)

Here

1. d* function will calculate the value of the probability mass function at value x. Parameters are needed to be specified.

2. p* function will calculate the value of the cumulative distribution function at value x, i.e., $$P(X \leq x)$$.

3. q* function will calculate the percentile at value $$p$$, see this function after we learn percentile in Chapter 4.

4. r* function will simulate a ramdom sample of size as the sample size specified in this function, with a specified distribution.

A useful function help.search() to check how to use the above functions. We will not simulate discrete random variables in this tutorial.

### 1. Geometric distribution

The geometric distribution has the form as dgeom, pgeom, qgeom, rgeom. The following calculates the pmf for a geometric distribution when $$x=0, 1, 2, 3, \cdots, 10$$, with $$p=.4$$

x<-seq(0, 10)
y<-dgeom(x, prob=.4)
y
##  [1] 0.400000000 0.240000000 0.144000000 0.086400000 0.051840000
##  [6] 0.031104000 0.018662400 0.011197440 0.006718464 0.004031078
## [11] 0.002418647

We can plot this pmf function

plot(x, y, type='h', xlab='X', ylab='', main='Probability Mass Function')

The cdf can be calcualted as

z<-pgeom(x, prob=.4)
z
##  [1] 0.4000000 0.6400000 0.7840000 0.8704000 0.9222400 0.9533440 0.9720064
##  [8] 0.9832038 0.9899223 0.9939534 0.9963720

Note that the last entry in z should be the same as sum of the array y. We will confirm it by calculating sum(y)

sum(y)
## [1] 0.996372

The cdf can be drawed as

plot(x, z, xlab='X', ylab='', main='Cumulative Density Function')

Note that the above plot is NOT the cdf plot in the textbook. We will not show how to draw the plot in this tutorial.

### 2 Binomial distribution

The Binomial distribution has the form as dbinom, pbinom, qbinom, rbinom. The following calculates the pmf for a binomial distribution when $$x=0, 1, 2, 3, \cdots, 10$$, with $$p=.4$$ and $$n=10$$.

x<-seq(0, 10)
y<-dbinom(x, size=10, prob=.4)
y
##  [1] 0.0060466176 0.0403107840 0.1209323520 0.2149908480 0.2508226560
##  [6] 0.2006581248 0.1114767360 0.0424673280 0.0106168320 0.0015728640
## [11] 0.0001048576

We can plot this pmf function

plot(x, y, type='h', xlab='X', ylab='', main='Probability Mass Function')

The cdf can be calcualted as

z<-pbinom(x, size=10, prob=.4)
z
##  [1] 0.006046618 0.046357402 0.167289754 0.382280602 0.633103258
##  [6] 0.833761382 0.945238118 0.987705446 0.998322278 0.999895142
## [11] 1.000000000

Note that each entry in z should be the same as a partial sum of the array y. We will confirm it by calculating sum(y[1:6]), the first 6 entries in y. It should be the same as z[6]

sum(y[1:6])
## [1] 0.8337614
z[6]
## [1] 0.8337614

The cdf can be drawed as

plot(x, z, xlab='X', ylab='', main='Cumulative Density Function')