This page contains related R codes for STAT 51100 Sections 010. This page is for Chapter 3, some discrete random variables. We will cover several distributions: geometric distribution, binomial distribution, hypergeometric distribution, nonnegative binomial distribution and Poisson distribution.

We can use R to simulate random variables from the above distribution, and we can also use R to calculate the probability mass function, cumulative distribution function, and also draw the probability histogram.

General function format are

d*(x, parameters)
p*(x, parameters)
q*(p, parameters)
r*(sample size, parameters)

Here

  1. d* function will calculate the value of the probability mass function at value x. Parameters are needed to be specified.

  2. p* function will calculate the value of the cumulative distribution function at value x, i.e., \(P(X \leq x)\).

  3. q* function will calculate the percentile at value \(p\), see this function after we learn percentile in Chapter 4.

  4. r* function will simulate a ramdom sample of size as the sample size specified in this function, with a specified distribution.

A useful function help.search() to check how to use the above functions. We will not simulate discrete random variables in this tutorial.

1. Geometric distribution

The geometric distribution has the form as dgeom, pgeom, qgeom, rgeom. The following calculates the pmf for a geometric distribution when \(x=0, 1, 2, 3, \cdots, 10\), with \(p=.4\)

x<-seq(0, 10)
y<-dgeom(x, prob=.4)
y
##  [1] 0.400000000 0.240000000 0.144000000 0.086400000 0.051840000
##  [6] 0.031104000 0.018662400 0.011197440 0.006718464 0.004031078
## [11] 0.002418647

We can plot this pmf function

plot(x, y, type='h', xlab='X', ylab='', main='Probability Mass Function')

The cdf can be calcualted as

z<-pgeom(x, prob=.4)
z
##  [1] 0.4000000 0.6400000 0.7840000 0.8704000 0.9222400 0.9533440 0.9720064
##  [8] 0.9832038 0.9899223 0.9939534 0.9963720

Note that the last entry in z should be the same as sum of the array y. We will confirm it by calculating sum(y)

sum(y)
## [1] 0.996372

The cdf can be drawed as

plot(x, z, xlab='X', ylab='', main='Cumulative Density Function')

Note that the above plot is NOT the cdf plot in the textbook. We will not show how to draw the plot in this tutorial.

2 Binomial distribution

The Binomial distribution has the form as dbinom, pbinom, qbinom, rbinom. The following calculates the pmf for a binomial distribution when \(x=0, 1, 2, 3, \cdots, 10\), with \(p=.4\) and \(n=10\).

x<-seq(0, 10)
y<-dbinom(x, size=10, prob=.4)
y
##  [1] 0.0060466176 0.0403107840 0.1209323520 0.2149908480 0.2508226560
##  [6] 0.2006581248 0.1114767360 0.0424673280 0.0106168320 0.0015728640
## [11] 0.0001048576

We can plot this pmf function

plot(x, y, type='h', xlab='X', ylab='', main='Probability Mass Function')

The cdf can be calcualted as

z<-pbinom(x, size=10, prob=.4)
z
##  [1] 0.006046618 0.046357402 0.167289754 0.382280602 0.633103258
##  [6] 0.833761382 0.945238118 0.987705446 0.998322278 0.999895142
## [11] 1.000000000

Note that each entry in z should be the same as a partial sum of the array y. We will confirm it by calculating sum(y[1:6]), the first 6 entries in y. It should be the same as z[6]

sum(y[1:6])
## [1] 0.8337614
z[6]
## [1] 0.8337614

The cdf can be drawed as

plot(x, z, xlab='X', ylab='', main='Cumulative Density Function')