Math10282 Ex05 - An R Session
Math10282 Ex05 - An R Session
Math10282 Ex05 - An R Session
Probability Distributions in R
You can carry out a variety of calculations involving parametric probability
distributions in R. Some of the common distributions available are
Distribution R name
Discrete:
Binomial binom
Poisson pois
Geometric geom
Negative Binomial nbinom
Continuous:
Uniform unif
Normal norm
Exponential exp
Gamma gamma
Chisquare chisq
Beta beta
Student t t
Some of these will be familiar at the moment; others will become familiar
as you study more probability and statistics. R has four particular functions
available for each distribution. These are
1
Name Description
dname(x= , other arguments) Density or probability mass function
pname(q= , other arguments) Cumulative distribution function
qname(p= , other arguments) Quantile function
rname(n= , other arguments) Random deviates
ie. you prefix the R function name with either of the letters ”d”, ”p”, ”q”
or ”r”, depending what you would like to calculate. You have to specify the
values of the parameters of the distribution in the call to the function if you
are changing them from any preset default values. Functions may also have
other arguments with preset values but you can use ”help” in R to check
these.
When calculating the pdf or pmf you need to specify either a scalar or
vector of values, using the argument x = ..., for which the calculation is to be
performed; the cdf calculates P (X ≤ q) so you need to give either a scalar or
vector of x-values using q = ...; the quantile (or inverse cdf) function requires
you to list the probabilities for which you want the calculation(s) performed
in a scalar or vector using p = ...; when generating random observations you
need to say how many you want using the n = ... argument
2
> pbinom(q=25, 50, 0.6)
[1] 0.09780736
3
> rbinom(n=8, 1, 0.6)
[1] 1 1 1 0 0 0 1 0
> dnorm(x=0)
[1] 0.3989423
The default settings are for the mean=0 and sd=1, so there is no need
to mention these parameters in the call to the function dnorm above.
(This is also the case with the functions pnorm, qnorm and rnorm.)
If we want to calculate the height of the pdf at x = 0 for the N (4, 102 )
distribution then we use:
If we wanted a plot of the pdf curve for this distribution then we use:
4
The quantile or inverse cdf function is used as follows:
> qnorm(p=0.975)
[1] 1.959964
> qnorm(p=0.975, mean=4, sd=10)
[1] 23.59964
> qnorm(p=0.5, mean=4, sd=10)
[1] 4
5
Exercises
(a) (i) Let the discrete random variable X ∼ Bi(n = 100, p = 0.3).
Use the appropriate binom functions to calculate the values of
P (X = 33), P (28 ≤ X ≤ 34) and P (X < 38). Now calculate the
same probabilities using a Normal approximation to the Binomial
distribution with a continuity correction. Compare your two sets
of results.
(ii) Let X ∼ P o(10). In R, calculate and plot the pmf of X for the
x-values {0, 1, 2, . . . , 25}. Calculate P (X < 15), P (X ≥ 8) and
P (6 ≤ X ≤ 16). Find the lower quartile, median and upper
quartile of the distribution.
(iii) Let X ∼ Ex(0.2). In R, calculate and plot the pdf of X (a line
plot) for x ∈ (0, 25). Calculate P (X < 12), P (X > 3) and
P (4 < X < 20). Find the 20’th, 50’th and 80’th percentiles
of the distribution.
(iv) Let X ∼ N (20, 72 ). In R, calculate and plot the pdf of X for x ∈
(0, 40). Calculate P (X < 17), P (X > 25) and P (13 < X < 27).
Find the 5’th, 10’th, 90’th and 95’th percentiles of this Normal
distribution. Can you find these percentiles as functions of the
corresponding percentiles from the standard Normal distribution?
(b) We will use the same data as for the R Session 2. ie. the data sets
simdat1, geyser and anorexia.