Practical 5
Practical 5
Goals: This practical has two objectives. First, we shall review some notions of useful in
point estimation theory. Second, use R to simulate the sampling distribution of the median.
1 Theoretical Exercices
1.1 Exercise 1
Which of the following are random variables?
1. Population mean.
2. Population size.
3. Sample size.
4. Sample mean.
1.2 Exercice 2
Suppose we are interested in estimating the proportion of households living below the poverty
line in a given swiss canton. For this purpose, a random sample of households is drawn from
the population.
1. What type of computations should we carry on the sample in order to answer our
question? Formalize these steps by proposing an estimator and specify its distribution.
2. If the proportion of households living below the poverty line is equal to 0.15 at the
population level, what should the sample size be so that the standard deviation of the
estimator we defined at point 1. is (at the most) equal to 0.02?
1
University of Geneva GSEM
Statistics I Fall 2017
Prof. Eva Cantoni Practical 5
1.3 Exercice 3
The sample mean X̄ of a sample of size n is used to estimate the population mean µ. We
would like to find n such that the absolute error of estimation |X̄ − µ| is (at the most) equal
to a fixed value d with a (large) probability 1 − α (α given). Assume we have a sequence of
random variables X1 , . . . , Xn independent and identically distributed (i.i.d.) drawn from a
N (µ, 4) distribution.
2 R simulation
2.1 Distributions in R
R has built-in functions to evaluate quantities associated with many common probability
distributions. You can compute values of the cumulative distribution function (cdf) using
functions with prefix “p", quantiles using the prefix “q". Moreover, we can evaluate the
probability density functions (pdf for continuous distributions) or probability mass functions
(pmf for discrete distributions) with the prefix “d" and randomly generate observations
drawn from given distributions using the prefix “r".
The following table summarizes the available functions for some common probability distri-
butions.
where :
2
University of Geneva GSEM
Statistics I Fall 2017
Prof. Eva Cantoni Practical 5
• a et b (min=0 and max=1): the parameters of the Uniform distribution (beginning and
end of the interval);
• µ et σ (mean=0 and sd=1): mean and standard deviation parameters of the Normal
distribution;
• m et p (size and prob): Number of trials and probability of success for a binomial
distribution.
• λ (lambda or rate = 1): Rate parameter for a Poisson and exponential distribution.
1. For X ∼ N (10, 4) compute P (X < 12) and P (10 < X < 12).
Draw a kernel density plot of the sample (see Practical 3). Add the probability density
function of a N (1.5, 4) distribution on the same plot with :
sorted . norm . sample = sort ( norm . sample )
lines ( sorted . norm . sample , dnorm ( sorted . norm . sample ,
mean = 1.5 , sd = 2) , col = ' ' red ' ')
# Try lines(norm.sample, dnorm(norm.sample, mean = 1.5, sd = 2),
# col=”red”) instead of the above. What happens ?
F (x) = 1 − exp(−λx)
3
University of Geneva GSEM
Statistics I Fall 2017
Prof. Eva Cantoni Practical 5
1. Give the population mean E(X) and population median m(λ) of a random variable
X ∼ E(λ).
2. To look at the sampling distribution of the median when λ = 1/2 and sample size is
100, we simulate 500 samples:
Exp . median = numeric (500)
# prepare a vector of size 500 to store the results
for ( i in 1:500)
{
Exp . median [ i ] = median ( rexp (100 , rate = 1 / 2 ))
# store results at each iteration
}
(a) Explore graphically the sampling distribution of the median (with histograms,
boxplots etc).
(b) Can the sampling distribution be considered normal? Check graphically with the
appropriate tool.
(c) Around which value is the sampling distribution of median(X1 , . . . , Xn ) concen-
trated?