Excel Sample Mean Simulation
Excel Sample Mean Simulation
SIMULATION
Sampling theory
You absolutely have to make this simulation before session 3
Save your results and bring your computer for the session 3, your
teacher will check your simulation!!!
If no computer, if excel does not work or if you cannot find toolpack
analysis .. search on the internet or use a computer of IESEG and send
your work to your teacher before session3
Toolpack analysis
■ Open excel and click on options (For mac users click on tools and then
add-ins)
■ Click on add-ins (compléments) and then select toolpack analysis
(utilitaire d’analyses), select « go » and then select again toolpack
analysis, click on OK
■ Click on the tab « data » (onglet données) and see the tab toolpack
analysis on the right
■ If it does not work, search on the internet how to install toolpack
analysis or make the simulation at ieseg IT room
2
1-Install the toolpack analysis on Excel
PART 1: descriptive statistics
2-During the last 5 years, we have counted how many retakes for
market structure and pricing decision students had to take. The table
below gives the number of time students have to retake the market
structure course! We can observe that 68% pass the course without
any retake! 12% have 1 retake…( we admitt that this distribution is
the population distribution.
x %
0 0,68
1 0,12
2 0,15
3 0,05*
* maybe with your computer you must use a dot such as 0.3 instead
of 0,3 F G H H
x % x*% x*%
3-Copy and paste those data on Excel and display this retake 0 0,68 0 =F147*G147
1 0,12 0,12 =F148*G148
distribution with a graph
2 0,15 0,3 =F149*G149
=F150*G150
4-Calculate the average number of retakes students have to take 3 0,05 0,15
0,57
and the standard deviation (be careful you cannot use the average =SUM(H147:H150)
function of excel since the data are grouped!). 3
Objectives of the simulation
■ We want to study the differences between the population average µ and the sample mean
■ We want :
– to demostrate that the sample mean is random, it varies from one sample to
another
– even if the sample is large and well selected, the sample mean is not necessarily = µ
and sometimes it is even far from µ.
– to draw the probability distribution of the sample mean () in order to appreciate the
probability for the sample mean to be far or close to population average.
Method
■ We select a random sample from the population and we calculate the average number of
retakes in the selected sample ; note that in this simulation since we know µ=0.57 we can
appreciate how far or close is the point estimate from µ (in the real world, we do not the pop
average so cannot say if the estimate is close or not to mu!).
■ We will repeat that experiment 3000 times (we select 3000 samples) in order to have 3000
■ Let’s do it!!
PART 2: sampling; in this part we are going to select 1 random
sample of 50 observations from the « retake » distribution; we use
an excel function that enables to make a perfect random selection
regarding the percentages we observe in the population
x %
5-Click on toolpack analysis in the tab data and then select « generation of random 0 0,68
numbers » 1 0,12
2 0,15
Write 50 for the number of variables (1st case) and write 1 for the number of 3 0,05
samples (2nd case).
For the “parameters” Select the retake distribution (without the labels) and then
OK
Click on select a new sheet and wait one minute…
you get 1 row with 50 colomns. Those colomns represent the 50 observations
that have been randomly selected from the population
6- On the right, in colomn AZ, calculate the mean of the 50 random observations (you
must use the average formula of excel since you have an individual distribution, not a
grouped one as before);
This average is the average number of retake students have to take in the random
sample of 50 observations you have selected! It is an estimate of µ. Now compare4µ
(calculated in part 1 and ) and comment
PART 3: sampling; in this part we are going to select not only one but 3000 random
samplessizof the same sime (50 observations) and all from the same « retake »
distribution
e
WHY? We know that is a random estimate (not perfect and not necessarily = µ), we need to
appreciate the probability for to be close or far to µ , we need to determine the distribution,
To determine the distribution, we need many different samples in which we will calculate many
different that we will use to draw the distribution,
7-Click on toolpack analysis in the tab data and then select « generation of random numbers »
Write 50 for the number of variables (1st case) and write 3000 for the number of samples (2nd case).
For the “parameters” Select the retake distribution (without the labels) and then OK
Click on select a new sheet and Wait one minute…
you get 3000 rows with 50 columns. 1 row=1 sample of 50 observations
8- On the right, in colomn AZ, calculate the mean of the first 50 random observations (first row) (you must use the
average formula of excel since you have an individual distribution, not a grouped one as before) and make the
same thing for the folllowing 2999 rows (just scroll down the formula)
In the Az column, you have 3000 samples means, they all come from a random selected sample from the same
population! What do you observe? USE an histogram to display the sample mean distribution and comment!
Save those results and bring you computer for the next session, your teacher will check your
simulation. 4