Statistics
Sample
Population: Sample: independent identically distributed (iid)
••
••••• Examples:
•• • production
•••• ••
•••• • marketing research
•••••
Sample function: a function of the observed values in
the sample used for making general
conclusion about the entire population.
1 lecture 6
Sample
Mean, mode & median
Sample mean / average: Example: Grade sample:
1 n
X = ∑Xi
4, 12, 7, 2, 10, 7, -3, 0, 0,
n i =1 -3, 2, 7, 10
3.0
Mode: 2.5
2.0
the value(s) with highest frequency
1.5
Median: X(i) increasing sequence 1.0
0.5
0.0
X ( n +1) 2
-3 0 2 4 7 10 12
n odd x = 55 / 13 = 4.23
~ Mean :
X = X n 2 + X n 2+1 Mode : m=7
n even
Median : ~
x =4
2
2 lecture 6
Sample
Range & variance
Range: Example cont.:
R = X(n) - X(1) Range: r = 15
1 15
Sample varinace: Variance: s =
2
∑
13 − 1 i = 1
( X i − 4. 23) 2
= 25.03
1 n
S =
2
∑
n − 1 i= 1
( X i − X ) 2
Standard div: s = 25.03 = 5.00
Sample standard deviation:
S = S2
Lower case letters: Based on observations
Notice!!
Upper case letters: Based on random variables
3 lecture 6
Sample mean
Normal distribution
Theorem:
Let X1, X2, ... Xn be independent normal distributed random
variables with same mean µ and same finite variance σ2,
that is X ~ N ( µ , σ 2 ), i = 1, 2, , n iid.
i
It follows that σ2
X ~ N µ ,
n
and then X −µ
~ N (0,1)
σ n
4 lecture 6
Sample mean
Distribution
The Central Limit Theorem (CLT):
Let X1, X2, ... Xn be independent identically distributed
random variables with same mean µ and same finite
variance σ2. Then the distribution of
X −µ
Z=
σ
n
will tend towards the standard normal distribution as n → ∞.
How large should n be before the approximation is good?
• Most distributions: n > 30
• Normal distribution : for all n
5 lecture 6
Sample mean
Example
Problem: Production of light bulbs
A company produces bulbs with a life time X,
which is approximately normal distributed with
mean: µ = 800 hours
standard diviation: σ = 40 hours
(a) Find the probability that a sample
consisting of 16 bulbs has a mean life time
less than 775 hours?
(b) If you observe a sample mean life time of 775 hours,
would you believe that the population mean is in fact 800
hours?
6 lecture 6
Two sample means
Comparison
Theorem:
Assume two independent samples are taken from two
populations with means µ1 and µ2, respectively, and finite
2 2
variances σ1 and σ2 , respectively.
Then for the difference between the two sample means
X 1 − X 2 , we have σ 12 σ 22
X 1 − X 2 ~ N µ1 − µ 2 , +
n1 n2
and hence (X 1 − X 2 ) − (µ1 − µ 2 )
~ N (0, 1)
σ1 σ 2
2 2
+
n1 n2
7 lecture 6
Sample variance
Distribution
Theorem:
Lad X1, X2, ... Xn be independent normal distributed
random variables with mean µ and variance σ2.
Then (n − 1) S 2 1 n
σ2
=
σ2
∑ i
( X
i= 1
− X ) 2
~ χ 2
(n − 1)
When calculating S2, it’s usually more convenient to use
1 n n
2
S =
2
n ∑ X i − ∑ X i
2
n (n − 1) i = 1
i= 1
8 lecture 6
Sample variance
Example
Problem: Car batteries
A producer of car batteries claims that the life time of
their batteries are normal distributed with
mean: µ = 3 year
standard deviation: σ = 1 year
Sample of 5 batteries: 1.9 2.4 3.0 3.5 4.2
(a) Calculate sample standard deviation.
(b) Do you believe that the standard deviation is 1 year?
9 lecture 6
Sample mean
Distribution (unknown varinace)
Typically the variance σ2 is unknown.
If we replace the unknown variance by s2 we obtain:
Theorem:
Lad X1,X2, ... Xn be independent normal distributed
random variables with mean µ and variance σ2 (unknown).
X −µ
Then ~ t (n − 1)
S
n
10 lecture 6
Sample mean
Example (unknown variance)
Problem cont.: Car batteries
Producer claims that the life time of
their batteries are normal distributed with
mean: µ = 3 years
standard deviation: unknown
Sample of 5 batteries: 1.9 2.4 3.0 3.5 4.2
Do you believe that mean life time is 3 years?
11 lecture 6
Two sample variances
Comparison
Theorem:
If two independent samples are taken from two normal
populations with variances σ1 and σ2 , respectively, then
2 2
2
S1
σ 12
2
~ F (n 1 − 1, n 2 − 1)
S2
σ 22
1 1
Notice!! f 1 − α (n 1 , n 2 ) = Eg. f 0.95 (6, 10) =
f α (n 2 , n 1 ) f 0.05 (10, 6)
12 lecture 6