Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Principles of Statistics for Admin.

(15060105)

Sampling and Sampling Distributions


● An element is the entity on which data are collected.
● A population is the collection of all the elements of interest.
● A sample is a subset of the population.

The sampled population is the population from which the sample is drawn,
and a frame is a list of the elements that the sample will be selected from.
Population

Sample element

frame

Population element
Point Estimator
if the measures are computed for data if the measures are computed for data from
from a population, they are called a sample, they are called sample statistics.
population parameters.

Parameter Statistic
Example Example

in statistical inference:
a sample statistic is referred to as the point estimator of the corresponding population parameter.

The target population is the population we want to make inferences about,


while the sampled population is the population from which the sample is actually taken.
point estimator
Population Sample
Parameter Statistic
Example Example

𝜇
𝑀𝑒𝑎𝑛 ∶ 𝜇 𝑀𝑒𝑎𝑛 ∶ 𝑥

𝜎2
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎2 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝑆2

𝜎
Standard Deviation: ∶ 𝜎 Standard Deviation ∶ 𝑆

𝑝
Proportion: p Proportion: 𝑝
Example
The following marks are from a simple random sample for student : 2, 7, 10, 9, 6, 8
a. What is the point estimate of the population mean?
b. What is the point estimate of the population variance?
c. What is the point estimate of the population standard deviation?
d. What is the point estimate of the population proportion of failed student,
assuming the student failed if his/her mark is less than 5 ?
Answer:
𝒙
a. What is the point estimate of the population mean? 𝜇 = 𝒙 =
𝒏

𝒙−𝒙 𝟐
b. What is the point estimate of the population variance? 𝜎2 = 𝑺𝟐 =
𝒏−𝟏
𝒙−𝒙 𝟐
c. What is the point estimate of the population standard deviation? 𝜎 = 𝑺 =
𝒏−𝟏

d. What is the point estimate of the population proportion of failed student,


𝑰𝒊
assuming the student failed if his/her mark is less than 5 ? 𝑝 = 𝒑 = ,
𝒏
where: 𝐼𝑖 =1: if the student mark is less than 5, and zero otherwise

2 , 7 , 10 , 9 , 6 , 8
Sampling Distributions
The sample mean 𝑥 is the point estimator of the population mean 𝜇
The sample proportion 𝑝 is the point estimator of the population proportion 𝑝.

Example:
Assuming we have a simple random sample of 𝑛 = 30 managers,
The point estimate of 𝜇 is 𝑥 = $51,814 and
the point estimate of 𝑝 is 𝑝 = 0.63.

Suppose we select another simple random sample of 𝑛 = 30 managers and obtain the
following point estimates:
The point estimate of 𝜇 is 𝑥 = $52,670 and
the point estimate of 𝑝 is 𝑝 = 0.70.
……
Now, suppose we repeat the process of selecting a simple random sample of
𝑛 = 30 managers and obtain the point estimates of each 𝜇 and 𝑝.
Sampling Distributions
The probability distribution of any particular sample statistic is called the sampling distribution of the statistic
The sampling distribution of 𝒙 is the probability distribution of all values of the sample mean 𝒙.

E 𝑥 : The Expected value of 𝑥 is the mean of 𝑥 (where 𝑥 is a random variable).


𝐄 𝒙 = 𝝁 , where 𝜇 i𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

Standard Error of 𝑥 ↔ Standard deviation of 𝑥 :


𝝈
𝝈𝒙 = , where σ i𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 standard deviation, and n is the sample size
𝒏

Note:
• If the population of 𝑥 is Normally distributed, then sampling distribution of 𝒙 is Normally distributed.
• If the sample size is large (n>30), we assume the measure is Normally distributed.
• If (Expected value of a statistic)=parameter, then:
The statistic is an unbiased estimator of the parameter. (Example: E 𝑥 = 𝜇).

The sampling distribution of 𝒙 is Normally distributed,


and you can compute any probability of 𝑥 using the normal distribution.
Normal Probability Distribution
The most important probability distribution for describing a continuous random variable is the normal probability distribution.

The normal distribution has the bell shaped curve and symmetric around mean ( where mean=median=mode).
Normal distribution properties:
1. The normal distributions is based on two parameters: the mean 𝜇 and the variance 𝜎 2 .
2. The highest point on the normal curve is at the mean, which is also the median and mode of the distribution.
3. The mean of the distribution can be any numerical value: negative, zero, or positive.
4. The normal distribution is symmetric, with the shape of the normal curve to the left of the mean a mirror
image of the shape of the normal curve to the right of the mean. The tails of the normal curve extend to
infinity in both directions and theoretically never touch the horizontal axis. Because it is symmetric, the
normal distribution is not skewed; its skewness measure is zero.
5. The standard deviation determines how flat and wide the normal curve is. Larger values of the standard
deviation result in wider, flatter curves, showing more variability in the data.
6. Probabilities for the normal random variable are given by areas under the normal curve. The total area under
the curve for the normal distribution is (1). Because the distribution is symmetric, the area under the curve to
the left of the mean is (0.50) and the area under the curve to the right of the mean is (0.50).
7. The percentage of values in some commonly used intervals are:
a. 68.3% of the values of a normal random variable are within (+) or (-) one standard deviation of its mean.
b. 95.4% of the values of a normal random variable are within (+) or (-) two standard deviations of its mean.
c. 99.7% of the values of a normal random variable are within (+) or (-) three standard deviations of its mean
Standard Normal Probability Distribution (Z)
Standard Normal Probability Distribution: Is a normal distributions with mean 𝜇 = 0 and variance 𝜎 2 = 1.

You can convert any normal distribution of (𝒙) with mean 𝝁 and variance 𝝈𝟐 to standard Normal distribution (𝒛) by:

Probabilities for the normal random variable are given by areas under the normal curve.
You can use standard Normal table (𝒛).
Example:
Based on the standard Normal table (𝑧), compute:

1. 𝑃 𝑍<0
2. 𝑃 𝑍 < 1.0
3. 𝑃 𝑍 < 1.65
4. 𝑃 𝑍 < 0.8
5. 𝑃 𝑍 < −1.28
6. 𝑃 𝑍>0
7. 𝑃 𝑍 > −1.34
8. 𝑃 𝑍 > 2.0
9. 𝑃 −0.32 < 𝑍 < 0.5
10. 𝑃−1.28 < 𝑍 < 1.28
11. 𝑃−1.65 < 𝑍 < 1.65
12. 𝑃−1.96 < 𝑍 < 1.96
13. What is the value of 𝑍 that have 40% of chance (probability) less than 𝑍, i.e. 𝑃(𝑍 < 𝑧) = 0.40
14. What is the value of 𝑍 that have 80% of chance (probability) greater than 𝑍, i.e. 𝑃(𝑍 > 𝑧) = 0.80
An Application using the Normal Distribution:

Example:

Assuming a population of 1000 students, if it mean of student marks is 70 and variance


is 144. if the student marks is normally distributed.

a) Compute the probability that the student mark is greater than 79? 𝑃 𝑋 > 79 ?
b) Compute the probability that the student mark is between 64 and 85?
c) How many students can there marks between 64 and 85?
d) How many students can there marks more than 85?
e) What is the mark that a 330 students will take less than it?
f) If you drawn a random sample from the population. if the sample size is 36.
Compute the probability that the sample mean (𝑿) is greater than 75?
Example:
Assuming a population of 1000 student, if it mean of student marks is 70 and variance
is 144. if the student marks is normally distributed.
𝑵 = 𝟏𝟎𝟎𝟎 , 𝝁 = 𝟕𝟎 , 𝝈𝟐 = 𝟏𝟒𝟒 → 𝝈 = 𝟏𝟐,

𝑿−𝝁
Note: 𝒁= ,
𝝈

a) Compute the probability that the student mark is greater than 79? 𝑃 𝑋 > 79 ?

𝑿−𝝁 79 − 70
𝑃 𝑋 > 75 = 𝑃 𝑍 > =𝑃 𝑍> = 𝑃 𝑍 > 0.75 = 0.2266
𝝈 12
𝑵 = 𝟏𝟎𝟎𝟎 , 𝝁 = 𝟕𝟎 , 𝝈𝟐 = 𝟏𝟒𝟒 → 𝝈 = 𝟏𝟐,
𝑿−𝝁
Note: 𝒁= ,
𝝈

b) Compute the probability that the student mark is between 64 and 85?
𝑃 64 < 𝑋 < 85 ?
𝑿𝟏 − 𝝁 𝑿𝟐 − 𝝁
𝑃 64 < 𝑋 < 85 = 𝑃 <𝒁<
𝝈 𝝈
64−70 85−70
=𝑃 <𝑍<
12 12

= 𝑃 −0.5 < 𝑍 < 1.25

= 𝑇 1.25 − 𝑇 −0.5
= 0.8944 − 0.3085 = 0.5859
𝑵 = 𝟏𝟎𝟎𝟎 , 𝝁 = 𝟕𝟎 , 𝝈𝟐 = 𝟏𝟒𝟒 → 𝝈 = 𝟏𝟐,
𝑿−𝝁
Note: 𝒁= ,
𝝈

c) How many students can there marks between 64 and 85?


Note: Number of student is = Population size * Probability
= N ∗ 𝑃 64 < 𝑋 < 85 ?
= 1000∗0.5859
= "about 586 from 1000 “ will take marks between than 64 and 85

d) How many students can there marks more than 85?


= N ∗ 𝑃 𝑋 > 85 ?
= 1000∗(1−0.8944)=1000∗0.1056
= "about 106 from 1000 “ will take marks more than 85
𝑵 = 𝟏𝟎𝟎𝟎 , 𝝁 = 𝟕𝟎 , 𝝈𝟐 = 𝟏𝟒𝟒 → 𝝈 = 𝟏𝟐,
𝑿−𝝁
Note: 𝒁= ,
𝝈
e) What is the mark that a 330 students will take less than it?
330 = 1000 ∗ 𝑃 𝑋 < 𝑥330 → 𝑃 𝑋 < 𝑥330 = 0.33?

From Z table, find the z values that have probability (area) less than it= 0.33:
𝑃 𝑍 < −0.44 = 0.33 , z = −0.44
𝑿−𝝁
𝒁=
𝝈
𝑥330 − 𝝁
−𝟎. 𝟒𝟒 =
𝝈
𝑥330 − 𝟕𝟎
−𝟎. 𝟒𝟒 =
𝟏𝟐
𝑥330 = 𝟔𝟒. 𝟕𝟐

→ There are 330 students will take less than the mark 64.72
Sampling distribution of Mean (𝜇):
Assuming X distributed Normal (mean=𝜇, Variance=𝝈𝟐𝑿 ).
Then 𝑿 is distributed Normal (mean=𝜇, Variance=𝝈𝟐𝑿 ).
𝟐
𝟐 𝝈𝑿
Such that: 𝝈𝑿 =
𝒏
Then: We can convert 𝑿 to standard Normal distribution using:
𝑿−𝝁𝑿 𝝈
𝒁= , 𝝁𝑿 = 𝝁 , 𝝈𝑿 =
𝝈𝑿 𝒏

f) If you drawn a random sample from the population. if the sample size is 36.
Compute the probability that the sample mean (𝑿) is greater than 75?
Note: 𝑁 = 1000 , 𝜇 = 70 , 𝜎 2 = 144, 𝑛 = 36 , 𝑃 𝑋 > 75 ?
𝑿 − 𝝁𝑿 𝝈
𝒁= , 𝝁𝑿 = 𝝁 , 𝝈𝑿 =
𝝈𝑿 𝒏
𝑋 − 𝜇𝑋 75 − 70
𝑃 𝑋 > 75 = 𝑃 𝑍 > =𝑃 𝑍> = 𝑃 𝑍 > 2.50 = 0.0062
𝜎𝑋 12
36
g) If you drawn a random sample from the population. if the sample size is 25.
Compute the probability that the sample mean (𝑿) is greater than 76?
𝑁 = 1000 , 𝜇 = 70 , 𝜎 2 = 144, 𝑛 = 25 , 𝑃 𝑋 > 78 ?
Note:
𝑿 − 𝝁𝑿 𝝈
𝒁= , 𝝁𝑿 = 𝝁 , 𝝈𝑿 =
𝝈𝑿 𝒏
𝑋 − 𝜇𝑋 76 − 70
𝑃 𝑋 > 75 = 𝑃 𝑍 > =𝑃 𝑍> = 𝑃 𝑍 > 0.42 =
𝜎𝑋 12
25
= 1 − 𝑃 𝑍 < 0.42 = 1 −0.6628=0.3372
sampling distribution of proportion
The sampling distribution of 𝑷 is the probability distribution of all possible
values of the sample proportion 𝒑.
The sampling distribution of 𝑷 is the probability distribution of all possible values of
the sample proportion 𝒑.

E 𝒑 : The Expected value of 𝒑 is the mean of 𝒑 (where 𝒑 is a random variable).


𝐄 𝒑 = 𝒑 , where 𝒑 i𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 proportion
Standard Error of 𝑝 ↔ Standard deviation of 𝑝 :
𝒑(𝟏−𝒑)
𝝈𝒑 = , where 𝒑 i𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 proportion , and n is the sample size
𝒏

Note:
• The population of 𝒑 can be approximated by a normal distribution whenever np ≥ 5 and n(1 − p) ≥ 5

In this course
we assuming the sampling distribution o𝐟 𝒑 is Normally distributed,
and you can compute any probability of 𝑝 using the normal distribution.
Example:
Assuming a sample of 30 students, if the population proportion of failed student is 0.2.
Compute the Expected value of the sample proportion 𝒑 . And the Standard Error of 𝑝 :

Answer:
𝐄 𝒑 = 𝒑 = 𝟎. 𝟐 , where 𝒑 i𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 proportion

Standard Error of 𝑝 ↔ Standard deviation of 𝑝 :


𝒑(𝟏 − 𝒑) 𝟎. 𝟐(𝟏 − 𝟎. 𝟐)
𝝈𝒑 = = = 𝟎. 𝟎𝟕𝟑
𝒏 𝟑𝟎

You might also like