0% found this document useful (0 votes)
22 views17 pages

05 Statistical Inference Estimation 02052024 113417pm

Lecture

Uploaded by

Deesha Bachani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views17 pages

05 Statistical Inference Estimation 02052024 113417pm

Lecture

Uploaded by

Deesha Bachani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

BBA – 3 (Spring – 2024)

Statistical Inference

Estimation

Ahmad Jalil Ansari


Contact: [email protected]
Learning Objectives
 Degree of freedom (df)
 Concept of Proportion
 Central Limit Theorem revisited
 Estimation - to estimate certain characteristics of a
population from sample
 Point Estimator
 Interval Estimate – Confidence Level vs. Confidence
Interval
 Interval Estimate for population mean (µ)
 t-Distribution
 Concept of t-Distribution
 Z-statistics vs t-Statistic
Degree of Freedom
Degree of Freedom
It is the number of independent variables that can be estimated in a
statistical analysis and tells us how many items can be randomly
selected before constraints must be put in place.

Consider a data set: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}


Example 1: If it is required that the values of the five integers must have an average
of six. We can randomly selected four integers, say as {3, 8, 5, and 4}, then
(3 + 8 + 5 + 4 + x) / 5 = 6  x must be 10
Because the first four numbers can be chosen at random, the degree of freedom is 4

Example 2: If we are required to select any five integers with no known relationship
between them. Because all five can be chosen at random with no limitations, the
degree of freedom is Five.

Example 3: If we are required to select four positive integers with restriction that
two of them are divisible by 5, then the degree of freedom is two because you can
select two numbers randomly.
Applying Degree of Freedom
Degrees of freedom tell you how many units within a set can be
selected without constraints to still abide by a given rule
overseeing the set.
 Degrees of freedom is denoted as df or Df or 𝜈 (nu).
 In statistics, degrees of freedom define the shape of the various
distribution used when calculating the probability value (p-value).
Depending on the sample size and different degrees of freedom, it
will display different distributions.
 Calculating degrees of freedom is critical when understanding the
importance of t- distribution, chi-square statistic and F-distribution.
 Calculation of degrees of freedom depends on the type of
distribution used i.e. whether it is t- distribution, chi-square statistic
and F-distribution.
Proportion
Proportion
In Statistics, we use proportion as the ratio of the size, number, or amount
of one group as compared to the size, number, or amount of whole.
Example: If a group of 30 students are selected for some research interview
containing 12 girls the ratio of girls are 12/30
 Proportions can be obtained from samples or populations
 Notation we will use for Proportion are
p = population proportion, 𝑝 = sample proportion
 If X = number of population or sample units that possess the
characteristics of interest and n = sample size, then
𝑋 𝑋
𝑝=
𝑁
or 𝑝=
𝑛
 The ratio that possess other than the characteristics of interest is
denoted by q (for population) and 𝑞 (𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒)
𝑛−𝑋
For Population: 𝑞= or 𝑞 = 1−𝑞
𝑁
𝑛−𝑋
For Sample: 𝑞= or 𝑞 = 1− 𝑝
𝑛

 It can be expressed as a fraction, decimal, or percentage e.g.


12/30 = 2/5 = 0.4 = 40%
Proportion
Example: It is stated that 12% of the pleasure boats in the USA are named
Serenity. The 12% is a proportion. It means that of all the pleasure boats in
the United States, 12 out of every 100 are named Serenity.
In this case,
p = proportion of boats named as Serenity
= 12% = 0.12 = 12/100
q = proportion of boats not named as Serenity
= 100% - 12% = 88%

Note: Proportions can also represent probabilities.


If a pleasure boat is selected at random, the probability that it is
called Serenity is p = 0.12
The probability that it is called other than Serenity is q = 0.88
Proportion
Example: In a study, 200 people were asked if they were satisfied with
their job; 162 said that they were.
In this case, n = 200, X = 162,
𝑋 162
𝑝= = = 0.81
𝑛 200
Therefore, 0.81, or 81%, of those surveyed were satisfied with their job.
The proportion of people who were not satisfied with their job = 𝑞
𝑛−𝑋
𝑞=
𝑛
200 − 162
𝑞=
200
38
= or 0.19, or 19%.
200
Note:
i. 𝑝 + 𝑞 = 1 or 100%. Therefore 𝑞 = 1 – 𝑝 = 1 - 0.81 = 0.19.
ii. If a person is randomly selected then the probability of person
satisfied with his job is 81% and not satisfied with his job is 19%
Proportion
Example
In a recent survey of 150 households, 54 had central air conditioning. Find
𝑝 and 𝑞, where 𝑝 is the proportion of households that have central air
conditioning.
Solution
Since X = 54 and n = 150,
𝑋
𝑝=
𝑛
54
= = 0.36 𝑜𝑟 36%
150

𝑛−𝑋
𝑞=
𝑛
150−54
= = 96/150 = 0.64 = 64%
150
Central Limit
Theorem revisited
Sampling Distribution
Statistical Inference: A process of drawing conclusions about an
underlying population based on a sample
The Probability Distribution of Sample Statistic computed from
all possible random samples of a specific size taken from a
population is called a Sampling Distribution
The sample may be taken with:
• Replacement
• Without Replacement
The statistic may be:

Statistic Sampling Distribution of To estimate


Mean (𝒙) Sample means µ
Variance (s2) Sample Variance σ2
Standard Deviation (s) Sample Standard Deviation σ
Proportion (𝒑) Sample Proportion p
Central Limit Theorem
 If population has normal distribution, then distribution of sample
means will have normal distribution
 If population is not normally distributed, then sampling distribution
of means approaches normal as sample size increases (n > 30)
 The mean of sample means will be equal to the mean of the
population 𝜇𝑥 = µ
 The standard deviation of the sample means will be equal to the
population standard deviation divided by the square root of the
sample size σ𝑥 = σ / √n.
 Notes
1. When population is finite (size of
sample is more than 5% of
𝜎 𝑁 −𝑛
population) 𝜎𝑥 =
𝑛 𝑁−1
2. If population standard deviation (σ) is not known or sample size is
less than 30, then instead of Normal we will use t Distribution
Central Limit Theorem
It is reported that children watch an average of 25 hours
of television per week. Assume the variable is normally
distributed and the standard deviation is 3 hours. If 20
children are randomly selected, find the probability that
the mean of the number of hours they watch television
will be greater than 26.3 hours.
Solution
The standard deviation of the sample means is
𝜎 3 𝜇𝑥 = µ = 25
𝜎𝑋 = = = 0.671
𝑛 20
The z value is
𝑋−𝜇 26.3 − 25 1.3
𝑧= 𝜎 = = = 1.94
3 0.671
𝑛 20
The area to the right of 1.94 is 1.000 - 0.97381 = 0.02619, or 2.62%.
Therefore the probability of obtaining a sample mean larger than 26.3
hours is 2.62% [i.e., P(𝑋 > 26.3) = 2.62%].
Central Limit Theorem - Problem Sample size is more than
5% of population size
It is reported that children watch an average of 25 hours
of television per week. Assume the variable is normally
distributed and the standard deviation is 3 hours. If 20
children are randomly selected from 200, find the
probability that the mean of the number of hours they
watch television will be greater than 26.3 hours.
Solution
The standard deviation of the sample means is
𝜇𝑥 = µ = 25
𝜎 𝑁 −𝑛 3 2000 − 20
𝜎𝑋 = = = 0.671 × 0.9952 = 0.6678
𝑛 𝑁 − 1 20 2000 − 1
The z value is
𝑋−𝜇 26.3 − 25 1.3
𝑧= = = = 1.94
𝜎𝑋 0.6678 0.671
The area to the right of 1.95 is 1.000 - 0.97441 = 0.02551, or 2.55%.
Therefore the probability of obtaining a sample mean larger than 26.3
hours is 2.55% [i.e., P(𝑋 > 26.3) = 2.62%].
Student t Distribution - Problem
It is reported that children watch an average of 25 hours
of television per week. Assume the variable is normally
distributed and the standard deviation is 3 hours. If 20
children are randomly selected, find the probability that
the mean of the number of hours they watch television
will be greater than 26.3 hours.
Solution
The standard deviation of the sample means is
𝜎 3 𝜇𝑥 = µ = 25
𝜎𝑋 = = = 0.671
𝑛 20
The t value is
𝑋−𝜇 26.3 − 25 1.3
𝑡= 𝜎 = = = 1.94
3 0.671
𝑛 20
Then we will look the value of area in t Table instead of Standard Normal
Distribution Table depending on degree of freedom
Thank you

Ahmad Jalil Ansari


Contact: [email protected]

You might also like