0% found this document useful (0 votes)
16 views31 pages

Session2 QTII 24

Uploaded by

b24093
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views31 pages

Session2 QTII 24

Uploaded by

b24093
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Dr.

Pritha Guha
Quantitative Techniques - II [email protected]
Making
Inference From
A Sample
Making Inferences From Samples

▪ Most commonly used statistic are


▪ Sample mean
▪ Sample proportion
▪ Sample variance
▪ For the inferential procedure to be good, the sample statistic should be pretty close to the
unknown population parameter.
▪ To ensure that, we look at the sampling distributions of sample statistics.
An online florist offers three different sizes for Mother’s Day bouquets: a
small arrangement costing Rs. 80, a medium-sized one for Rs. 100, and a
large one with a price tag of Rs. 120. 20% of all purchasers choose the
small arrangement, 30% choose medium, and 50% choose large (because
they really love Mom!) flower arrangement.
Distribution Suppose each customer chooses only one flower arrangement and the
of Sample choices of the customers are independent of each other.
Mean in a) Obtain the probability distribution of the amount spent by a single
General Case customer on that day.
b) Obtain the probability distribution of the average amount spent by
two customers on that day.
c) Obtain the probability distribution of the average amount spent by 100
customers on that day.
Some Properties Of The Sample Statistic
Sampling Distribution
• The probability distribution of a sample statistic is called its sampling
distribution.

Note that
• It is impossible to collect every possible sample and calculate the
sample statistic for each of them.
• In fact, we will only have one sample with us!
• The sampling distribution will tell us how much a statistic would vary
from sample to sample and will help us to predict how close a statistic
is to the parameter it estimates.
Sampling 𝑋1 , 𝑋2 , … 𝑋𝑛 : IID sample with mean μ and variance 𝜎 2

Distribution For large n, σ𝑛𝑖=1 𝑋𝑖 ~ 𝑁(𝑛𝜇, 𝑛𝜎 2 ) approximately.


OR
Of Sample
𝜎2
Mean For large sample size, 𝑋ത ~ 𝑁(𝜇, ) approximately
𝑛

Central Limit
Theorem (CLT)
Sampling Distribution Of Sample Proportion

• Let p = Proportion of a population that has some property


• The statistic that estimates the parameter p is the sample proportion.
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
• 𝑝Ƹ =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
• Sample proportion can be thought of as sample mean of indicator variables.
Sampling Distribution Of Sample Proportion

• Let X = Number of success out of n trials where n is very large


𝑋
• Then 𝑋 ~ 𝐵𝑖𝑛(𝑛, 𝑝) and 𝑝Ƹ =
𝑛
• Using approximation of Binomial by Normal, we get,
𝑝 1−𝑝
𝑋 ~ 𝑁 𝑛𝑝, 𝑛𝑝 1 − 𝑝 ⇒ 𝑝Ƹ ~ 𝑁 𝑝,
𝑛
Problem

The IT section of a university suggested that students should use long passphrases as
their email account password to protect their accounts from getting hacked.
If the population proportion of students at the university who are using long passphrase
as advised by IT section is 0.1 then what is the probability that the sample proportion
based on a random sample of size 200 would be in [0.045, 0.155] ? (Use a suitable
normal approximation without any correction)
Making
Inferences
Statistical inference

• The process of drawing conclusions about a


parameter one is seeking to measure or
estimate.
Making Inferences Three major type of problems can occur while
making inference
• Estimation
• Hypothesis testing
• Relationship
Estimation
What is estimation about?

• The objective of estimation is to approximate


the value of a population parameter based on
a sample statistic.

The two types of estimators

• Point Estimator
• Interval Estimator
Point Estimation: Basic Idea

• We resort to sampling when we cannot get hold of the entire population, and hence
don’t know it’s characteristics (parameters): mean, SD etc.
• The parameters are estimated through sample statistic: sample mean (𝑋), sample
Ƹ sample standard deviation (s)/variance (s2) etc.
proportion (𝑝),
• You can think of this technique as extrapolation of sample properties to the population.
• This technique is known as (point) estimation.
• The sample statistic is called the (point) estimator of the parameter.
Some Notations and Concepts

• Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from a population with an underlying distribution


D with parameter θ.
• The samples are independently and identically distributed (IID).
• Let us denote the estimator of the parameter θ by theta-hat (𝜃). ෠
• An estimate of θ is a function of 𝑋1 , 𝑋2 , … , 𝑋𝑛 , 𝜃෠ = 𝑇(𝑋1 , 𝑋2 , … , 𝑋𝑛 ), thus it is a random
variable.
• The probability distribution of an estimate is called its sampling distribution.
• Standard error of 𝜃෠ is its standard deviation 𝜎𝜃෡ = 𝑉𝑎𝑟 𝜃෠
• If the standard error itself involves unknown parameters whose values can be
estimated, substitution of these estimates into 𝜎𝜃෡ yields the estimated standard error
of 𝜃෠ . The estimated standard error is denoted by 𝜎ො𝜃෡ 𝑜𝑟 𝑠𝜃෡ .
How Does This Technique Work?

• For the sample statistic to be a good proxy (Estimate) for the population parameter it is
required have at least some desirable properties.
• Some properties which we would be discussing are:
• Unbiasedness
• Efficiency
• Consistency
Unbiasedness
What is Unbiasedness

• Unbiasedness requires that the sampling distribution of the estimator be centered at


the value of θ, whatever that value might be.
• If we want to estimate a population parameter, 𝜃, by a function 𝜃መ of the sample,
𝑋1 , 𝑋2 , … , 𝑋𝑛 , and 𝐸 𝜃መ = 𝜃 whatever the value of 𝜃 may be, we say, 𝜃መ is unbiased.

What is Bias

• 𝐵𝑖𝑎𝑠 = 𝐸 𝜃መ − 𝜃
• The bias of an estimator 𝜃መ quantifies its accuracy by measuring how far, on the average,
𝜃መ differs from θ.
Problem
Sample mean is an unbiased estimator for the population mean for SRS,
i.e. 𝐸 𝑋ത = 𝜇.
Problem (HW)
Sample variance is an unbiased estimator for the population variance for
SRS.
Unbiased Estimators may
not be unique!

Problem
Suppose 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample (IID) from a distribution
with mean 𝐸 𝑋𝑖 = 𝜃 and 𝑉𝑎𝑟 𝑋𝑖 = 𝜎 2 .
Consider the following two estimators 𝜃෠1 and 𝜃෠2 for θ:
𝑋1 +𝑋2 +⋯+𝑋𝑛
𝜃෠1 = 𝑋1 , 𝜃෠2 =
𝑛
Are 𝜃෠1 and 𝜃෠2 both unbiased estimators of θ?
Bias and
Standard Error

• Unbiasedness requires that the


sampling distribution of the
estimator be centered at the value of
θ, whatever that value might be.
• The standard error of 𝜃෠ quantifies its
precision by measuring the
variability of 𝜃෠ across different
possible realizations (i.e., different
random samples).
Unbiased Estimators May Not Be Unique!

• Which one is better?


→ Choose that estimate whose sampling distribution is highly concentrated about the
true parameter value!
→ Need a measure!

(Mean Square Error)


Efficiency

▪ Suppose the population parameter, θ, has two estimators, namely, 𝜃෠1 and 𝜃෠2 .
We say the estimator with a smaller MSE is more efficient.
▪ MSE of an estimator 𝜃෠ is defined as:
2
෠ ෠
𝑀𝑆𝐸 𝜃 = 𝑉𝑎𝑟 𝜃 + 𝐵𝑖𝑎𝑠(𝜃) . ෠
▪ If 𝜃෠ is an unbiased estimator of θ, then 𝐸 𝜃෠ = 𝜃 and
𝑀𝑆𝐸 𝜃෠ = 𝑉𝑎𝑟 𝜃෠ .
▪ Now let us answer the problem.
Unbiased Estimators May Not Be Unique: Choosing
Using Efficiency

• Which one is better?


Consistency

• An estimate, 𝜃෠ is said to be a consistent estimate of a parameter, θ, if the difference


between the estimator 𝜃෠ and the population parameter θ becomes smaller as we
increase the sample size.
• Note: An unbiased estimator 𝜃෠𝑛 based on a sample of size n, for the parameter θ, is
consistent, if,
𝑉𝑎𝑟 𝜃෠𝑛 → 0, 𝑎𝑠 𝑛 → ∞
Problem
• Show that 𝑋ത is an unbiased and consistent estimator of the population
mean 𝜇.
• We would like to have the estimates become more and more
concentrated near the true value of the parameter being estimated.
• An estimator can be unbiased but not consistent.
Why Do We • Example:
Need
Consistency?
• An estimator can be biased but consistent.
• Example:
How To Obtain
A Point
Estimate? ▪ There are several methods. We would be discussing
the following two:
▪ Method of Moments Estimator (MME)
▪ Maximum Likelihood Estimator (MLE)
Method of Moment Estimator (MME)

• The k-th population moment of a probability distribution is defined as


𝜇𝑘 = 𝐸(𝑋 𝑘 )
• If 𝑋1 , 𝑋2 , … , 𝑋𝑛 are IID random variables from a distribution D, then the k-th
sample moment is defined as
𝑛
1
𝜇Ƹ 𝑘 = ෍ 𝑋𝑖𝑘
𝑛
𝑖=1
• We can view 𝜇Ƹ 𝑘 as an estimate of 𝜇𝑘 .
Method of Moment Estimator (MME)
Basic Idea
1
• Equate 1st sample moment 𝜇Ƹ 1 = σ𝑛𝑖=1 𝑋𝑖 = 𝑋ത to the 1st theoretical moment 𝐸(𝑋).
𝑛
1
• Equate 2nd sample moment 𝜇Ƹ 2 = σ𝑛𝑖=1 𝑋𝑖2 to the 2nd theoretical moment 𝐸(𝑋 2 ).
𝑛
• Continue equating sample moments 𝜇Ƹ 𝑘 with the corresponding theoretical moment
𝐸 𝑋 𝑘 , 𝑘 = 3, 4, … , until you have as many equations as you have parameters.
• Solve for the parameters.

Note that
• Such estimates are consistent, i.e., they converge to the true parameter as the sample
size gets larger.
• MME may not be unique.
Problem
▪ Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be IID sample from Uniform [0,b]. Find MME of b
where b > 0.
▪ Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be IID sample from Uniform [-b,b]. Find MME of b
where b > 0.

You might also like