0% found this document useful (0 votes)
444 views8 pages

Formula Sheet

This document provides formulas and estimators for common statistical tests involving single and two population parameters like means, proportions, and differences between populations. It includes point estimators, standard errors, and confidence intervals for parameters like the mean when the population variance is known or unknown, and for proportions when the sample size is large. Confidence intervals are provided for single population means, differences between two population means, and differences between paired sample means.

Uploaded by

Lionel Hector
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
444 views8 pages

Formula Sheet

This document provides formulas and estimators for common statistical tests involving single and two population parameters like means, proportions, and differences between populations. It includes point estimators, standard errors, and confidence intervals for parameters like the mean when the population variance is known or unknown, and for proportions when the sample size is large. Confidence intervals are provided for single population means, differences between two population means, and differences between paired sample means.

Uploaded by

Lionel Hector
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Formula Sheet

Preliminaries:

2
∑𝑛𝑖=1(𝑋𝑖 − 𝑥̅ )2 𝑛 ∑𝑛𝑖=1 𝑋𝑖 2 − (∑𝑛𝑖=1 𝑋𝑖 )
𝑠= √ = √
𝑛−1 𝑛(𝑛 − 1)

Estimation: Single Population

Common Point Estimators under Simple Random Sampling Without Replacement


Parameter Point Estimator
Mean: Sample Mean:
𝜇 ∑𝑛𝑖=1 𝑋𝑖
𝑥̅ =
𝑛
Standard Error of the sample mean: Estimator of Standard Error of the sample mean:
𝑁 − 𝑛 𝜎2 𝑁 − 𝑛 𝑠2
𝑆𝐸(𝑋̅) = √( ) ̂ (𝑋̅) = √(
𝑆𝐸 )
𝑁−1 𝑛 𝑁 𝑛

Proportion: Sample Proportion:


p 𝑝̂
Standard Error of the sample proportion: Estimator of Standard Error of the sample
proportion:
𝑁 − 𝑛 𝑝(1 − 𝑝)
𝑆𝐸(𝑝̂ ) = √( ) 𝑁 − 𝑛 𝑝̂ (1 − 𝑝̂ )
𝑁−1 𝑛 ̂ (𝑝̂ ) = √(
𝑆𝐸 )
𝑁 𝑛−1

Common Point Estimators under Random Sampling from an Infinite Population


Parameter Point Estimator
Mean: Sample Mean:
𝜇 ∑𝑛𝑖=1 𝑋𝑖
𝑥̅ =
𝑛
Standard Error of the sample mean: Estimator of Standard Error of the sample mean:
𝜎2 𝑠2
𝑆𝐸(𝑋̅) = √ ̂ (𝑋̅) = √
𝑆𝐸
𝑛 𝑛

Proportion: Sample Proportion:


p 𝑝̂
Standard Error of the sample proportion: Estimator of Standard Error of the sample proportion:
𝑝(1 − 𝑝) 𝑝̂ (1 − 𝑝̂ )
𝑆𝐸(𝑝̂ ) = √ ̂ (𝑝̂ ) = √
𝑆𝐸
𝑛 𝑛
Confidence Intervals (Single Population)

1    100% Confidence Interval Estimates for the Population Mean


Cases Confidence Intervals

𝜎 𝜎
𝜎 2 is known (𝑥̅ − (𝑧𝛼 ) ( ) , 𝑥̅ + (𝑧𝛼 ) ( ))
2 √𝑛 2 √𝑛

𝑠 𝑠
𝜎 2 is unknown (𝑥̅ − (𝑡 𝛼, 𝑛−1
)( ) , 𝑥̅ + (𝑡 𝛼, 𝑛−1
)( ))
2 √𝑛 2 √𝑛

𝜎 2 is unknown but the sample size is 𝑠 𝑠


sufficiently large (𝑥̅ − (𝑧𝛼 ) ( ) , 𝑥̅ + (𝑧𝛼 ) ( ))
(Assume sufficiently large n if n>30) 2 √𝑛 2 √𝑛

An approximate 1    100% confidence interval estimate for the population proportion p when p is not
expected to be close to 0 or 1 and n is large:

𝑝̂(1−𝑝̂) 𝑝̂(1−𝑝̂)
(𝑝̂ − (𝑧𝛼 ) √ , 𝑝̂ + (𝑧𝛼 ) √ )
2 𝑛 2 𝑛
Confidence Intervals (Two Populations)

1    100% Confidence Interval Estimates for the Difference Between Two Population Means
(Independent Samples)
Cases Confidence Interval Estimates
σy 2 2 σy 2 2
𝜎𝑥 2 and σ
((𝑋̅ − 𝑌̅) − 𝑧𝛼 √ x +
σ
, (𝑋̅ − 𝑌̅) + 𝑧𝛼 √ x + )
𝜎𝑦 2 are 2 𝑛1 𝑛2 2 𝑛1 𝑛2
known

𝜎𝑥 2 and ̅) − (𝑡 𝛼 ) √sp 2 ( 1 + 1 ) , (X
̅−Y ̅) + (𝑡 𝛼 ) √sp 2 ( 1 + 1 ))
̅−Y
((X
𝜎𝑦 2 are , 𝑣
2 n n 1
, 𝑣 2 n n 2 1 2
unknown
(𝑛1 −1)𝑠𝑥 2 +(𝑛2 −1)𝑠𝑦 2
Where 𝑣 = n1 + n2 − 2 and sp 2 =
𝜎𝑥 2 = 𝜎𝑦 2 𝑛1 +𝑛2 −2

2 2 2 2
̅) − (𝑡 𝛼 ) √(𝑠𝑥 + 𝑠𝑦 ) , (X
̅−Y
((X ̅) + (𝑡 𝛼 ) √(𝑠𝑥 + 𝑠𝑦 ))
̅−Y
, 𝑣 n n 1
, 𝑣 n2 n 1 2
2 2
𝜎𝑥 2 and
𝜎𝑦 2 are 2 2
𝑠𝑥 2 𝑠𝑦
unknown ( + )
n1 n2
Where 𝑣= 2 2
𝜎𝑥 2 ≠ 𝜎𝑦 2 𝑠𝑥 2 𝑠𝑦 2
(n ) (n )
1 2
+
n1 −1 n2 −1

𝜎𝑥 2 and
𝜎𝑦 2 are
unknown 2 2 2 2
𝑠 𝑠𝑦 𝑠 𝑠𝑦
but ̅−̅
((X ̅−̅
Y) − 𝑧𝛼 √( 𝑥 + ) , (X Y) + 𝑧𝛼 √( 𝑥 + ))
n n2 1 n n
2 2 1 2
𝑛1 > 30
and
𝑛2 > 30

1    100% confidence interval estimate for the mean difference in matched or paired samples
𝑠𝑑 𝑠𝑑
(𝑑̅ − (𝑡𝛼 , 𝑛−1 ) √𝑛, 𝑑̅ + (𝑡𝛼 , 𝑛−1 ) √𝑛)
2 2

An approximate 1    100% confidence interval for p1  p 2 when the sample sizes are large

̂(1−𝑝
𝑝1 ̂)
1 ̂(1−𝑝
𝑝2 ̂)
2 ̂(1−𝑝
𝑝1 ̂)
1 ̂(1−𝑝
𝑝2 ̂)
2
((𝑝̂1 − 𝑝̂2 ) − 𝑧𝛼 √ + , (𝑝̂1 − 𝑝̂2 ) + 𝑧𝛼 √ + )
2 𝑛1 𝑛2 2 𝑛1 𝑛2
Tests of Hypothesis (Single Population)

Summary of Hypothesis Tests on the Population Mean

Alternative
Null Hypothesis Region of
Hypothesis Test Statistic
Ho Rejection
Ha
 is known
  0 z   z
X  o
  0   0 Z z  z

  0 n z  z 2
 is not known
  0 𝑡 < −𝑡 𝛼, 𝑛−1
X  o
  0   0 t 𝑡 > 𝑡 𝛼, 𝑛−1
S
  0 n |𝑡| > 𝑡 𝛼, 𝑛−1
2
 is not known but the sample size is sufficiently large
(Assume sufficiently large n if n>30)
  0 𝑡 < −𝑧𝛼
X  o
  0   0 t 𝑡 > 𝑧𝛼
S
  0 n |𝑡| > 𝑧𝛼
2

Hypothesis Test on the Population Proportion (For large n)

Alternative
Null Hypothesis Region of
Hypothesis Test Statistic
Ho Rejection
Ha
p  po X  np o z   z
Z
np o (1  p 0 )
p  po p  po z  z
where
p  po z  z 2
X = number of success
Tests of Hypothesis (Two Populations)

Summary of Tests of Hypotheses for the Difference of Means

Alternative
Null Region of
Hypothesis Test Statistic
Hypothesis Ho Rejection
Ha
 X2 and  Y are known
2

 X  Y  d o z   z
( X  Y )  do
Z
 X  Y  d o  X  Y  d o  X2  Y2 z  z

n1 n2
 X  Y  d o z  z 2

 X2   Y2   2 is unknown
( X  Y )  do
 X  Y  d o t 𝑡 < −𝑡 𝛼, 𝑣
1 1
Sp 
n1 n2
 X  Y  d o  X  Y  d o where 𝑡 > 𝑡 𝛼, 𝑣
(n1  1) S  (n2  1) S
2 2
S p2  X Y

n1  n2  2 |𝑡| > 𝑡𝛼, 𝑣


 X  Y  d o
v  n1  n2  2 2

 X2   Y2 and both are unknown


(X  Y )  do
 X  Y  d o t 𝑡 < −𝑡 𝛼, 𝑣
S X2 S Y2

n1 n 2
where
 X  Y  d o  X  Y  d o 2 𝑡 > 𝑡 𝛼, 𝑣
 S X2 S2 
  Y
 n1 n2 
v 2 2
 S X2   S Y2 
 X  Y  d o  
n1   n  |𝑡| > 𝑡𝛼, 𝑣
 
2 
2
n1  1 n2  1
 X2 and  Y2 are unknown but 𝑛1 > 30 and 𝑛2 > 30

 X  Y  d o z   z
( X  Y )  do
Z
 X  Y  d o  X  Y  d o S X2 S Y2 z  z

n1 n2
 X  Y  d o z  z 2
Test for the difference of population means for paired or matched observations

Alternative
Null Hypothesis Region of
Hypothesis Test Statistic
Ho Rejection
Ha
D  do 𝑡 < −𝑡 𝛼, 𝑛−1
d  do
D  do t 𝑡 > 𝑡 𝛼, 𝑛−1
 D   X  Y  d o Sd
D  do n |𝑡| > 𝑡 𝛼, 𝑛−1
2

2
∑𝑛𝑖=1 𝑑𝑖 ∑𝑛𝑖=1(𝑑𝑖 − 𝑑̅ )2 𝑛 ∑𝑛𝑖=1 𝑑𝑖 2 − (∑𝑛𝑖=1 𝑑𝑖 )
𝑑̅ = , 𝑆𝑑 = √ = √
𝑛 𝑛−1 𝑛(𝑛 − 1)

Hypothesis Test on the Difference of Two Population Proportions ( for 𝑙𝑎𝑟𝑔𝑒 𝑛1 𝑎𝑛𝑑 𝑛2 )

Null Alternative
Region of
Hypothesis Hypothesis Test Statistic
Rejection
Ho Ha
p1  p 2 𝑝
̂1 − 𝑝
̂2 z   z
𝑍=
1 1
√𝑝̅ (1 − 𝑝̅ ) (
𝑛1 + 𝑛2 )
p1  p 2 z  z
p1  p 2 where
𝑥1 + 𝑥2
𝑝̅ =
𝑛1 + 𝑛2
p1  p 2 z  z 2
𝑥𝑖 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠
𝑛𝑖 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒

Test for Independence

Ho: The variables are independent


Ha: The variables are not independent

Test statistic:
𝑟 𝑐 2 2 𝑟 𝑐
2
(𝑂𝑖𝑗 − 𝐸𝑖𝑗 ) 𝑂𝑖𝑗
𝑋 = ∑∑ = ∑∑ −𝑛
𝐸𝑖𝑗 𝐸𝑖𝑗
𝑖=1 𝑗=1 𝑖=1 𝑗=1
th
where Oij is the observed number of cases in the i row and jth column
Eij is the expected number of cases in the ith row and jth column

Region of rejection: 𝑋 2 > 𝜒 2 𝛼, (𝑟−1)∗(𝑐−1)


Correlation Coefficient

n
 n  n 
n X i Yi    X i   Yi 
r i 1  i 1  i 1 
 n 2  n 
2
 n 2  n  2 
 n X    X   n Y    Y  
 i 1 i  i 1 i   i 1 i  i 1 i  
  

Alternative
Null Hypothesis Region of
Hypothesis Test Statistic
Ho Rejection
Ha
  0 𝑡 < −𝑡𝛼, 𝑛−2
  0 r   0  n2 𝑡 > 𝑡𝛼, 𝑛−2
  0 t
1 r2
  0 |𝑡| > 𝑡𝛼, 𝑛−2
2

Regression (Least Square Estimates)

n
 n  n 
n X i Yi    X i   Yi 
b1  i 1  i 1  i 1 
2 ,
n
 n 
n X i    X i 
2

i 1  i 1 

bo  Y  b1 X .
̂𝑖 = 𝑏0 + 𝑏1 𝑋𝑖
𝑌
2
2
𝑆𝑆𝐸 ∑𝑛𝑖=1(𝑌𝑖 − 𝑌̂𝑖 )
𝜎̂ = 𝑀𝑆𝐸 = =
𝑛−2 𝑛−2
A (1 − 𝛼)100% confidence interval for 𝛽0 is given by:

(𝑏0 − (𝑡𝛼 , 𝑛−2 )𝑆𝑏0 , 𝑏0 + (𝑡𝛼, 𝑛−2 )𝑆𝑏0 )


2 2

(𝑀𝑆𝐸)(∑𝑛 2
𝑖=1 𝑥𝑖 )
Where 𝑆𝑏0 =√ 2
𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )

A (1 − 𝛼)100% confidence interval for 𝛽1 is given by:

(𝑏1 − (𝑡𝛼 , 𝑛−2 )𝑆𝑏1 , 𝑏1 + (𝑡𝛼 , 𝑛−2 )𝑆𝑏1 )


2 2

(𝑀𝑆𝐸)(𝑛)
Where 𝑆𝑏1 =√ 2
𝑛 ∑𝑛 2 𝑛
𝑖=1 𝑥𝑖 −(∑𝑖=1 𝑥𝑖 )

Test for 𝛽0

Ho: 𝛽0 = 0 vs. Ha: 𝛽0 ≠ 0


𝑏0
Test statistic: 𝑡 =
𝑆𝑏0

Reject the null hypothesis if: |𝑡| > 𝑡 𝛼, 𝑛−2


2

Test for 𝛽1

Ho: 𝛽1 = 0 vs. Ha: 𝛽1 ≠ 0


𝑏1
Test statistic: 𝑡 =
𝑆𝑏1

Reject the null hypothesis if: |𝑡| > 𝑡 𝛼, 𝑛−2


2

You might also like