0% found this document useful (0 votes)
10 views

Statistics-Help-Card-Formulas

Uploaded by

Ashraf Mahmoud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Statistics-Help-Card-Formulas

Uploaded by

Ashraf Mahmoud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

STATISTICS HELP CARD

Summary Measures Sample Proportion


𝑥
Sample Mean 𝑝̂ =
𝑥! + 𝑥" + ⋯ + 𝑥# ∑ 𝑥$ 𝑛
𝑥̅ = =
𝑛 𝑛 Mean 𝐸(𝑝̂ ) = 𝑝
3(!03)
Sample Standard Deviation Standard Deviation 𝑠. 𝑑. (𝑝̂ ) = M #

∑(𝑥$ − 𝑥̅ )"
𝑠=) O
Sampling Distribution of 𝒑
𝑛−1
If the sample size n is large enough
(namely, 𝑛𝑝 ≥ 10 𝑎𝑛𝑑 𝑛(1 − 𝑝) ≥ 10),
Probability Rules then the distribution of all possible sample
proportion values is approximately
Complement Rule: 𝑃(𝐴% ) = 1 − 𝑃(𝐴)
3(!03)
Addition Rule: 𝑃(𝐴 𝑜𝑟 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 𝑎𝑛𝑑 𝐵) 𝑁 Q𝑝, M #
R

&(( *#+ ,)
Conditional Probability: 𝑃(𝐴|𝐵) = &(,)
Sample Mean
Events A and B are independent if 𝑃(𝐴|𝐵) = 𝑃(𝐴) ∑ 𝑥$
Events A and B are independent if 𝑃(𝐴 𝑎𝑛𝑑 𝐵) = 𝑃(𝐴)𝑃(𝐵) 𝑥̅ =
𝑛
If A and B are disjoint events then 𝑃(𝐴 𝑎𝑛𝑑 𝐵) = 0
Mean 𝐸(𝑋S ) = 𝜇
2
Standard Deviation 𝑠. 𝑑. (𝑋S ) =
√#
General Discrete Random Variable
U
Sampling Distribution of 𝑿
Mean 𝐸(𝑋) = 𝜇 = ∑ 𝑥$ 𝑝$ = 𝑥! 𝑝! + 𝑥" 𝑝" + ⋯ + 𝑥. 𝑝.
If X has Normal distribution
Standard Deviation 𝑠. 𝑑. (𝑋) = 𝜎 = 9∑(𝑥$ − 𝜇)" 𝑝$ with mean 𝜇 and standard deviation 𝜎,
then the distribution of all possible sample
mean values is
Standard Score 𝜎
𝑁 V𝜇, X
√𝑛
𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 − 𝑀𝑒𝑎𝑛
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑆𝑐𝑜𝑟𝑒 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 Central Limit Theorem
If X follows any distribution
with mean 𝜇 and standard deviation 𝜎
Z Score
and the sample size n is large enough,
If X follows a Normal distribution then the distribution of all possible sample
with mean 𝜇 and standard deviation 𝜎, mean values is approximately
/01
then the random variable 𝑍 = 𝜎
2 𝑁 V𝜇, X
has a 𝑁(0,1) distribution √𝑛
One Population Difference in Two Population One Population Population Mean
Proportion Proportions Mean of Differences
Parameter p Parameter 𝑝! − 𝑝" Parameter 𝜇 Parameter 𝜇#
Statistic 𝑝̂ Statistic 𝑝̂! − 𝑝̂ " Statistic 𝑥̅ Statistic 𝑑̅
Standard Error Standard Error Standard Error Standard Error
𝑠 𝑠#
𝑝̂ (1 − 𝑝̂ ) 𝑝̂! (1 − 𝑝̂! ) 𝑝̂ " (1 − 𝑝̂" ) 𝑠. 𝑒. (𝑥̅ ) = 𝑠. 𝑒. (𝑥̅ # ) =
𝑠. 𝑒. (𝑝̂ ) = 2 𝑠. 𝑒. (𝑝̂! − 𝑝̂ " ) = * + √𝑛 √𝑛
𝑛 𝑛! 𝑛"

Confidence Interval Confidence Confidence Interval Confidence


𝑝̂ ± 𝑧 ∗ × 𝑠. 𝑒. (𝑝̂ ) Interval Interval
𝑥̅ ± 𝑡 ∗ × 𝑠. 𝑒. (𝑥̅ )
Conservative (𝑝̂! − 𝑝̂ " ) ± 𝑧 ∗ × 𝑠. 𝑒. (𝑝̂! − 𝑝̂ " ) 𝑥̅# ± 𝑡 ∗ × 𝑠. 𝑒. (𝑥̅# )
Confidence Interval 𝑑𝑓 = 𝑛 − 1
𝑧∗ 𝑑𝑓 = 𝑛 − 1
𝑝̂ ±
2√𝑛
Sample Size
𝑧∗ "
𝑛=< >
2𝑚
m=desired margin of error

Large Sample z-Test Large Sample z-Test One-Sample t-Test Paired t-Test
𝑝̂! − 𝑝̂"
𝑧= 𝑥̅ − 𝜇* 𝑥̅# − 0
𝑝̂ − 𝑝% 1 1 𝑡= 𝑡=
𝑧= A𝑝̂ (1 − 𝑝̂ ) B + D 𝑠. 𝑒. (𝑥̅ ) 𝑠. 𝑒. (𝑥̅# )
𝑛! 𝑛"
A𝑝% (1 − 𝑝% ) 𝑑𝑓 = 𝑛 − 1
𝑛 &! '(! )&" '(" 𝑑𝑓 = 𝑛 − 1
where 𝑝̂ = &! )&"

Difference in Two Population Means


Unpooled (Welch’s) Pooled
Parameter 𝜇! − 𝜇" Parameter 𝜇! − 𝜇"
Statistic 𝑥̅! − 𝑥̅" Statistic 𝑥̅! − 𝑥̅"
Standard Error Standard Error
1 1
𝑠" 𝑠" 𝑝𝑜𝑜𝑙𝑒𝑑 𝑠. 𝑒. (𝑥̅! − 𝑥̅" ) = 𝑠' 2 +
𝑠. 𝑒. (𝑥̅! − 𝑥̅" ) = 2 ! + " 𝑛! 𝑛"
𝑛! 𝑛"
(## 0!)5#$ 6(#$ 0!)5$$
where 𝑠3 = M
## 6#$ 0"

Confidence Interval Confidence Interval


(𝑥̅! − 𝑥̅ " ) ± 𝑡 ∗ × 𝑠. 𝑒. (𝑥̅! − 𝑥̅" ) (𝑥̅! − 𝑥̅" ) ± 𝑡 ∗ × H𝑝𝑜𝑜𝑙𝑒𝑑 𝑠. 𝑒. (𝑥̅! − 𝑥̅" )I
df from technology ** 𝑑𝑓 = 𝑛! + 𝑛" − 2
Two-Sample t-Test Pooled Two-Sample t-Test
(𝑥̅! − 𝑥̅" ) − 0 𝑥̅! − 𝑥̅" (𝑥̅! − 𝑥̅ " ) − 0 𝑥̅! − 𝑥̅"
𝑡= = 𝑡= =
𝑠. 𝑒. (𝑥̅! − 𝑥̅" ) 𝑝𝑜𝑜𝑙𝑒𝑑 𝑠. 𝑒. (𝑥̅! − 𝑥̅ " ) 1 1
𝑠" 𝑠" 𝑠3 M𝑛 + 𝑛
) !+ " ! "
𝑛! 𝑛"
df from technology ** 𝑑𝑓 = 𝑛! + 𝑛" − 2
**If technology not available, use conservative df
= the minimum of 𝑛! − 1 𝑎𝑛𝑑 𝑛" − 1
Note: A z-distribution is often used in statistical methods in place of a t-distribution when sample sizes are sufficiently large.
Pearson Correlation and Linear Regression

Pearson Correlation and its square Estimate of 𝝈


𝑥 − 𝑥̅ 𝑦 − 𝑦< 𝑆𝑆𝐸
𝑟 = 67 9: = 𝑠 = √𝑀𝑆𝐸 = C
𝑠+ 𝑠, 𝑛−2
--./0 where 𝑆𝑆𝐸 = ∑(𝑦 − 𝑦F)" = ∑ 𝑒 "
𝑟 " = --1*234 where 𝑆𝑆𝑇𝑜𝑡𝑎𝑙 = ∑(𝑦 − 𝑦,)! = 𝑆𝑆𝑅𝑒𝑔 +
𝑆𝑆𝐸

Linear Regression Model Standard Error of the Sample Slope


Population Version 𝑠
𝑠. 𝑒. (𝑏! ) =
Mean: 𝐸(𝑌|𝑥) = 𝛽% + 𝛽! 𝑥 S∑(𝑥 − 𝑥̅ )"
Individual: 𝑦5 = 𝛽% + 𝛽! 𝑥 + 𝜀5 Confidence Interval for 𝜷𝟏
where 𝜀5 𝑖𝑠 𝑁(0, 𝜎)
𝑏! ± 𝑡 ∗ × 𝑠. 𝑒. (𝑏! ) df = 𝑛 − 2
Sample Version
Mean: 𝑦F = 𝑏% + 𝑏! 𝑥 t-Test for 𝜷𝟏
7! 8%
Individual: 𝑦5 = 𝑏% + 𝑏! 𝑥 + 𝑒5 𝑡= df = 𝑛 − 2
9./.(7! )

Parameter Estimators Confidence Interval for the Mean Response


𝑠, 𝑦F ± 𝑡 ∗ × 𝑠. 𝑒. (𝑓𝑖𝑡) df = 𝑛 − 2
𝑏! = 𝑟
𝑠+
𝑏% = 𝑦< − 𝑏! 𝑥̅ ! (+8+̅ )"
where 𝑠. 𝑒. (𝑓𝑖𝑡) = 𝑠W + ∑(+ "
& # 8+̅ )

Residuals Prediction Interval for an Individual Response

𝑒 = 𝑦 − 𝑦F = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑦 − 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑦 𝑦F ± 𝑡 ∗ × 𝑠. 𝑒. (𝑝𝑟𝑒𝑑) df = 𝑛 − 2

"
where 𝑠. 𝑒. (𝑝𝑟𝑒𝑑) = W𝑠 " + Z𝑠. 𝑒. (𝑓𝑖𝑡)[

Chi-Square Tests
Test for Goodness of Fit Test of Independence

Expected Count Expected Count


(𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙)(𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙)
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 = 𝑛𝑝$7 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 =
𝑡𝑜𝑡𝑎𝑙 𝑛
Test Statistic Test Statistic
(895:;<:+0:=3:%>:+)$ (895:;<:+0:=3:%>:+)$
𝜒" = ∑ :=3:%>:+
𝑑𝑓 = 𝑘 − 1 𝜒" = ∑ 𝑑𝑓 = (𝑟 − 1)(𝑐 − 1)
:=3:%>:+

Properties of a Chi-Square Distribution


A 𝜒 " random variable has mean = df and standard deviation = 92𝑑𝑓

You might also like