Module Wise Important Formulae
Module Wise Important Formulae
1. Mean:
𝒏
𝟏 𝟏
̅ = (𝒙𝟏 +𝒙𝟐 + 𝒙𝟑 + ⋯ + 𝒙𝒏 ) = ∑ 𝒙𝒊
𝒙
𝒏 𝒏
𝒊=𝟏
2. Median:
4. Harmonic Mean:
1
𝐻=
1 1
∑( )
𝑛 𝑋
Where, 𝑁 = ∑ 𝑓
𝑋= mid-value of the variable or mid-value of the class
𝑓= frequency of 𝑋
Measures of variability or Dispersion:
1. Range:
Range is the difference between the greatest (maximum) and the smallest (minimum)
observation of the distribution.
𝑅𝑎𝑛𝑔𝑒 = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛
2. Quartile deviation:
It is a measure of dispersion based on the upper quartile 𝑄3 and the lower quartile 𝑄1.
Q 3 − Q1
Quartile deviation (Q. D) =
2
𝐡 𝐍∗𝐢
Where, 𝐐𝐢 = 𝒍 + 𝐟 ( − 𝐜) , 𝐟𝐨𝐫 𝐢 = 𝟏, 𝟐, 𝟑
𝟒
4. Standard Deviation:
1 1
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = ∑(𝑥𝑖 − 𝑥̅ )2 = ∑ 𝑥𝑖 2 − 𝑥̅ 2
𝑛 𝑛
1 1
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑆. 𝐷 = 𝜎 = √𝑛 ∑(𝑥𝑖 − 𝑥̅ )2 (or) √𝑛 ∑ 𝑥𝑖 2 − 𝑥̅ 2
𝑄3 −𝑄1
Coefficient of dispersion = 𝑄3 +𝑄1
𝜎
Coefficient of variation= 𝐶. 𝑉 = 100 × 𝑥̅
Skewness:
Mean(𝑀), median 𝑀𝑑 and mode 𝑀0 fall at different points i.e., 𝑀𝑒𝑎𝑛 ≠ 𝑀𝑒𝑑𝑖𝑎𝑛 ≠ 𝑀𝑜𝑑𝑒
Quartiles are not equidistant from median
The curve drawn with the help of the given data is not symmetrical but stretched more
to one side than to the other.
Measures of Skewness:
𝑆𝑘 = 𝑀 − 𝑀𝑑
𝑆𝑘 = 𝑀 − 𝑀0
𝑆𝑘 = (𝑄3 − 𝑀𝑑 ) − (𝑀𝑑 − 𝑄1 )
𝑀−𝑀0
𝑆𝑘 = , 𝜎 is the standard deviation of the distribution.
𝜎
If mode is ill-defined, then using the empirical relation, 𝑀0 = 3𝑀𝑑 − 2𝑀, for a
moderately asymmetrical distribution, we get
3(𝑀 − 𝑀𝑑 )
𝑆𝑘 =
𝜎
𝑆𝑘 = 0, if 𝑀 = 𝑀0 = 𝑀𝑑 . Hence for a symmetrical distribution all are coincide.
Prof. Karl Pearson’s calls as the ‘convexity of the frequency curve’ or Kurtosis.
Kurtosis enables the flatness or peakedness of the frequency curve.
It is measured by the coefficient 𝛽2 or its derivation is given by
𝜇
𝛽2 = 𝜇 42 , 𝛾2 = 𝛽2 − 3
2
A: Leptokurtic Curve (which is more peaked than the normal curve𝛽2 > 3 𝑖. 𝑒. 𝛾2 > 0 )
B: Normal Curve or Mesokurtic Curve (which is neither flat nor peaked
𝛽2 = 3 𝑖. 𝑒. 𝛾2 = 0 )
C: Platykurtic Curve (which is flatter than the normal curve 𝛽2 < 3 𝑖. 𝑒. 𝛾2 < 0)
Module 2
Probability
Probability:
If a random experiment or a trial results ‘n’ exhaustive, mutually exclusive and equally
likely outcomes, out of which ‘m’ are favourable to the to the occurrence of event E,
then the probability ‘p’ of occurrence or happening of E, usually denoted by P(E), is
given by
Conditional Probability:
Let 𝑆 be the sample space of a random experiment. Let 𝐶1 ⊂ 𝑆, further let 𝐶2 ⊂ 𝐶1 , then
the conditional event 𝐶1 has already occurred, denoted by 𝑃(𝐶2 /𝐶1 ) is defined as
𝑃(𝐶2 ∩ 𝐶1 )
𝑃(𝐶2 /𝐶1 ) = , 𝑖𝑓 𝑃(𝐶1 ) ≠ 0
𝑃(𝐶1 )
Or
Note:
Bayes theorem:
A Random Variable which takes on a finite (or) countably infinite number of values is
called a Discrete Random Variable.
The set of ordered pairs (𝑥, 𝑓(𝑥)) is a probability function of Probability Mass
(i) 𝑓(𝑥) ≥ 0
(ii) ∑ 𝑓(𝑥) = 1
The function 𝑓(𝑥) is a Probability Density Function for the Continuous Random
Variable 𝑥 defined over the set of real numbers 𝑅, 𝑖𝑓
(i) 𝑓(𝑥) ≥ 0, ∀ 𝑥 ∈ 𝑅
+∞
(ii) ∫−∞ 𝑓(𝑥) 𝑑𝑥 = 1
𝑏
(iii) 𝑃(𝑎 < 𝑋 < 𝑏) = ∫𝑎 𝑓(𝑥) 𝑑𝑥
Let 𝑋 be a random variable with probability distribution 𝑓(𝑥), then the mean
or mathematical expectation of 𝑋 is denoted by 𝐸(𝑋) and it is denoted by
𝐸(𝑋) = ∑ 𝑥 𝑓(𝑥), where 𝑋is a discrete random variable
+∞
𝐸(𝑋) = ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥 ,where 𝑋is a continuous random variable
𝑋 be a random variable with pdf 𝑓(𝑥) and the mean 𝜇, then the variance of 𝑋
is
𝑉(𝑥) = 𝜎 2 = E[(𝑋 − 𝜇)2 ] = ∑(𝑋 − 𝜇)2 𝑓(𝑥), where 𝑋is a discrete random
variable
+∞
𝑉(𝑥) = 𝜎 2 = ∫−∞ (𝑋 − 𝜇)2 𝑓(𝑥), where 𝑋is a continuous random variable
The positive square root of variance is a standard deviation of 𝑋. It is denoted
by 𝜎(𝑆. 𝐷).
𝐸(𝑥 2 ) = ∑ 𝑥 2 𝑓(𝑥) (Discrete)
+∞
𝐸(𝑥 2 ) = ∫−∞ 𝑥 2 𝑓(𝑥) (Continuous)
𝑏 𝑑
(iii) 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏, 𝑐 ≤ 𝑌 ≤ 𝑑) = ∫𝑎 ∫𝑐 𝑓(𝑋, 𝑌) 𝑑𝑋𝑑𝑌 = 1
When (𝑋, 𝑌) be a two-dimensional continuous random variable, then the marginal density function of
the random variable 𝑋 is defined as
∞
𝑓𝑋 (𝑥) = ∫ 𝑓(𝑥, 𝑦) 𝑑𝑦
−∞
The marginal density function of the random variable 𝑌 is defined as
∞
𝑓𝑌 (𝑦) = ∫ 𝑓(𝑥, 𝑦) 𝑑𝑥
−∞
𝑓(𝑥,𝑦)
𝑓(𝑦/𝑥) = 𝑓𝑋 (𝑥)
is conditional probability function of 𝑌 given 𝑋.
Moments:
The 𝑟 𝑡ℎ moment about the origin of a random variable 𝑋 denoted by 𝜇𝑟 is 𝐸(𝑋 𝑟 ), i.e.,
𝜇0 = 𝐸(𝑋 0 ) = 𝐸(1) = 1
𝜇1 = 𝐸(𝑋1 ) = 𝐸(𝑋) = 𝜇
2
𝜇2 = 𝐸(𝑋 2 ) − (𝐸(𝑋)) = 𝐸(𝑋 2 ) − 𝜇2
𝐸(𝑋 2 ) = 𝜎 2 + 𝜇2
The MGF of the distribution of a random variable completely describes the nature of the
distribution.
Let having PDF 𝑓(𝑋), then the MGF of the distribution of 𝑋 is denoted by 𝑀(𝑡) and is
defined as 𝑀(𝑡) = 𝐸(𝑒 𝑡𝑥 ).
∑ 𝑒 𝑡𝑥 𝑓(𝑥) , if 𝑥 is discrete
Thus, the MGF 𝑀(𝑡) = { ∞ 𝑡𝑥
∫−∞ 𝑒 𝑓(𝑥)𝑑𝑥, 𝑖𝑓 𝑥 𝑖𝑠 𝑐𝑜𝑛𝑡𝑢𝑜𝑢𝑠
𝑡𝑟
The coefficient of is about the origin is 𝜇𝑟 ′ .
𝑟!
∞
𝑀′ (𝑡) = ∫ 𝑥. 𝑒 𝑡𝑥 𝑓(𝑥)𝑑𝑥
−∞
∞
𝑀′′ (𝑡) = ∫−∞ 𝑥 2 . 𝑒 𝑡𝑥 𝑓(𝑥)𝑑𝑥, …
Now at 𝑡 = 0
𝑀(0) = 𝐸(1) = 1
𝑀′ (0) = 𝐸(𝑥) = 𝜇
Mean is 𝜇 = 𝑀′ (0)
2
Variance is 𝑀′′ (0) = 𝑀′′ (0) − (𝑀′ (0))
𝜕𝑟
𝜇𝑟 ′ = (𝑀(𝑡)) ; 𝑟 = 0,1,2, …
𝜕𝑡 𝑟
Characteristic function:
The characteristic function is defined as
∞
(𝑖𝑡)𝑟 ′
𝑀(𝑡) = ∑ 𝜇
𝑟! 𝑟
𝑟=0
(𝑖𝑡)𝑟
The coefficient of 𝑟!
is about the origin is 𝜇𝑟 ′ .
Module 3
𝐶𝑜𝑣(𝑋,𝑌)
𝑟𝑋𝑌 = 𝜎𝑋 𝜎𝑌
If (𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), (𝑥3 , 𝑦3 ), … , (𝑥𝑛 , 𝑦𝑛 ) are 𝑛 pairs of observations of the variables 𝑋 𝑎𝑛𝑑 𝑌
in a bivariate distribution, then
1 1 1
𝐶𝑜𝑣(𝑥, 𝑦) = 𝑛 ∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅); 𝜎𝑥 = √ ∑(𝑥 − 𝑥̅ )2 , 𝜎𝑦 = √ ∑(𝑦 − 𝑦̅)2
𝑛 𝑛
∑ 𝑑𝑥 𝑑𝑦
𝑟=𝑟=
√∑ 𝑑𝑥 2 ∑ 𝑑𝑦 2
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ] × [𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
Properties of correlation coefficient:
Where, A, B, h and k are constants, ℎ > 0, 𝑘 > 0, then the correlation between
𝑥 𝑎𝑛𝑑 𝑦 is same the correlation coefficient between 𝑢 𝑎𝑛𝑑 𝑣 i.e., 𝑟(𝑥, 𝑦) = 𝑟(𝑢, 𝑣)
𝑟𝑥𝑦 = 𝑟𝑢𝑣
∑(𝑢 − 𝑢̅)(𝑣 − 𝑣̅ )
𝑟𝑢𝑣 =
√∑(𝑢 − 𝑢̅)2 ∑(𝑣 − 𝑣̅ )2
𝑛 ∑ 𝑢𝑣 − (∑ 𝑢)(∑ 𝑣)
𝑟𝑢𝑣 =
√[𝑛 ∑ 𝑢2 − (∑ 𝑢)2 ] × [𝑛 ∑ 𝑣 2 − (∑ 𝑣)2 ]
6 ∑ 𝑑2
𝜌=1−
𝑛(𝑛2 −1)
Where, 𝑑 is the difference between the pair of ranks of the same individual in the two
characteristics and 𝑛 is the number of pairs.
Repeated ranks:
𝑚(𝑚2 −1)
In the Spearman’s formula add the factor
12
to ∑ 𝑑2 , where 𝑚 is the number of
times is repeated. This correction factor is to be added for each repeated value in both
the series.
Linear Regression:
Let us suppose that the in the bivariate distribution(𝑥𝑖 , 𝑦𝑖 ); 𝑖 = 1,2,3, … , 𝑛; 𝑦 is
dependent variable and 𝑥 is independent variable. Let the line of regression is the line
of 𝑦 on 𝑥 be
𝑦 = 𝑎 + 𝑏𝑥
Regression coefficients:
𝑦 − 𝑦̅ = 𝑏𝑦𝑥 (𝑥 − 𝑥̅ )
Coefficient of Determination:
Note:
𝜔
1 − 𝑅1.23 2 =
𝜔11
1 𝑟12 𝑟13
Where, 𝜔 = |𝑟21 1 𝑟23 | = 1 − 𝑟12 2 − 𝑟13 2 − 𝑟23 2 + 2𝑟12 𝑟13 𝑟23
𝑟31 𝑟32 1
1 𝑟23
and 𝜔11 = | | = 1 − 𝑟23 2
𝑟32 1
Module 4
Discrete Probability Distributions
Bernoulli’s Distribution:
A random variable 𝑋 which takes two values 0 and 1 with probability 𝑞 𝑎𝑛𝑑 𝑝
respectively. That is 𝑃(𝑋 = 0) = 𝑞 𝑎𝑛𝑑 𝑃(𝑋 = 1) = 𝑝, 𝑞 = 1 − 𝑝 is called a
Bernoulli’s discrete random variable. The probability function of Bernoulli’s
distribution can be written as
𝜇 = 𝐸(𝑋) = ∑ 𝑋𝑖 . 𝑃(𝑋𝑖 ) = 𝑝
Variance of 𝑋 is
Mean of 𝑋 is
𝜇 = 𝐸(𝑋) = 𝑛𝑝
2)
Variance 𝑉(𝑋) = 𝐸(𝑋 − 𝐸(𝑋)2
𝜎 2 = 𝑉(𝑋) = 𝑛𝑝𝑞
Poisson parameter, λ = np
Mean
𝜇 = 𝐸(𝑋) = λ
𝜇 = λ = np
Variance
𝑉(𝑋) = 𝜎 2 = λ
𝑡 −1)
𝑀(𝑡) = 𝑒 λ(𝑒
Characteristic function:
𝑖𝑡 −1)
∅(𝑡) = 𝑒 λ(𝑒
Hyper geometric Distribution:
𝑀𝐶𝑘 × 𝑁 − 𝑀𝐶𝑛−𝑘
𝑃(𝑋 = 𝑥) = ℎ(𝑘; 𝑁, 𝑀, 𝑛) =
𝑁𝐶𝑛
𝑛𝑀 𝑀
Mean is 𝐸(𝑋) = 𝑛𝑝 = 𝑁
, 𝑤ℎ𝑒𝑟𝑒 𝑝 = 𝑁
𝑛𝑀(𝑁−𝑀)(𝑁−𝑛)
Variance is 𝑣𝑎𝑟(𝑋) = 𝑛𝑝𝑞 = N2 (𝑁−1)
Covariance:
Uniform Distribution:
A random variable 𝑋 is said to follow uniform distribution over an interval (a, b), if its
probability density function is constant = k (say), over the entire range of X,
𝑘, 𝑎<𝑋<𝑏
𝑓(𝑋) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1
, a<X<b
f(X) = {(b − a)
0, otherwise
∞ 𝑏 1
∫−∞ 𝑓(𝑋)𝑑𝑋 = ∫𝑎 (𝑏−𝑎)
𝑑𝑋 = 1, 𝑎 < 𝑏, a and b are two parameters of the uniform
and 𝑥 = 𝑏.
Moments:
𝑏+𝑎
Mean =
2
(𝑏−𝑎)2
Variance = 12
Normal Distribution:
𝑎−𝜇 𝑏−𝜇
𝑃(𝑎 < 𝑋 ≤ 𝑏) = 𝑃 ( <𝑍≤ )
𝜎 𝜎
𝑏−𝜇 𝑎−𝜇
= ∅( ) −= ∅ ( )
𝜎 𝜎
𝐹(−𝑍) = 1 − 𝐹(𝑍)
Exponential Probability Distribution:
𝜆
The MGF is 𝑀𝑋 (𝑡) = 𝜆−𝑡
1
Mean = 𝜆
1
Variance= 𝜆2
𝜆𝑘 𝑥 𝑘−1 𝑒 −𝜆𝑥
𝑓(𝑥) = { ,𝑥 ≥ 0
Γ(𝑘)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Note:
When 𝒌 = 𝟏, the distribution is called exponential distribution
∞ ∞ Γ(𝑘)
∫−∞ 𝑓(𝑥)𝑑𝑥 = 1 (Since, ∫0 𝑥 𝑘−1 𝑒 −𝑎𝑥 𝑑𝑥 = 𝑎𝑘
)
The MGF is
𝜆 𝑘
𝑀𝑋 (𝑡) = ( )
𝜆−𝑡
𝑘
Mean = 𝜆
𝑘
Variance =𝜆2
Beta Distribution:
𝛼
Mean 𝜇 =
𝛼+𝛽
𝛼𝛽
Variance 𝜎 2 = (𝛼+𝛽)2 (𝛼+𝛽+1)
Note:
Weibull distribution:
Note:
2 1 2
Variance= 𝜎 2 = 𝛼 −2/𝛽 {Γ (1 + 𝛽) − [Γ (1 + 𝛽)] }
Cumulative distribution function:
𝛽
−𝛼𝑥
𝐹(𝑥; 𝛼, 𝛽) = { 1 − 𝑒 , 𝑥≥0
0, 𝑥<0
Module-6
Hypothesis Testing-I
̅𝑋 − 𝜇
𝑍= 𝜎
√𝑛
If ̅𝑋 is the mean of a random sample of size 𝑛 taken from a normal population having the
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 − 𝑋)
mean 𝜇 and the finite variance 𝜎 2 , and 𝑠 2 = then
𝑛−1
̅𝑋 − 𝜇
𝑡= 𝑠
√𝑛
is a random variable having t-distribution with parameter 𝜈 = 𝑛 − 1.
Hypothesis testing:
Hypothesis testing is a method for testing a claim/hypothesis about a parameter in a
population using data measured in a sample.
Statistical hypothesis:
In 𝐻0 , a statement involving equality (=, ≥, ≤)
In 𝐻1 , a statement involving equality (≠, >, <)
Types of test:
Suppose we test for population mean, then
Null hypothesis 𝐻0 : 𝜇 = 𝜇0
Alternative hypothesis 𝐻1 : 𝜇 ≠ 𝜇0 𝑜𝑟 𝜇 > 𝜇0 𝑜𝑟 𝜇 < 𝜇0
If 𝜇 ≠ 𝜇0 , then the test is called two-tailed test.
If 𝜇 > 𝜇0 , then the test is called right-tailed test (One-tailed)
If 𝜇 < 𝜇0 , then the test is called left-tailed test (One-tailed)
If 𝑛 ≥ 30 is a large sample
𝐻0 : 𝜇 = 𝜇0
𝒑−𝑷
𝒁=
√𝑷𝑸
𝒏
Where, 𝑄 = 1 − 𝑃
𝑋
𝑝 = 𝑛 is a sample proportion in a random sample of size 𝑛.
Test statistics:
𝑝1 − 𝑝2
𝒁=
1 1
√𝑃𝑄 (𝑛 + 𝑛 )
1 2
If, 𝑃 is not known, an unbiased estimate of 𝑃 based on the both samples, given by
𝑛1 𝑝1 +𝑛2 𝑝2
𝑃= is used in the place of 𝑃.
𝑛1 +𝑛2
Module 7
Hypothesis Testing-II
Student’s t-distribution:
1. Statistic for small sample test concerning one mean:
Null hypothesis: 𝐻0 : 𝜇 = 𝜇0
Test Statistic:
𝑋̅ − 𝜇0
𝑡= 𝑠
√𝑛
∑(𝑋𝑖 −𝑋̅ )2
Here, 𝑠 2 =
𝑛−1
Null hypothesis: 𝐻0 : 𝜇1 − 𝜇2 = 𝑑
Test Statistic:
𝑥
̅̅̅1 − ̅̅̅
𝑥2
𝑡=
1 1
√𝑠2 ( + )
𝑛1 𝑛2
follows t-distribution with 𝑛1 + 𝑛2 − 2 degrees of freedom.
̅̅̅1̅)2 +∑(𝑥2𝑖 −𝑥
∑(𝑥1𝑖 −𝑥 ̅̅̅2̅)2
Where, 𝑠 2 =
𝑛1 +𝑛2 −2
Or
2
𝑛1 𝑠1 2 + 𝑛2 𝑠2 2
𝑠 =
𝑛1 + 𝑛2 − 2
F-distribution:
F-distribution is used to test the equality of the variances of two populations from
which two samples have been drawn.
Null hypothesis: 𝐻0 : 𝜎1 2 = 𝜎2 2
Test statistics:
𝑠1 2
𝐹=
𝑠2 2
̅̅̅1̅)2
∑(𝑥1𝑖 −𝑥 ̅̅̅2̅)2
∑(𝑥2𝑖 −𝑥
Where, 𝑠1 2 = 𝑛1 −1
𝑎𝑛𝑑 𝑠2 2 = 𝑛2 −1
Note:
The larger among 𝒔𝟏 𝟐 𝒂𝒏𝒅 𝒔𝟐 𝟐 will be the numerator.
Here ′𝐹′ follows F-distribution with (𝑛1 − 1, 𝑛2 − 1) degrees of freedom.
The critical region value is 𝐹(𝑛1 −1,𝑛2 −1) .
Chi-square distribution (or) 𝝌𝟐 − 𝐝𝐢𝐬𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧:
Hypothesis concerning one variance
Goodness of fit
Test for independence of attributes
Null hypothesis 𝑯𝟎 : 𝜎 2 = 𝜎0 2
Test statistics:
2
(𝑛 − 1)𝑠 2
𝜒 =
𝜎0 2
𝑛
2
(𝑂𝑖 − 𝐸𝑖 )2
𝜒 = ∑[ ]
𝐸𝑖
𝑖=1
Note:
If the data is given in series of ′𝑛′ numbers, then
𝐴 𝑎 𝑏
𝐵 𝑐 𝑑
𝑎 𝑏 𝑎+𝑏
𝑐 𝑑 𝑐+𝑑
𝑎+𝑐 𝑏+𝑑 𝑁
The expected frequencies are given by
ANOVA:
Analysis of Variance is a hypothesis testing technique used to test the equality of two or more
population means by examining the variances of samples that are taken.
Assumptions of ANOVA:
All populations involved follow a normal distribution.
All populations have the same variances.
The samples are randomly selected and independent of one another or the
observations are independent.
Types of ANOVA:
One-way ANOVA: Completely Randomized Design (CRD)
Two-way ANOVA: Randomized Based Design (CBD)
Three-way ANOVA: Latin Square Design (LSD)
𝑘 𝑛𝑖
𝑆𝑆𝑇 = ∑ ∑ 𝑦𝑖𝑗 2 − 𝐶
𝑖=1 𝑗=1
𝑘 2
𝑇𝑖
𝑆𝑆𝐵 = ∑ –𝐶
𝑛𝑖
𝑖=1
To test the 𝐻0 that 𝐾 population mean is equal, we shall compare two estimates of 𝜎 2 .
One based on the variation between the sample mean.
ANOVA table:
Decision:
If 𝐹 > 𝐹𝛼,(𝑁−1,𝑁−𝐾) , reject the null hypothesis 𝐻0 .
2. Two-way ANOVA classification:
∑𝑪
𝒊=𝟏 𝑻𝒊.
𝟐
𝑺𝑺(𝑻𝒓) = − 𝑪𝒐𝒓𝒓𝒆𝒄𝒕𝒊𝒐𝒏 𝒇𝒂𝒄𝒕𝒐𝒓
𝒄
2
Block sum square, 𝑆𝑆(𝐵𝑙) = 𝐶 ∑𝑟𝑗=1(𝑦
̅̅̅
.𝑗 − 𝑦
̅.. )
∑𝒓𝒋=𝟏 𝑻.𝒋 𝟐
𝑺𝑺(𝑩𝒍) = − 𝑪𝒐𝒓𝒓𝒆𝒄𝒕𝒊𝒐𝒏 𝒇𝒂𝒄𝒕𝒐𝒓
𝒓
2
Error sum of square, 𝑆𝑆𝐸 = ∑𝐶𝑖=1 ∑𝑟𝑗=1(𝑦𝑖𝑗 − 𝑦̅𝑖. − ̅̅̅
𝑦.𝑗 + 𝑦̅.. )
2
Total sum of square, 𝑆𝑆𝑇 = ∑𝐶𝑖=1 ∑𝑟𝑗=1(𝑦𝑖𝑗 − 𝑦̅.. )
𝑪 𝒓
𝟐
𝑺𝑺𝑻 = ∑ ∑(𝒚𝒊𝒋 − 𝒚̅.. ) − 𝑪𝒐𝒓𝒓𝒆𝒄𝒕𝒊𝒐𝒏 𝒇𝒂𝒄𝒕𝒐𝒓
𝒊=𝟏 𝒋=𝟏
𝑻.. 𝟐
Where, correction factor is given by 𝑪 = 𝑪𝒓
𝑺𝑺(𝑻𝒓)
𝑴𝑺(𝑻𝒓) ( 𝑪−𝟏 )
𝑭𝑻𝒓 = =
𝑴𝑺𝑬 𝑺𝑺𝑬
( )
(𝑪 − 𝟏)(𝒓 − 𝟏)
Decision:
𝑺𝑺(𝑩𝒍)
𝑴𝑺(𝑩𝒍) ( 𝒓−𝟏 )
𝑭𝑩𝒍 = =
𝑴𝑺𝑬 𝑺𝑺𝑬
( )
(𝑪 − 𝟏)(𝒓 − 𝟏)
Decision: reject for 𝐻0 , if 𝐹𝐵𝑙 > 𝐹𝛼, (𝑟−1,(𝐶−1)(𝑟−1))
Degrees of freedom:
𝑫𝑭𝒓𝒐𝒘𝒔 = 𝒏 − 𝟏
𝑫𝑭𝒄𝒐𝒍𝒖𝒎𝒏𝒔 = 𝒏 − 𝟏
𝑫𝑭𝒕𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕𝒔 = 𝒏 − 𝟏
𝑫𝑭𝑬𝒓𝒓𝒐𝒓 = (𝒏 − 𝟏)(𝒏 − 𝟐)
Critical region:
𝐹(𝑛−1,(𝒏−𝟏)(𝒏−𝟐) )
𝐺 = ∑ ∑ 𝑥𝑖𝑗
𝐺2
Correction factor is 𝐶. 𝐹 =
𝑁
𝑆𝑆𝑇 = ∑ ∑ 𝑥𝑖𝑗 2 − 𝐶. 𝐹
sum of squares:
𝐶𝑗 2
𝑆𝑆𝐶 = ∑ − 𝐶𝐹
𝑛
Where, 𝐶𝑗 is the column sum of the jth column.
𝑅𝑖 2
𝑆𝑆𝑅 = ∑ − 𝐶𝐹
𝑛
Where, 𝑅𝑖 is the row sum of the ith row.
𝑇𝑖 2
𝑆𝑆𝑇𝑟 = ∑ − 𝐶𝐹
𝑛
Where, 𝑇𝑖 is called the treatment sum of ith treatment.
ANOVA table: