0% found this document useful (0 votes)
13 views

Module Wise Important Formulae

Uploaded by

Aisha Yahya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Module Wise Important Formulae

Uploaded by

Aisha Yahya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Module 1

Introduction to Statistics & Data Analysis

Measures of Central Tendency:

1. Mean:

 Arithmetic mean of a set of observations is their sum divided by number of


observations, for example, the arithmetic mean 𝑥̅ of n observations
𝑥1, 𝑥2 , 𝑥3 , … , 𝑥𝑛 is given by:

𝒏
𝟏 𝟏
̅ = (𝒙𝟏 +𝒙𝟐 + 𝒙𝟑 + ⋯ + 𝒙𝒏 ) = ∑ 𝒙𝒊
𝒙
𝒏 𝒏
𝒊=𝟏

 In case the frequency distribution 𝑓𝑖 , 𝑖 = 1,2,3, … , 𝑛, where 𝑓𝑖 is the


frequency of the variable 𝑥𝑖 ,
𝑓1 𝑥1 +𝑓2 𝑥2 +𝑓3 𝑥3 +⋯+𝑓𝑛 𝑥𝑛 1
𝑥̅ = 𝑓1 +𝑓2 +𝑓2 +⋯+𝑓𝑛
= 𝑁 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 , where 𝑁 = ∑𝑛𝑖=1 𝑓𝑖

 In case of grouped or continuous frequency distribution, the arithmetic is


𝑛
h
𝑥̅ = 𝐴 + ∑ 𝑓𝑖 𝑑𝑖
𝑁
𝑖=1

2. Median:

 In case of ungrouped data, if the number of observations is odd then median is


the middle value after the values have been arranged in ascending or descending
order of magnitude.
 In case of even number of observations, there are two middle terms are median
is obtained by taking the arithmetic mean of the middle terms.
 In the case of continuous frequency distribution, the class corresponding to the
𝑁
c.f. just greater than 2 is called the median class and the value of median is
obtained by the following formula:
h 𝑁
Median = 𝑙 + ( − 𝑐)
𝑓 2
Where 𝑙 is the lower limit of the median class
𝑓 is the frequency of the median class
h is the magnitude of the median class
𝑐 is the c.f. of the class preceding the median class
𝑁 = ∑𝑓
3. Geometric mean:

 The geometric mean, usually abbreviated as G.M. of a set of 𝑛 observations is


the 𝑛𝑡h root of their product. Thus, if 𝑋1 , 𝑋2 , 𝑋3 , … , 𝑋𝑛 are the 𝑛
observations then their G.M. is given by
1
𝑛
𝐺. 𝑀 = √𝑋1 × 𝑋2 × 𝑋3 × … × 𝑋𝑛 = (𝑋1 × 𝑋2 × 𝑋3 × … × 𝑋𝑛 )𝑛

If 𝑛 = 2 i.e., if we take two observations, then 𝐺. 𝑀 = √𝑋1 × 𝑋2

 The logarithm of the G.M of a set of observations is the arithmetic mean of


their logarithms.
1
𝐺. 𝑀 = Antilog ( ∑ 𝑙𝑜𝑔𝑋)
𝑛

4. Harmonic Mean:

 If 𝑋1 , 𝑋2 , 𝑋3 , … , 𝑋𝑛 is a given 𝑛 set of observations, then their harmonic


mean, abbreviated as H.M.
1
𝐻=
1 1 1 1 1
+ + + ⋯+ 𝑋 ]
𝑛 [𝑋1 𝑋2 𝑋3 𝑛

1
𝐻=
1 1
∑( )
𝑛 𝑋

 In case of frequency distribution, we have


1 1 𝑓1 𝑓2 𝑓𝑛
= ( + + ⋯+ )
𝐻 𝑁 𝑋1 𝑋2 𝑋𝑛

Where, 𝑁 = ∑ 𝑓
𝑋= mid-value of the variable or mid-value of the class

𝑓= frequency of 𝑋
Measures of variability or Dispersion:

1. Range:
Range is the difference between the greatest (maximum) and the smallest (minimum)
observation of the distribution.
𝑅𝑎𝑛𝑔𝑒 = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛

2. Quartile deviation:

It is a measure of dispersion based on the upper quartile 𝑄3 and the lower quartile 𝑄1.

Q 3 − Q1
Quartile deviation (Q. D) =
2

𝐡 𝐍∗𝐢
Where, 𝐐𝐢 = 𝒍 + 𝐟 ( − 𝐜) , 𝐟𝐨𝐫 𝐢 = 𝟏, 𝟐, 𝟑
𝟒

Q1 = first quartile deviation


Q 2 = second quartile deviation
Q 3 = third quartile deviation

3. Mean Deviation (or) Absolute mean deviation:

For ungrouped data or raw data:


1
𝑀. 𝐷 = ∑|𝑥𝑖 − 𝑥̅ |
𝑛
𝑖

For frequency distribution:


1
𝑀. 𝐷 = ∑ 𝑓𝑖 |𝑥𝑖 − 𝑥̅ |
𝑁
𝑖

4. Standard Deviation:
1 1
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = ∑(𝑥𝑖 − 𝑥̅ )2 = ∑ 𝑥𝑖 2 − 𝑥̅ 2
𝑛 𝑛

1 1
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑆. 𝐷 = 𝜎 = √𝑛 ∑(𝑥𝑖 − 𝑥̅ )2 (or) √𝑛 ∑ 𝑥𝑖 2 − 𝑥̅ 2

𝑄3 −𝑄1
Coefficient of dispersion = 𝑄3 +𝑄1
𝜎
Coefficient of variation= 𝐶. 𝑉 = 100 × 𝑥̅
Skewness:
 Mean(𝑀), median 𝑀𝑑 and mode 𝑀0 fall at different points i.e., 𝑀𝑒𝑎𝑛 ≠ 𝑀𝑒𝑑𝑖𝑎𝑛 ≠ 𝑀𝑜𝑑𝑒
 Quartiles are not equidistant from median
 The curve drawn with the help of the given data is not symmetrical but stretched more
to one side than to the other.

Measures of Skewness:

Various measures of Skewness (𝑆𝑘 ) are:

 𝑆𝑘 = 𝑀 − 𝑀𝑑
 𝑆𝑘 = 𝑀 − 𝑀0
 𝑆𝑘 = (𝑄3 − 𝑀𝑑 ) − (𝑀𝑑 − 𝑄1 )

These are the absolute measures of Skewness.


1. Prof. Karl Pearson’s Coefficient of Skewness:

𝑀−𝑀0
𝑆𝑘 = , 𝜎 is the standard deviation of the distribution.
𝜎
If mode is ill-defined, then using the empirical relation, 𝑀0 = 3𝑀𝑑 − 2𝑀, for a
moderately asymmetrical distribution, we get

3(𝑀 − 𝑀𝑑 )
𝑆𝑘 =
𝜎
𝑆𝑘 = 0, if 𝑀 = 𝑀0 = 𝑀𝑑 . Hence for a symmetrical distribution all are coincide.

2. Prof. Bowley’s Coefficient of Skewness:

(𝑄3 − 𝑀𝑑 ) − (𝑀𝑑 − 𝑄1 ) 𝑄3 + 𝑄1 − 2𝑀𝑑


𝑆𝑘 = =
(𝑄3 − 𝑀𝑑 ) + (𝑀𝑑 − 𝑄1 ) 𝑄3 − 𝑄1

𝑆𝑘 = 0, if 𝑄3 − 𝑀𝑑 = 𝑀𝑑 − 𝑄1 . Hence for a symmetrical distribution. Median is


equidistant from the upper and lower quartiles.

3. Based upon the moments, coefficient of skewness:


√𝛽1 (𝛽2 + 3)
𝑆𝑘 =
2(5𝛽2 − 6𝛽1 − 9)
𝑆𝑘 = 0, if either 𝛽1 = 0 or 𝛽2 = −3
Kurtosis:

 Prof. Karl Pearson’s calls as the ‘convexity of the frequency curve’ or Kurtosis.
 Kurtosis enables the flatness or peakedness of the frequency curve.
 It is measured by the coefficient 𝛽2 or its derivation is given by
𝜇
𝛽2 = 𝜇 42 , 𝛾2 = 𝛽2 − 3
2

A: Leptokurtic Curve (which is more peaked than the normal curve𝛽2 > 3 𝑖. 𝑒. 𝛾2 > 0 )
B: Normal Curve or Mesokurtic Curve (which is neither flat nor peaked

𝛽2 = 3 𝑖. 𝑒. 𝛾2 = 0 )

C: Platykurtic Curve (which is flatter than the normal curve 𝛽2 < 3 𝑖. 𝑒. 𝛾2 < 0)
Module 2
Probability

 Probability:

If a random experiment or a trial results ‘n’ exhaustive, mutually exclusive and equally
likely outcomes, out of which ‘m’ are favourable to the to the occurrence of event E,
then the probability ‘p’ of occurrence or happening of E, usually denoted by P(E), is
given by

𝑁𝑜. 𝑜𝑓 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑐𝑎𝑠𝑒𝑠 𝑚


𝑝 = 𝑃(𝐸) = =
𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑒𝑥h𝑎𝑢𝑠𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠 𝑛

 Conditional Probability:

Let 𝑆 be the sample space of a random experiment. Let 𝐶1 ⊂ 𝑆, further let 𝐶2 ⊂ 𝐶1 , then
the conditional event 𝐶1 has already occurred, denoted by 𝑃(𝐶2 /𝐶1 ) is defined as

𝑃(𝐶2 ∩ 𝐶1 )
𝑃(𝐶2 /𝐶1 ) = , 𝑖𝑓 𝑃(𝐶1 ) ≠ 0
𝑃(𝐶1 )

Or

𝑃(𝐶2 ∩ 𝐶1 ) = 𝑃(𝐶1 )𝑃(𝐶2 /𝐶1 )

Note:

If 𝐶1 , 𝐶2 , 𝐶3 are any three events, then


𝑃(𝐶1 ∩ 𝐶2 ∩ 𝐶3 ) = 𝑃(𝐶1 )𝑃(𝐶2 /𝐶1 )𝑃(𝐶3 /𝐶1 ∩ 𝐶2 ), …

 Bayes theorem:

Let 𝐶1 , 𝐶2 , 𝐶3 , … , 𝐶𝑛 be a partition of sample space and let C be any event which is a


subset of ⋃𝑛𝑖=1 𝐶𝑖 such that
𝑃(𝐶𝑖 )𝑃(𝐶/𝐶𝑖 )
P(𝐶) > 0, 𝑡h𝑒𝑛 𝑃(𝐶i /C) = 𝑛
∑𝑖=1 𝑃(𝐶𝑖 )𝑃(𝐶/𝐶𝑖 )

Discrete Random Variable:

A Random Variable which takes on a finite (or) countably infinite number of values is
called a Discrete Random Variable.

Continuous Random Variable:

A Random Variable which takes on non-countable infinite number of values is called as


non- Discrete (or) Continuous Random Variable.

Probability Mass Function (P.M.F):

The set of ordered pairs (𝑥, 𝑓(𝑥)) is a probability function of Probability Mass

Function of a Discrete Random Variable 𝑥.

If for each possible outcome 𝑥, 𝑓(𝑥) must be

(i) 𝑓(𝑥) ≥ 0
(ii) ∑ 𝑓(𝑥) = 1

(iii) 𝑃(𝑋 = 𝑥) = 𝑓(𝑥)

The Probability Mass Function is also denoted by 𝑃𝑋 (𝑥) = 𝑃(𝑋 = 𝑥).

Probability Density Function (P.D.F):

The function 𝑓(𝑥) is a Probability Density Function for the Continuous Random
Variable 𝑥 defined over the set of real numbers 𝑅, 𝑖𝑓

(i) 𝑓(𝑥) ≥ 0, ∀ 𝑥 ∈ 𝑅

+∞
(ii) ∫−∞ 𝑓(𝑥) 𝑑𝑥 = 1
𝑏
(iii) 𝑃(𝑎 < 𝑋 < 𝑏) = ∫𝑎 𝑓(𝑥) 𝑑𝑥

Cumulative Distribution Function:

The Cumulative density distribution function of a discrete random variable 𝑋 with


probability distribution function 𝑓(𝑥) as

𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑ 𝑓(𝑡)


𝑡≤𝑥

Mathematical Expectation, Variance and Standard deviation:

 Let 𝑋 be a random variable with probability distribution 𝑓(𝑥), then the mean
or mathematical expectation of 𝑋 is denoted by 𝐸(𝑋) and it is denoted by
𝐸(𝑋) = ∑ 𝑥 𝑓(𝑥), where 𝑋is a discrete random variable

+∞
𝐸(𝑋) = ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥 ,where 𝑋is a continuous random variable

 𝑋 be a random variable with pdf 𝑓(𝑥) and the mean 𝜇, then the variance of 𝑋
is
 𝑉(𝑥) = 𝜎 2 = E[(𝑋 − 𝜇)2 ] = ∑(𝑋 − 𝜇)2 𝑓(𝑥), where 𝑋is a discrete random
variable
+∞
 𝑉(𝑥) = 𝜎 2 = ∫−∞ (𝑋 − 𝜇)2 𝑓(𝑥), where 𝑋is a continuous random variable
 The positive square root of variance is a standard deviation of 𝑋. It is denoted
by 𝜎(𝑆. 𝐷).
 𝐸(𝑥 2 ) = ∑ 𝑥 2 𝑓(𝑥) (Discrete)
+∞
 𝐸(𝑥 2 ) = ∫−∞ 𝑥 2 𝑓(𝑥) (Continuous)

Marginal Probability Distribution

 Let (𝑋, 𝑌) be a two-dimensional discrete random variable. Then the marginal


probability function of the random variable 𝑋 is defined as
𝑚

𝑃(𝑋 = 𝑥𝑖 ) = ∑ 𝑃𝑖𝑗 = 𝑃𝑖∗


𝑗=1
 The marginal probability function of the random variable 𝑌 is defined as
𝑛

𝑃(𝑌 = 𝑦𝑗 ) = ∑ 𝑃𝑖𝑗 = 𝑃∗𝑗


𝑖=1
 The marginal distribution of 𝑋 is the coefficient of pairs (𝑥𝑖 , 𝑃𝑖∗ ) and of 𝑌 is (𝑦𝑗 , 𝑃∗𝑗 ).

Conditional Probability Distribution

Let (𝑋, 𝑌) be two-dimensional discrete random variable, then

𝑃(𝑋=𝑥𝑖 ,𝑌=𝑦𝑗 ) 𝑃𝑖𝑗


 𝑃 (𝑋 = 𝑥𝑖 /𝑌 = 𝑦𝑗 ) = =
𝑃(𝑌=𝑦𝑗 ) 𝑃∗𝑗

𝑃(𝑋=𝑥𝑖 ,𝑌=𝑦𝑗 ) 𝑃𝑖𝑗


 𝑃 (𝑌 = 𝑦𝑗 /𝑋 = 𝑥𝑖 ) = =
𝑃(𝑋=𝑥𝑖 ) 𝑃𝑖∗

Continuous random variables 𝑿 𝒂𝒏𝒅 𝒀:

Joint Probability Density function of (𝑿, 𝒀)

Let (𝑋, 𝑌) be a two-dimensional continuous random variable such that


𝑑𝑋 𝑑𝑋 𝑑𝑌 𝑑𝑌
𝑃 (𝑋 − ≤𝑋≤𝑋+ , 𝑌− ≤ 𝑌 ≤ 𝑌 + ) = ∬ 𝑓(𝑋, 𝑌) 𝑑𝑋𝑑𝑌
2 2 2 2
Then 𝑓(𝑋, 𝑌) is called the joint density function of (𝑋, 𝑌), if it satisfies the following conditions:

(i) 𝑓(𝑋, 𝑌) ≥ 0, 𝑓𝑜𝑟 𝑎𝑙𝑙 (𝑋, 𝑌) ∈ 𝑅, where R is the range space.


∞ ∞
(ii) ∫−∞ ∫−∞ 𝑓(𝑋, 𝑌) 𝑑𝑋𝑑𝑌 = 1

Moreover, if (𝑎, 𝑏), (𝑐, 𝑑) ∈ 𝑅, then

𝑏 𝑑
(iii) 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏, 𝑐 ≤ 𝑌 ≤ 𝑑) = ∫𝑎 ∫𝑐 𝑓(𝑋, 𝑌) 𝑑𝑋𝑑𝑌 = 1

Marginal Probability Distribution:

When (𝑋, 𝑌) be a two-dimensional continuous random variable, then the marginal density function of
the random variable 𝑋 is defined as

𝑓𝑋 (𝑥) = ∫ 𝑓(𝑥, 𝑦) 𝑑𝑦
−∞
The marginal density function of the random variable 𝑌 is defined as

𝑓𝑌 (𝑦) = ∫ 𝑓(𝑥, 𝑦) 𝑑𝑥
−∞

Conditional Probability Distribution

Let (𝑋, 𝑌) be two-dimensional continuous random variable, then


𝑓(𝑥, 𝑦)
𝑓(𝑥/𝑦) =
𝑓𝑌 (𝑦)
is conditional probability function of 𝑋 given 𝑌.

𝑓(𝑥,𝑦)
𝑓(𝑦/𝑥) = 𝑓𝑋 (𝑥)
is conditional probability function of 𝑌 given 𝑋.

Moments:
 The 𝑟 𝑡ℎ moment about the origin of a random variable 𝑋 denoted by 𝜇𝑟 is 𝐸(𝑋 𝑟 ), i.e.,
𝜇0 = 𝐸(𝑋 0 ) = 𝐸(1) = 1
𝜇1 = 𝐸(𝑋1 ) = 𝐸(𝑋) = 𝜇
2
𝜇2 = 𝐸(𝑋 2 ) − (𝐸(𝑋)) = 𝐸(𝑋 2 ) − 𝜇2
𝐸(𝑋 2 ) = 𝜎 2 + 𝜇2

Moment Generating function (MGF):

 The MGF of the distribution of a random variable completely describes the nature of the
distribution.

 Let having PDF 𝑓(𝑋), then the MGF of the distribution of 𝑋 is denoted by 𝑀(𝑡) and is
defined as 𝑀(𝑡) = 𝐸(𝑒 𝑡𝑥 ).

∑ 𝑒 𝑡𝑥 𝑓(𝑥) , if 𝑥 is discrete
Thus, the MGF 𝑀(𝑡) = { ∞ 𝑡𝑥
∫−∞ 𝑒 𝑓(𝑥)𝑑𝑥, 𝑖𝑓 𝑥 𝑖𝑠 𝑐𝑜𝑛𝑡𝑢𝑜𝑢𝑠

We know that 𝑀(𝑡) = 𝐸(𝑒 𝑡𝑥 )



𝑡𝑟 ′
𝑀(𝑡) = ∑ 𝜇
𝑟! 𝑟
𝑟=0

𝑡𝑟
The coefficient of is about the origin is 𝜇𝑟 ′ .
𝑟!

 If 𝑋 be a continuous random variable, then MGF is



𝑀(𝑡) = ∫ 𝑒 𝑡𝑥 𝑓(𝑥)𝑑𝑥
−∞


𝑀′ (𝑡) = ∫ 𝑥. 𝑒 𝑡𝑥 𝑓(𝑥)𝑑𝑥
−∞


𝑀′′ (𝑡) = ∫−∞ 𝑥 2 . 𝑒 𝑡𝑥 𝑓(𝑥)𝑑𝑥, …

Now at 𝑡 = 0

𝑀(0) = 𝐸(1) = 1

𝑀′ (0) = 𝐸(𝑥) = 𝜇

𝑀′′ (0) = 𝐸(𝑥 2 ) = 𝜎 2 + 𝜇2

Mean is 𝜇 = 𝑀′ (0)
2
Variance is 𝑀′′ (0) = 𝑀′′ (0) − (𝑀′ (0))

𝜕𝑟
𝜇𝑟 ′ = (𝑀(𝑡)) ; 𝑟 = 0,1,2, …
𝜕𝑡 𝑟

Characteristic function:
 The characteristic function is defined as

∑ eitX f(x), for discrete probability distribution


∅X (t) = E(eitX ) = x
itX
∫e f(x)dx, for continuous probability distribution
{

 If 𝐹𝑋 (𝑥) is the distribution function of a continuous random variable 𝑋, then



∅𝑋 (𝑡) = ∫ 𝑒 𝑖𝑡𝑋 𝑑𝐹(𝑥)
−∞


(𝑖𝑡)𝑟 ′
𝑀(𝑡) = ∑ 𝜇
𝑟! 𝑟
𝑟=0

(𝑖𝑡)𝑟
The coefficient of 𝑟!
is about the origin is 𝜇𝑟 ′ .
Module 3

Correlation and Regression

Karl Pearson’s coefficient of Correlation (Covariance method):

 Correlation coefficient between two variables 𝑋 𝑎𝑛𝑑 𝑌, usually denoted by 𝑟(𝑋, 𝑌) or


simply 𝑟𝑋𝑌 or simply 𝑟, is a numerical measure of linear relationship between them
and is defined as:

𝐶𝑜𝑣(𝑋,𝑌)
𝑟𝑋𝑌 = 𝜎𝑋 𝜎𝑌

 If (𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), (𝑥3 , 𝑦3 ), … , (𝑥𝑛 , 𝑦𝑛 ) are 𝑛 pairs of observations of the variables 𝑋 𝑎𝑛𝑑 𝑌
in a bivariate distribution, then

1 1 1
𝐶𝑜𝑣(𝑥, 𝑦) = 𝑛 ∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅); 𝜎𝑥 = √ ∑(𝑥 − 𝑥̅ )2 , 𝜎𝑦 = √ ∑(𝑦 − 𝑦̅)2
𝑛 𝑛

 Summation being taken over 𝑛 pairs of observations.

∑ 𝑑𝑥 𝑑𝑦
𝑟=𝑟=
√∑ 𝑑𝑥 2 ∑ 𝑑𝑦 2

Where, 𝑑𝑥 = 𝑥 − 𝑥̅ 𝑎𝑛𝑑 𝑑𝑦 = 𝑦 − 𝑦̅.

 Corrected value of 𝑟 is given by

𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ] × [𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
Properties of correlation coefficient:

 Pearson coefficient cannot exceed 1 numerically. In other words, it lies between -1


and +1 i.e., −1 ≤ 𝑟 ≤ 1
 Correlation coefficient is independent of the change of origin and scale.
Mathematically, if X and y are the given variables and they are transformed to the
new variables 𝑢 𝑎𝑛𝑑 𝑣 by the change of origin and scale
𝑋−𝐴 𝑦−𝐵
𝑢= ℎ
𝑎𝑛𝑑 𝑣 = , ℎ > 0, 𝑘 > 0.
𝑘

Where, A, B, h and k are constants, ℎ > 0, 𝑘 > 0, then the correlation between
𝑥 𝑎𝑛𝑑 𝑦 is same the correlation coefficient between 𝑢 𝑎𝑛𝑑 𝑣 i.e., 𝑟(𝑥, 𝑦) = 𝑟(𝑢, 𝑣)

𝑟𝑥𝑦 = 𝑟𝑢𝑣

∑(𝑢 − 𝑢̅)(𝑣 − 𝑣̅ )
𝑟𝑢𝑣 =
√∑(𝑢 − 𝑢̅)2 ∑(𝑣 − 𝑣̅ )2

𝑛 ∑ 𝑢𝑣 − (∑ 𝑢)(∑ 𝑣)
𝑟𝑢𝑣 =
√[𝑛 ∑ 𝑢2 − (∑ 𝑢)2 ] × [𝑛 ∑ 𝑣 2 − (∑ 𝑣)2 ]

 Two independent variables are uncorrelated i.e., 𝑟𝑥𝑦 = 0.


𝑎×𝑐
 𝑟(𝑎𝑋 + 𝑏, 𝑐𝑌 + 𝑑) = |𝑎×𝑐| . 𝑟(𝑋, 𝑌)

Rank Correlation method:


 Spearman’s rank correlation coefficient, usually denoted by 𝜌 (Rho) is given by the
formula

6 ∑ 𝑑2
𝜌=1−
𝑛(𝑛2 −1)

Where, 𝑑 is the difference between the pair of ranks of the same individual in the two
characteristics and 𝑛 is the number of pairs.
Repeated ranks:
𝑚(𝑚2 −1)
 In the Spearman’s formula add the factor
12
to ∑ 𝑑2 , where 𝑚 is the number of

times is repeated. This correction factor is to be added for each repeated value in both
the series.

Linear Regression:
 Let us suppose that the in the bivariate distribution(𝑥𝑖 , 𝑦𝑖 ); 𝑖 = 1,2,3, … , 𝑛; 𝑦 is
dependent variable and 𝑥 is independent variable. Let the line of regression is the line
of 𝑦 on 𝑥 be

𝑦 = 𝑎 + 𝑏𝑥

 The line of regression of 𝑌 𝑜𝑛 𝑋 passes through the point (𝑥̅ , 𝑦̅ )


𝑦̅ = 𝑎 + 𝑏𝑥̅

Regression coefficients:

 Equations of the line of regression of 𝑥 𝑜𝑛 𝑦 is


𝑥 − 𝑥̅ = 𝑏𝑥𝑦 (𝑦 − 𝑦̅)

 Equation of line of regression of 𝑦 𝑜𝑛 𝑥 is

𝑦 − 𝑦̅ = 𝑏𝑦𝑥 (𝑥 − 𝑥̅ )

Coefficient of Determination:

 The coefficient is given by the square of the correlation coefficient i.e.,


explained variance
𝑟2 =
total variance

Coefficient of Partial correlation:

 The partial correlation coefficient between 𝑋1 𝑎𝑛𝑑 𝑋2 , usually denoted by 𝑟12.3 is


given by
𝑟12 − 𝑟13 𝑟23
𝑟12.3 =
√(1 − 𝑟13 2 )(1 − 𝑟23 2 )
𝑟13 − 𝑟12 𝑟32
𝑟13.2 =
√(1 − 𝑟12 2 )(1 − 𝑟32 2 )
𝑟23 − 𝑟21 𝑟31
𝑟23.1 =
√(1 − 𝑟21 2 )(1 − 𝑟31 2 )

 The multiple correlations in terms of total and partial correlations:

2 𝑟12 2 + 𝑟13 2 − 2𝑟12 𝑟13 𝑟23


1 − 𝑅1.23 = 1−
1 − 𝑟23 2

1 − 𝑟23 2 − 𝑟12 2 − 𝑟13 2 + 2𝑟12 𝑟13 𝑟23


=
1 − 𝑟23 2

Note:
𝜔
1 − 𝑅1.23 2 =
𝜔11

1 𝑟12 𝑟13
Where, 𝜔 = |𝑟21 1 𝑟23 | = 1 − 𝑟12 2 − 𝑟13 2 − 𝑟23 2 + 2𝑟12 𝑟13 𝑟23
𝑟31 𝑟32 1

1 𝑟23
and 𝜔11 = | | = 1 − 𝑟23 2
𝑟32 1
Module 4
Discrete Probability Distributions

Bernoulli’s Distribution:

 A random variable 𝑋 which takes two values 0 and 1 with probability 𝑞 𝑎𝑛𝑑 𝑝
respectively. That is 𝑃(𝑋 = 0) = 𝑞 𝑎𝑛𝑑 𝑃(𝑋 = 1) = 𝑝, 𝑞 = 1 − 𝑝 is called a
Bernoulli’s discrete random variable. The probability function of Bernoulli’s
distribution can be written as

𝑃(𝑋) = 𝑝 𝑋 𝑞 𝑛−𝑋 = 𝑝 𝑋 (1 − 𝑝)𝑛−𝑋 ; 𝑋 = 0,1


Note:

 Mean of Bernoulli’s distribution discrete random variable 𝑋

𝜇 = 𝐸(𝑋) = ∑ 𝑋𝑖 . 𝑃(𝑋𝑖 ) = 𝑝

 Variance of 𝑋 is

𝑉(𝑋) = 𝐸(𝑋 2 ) − 𝐸(𝑋)2 = ∑ 𝑋𝑖 2 𝑃(𝑋𝑖 ) − 𝜇 2

= (02 × 𝑞) + (12 × 𝑝) − 𝑝2 = 𝑝 − 𝑝2 = 𝑝(1 − 𝑝) = 𝑝𝑞

 The standard deviation is 𝜎 = √𝑝𝑞


Binomial Distribution:

𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 , 𝑥 = 0,1,2,3, … , 𝑛


𝑃(𝑋 = 𝑥) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Where 𝑛 𝑎𝑛𝑑 𝑝 are known as parameters.

 Mean of 𝑋 is
𝜇 = 𝐸(𝑋) = 𝑛𝑝
2)
 Variance 𝑉(𝑋) = 𝐸(𝑋 − 𝐸(𝑋)2
𝜎 2 = 𝑉(𝑋) = 𝑛𝑝𝑞

MGF Binomial Distribution:

 Let 𝑋~𝐵(𝑛, 𝑝), then


𝑀(𝑡) = 𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑥 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛

Characteristic Function of Binomial distribution:

∅𝑋 (𝑡) = 𝐸(𝑒 𝑖𝑡𝑥 ) = (𝑞 + 𝑝𝑒 𝑖𝑡 )𝑛

Cumulative Binomial distribution:

The Binomial probabilities can be obtained from cumulative distribution as follows

𝑏(𝑥; 𝑛, 𝑝) = 𝐵(𝑥; 𝑛, 𝑝) − 𝐵(𝑥 − 1; 𝑛, 𝑝)


Poisson Distribution:
 A random variable X taking on one of the non-negative values with parameter λ, λ >0,
is said to follow Poisson distribution if its probability mass function is given by
λ𝑥 𝑒 −λ
𝑃(𝑥; λ) = 𝑃(𝑋 = 𝑥) = { , 𝑥 = 0,1,2,3, …
𝑥!
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 Poisson parameter, λ = np

 Mean
𝜇 = 𝐸(𝑋) = λ

𝜇 = λ = np

 Variance

𝑉(𝑋) = 𝐸(𝑋 2 ) − 𝐸(𝑋)2

𝑉(𝑋) = 𝜎 2 = λ

Cumulative Poisson distribution:


λ𝑘 𝑒 −λ
𝐹(𝑥; λ) = 𝑃(𝑋 ≤ 𝑥) = ∑𝑥𝑘=0
𝑘!

Moment generating function:

𝑡 −1)
𝑀(𝑡) = 𝑒 λ(𝑒

Characteristic function:
𝑖𝑡 −1)
∅(𝑡) = 𝑒 λ(𝑒
Hyper geometric Distribution:

 A discrete random variable X is said to follow the hypergeometric distribution with


parameters 𝑁, 𝑀 𝑎𝑛𝑑 𝑛, if it assumes only non-negative values and its probability
mass function is given by
(𝑀
𝑘
)(𝑁−𝑀
𝑛−𝑘
)
, 𝑘 = 0,1,2,3, … , min(𝑛, 𝑀)
𝑃(𝑋 = 𝑥) = ℎ(𝑘; 𝑁, 𝑀, 𝑛) = { (𝑁)
𝑛
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Where 𝑁 is a positive integer, 𝑀 is a positive integer not exceeding 𝑁 and 𝑛 is a


positive integer that is at most 𝑁.

𝑀𝐶𝑘 × 𝑁 − 𝑀𝐶𝑛−𝑘
𝑃(𝑋 = 𝑥) = ℎ(𝑘; 𝑁, 𝑀, 𝑛) =
𝑁𝐶𝑛

𝑛𝑀 𝑀
 Mean is 𝐸(𝑋) = 𝑛𝑝 = 𝑁
, 𝑤ℎ𝑒𝑟𝑒 𝑝 = 𝑁
𝑛𝑀(𝑁−𝑀)(𝑁−𝑛)
 Variance is 𝑣𝑎𝑟(𝑋) = 𝑛𝑝𝑞 = N2 (𝑁−1)

Covariance:

 The covariance of two random variables X and Y is


𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌)
Or
𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸[(𝑋 − 𝐸(𝑋))(𝑌 − 𝐸(𝑌))]
Module 5

Continuous Probability Distribution

Uniform Distribution:

 A random variable 𝑋 is said to follow uniform distribution over an interval (a, b), if its
probability density function is constant = k (say), over the entire range of X,

𝑘, 𝑎<𝑋<𝑏
𝑓(𝑋) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

1
, a<X<b
f(X) = {(b − a)
0, otherwise

∞ 𝑏 1
 ∫−∞ 𝑓(𝑋)𝑑𝑋 = ∫𝑎 (𝑏−𝑎)
𝑑𝑋 = 1, 𝑎 < 𝑏, a and b are two parameters of the uniform

distribution on (a, b).

 Since 𝐹(𝑋) is not continuous at 𝑥 = 𝑎 and 𝑥 = 𝑏, it is not differentiable at these


𝑑 1
points. Thus 𝑑𝑋 𝐹(𝑋) = 𝑓(𝑋) = (𝑏−𝑎) ≠ 0 exists everywhere except the points 𝑥 = 𝑎

and 𝑥 = 𝑏.

Moments:

𝑏+𝑎
 Mean =
2
(𝑏−𝑎)2
 Variance = 12
Normal Distribution:

 A random variable X is said to have a normal distribution, if its density function or


probability distribution is given by
1 −(𝑥−𝜇)2
𝑓(𝑥; 𝜇, 𝜎) = 𝑒 2𝜎2 , −∞ < 𝑥 < ∞, −∞ < µ < ∞, σ > 0.
𝜎√2𝜋

Where, 𝜇 is the mean and 𝜎 is the standard deviation of 𝑥.

 The random variable that follows this distribution is denoted by z. If a variable x


follows normal distribution with mean µ and s.d. σ, the variable z defined as
𝑥−µ
𝑍=
𝜎
has standard normal distribution with mean 0 and standard deviation 1. This is also
referred as z-score.
 The normal curve is symmetric about mean, the total area under the normal curve is 1,
that is
𝑃(−∞ < 𝑋 < ∞) = 1
Also

𝑃(−∞ < 𝑍 < ∞) = 1


 The standard normal probability in the form of cumulative distribution function
(CDF)

𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝐹(𝑏) − 𝐹(𝑎)


 When 𝑋 is normal distribution with mean 𝜇 and standard deviation 𝜎

𝑎−𝜇 𝑏−𝜇
𝑃(𝑎 < 𝑋 ≤ 𝑏) = 𝑃 ( <𝑍≤ )
𝜎 𝜎
𝑏−𝜇 𝑎−𝜇
= ∅( ) −= ∅ ( )
𝜎 𝜎
 𝐹(−𝑍) = 1 − 𝐹(𝑍)
Exponential Probability Distribution:

 A continuous random variable 𝑋 is said to follow an exponential distribution with


parameter 𝜆 > 0, if its probability density function is given by
𝜆𝑒 −𝜆𝑥 , 𝑥≥0
𝑓(𝑥) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
 The general form of the exponential distribution is
𝑥
1
𝑓(𝑥) = 𝑎 𝑒 −𝑎 , 𝑎 > 0, 𝑥 ≥ 0 with parameter 𝑎.

MGF of Exponential Distribution:

𝜆
 The MGF is 𝑀𝑋 (𝑡) = 𝜆−𝑡
1
 Mean = 𝜆
1
 Variance= 𝜆2

 The cumulative distribution function is


𝑥 𝑥
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝜆𝑒 −𝜆𝑥 𝑑𝑥 = 1 − 𝑒 −𝜆𝑥
0 0
−𝜆𝑥
𝐹(𝑥) = { 1 − 𝑒 , 𝑥≥0
0, 𝑥<0

Exponential Distribution possesses memoryless property:

 𝑃(𝑋 > 𝑠 + 𝑡 /𝑋 > 𝑡) = 𝑃(𝑋 > 𝑠), for any 𝑠, 𝑡 > 0



𝑃(𝑋 > 𝑠 + 𝑡 ∩ 𝑋 > 𝑡) 𝑃(𝑋 > 𝑠 + 𝑡)
𝑃(𝑋 > 𝑠 + 𝑡/ 𝑋 > 𝑡) = = = 𝑃(𝑋 > 𝑠)
𝑃(𝑋 > 𝑡) 𝑃(𝑋 > 𝑡)
Gamma Distribution:

 A continuous random variable 𝑋 is said to follow general Gamma distribution with


two parameters 𝜆 > 0 and 𝑘 > 0, if its probability density function is given by

𝜆𝑘 𝑥 𝑘−1 𝑒 −𝜆𝑥
𝑓(𝑥) = { ,𝑥 ≥ 0
Γ(𝑘)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Note:
 When 𝒌 = 𝟏, the distribution is called exponential distribution
∞ ∞ Γ(𝑘)
 ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1 (Since, ∫0 𝑥 𝑘−1 𝑒 −𝑎𝑥 𝑑𝑥 = 𝑎𝑘
)

MGF of Gamma Distribution:

 The probability density function of the general Gamma random variable 𝑋 is


𝜆𝑘 𝑥 𝑘−1 𝑒 −𝜆𝑥
𝑓(𝑥) = { ,𝑥 ≥ 0
Γ(𝑘)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Where 𝜆 𝑎𝑛𝑑 𝑘 are the parameters.

 The MGF is
𝜆 𝑘
𝑀𝑋 (𝑡) = ( )
𝜆−𝑡
𝑘
 Mean = 𝜆
𝑘
 Variance =𝜆2
Beta Distribution:

 A continuous random variable X takes on values in the interval from 0 to 1. It has to


follow the Beta distribution, if its probability density is given as
Γ(𝛼 + 𝛽) 𝛼−1
𝑥 (1 − 𝑥)𝛽−1 , 𝑓𝑜𝑟 0 < 𝑥 < 1, 𝛼 > 0, 𝛽 > 0
𝑓(𝑥) = {Γ(𝛼)Γ(𝛽)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝛼
 Mean 𝜇 =
𝛼+𝛽
𝛼𝛽
 Variance 𝜎 2 = (𝛼+𝛽)2 (𝛼+𝛽+1)

Note:

 If 𝛼 = 1 𝑎𝑛𝑑 𝛽 = 1, we obtain as special case the uniform distribution.

Weibull distribution:

 The random variable X is said to follow Weibull distribution, if its probability


distribution is given by
𝛽−1 −𝛼𝑥 𝛽
𝑓(𝑥) = {𝛼𝛽𝑥 𝑒 , 𝑥>0
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Where, 𝛼 > 0 𝑎𝑛𝑑 𝛽 > 0 are two parameters of the Weibull distribution.

Note:

 When 𝛽 = 1, the Weibull distribution reduces to the exponential distribution with


parameter 𝛼.
1
− 1
 Mean = 𝐸(𝑋) = 𝜇 = 𝛼 𝛽 Γ (1 + 𝛽)

2 1 2
 Variance= 𝜎 2 = 𝛼 −2/𝛽 {Γ (1 + 𝛽) − [Γ (1 + 𝛽)] }
Cumulative distribution function:
𝛽
−𝛼𝑥
𝐹(𝑥; 𝛼, 𝛽) = { 1 − 𝑒 , 𝑥≥0
0, 𝑥<0
Module-6
Hypothesis Testing-I

Population Parameters Sample Statistics


Population mean (𝜇) Sample mean (𝑋̅)
Population standard deviation (𝜎) Sample standard deviation (𝑆)
Population size (𝑁) Sample size (𝑛)
Population proportion (𝑃) Sample proportion (𝑝)

Sampling distribution of mean (𝝈 𝒌𝒏𝒐𝒘𝒏):


If ̅𝑋 is the mean of a random sample of size 𝑛 taken from a population having the mean 𝜇
and the finite variance 𝜎 2 , then

̅𝑋 − 𝜇
𝑍= 𝜎
√𝑛

Sampling distribution of mean (𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏):

If ̅𝑋 is the mean of a random sample of size 𝑛 taken from a normal population having the
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 − 𝑋)
mean 𝜇 and the finite variance 𝜎 2 , and 𝑠 2 = then
𝑛−1

̅𝑋 − 𝜇
𝑡= 𝑠
√𝑛
is a random variable having t-distribution with parameter 𝜈 = 𝑛 − 1.

Hypothesis testing:
Hypothesis testing is a method for testing a claim/hypothesis about a parameter in a
population using data measured in a sample.

Statistical hypothesis:
 In 𝐻0 , a statement involving equality (=, ≥, ≤)
 In 𝐻1 , a statement involving equality (≠, >, <)
Types of test:
 Suppose we test for population mean, then
Null hypothesis 𝐻0 : 𝜇 = 𝜇0
Alternative hypothesis 𝐻1 : 𝜇 ≠ 𝜇0 𝑜𝑟 𝜇 > 𝜇0 𝑜𝑟 𝜇 < 𝜇0
If 𝜇 ≠ 𝜇0 , then the test is called two-tailed test.
If 𝜇 > 𝜇0 , then the test is called right-tailed test (One-tailed)
If 𝜇 < 𝜇0 , then the test is called left-tailed test (One-tailed)

Level of significance (𝜶) Types of test


One-tailed Two-tailed
5% (0.05) +1.645 or - 1.645 ± 1.96
1% (0.01) +2.33 or - 2.33 ± 2.58
10% (0.1) +3.09 or - 3.09 ± 3.30

 If 𝑛 ≥ 30 is a large sample

If 𝑛 < 30 is a small sample

1. Test of single mean condition:

 𝐻0 : 𝜇 = 𝜇0

Test statistic: Statistic for test concerning mean (𝜎 known) is


̅𝑋 − 𝜇0
𝑍= 𝜎
√𝑛
Which follows standard normal distribution

 Critical regions for testing 𝜇 = 𝜇0 (standard normal distribution and 𝜎 be known)

Alternative hypothesis 𝐻0 Reject Null hypothesis


𝜇 < 𝜇0 𝑍 < −𝑍𝛼
𝜇 > 𝜇0 𝑍 > 𝑍𝛼
𝜇 ≠ 𝜇0 𝑍 < − 𝑍𝛼/2 or 𝑍 > 𝑍𝛼/2
2. Hypothesis test concerning two mean:
̅𝑋 − ̅𝑌
𝑍=
𝑠 2 𝑠 2
√ 1 + 2
𝑛1 𝑛2

Inferences concerning Proportions:

1. Test for single Proportion (Large sample):

Therefore, the test statistics ′𝑍′ is given by

𝒑−𝑷
𝒁=
√𝑷𝑸
𝒏

Where, 𝑄 = 1 − 𝑃
𝑋
𝑝 = 𝑛 is a sample proportion in a random sample of size 𝑛.

2. Test for the difference between two sample


Proportions:

 Let 𝑃1 𝑎𝑛𝑑 𝑃2 be the proportions of successes in two large samples of size


𝑛1 𝑎𝑛𝑑 𝑛2 respectively, drawn from the sample population or from two populations
with the same proportion 𝑃1 = 𝑃2 = 𝑃.

Test statistics:
𝑝1 − 𝑝2
𝒁=
1 1
√𝑃𝑄 (𝑛 + 𝑛 )
1 2

Where population proportion mean 𝑃 is known.

If, 𝑃 is not known, an unbiased estimate of 𝑃 based on the both samples, given by
𝑛1 𝑝1 +𝑛2 𝑝2
𝑃= is used in the place of 𝑃.
𝑛1 +𝑛2
Module 7

Hypothesis Testing-II

Student’s t-distribution:
1. Statistic for small sample test concerning one mean:

Null hypothesis: 𝐻0 : 𝜇 = 𝜇0

Test Statistic:
𝑋̅ − 𝜇0
𝑡= 𝑠
√𝑛

Follows t-distribution with 𝑛 − 1 degrees of freedom.

∑(𝑋𝑖 −𝑋̅ )2
Here, 𝑠 2 =
𝑛−1

is an unbiased estimator of population standard deviation 𝜎 2 .


2. Test of difference of means:

Null hypothesis: 𝐻0 : 𝜇1 − 𝜇2 = 𝑑

Test Statistic:

𝑥
̅̅̅1 − ̅̅̅
𝑥2
𝑡=
1 1
√𝑠2 ( + )
𝑛1 𝑛2
follows t-distribution with 𝑛1 + 𝑛2 − 2 degrees of freedom.

̅̅̅1̅)2 +∑(𝑥2𝑖 −𝑥
∑(𝑥1𝑖 −𝑥 ̅̅̅2̅)2
Where, 𝑠 2 =
𝑛1 +𝑛2 −2

Or
2
𝑛1 𝑠1 2 + 𝑛2 𝑠2 2
𝑠 =
𝑛1 + 𝑛2 − 2

F-distribution:

 F-distribution is used to test the equality of the variances of two populations from
which two samples have been drawn.

Null hypothesis: 𝐻0 : 𝜎1 2 = 𝜎2 2

Test statistics:

𝑠1 2
𝐹=
𝑠2 2

̅̅̅1̅)2
∑(𝑥1𝑖 −𝑥 ̅̅̅2̅)2
∑(𝑥2𝑖 −𝑥
Where, 𝑠1 2 = 𝑛1 −1
𝑎𝑛𝑑 𝑠2 2 = 𝑛2 −1

Note:
 The larger among 𝒔𝟏 𝟐 𝒂𝒏𝒅 𝒔𝟐 𝟐 will be the numerator.
 Here ′𝐹′ follows F-distribution with (𝑛1 − 1, 𝑛2 − 1) degrees of freedom.
 The critical region value is 𝐹(𝑛1 −1,𝑛2 −1) .
Chi-square distribution (or) 𝝌𝟐 − 𝐝𝐢𝐬𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧:
 Hypothesis concerning one variance
 Goodness of fit
 Test for independence of attributes

1. Hypothesis concerning one variance:

Null hypothesis 𝑯𝟎 : 𝜎 2 = 𝜎0 2

Test statistics:
2
(𝑛 − 1)𝑠 2
𝜒 =
𝜎0 2

Where 𝑛 is the sample size

𝑠 2 is the sample variance

𝜎0 2 is the value of 𝜎 2 given by null hypothesis.

The degrees of freedom of a 𝜒 2 −distribution is ′𝑛 − 1′.


2. Goodness of fit:

 Chi-square test of goodness of fit is

𝑛
2
(𝑂𝑖 − 𝐸𝑖 )2
𝜒 = ∑[ ]
𝐸𝑖
𝑖=1

The degrees of freedom (df) for Chi-square distribution is ′𝑛 − 1′.

Note:
If the data is given in series of ′𝑛′ numbers, then

1. In case of Binomial distribution, 𝑑𝑓 = 𝑛 − 1


2. In case of Poisson distribution, 𝑑𝑓 = 𝑛 − 2
3. In case of Normal distribution, 𝑑𝑓 = 𝑛 − 3.

3. Chi-square test for independence of attributes:

 An attribute may be marked by its presence (position) or absence in a number of a


given population.
 Let us consider two attributes A and B. A is divided into two classes and B is divided
in two classes. The various cell frequencies can be expressed in the following table
known as 2x2 contingency tale.

𝐴 𝑎 𝑏
𝐵 𝑐 𝑑

𝑎 𝑏 𝑎+𝑏
𝑐 𝑑 𝑐+𝑑
𝑎+𝑐 𝑏+𝑑 𝑁
The expected frequencies are given by

(𝑎 + 𝑐)(𝑎 + 𝑏) (𝑏 + 𝑑)(𝑎 + 𝑏) 𝑎+𝑏


𝐸(𝑎) = 𝐸(𝑎) =
𝑁 𝑁

(𝑎 + 𝑐)(𝑐 + 𝑑) (𝑏 + 𝑑)(𝑐 + 𝑑) 𝑐+𝑑


𝐸(𝑎) = 𝐸(𝑎) =
𝑁 𝑁

𝑎+𝑐 𝑏+𝑑 𝑁 (total frequency)


Design experiments:

 When comparing means across two samples, we use Z-test or t-test.


 If more than two samples are test for their means, we use ANOVA.

ANOVA:
Analysis of Variance is a hypothesis testing technique used to test the equality of two or more
population means by examining the variances of samples that are taken.

Assumptions of ANOVA:
 All populations involved follow a normal distribution.
 All populations have the same variances.
 The samples are randomly selected and independent of one another or the
observations are independent.

Types of ANOVA:
 One-way ANOVA: Completely Randomized Design (CRD)
 Two-way ANOVA: Randomized Based Design (CBD)
 Three-way ANOVA: Latin Square Design (LSD)

1. Scheme for one-way classification or Completely


Randomized Design (CRD):

𝑘 𝑛𝑖

𝑆𝑆𝑇 = ∑ ∑ 𝑦𝑖𝑗 2 − 𝐶
𝑖=1 𝑗=1
𝑘 2
𝑇𝑖
𝑆𝑆𝐵 = ∑ –𝐶
𝑛𝑖
𝑖=1

Where, 𝐶 is called the correction factor for the mean is given by


2 𝑘 𝑛𝑖
𝐺
𝐶= , 𝑁 = ∑ 𝑇𝑖 , 𝑇𝑖 = ∑ 𝑦𝑖𝑗
𝑁
𝑖=1 𝑗=1
Test statistic:

 To test the 𝐻0 that 𝐾 population mean is equal, we shall compare two estimates of 𝜎 2 .
One based on the variation between the sample mean.

One based on the variation within the sample mean.

 Each sum of squares first converted to a mean square


𝒔𝒖𝒎 𝒐𝒇 𝒔𝒒𝒖𝒂𝒓𝒆𝒔
𝐌𝐞𝐚𝐧 𝐬𝐪𝐮𝐚𝐫𝐞 =
𝒅𝒆𝒈𝒓𝒆𝒆𝒔 𝒐𝒇 𝒇𝒓𝒆𝒆𝒅𝒐𝒎

Mean of sum of squares between sample


2
𝑺𝑺𝑩 𝑦𝑖 − 𝑦
∑𝑘𝑖=1 𝑛𝑖 (̅̅̅̅ ̅)
𝐌𝐒𝐁 = =
𝑫𝑭𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝑲−𝟏

Mean sum of squares within sample


𝑛𝑖 2
𝑺𝑺𝑾 ∑𝑘𝑖=1 ∑𝑗=1 (𝑦𝑖𝑗 − ̅𝑦
̅̅̅)
𝑖
𝐌𝐒𝐖 = =
𝑫𝑭𝒘𝒊𝒕𝒉𝒊𝒏 𝑵−𝑲
 Test statistic:
𝑴𝑺𝑩
𝑭=
𝑴𝑺𝑾

F-distribution follows 𝑲 − 𝟏 and 𝑵 − 𝑲 degrees of freedom.

ANOVA table:

Source of Degrees of Sum of Mean squares 𝐹


variation freedom squares
Between groups 𝑲−𝟏 SSB MSB 𝐹
Within groups 𝑵−𝑲 SSW MSW 𝑀𝑆𝐵
=
𝑀𝑆𝑊

Total 𝑵−𝟏 SST

Decision:
If 𝐹 > 𝐹𝛼,(𝑁−1,𝑁−𝐾) , reject the null hypothesis 𝐻0 .
2. Two-way ANOVA classification:

∑𝑪
𝒊=𝟏 𝑻𝒊.
𝟐
𝑺𝑺(𝑻𝒓) = − 𝑪𝒐𝒓𝒓𝒆𝒄𝒕𝒊𝒐𝒏 𝒇𝒂𝒄𝒕𝒐𝒓
𝒄

2
Block sum square, 𝑆𝑆(𝐵𝑙) = 𝐶 ∑𝑟𝑗=1(𝑦
̅̅̅
.𝑗 − 𝑦
̅.. )

∑𝒓𝒋=𝟏 𝑻.𝒋 𝟐
𝑺𝑺(𝑩𝒍) = − 𝑪𝒐𝒓𝒓𝒆𝒄𝒕𝒊𝒐𝒏 𝒇𝒂𝒄𝒕𝒐𝒓
𝒓
2
Error sum of square, 𝑆𝑆𝐸 = ∑𝐶𝑖=1 ∑𝑟𝑗=1(𝑦𝑖𝑗 − 𝑦̅𝑖. − ̅̅̅
𝑦.𝑗 + 𝑦̅.. )
2
Total sum of square, 𝑆𝑆𝑇 = ∑𝐶𝑖=1 ∑𝑟𝑗=1(𝑦𝑖𝑗 − 𝑦̅.. )

𝑪 𝒓
𝟐
𝑺𝑺𝑻 = ∑ ∑(𝒚𝒊𝒋 − 𝒚̅.. ) − 𝑪𝒐𝒓𝒓𝒆𝒄𝒕𝒊𝒐𝒏 𝒇𝒂𝒄𝒕𝒐𝒓
𝒊=𝟏 𝒋=𝟏

𝑻.. 𝟐
Where, correction factor is given by 𝑪 = 𝑪𝒓

𝑇𝑖 . = the sum of the 𝑟 observations for the ith treatment

𝑇.𝑗 = the sum of the 𝐶 observations for the jth block

𝑇.. = the grand total of all observations

F-ratio for treatment or between sample

𝑺𝑺(𝑻𝒓)
𝑴𝑺(𝑻𝒓) ( 𝑪−𝟏 )
𝑭𝑻𝒓 = =
𝑴𝑺𝑬 𝑺𝑺𝑬
( )
(𝑪 − 𝟏)(𝒓 − 𝟏)

Decision:

Reject for 𝐻0 , if 𝐹𝑇𝑟 > 𝐹(𝐶−1,(𝐶−1)(𝑟−1))

F-ratio for blocks

𝑺𝑺(𝑩𝒍)
𝑴𝑺(𝑩𝒍) ( 𝒓−𝟏 )
𝑭𝑩𝒍 = =
𝑴𝑺𝑬 𝑺𝑺𝑬
( )
(𝑪 − 𝟏)(𝒓 − 𝟏)
Decision: reject for 𝐻0 , if 𝐹𝐵𝑙 > 𝐹𝛼, (𝑟−1,(𝐶−1)(𝑟−1))

Two-way ANOVA table for results

Source of Degrees of Sum of Mean squares 𝐹


variation freedom squares
Treatments 𝒓−𝟏 SS(Tr) 𝑀𝑆(𝑇𝑟) 𝑀𝑆(𝑇𝑟)
𝑆𝑆(𝑇𝑟) 𝐹𝑇𝑟 =
𝑀𝑆𝐸
=
𝑟−1

Blocks 𝑪−𝟏 SS(Bl) 𝑀𝑆(𝐵𝑙)


𝑀𝑆(𝐵𝑙) 𝐹𝐵𝑙 =
𝑀𝑆𝐸
𝑆𝑆(𝐵𝑙)
(𝐶 − 1)(𝑟 − 1) =
𝐶−1
Error SSE
𝑀𝑆𝐸
𝑆𝑆𝐸
=
(𝑟 − 1)(𝐶 − 1)
Total (𝐶𝑟 − 1) SST

3. Latin Square Design (LSD) (or) Three-way ANOVA:

Null hypothesis: There is no significant difference in the means of columns


(Groups), rows (Blocks), and treatments.

Alternative hypothesis: There is at least one mean in column which differs


from others. Also, there is at least one mean in the rows which differs from
others. Similarly, for treatments.

Degrees of freedom:
𝑫𝑭𝒓𝒐𝒘𝒔 = 𝒏 − 𝟏

𝑫𝑭𝒄𝒐𝒍𝒖𝒎𝒏𝒔 = 𝒏 − 𝟏

𝑫𝑭𝒕𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕𝒔 = 𝒏 − 𝟏

𝑫𝑭𝑬𝒓𝒓𝒐𝒓 = (𝒏 − 𝟏)(𝒏 − 𝟐)
Critical region:

𝐹(𝑛−1,(𝒏−𝟏)(𝒏−𝟐) )

𝐺 = ∑ ∑ 𝑥𝑖𝑗

𝐺2
Correction factor is 𝐶. 𝐹 =
𝑁

Sum of squares total:

𝑆𝑆𝑇 = ∑ ∑ 𝑥𝑖𝑗 2 − 𝐶. 𝐹

sum of squares:
𝐶𝑗 2
𝑆𝑆𝐶 = ∑ − 𝐶𝐹
𝑛
Where, 𝐶𝑗 is the column sum of the jth column.

𝑅𝑖 2
𝑆𝑆𝑅 = ∑ − 𝐶𝐹
𝑛
Where, 𝑅𝑖 is the row sum of the ith row.

𝑇𝑖 2
𝑆𝑆𝑇𝑟 = ∑ − 𝐶𝐹
𝑛
Where, 𝑇𝑖 is called the treatment sum of ith treatment.

𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑅 − 𝑆𝑆𝐶 − 𝑆𝑆𝑇𝑟

ANOVA table:

Source of Sum of Degrees of Mean squares 𝑭


variation Squares freedom (MS)
(SS)
Columns SSC 𝑛−1 𝑆𝑆𝐶 𝐹1
𝑛−1 𝑀𝑆𝐶
=
𝑀𝑆𝐸
Rows SSR 𝑛−1 𝑆𝑆𝑅 𝐹2
𝑛−1 𝑀𝑆𝑅
=
𝑀𝑆𝐸

Treatments SSTr 𝑛−1 𝑆𝑆𝑇𝑟 𝐹3


𝑛−1 𝑀𝑆𝑇𝑟
=
𝑀𝑆𝐸

Error SSE (𝑛 − 1) (𝑛 − 2) 𝑆𝑆𝐸


(𝑛 − 1) (𝑛 − 2)

You might also like