Chi Square
Chi Square
X −μ
is a χ variate with n degree s of freedom
σ
Therefore the sum of the squares of ‘n’ independent standard normal variates is a χ −variate with ‘n’ degree’s of
freedom and it follows χ − distribution with ‘n’ degree’s of freedom.
DERIVATION
Suppose a random sample x , x , … , x of size ‘n’ is taken from a population with normal distribution N(μ, σ ). Let us
defined,
x −μ
χ = = z … … (1) [Where z ~N(0,1)]
σ
The independence of z means z is also independent. The MGF of χ can be written as,
M (t) = M∑ (t) = M ⋯ (t) = M (t). M (t) … M (t) … … (2)
SAJIB-SUST(2019134125)
2
1 1 1 ⁄2 𝐚 𝐧
Finally, M (t) = [M (t)] = = =
√1 − 2t 1 − 2t 1 ⁄2 − t 𝐚−𝐭
Which is the moments generating function of Gamma distribution with parameters 1 2 and n 2. From the
uniqueness theorem of MGF the pdf of χ distribution is,
1
f(χ ) = e (χ ) ;0 < χ < ∞
n
2
2
1 1 x n
log f(x) = log e (x) = log + log e + log(x) = c − + − 1 log x
n n 2 2
2 2
2 2
SAJIB-SUST(2019134125)
3
SAJIB-SUST(2019134125)
4
μ 64n √8
Coeff icient of skewness, γ = β = = = > 0 {Positively skewed}
μ 8n √n
μ 48n + 12n 12 12
Coeff icient of kurtosis, γ =β −3= −3= −3 = +3−3= (Leptokurtic)
μ 4n n n
LIMITING FORM OF THE χ DISTRIBUTION
X−μ
Let X~χ , Then M (t) = (1 − 2t) . Again let, z= . For χ distribution μ = n and σ = √2n
σ
√ √
X−μ X−n X √n
∴z= = = − . Now M (t) = E(e ) = E e √ √ =e √ E e√
σ √2n √2n √2
√ t √ t√2
=e √ 1−2 =e √ 1−
√2n √n
√ t√2 √ t√2
Now, K (t) = log M (t) = log e √ × 1− = log e √ + log 1 −
√n √n
−t√n n t√2 −t√n n t√2 1 t√2 1 t√2 1 t√2
= − log 1 − = − − − − − −⋯
√2 2 √n √2 2 √n 2 √n 3 √n 4 √n
−t√nn t√2 1 t√2 1 t√2 1 t√2
= + + + + +⋯
√2 2 √n 2 √n 3 √n 4 √n
−t√n t√n t t √2 2t t t √2 2t
= + + + + +⋯= + + +⋯
√2 √2 2 3√n 4n 2 3√n 4n
t t
∴ lim K (t) = , ⟹ log M (t) = , ⟹ M (t) = e as n → ∞
→ 2 2
Which is the MFG of standard normal variate. Hence by the uniqueness theorem of MGF ‘z’ is normal as n → ∞. In
other words standard χ − variate tends to standard normal variate for large degree of freedom. Thus χ
distribution tends to normal distribution for large degree of freedom.
ADDITIVE LAW OF THE 𝛘𝟐 DISTRIBUTION
THEOREM: Sum of independent 𝛘𝟐 variate is also a 𝛘𝟐 variate. If 𝛘𝟐𝐢 (i = 1,2,3, … , k) are independent 𝛘𝟐 variates with
n , n , … , n degree of freedom respectively, then ∑ 𝛘𝟐𝐢 is a 𝛘𝟐 variate with ∑ n degree of freedom.
SAJIB-SUST(2019134125)
5
−n
is (1 − 2t) , (1 − 2it) and
log(1 − 2t) respectively.
2
8 12 12
(5) For χ distribution, β = ,β = + 3 and γ = 8⁄n , γ =
n n n
(6) χ distribution tends to normal distribution for large degree of freedom.
APPLICATIONS of the CHI-SQUARE DISTRIBUTIONS
(1) To test if the hypothetical value of the population variance is σ = σ (say).
(2) To test the goodness of fit.
(3) To test the independence of attributes.
(4) To test the homogeneity of independent estimates of the population variance.
(5) To test the homogeneity of independent estimates of the population correlation coefficient.
(6) To combine various probabilities obtained from independent experiment to give a single test of significance.
Limitations of the Chi-square test:
1) Difficult to interpret when variables have many categories.
2) The chi-square test is sensitive to sample size. As sample size decreases, the test becomes less trustworthy.
3) The chi-square test is also sensitive to small expected frequencies in one or more of the cells in the table.
4) With very large samples, relatively trivial relationship may be declared statistically significant.
5) Chi-square test is not valid for proportions.
THEOREM:-1 If X and X are two independent χ variates with n and n degree of freedom respectively, then
n n
X ⁄X is a β , variate.
2 2
PROOF: Since X and X are two independent χ variates with n and n degree of freedom respectively. The joint
probability differential,
dP(x , x ) = dP(x ). dP(x )
Here,
1 ⁄ 1 ⁄
dP(x ) = e (x ) dx and dP(x ) = e (x ) dx
n n
2 2
2 2
1
∴ dP(x , x ) = e (x ) (x ) dx dx
n n
2
2 2
Let us make the transformation, u = x ⁄x and v = x . So that x = uv and x = v. Jaccobian of the transformation
is,
∂x ∂x
∂(x , x ) ∂u = v 0 = v,
J= = ∂u ⟹ |J| = v
∂(u, v) ∂x ∂x u 1
∂v ∂v
1 1
dG(u, v) = e (uv) (v) v du dv = e u v du dv
n n n n
2 2
2 2 2 2
The marginal probability differential of ‘U’ is
SAJIB-SUST(2019134125)
6
1 ( )
dP(u) = dG(u, v) dv = u e v dv du
n n
2
2 2
n n
1 + 1 u
= u 2 2 du = du
n n n n
β ,
2
2 2 (1 + u) 2 2 (1 + u)
2
Which is the probability differential of β distribution with parameter, , .
THEOREM:-2 If X and X are two independent χ variates with n and n degree of freedom respectively, then
SAJIB-SUST(2019134125)
7
Remark:
If X~χ and Y~χ are independent chi-squares variates, then
X n n X n n
(1) X + Y~χ , (2) ~ β , , (3) ~β ,
Y 2 2 x+Y 2 2
LINEAR TRANSFORMATION
Let us suppose that the given set of variables X = {x , x , … , x } is transformed into a new variable Y =
{y , y , … , y } by the means of linear transformation:
y = a x + a x +⋯+ a x
y = a x + a x + ⋯+ a x
⋮
y = a x + a x + ⋯+a x
i. e. y = a x + a x + ⋯ + a x ; i = 1,2,3, … , n
In matrix form, this system of linear transformation can be expressed as: Y = AX
y x
a a ⋯ a
y x
Y= ⋮ , A= a a ⋯ a , X= ⋮
a a ⋯ a
y x
From matrix theory, we can know that the system has a unique solution iff, |A| ≠ 0. In other words, we can express
‘X’ uniquely in terms of ‘Y’ ⟹ X = A Y . Where A is the inverse of the matrix ‘A’.
The linear transformation Y = AX is said to be linear orthogonal transformation if ‘A’ is an orthogonal matrix such
that, AA = A A = I . An orthogonal transformation transforms x + x + ⋯ + x ∑ x into y + y + ⋯ +
y ∑y
Now, Y Y = (AX) AX = X A A X=X X
y x
y x
Y Y = [y y ⋯ y ]× = y + y +⋯+ y y , X X = [x x ⋯ x ]×
⋮ ⋮
y x
x + x + ⋯+ x x
1 ∑ 1
f(x , x , … , x ) = e ; −∞ < x , x , … , x < +∞ = e
√2π σ √2π σ
The joint density function of (Y , Y , … , Y ) becomes,
SAJIB-SUST(2019134125)
8
1
g(y , y , … , y ) = e |J|
√2π σ
Now, Jaccobian of the transformation is defined by,
1 ∂(y , y , … , y )
= = |A|. Now, A A = I, ⟹ A A = |I| = 1, ⟹ A |A| = 1,
J ∂(x , x , … , x )
⟹ |A| = 1 ∵ A = |A| , ⟹ |A| = ±1 ∴ |J| = 1
1 1 ∑ 1
∴ g(y , y , … , y ) = e = e = e
√2π σ √2π σ √2π σ
THEOREM-4: Let (X , X , … , X ) be a random sample from a normal population with mean μ and variance σ . Then
σ X −X
(1) X~N μ, , (2) is a χ − variate with (n − 1) degree of freedom.
n σ
1 ns X −X σ
(3) X = X and = are independently distributed ; X as μ, and
n σ σ n
ns
as χ( ) variate.
σ
PROOF: Since X , X , … , X be a random sample from normal population with mean μ and variance σ . Then the
joint probability differential of X , X , … , X is given by,
1 ∑
dP(x , x , … , x ) = e dx dx … dx ; −∞ < x , x , … , x < +∞
√2πσ
Let us transform to the variables Y {i = 1,2,3, … , n} by the means of a linear orthogonal transformation Y = AX such
that,
Y Y = X X and A A = I. Where y = a x + a x + ⋯ + a x ; i = 1,2,3, … , n
1
Let us choose in particular a =a =⋯=a =
√n
1 1
⟹ y = a x +a x +⋯+ a x = (x + x + ⋯ + x ) = . n x = √n x … (1)
√n √n
Since the transformation is orthogonal, we have Y Y = X X
y + y + ⋯+ y = x + x + ⋯+ x , ⟹ y = x = (x − x) + nx ,
⟹ y = (x − x) + y , ⟹ y −y = (x − x) , ⟹ y = (x − x) … (2)
SAJIB-SUST(2019134125)
9
1 1 ∑ ( )
∑ ( )
dG(y , y , … , y ) = e |J| dy dy … dy = e e dy dy … dy
√2πσ √2πσ
d
We have, y = √n x. Differentiating with respect to y ⟹ 1 = √n x, ⟹ dy = √n dx
dy
1 ( ) 1 ∑
dG(y , y , … , y ) = e √n dx × e dy … dy
√2πσ √2πσ
1 ( ) 1 ∑
= e dx × e dy … dy
√2π σ⁄√n √2πσ
Thus x and y = (x − x) = ns are independently distributed, which establish part (3) of the theorem
Since for large ‘n’ ~N(0,1) we can conclude that √2X − √2n~N(0,1) for large n.
√
SAJIB-SUST(2019134125)