Bivariate Analysis
Bivariate Analysis
Correlation and
Regression Analysis
Properties-
• It is independent of origin and scale.
• It is independent of unit of measurement.
• The coefficient of correlation is the
geometric mean of the two regression
coefficients.
Correlation Formula
𝟐 𝟐
&
Question 1-2
1- Find the Karl Pearson correlation
coefficient from the following
data-
X= 10 4 3 5 8 5 6 7
Y= 9 6 4 5 7 4 5 8
1 10 9 4 16 3 9 12
2 4 6 -2 4 0 0 0
3 3 4 -3 9 -2 4 6
4 5 5 -1 1 -1 1 1
5 8 7 2 4 1 1 2
6 5 4 -1 1 -2 4 2
7 6 5 0 0 -1 1 0
8 7 8 1 1 2 4 2
N=8 𝐗 = 𝟒𝟖 𝐘 = 𝟒𝟖 == (𝐗 − 𝐗)𝟐 == (𝐘 − 𝐘)𝟐 (𝐗 − 𝐗) ∗ (𝐘 − 𝐘)
= 𝟑𝟔 = 𝟐𝟒 = 𝟐𝟓
𝟐 𝟐
Since Variance =
Therefore Standard Deviation=
We know that
Hence there is highly positive correlation
between two variables.
Spearman’s Rank Correlation
Method
• It uses ranks rather than actual
observation and makes no assumptions
about the population from which the
actual observations are drawn. The
correlation coefficient between two
series of ranks is called ‘Rank
Correlation Coefficient’. It is given by the
formula-
𝟔∗∑ 𝑫𝟐 𝟔∗𝟐𝟎
We know that
𝑵(𝑵𝟐 −𝟏) 𝟏𝟏∗𝟏𝟐𝟎
Question 4
If r=-0.8 and N=36, calculate
standard error, probable error and
also state whether the value of r is
significant.
Solution of Question No. 4
Given that r=-0.8 and N=36
We know that
. It means the
existence of r is practically significant.
Coefficient of Determination
• The coefficient of determination is
defined as the ratio of Explained
Variance to the total variance.
. It means the
existence of r is practically significant.
Solution-7 given that r=0.3 and N=4
𝟐
. It means there
is no evidence correlation.
Testing the significance of
the Correlation Coefficient
• The statistical test for the significance of
a correlation coefficient is conducted
using a t- statistic. The hypothesis to be
tested as below:
Y 10 20 30 50 40
Correlation coefficient
Total Variation in Y and X
Total variation in
Total variation in
Unexplained Variation in
Unexplained Variation in
Unexplained
Explained Variation in
Explained Variation in
Total variation = Unexplained variation + Explained variation
𝟐 𝟐 𝟐
𝑪 𝑪
𝟐 𝟐 𝟐
𝑪 𝑪
Testing the Significance of β
The hypothesis to be tested for the slope
coefficient is given as-
Sales (Y) 10 20 30 50 40
1 1 10 -2 4 -20 400 40
2 2 20 -1 1 -10 100 10
3 3 30 0 0 0 0 0
4 4 50 1 1 20 400 20
5 5 40 2 4 10 100 20
N=5 == 𝟐 == 𝟐
𝒀𝑿
𝐘𝐗 𝟐
1 1 10 12 -2 4
2 2 20 21 -1 1
3 3 30 30 0 0
4 4 50 39 11 121
5 5 40 48 -8 64
Total 190
3-Unexplained vatiation in 𝐂
𝟐
𝟖
𝟐
∑(𝐘−𝐘𝐂 )𝟐 𝟒𝟓𝟎
Standard Error of estimate 𝒀𝑿 𝑵−𝟐 𝟖
Solution of Question No.11 (4-5)
To test the significance of slope coefficient, the
following hypothesis is to be tested-
𝟎
𝐚
𝑺𝒀𝑿 𝟕.𝟓
Standard error
(𝑿−𝑿)𝟐 √𝟑𝟎
𝜷−𝜷 −𝟏𝟎−𝟎
The t statistic will be 𝒏−𝟐 𝑺𝑬(𝜷) 𝟏.𝟑𝟕