Correlation and Regression Analysis
Correlation and Regression Analysis
7 Bivariate Statistics
Learning Outcome
Results of correlation analysis is interpreted in two ways, the type of relationship and the degree
of the relationship. The type of relationship is described as direct relationship (i.e. if the result is
positive) and as inverse relationship (i.e. if the result is negative). A direct relationship shows
that when the value of “x” increases ↑, the value of the “y” also increases ↑ or when the value of
“x” decreases↓, the value of “y” also decreases ↓. An inverse relationship shows that when the
value of “x” increases ↑, the value of “y” decreases ↓and vice versa.
The degree and type of a linear relationship between two variables is represented by the
correlation coefficient “r”. The value of “r” ranges only from -1 to +1, with -1 implying a
perfect, inverse relationship and +1 implying a perfect, direct relationship. A value of 0 indicates
no correlation.
[46]
The values of “r” takes only decimals. The degree of relationship is interpreted using a range of
decimal values. The table shows an example of the interpretation.
1. Phi correlation is used to determine relationship between two variables that are in nominal
scale.
2. Point biserial correlation is used to determine relationship between two variables, one of
which is a nominal or ordinal scale and the other one is interval/ratio scale.
3. Spearman rank correlation is used to determine relationship between two variables that are in
ordinal scale.
4. Pearson product moment correlation is used to determine relationship between two variables
that are in interval or ratio scale.
[47]
Correlation coefficient formula:
N ( ∑ XY ) −( ∑ X)( ∑ Y )
r=
√ [ N (∑ X )−(∑ X ) ( N ∑ Y )−(∑ Y ) ]
2 2 2 2
Example:
Given the following data, determine the correlation coefficient and interpret the results.
X 3 5 6 8 9 11
Y 2 3 4 8 5 8
X Y X2 Y2 XY
3 2 9 4 6
5 3 25 9 15
6 4 36 16 24
8 8 64 64 64
9 5 81 25 45
11 8 121 64 88
∑X = 42 ∑Y = 30 ∑X2 = 336 ∑Y = 182
2
∑XY= 242
6 ( 242 ) −( 42)(30)
r=
√ ¿¿ ¿
There is a strong, direct relationship between X and Y. When the value of X increases, the
values of Y also increases.
[48]
In this calculator model, the following keys will
be identified
Example
Determine the relationship in the given data.
x y
3 2
5 3
6 4
8 8
9 5
11 8
[49]
To calculate r,
[50]
In this calculator model, the
following keys will be identified
Example
Determine the relationship in the given data.
x y
3 2
5 3
6 4
8 8
9 5
11 8
[51]
Enter the all the data pairs under x and y.
or
Note: If only 3 of these items appear, press AC twice and then press SHIFT 1 (STAT) again.
[52]
Year and Section: _____________________ Date: _________________
EXERCISE 4.7
1. It was claimed that the taller the father is, the taller will be the eldest son also. Given
the data on the heights of fathers and eldest sons, find the correlation coefficient. Is the
claim true?
Father’s ht (inches) 65 63 66 64 87 62 70 65 68
66
Son’s ht (inches) 68 66 68 65 69 66 68 65 71
67
2. The following table shows the final grades of ten students in Algebra and Statistics.
Find the correlation coefficient and interpret the results.
Algebra (X) 75 80 93 65 87 71 98 68 84 77
Statistics (Y) 82 78 86 72 91 80 95 72 89 74
[53]
3. It is generally known that the number of road accidents is inversely proportional with
road width. The following data show the results of a study indicating the number of
accidents occurring per hundred thousand vehicle kilometers. Determine whether the data
is in consonance with observations.
4. A men’s tie shop ran ten sales promotions to determine the number of men’s neckties of
a certain type that customers would buy at various prices. Following are the results.
Determine the relationship and interpret the results.
Prices (X) 150 160 175 190 200 220 250 350 400 500
No. sold (Y) 187 149 150 150 120 80 60 50 50 30
[54]
4.8 Regression Analysis
Regression analysis is used to find trends in data. It is used to make estimations or projections of
future value. It is also used to identify independent predictor(s) of a dependent variable. Linear
regression is the most basic type of regression and commonly used in predictive analysis.
Where y is the predicted (dependent) variable, x is the given independent variable. The values of
a and b are calculated using the following formula.
( ∑ Y ) ( ∑ X 2 )−( ∑ X )( ∑ XY ) N ( ∑ XY ) −( ∑ X )( ∑ Y )
a= 2
b= 2
N ( ∑ X ) −( ∑ X) N ( ∑ X ) −( ∑ X )
2 2
when x = 0
∑Y b=
∑ XY
a=
N ∑ X2
Example
Predict the sales in 2018 and 2019 given the historical data on sales.
To do manual computation, assign the year as the variable x and the sales as the variable y.
Calculate for the summations.
Sales (in
Year (x) Million x2 y2 xy
pesos) (y)
2012 12.8 4048144 163.84 25753.6
2013 18.6 4052169 345.96 37441.8
2014 25.3 4056196 640.09 50954.2
2015 24.5 4060225 600.25 49367.5
2016 24.8 4064256 615.04 49996.8
2017 26.3 4068289 691.69 53047.1
12087 132.3 24349279 3056.87 266561
[55]
∑X = 12087 ∑Y = 132.3 ∑X2= 24349279 ∑Y2 = 3056.87 ∑XY= 266561
¿−4887.5743
y 2018=−4887.5743+(2.4371)(2018)=30.58
y 2019=−4887.5743+(2.4371)(2019)=33.02
Enter the x, y data pairs using the “,” and “DT” keys
[56]
Press 2 and the value of b appears (2.4371)
Press the arrow to the right and this will appear on the screen. This will be used to make the
prediction for 2018 and 2019.
Press 5 then = and the projected value for 2018, which is 30.58 will appear.
Press 5 then = and the projected value for 2019, which is 33.017 will appear.
[57]
Name: _______________________________ Score: ________________
Year and Section: _____________________ Date: _________________
EXERCISE 4.8
1. X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9
Determine the value of Y when X = 10; X = 15
[58]
3. Following is a summary of the total assets, in million pesos for 12 credit unions. Also
reported are the capital ratio, a financial measure showing the total equity to total
liabilities, and the profit for each credit union for the year.
Assets (in million Capital Ratio Profit (in thousand
pesos) pesos)
74.139 8.759 1,147
69.624 7.505 589
58.033 8.185 789
49.235 9.207 449
37.639 12.572 414
30.650 13.537 191
27.070 13.631 259
22.008 8.451 105
21.288 8.441 232
20.153 7.711 181
19.923 9.075 180
19.153 10.808 283
[59]