0% found this document useful (0 votes)
32 views

Correlation and Regression Analysis

This document discusses bivariate statistics and correlation analysis. It defines correlation as the relationship between two variables, and describes how the type (direct or inverse) and degree (correlation coefficient r value) of relationship can be determined. It provides examples of calculating r using a formula and calculator, and interpreting the results to describe the relationship between variables. Values of r range from -1 to 1, where -1 is a perfect inverse relationship and 1 is a perfect direct relationship.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Correlation and Regression Analysis

This document discusses bivariate statistics and correlation analysis. It defines correlation as the relationship between two variables, and describes how the type (direct or inverse) and degree (correlation coefficient r value) of relationship can be determined. It provides examples of calculating r using a formula and calculator, and interpreting the results to describe the relationship between variables. Values of r range from -1 to 1, where -1 is a perfect inverse relationship and 1 is a perfect direct relationship.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

4.

7 Bivariate Statistics

Learning Outcome

At the end of the lesson, students will


1. calculate correlation coefficient
2. interpret the type and degree of relationship between variables
3. predict values given historical or past data

4.7.1 Correlation analysis

Correlation analysis is concerned with the relationship among variables. A correlation is a


relationship between two variables, where “x “is usually designated as the independent variable
and “y” as the dependent variable.

Results of correlation analysis is interpreted in two ways, the type of relationship and the degree
of the relationship. The type of relationship is described as direct relationship (i.e. if the result is
positive) and as inverse relationship (i.e. if the result is negative). A direct relationship shows
that when the value of “x” increases ↑, the value of the “y” also increases ↑ or when the value of
“x” decreases↓, the value of “y” also decreases ↓. An inverse relationship shows that when the
value of “x” increases ↑, the value of “y” decreases ↓and vice versa.

The degree and type of a linear relationship between two variables is represented by the
correlation coefficient “r”. The value of “r” ranges only from -1 to +1, with -1 implying a
perfect, inverse relationship and +1 implying a perfect, direct relationship. A value of 0 indicates
no correlation.

Perfect direct rela- Perfect inverse rela-


tionship tionship
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
1 2 3 4 5 6 1 2 3 4 5 6

[46]
The values of “r” takes only decimals. The degree of relationship is interpreted using a range of
decimal values. The table shows an example of the interpretation.

+1.00 Perfect direct relationship


+.70 to .99 Very strong direct relationship
+.40 to +.69 Strong direct relationship
+.30 to +.39 Moderate direct relationship
+.20 to +.29 weak direct relationship
+.01 to +.19 No or negligible relationship
0 No relationship [zero order correlation]
-.01 to -.19 No or negligible relationship
-.20 to -.29 weak inverse relationship
-.30 to -.39 Moderate inverse relationship
-.40 to -.69 Strong inverse relationship
-.70 to -.99 Very strong inverse relationship
-1.00 Perfect inverse relationship
http://www.statisticshowto.com/how-to-compute-pearsons-correlation-coefficients/

There are several types of correlation tests:

1. Phi correlation is used to determine relationship between two variables that are in nominal
scale.
2. Point biserial correlation is used to determine relationship between two variables, one of
which is a nominal or ordinal scale and the other one is interval/ratio scale.
3. Spearman rank correlation is used to determine relationship between two variables that are in
ordinal scale.
4. Pearson product moment correlation is used to determine relationship between two variables
that are in interval or ratio scale.

The value of r is determined using the formula.

[47]
Correlation coefficient formula:
N ( ∑ XY ) −( ∑ X)( ∑ Y )
r=
√ [ N (∑ X )−(∑ X ) ( N ∑ Y )−(∑ Y ) ]
2 2 2 2

Example:

Given the following data, determine the correlation coefficient and interpret the results.
X 3 5 6 8 9 11
Y 2 3 4 8 5 8

Step 1. Calculate the values of the summations (∑)

X Y X2 Y2 XY
3 2 9 4 6
5 3 25 9 15
6 4 36 16 24
8 8 64 64 64
9 5 81 25 45
11 8 121 64 88
∑X = 42 ∑Y = 30 ∑X2 = 336 ∑Y = 182
2
∑XY= 242

∑x – add all the values of x.


∑y – add all the values of y.
∑x2 – add all the squared values of x.
∑y2 – add all the squared values of y.
∑xy – add all the product of the corresponding x and y values.
N – is the number of data pairs
Substitute the values in the formula.

6 ( 242 ) −( 42)(30)
r=
√ ¿¿ ¿

There is a strong, direct relationship between X and Y. When the value of X increases, the
values of Y also increases.

Using the calculator to find the value of the correlation coefficient, r.

[48]
In this calculator model, the following keys will
be identified

Example
Determine the relationship in the given data.
x y
3 2
5 3
6 4
8 8
9 5
11 8

To use the calculator, open Reg mode

Press MODE twice. This will be displayed.

Press 2 for REG (regression) 1 for Lin (linear)

Enter the x, y data pairs using the “,” and “DT”keys


Press 3 , 2 [DT] (n=1 will be seen on screen)
Press 5 , 3 [DT] (n=2 will be seen on screen)
Press 6 , 4 [DT]
Press 8 , 8 [DT]
Press 9 , 5 [DT]
Press 11, 8 [DT] (n=6 will be seen on screen.
This shows the total number of data input)

[49]
To calculate r,

Press SHIFT then SVAR

This will be displayed on screen

Press arrow to the right two times


This will be displayed on screen

Press 3 for the value of the correlation coefficient, r.

[50]
In this calculator model, the
following keys will be identified

Example
Determine the relationship in the given data.
x y
3 2
5 3
6 4
8 8
9 5
11 8

To use the calculator, pressMODE and this will appear.

Press 3 (Stat) and this will appear on the screen.

Press 2 and this will be displayed.

[51]
Enter the all the data pairs under x and y.

Press SHIFT 1 (STAT) and this will appear.

or

Note: If only 3 of these items appear, press AC twice and then press SHIFT 1 (STAT) again.

Press 7 (Reg) or 5 (Reg) and this will be on the screen

Press 3, then = and the value of r will appear on screen.

Name: _______________________________ Score: ________________

[52]
Year and Section: _____________________ Date: _________________

EXERCISE 4.7

1. It was claimed that the taller the father is, the taller will be the eldest son also. Given
the data on the heights of fathers and eldest sons, find the correlation coefficient. Is the
claim true?

Father’s ht (inches) 65 63 66 64 87 62 70 65 68
66
Son’s ht (inches) 68 66 68 65 69 66 68 65 71
67

2. The following table shows the final grades of ten students in Algebra and Statistics.
Find the correlation coefficient and interpret the results.

Algebra (X) 75 80 93 65 87 71 98 68 84 77
Statistics (Y) 82 78 86 72 91 80 95 72 89 74

[53]
3. It is generally known that the number of road accidents is inversely proportional with
road width. The following data show the results of a study indicating the number of
accidents occurring per hundred thousand vehicle kilometers. Determine whether the data
is in consonance with observations.

Road width (ft) X 75 52 60 33 22


No. of accidents Y 40 84 55 92 90

4. A men’s tie shop ran ten sales promotions to determine the number of men’s neckties of
a certain type that customers would buy at various prices. Following are the results.
Determine the relationship and interpret the results.
Prices (X) 150 160 175 190 200 220 250 350 400 500
No. sold (Y) 187 149 150 150 120 80 60 50 50 30

[54]
4.8 Regression Analysis

Regression analysis is used to find trends in data. It is used to make estimations or projections of
future value. It is also used to identify independent predictor(s) of a dependent variable. Linear
regression is the most basic type of regression and commonly used in predictive analysis.

It utilizes the equation of a linear function:


y=a+bx

Where y is the predicted (dependent) variable, x is the given independent variable. The values of
a and b are calculated using the following formula.
( ∑ Y ) ( ∑ X 2 )−( ∑ X )( ∑ XY ) N ( ∑ XY ) −( ∑ X )( ∑ Y )
a= 2
b= 2
N ( ∑ X ) −( ∑ X) N ( ∑ X ) −( ∑ X )
2 2

when x = 0
∑Y b=
∑ XY
a=
N ∑ X2
Example
Predict the sales in 2018 and 2019 given the historical data on sales.

Year (x) Sales (in Million


pesos) (y)
2012 12.8
2013 18.6
2014 25.3
2015 24.5
2016 24.8
2017 26.3
2018 ?
2019 ?

To do manual computation, assign the year as the variable x and the sales as the variable y.
Calculate for the summations.
Sales (in
Year (x) Million x2 y2 xy
pesos) (y)
2012 12.8 4048144 163.84 25753.6
2013 18.6 4052169 345.96 37441.8
2014 25.3 4056196 640.09 50954.2
2015 24.5 4060225 600.25 49367.5
2016 24.8 4064256 615.04 49996.8
2017 26.3 4068289 691.69 53047.1
12087 132.3 24349279 3056.87 266561

[55]
∑X = 12087 ∑Y = 132.3 ∑X2= 24349279 ∑Y2 = 3056.87 ∑XY= 266561

Solve for a and b by substituting the values in the formula

( ∑ Y ) (∑ X 2 )−( ∑ X )( ∑ XY ) ( 132.3 )( 24349279 ) −( 12087 ) ( 266561)


a= 2
= 2
N ( ∑ X ) −( ∑ X ) 6 ( 24349279 )−( 12087 )
2

¿−4887.5743

N ( ∑ XY )−( ∑ X )( ∑ Y ) 6 ( 266561) −( 12087 ) (132.3 )


12.8 b= 2
= 2
=2.4371
N ( ∑ X )−( ∑ X ) N 6 ( 24349279 )−(12087)
2

y 2018=−4887.5743+(2.4371)(2018)=30.58
y 2019=−4887.5743+(2.4371)(2019)=33.02

Using the calculator, start by opening on Reg MODE.

Press 2 for REG (regression) 1 for Lin (linear)

Enter the x, y data pairs using the “,” and “DT” keys

Press SHIFT then SVAR

This will be displayed on screen

Press arrow to the right two times


This will be displayed on screen

The value of a is on 1, and the value of b is on 2.


Press 1 and the value of a appears (-4887)

[56]
Press 2 and the value of b appears (2.4371)
Press the arrow to the right and this will appear on the screen. This will be used to make the
prediction for 2018 and 2019.

To predict the value at 2018, type 2018. Press SHIFT 2 (SVAR)

Press 5 then = and the projected value for 2018, which is 30.58 will appear.

To predict the value at 2019, type 2019. Press SHIFT 2 (SVAR)

Press 5 then = and the projected value for 2019, which is 33.017 will appear.

[57]
Name: _______________________________ Score: ________________
Year and Section: _____________________ Date: _________________

EXERCISE 4.8

1. X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9
Determine the value of Y when X = 10; X = 15

2. Project the sales in 2017, 2018, 2019


Year Sales
(thousand
pesos)
2008 436
2009 469
2010 483
2011 488
2012 443
2013 470
2014 530
2015 528
2016 500

[58]
3. Following is a summary of the total assets, in million pesos for 12 credit unions. Also
reported are the capital ratio, a financial measure showing the total equity to total
liabilities, and the profit for each credit union for the year.
Assets (in million Capital Ratio Profit (in thousand
pesos) pesos)
74.139 8.759 1,147
69.624 7.505 589
58.033 8.185 789
49.235 9.207 449
37.639 12.572 414
30.650 13.537 191
27.070 13.631 259
22.008 8.451 105
21.288 8.441 232
20.153 7.711 181
19.923 9.075 180
19.153 10.808 283

a. Determine the relationship between assets and capital ratio.

b. Determine the relationship between assets and profit.

c. Estimate the profit when assets is a) 11.234; b) 10.734.

d. Determine the relationship between capital ratio and profit.

e. Estimate the profit when capital ratio is a) 14.705; b) 11.608.

[59]

You might also like