0% found this document useful (0 votes)
10 views23 pages

Ch9 - Correlation Regression

class notes of Quantitative Business Methods

Uploaded by

zainab.jh88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views23 pages

Ch9 - Correlation Regression

class notes of Quantitative Business Methods

Uploaded by

zainab.jh88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Quantitative Business Methods

QBM
Chapter 9

Correlation
and
Regression

Two continuous
variables
• Correlation and regression are concerned
with the investigation of two continuous
variables.

• Previously we have only considered a single


variable - now we look at two associated
variables.
We might wish to know:
• If a relationship exists between those variables
• if so, how strong that relationship is
• what form that relationship takes.

• Can we make use of that relationship for predictive


purposes i.e. forecasting?

• Correlation describes the strength of the relationship.


It is not concerned with 'cause' and 'effect'.

• Regression describes the relationship itself in the


form of a straight line equation which best fits the
data.

4
• Some initial insight into the relationship between two
continuous variables can be obtained by plotting a scatter
diagram and simply looking at the resulting graph.

• Does the relationship seem to be linear or curved?


• If there appears to be a linear relationship, can it be
quantified.
• A correlation coefficient is calculated as the
measure of the strength of this relationship.
• Its symbol is 'r' and its value lies between
-1 and +1.

5
• Is the association between the two variables strong
enough to be useful?

• If the relationship is found to be significantly strong,


• its nature can be found using linear regression.
• This defines the equation of the straight line of
'best fit' through the bi-variate data, y = a + bx .
• For example, £x spent on Advertising is expected
to increase Sales by £y.

6
• The 'goodness of fit' can be calculated
to see how well the line fits the data.
• Once defined by an equation, the
relationship can be used for
predictive purposes.

7
Example
'Ice cream Sales' for a particular firm of
manufacturers and 'Average Monthly Temperature'
are:
Month Av. Temp Sales From this data we need:
°(C) (£'000)
• Scatter diagram
January 4 73
February 4 57 • Correlation coefficient
March 7 81 • Regression line
April 8 94
May 12 110
• Goodness of fit
June 15 124 • Prediction
July 16 134
August 17 139
September 14 124
October 11 103
November 7 81
December 5 80
Scatter diagrams
We look for a linear relationship with the
bivariate points plotted being reasonably
close to the, yet unknown, 'line of best fit'.
• Plot the independent Sales against Av erage Monthly Temperature
variable, x, on the
horizontal axis. 140
130

• Plot the potentially 120


110

dependent variable on the

Sales
100

vertical, y, axis. 90
80
• (Minitab output shown) 70
60
• Looks promising: a 50

straight line relationship, 5 10


Av.Temp.
15

with all points fairly close


to a 'line of best fit'.
• Pearson's Correlation Coefficient (r)
• (for quantitative data only)

• This quantifies the strength of a linear relationship

10
• Calculation of Correlation coefficient
• Input data to calculator
• Best to use of calculator in 'Type A+BX ' or 'LR
mode' as will be demonstrated in tutorials.
• (Method in specific calculator manual)
• (If without 'A+BX type' or 'LR mode' complex formulae and
methods are needed, also in textbook or handout.)
• Correlation coefficient, r, (output from calculator):
r = 0.9833

11
Is this correlation coefficient, 0.9833, significant?
Hypothesis test for a Pearson’s correlation coefficient
• H0: There is no association between ice-cream sales and
average monthly temperature.
• H1: There is an association between them.
• Critical Value:
• Χ2 tables, 5%, 10 degrees of freedom = 0.576
• Test statistic: 0.983
• Conclusion: The test statistic exceeds the critical value
so we reject the Null Hypothesis, H0, and conclude that
there is a significant association between ice-cream sales
and average monthly temperature.

12
Regression equation (y = a + bx)
• There is a significant relationship between the two
variables, so the next step is to define it as a
regression equation.

• This can be produced directly from a calculator in


'A+BX type' or 'LR mode' as shown in your manual.
• (If without 'A+BX type' or 'LR mode' complex
formulae and methods are needed, also in
textbook or handout.)

13
• The regression line is described, in general, as
the straight line of ‘best fit’ with the equation:
• y = a + bx
• where x and y are the independent and dependent
variables, a the intercept on the y-axis, and b the
slope of the line.
• For this data are: a = 45.5 b = 5.45
• Giving the regression equation:
• y = 45.5 + 5.45x
14
Draw this line on the scatter diagram:

Plot any three points and


join them up. Scatter diagram with Regression line
Useful points: (0,a); the
centroid ( ); any other
Sales against Av erage Monthly Temperature

points calculated from the


140
130

regression equation:
120
110

Sales
100

E.g. If x = 15; 90
80

y = 45.5 + 5.45x15=127.2 70
60
50
5 10 15

For any value of x the Av.Temp.

corresponding value of y
can be found directly from
the calculator [ŷ].
Goodness of Fit
• How well does this line fit the data?

• Goodness of fit is measured by (r2 x 100)%.

• The correlation coefficient 'r' was 0.983 so Goodness of Fit =


(0.983)2 x 100 = 96.6% fit.

• This indicates the percentage of the variation in Ice-cream Sales


accounted for by the variation in Average monthly temperature.

16
Prediction of Sales

• Suppose that the Ice-cream manufacturer knows that the


estimated average temperature for the following month is
14oC, what would be his expected Sales?

• Substitute 14 for the independent variable, x, and


calculate the corresponding value of the Sales, y. This can
be more easily be produced directly from your calculator:
type in 14, find [ŷ].

• Estimated Sales: 45.5 + 5.45 x av. temp.


45.5 + 5.45 x 14 = 121.8

• Expected sales would be £122 000 17


Further regression modelling

• This lecture has concentrated on the production of a


regression model but has not gone dn to decide how
good this model is.

• At the moment we have only one model but further


exploration by residual analysis is essential for
comparing models.

• For residual analysis to see if this is a good model see


Section 9.8 of Business Statistics for Non-
Mathematicians and the computer worksheets.
18
• Your computer worksheet enables you to produce
two different models and compare their merits.
Further regression methods, applicable if your data is
not liners, are discussed in Section 9.11.

• All the formulae and method needed to carry out


correlation and regression follow in the Appendices
but it is hoped that you see the merits of investing in
a calculator that does it all for you!

19
In this lecture we have concentrated
Next lecture:
Summary

In this Chapter we have looked at


Questions

You might also like