0% found this document useful (0 votes)
17 views22 pages

Lecture 12 Simple Linear Regression Analysis

This document discusses simple linear regression and correlation analysis. Simple linear regression finds the straight line equation that best fits the relationship between an independent variable x and dependent variable y. Correlation analysis measures the strength of the linear association between x and y using Pearson's correlation coefficient r. The coefficient of determination r2 indicates the proportion of variation in y that is explained by x.

Uploaded by

Brian Zvekare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views22 pages

Lecture 12 Simple Linear Regression Analysis

This document discusses simple linear regression and correlation analysis. Simple linear regression finds the straight line equation that best fits the relationship between an independent variable x and dependent variable y. Correlation analysis measures the strength of the linear association between x and y using Pearson's correlation coefficient r. The coefficient of determination r2 indicates the proportion of variation in y that is explained by x.

Uploaded by

Brian Zvekare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Simple Linear Regression

and Correlation Analysis


Lecture objectives
• explain the meaning of regression analysis
• identify practical examples where regression analysis
can be used
• construct a simple linear regression model
• use the regression line for prediction purposes
• calculate and interpret the correlation coefficient
• calculate and interpret the coefficient of determination
• conduct a hypothesis test on the regression model to
test for significance
Introduction
• In management, many numeric measures are related
(either strongly or loosely) to one another. For example:
• advertising expenditure is assumed to influence sales
volumes
• a company’s share price is likely to be influenced by its
return on investment
• Regression analysis and correlation analysis are two
statistical methods that aim to quantify the relationship
between these variables and measure the strength of
this relationship
Introduction
• The relationship between any pair of variables – labelled x and y – can be
examined graphically by producing a scatter plot of their data values
Introduction
• The scatter plot illustrates the idea behind regression and
correlation analysis. Each scatter point represents a pair of
data values from the two random variables, x and y. The
pattern of the scatter points indicates the nature of the
relationship, which is represented by the straight line,
calculated by regression analysis.
• The degree of closeness of the scatter points to the straight
line is a measure of the strength of the relationship and is
described by correlation analysis.
• To perform regression and correlation analysis, the data
for both variables must be numeric.
Simple Linear Regression Analysis
• Simple linear regression analysis finds a straight-line
equation that represents the relationship between the
values of two numeric variables
• One variable is called the independent or predictor
variable, x, and the other is called the dependent or
response variable, y.
• The x-variable influences the outcome of the y-variable. Its
values are usually known or easily determined. The
dependent variable, y, is influenced by (or responds to) the
independent variable, x. Values for the dependent variable
are estimated from values of the independent variable.
Simple Linear Regression Analysis
• In simple linear regression, there is only one
independent variable, x, that is used to
estimate or predict values of the dependent
variable, y.
example
solution
• Step 1: Identify the dependent and independent
variables
• An essential first step is to correctly identify the
independent and dependent variables. A useful rule of
thumb is to ask the following question: ‘Which
variable is to be estimated?’
• The answer will identify the dependent variable, y. In
the example:
• x = the number of advertisements placed weekly
• y = the number of flat-screen TVs sold in the week.
solution
• Step 2: Construct a scatter plot between x
and y
Step 3: Calculate the linear regression equation

• Regression analysis finds the equation that best fits a


straight line to the scatter points.
• A straight-line graph is defined as follows:

• Where: x = values of the independent variable


• y^ = estimated values of the dependent variable
• b0 = y-intercept coefficient (where the regression line
cuts the y-axis)
• b1 = slope (gradient) coefficient of the regression line
Where….
Therefore the equation
• The simple linear regression equation to estimate
flat-screen TV sales is given by:
• y^ = 12.817 + 4.368x for 2 ≤ x ≤ 5
• The interval of x-values (i.e. 2 ≤ x ≤ 5) is called
the domain of x. It represents the set of x-values
that were used to construct the regression line.
Thus to produce valid estimates of y, only values
of x from within the domain should be substituted
in the regression equation
Step 4: Estimate y-values using the
regression equation
• y^ = 12.817 + 4.368(3)
• = 12.817 + 13.104
• = 25.921
• = 26 (rounded)
• The management of Music Technologies can
therefore expect to sell, on average, 26
flatscreen TVs in a week when three
newspaper advertisements are placed.
Correlation Analysis
• The reliability of the estimate of y depends on
the strength of the relationship between the x
and y variables. A strong relationship implies a
more accurate and reliable estimate of y.
• Correlation analysis measures the strength of
the linear association between two numeric
(ratio-scaled) variables, x and y.
• This measure is called Pearson’s correlation
coefficient. It is represented by the symbol r when
it is calculated from sample data.
• The following formula is used to calculate the
sample correlation coefficient:

• r = the sample correlation coefficient


• x = the values of the independent variable
• y = the values of the dependent variable
• n = the number of paired data points in the sample
From last example r =
Interpretation of the Correlation
Coefficient
The r ² Coefficient
• When the sample correlation coefficient, r, is
squared (r ), the resultant measure is called the
2

coefficient of determination.
• The coefficient of determination measures the
proportion (or percentage) of variation in the
dependent variable, y, that is explained by the
independent variable, x. The coefficient of
determination ranges between 0 and 1 (or 0%
and 100%).
The r ² Coefficient
• r is an important indicator of the usefulness of
2

the regression equation because it measures


how strongly x and y are associated.
• The closer r2 is to 1 (or 100%), the stronger
the association between x and y. Alternatively,
the closer r2 is to 0, the weaker the association
between x and y
The r ² Coefficient
• r = 0 There is no association between x and
2

• r = 1 There is perfect association between x and y


2

• In both cases, y is completely (100%) explained by x.


• 0 < r < 1 The strength of association depends on how
2

close r lies to either 0 or 1.


2

• When r lies closer to 0 (or 0%), it indicates a weak


2

association between x and y When r2 lies closer to 1


(or 100%), it indicates a strong association between x
and y

You might also like