Regression Analysis Handouts
Regression Analysis Handouts
Introduction
REGRESSION ANALYSIS is a form of predictive modelling
technique which investigates the relationship between a
dependent and independent variable.
Benefits:
• It indicates the strength of impact of multiple independent variables on a dependent variable
• It indicates the significant relationships between dependent variable and independent variable
These benefits help market researchers/data analysts/data scientists to eliminate and evaluate the best set of
variables to be used for building predictive models.
Regression Model
The REGRESSION LINE is a
single line that best fits the
data
• REGRESSION LINE represents the pattern of the data. It also predicts the change in ‘y’ when ‘x’ increases by one
unit. The change in y describes either an INCREASE or a DECREASE.
Types of Regression Analysis
Has two Has two or
variables more variables
SIMPLE MULTIPLE
• LINEAR REGRESSION’s graph is a straight line with a linear regression equation degree equal to 1.
• NON-LINEAR REGRESSION’s graph is not a straight line and it is NOT A FIRST-DEGREE regression equation
Simple Linear Regression
• In statistics, it is a linear regression model with a single
explanatory variable
Example:
REGRESSION MODEL EXAMPLE
Number of Number of
6
TV Ads Houses
Sold
5
1 2
2 4 4
3 5 3
4 4
2
5 5
1
0
0 1 2 3 4 5 6
Determine the mean of the independent (x) and Subtract the x values to its mean
dependent (y) variable values
ܾଵ ൌ ൌͲǤͲ
ͳͲ 5
4
y-intercept for the
Estimated Regression 3
Equation
ͶെͲǤͲ ͵ ൌ
2
ܾ ൌ ʹ Ǥʹ
1
Estimated Regression
Equation 0
0 1 2 3 4 5 6
ŷ=2.2ሺͲǤͲሻई
Coefficient of Determination
• The coefficient of determination or R-SQUARED tells us how
well a regression line predicts or estimates actual values
The COEFFICIENT OF DETERMINATION, or R-SQUARED, is the ration of explained variation in y to the total variation
in y. It can take any value between 0 and 1. The closer the value is to 1, the better the explanatory power.
Square the difference and summarize Using the estimated regression equation that was solved earlier,
ŷ=2.2ሺͲǤͲሻई , substitute x with its corresponding values
͵ Ǥ
ݎଶ ൌ ൌ ͲǤ
ݎ ݂ܫଶ ൌ
ͳ, it is a perfect fit
ݎ ݂ܫଶ ݄ܽ݁ܿܽݎ݄݁ܿܽݎ
݁݁ݖ ݐ ݏ ݏ
ݎǡݐ ݊ݏ݅ ݁ݎ݄݁݁ݎ݄݄݄݁ܿܽ݁݁ݐ
݈ܽ݁ݎ ݈݈ܽݐܽ ݄݅ݏ ݊ ݅݊݅ݐ
ݎ ݂ܫଶ ݄ܽ݁ܿܽݎ݄݁ܿܽݎ
Ͳ ݐ ݏ ݏǤͻ ݅ݕ ݐݐ݁ݎܽݏ݅ݐ ݃ݐ݂݅ ݀
Sample Exercises.
Show your solutions for each number and box your final answers.
1. The sales of a company (in million pesos) for each year are shown in the table below:
Year Sales
2007 9
2008 20
2009 39
2010 47
2011 54
a. Find the slope and the y-intercept of the estimated regression line.
b. Use the estimated regression equation to estimate the sales of the company in 2015
2. The values of y and their corresponding values of y are shown in the table below
x 0 1 2 3 4
y 3 4 5 7 6