0% found this document useful (0 votes)
6 views36 pages

Lecture 13 BA

This document covers the fundamentals of linear regression, including its equation, slope formula, and the estimation of values. It explains the relationship between dependent and independent variables, and how regression analysis can be used for prediction and estimation in various managerial contexts. The document also illustrates the application of simple linear regression through a case study involving Butler Trucking Company, detailing the process of estimating travel times based on miles traveled.

Uploaded by

Imam Jan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views36 pages

Lecture 13 BA

This document covers the fundamentals of linear regression, including its equation, slope formula, and the estimation of values. It explains the relationship between dependent and independent variables, and how regression analysis can be used for prediction and estimation in various managerial contexts. The document also illustrates the application of simple linear regression through a case study involving Butler Trucking Company, detailing the process of estimating travel times based on miles traveled.

Uploaded by

Imam Jan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Business Analytics

Linear Regression

Lecture # 13

1
TOPICS to be COVERED
01 Linear Regression Equation

02 Slope Formula

03 Estimated Values

04 Standard Error

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Types of Probabilistic Models

Probabilistic
Probabilistic
MModels
odels

Regression
Regression Correlation
Correlation Other
Other
MModels
odels MModels
odels MModels
odels

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
3
Regression Models

• Relationship between one dependent variable


and explanatory variable(s)
• Use equation to set up relationship
• Numerical Dependent (Response) Variable
• 1 or More Numerical or Categorical Independent
(Explanatory) Variables
• Used Mainly for Prediction & Estimation

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
4
Regression
• Regression: Prediction of one variable from knowledge of one
or more other variables.
• Linear regression aims to fit a straight line to data that for any
value of x gives the best prediction of y.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example
• Managerial decisions are often based on the relationship between
two or more variables. For example, after considering the
relationship between advertising expenditures and sales, a
marketing manager might attempt to predict sales for a given level
of advertising expenditures.
• In another case, a public utility might use the relationship between
the daily high temperature and the demand for electricity to
predict electricity usage on the basis of next month’s anticipated
daily high temperatures.
• Sometimes a manager will rely on intuition to judge how two
variables are related. However, if data can be obtained, a statistical
procedure called regression analysis can be used to develop an
equation showing how the variables are related.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
6
• In regression terminology, the variable being predicted is called
the dependent variable, or response.
• The variables being used to predict the value of the dependent
variable are called the independent variables, or predictor
variables.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
7
• Simple linear regression, in which the
relationship between one dependent variable
(denoted by y) and one independent variable
(denoted by x) is approximated by a straight
line.
• Multiple linear regression ,the relationship
between a dependent variable (y) and two or
more independent variables (x1, x2 , … , xq ).

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
8
What’s Slope?

A slope of 2 means that every 1-unit change in X


yields a 2-unit change in Y.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Model Specification is Based on Theory

• 1. Theory of Field (e.g., Epidemiology)


• 2. Mathematical Theory
• 3. Previous Research
• 4. ‘Common Sense’

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
10
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables

Simple Multiple

Linear Non-Linear Linear Non-Linear

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Regression Modeling Steps
1. Hypothesize Deterministic Component
• Estimate Unknown Parameters
2. Specify Probability Distribution of Random
Error Term
• Estimate Standard Deviation of Error
3. Evaluate the fitted Model
4. Use Model for Prediction & Estimation

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
12
Best Fit Line, Minimising Sum of Squared Errors
• y=mx+c ŷ = bx + a
• Here, ŷ = bx + a
– ŷ : predicted value of y
– b: slope of regression line
ε
– a: intercept

= ŷ, predicted
= y i , observed
ε = residual

Residual error (ε): Difference between obtained and predicted values


of y (i.e. y- ŷ).
Best fit line (values of b and a) is the one that minimises the sum of
squared errors (SSerror) (y- ŷ)2
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
• Butler Trucking Company is an independent trucking company
in Southern California.
• A major portion of Butler’s business involves deliveries
throughout its local area.
• To develop better work schedules, the managers want to
estimate the total daily travel times for their drivers.
• The managers believe that the total daily travel times (denoted
by y) are closely related to the number of miles traveled in
making the daily deliveries (denoted by x).
• Using regression analysis, we can develop an equation showing
how the dependent variable y is related to the independent
variable x.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
14
• In the Butler Trucking Company example, a simple linear
regression model hypothesizes that the travel time of a driving
assignment (y) is linearly related to miles travel (x) as follows:

b0 and b1 are population parameters that describe the y-


intercept and
slope of the line relating y and x.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
15
• the values of the population parameters β0 and β1
are not known and must be estimated using sample
data. Sample statistics (denoted b0 and b1) are
computed as estimates of the population parameters
β0 and β1. Substituting the values of the sample
statistics b0 and b1 for β0 and β1 in equation (7.1)
and dropping the error ,we obtain the estimated
regression for simple linear regression:

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
16
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
17
Linear Equations
Y
Y = mX + b
C ha ng e
m = S lo pe in Y
C ha ng e in X
b = Y -in te rce pt
X

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license © 1984-1994 T/Maker Co.
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
18
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
• The graph of the estimated simple linear
regression equation is called the estimated
regression line; b0 is the estimated y-intercept,
and b1 is the estimated slope.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
20
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
21
Population & Sample Regression Models
Population Random Sample

Unknown
 Relationship
Yi  0  1X i   i 

 


© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Population & Sample Regression Models
Population Random Sample

Unknown Yii ˆ00  ˆ11X ii  ˆii


 Relationship
Yi  0  1X i   i 

 


© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Least Squares Method
• The least squares method is a procedure for using sample data
to find the estimated regression equation.
• To illustrate the least squares method, suppose data were
collected from a sample of 10 Butler Trucking Company driving
assignments.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
24
• In addition, for these data, the relationship between
the travel time and miles traveled appears to be
approximated by a straight line; indeed, a positive
linear relationship is indicated between x and y.
• We therefore choose the simple linear regression
model to represent this relationship.
• Given that choice, our next task is to use the sample
data in Table 7.1 to determine the values of b0 and
b1 in the estimated simple linear regression
equation.
• For the ith driving assignment, the estimated
regression equation provides
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
25
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
26
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
27
• Figure 7.3 is a scatter chart of the data in Table 7.1. Miles
traveled is shown on the
• horizontal axis, and travel time (in hours) is shown on
the vertical axis. Scatter charts for regression analysis are
constructed with the independent variable x on the
horizontal axis and the dependent variable y on the
vertical axis.
• The scatter chart enables us to observe the data
graphically and to draw preliminary conclusions about
the possible relationship between the variables.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
28
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
29
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
30
• Estimated slope of b1 = 0.0678 and a y-intercept of b0 =1.2739
• estimated simple linear regression equation is yˆ =1.2739 +
0.0678(x1)

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
31
• For the Butler Trucking Company model, we therefore
estimate that, if the length of a driving assignment were
1 mile longer, the mean travel time for that driving
assignment would be 0.0678 hour (or approximately 4
minutes) longer.
• The y-intercept b0 is the estimated value of the
dependent variable y when the independent variable x
is equal to 0.
• For the Butler Trucking Company model, we estimate
that if the driving distance for a driving assignment was
0 units (0 miles), the mean travel time would be 1.2739
units (1.2739 hours, or approximately 76 minutes)
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
32
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
33
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
34
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.
35
Thank You !

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license
distributed with a certain product or service or otherwise on a password-protected website for classroom use.

You might also like