0% found this document useful (0 votes)
18 views5 pages

FBAS

Uploaded by

Sophia Fronda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views5 pages

FBAS

Uploaded by

Sophia Fronda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

CHAPTER 5 – Linear Regression

Managerial decisions
- are often based on the relationship between variables.
Regression analysis
- is a statistical procedure that uses data to develop an equation showing how the variables are related.
o Dependent (or response/ target) variable
- is the variable being predicted.
o Independent (or predictor) variables (or features)
- are variables used to predict the value of the dependent variable.
o Simple linear regression
- is a form of regression analysis in which a (“simple”) single independent variable,
x is used to develop a “linear” relationship (straight line) with the dependent

- is an equation that describes how the dependent variable 𝑦 is related to the


variable, y .

independent variable 𝑥 and error term ε


- y=β 0 + β 1 x + ε

 𝑦 is the dependent variable.


- Where:

 𝑥 is the independent variable.

 ε is the error term. It accounts for the variability in 𝑦 that cannot be


 β 0 and β 1 are referred to as the population parameters.

explained by the linear relationship between 𝑥 and 𝑦.


 estimated simple linear regression equation
- ^y =b0 +b 1 x
- Where :

for a given 𝑥.
 ^y is the point estimator of E ( y∨x ) , the mean of y

 b 0 is the point estimator of β 0 and the 𝑦-intercept

The 𝑦-intercept b 0 is the estimated value of the


of the regression.

dependent variable 𝑦 when the independent


variable x is equal to 0.
 1 is the point estimator of β 1 and the slope of the
b
regression.

the dependent variable 𝑦 that is associated with a


The slope b 1 is the estimated change in the value of

one unit increase in the independent variable x .


 least squares method
- is a procedure for using sample data to find the estimated
linear regression equation (see notes for b 0 and b 1
equations.)
min ∑ e i =min ∑ ( yi −^y i ) =min ∑ ( y i −b0−b1 x i )
2 2 2
-

 e i= y i− ^yi is referred to as the 𝑖th residual: the


- Where :

dependent variable for the 𝑖th observation.


error made in estimating the value of the

dependent variables for the 𝑖th observation.


 x iand y i are the values of independent and

 ^y i is the predicted value of the dependent variable


for the ith observation.
 n is the total number of observations.
- Differential calculus can be used to show that the values of
b 0 and b 1 that minimize the least squares criterion are
given by

b 1=
∑ ( xi −x ) ( y i − y )
∑ ( x i−x )2
b 0= y−b1 x

The Butler Trucking Company Example


The managers believe that the total daily travel times (denoted by y) are closely related
to the number of miles traveled in making the daily deliveries (denoted by x).

o Multiple linear regression


- is a more general form of regression analysis involving two or more independent
variables.

Regression Equation for Butler Trucking


To manually compute for this answer, let’s do the following: ŷ=b0+b1x
Assignment Miles Time xy X2 ȳ1=b0+b1xi ei=yi- ȳ1 ei2=(y- ȳ1)2
1 100 9.3 930 10000 8.0539 1.2461 1.5528
2 50 4.8 240 2500 4.6639 0.1361 0.0185
3 100 8.9 890 10000 8.0539 0.8461 0.7159
4 100 6.5 650 10000 8.0539 -1.5539 2.4146
5 50 4.2 210 2500 4.6639 -0.4639 0.2152
6 80 6.2 496 6400 6.6979 -0.4979 0.2479
7 75 7.4 555 5625 6.3589 1.0411 1.0839
8 65 6 390 4225 5.6809 0.3191 0.1018
9 90 7.6 684 8100 7.3759 0.2241 0.0502
10 90 6.1 549 8100 7.3759 -1.2759 1.6279
∑xy=559
∑=66.9790 ∑=0.0210 ∑=8.0287
∑x=800 ∑y=67 4 ∑x2=67450
x̄ =80 ȳ=6.7

n ∑ xy −∑ x ∑ y
b 1= 2
n ∑ x −¿ ¿
b 1=10(5594)−¿ ¿
55940−53600
b 1=
674500−640000
2340
b 1=
34500
b 1=0.0678

b 0= ȳ−b1 x̄

b 0=6.7−(0.0678)(80)
b 0=6.7−5.424

b 0=1.276

Experimental Region and Extrapolation

Experimental region
- defined as the range of values of the independent variables in the data used to
estimate the model.
Extrapolation
- the prediction of the value of the dependent variable outside the experimental
region, is risky and should be avoided unless we have empirical evidence
dictating otherwise.
The Sums of Squares
sum of squares due to error (SSE)

- is a measure of the error that results from using the ^y ivalues to predict the y i
values
SSE=∑ ( y i−^y i )
2
-

total sum of squares (SST)


- is a measure of the error that results from using the sample mean y to predict
the y i values.

SST =∑ ( y i− y )
2
-

sum of squares due to regression (SSR)

- is a measure of how much the ^y ivalues deviate from the sample mean y .

SSR=∑ ( ^y i− y )
2
-

The relationship between these three sums of squares is SST =SSR+ SSE
Coefficient of Determination
The ratio SSR/SST is called the coefficient of determination, denoted by r 2.
2 SSR
r=
SST
The coefficient of determination can only assume values between 0 and 1 is used to evaluate the goodness of
fit for the estimated regression equation.

A perfect fit exists when y i is identical to ^y i for every observation i so that all residuals y i− ^y i =0.

• In such case, SSE=0 , SSR=SST , and r 2=SSR /SST =1.

Poorer fits between y iand ^y i result in larger values of SSE and lower r 2values.

• The poorest fit happens when SSE=SST , SSR=0 , and r 2=0.

The Estimation Process in Multiple Regression


estimated multiple linear regression equation
^y =b0 +b 1 x 1+ b2 x 2 +…+b q x q

Where:
^y is a point estimate of E ( y ) for a given set of p independent variables, x 1 , x 2 ,… , x q.

A simple random sample is used to compute the sample statistics b 0 , b 1 , b 2 , … , bq that are used as estimates of
β 0 , β1 , β2 , … , βq.

Least Squares Method and Multiple Regression

The least squares method uses the sample data to provide the values of the sample statistics b 0 , b 1 , b 2 , … , bq
that minimize the sum of the square errors between the y i and the ^y i.

min ∑ ( y i −^y i ) =min ∑ ( yi −( b0 + b1 x 1 +b2 x 2+…+ bq x q ) ) =min ∑ e2i


2 2

Where, y i is the value of dependent variable for the 𝑖th observation.


^y i is the predicted value of dependent variable for the 𝑖th observation.

Because the formulas for the regression coefficients involve the use of matrix algebra, we rely on computer
software packages to perform the calculations.
The emphasis will be on how to interpret the computer output rather than on how to make the multiple
regression computations.

You might also like