0% found this document useful (0 votes)
322 views

Module 10. Simple Linear Regression Regression Analysis

This document discusses simple linear regression analysis. It defines regression analysis and simple linear regression as involving one independent and one dependent variable with the relationship estimated by a straight line. It provides the simple linear regression formula and formulas to calculate the slope, y-intercept, coefficient of correlation, and coefficient of determination. Two examples are worked through demonstrating how to apply the formulas to data sets and interpret the results.

Uploaded by

Gaile Yabut
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
322 views

Module 10. Simple Linear Regression Regression Analysis

This document discusses simple linear regression analysis. It defines regression analysis and simple linear regression as involving one independent and one dependent variable with the relationship estimated by a straight line. It provides the simple linear regression formula and formulas to calculate the slope, y-intercept, coefficient of correlation, and coefficient of determination. Two examples are worked through demonstrating how to apply the formulas to data sets and interpret the results.

Uploaded by

Gaile Yabut
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Business Statistics: Module 10.

Simple Linear Regression Page 1 of 6

Module 10. Simple Linear Regression

Regression analysis

 A parametric tool used to describe the linear relationship between the independent
and dependent variables.

 Develops a model to predict the values of the dependent variable based on the
values of the independent variables.

Simple linear regression (SLR)

 Simplest type of regression analysis which involves one independent variable and
done dependent variable in which the relationship between the two variables is
estimated by a straight line

 SLR formula is as follow:

Y = a + bx or y = bo + b1x

Where Y = dependent variable


X = independent variable
a or bo = y-intercept of the regression line; or the value of Y if X = 0
b or b1 = slope, or the unit change in Y for every unit change in X

 To develop the linear regression model, you need to solve for the y-intercept and
slope first; the formulas of which are as follow:

Slope (b or b1) = nƩxy – ƩxƩy where n = number of samples


nƩx2 – (Ʃx)2

y-intercept (a or bo) = Ʃy – bƩx


n

 In addition, coefficient of correlation and coefficient of determination are used in


regression analysis to measure the strengths of relationship of the independent and
dependent variables. These two measures were discussed in module 9.

Coefficient of correlation (r) = n(Ʃxy) – (Ʃx)*(Ʃy) * = multiply


nƩx2 – (Ʃx)2 * nƩy2 – (Ʃy)2

Coefficient of determination = r2

For problem illustrations, let us use the same two problems in module 9.
Business Statistics: Module 10. Simple Linear Regression Page 2 of 6

Problem illustrations

1. A group of independent researchers investigated the relationship between calories


and fat of coffee drinks at two well-known coffee shops. The table below shows the
result of the experiment.
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the value of the fat of a coffee drink with calories of 43000.
e. Describe the coefficient of correlation
f. Describe the coefficient of determination

Products Calories (X) Fat (Y) X*Y X2 Y2


K iced mocha swirl latte 240 8.0 1920 57600 64
Z coffee Frappuccino blended coffee 260 3.5 910 67600 12.25
K coffee coolatta 350 22.0 7000 122500 400
Z coffee iced coffee mocha expresso 350 20.0 7700 122500 484
K mocha frappuccino blended coffee 420 16.0 6720 176400 256
Z chocolate brownie frappucino
blended coffee 510 22.0 11220 260100 484
Z chocolate frapppucino blended
crème 530 19.0 10070 280900 361
Total 2660 110.5 45540 1087600 2061.25

In this particular module, we will set aside hypotheses, level of significance, critical
values, decision rules, conclusions, and recommendations for the meantime.

We will proceed with the computation. Initially, the table shown above has only three
columns (written in blank font); and we added three more columns (written in red font) to
complete our data matrix. For column X*Y, you need to multiply each value of X and Y,
and you have to do it for every value in column X and Y. For column X^2, you multiply
each value of X by itself, and you need to do it for every value in column X. For column
Y^2, you multiply each value of Y by itself, and you have to do it for every value in
column Y. Lastly, you need to get the total of each column.

a. Slope (b or b1) = 7(45540) – (2660)(110.5) = 0.05


7(1087600) – (26602)

b. Y-intercept (a or bo) = 110.5 – 0.046(2660) = -1.69


7

c. Linear equation model = Y = -1.69 + 0.05X since it’s a model, we retain the X

d. Y = -1.69 + 0.05X
= -1.69 +.05(300) = 13.31
Business Statistics: Module 10. Simple Linear Regression Page 3 of 6

e. r = n(Ʃxy) – (Ʃx)*(Ʃy) 7(45540) – (2660)(110.5) = 0.7196


n(Ʃx2) – (Ʃx)2 * n(Ʃy2) – (Ʃy)2 7(1087600) –(2660)2 * 7(2061.25) –(110.5)2

The r-value of 0.7196 reflects the high positive relationship of the calories and and
fat of coffee drinks

f. r2 = 0.71962 = 0.5178 or 51.78%, which means 51.78% of the changes in the fat of
the coffee drinks can be explained by its interaction with the calories of the coffee
drinks, and the remaining 48.22% are unexplained factors or factors not included in
this study.

2. The store manager wants to determine the relationship between the number of
weekend television commercials shown and the sales of stereo and sound
equipment at the store. The table below shows the gathered data.
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the sales for 8 TV weekend commercials
e. Describe the coefficient of correlation
f. Describe the coefficient of determination

Number of TV weekend Sales (Y)


Week commercials (X) ($100s) X*Y X^2 Y^2
1 3 52 156 9 2704
2 6 58 348 36 2264
3 2 43 86 4 1849
4 4 55 220 16 3025
5 5 56 280 25 3136
6 2 40 80 4 1600
7 5 64 320 25 4096
8 3 49 147 9 2401
9 4 60 240 16 3600
10 1 39 39 1 1521
Total 35 516 1916 145 27296

a. Slope (b or b1) = 10(1916) – (35)(516) = 4.89


10(145) – (35)2

b. Y-intercept (a or bo) = (516 -4.89(35) = 34.49


10

c. Linear regression model = Y = 34.49 + 4.89X

d. Y = 34.49 + 4.89X
= 34.49 + 4.89(8) = 73.61

e. r = n(Ʃxy) – (Ʃx)*(Ʃy) 10(1916) – (516)(1916) = 0.8956


n(Ʃx ) – (Ʃx) * n(Ʃy ) – (Ʃy)
2 2 2 2
10(1916) – (516 ) *2
10(27296) – (1916 ) 2
Business Statistics: Module 10. Simple Linear Regression Page 4 of 6

There is a high positive correlation between number of weekend television commercials and
sales of stereo and sound equipment.

f. r2 = 0.89562 = 0.8021 or 80.21%, which means 80.21% of the changes in the sales of
stereo and sound equipment of the store can be explained by the changes in
number of weekend television commercials and the 19.79% are caused by other
unexplained factors.

End of Module Exercises

Read and analyze the problems carefully.

1. A production manager has compared the dexterity test scores of seven assembly
line employees with their hourly productivity. The table below shows the result.
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the hourly productivity for a dexterity score of 20
e. Describe the coefficient of correlation
f. Describe the coefficient of determination
Employee Dexterity test score (X) Hourly productivity
A 13 56
B 15 64
C 18 68
D 17 71
E 12 52
F 14 62
G 16 64
Total

2. It has been reported that the average American male consumes 3774 calories per
day and that 72.2% of American males are overweight. This information along with
data for seven other countries, is shown below.
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the % overweight for American male who consumes 4000 calories per
day
e. Describe the coefficient of correlation
f. Describe the coefficient of determination
Country Calories per day % overweight
A 2214 10.6
B 2975 27.7
C 3458 40.4
D 3523 64.7
E 2560 62.8
F 3257 15.3
G 3885 70.0
Total
Business Statistics: Module 10. Simple Linear Regression Page 5 of 6

3. In a certain company, employees who have stayed for more than five years with the
firm are given the opportunity to own a stock share at a discounted price as a form of
reward. The personnel manager wanted to know if the employees’ number of years
in service with the firm has influence on number of stock shares own? Describe
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the number of stock shares own of an employee with 20 years in service
with the firm.
e. Describe the coefficient of correlation
f. Describe the coefficient of determination
Employee No. of years (X) No. of stock shares (Y)
A 7 315
B 13 418
C 15 570
D 7 273
E 10 300
F 14 665
G 16 660
H 10 310
Total

4. The Insurance Institute for Highway Safety has listed the following ratings based on
collision and comprehensive claims for nine makes of midsize four-door cars from
2014-2016 model years. Higher numbers reflect higher claims in the collision and
comprehensive categories of coverage.
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the comprehensive claim rating for a collision rating of 130
e. Describe the coefficient of correlation
f. Describe the coefficient of determination
Claims Collision claim rating (X) Comprehensive claim rating (Y)
1 113 89
2 108 91
3 124 92
4 131 108
5 128 108
6 90 74
7 99 79
8 106 86
9 116 98
Total

5. A mail-order catalog business that sells personal computer supplies, software, and
hardware maintains a centralized warehouse for the distribution of products ordered.
Management is currently examining the process distribution from the warehouse and
is interested in studying the factors that affect warehouse distribution costs.
Currently, a small handling fee is added to the order, regardless of the amount. The
Business Statistics: Module 10. Simple Linear Regression Page 6 of 6

table below shows the data (distribution costs in thousands of dollars and number of
orders) collected for the past 12 months.
a. Determine the slope value.
b. Determine the y-intercept value
c. Develop the linear regression equation or model
d. Predict the number of orders for a distribute costs of 90.25 (in thousand dollars)
e. Describe the coefficient of correlation
f. Describe the coefficient of determination
Month Distribution costs Number of orders
1 52.95 4015
2 71.66 3806
3 85.58 5309
4 63.69 4262
5 72.81 4296
6 68.44 4097
7 52.46 3213
8 70.77 4809
9 82.03 5237
10 74.39 4732
11 70.84 4413
12 54.08 2921
Total

References

Albright, S. et al. (2015). Business analytics: data analysis and decision making (5th
ed). Cengage Learning.
Anderson, D., Sweeney, D.J., et.al., (2018). Modern business statistics. Australia:
Cengage Learning.
Antivola, H. (2015). Business statistics: a modular approach. Books Atbp. Publishing.
Anywhere Math. (2016). Introduction to Statistics.
https://www.youtube.com/watch?v=LMSyiAJm99g.
Berenson, M.L., Levine, D.M., & Krehbiel, T.C. (2015). Basic business statistics:
concepts and applications. Pearson Education Sou7th Asia Pte. Ltd.
Bowerman, B. (2017). Business statistics in practice: using modeling, data, and
analytics (8th ed.). McGraw-Hill Education.
Jaggia, S. (2019). Business statistics: communicating with numbers (3rd ed.). McGraw-
Hill Education.
Lee, N. (2016). Business statistics: using excel & SPSS. Sage.
Mukaka, M.M. (2012). A guide to appropriate use of correlation coefficient in medical
research. Malawi Medical Journal, v.24(3).
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576830/
Simple Learning Pro. (2015). Mean, median, mode, range, and standard deviation.
https://www.youtube.com/watch?v=mk8tOD0t8M0.
Sharpe, N. (2015). Business statistics 3rd ed. Pearson Education.
Willoughby, D. (2015). An essential guide to business statistics. John Wiley & Sons.

You might also like