0% found this document useful (0 votes)

41 views71 pages

05 Linear Regression 2

The document discusses multiple linear regression analysis. It covers estimating multiple regression models using least squares methods and Excel's regression tool. It also discusses checking regression assumptions by examining residual plots, hypothesis testing of regression parameters using t-tests, and determining statistical significance.

Uploaded by

87rkkcbct7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views71 pages

05 Linear Regression 2

Uploaded by

87rkkcbct7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 71

MM3425 BUSINESS ANALYTICS

LINEAR REGRESSION PART II

1
MULTIPLE REGRESSION MODEL

■ y = dependent variable
■ x1, x2,…,xq = independent variables
■ β0, β1,…, βq = parameters (represents the change in the mean value of the dependent
variable y that corresponds to a one unit increase in the independent variable xi, holding
the values of all other independent variable constant)
■ ε = error term (accounts for the variability in y that cannot be explained by
the linear effect of the q independent variables) 2
ESTIMATED MULTIPLE REGRESSION MODEL

3
LEAST SQUARES METHOD AND MULTIPLE REGRESSION

■ The least squares method is used to develop the estimated

multiple regression equation

b0, b1, b 2 ,…, bq that satisfy min  i ˆi   

n 2 n
ei2 .
1
y 
i
min
i
1
y

4
MULTIPLE REGRESSION MODEL

5
INFERENCE AND REGRESSION

Simple linear regression model Multiple regression model

EXTENSION OF BUTLER TRUCKING COMPANY

■ Butler Trucking Company

– The estimated simple linear regression equation is
ŷi =1.2739+0.0678xi
– The linear effect of the number of miles traveled explains
66.41% of the variability in travel time in the sample
data (r2=0.6641)
– 33.59% of the variability in sample travel times remains
unexplained
– The managers want to consider adding one or more
independent variables, such as number of deliveries, to
the model to explain some of the remaining variability in
the dependent variable 6

– 300 observations are used this time

Assignment Miles (x1) Deliveries (x2) Time (y)
1 100.0 4.0 9.3
2 50.0 3.0 4.8
3 100.0 4.0 8.9
4 100.0 2.0 6.5
5 50.0 2.0 4.2

290 85.0 2.0 7.8

291 75.0 2.0 6.5
292 70.0 2.0 6.1
293 75.0 4.0 7.2
294 70.0 6.0 8.9
295 95.0 6.0 10.9
296 50.0 4.0 7.2
297 50.0 1.0 3.5
298 85.0 2.0 8.0
299 100.0 2.0 7.8
300 65.0 6.0 10.0 7
ESTIMATED MULTIPLE REGRESSION MODEL

■ Estimated multiple linear regression with two independent

variables

ŷ  b 0  b 1 x1  b 2 x2

■ ŷ = estimated mean travel time

■ x1= distance travelled
■ x2= number of deliveries
8

■ SST, SSR, SSE and r2 are computed

EXCEL’S REGRESSION TOOL

10
EXCEL’S REGRESSION TOOL

11
EXCEL’S REGRESSION TOOL

ŷ = 0.1273 + 0.0672x1 + 0.6900x2 (r2=SSR/SST=915.5161/1120.1032=81.73%) 11

ADJUSTED R2

■ r2 never decreases when a new x variable is added to the model

■ This can be a disadvantage when comparing models
■ What is the net effect of adding a new variable?
– We lose a degree of freedom when a new x variable is added
– Did the new x variable add enough explanatory power to offset the loss of one degree of
freedom?

(where n = sample size, k = number of independent variables)

– Interpreted as the percentage of the total sum of squares that can be explained by using
the estimated regression equation adjusted for the number of x variables used
12
– Smaller than r2
– Useful in comparing among models
Adjusted r2=1-[(1-0.8173)(300-1)/(300-2-1)]=0.8161=81.61%

14
15
Check residual plots
before inference
INFERENCE AND REGRESSION

■ Three regression conditions

– The population of potential error terms ε is normally distributed with a mean of 0
– The population of potential error terms ε has a constant variance
– The values of ε are statistically independent
■ The errors must satisfy these conditions in order for inferences
■ How to check?
– Residual plots to check for violations of regression conditions
– Residuals vs. ŷ
– Residuals vs. xi
17
 Symmetrically distributed around 0 Not symmetrically distributed around 0
18
Y Y

x x
residuals

residuals
x x

Non-constant variance
 Constant variance
19
Not Independent

residuals
 Independent

residuals
X
residuals

20
WHEN RESIDUALS DO NOT MEET CONDITIONS

■ An important independent variable has been omitted

■ The function form of the model is inadequate to explain the
relationships between the independent variables and the
dependent variable

21
EXCEL’S REGRESSION TOOL

22
SCATTER CHART OF RESIDUALS AND PREDICTED VALUES OF
THE DEPENDENT VARIABLE

23
EXCEL’S REGRESSION TOOL

24
25
SCATTER CHART OF RESIDUALS AND PREDICTED VALUES OF
THE DEPENDENT VARIABLE

26
Hypothesis testing
INFERENCE AND REGRESSION

■ Statistical inference
– Process of making estimates and drawing conclusions about one or more
characteristics of a population (the value of one or more
parameters) through the analysis of sample data drawn from the
population
■ Inference is commonly used to estimate and draw conclusions on
– The regression parameters β0,β1,…,βq
– The mean value and/or the predicted value of the dependent variable y
for specific values of the independent variables x1,x2,…,xq
■ Consider both hypothesis testing and interval estimation
28
HYPOTHESIS TESTING

■ If you can assume that he is innocent or guilty, which way is

easier to prove that he is guilty?

29
■ Claim: The population mean age is 50
■ H0: μ = 50, H1: μ ≠ 50
■ Sample the population and find sample mean

Population

Sample
30
■ Suppose the sample mean (μ) age was 20
■ This is significantly lower than the claimed mean population age of 50
■ If the null hypothesis were true, the probability of getting such a different sample mean
would be very small
■ Getting a sample mean of 20 is so unlikely if the population mean was 50
■ You conclude that the population mean must not be 50
31

■ You reject the null hypothesis (H0: μ = 50)

■ If the sample mean is close to the assumed population mean
– H0 is not rejected
■ If the sample mean is far from the assumed population mean
– H0 is rejected
■ How far is “far enough” to reject H0?
– The critical value of a test statistic is determined for decision making

32
Region of Region of
Rejection Rejection

Critical Values

33
Sample

34
35
INFERENCE AND REGRESSION

■ Testing individual regression parameters

– t-test
– To determine whether statistically significant relationships exist between
the dependent variable y and each of the independent variables xj
– If βj=0, there is no linear relationship between the dependent variable y
and the independent variable xj
– If βj≠0, there is a linear relationship between y and xj

36
INFERENCE AND REGRESSION

■ Use a t test to test the hypothesis that a regression parameter

– Sbj is estimated standard deviation of bj
– As the magnitude of t increases (as t deviates from zero in either direction),
we are more likely to reject the hypothesis that the regression parameter βj=0

b 
tSTAT  Sj
bj (df = n – k – 1)
0

37
n

bj Sbj

38
H0: βj = 0 From the excel output:
For Miles tSTAT = 27.3655, with p-value < 0.0001
H1: βj  0
For Deliveries tSTAT = 23.3731, with p-value < 0.0001
d.f. = 300-2-1 = 297

 = 0.05 The test statistic for each variable falls in

Excel
the rejection region (p-values < 0.05)
=T.INV.2T(0.05,297) t/2 = 1.97
Decision:
/2=0.025 /2=0.025 Reject H0 for each variable
Conclusion:
There is evidence that both
Reject H0
Do not reject H
-t α/2
0
tα/2
Reject H0
Miles and Deliveries affect
0
-1.97 1.97 travel time at  = 0.05
39
H0: βj = 0 From the excel output:
For Miles tSTAT = 27.3655, with p-value < 0.0001
H1: βj  0
For Deliveries tSTAT = 23.3731, with p-value < 0.0001
d.f. = 300-2-1 = 297

 = 0.01 The test statistic for each variable falls in

Excel
the rejection region (p-values < 0.01)
=T.INV.2T(0.01,297) t/2 = 2.59
Decision:
/2=0.005 /2=0.005 Reject H0 for each variable
Conclusion:
There is evidence that both
Reject H0
Do not reject H
-t α/2
0
tα/2
Reject H0
Miles and Deliveries affect
0
-2.59 2.59 travel time at  = 0.01
40
P-VALUE

■ p-value
– The probability of obtaining a test statistic equal to or more extreme (< or
>) than the observed sample value, given H0 is true (no linear relationship)
– The p-value is also called the observed level of significance
– Smallest value of α for which H0 can be rejected
■ Compare the p-value with α

– If p-value < α, reject H 0

– If p-value ≥ α, do not reject H 0 41

– If the p-value is low then H0 must go

/2=0.005 /2=0.005

Do not reject H Reject H0

Reject H0
-t α/2
0
tα/2
0
-2.59 2.59

Excel
=T.DIST.2T(D18,297)
-27.3655 0 27.3655

42
Interval estimation
INFERENCE AND REGRESSION

■ Confidence interval
– An estimate of a population parameter that provides an interval
believed to contain the value of the parameter at some level of
confidence

b j  tα / 2 Sb j
■ Confidence level
– Indicates how frequently interval estimates based on samples of the
same size taken from the same population using identical sampling
techniques will contain the true value of the parameter we are
estimating
– 1 - α (level of significance) 44
bj Sbj

45
n

k
Confidence level

46
For miles, lower 95%
=0.06718172-1.968*0.002454979=0.0624
T.INV.2T(0.05, 297) = 1.968
Upper 95%
=0.06718172+1.968*0.002454979=0.0720

 0.0624 ≤ β1 ≤ 0.0720
b j  tα / 2 Sb j You have 95% confidence that this interval correctly estimates the
relationship between these variables.

From a hypothesis-testing viewpoint, because this confidence interval does

not include 0, you can conclude that the regression coefficient (β1) has a 47

significant effect.
48
F test
INFERENCE AND REGRESSION

■ Testing for an overall regression relationship

■ Use an F test based on the F probability distribution
– H0: β1=β2=…= βq=0 (no linear relationship)
– H1: at least one βi ≠0 (at least one independent variable affects y)

SSR

 MSR  k
FSTAT MSE SSE
nk
50
1
n
p-value for the F Test

SSR
MSR k
FSTAT  MSE  SSE
n  k 1

FSTAT = [915.5160626/2]/[204.5871374/(300-2-1)]=457.7580313/0.68884558=664.5292419 50
H0: β1=β2=0
H1: β1 and β2 not both zero
 = 0.05,  = 0.01
df1= 2 df2 = 297

 = 0.05  = 0.01

0 Do not Reject H
F 0 Do not Reject H
F
0 0
reject H0 reject H0
F0.05 = 3.03 F0.01 = 4.68
Excel Excel
=F.INV.RT(0.05,2,297) =F.INV.RT(0.01,2,297)

Since FSTAT test statistic is in the rejection region, reject H0. There is evidence that at least one 51

independent variable affects y.

53
54
INFERENCE AND REGRESSION

■ Non-significant independent variables

– If practical experience dictates that the non-significant independent variable
has a relationship with the dependent variable, the independent variable
should be kept in the model
– If the model sufficiently explains the dependent variable without the non-
significant independent variable, then consider rerunning the
regression without the non-significant independent variable (results
may change)
– The appropriate treatment of the inclusion or exclusion of the y-intercept when b0
is not statistically significant may require special consideration
– Regression through the origin should not be forced unless there are strong priori
reasons for believing that the dependent variable is equal to zero when 54

the values of all independent variables in the model are equal to zero
Multicollinearity
MULTICOLLINEARITY

■ Multicollinearity refers to the correlation among the independent variables in

multiple regression analysis
■ What will happen if independent variables are correlated?
– In t tests for the significance of individual parameters, the difficulty caused by
multicollinearity is that it is possible to conclude that a parameter associated
with one of the multicollinear independent variables is not significantly different
from zero when the independent variable actually has a strong relationship
with the dependent variable
■ This problem is avoided when there is little correlation among the independent
variables

55
Miles traveled and gasoline consumed are strongly related X
goes up and Y goes up 56
The primary consequence of multicollinearity is that
it increases the variances and standard errors of the
regression estimates of β0, β1, β2, .
. . , βq and predicted values of the dependent
variable, and so inference based on these
estimates is less precise than it should be.

58
A TEST OF MULTICOLLINEARITY

■ As a rule-of-thumb, multicollinearity is a
potential problem if the absolute value of the
sample correlation coefficient exceeds
±0.7 for any two of the independent
variables
■ Correlation coefficient in Excel
■ =CORREL(array1, array2)

 rMiles, Gasoline Consumption=0.9571 >0.7

– Miles and Gasoline Consumption are
collinear
– Include either Miles or Gasoline
Consumption

 rMiles, Deliveries=0.0258 <0.7

60
– Miles and Deliveries are not
collinear
Categorical variable
CATEGORICAL INDEPENDENT VARIABLES

🞍 Butler Trucking Company and Rush Hour

– Dependent variable: travel time (y)
– Independent variables: miles traveled (x1) and number of deliveries (x2)
– Categorical variable/dummy variable: rush hour (x3)

🞍 x3=0 if an assignment did not include travel on the congested segment of highway during
afternoon rush hour

🞍 x3=1 if an assignment included travel on the congested segment of highway during

afternoon rush hour
62
CATEGORICAL INDEPENDENT VARIABLES

ei = yi - +ve means actual is larger 60

ŷi
CATEGORICAL INDEPENDENT VARIABLES

61
ŷ = –0.3302 + 0.0672x1 + 0.6735x2 + 0.9980x3
CATEGORICAL INDEPENDENT VARIABLES

ŷ = –0.3302 + 0.0672x1 + 0.6735x2 + 0.9980x3

■ The model estimates that travel time increases by
– 0.0672 hour for every increase of 1 mile traveled, holding constant the number of
deliveries and whether the driving assignment route requires the driver to travel
on the congested segment of a highway during the afternoon rush hour period
– 0.6735 hour for every delivery, holding constant the number of miles
traveled and whether the driving assignment route requires the driver to
travel on the congested segment of a highway during the afternoon rush
hour period
– 0.9980 hour if the driving assignment route requires the driver to travel on the
congested segment of a highway during the afternoon rush hour period,
holding
■ r2=0.8838 constant
indicates thatthe
the number ofmodel
regression miles explains
traveledapproximately
and the number of deliveries
88.38% of the
variability in travel time for the driving assignments in the sample 62
CATEGORICAL INDEPENDENT VARIABLES

■ When x3  0:

■ When x
 1: 3

66
CATEGORICAL INDEPENDENT VARIABLES

■ If a categorical variable has k levels, k-1 dummy variables are required

■ Suppose a manufacturer of vending machines organized the sales
territories for a particular state into three regions: A, B, and C
■ Suppose the managers believe sales region is one of the important factors in
predicting the number of units sold

67
CATEGORICAL INDEPENDENT VARIABLES

Region A

Region B

Region C

68
APPLICATION 1

https://www.sciencedirect.com/science/article/pii/S2212567112001888 Which variables are

69
significant?
APPLICATION 2

https://www.emerald.com/insight/content/doi/10.1108/01443579910287064/full/html
70
DATABASES

◾ HKSAR Government: https://data.gov.hk/en/

◾ Statista (PolyU login required): https://www-statista-com.ezproxy.lb.polyu.edu.hk/
◾ World bank: https://data.worldbank.org/
◾ Aviation: https://www.bts.gov/
◾ Shipping (PolyU login required): https://www.lib.polyu.edu.hk/databases/shipping-intelligence-network-
individual- title-varies
◾ HK Air Traffic Statistics: https://www.cad.gov.hk/english/statistics.html
◾ Kaggle Dataset: https://www.kaggle.com/datasets

Linear Regression
100% (2)
Linear Regression
28 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Predective Analytics or Inferential Statistics
No ratings yet
Predective Analytics or Inferential Statistics
27 pages
K Kiran Kumar IIM Indore
100% (1)
K Kiran Kumar IIM Indore
115 pages
Stats AP Review
100% (2)
Stats AP Review
38 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
Chapter 8 B - Trendlines and Regression Analysis
No ratings yet
Chapter 8 B - Trendlines and Regression Analysis
73 pages
LGT2425 Lecture 3 Part II (Notes)
No ratings yet
LGT2425 Lecture 3 Part II (Notes)
55 pages
Lecture10 Regression2 TS PDF
No ratings yet
Lecture10 Regression2 TS PDF
22 pages
11 SimpleRegression
No ratings yet
11 SimpleRegression
42 pages
Session 19-20 - ANT 5001 - 2019-21
No ratings yet
Session 19-20 - ANT 5001 - 2019-21
42 pages
REGRESSION ANALYSIS 1 and 2 Notes
No ratings yet
REGRESSION ANALYSIS 1 and 2 Notes
9 pages
Is The Dependent Variable Related To The Independent Variable?
No ratings yet
Is The Dependent Variable Related To The Independent Variable?
10 pages
Week 5 Multiple Regression: Busa3500 Statistics For Business Ii Piedmont College
No ratings yet
Week 5 Multiple Regression: Busa3500 Statistics For Business Ii Piedmont College
57 pages
Chap01-3 (Autosaved)
No ratings yet
Chap01-3 (Autosaved)
51 pages
Multiple Choice Question Subject-Engineering Mathematics-III Unit-III Statistics
No ratings yet
Multiple Choice Question Subject-Engineering Mathematics-III Unit-III Statistics
38 pages
Evans - Analytics2e - PPT - 07 and 08 CH
No ratings yet
Evans - Analytics2e - PPT - 07 and 08 CH
50 pages
Regression and Life Cycle Costing
No ratings yet
Regression and Life Cycle Costing
28 pages
15multiple Linear Regression
No ratings yet
15multiple Linear Regression
168 pages
Week 8 - 10
No ratings yet
Week 8 - 10
72 pages
Chapter 8 Regression Model - 2023
No ratings yet
Chapter 8 Regression Model - 2023
21 pages
Topic 7-Regression Analysis
No ratings yet
Topic 7-Regression Analysis
56 pages
Lecture 10
No ratings yet
Lecture 10
38 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
What Is Multiple Linear Regression
No ratings yet
What Is Multiple Linear Regression
23 pages
Regrion
No ratings yet
Regrion
19 pages
Estimation of Causal Relationships I: Illustration 1
No ratings yet
Estimation of Causal Relationships I: Illustration 1
8 pages
Chapter 14
No ratings yet
Chapter 14
65 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
10 Bda
No ratings yet
10 Bda
35 pages
F Regression
No ratings yet
F Regression
65 pages
Evans - Analytics2e - PPT - 07 and 08
No ratings yet
Evans - Analytics2e - PPT - 07 and 08
49 pages
MS Excel Linear & Multiple Regression
No ratings yet
MS Excel Linear & Multiple Regression
8 pages
Chapter 05 Demand Estimation
No ratings yet
Chapter 05 Demand Estimation
41 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
MKT3600 - L09 - Correlation and Regression
No ratings yet
MKT3600 - L09 - Correlation and Regression
51 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Intro To Regresion: Codergirl Data Analysis
No ratings yet
Intro To Regresion: Codergirl Data Analysis
32 pages
Topic 7 2023 Linear Regression STD
No ratings yet
Topic 7 2023 Linear Regression STD
14 pages
Important Questions of Machine Learning
No ratings yet
Important Questions of Machine Learning
5 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Discussion+on+Multiple+Regression ShimengHuang
No ratings yet
Discussion+on+Multiple+Regression ShimengHuang
35 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Regression and Correlation
No ratings yet
Regression and Correlation
17 pages
L1 QM07 High Yield Notes
No ratings yet
L1 QM07 High Yield Notes
4 pages
Fba 1
No ratings yet
Fba 1
9 pages
Bivariate
No ratings yet
Bivariate
28 pages
3 Epidemiology and Statistics For IPC Surveillance
No ratings yet
3 Epidemiology and Statistics For IPC Surveillance
46 pages
03 - Simple Linear Regression
No ratings yet
03 - Simple Linear Regression
13 pages
Regression
No ratings yet
Regression
60 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
35 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
Session 3: - Quantitative Demand
No ratings yet
Session 3: - Quantitative Demand
32 pages
ISOM2500 Spring 25 - Topic 10 - Linear Regression Interpretation and Diagnosis
No ratings yet
ISOM2500 Spring 25 - Topic 10 - Linear Regression Interpretation and Diagnosis
51 pages
Vivek Jain SPM 8th Ed (OCR) - 867
No ratings yet
Vivek Jain SPM 8th Ed (OCR) - 867
1 page
Measures of Spread
No ratings yet
Measures of Spread
5 pages
9709 s20 QP 31-Solved (Handwritten)
No ratings yet
9709 s20 QP 31-Solved (Handwritten)
12 pages
09 Decision Analysis
No ratings yet
09 Decision Analysis
44 pages
10 Spreadsheet Models
No ratings yet
10 Spreadsheet Models
30 pages
IBES Reported Actual EPS and Analysts' Inferred Actual EPS
No ratings yet
IBES Reported Actual EPS and Analysts' Inferred Actual EPS
55 pages
Final Chi Square
No ratings yet
Final Chi Square
22 pages
The Simple Exponential Smoothing Model: Eva Ostertagová, Oskar Ostertag
No ratings yet
The Simple Exponential Smoothing Model: Eva Ostertagová, Oskar Ostertag
5 pages
08 Simulation
No ratings yet
08 Simulation
29 pages
Prediksi Data Curah Hujan Dengan Menggunakan Statistika Non Parametrik
No ratings yet
Prediksi Data Curah Hujan Dengan Menggunakan Statistika Non Parametrik
10 pages
Thesis Sales Forcasting
No ratings yet
Thesis Sales Forcasting
43 pages
Standard Costing 2 - Cost Variances
No ratings yet
Standard Costing 2 - Cost Variances
45 pages
ML 2024 Part6 Classification Unsupervised
No ratings yet
ML 2024 Part6 Classification Unsupervised
43 pages
Thesis Final Version Julian Van Erk
No ratings yet
Thesis Final Version Julian Van Erk
30 pages
Estimasi Sumberdaya Untuk Data Dengan Distribusi Lognormal Pada Endapan Urat Emas Gunung Pongkor Dengan Pendekatan Geostatistik
No ratings yet
Estimasi Sumberdaya Untuk Data Dengan Distribusi Lognormal Pada Endapan Urat Emas Gunung Pongkor Dengan Pendekatan Geostatistik
9 pages
Statistics
No ratings yet
Statistics
6 pages
Horn Parallel-Analysis Packadge
No ratings yet
Horn Parallel-Analysis Packadge
4 pages
Kunci Soal Latihan
No ratings yet
Kunci Soal Latihan
7 pages
Understanding The Service Desk: Applied Forecasting and Analytics Approach
No ratings yet
Understanding The Service Desk: Applied Forecasting and Analytics Approach
5 pages
The Chi-Squared Test With TI-Nspire IB10
No ratings yet
The Chi-Squared Test With TI-Nspire IB10
5 pages
Apa Anova
No ratings yet
Apa Anova
2 pages
BEO1106 Business Statistics Assignment Part III AnswerSheet
No ratings yet
BEO1106 Business Statistics Assignment Part III AnswerSheet
4 pages
Determining Probabilities: Den Mark L. Asebo
No ratings yet
Determining Probabilities: Den Mark L. Asebo
18 pages
4158-Article Text-7967-1-10-20201227
No ratings yet
4158-Article Text-7967-1-10-20201227
16 pages
Get Introduction To Probability and Statistics For Engineers and Scientists, 6th Edition Sheldon M. Ross PDF Ebook With Full Chapters Now
No ratings yet
Get Introduction To Probability and Statistics For Engineers and Scientists, 6th Edition Sheldon M. Ross PDF Ebook With Full Chapters Now
40 pages
Homework 1 - Simple Linear Regression - Neal Pania
No ratings yet
Homework 1 - Simple Linear Regression - Neal Pania
4 pages
Introduction To Business Analytics Internal Assignment 2 (Part II)
No ratings yet
Introduction To Business Analytics Internal Assignment 2 (Part II)
5 pages
BBT 3106 - Probability & Statistics II - August 2023EC
No ratings yet
BBT 3106 - Probability & Statistics II - August 2023EC
3 pages
Chapter 6 Exercises
No ratings yet
Chapter 6 Exercises
4 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet

05 Linear Regression 2

Uploaded by

05 Linear Regression 2

Uploaded by

MM3425 BUSINESS ANALYTICS

LINEAR REGRESSION PART II

■ The least squares method is used to develop the estimated

b0, b1, b 2 ,…, bq that satisfy min  i ˆi   

Simple linear regression model Multiple regression model

■ Butler Trucking Company

– 300 observations are used this time

290 85.0 2.0 7.8

■ Estimated multiple linear regression with two independent

■ ŷ = estimated mean travel time

■ SST, SSR, SSE and r2 are computed

ŷ = 0.1273 + 0.0672x1 + 0.6900x2 (r2=SSR/SST=915.5161/1120.1032=81.73%) 11

■ r2 never decreases when a new x variable is added to the model

(where n = sample size, k = number of independent variables)

■ Three regression conditions

■ An important independent variable has been omitted

■ If you can assume that he is innocent or guilty, which way is

■ You reject the null hypothesis (H0: μ = 50)

■ Testing individual regression parameters

■ Use a t test to test the hypothesis that a regression parameter

 = 0.05 The test statistic for each variable falls in

 = 0.01 The test statistic for each variable falls in

– If p-value < α, reject H 0

– If p-value ≥ α, do not reject H 0 41

– If the p-value is low then H0 must go

Do not reject H Reject H0

From a hypothesis-testing viewpoint, because this confidence interval does

■ Testing for an overall regression relationship

independent variable affects y.

■ Non-significant independent variables

■ Multicollinearity refers to the correlation among the independent variables in

 rMiles, Gasoline Consumption=0.9571 >0.7

 rMiles, Deliveries=0.0258 <0.7

🞍 Butler Trucking Company and Rush Hour

🞍 x3=1 if an assignment included travel on the congested segment of highway during

ei = yi - +ve means actual is larger 60

ŷ = –0.3302 + 0.0672x1 + 0.6735x2 + 0.9980x3

■ If a categorical variable has k levels, k-1 dummy variables are required

https://www.sciencedirect.com/science/article/pii/S2212567112001888 Which variables are

◾ HKSAR Government: https://data.gov.hk/en/

You might also like