0% found this document useful (0 votes)

10 views25 pages

11 Bda

Uploaded by

Qwertg Asdfg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views25 pages

11 Bda

Uploaded by

Qwertg Asdfg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Emerging Market Institute Fall 2024

Business Data Analysis

Jin Chen, PhD.

School of Statistics,
Beijing Normal University
10/30/2024

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
Lecture 11 Multiple Regression
 Multiple Regression
 Multiple regression model
 Least squares method
 Multiple coefficient of determination
 Model assumptions
 Testing for significance
 Estimation and prediction based on the estimated regression
equation
 Residual analysis
 Categorical independent variables
 Model curvilinear relationship

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Multiple regression model
 Two or more independent variables
 Describes how the dependent variable is related to the independent
variables and an error term:
y = β0 + β1x1 + β2x2 + . . . + βpxp + ε
where: β0, β1, β2, . . . , βp are the parameters, and ε is a random variable
called the error term
 Multiple regression equation
E(y) = β0 + β1x1 + β2x2 + . . . + βpxp
 Estimated multiple regression equation
𝑦𝑦� = b0 + b1x1 + b2x2 + . . . + bpxp
 Least squares method
min ∑ 𝑦𝑦𝑖𝑖 − 𝑦𝑦�𝑖𝑖 2

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Assumptions about the error term
 The error ε is a random variable with mean zero
 The variance of ε, denoted by σ 2, is the same for all values of
the independent variables
 The values of ε are independent
 The error ε is a normally distributed random variable reflecting
the deviation between the observed y value and the expected
value of y given by β0 + β1x1 + β2x2 + . . + βpxp

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Testing for significance
 The F test is used to determine whether a significant
relationship exists between the dependent variable and the set
of all the independent variables
 The F test is referred to as the test for overall significance
 If the F test shows an overall significance, the t test is used to
determine whether each of the individual independent variables
is significant
 A separate t test is conducted for each of the independent
variables in the model. Each t test is referred to as a test for
individual significance.

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Testing for Significance: F Test
 Hypotheses:
H 0: β 1 = β 2 = . . . = β p = 0
Ha: One or more of the parameters is not equal to zero
 Test statistic:
F=MSR/MSE
 Rejection rule:
Reject H0 if p-value≤ 𝛼𝛼 or if F≥ 𝐹𝐹𝛼𝛼 , where 𝐹𝐹𝛼𝛼 is based on an F
distribution with p d.f. in the numerator and n-p-1 d.f. in the
denominator

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Testing for Significance: t Test
 Hypotheses: H0: βi = 0 Ha: βi ≠ 0
𝑏𝑏𝑖𝑖
 Test statistic: 𝑡𝑡 =
𝑠𝑠𝑏𝑏𝑖𝑖
 Rejection rule:
Reject H0 if p-value < α or if t < -tα/2 or t > tα/2
where tα/2 is based on a t distribution with n - p – 1 degrees
of freedom.

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Testing for Significance: Multicollinearity
 Multicollinearity refers to the correlation among the independent
variables
 When the independent variables are highly correlated (e.g. |r|>.7), it is
not possible to determine the separate effect of any particular
independent variable on the dependent variable
 If the estimated regression equation is to be used only for predictive
purposes, multicollinearity is usually not a serious problem
 Every attempt should be made to avoid including independent variables
that are highly correlated

Statistics for Business and Economics © Cengage Learning 2017

No dissemination without permission or outside the class.
 Example: Butler Trucking Company
The business of Butler Trucking Company, an independent trucking
company in southern California, involves deliveries throughout its local
area. The managers want to predict the total daily travel time for their
drivers. They believed that the total daily travel time would be closely
related to the number of miles traveled and the number of deliveries
made daily. A simple random sample of 10 driving assignments provided
the data for the analysis.
Solution:
y = β0 + β1 x 1 + β2 x 2 + ε

No dissemination without permission or outside the class.
 𝑦𝑦� = −0.869 + 0.061𝑥𝑥1 + 0.923𝑥𝑥2
 Interpretation of the coefficients
 𝑏𝑏𝑖𝑖 represents an estimate of the change in y corresponding to a one-unit change in 𝑥𝑥𝑖𝑖 when
all other independent variables are held constant.
 𝑏𝑏1 =0.061 is an estimate of the expected increase in travel time corresponding to an
increase of one mile in the distance traveled when the number of deliveries is held constant.
 𝑏𝑏2 =0.923 is an estimate of the expected increase in travel time corresponding to one more
delivery made when the number of miles traveled is held constant.
Statistics for Business and Economics © Cengage Learning 2017
No dissemination without permission or outside the class.
 Multiple coefficient of determination
 SST=SSR+SSE
Where SST=total sum of squares=∑(𝑦𝑦𝑖𝑖 − 𝑦𝑦) � 2
SSR=sum of squares due to regression=∑(𝑦𝑦�𝑖𝑖 − 𝑦𝑦) � 2
SSE=sum of square due to error =∑(𝑦𝑦𝑖𝑖 − 𝑦𝑦) � 2
𝑆𝑆𝑆𝑆𝑆𝑆 21.6006
 𝑅𝑅 2 = = = .9038
𝑆𝑆𝑆𝑆𝑆𝑆 23.900
Therefore, 90.38% of the variability in travel time y is explained by the
estimated multiple regression equation.
 In general, 𝑅𝑅 2 always increases as independent variables are added to the
model
 Many analysts prefer adjusting 𝑅𝑅 2 for the number of independent variables to
avoid overestimating the impact of adding an independent variable on the
amount of variability explained by the estimated regression equation:
𝑛𝑛−1
𝑅𝑅𝑎𝑎2 = 1 − (1 − 𝑅𝑅 2 ) =.8763
𝑛𝑛−𝑝𝑝−1

No dissemination without permission or outside the class.
 F Test for overall significance
F=32.88, p-value<.01 indicates that we can reject 𝐻𝐻0 : 𝛽𝛽1 = 𝛽𝛽2 = 0,
we can conclude that a significant relationship is present between travel
time y and the two independent variables.
 t Test for individual significance
𝑏𝑏1 = .06113 𝑠𝑠𝑏𝑏1 =.00989 t=.06113/.00989=6.18 p<.01
𝑏𝑏2 = .923 𝑠𝑠𝑏𝑏2 =.221 t=.923/.221=4.18 p<.01
We reject 𝐻𝐻0 : 𝛽𝛽1 =0, and 𝐻𝐻0 : 𝛽𝛽2 =0

No dissemination without permission or outside the class.
Computer solution with Minitab
 Estimation and Prediction based on Estimated Regression
Equation
 Obtain the point estimate 𝑦𝑦,
� by substituting the given values x1,
x2, . . . , xp into the estimated regression equation
 The formulas required to develop interval estimates for the
mean value of 𝑦𝑦� and for an individual value of y are beyond the
scope of the textbook. Software packages often provide these
estimates.
 Example: Butler Trucking Company

No dissemination without permission or outside the class.
Statistics for Business and Economics © Cengage Learning 2017
No dissemination without permission or outside the class.
 Residual Analysis
 For simple linear regression, the residual plot against x and the
residual plot against 𝑦𝑦� provide the same information
 In multiple regression analysis, it is preferable to use the
residual plot against 𝑦𝑦� to determine if the model assumptions
are satisfied.
 Standardized residual plot against 𝑦𝑦�
 Can identify outliers (typically, <-2 or >+2)
 Can provide insight about the assumption that the error term ε has a
normal distribution

No dissemination without permission or outside the class.
 Example: Butler Trucking Company
 Categorical independent variables
 Categorical independent variables such as gender (male, female), method of
payment (cash, check, credit card), etc. are commonly used in regression
analysis
 When including categorical independent variables in the regression model,
dummy or indicator variables should be used.
 If a categorical variable has k levels, k - 1 dummy variables are required, with
each dummy variable being coded as 0 or 1.
 For example, x2 might represent gender where x2=0 indicates male and x2=1
indicates female.
 For another example, method of payment (cash, check, credit card) should
be represented by two dummy variables in the regression model, x1 =1 and
x2 =0 indicates cash, x1 =0 and x2=1 indicates check, x1 =0 and x2=0
indicates credit card.
 Interpretations of the dummy variables that represent k levels are not as
straight forward.

No dissemination without permission or outside the class.
 Example: Johnson Filtration, Inc.
Johnson Filtration, Inc., provides maintenance service for water-
filtration systems throughout southern Florida. To estimate the service
time and the service cost, Johnson managers want to predict the repair
time necessary for each maintenance request. Repair time is believed to
be related to the number of months since the last maintenance service
and the type of repair problem (mechanical or electrical). Develop and
estimate a regression model showing how repair time is related to
number of months since the last maintenance service and the type of
repair problem.
Solution:
y = β0 + β1x1 + β2x2 +ε
where x1 denotes number of months, x2 denotes type of repair with 1
indicating electrical and 0 indicating mechanical

No dissemination without permission or outside the class.
� = .93 + .3876𝑥𝑥1+1.263𝑥𝑥2
 𝑦𝑦
 At the .05 level of significance, the p-value of .001 associated with the F test (F=21.36)
indicates that the regression relationship is significant.
 The t test results indicate that both months since last service (p-value<.001) and type of repair
(p-value=.005) are statistically significant.
 R-sq=85.92% and R-sq(adj)=81.9% indicate that the estimated regression equation does a
good job of explaining the variability in repair time.
 b2 =1.263 indicates the difference between the mean repair time for an electrical
repair and the mean repair time for a mechanical repair.
Statistics for Business and Economics © Cengage Learning 2017
No dissemination without permission or outside the class.
 Model curvilinear relationship
 Example: Sales of Laboratory Scales
 A manufacturer of laboratory scales wants to investigate the relationship between
the length of employment of their salespeople and the number of scales sold.
 We first develop a simple linear regression model and a scatter diagram to see
how the model fits the data


 The scatter diagram suggests a possible curvilinear relationship between
the length of time employed and the number of scales sold.
 We then develop a multiple regression model with two independent
variables x and x2:
y = b0 + b1x + b2x2 + ε
 This model is often referred to as a second-order polynomial or a
quadratic model.
Statistics for Business and Economics © Cengage Learning 2017
No dissemination without permission or outside the class.
Statistics for Business and Economics © Cengage Learning 2017
No dissemination without permission or outside the class.
Statistics for Business and Economics © Cengage Learning 2017
No dissemination without permission or outside the class.

MBS 7e PPT 15
No ratings yet
MBS 7e PPT 15
51 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
Week 8 - 10
No ratings yet
Week 8 - 10
72 pages
Week 5 Multiple Regression: Busa3500 Statistics For Business Ii Piedmont College
No ratings yet
Week 5 Multiple Regression: Busa3500 Statistics For Business Ii Piedmont College
57 pages
10 Bda
No ratings yet
10 Bda
35 pages
Multiple Regression (Compatibility Mode)
No ratings yet
Multiple Regression (Compatibility Mode)
24 pages
STAT 252-Notes-Topic 5-Multiple Linear Regression
No ratings yet
STAT 252-Notes-Topic 5-Multiple Linear Regression
33 pages
Chapter 15
No ratings yet
Chapter 15
50 pages
Multiple Regression
No ratings yet
Multiple Regression
36 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
No ratings yet
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
12 pages
Lecture 12
No ratings yet
Lecture 12
36 pages
Chapter 14, Multiple Regression Using Dummy Variables
No ratings yet
Chapter 14, Multiple Regression Using Dummy Variables
19 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
Topic 7 Regression (Cont.)
No ratings yet
Topic 7 Regression (Cont.)
47 pages
Session 11 - Correlation and Regression
No ratings yet
Session 11 - Correlation and Regression
28 pages
05 Linear Regression 2
No ratings yet
05 Linear Regression 2
71 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
6 Multiple Regression
No ratings yet
6 Multiple Regression
36 pages
High Yield Notes
No ratings yet
High Yield Notes
251 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Topic 3 Multiple Regression Analysis Estimation
No ratings yet
Topic 3 Multiple Regression Analysis Estimation
31 pages
Fsgs
No ratings yet
Fsgs
28 pages
Multiple Linear Regression (Multiple Regression Analysis)
No ratings yet
Multiple Linear Regression (Multiple Regression Analysis)
37 pages
Chapter 4 (Regression) PDF
No ratings yet
Chapter 4 (Regression) PDF
63 pages
08 Multiple Regression
No ratings yet
08 Multiple Regression
11 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
Regression Analysis NEW-1
No ratings yet
Regression Analysis NEW-1
60 pages
Week 11-2 Lecture 15 Student
No ratings yet
Week 11-2 Lecture 15 Student
54 pages
Multiple Regression: by Dr. D. Israel
No ratings yet
Multiple Regression: by Dr. D. Israel
23 pages
Multiple Linear Regression-I
No ratings yet
Multiple Linear Regression-I
6 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
Unit 5
No ratings yet
Unit 5
10 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
Linear Regression PDF
100% (1)
Linear Regression PDF
32 pages
Unit 4-1
No ratings yet
Unit 4-1
29 pages
Lecture Plan 12 - 16!1!1
No ratings yet
Lecture Plan 12 - 16!1!1
7 pages
Estimating Demand: Regression Analysis
No ratings yet
Estimating Demand: Regression Analysis
29 pages
CH - 03 - Multiple Regression Analysis Estimation
No ratings yet
CH - 03 - Multiple Regression Analysis Estimation
36 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
UNIT II Regression
No ratings yet
UNIT II Regression
59 pages
MGT555 CH 6 Regression Analysis
No ratings yet
MGT555 CH 6 Regression Analysis
19 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
Lecture 13 BA
No ratings yet
Lecture 13 BA
36 pages
Complete Business Statistics: Multiple Regression
No ratings yet
Complete Business Statistics: Multiple Regression
64 pages
Regression
No ratings yet
Regression
60 pages
Chapter 4 (Regression)
No ratings yet
Chapter 4 (Regression)
125 pages
120.508 Module 8 Multiple Regression (PDF Full Page Color)
No ratings yet
120.508 Module 8 Multiple Regression (PDF Full Page Color)
52 pages
I Am Sharing 'Chapter 15 Updated Student Versio 240104 162608
No ratings yet
I Am Sharing 'Chapter 15 Updated Student Versio 240104 162608
65 pages
Bivariate
No ratings yet
Bivariate
28 pages
Multiple Regression: Click To Edit Master Subtitle Style
No ratings yet
Multiple Regression: Click To Edit Master Subtitle Style
12 pages
C2 English
No ratings yet
C2 English
34 pages
Anderson Ch16
No ratings yet
Anderson Ch16
59 pages
Multiple Regression
100% (1)
Multiple Regression
30 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
MultipleRegression 1
No ratings yet
MultipleRegression 1
40 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Classical LinearReg 000
No ratings yet
Classical LinearReg 000
41 pages
6 One Hot Encoding
No ratings yet
6 One Hot Encoding
3 pages
Answer Assignment 2
No ratings yet
Answer Assignment 2
6 pages
Pengaruh Motivasi, Disiplin Kerja Dan Kepuasan Kerja Terhadap Kinerja Karyawan Pt. Jadi Abadi Corak Biscuit Surabaya
No ratings yet
Pengaruh Motivasi, Disiplin Kerja Dan Kepuasan Kerja Terhadap Kinerja Karyawan Pt. Jadi Abadi Corak Biscuit Surabaya
22 pages
TYBBAA 1007points Tally Show
No ratings yet
TYBBAA 1007points Tally Show
33 pages
Sample Exam With Solutions. Econometrics II 2015.
No ratings yet
Sample Exam With Solutions. Econometrics II 2015.
15 pages
Linear Regression Model: Man - PN@VNP - Edu.vn
No ratings yet
Linear Regression Model: Man - PN@VNP - Edu.vn
77 pages
Unit 9
No ratings yet
Unit 9
18 pages
"Demand Forecasting Is Predicting Future
No ratings yet
"Demand Forecasting Is Predicting Future
27 pages
Tariro Mupfurutsa R159097Q Hence Commnication Skills Assignment 1 MR Gumindoga
No ratings yet
Tariro Mupfurutsa R159097Q Hence Commnication Skills Assignment 1 MR Gumindoga
4 pages
Pide Campus
No ratings yet
Pide Campus
2 pages
Multiple Regression With Serial
No ratings yet
Multiple Regression With Serial
15 pages
ECON 241 or ECON C342 - COMPRE ANSWER KEY
No ratings yet
ECON 241 or ECON C342 - COMPRE ANSWER KEY
10 pages
Review of An Introduction To Modern Econometrics U
No ratings yet
Review of An Introduction To Modern Econometrics U
7 pages
REGRESSION ANALYSIS 1 and 2 Notes
No ratings yet
REGRESSION ANALYSIS 1 and 2 Notes
9 pages
Profitability Analysis of Poultry Business in Umunneochi Local Government Area of Abia State, Nigeria
No ratings yet
Profitability Analysis of Poultry Business in Umunneochi Local Government Area of Abia State, Nigeria
9 pages
MFML Unit-4 Notes - 14-06-2024
No ratings yet
MFML Unit-4 Notes - 14-06-2024
36 pages
Statistics For Business and Economics
No ratings yet
Statistics For Business and Economics
43 pages
4913 Mentzer Chapter 3 Time Series Forcasting Techniques
No ratings yet
4913 Mentzer Chapter 3 Time Series Forcasting Techniques
40 pages
Forecasting
100% (1)
Forecasting
70 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Estimating A Vecm in Stata
No ratings yet
Estimating A Vecm in Stata
3 pages
Chapter 8 Estimation of Parameters and Fitting of Probability Distributions
No ratings yet
Chapter 8 Estimation of Parameters and Fitting of Probability Distributions
20 pages
Problem 6 46
No ratings yet
Problem 6 46
12 pages
Ps 4 Lab
No ratings yet
Ps 4 Lab
4 pages
Omitted Variable Bias: The Simple Case
No ratings yet
Omitted Variable Bias: The Simple Case
8 pages
Predictive Analytics Notes
No ratings yet
Predictive Analytics Notes
42 pages
Econometrics Work Sheet For Eco
No ratings yet
Econometrics Work Sheet For Eco
4 pages
OLS
No ratings yet
OLS
18 pages

11 Bda

Uploaded by

11 Bda

Uploaded by

Emerging Market Institute Fall 2024

Business Data Analysis

Jin Chen, PhD.

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

Statistics for Business and Economics © Cengage Learning 2017

You might also like