Multiple Regression With Two Independent Variables: 1. Data Collection
Multiple Regression With Two Independent Variables: 1. Data Collection
A collector of antique grandfather clocks believes that the price received for the clocks at an
antique auction increases with the age of the clocks and with the number of bidders. Thus, the
model hypothesized is
Y 1 X 1 2 X 2
1. Data Collection
Hsieh, P-H 1
BA 275 Modeling Relationships
Winter 2007 Multiple Regression Analysis
2. Initial Analysis
2200 2200
1900 1900
1600 1600
Price
Price
1300 1300
1000 1000
700 700
100 120 140 160 180 200 5 7 9 11 13 15
Age Bidder
Hsieh, P-H 2
BA 275 Modeling Relationships
Winter 2007 Multiple Regression Analysis
99.9 2.2
99
Studentized residual
95 1.2
percentage
80
50 0.2
20
5 -0.8
1
0.1 -1.8
-1.8 -0.8 0.2 1.2 2.2 0 10 20 30 40
8 2.2
Studentized residual
6 1.2
frequency
4 0.2
2 -0.8
0 -1.8
-2 -1 0 1 2 700 1000 1300 1600 1900 2200
Hsieh, P-H 3
BA 275 Modeling Relationships
Winter 2007 Multiple Regression Analysis
2.2 2.2
Studentized residual
Studentized residual
1.2 1.2
0.2 0.2
-0.8 -0.8
-1.8 -1.8
5 7 9 11 13 15 100 120 140 160 180 200
Bidder Age
Is there any violation of the required conditions? (normality, independence, constant variance,
and zero mean.)
When the number of bidders is around 10, does the current model tend to overestimate or
underestimate Price? How about the number of bidders is around 5 or 15?
6. Model Selection
After trying out several models, there are only a few remaining models that passed all the tests
and satisfied the required conditions. The following table summarizes the STATGRAPHICS
PLUS outputs from each model. Which one of the competing models should be chosen as our
final model? And why? (Assume that our current model passed all the tests and satisfied the
required conditions.)
What is the total variation of auction prices? How much has been explained by the model?
If there are 10 bidders and the age of the clock is 100 years old, what is the expected auction
price?
If Age is held fixed and the number of bidders increases from 10 to 11, how much does Price
increase?
Hsieh, P-H 4
BA 275 Modeling Relationships
Winter 2007 Multiple Regression Analysis
A bank would like to develop a model to predict the total sum of money that customers withdraw
(Y) from Automatic Teller Machines (ATMs) on a weekend based on the median value of homes
(X1) in the neighborhood in which the ATM is located and the location of the ATM (X2) (no =
not a shopping center and yes = shopping center). A random sample of 15 ATM locations is
selected. The multiple linear regression model:
Y = + 1 X1 + 2 X2 +
with normal error terms is expected to be appropriate. Perform a multiple linear regression
analysis.
Hsieh, P-H 5
BA 275 Modeling Relationships
Winter 2007 Multiple Regression Analysis
Regression Printout
Questions
3. What is the value of R2? the adjusted R2? To select a model, why do we prefer adj-R2 to R2?
4. Predict the amount of money withdrawn for a neighborhood in which the median value of
homes is $200,000 for an ATM that is located in a shopping center.
5. If the median value of homes increases by $2,000, then the amount of money withdrawn from
an ATM located in a shopping center is expected to increase by .
6. If the median value of homes is $200,000, then the amount of money withdrawn from an
ATM located in a shopping center is ; and the amount of money withdrawn
from an ATM located outside a shopping center is . What is the difference?
Hsieh, P-H 6