Multiple Linear Regression Example: Problem Statement
Multiple Linear Regression Example: Problem Statement
Example
Problem Statement
Mileage of used cars is often thought of as a good predictor of sale prices of used cars. Does this
same conjecture hold for so called “luxury cars”: Porches, Jaguars, and BMWs? More precisely, do
the slopes and intercepts differ when comparing mileage and price for these three brands of cars?
To answer this question, data was randomly selected from an Internet car sale site. (Tweaked a bit
from Cannon et al. 2013 [Chapter 1 and Chapter 4])
Competing Hypotheses
There are many hypothesis tests to run here. It’s important to first think about the model that we will
fit to address these questions. We want to predict Price (in thousands of dollars) based
on Mileage (in thousands of miles). A simple linear regression equation for this would
be Price^=b0+b1∗MileagePrice^=b0+b1∗Mileage.
We are dealing with a more complicated example in this case though. We need to also include
in CarType to our model. Since CarType has three levels: BMW, Porche, and Jaguar, we encode this
as two dummy variables with BMW as the baseline (since it occurs first alphabetically in the list of
three car types). This model would help us determine if there is a statistical difference in the
intercepts of predicting Price based on Mileage for the three car types, assuming that the slope is
the same for all three lines:
Price^=b0+b1∗Mileage+b2∗Porche+b3∗Jaguar.Price^=b0+b1∗Mileage+b2∗Porche+b3∗Jaguar.
This is not exactly what the problem is asking for though. It wants us to see if there is also a
difference in the slopes of the three fitted lines for the three car types. To do so, we need to
incorporate interaction terms on the dummy variables of Porche and Jaguar with Mileage. This also
creates a baseline interaction term of BMW:Mileage, which is not specifically included in the model
but comes into play by setting Jaguar and Porche equal to 0:
Price^=b0+b1∗Mileage+b2∗Porche+b3∗Jaguar+b4Mileage∗Jaguar+b5Mileage∗Porche.