Critical Thinking Exercise-Real Estate
Critical Thinking Exercise-Real Estate
Important: Ensure that the Dependent Variable (Price) is plotted on the Y-axis,
and the Independent Variable is on the X-axis, as the column order determines
the axis assignment.
Price and Baths (R² = 0.0264): Slightly related but still very weak
Use Excel’s regression tools to evaluate how well the prescribed independent
variables explain variations in the dependent variable.
Before running the regression, follow these steps to properly encode categorical
variables:
Identify Categorical Variables – Review the dataset and determine which
columns contain categorical data (e.g., "Region," "Category," or "Yes/No"
values).
=IF(A2="Male",1,0)
In the Excel spreadsheet provided, using the Data Analysis Add-in, run a regression
analysis with Price as the Dependent Variable with all Independent Variables.
Next run regression with Price as the Dependent and each of the following Independent
Variable (Lot, Garage and BRs as the Independent Variables and place each of the
regression results in a separate worksheet with the appropriate name, i.e., select new
worksheet to place the results of each regression “Regression Model with Lot, Regression
Model with Garage, etc.…”).
a. Provide the following from the “Full Regression Model” (Model with all
features):
a. Adjusted R2 ________ 0.0118
_________ 596764.38
c. Coefficient of X1 (Lot)
_________ -16847.73
d. Coefficient of X2 (Garage)
__________ -18183.71
e. Coefficient of X3 (BRs)
________ -2450.48
Explain: I believe not all independent variables are necessary for predicting changes in Price. A
negative adjusted R² suggests that some variables do not meaningfully contribute to the model.
Additionally, Lot Size, Garage, and Age have very small coefficients, indicating a weak
relationship with Price. Eliminating these less significant variables can enhance the model’s
effectiveness and clarity.
e. Run a Regression Model on the Real Estate – Base database using Price as the
Dependent Variable (Y) and include the original Independent Variables
(minus any you removed in step 6) and adding the variable you chose in step
7. Print your model output and turn it in with the assignment. (NOTE: You
may have to repeat this exercise until you find a combination of variables that
gives you a higher R2).
i. Var____Sq. Ft.___________________
_________ 10.47_______________
ii. Var______Baths_________________
______________ -45614.7__________
iii. Var___________BRs____________
_________ -3792.258 _______________
Critical Thinking Question:
g. A large real estate company is trying to use similar data plus their own sales
data to forecast total sales for the coming year for each of their agents and
they have pulled data from their Finance records. They are trying to assemble
the best data to build a Regression model.
a. Would it make sense to use the same data as we used above in the
model? Why or why not?
Answer: No, it would not be ideal to use the same data. The current model
focuses on predicting house prices based on property features, whereas
forecasting total sales for agents requires different factors. Sales performance is
influenced by variables such as the number of transactions completed, marketing
efforts, client network, and market trends. Using property characteristics alone
would not provide an accurate prediction of an agent’s sales.
b. Recommend two data elements you think they probably have available
to help them predict sales for each of their sales people.
1. Number of Transactions Closed per Agent – This reflects an agent’s past sales
activity and is a strong indicator of their future performance.
2. Total Commission Earned per Agent – Higher commissions suggest more
successful sales, making it a useful metric for predicting overall sales
performance.
3.
GRADING RUBRIC
Overall Score Possible = 100