Practical 6 Multiple Linear Regression Using SPSS
Practical 6 Multiple Linear Regression Using SPSS
• Range: 0 to ∞.
• Lower is better: Smaller RMSE indicates better model performance.
• Strength:
• Penalizes large errors more than small ones (due to squaring).
• Useful when large deviations are more concerning.
• Limitations: Sensitive to outliers due to squaring.
Metrics to evaluate regression…
• 3. Mean Absolute Error (MAE):
• Definition: Average of absolute differences between predicted and actual
values.
• Range: 0 to ∞.
• Lower is better: Smaller MAE indicates better model performance.
• Strength:
• Provides a straightforward measure of average error magnitude.
• Less sensitive to outliers compared to RMSE.
• Limitations: Does not account for variance in error magnitude.
Dataset
Our model’s R square is 0.984 which indicates that 98.4% of the variation in monthly revenue is explained
by the independent variables. This is pretty good regression model.
Durbin-Watson Test for
Independence of Residuals
From previous slide, Durbin-Watson Value is 1.448 which means there is positive
autocorrelation.
Anova for Regression
The ANOVA (Analysis of Variance) table in regression analysis evaluates whether the regression model
explains a significant portion of the variation in the dependent variable. It essentially tests the null
hypothesis that all regression coefficients (except the intercept) are zero.
Since null hypothesis is rejected, hence we cans ay that regression coefficients aren’t zero.
T-test for the regression coefficients
In regression analysis, the t-test for regression coefficients evaluates whether each independent
variable significantly contributes to predicting the dependent variable. Specifically, it tests the null
hypothesis (H0) that a coefficient (β) is equal to zero, indicating no effect.
Only average room price contributes significantly for predicting the dependent variable. We
can remove guest satisfaction and marketing expenses variables.
VIF (Variance Inflation Ratio) is high for avg_room_price and marketing expense (>10).
Hence, multicollinearity exists here.
Correlation bw independent
variables.
Since correlation is high, we can say that there exists multicollinearity In the data.
Normality Testing