Group 5 - Assignment No.3
Group 5 - Assignment No.3
1. Data on amount of money spent (Y) by customers at an e-commerce portal, monthly income (X1) and
family size (X2) is collected for 200 customers (File attached).
Q-1 Comment on the suitability of developing a regression model based on scatter plot and correlation
matrix.
Scatter Plot:
Interpretation: From the scatter plot of the regression model is found that:
1. There is linear correlation between amount spent and the Income.
2. There is no correlation between the amount spent and the Family size
Correlation Matrix:
Correlations
Amount Spent Income Family Size
Pearson Correlation Amount Spent 1.000 .748 -.138
Income .748 1.000 .023
Family Size -.138 .023 1.000
Sig. (1-tailed) Amount Spent . .000 .025
Income .000 . .375
Family Size .025 .375 .
N Amount Spent 200 200 200
Income 200 200 200
Family Size 200 200 200
1.The Dependent Variable (Amount Spent) is highly correlated with the independent variable (Income) 0.748
and less correlated to second Independent Variable ( -0.138).
2. The Independent variable Income and the family members are less corelated and the value of Pearson
correlation is 0.023 which is very less.
Q.2 Build two regression model at one go with SPSS a) only with monthly income
b) both monthly income and family size together as independent variables. The dependent variable in
Model 1
The significance value is 0.000 which is less than 0.05 which states that the model is significant, and the
Amount spent is explained by the income.
Model 2
The significance value is 0.000 which is less than 0.05 which states that the model is significant, and the
amount spent is explained by both the income and Family size. The impact of significance is not same
and is explained further in the Solution.
Null Hypothesis: There is no significant relationship between the Dependent Variable (Amount spent)
and the independent variables (Income and Family Size)
The overall regression model is significant as the sig. value (p value) from ANNOA table is 0.000 which is
less than 0.05. Hence the null Hypothesis is rejected and, the overall regression model is significant.
b) whether the independent variables are significant, and which one has more impact
Yes, both the independent variables are significant and have impact. Although the impact of Family size
is very less as the corelation value is -0.138 and the impact of income on amount spent is high as the
correlation value is 0.748.
The overall predictive capability of the model is explained by the adjusted R2 which is 0.580. This means
that the model is 58 % capable to predict the value of Amount spent. i.e., 58 % variation in the amount
spent is explained by the income and the family size.
Model Summary
Change Statistics
Adjusted R Std. Error of R Square F Sig. F
Model R R Square Square the Estimate Change Change df1 df2 Change
1 .748 a
.560 .558 615.724 .560 251.935 1 198 .000
2 .764b .584 .580 600.114 .024 11.434 1 197 .001
a. Predictors: (Constant), Income
b. Predictors: (Constant), Income, Family Size
Coefficientsa
Unstandardized Standardized 95.0% Confidence Interval for
Coefficients Coefficients B
Regression Model:
Y = b0 + b1X1+b2X2
b0 – constant = 395.571
X2 = Family Size