EDU6950 Advance Statistics in Education Assignment 2-Multiple Regression Analysis
EDU6950 Advance Statistics in Education Assignment 2-Multiple Regression Analysis
In this second assignment, you are required to submit a report on the relationship
among the HATCO's seven independent variables (delivery speed, price level, price
flexibility, manufacturer image, overall service, sales force image and product quality)
on the product usage. Do submit your report before 26th October 2018. 167
2. To identify the factors that lead to increased product usage for application in
marketing campaign (Explanation).
To apply regression procedure, Usage Level (X9) was selected as dependent variable
(Y) to be predicted by independent variables representing perceptions of HATCO’s
performance. The following seven variables were included as independent variable
(after this will refer as predictor variables):
X1 Delivery Speed
X2 Price Level
X3 Price Flexibility
X4 Manufacturer Image
X5 Service
X6 Salesforce Image
X7 Product Quality
The relationship among seven predictor variables and usage level was assumed to be
statistical, not functional, because it involved perceptions of performance and may have
the levels of measurement error.
(Predicted usage level) Y = b0 + b1X1 + b2X2 + b3X3 + b4X4 + b5X5 +b6X6 +b7X7
where
b0 = constant number
b1 = change in usage level associated with unit change of Delivery Speed
b2 = change in usage level associated with unit change of Price Level
b3 = change in usage level associated with unit change of Price Flexibility
b4 = change in usage level associated with unit change of Manufacturer Image
b5 = change in usage level associated with unit change of Overall Service
b6 = change in usage level associated with unit change of Salesforce Image
b7 = change in usage level associated with unit change of Product Quality
X1 = Delivery Speed
X2 = Price Level
X3 = Price Flexibility
X4 = Manufacturer Image
X5 = Overall Service
X6 = Salesforce Image
X7 = Product Quality
Table 1
Model Summary
There is no missing data due to complete responses from the respondents. Table 2
below show the descriptive data that show no missing value in this analysis.
Table 2
Descriptive Statistics
Minimum R2 That Can Be Found Statistically Significant with a Power of .80 for Varying
Numbers of Independent Variables (1), Sample Sizes (100), and Significance Level (α)
= .01. The ratio for this data is 13 observation: W1 variables approaching the desired
level of 15 observation per 1 variable. The ratio between observation and variables is
above the minimum requirement 5:1.
The proposed regression analysis was deemed sufficient to identify not only statistically
significant relationship but also relationship that had managerial significance because of
the adequate sample and no missing data.
Table 3
b. Homoscedasticity is the description of data for which the variance of the error
terms (e) appears constant over the range of values of a predictor variables.
Based on the scatterplot, all predictor variables show homoscedasticity.
c. Normality
1. The simplest diagnostic for the set of independent variables in the equation is
a histogram of residuals, with a visual check for a distribution approximating
the normal distribution. (See Histogram below)
2. The better method is the use of normal probability plots.
3. Normality test for HATCO data using Shapiro Wilk test for normality especially
for the small sample. Table 4 show Shapiro Wilk data for normality testing.
Table 4
Shapiro-Wilk
Statistic df Sig.
Delivery Speed .985 100 .341
Price Level .969 100 .018
Price Flexibility .950 100 .001
Manufacturer Image .982 100 .183
Overall Service .986 100 .366
Salesforce Image .963 100 .007
Product Quality .971 100 .028
Usage Level .985 100 .320
Satisfaction Level .977 100 .074
The variable show the normal distribution data if the p value is not significant (sig value
>.05). From the Table 4, we can see that all variables show normal distribution of
data except for Price Flexibility.
Stage 4: Assessing the Regression Model and Assessing
Overall Model Fit
In this stage have to accomplish three basic tasks:
1. Select a method for specifying the regression model to be estimated.
2. Assess the statistical significance of the overall model in predicting the
dependent variable.
3. Determine whether any of the observations exert an undue influence on the
results.
For this analysis, I will use Stepwise Estimation (under SEQUENTIAL SEARCH
METHODS). Stepwise Estimation is a method of selecting variables for inclusion in the
regression model that start by selecting the best predictors of the dependent variable.
Additional predictor variable are selected in terms of the incremental exploratory power
they can add to the regression model.
Table 5
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Table 6
Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 Stepwise
(Criteria:
Probability-of-F-
Service . to-enter <= .050,
Probability-of-F-
to-remove >= .
100).
2 Stepwise
(Criteria:
Probability-of-F-
Price Flexibility . to-enter <= .050,
Probability-of-F-
to-remove >= .
100).
3 Stepwise
(Criteria:
Probability-of-F-
Salesforce
. to-enter <= .050,
Image
Probability-of-F-
to-remove >= .
100).
Based on Table 7 below, correlation between usage levels (dependent variable) with
service (predictor variable) is .880, and with addition on price flexibility, the correlation
become .86. Combining all three predictor variables, Service, Price Flexibility and
Salesforce Image, the correlation increase to .877.
R square value indicate that 49.1 %change in usage level is based on service, 26.4%
from price flexibility and 0.13% is from Salesforce Image.
Table 7
Model Summaryd
ANOVA results in Table 8 show that there is significant relationship between three
variables (Overall Service, Price Flexibility and Salesforce image) with Usage Level
(dependent variables) with significant value of p < 0.5.
For service, the results is significant [F (1, 98) = 94.525, p < 0.5], while Price Flexibility
the results is [F (2, 97) = 149.184, p < 0.5], and lastly for the Salesforce Image, the
significant results is [F (3, 96) = 106.115, p < 0.5],
Table 8
ANOVAa
Total 7999.000 99
2 Regression 6036.513 2 3018.256 149.184 .000c
Residual 1962.487 97 20.232
Total 7999.000 99
3 Regression 6145.700 3 2048.567 106.115 .000d
B constants regression value in Table 9 for three predictor variables in linear equation.
T test show significant result p < 0.5.
Table 9
Coefficientsa
T test result in Table 10 show the effect of predictor variables in linear combination that
is not significant to dependent variable, that resulting them not to be included in
regression model. Some of the variables have small Beta In value causing them been
eliminated in regression model. Collieniarity tolerance < 2.0 show that the data don’t
have any Collinearity problems, meaning that no predictor variables are highly
correlated.
Table 10
Residuals Statisticsa
b0 = constant number
b3 = change in usage level associated with unit change of Price Flexibility
b5 = change in usage level associated with unit change of Overall Service
b6 = change in usage level associated with unit change of Salesforce Image
X3 = Price Flexibility
X5 = Overall Service
X6 = Salesforce Image
Significantly, Service [F (1, 98) = 94.525, p < 0.5] contribute 49.1 % variance (R2 = .
491) to customer product usage level. That’s mean Service is the primary predictor to
usage level. Combination of service and price flexibility [F (2, 97) = 149.184, p < 0.5],
will increase the variance to 75.5% and the combination of Overall Service, Price
Flexibility and Salesforece Image [F (3, 96) = 106.115, p < 0.5] will contribute to 76.8%
variance to product usage level.
Based on the analysis above, HATCO Company should enhance their quality on service
in their next campaign as service is the main factor for customer usage level. Other
than that, price flexibility and Salesforce Image can be considered as the factors to be
highlighted in next marketing campaign.