Market Segmentation - Product Service Management
Market Segmentation - Product Service Management
06/07/2019
1|Page
Contents
Market Segmentation – Product Service Management....................................................................1
2|Page
Project Introduction:
The data file Factor-Hair.csv contains 12 variables used for Market Segmentation in the
context of Product Service Management.
Variable Expansion
ProdQual Product Quality
Ecom E-commerce
TechSup Technical Support
CompRes Complaint Resolution
Advertising Advertising
Prodline Product Line
SalesFlmage Salesforce Image
Com Pricing Competitive Pricing
WartyClaim Warranty & Claims
OrdBilling Order & Billing
DelSpeed Delivery Speed
Satisfaction Customer Satisfaction
3|Page
Explanatory Analysis – Managerial Report:
Loading file in R
hair = read.csv ("hair.csv”, header = T)
From summary and structure, we can notice that the given data is scaled already
and there is no need to scale them.
4|Page
Assigning New column names in a matrix to use in the dataset
variables= c ("Product Quality" , "E-Commerce" , "Technical Support" , "Complaint
Resolution" ,"Advertising”, "Product Line" , "Salesforce Image", "Competitive
Pricing" ,"Warranty & Claims”, "Order & Billing" , "Delivery Speed" , "Customer
Satisfaction")
Attach Data
attach(hairnew)
names(hairnew)
5|Page
Finding co-relation between given data
cormtx=cor(hairnew[,-12])
cormtx
6|Page
From correlation matrix we can clearly identify that independent variables also
have correlation among themselves.
Hence it clearly indicates that there exists multicollinearity and factor analysis will
be applicable.
To assess multicollinearity, we will compute VIF score. If VIF score is greater than 4
for more than two or more variables, then it clearly indicate that multicollinearity
exists in independent variable.
library(car)
hairvif = vif(lm(`Customer Satisfaction` ~., data = hairnew))
hairvif
library(psych)
cortest.bartlett(cormtx,100)
Since P value is 1.79337e-96 which is less than 0.05 so it is ideal case for dimension
reduction.
7|Page
Kaiser-Meyer-Olkin (KMO) Test is a measure of how suited your data is for
Factor Analysis. The test measures sampling adequacy for each variable in the
model and for the complete model.
KMO(cormtx)
Since MSA value which is 0.65 and is greater than 0.5 then it clearly indicates that
we have enough samples to proceed further for Factor Analysis.
In order to perform factor analysis, we will proceed further to calculate Eigen Value
and draw a Scree plot to get the number of factors.
Calculation Eigen values for the variables to check that how the data is spread
across
evector =eigen(cormtx)
EV = evector$values
EV
We can clearly see that few of the Eigen Values are higher than the other Eigen
Values so this is a clear case of Dimension reduction.
8|Page
Plot scree plot of Eigen Values on a graph and check that which are all the major
influencing variables.
plot(EV, main = "Scree Plot", xlab = "Factors", ylab = "Eigen Values", pch = 20, col =
"black")
lines(EV, col = "red")
abline(h = 1, col = "black", lty = 2)
We will consider values above 1 so accordingly there are 4 factors only which are as
follows.
Factor 1 as After Sales Service which has Complaint Resolution, Order Billing and
Delivery Speed
Factor 3 as Customer Support which has Technical Suppot and Warrenty Claim
Factor 4 as Product which has Production Quality, Production Line and Competitive
Pricing
9|Page
Extracting PCA/FA
10 | P a g e
PCA with rotation
hair_rotate = fa(hairnew[,-12], nfactors =4, rotate ="varimax")
print(hair_rotate)
fa.diagram(hair_rotate)
Competitive Pricing here is in negative which indicates that it will affect adversely
with respect to factor “MR4”.
After doing rotation we can see that there is a significant change in values so we will
consider rotation
Create a Data Structure using score values of four factors and target values
and bind the dependent variables together to proceed with supervised
learning.
hairnew2 = cbind(hairnew[,12],hair_rotate$scores)
11 | P a g e
Checking and renaming new dataset header name
head(hairnew2)
head(hairnew2)
12 | P a g e
3. Name the factors
Factor 1 as After Sales Service which has Complaint Resolution, Order Billing and
Delivery Speed
Factor 3 as Customer Support which has Technical Suppot and Warrenty Claim
Factor 4 as Product which has Production Quality, Production Line and Competitive
Pricing
4. Perform Multiple Linear Regression with Customer Satisfaction as the dependent variable
and the four factors as the independent variables. Comment on Model Validity.
install.packages("caTools")
library(caTools)
Creating two datasets one to train the model and another to test the model based on
standard split ratio of 70:30.
spl = sample.split(hairnew2$Customer.Satisf, SplitRatio = 0.70)
Train = subset(hairnew2, spl==T)
Test = subset(hairnew2, spl==F)
Test Dimention : 29 5
cat (“Test Dimention : ", dim(Test))
13 | P a g e
Creating New Linear model using Train Data
newlm = lm(Customer.Satisfaction ~., data = Train)
summary(newlm)
vif(newlm)
If VIF score is greater than 4 for more than two or more variables, then it clearly
indicates that multicollinearity exists in independent variable. Here we have VIF<4
so it clearly indicates that there is no multicollinearity among the independent
variable.
14 | P a g e
Compute R-sq. for the test data
SST = sum((Test$Customer.Satisfaction - mean(Train$Customer.Satisfaction))^2)
SSE = sum((pred - Test$Customer.Satisfaction)^2)
SSR = sum((pred - mean(Train$Customer.Satisfaction))^2)
Rsqtest = SSR/SST
model1=lm(hairnew$`Customer
Satisfaction`~ProdQual+Ecom+TechSup+CompRes+Advertising+ProdLine+SalesFImage
+ComPricing+WartyClaim+OrdBilling+DelSpeed, data = hair)
summary(model1)
Predict_Model=predict(model2)
Predict_Model1=predict(model1)
plot(Predict_Model,col="Red")
lines(Predict_Model,col="Red")
plot(Predict_Model1,col="Blue")
lines(Predict_Model1,col="Blue")
lines(Predict_Model,col="Red")
15 | P a g e
From back tracking also we can see that New Model (Individual Factors) and Old model
(factors from 4 new factors) both follows the same path.
16 | P a g e