CLUSTER
ANALYSIS GRP
1
PROBLEM STATEMENT
Stimulus:
Marketers use demographic variables such as age, gender and education to identify distinctive segments that they
can attract, retain and win loyalty. But at often instances they fail to provide any useful insights which might be helpful for sound
decision making. Bencare wants to move away from simplistic segmentation strategies to those that use variables that consumers
actually use in making choice decisions for insurance products.
The KCPs of interest are defined as follows:
DESCRIPTION OF SEGMENTS
Well be considering four dependent variables for external validation
Satisfaction:
Trust Company:
Trust Agent:
Behavioural Loyalty:
Cognitive Loyalty:
Pricing:
Reputation:
SAT1 to SAT3
PRICE1 to PRICE3
REPU1 to REPU4
Short-term Value:
VAL1 to VAL3
Long-term Value:
VAL4 to VAL6
rep17 to rep 20
prac17 to prac20
loy1 to loy4
loy5 to loy8
The segmentation is based on the customer perception about Bencare on
a scale of Satisfaction, pricing, reputation, and short/long term values. The
segments will display Trust or Loyalty depending upon the loadings on the
corresponding attributes.
The aim of Bencare is to go beyond the conventional measures and derive insights from the segments in terms of the loyalty factor
and to identify the most appropriate segment to target. It also aims to study the impact of cultural differences on loyalty.
OBJECTIVES
To identify different customer segments using Insurance Products
To analyse groups of similar data instead of individual observation.
Analyse the loyalty of different segments
Studying the impact of cultural differences on each particular segment.
ANALYTICAL PLAN
Prepping the data by
removing the outliers
Identifying the cluster
seeds using
Hierarchical
Clustering
Identify the cluster
centroids by
comparing the means
using ANOVA
Split the dataset into
two halves, a test
sample and other is
the internal validation
sample and apply KMeans cluster to
obtain new centroids
Run External Validation using
ANOVA to compute differences
between the means within
each sample
Run Cohens Kappa to
determine symmetry or
agreement within the
cluster seeds.
Run Update and No
Update to obtain Cluster
Membership
ASSUMPTIONS
1
Well be using compositional method as assessment of similarity in which a
defined set of attributes is considered in developing the similarity between
objects
Were using Squared Euclidean distance as it increases the importance of large
distances, while weakening the importance of small distances.
The dataset provided is free from any Multicollinearity biases as the factors
scores have been provided in the dataset
We would be considering that all dependent variables are correlated with each
other and thus behave similarly, so checking outliers for any one would suffice
Step 1: Identifying Outliers
We will be using Mahalonobiss distance for Multivariate Normality using
Regression.
Were considering 99% significance at a five degrees of freedom, the ChiSquare value for which is 15.086, so any value of Mahals distance
exceeding this cut-off will be treated as an outlier.
There are a total of 29 outliers, which represent around 3.6 % of the total
responses.
As were eliminating these outliers we can see the variance captured
increases from 38.1% to 56.5%, which reflects a clear case of the
presence of outliers, hence we
will eliminate all those 29 outliers.
Step 2 : Cluster Seeds using
Hierarchical Clustering
Theres a significant increase in the percentage
change as we move from cluster 4 to 3. Cluster
Seed=3 would be the most appropriate choice.
% Change
It graph is a visual representation of the % age change for
deciding upon the number of cluster seeds. The first elbow
appears at 4, thus Number of Cluster Seeds deduced is 3. We
would be running hierarchical clustering for clusters 3 to 7.
15
10
Stress
5
0
2
9 10 11 12 13 14 15
Number of Clusters
Estimating Cluster Centroids using
ANOVA (For Cluster Seed 3)
Step 3 :
Descriptive
REGR factor score 1 for
analysis 1
REGR factor score 2 for
analysis 1
REGR factor score 3 for
analysis 1
N
1
2
3
Total
1
247
144
677
2
3
Total
1
247
144
677
2
3
247
Total
REGR factor score 4 for
analysis 1
REGR factor score 5 for
analysis 1
286
286
286
144
677
286
2
3
Total
1
247
144
677
2
3
247
Total
677
286
144
Std.
Mean
Deviation
-.524475
.87593233
8
.2247202 .55471251
.8042042 .61893394
.0314789 .88774481
-.145388
.65155437
4
.0531066 1.06215779
.3734897 .76418553
.0373985 .86680447
-.102392
.97457065
4
.1551692 .67344599
-.121879
1.05996505
4
-.012567
.90519522
3
-.395874
.77777667
9
.4470345 .61640115
.1610676 .82835820
.0301197 .82607392
-.190584
.76591607
3
.7239336 .69727932
-.802032
.85476796
5
.0130160 .96047909
Mean value of
standardized
scores to be
used as the
centroids
for
cluster
Analysis.
Looking at F value
we can say that
factor 1 and
factor 5
contributes
maximum in the
formation of this
clusters.
Sum of
Squares
REGR
factor
score 1
for
analysis 1
Between
Groups
Within
Groups
Total
REGR
factor
score 2
for
analysis 1
Between
Groups
Within
Groups
Total
REGR
factor
score 3
for
analysis 1
Between
Groups
Within
Groups
Total
REGR
factor
score 4
for
analysis 1
Between
Groups
Within
Groups
Total
REGR
factor
score 5
for
analysis 1
Between
Groups
Within
Groups
Total
Sig. = .000 for
all
the
five
factors.
It
shows
that
means
are
significantly
different
Mean
Square
df
183.605
349.144
674
532.749
676
25.882
482.030
674
507.913
676
10.978
542.922
674
553.900
676
97.303
363.998
674
461.301
676
232.350
391.273
674
623.624
676
91.803
Sig.
177.219
.000
.518
12.941
18.095
.000
.715
5.489
6.814
.001
.806
48.652
90.086
.000
.540
116.175
200.121
.000
.581
Step 3 : Estimating Cluster Centroids
using ANOVA(Cluster seed 4 and 5)
4
ANOVA
Sum of
Sig.
Squares
0.000
45.366
Sum of Squares
36.093
162.691
196
153.419
195
Total
REGR factor score Between Groups
2 for analysis 1
Within Groups
198.784
79.696
199
3
199
4
0.000
198.784
87.322
119.091
198.787
51.365
196
199
3
0.000
111.465
198.787
66.102
195
199
4
141.417
192.782
71.877
196
199
3
0.000
126.680
192.782
78.983
195
199
4
120.777
192.654
102.854
196
199
3
0.000
113.671
192.654
107.661
195
199
4
67.280
170.134
196
199
62.473
170.134
195
199
Total
REGR factor score Between Groups
4 for analysis 1
Within Groups
Total
REGR factor score Between Groups
5 for analysis 1
Within Groups
Total
F
14.494
43.721
23.730
38.881
99.878
Cluster size 4 :Mean significantly different. Factor 2
and factor 5 are contributing most towards the
formation of this cluster. As these are having the
highest F value.
df
REGR factor score Between Groups
1 for analysis 1
Within Groups
Total
REGR factor score Between Groups
3 for analysis 1
Within Groups
df
F
14.415
Sig.
0.000
38.191
0.000
25.438
0.000
33.873
0.000
84.012
0.000
Cluster
size
5;
Mean significantly
different. Factor 2
and factor 5 are
contributing most
towards
the
formation of this
cluster. As these
are having the
highest F value.
Step 3 : Estimating Cluster Centroids
using ANOVA (Cluster Seeds 6 & 7)
6
ANOVA
REGR factor
score 1 for
analysis 1
REGR factor
score 2 for
analysis 1
REGR factor
score 3 for
analysis 1
REGR factor
score 4 for
analysis 1
REGR factor
score 5 for
analysis 1
Sum of
Squares
Between
Groups
Within Groups
Total
Between
Groups
Within Groups
Total
Between
Groups
Within Groups
Total
Between
Groups
Within Groups
Total
Between
Groups
Within Groups
Total
49.813
df
F
5
Sig.
12.974
0.000
Sum of
Squares
51.595
df
7
F
Sig.
11.276
0.000
148.972
194
147.189
193
198.784
199
198.784
199
88.940
109.847
198.787
82.595
110.188
192.782
91.248
101.406
192.654
112.896
57.238
170.134
31.415
194
199
5
29.084
194
199
5
194
199
0.000
34.913
105.472
87.310
192.782
0.000
76.529
96.449
102.338
198.787
194
199
5
0.000
95.319
97.335
192.654
0.000
Cluster size 6; Mean significantly different. Factor 4 and
factor 5 are contributing most towards the formation of
this cluster. As these are having the highest F value.
113.983
56.151
170.134
30.316
193
199
6
38.858
193
199
6
193
199
0.000
31.501
193
199
6
0.000
0.000
65.297
0.000
Cluster size 7;
Mean significantly
different. Factor 3
and factor 5 are
contributing most
towards
the
formation of this
cluster. As these
are having the
highest F value.
Step 4: K mean Clustering
We Are using K Mean clustering for fine tuning of the result which we have got from hierarchical
clustering.
We are dividing the total sample into two equal parts using random no. and then performing
K mean cluster analysis on the first half.
Using the result of first half we are calculating centroid value for update and no update
clustering method.
Then we using these centroid to cluster the remaining half of the randomly divided data.
Then we used cross tab to calculate Kappa for each cluster and will choose the no. of cluster
with highest kappa value
Cluster Centre for First Half
Initial clusters center derived hierarchical cluster were fine tuned using k Mean
for further use in second half of the data.
As from the f value for different cluster it can be seen that the factor score in
each cluster is sufficiently different in each cluster and difference in factor score
lead to formation of clusters
Also difference in factor scores having the most impact on the formation of the
cluster for example differences in factor 5 is more responsible for formation of 3
clusters.
Distance in factor score for 5th factor has more impact in formation of 4, 5, 6 and
7 clusters
Cluster Centre for Second Half
The process is run twice; once in a
constrained manner (with no update
option) and then in an unconstrained
manner (with update option).
As from the f value for different cluster it
can be seen that the factor score in each
cluster is sufficiently different in each
cluster and difference in factor score lead
to formation of clusters
From the kappa value of cross tab results it
can be seen that 3 cluster solution is given
the highest agreement between the update
and no update solution. Further there is
another spike at 6 clusters.
Cluster Centre for Second Half
Final Solution with 3 clusters
From the factor analysis it can be seen that
Factor 1
Satisfaction
Factor 2
Prices
Factor 3
Reputation
Factor 4
Short term value
Factor 5
Long term value
Final centroid value have been calculated using the
centroid of the from the first half as initial values
As from the anova table it can be seen that F value for
the Long Term Value and Short Term Value are the
highest so the difference between the cluster is highly
dependent upon the Long Term Value and then on Short
Term Value and Prices.
Chart Title
Final Solution with 3 clusters
Segments
Cluster 1 :- This segment customers have rated
bencare negatively on all the parameter and
Bencare might not be able to convert them to their
customer
Cluster 2 :- This segment has high value for Long
Term Value, but reputation and Satisfaction of
Bencare are on the border line. Program targeting
to improve reputation and satisfaction should be
under taken.
Cluster 3:- This segment customers have rated
bencare positively on Reputation and Short Term
Values, however it depresses tremendously for
Long Term Value, even Satisfaction and Prices are
on borderline.
.80000
.60000
.40000
.20000
.00000
Satisfac tion
Pric es
Reputation
Short Term Value
-.20000
-.40000
-.60000
-.80000
-1.00000
-1.20000
1
Long Term Value
STEP 5: EXTERNAL VALIDATION
We conducted Factor Analysis for the remaining
variables and calculated Factor scores for those
dependent variables. As seen in the table, F Value for all
the factors is high and are significant, hence they have
different means for different clusters.
1.5000000
1.0000000
.5000000
Trust_Agent
Centroids
.0000000
BEH_Loy
1
TRUST_COM APNY
COG_LOY
-.5000000
-1.0000000
Clusters
Trust Agent, Behavioural Loyalty and Cognitive Loyalty
display similar behaviour for different clusters, whereas
Trust Company is displaying an entirely opposite behaviour
for the clusters.
For Cluster 1, all attributes display a negative correlation
except Trust Company.
Impact of Culture: Location
We split the data file with respect to location
into three part and then ran our cluster analysis
keeping no. of cluster equal to 3.
As from the f value it can be seen that in
different factors in different region have
maximum difference and are affecting the
formation of clusters
Reputation for USA is the main difference
creator and prices have the least impact.
For Germany long term value is the major
difference creator. Reputation and
satisfaction value has least impact.
For Holland long term value and reputation
are the main difference creator and
satisfaction has least impact.
Final Cluster Centers
Cluster
LOCATION
1
2
3
USA
Satisfaction
.
-.15361 .69851
29420
Prices
-.2647
.30614 .26260
1
Reputation
1.215 .53936 .45450
16
Short Term -.7506
.08614 .38040
Value
5
Long Term
.
.37496 -1.35467
Value
05212
Germany Satisfaction -.6023
.24623 .07035
5
Prices
-.4241
.37137 -.25292
4
Reputation
.
-.09780 -.22754
59298
Short Term -.5681
.39318 .53533
Value
7
Long Term
.
.84704 -.84574
Value
36304
Holland Satisfaction -.1140
-.13998 .02959
2
Prices
-.3057
.13363 .02322
5
Reputation -.7703
-.06331 .76256
5
ANOVA
LOCATION
F
USA
Satisfactio
12.054
n
Prices
8.833
Reputation 106.50
8
Short Term
32.763
Value
Long Term
86.276
Value
Germany Satisfactio
20.103
n
Prices
24.409
Reputation 20.352
Short Term
53.310
Value
Long Term 219.50
Value
0
Holland
Satisfactio
1.038
n
Prices
5.213
Reputation 78.191
Short Term
39.038
Value
Long Term 107.58
Value
7
Sig.
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.356
.006
.000
.000
.000
.50000
.40000
.30000
.20000
.10000
.00000
Impact of Culture: Location
-.10000
-.20000
-.30000
-.40000
-.50000
USA prices
2.00000
USA and Holland have similar perception
about prices as compared to Holland in
various segments.
USA and Germany show similar satisfaction
for Bencare product across various segment .
In USA and Germany customer see long term
of Bencare product in same light across
various clusters except cluster 2.
In Germany reputation of Bencare is near
border line for cluster 2 and can be
improved.
For Holland reputation of bencare for cluster
2 is at the borderline and should be improved
In Germany Prices of bencare for cluster 3 is
at the borderline and should be improved
upon.
Holland Prices
1.50000
1.00000
.00000
Germany Prices
1.00000
1
0.50000
0.00000
-1.00000
-0.50000
-2.00000
-1.00000
-3.00000
-1.50000
-4.00000
-2.00000
USA LTV
Germany LTV
USA Reputation
Holland LTV
Germany Reputation
Holland Reputation
.80000
.60000
1.00000
.40000
.50000
.20000
.00000
-.20000
-.40000
.00000
1
-.50000
-1.00000
-.60000
-1.50000
-.80000
-1.00000
USA STV
Germany STV
Holland STV
usa statisfaction
Holland Satisfaction
germany satisfaction
Impact of Culture : Gender
We split the data file with respect to
location into three part and then ran
our cluster analysis keeping no. of
cluster equal to 3.
As from the f value it can be seen that
in different factors in different region
have maximum difference and are
affecting the formation of clusters
Long Term Value and Reputation for
Males are the main difference
creator and Satisfaction and Prices
have the least impact.
For Females long term value is the
major difference creator.
Satisfaction has least impact.
Reputation
0.80000
0.60000
0.40000
0.20000
0.00000
1
-0.20000
-0.40000
-0.60000
-0.80000
-1.00000
-1.20000
Impact of Gender
Prices
Short Term Value
.40000
.20000
Males and Females are similarly
receptive towards price, short term
and long term value as attributes
Females in Cluster 1 and Cluster 3
give more importance to
Satisfaction, whereas for Cluster 2
its near zero, which can be
improved upon. Males have
lukewarm response in Cluster 2
and 1.
In case of Reputation Males and
Females in Cluster 1 and Cluster 2
show opposite preferences.
Reputation of Bencare Move in
opposite direction for Male and
female in different segment
.00000
-.20000
.60000
1
.40000
3
.00000
-.60000
2 Female
1
-.20000
-.80000
1 M ale
2 Female
Satisfaction
-.40000
-.60000
-.80000
.50000
.40000
Long Term Value
1.00000
.30000
.50000
.20000
.00000
.10000
-.10000
1 M ale
.20000
-.40000
.00000
1 M ale
2 Female
-.50000
1
3-1.00000
-1.50000
-.20000
1 M ale
2 Female
2
1 M ale
2 Female
Impact of Culture : Age
We split the data file with respect to Age into Five
part and then ran our cluster analysis keeping no. of
cluster equal to 3.
As from the f value it can be seen that in different
factors in different Age groups have maximum
difference and are affecting the formation of clusters
For 18-24 yrs , No attribute is showing significant
impact. Except Satisfaction and reputation
For 25-34 yrs , Long term value shows maximum
difference and maximum impact.
For 35-44 yrs , Long term value and satisfaction
shows maximum difference and maximum impact.
For 45-54 yrs , Long term value shows maximum
difference and maximum impact.
For 55+ yrs , Long term value shows maximum
difference and maximum impact.
Satisfaction
1.00000
0.00000
1
-1.00000
-2.00000
-3.00000
-4.00000
Impact of Culture : Age
For Age group 35-44yrs, All
attribute are negatively rate or
slightly positive
For age group 18-24 yrs, all
segment see Long term value in
Bencare Products but are highly
dissatisfied. Even believe that
prices are on higher side.
Satisfaction level for age group
55+ yrs is negative or on the
borderline
For long term value all clusters
have rate similarly except age
18-24 yrs
18-24 yrs
25-34 yrs
45-54 yrs
55+ yrs
35-44 yrs
Pricing
Shot term Value
1.00000
1.50000
1.00000
0.50000
0.00000
1
-0.50000
-1.00000
-1.50000
.50000
2
.00000
-.50000
18-24 yrs
25-34 yrs
45-54 yrs
55+ yrs
35-44 yrs
Long term Value
1.00000
2
-2.00000
18-24 yrs
25-34 yrs
45-54 yrs
55+ yrs
25-34 yrs
45-54 yrs
55+ yrs
35-44 yrs
Reputation
2.00000
.00000
1
-1.00000
18-24 yrs
35-44 yrs
2.50000
2.00000
1.50000
1.00000
.50000
.00000
1
-.50000
-1.00000
18-24 yrs
25-34 yrs
45-54 yrs
55+ yrs
35-44 yrs
1.50000
1.00000
0.50000
Solution with 6 Cluster
0.00000
Satisfaction
Pricing
Reputation
Short term Value Long term Valu
-0.50000
-1.00000
As their was spike in Kappa value at 6 cluster so K mean
cluster analysis was done with 6 cluster and initial seed
value from k mean second half was used.
From f value, it can be seen that satisfaction, pricing
and longterm value play similar role in determination
of segments which is different from 3 cluster as
satisfaction has least role to play.
Similar to 3 cluster there is one cluster which has all
the values negative (cluster 1) and also there is one
cluster which has all value positive (cluster 3) apart
from them other cluster have mixed perception
Cluster 6 has only long term value negative while
cluster 4 has only satisfaction positive.
-1.50000
-2.00000
1
Managerial Implication
Product with long term value should be provided to Cluster 3 customer as they high
reputation for Bencare and are also loyal but feel that prices are on the higher side
As for Cluster 2 reputation of Bencare in on the border line. So Program targeting to
improve reputation should be under taken.
USA market should be targeted using method which build reputation of the company as
reputation is the main difference creator.
Similarly for Germany long-term value are major difference creator these attribute
should be emphasized upon.
For customer in Holland Long term value and reputation both are major difference
creator.
For Cluster one, trust in company is an important attribute which should be emphasized
Reputation of Bencare in Holland and Germany are on the borderline, measure such as
improvement of services and customer should be under taken to improve the reputation of
the company
Managerial Implication
USA and Germany show similar satisfaction for Bencare product across various segment
Males and Females are similar perception towards price, short term and long term value as attributes for Bencare Product.
Reputation for Bencare is totally opposite for Male and female in different segments
Female of the two segment are satisfied with Bencare product and second clusters for female is on the borderline which
can be improved upon.
For age group 18-24 yrs, all segment see Long term value in Bencare Products but are highly dissatisfied. Even believe that prices are on
higher side.
Reputation of Bencare product move in similar fashion across various age groups except for 18-24 yrs age group which show
increased reputation in cluster 3.
Cluster 6 has only long term value negative while cluster 4 has only satisfaction positive which so cluster 6 should be provided with product with
long term value and cluster 4 are loyal customer as they have high satisfaction instead of all other parameter being negative.
RECOMMENDATIONS
1
Cluster 2 and Cluster 3 display high loadings on Loyalty, thus they should be the favoured customers.
Cluster 1 customers fall in the bad customer category, those who cant be gained back even by any means, whereas segment 2 customers
have rated Bencare consistently except Reputation, which can worked upon.
Cluster 3 seems to be the most attractive segment as they reflect negatively on Satisfaction and Price parameters, they also are looking
for products which have long term value.
Similar promotion and product strategy could be used for USA and Germany as customer show similar preference for various attribute for
Bencare product.
Age group of 18-24 yrs should be target as they perceive high long term value for Bencare but are not satisfied. And prices seems to be
the problem that lead to their dissatisfaction. So product emphasizing long term value with appropriate prices should be introduced.
Satisfaction improvement plan such customer care and support upgradation should be under taken
Pricing for two age group are perceived high so a new product mix should be adopted to cater the need of various age group for example
18-24 age group require low cost product.
Appendix:- Syntax
CLUSTER FAC1_1 FAC2_1 FAC3_1 FAC4_1 FAC5_1
/METHOD WARD
/MEASURE=SEUCLID
/PRINT SCHEDULE CLUSTER(3,7)
/PLOT NONE
/SAVE CLUSTER(3,7).
ONEWAY FAC1_1 FAC2_1 FAC3_1 FAC4_1 FAC5_1 BY CLU3_1
/STATISTICS DESCRIPTIVES
/MISSING ANALYSIS.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT behloy1
/METHOD=ENTER FAC1_1 FAC2_1 FAC3_1 FAC4_1 FAC5_1
/SAVE MAHAL.
/TABLES=QCL3_1 BY QCL3_2
/FORMAT=AVALUE TABLES
/STATISTICS=KAPPA
/CELLS=COUNT EXPECTED
/COUNT ROUND CELL. DATASET ACTIVATE DataSet1.
DATASET ACTIVATE DataSet2.
CROSSTABS
DATASET ACTIVATE DataSet2.
ONEWAY FAC1_1 FAC2_1 FAC3_1 FAC4_1 FAC5_1 BY CLU4_1
/STATISTICS DESCRIPTIVES
/MISSING ANALYSIS.
AGGREGATE
/OUTFILE='C:\Users\pardita\Desktop\MA\cluster
analysis\new\CLUSTER3.sav'
/BREAK=CLU3_1
/Factor1=MEAN(FAC1_1)
/Factor2=MEAN(FAC2_1)
/Factor3=MEAN(FAC3_1)
/Factor4=MEAN(FAC4_1)
/Factor5=MEAN(FAC5_1).
Appendix:- Syntax
COMPUTE randnum = RV.UNIFORM(0,1) .
EXECUTE .
SORT CASES BY randnum (A) .
RECODE randnum
(Lowest thru .5=1) (.5 thru Highest=2) INTO half .
EXECUTE .
SORT CASES BY half .
SPLIT FILE
SEPARATE BY half.
SELECT IF (Half=2).
QUICK CLUSTER FAC1_1 FAC2_1 FAC3_1 FAC4_1 FAC5_1
/MISSING=LISTWISE
/CRITERIA=CLUSTER(3) MXITER(10) CONVERGE(0)
/METHOD=KMEANS(UPDATE)
/PRINT INITIAL ANOVA
/INITIAL = (-.52448
-.14539 -.10239 -.39587 -.19058
.22472 .05311 .15517 .44703 .72393
.80420 .37349 -.12188 .16107 -.80203)
/OUTFILE='C:\Users\pardita\Desktop\MA\cluster
analysis\new\Split2Cluster3.sav
TEMPORARY.
TEMPORARY.
SELECT IF (Half=1).
QUICK CLUSTER FAC1_1 FAC2_1 FAC3_1 FAC4_1 FAC5_1
/MISSING=LISTWISE
/CRITERIA=CLUSTER(3) MXITER(10) CONVERGE(0)
/METHOD=KMEANS(UPDATE)
/SAVE CLUSTER
/PRINT INITIAL ANOVA
/INITIAL= (-.47056 -.41682 -.53086 -.53407 -.37517
.02075 .01822 .27613 .35044 .87636
.56811 .54855 .11225 .30771 -.83024)
/OUTFILE='C:\Users\pardita\Desktop\MA\cluster analysis\new\SPLIT1CLUSTER3A.sav'.
FACTOR
/VARIABLES rep17 rep18 rep19 rep20 prac17 prac18 prac19 prac20 loy1 loy2 loy3 loy4 loy5 loy6 loy7
loy8
/MISSING LISTWISE
/ANALYSIS rep17 rep18 rep19 rep20 prac17 prac18 prac19 prac20 loy1 loy2 loy3 loy4 loy5 loy6 loy7
loy8
/PRINT INITIAL EXTRACTION ROTATION
/FORMAT BLANK(.20)
/CRITERIA FACTORS(4) ITERATE(25)
/EXTRACTION ML
/CRITERIA ITERATE(25) DELTA(0)
/ROTATION OBLIMIN
/SAVE REG(ALL).
Thank You!!!
ANALYTICAL STEP
Problem and Analytical Plan
Segmentation Challenge & Strategy
Factors for Differentiating Segments
Factors for Describing Segments
Factors to Validate Segments
Outline Analytical Plan
Detail Steps
Evaluation Criteria to be Used
Expected Results
MAX POINTS
YOUR POINTS REMARKS/FEEDBACK
1.5
Excellent presentation, organization and layout. Emphasis on
cultural differences done well.
Correct factors chosen for differentiation and validation.
Segments are not described in terms of relevant factors, but are
executed and discussed in detail (and correctly) when doing
cultural differences.
Detailed step-wise analytical plan given. No indication of
evaluation criterion or expected results at each stage of the plan,
but executed correctly.
Distinctiveness
Distinctive feature of problem and analytical plan distinctive
Analytics Execution and Reporting
Scales, Outliers, Multicollinearity, & Factor Scores
0.5
Correctness & Rigor of Procedures
Cluster Model Selection and Validation
Quality of Tabulated & Ploted Evidence
Interpretation Quality and Precision
Cultural differences attempted in detail and with thoughtful
development. Assumptions on slide 4 are not appropriate either
as stated; for instance, the first assumption is better written as
Compositional Approach to Identify Clusters Based on Predetermined Attributes/Factors
See notes in last draft re inaccurate interpretation of R-square as
indication of outliers (it is not); Removal of so many outliers may
not be warranted.
The sequencing of steps is correct. The rigour in the process is
not reflected by the PPT as vital steps and outputs of the process
are shortened. E.g. the entire K-means procedure examining 5
cluster options get covered in only 5 slides.
Selecting and testing a 3-7 clusters range of is a good choice, and
the procedure is executed correctly.
Incomplete tabulated evidence in many cases, e.g. K-means
procedure. K-means clustering with second split half not
presented well in the ppt.
The 3 clusters are not labelled. They are not described in terms of
either segmentation or demographic variables.Cultural
differences are discussed well.