0% found this document useful (0 votes)

208 views42 pages

SPSS Tutorial Cluster Analysis PDF

This document provides an overview of cluster analysis techniques. Cluster analysis is used to group cases into relatively homogeneous clusters. It has various applications in marketing research, such as market segmentation, understanding consumer behavior, and identifying new product opportunities. The key steps to conducting cluster analysis are selecting a distance measure and clustering algorithm, determining the number of clusters, and validating the analysis. Hierarchical and k-means clustering are common algorithms. Determining the optimal number of clusters can involve examining the agglomeration schedule for large jumps in distance coefficients or creating a scree diagram. SPSS can be used to perform cluster analysis on a dataset of supermarket attributes.

Uploaded by

cajimenezb8872

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

208 views42 pages

SPSS Tutorial Cluster Analysis PDF

Uploaded by

cajimenezb8872

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

SPSS Tutorial

AEB 37 / AE 802
Marketing Research Methods
Week 7
Cluster analysis
Lecture / Tutorial outline
Cluster analysis
Example of cluster analysis
Work on the assignment
Cluster Analysis
It is a class of techniques used to
classify cases into groups that are
relatively homogeneous within
themselves and heterogeneous
between each other, on the basis of
a defined set of variables. These
groups are called clusters.
Cluster Analysis and
marketing research
Market segmentation. E.g. clustering of
consumers according to their attribute
preferences
Understanding buyers behaviours.
Consumers with similar
behaviours/characteristics are clustered
Identifying new product opportunities.
Clusters of similar brands/products can help
identifying competitors / market opportunities
Reducing data. E.g. in preference mapping
Steps to conduct a
Cluster Analysis
1. Select a distance measure
2. Select a clustering algorithm
3. Determine the number of clusters
4. Validate the analysis
3

2
1

1
REGR factor score 1 for analysis

-1

-2

-3

-4
-3 -2 -1 0 1 2 3 4

REGR factor score 2 for analysis 1

Defining distance: the
Euclidean distance
n
2
Dij x
k 1
ki xkj

Dij distance between cases i and j

xki value of variable Xk for case j
Problems:
Different measures = different weights
Correlation between variables (double
counting)
Solution: Principal component analysis
Clustering procedures
Hierarchical procedures
Agglomerative (start from n clusters,
to get to 1 cluster)
Divisive (start from 1 cluster, to get to
n cluster)
Non hierarchical procedures
K-means clustering
Agglomerative clustering
Agglomerative
clustering
Linkage methods
Single linkage (minimum distance)
Complete linkage (maximum distance)
Average linkage
Wards method
1. Compute sum of squared distances within clusters
2. Aggregate clusters with the minimum increase in the
overall sum of squares
Centroid method
The distance between two clusters is defined as the
difference between the centroids (cluster averages)
K-means clustering
1. The number k of cluster is fixed
2. An initial set of k seeds (aggregation centres) is
provided
First k elements
Other seeds
3. Given a certain treshold, all units are assigned to
the nearest cluster seed
4. New seeds are computed
5. Go back to step 3 until no reclassification is
necessary
Units can be reassigned in successive steps
(optimising partioning)
Hierarchical vs Non
hierarchical methods
Hierarchical Non hierarchical
clustering
clustering
No decision about the
number of clusters Faster, more reliable
Problems when data Need to specify the
contain a high level of number of clusters
error (arbitrary)
Can be very slow Need to set the initial
Initial decision are seeds (arbitrary)
more influential (one-
step only)
Suggested approach
1. First perform a hierarchical
method to define the number of
clusters
2. Then use the k-means procedure
to actually form the clusters
Defining the number of
clusters: elbow rule (1)
Agglomeration Schedule
n
Stage Cluster First
Stage Number of clusters Cluster Combined Appears
0 12 StageCluster 1 Cluster 2CoefficientsCluster 1 Cluster 2Next Stage
1 11 1 4 7 .015 0 0 4
2 10 2 6 10 .708 0 0 5
3 9 3 8 9 .974 0 0 4
4 8 4 4 8 1.042 1 3 6
5 7 5 1 6 1.100 0 2 7
6 6 6 4 5 3.680 4 0 7
7 5 7 1 4 3.492 5 6 8
8 4 8 1 11 6.744 7 0 9
9 3 9 1 2 8.276 8 0 10
10 2 10 1 12 8.787 9 0 11
11 1 11 1 3 11.403 10 0 0
Elbow rule (2): the
scree diagram
12

8
Distance

0
11 10 9 8 7 6 5 4 3 2 1
Number of clusters
Validating the
analysis
Impact of initial seeds / order of
cases
Impact of the selected method
Consider the relevance of the
chosen set of variables
SPSS Example
1.5 MATTHEW
JULIA

1.0 LUCY
JENNIFER
.5 NICOLE

0.0

JOHN
-.5 PAMELA
THOMAS ARTHUR

-1.0
Component2

-1.5 FRED

-2.0
-1.5 -1.0 -.5 0.0 .5 1.0 1.5 2.0

Component1
Agglomeration Schedule

Stage Cluster First

Cluster Combined Appears
Stage Cluster 1 Cluster 2 Coefficients Cluster 1 Cluster 2 Next Stage
1 3 6 .026 0 0 8
2 2 5 .078 0 0 7
3 4 9 .224 0 0 5
4 1 7 .409 0 0 6
5 4 10 .849 3 0 8
6 1 8 1.456 4 0 7
7 1 2 4.503 6 2 9
8 3 4 9.878 1 5 9
9 1 3 18.000 7 8 0

Number of clusters: 10 6 = 4
1.5 MATTHEW
JULIA

1.0 LUCY
JENNIFER
.5 NICOLE

0.0

JOHN
-.5 PAMELA
THOMAS ARTHUR
Cluster Number of Ca

-1.0 4
Component2

3
-1.5 FRED
2

-2.0 1
-1.5 -1.0 -.5 0.0 .5 1.0 1.5 2.0

Component1
Open the dataset
supermarkets.sav
From your N: directory (if you saved it
there last time
Or download it from:
http://www.rdg.ac.uk/~aes02mm/
supermarket.sav
Open it in SPSS
The supermarkets.sav
dataset
Run Principal
Components Analysis
and save scores
Select the variables to perform the
analysis
Set the rule to extract principal
components
Give instruction to save the
principal components as new
variables
Cluster analysis:
basic steps
Apply Wards methods on the
principal components score
Check the agglomeration schedule
Decide the number of clusters
Apply the k-means method
Analyse / Classify
Select the component
scores

Select from here Untick this

Select Wards algorithm

Select
method here

Click here
first
Output: Agglomeration
schedule
Number of clusters
Identify the step where the distance coefficients makes a bigger
jump
The scree diagram
(Excel needed)
Distance

800

700

600

500

400

300

200

100

0
118

120

122

124

126

128

130

132

134

136

138

140

142

144

146

148
Step
Number of clusters
Number of cases 150
Step of elbow 144
__________________________________
Number of clusters 6
Now repeat the
analysis
Choose the k-means technique
Set 6 as the number of clusters
Save cluster number for each case
Run the analysis
K-means
K-means dialog box

Specify
number of
clusters
Save cluster membership

Click here
first Thick here
Final output
Cluster membership
Component meaning
(tutorial week 5)
4. Organic radio
Component Matrixa
listener
1. Old Rich Big
Component
Spender 3. Vegetarian TV
1 2 3 4 5
Monthly amount spent .810 lover
-.294 -4.26E-02 .183 .173
Meat expenditure
2. Family
.480
shopper
-.152 .347 .334 -5.95E-02
Fish expenditure .525 -.206 -.475 -4.35E-02 .140
Vegetables expenditure .192 -.345 -.127 .383 5. Vegetarian
.199 TV and
-.207web hater
% spent in own-brand
.646 -.281 -.134 -.239
product
Own a car .536 .619 -.102 -.172 6.008E-02
% spent in organic food .492 -.186 .190 .460 .342
Vegetarian 1.784E-02 -9.24E-02 .647 -.287 .507
Household Size .649 .612 .135 -6.12E-02 -3.29E-03
Number of kids .369 .663 .247 .184 1.694E-02
Weekly TV watching
.124 -9.53E-02 .462 .232 -.529
(hours)
Weekly Radio listening
2.989E-02 .406 -.349 .559 -8.14E-02
(hours)
Surf the web .443 -.271 .182 -5.61E-02 -.465
Yearly household income .908 -4.75E-02 -7.46E-02 -.197 -3.26E-02
Age of respondent .891 -5.64E-02 -6.73E-02 -.228 6.942E-04
Extraction Method: Principal Component Analysis.
a. 5 components extracted.
Final Cluster Centers

Cluster
1 2 3 4 5 6
REGR factor score
-1.34392 .21758 .13646 .77126 .40776 .72711
1 for analysis 1
REGR factor score
.38724 -.57755 -1.12759 .84536 .57109 -.58943
2 for analysis 1
REGR factor score
-.22215 -.09743 1.41343 .17812 1.05295 -1.39335
3 for analysis 1
REGR factor score
.15052 -.28837 -.30786 1.09055 -1.34106 .04972
4 for analysis 1
REGR factor score
.04886 -.93375 1.23631 -.11108 .31902 .87815
5 for analysis 1
Cluster interpretation
through mean component values
Cluster 1 is very far from profile 1 (-1.34) and
more similar to profile 2 (0.38)
Cluster 2 is very far from profile 5 (-0.93) and
not particularly similar to any profile
Cluster 3 is extremely similar to profiles 3 and 5
and very far from profile 2
Cluster 4 is similar to profiles 2 and 4
Cluster 5 is very similar to profile 3 and very far
from profile 4
Cluster 6 is very similar to profile 5 and very far
from profile 3
Which cluster to
target?
Objective: target the organic
consumer
Which is the cluster that looks more
organic?
Compute the descriptive statistics
on the original variables for that
cluster
Representation of factors 1
and 4
(and cluster membership)
3

2
1
REGR factor score 4 for analysis

Cluster Number of Ca
0
6

5
-1
4

3
-2
2

-3 1
-3 -2 -1 0 1 2

REGR factor score 1 for analysis 1

Cluster Analysis: Classification Analysis, or Numerical Taxonomy
No ratings yet
Cluster Analysis: Classification Analysis, or Numerical Taxonomy
13 pages
Aula - Análise de Clusters
No ratings yet
Aula - Análise de Clusters
93 pages
Methodology To Calculate The Effective R
No ratings yet
Methodology To Calculate The Effective R
13 pages
How To Install Endnote X7 and The Accompanying Files in 6 Steps
No ratings yet
How To Install Endnote X7 and The Accompanying Files in 6 Steps
5 pages
Cluster Analysis
No ratings yet
Cluster Analysis
23 pages
L18 19 Clustering
No ratings yet
L18 19 Clustering
48 pages
Cluster Analysis Finalllll
No ratings yet
Cluster Analysis Finalllll
24 pages
Group#10 (Cluster Analysis)
No ratings yet
Group#10 (Cluster Analysis)
53 pages
19 - Clustering in Operation Research
No ratings yet
19 - Clustering in Operation Research
11 pages
Cluster Analysis-2
No ratings yet
Cluster Analysis-2
7 pages
Lecture-11 Cluster Analysis-1
No ratings yet
Lecture-11 Cluster Analysis-1
28 pages
Cluster Analysis
No ratings yet
Cluster Analysis
33 pages
Business Research Methods: Cluster Analysis
No ratings yet
Business Research Methods: Cluster Analysis
46 pages
Cluster Analysis
No ratings yet
Cluster Analysis
25 pages
8.cluster Analysis HCA
No ratings yet
8.cluster Analysis HCA
31 pages
2017-09-26 Paulson & Co - Gold Equities - Myths, Dreams & Reality
No ratings yet
2017-09-26 Paulson & Co - Gold Equities - Myths, Dreams & Reality
26 pages
A Famous Example of Cluster Analysis
No ratings yet
A Famous Example of Cluster Analysis
5 pages
Lecture 02 - Cluster Analysis 1
No ratings yet
Lecture 02 - Cluster Analysis 1
59 pages
Chapter 14 - Cluster Analysis: Data Mining For Business Intelligence
No ratings yet
Chapter 14 - Cluster Analysis: Data Mining For Business Intelligence
31 pages
Market Segmentation - Cluster Analysis
No ratings yet
Market Segmentation - Cluster Analysis
18 pages
Cluster Analysis CH 20
No ratings yet
Cluster Analysis CH 20
2 pages
Cluster Analysis
No ratings yet
Cluster Analysis
20 pages
Cluster Analysis: Talha Farooq Faizan Ali Muhammad Abdul Basit
No ratings yet
Cluster Analysis: Talha Farooq Faizan Ali Muhammad Abdul Basit
16 pages
Predictive Analytics and Data Mining: Segmentation Using Clustering
No ratings yet
Predictive Analytics and Data Mining: Segmentation Using Clustering
25 pages
Cluster Analysis
No ratings yet
Cluster Analysis
15 pages
Cluster Analysis - CFL PPT2
No ratings yet
Cluster Analysis - CFL PPT2
10 pages
Chapter 04 Clustering
No ratings yet
Chapter 04 Clustering
36 pages
Chapter Twenty: Cluster Analysis
No ratings yet
Chapter Twenty: Cluster Analysis
35 pages
Cluster Analysis
No ratings yet
Cluster Analysis
34 pages
Clustering X
No ratings yet
Clustering X
2 pages
Cluster Analysis: Clusters Classification Analysis Numerical Taxonomy
No ratings yet
Cluster Analysis: Clusters Classification Analysis Numerical Taxonomy
50 pages
Sensitivity Sample Model: Tornado, Spider and Sensitivity Charts (Nonlinear)
No ratings yet
Sensitivity Sample Model: Tornado, Spider and Sensitivity Charts (Nonlinear)
6 pages
Cluster Analysis: Prof. (DR.) H. J. Jani Mba Programme, Sardar Patel University Vallabh Vidyanagar - 388 120
No ratings yet
Cluster Analysis: Prof. (DR.) H. J. Jani Mba Programme, Sardar Patel University Vallabh Vidyanagar - 388 120
41 pages
Personality Test Based On Jung and Briggs-Myers
No ratings yet
Personality Test Based On Jung and Briggs-Myers
23 pages
11 Chapter 3
No ratings yet
11 Chapter 3
17 pages
Unit 4 Carlson
No ratings yet
Unit 4 Carlson
8 pages
Cluster Analysis: Consumer Segmentation
No ratings yet
Cluster Analysis: Consumer Segmentation
17 pages
Cluster Analysis GP Seminar
No ratings yet
Cluster Analysis GP Seminar
13 pages
Cluster Analysis: Mala Srivastava
No ratings yet
Cluster Analysis: Mala Srivastava
21 pages
BA2 7 Cluster
No ratings yet
BA2 7 Cluster
33 pages
My Lecture On CLUSTER ANALYSIS PDF
No ratings yet
My Lecture On CLUSTER ANALYSIS PDF
55 pages
Chapter-5-Cluster Analysis PDF
No ratings yet
Chapter-5-Cluster Analysis PDF
5 pages
Session-13b BRM PDF
No ratings yet
Session-13b BRM PDF
18 pages
United States Patent (10) Patent N0.: US 8,082,167 B2
No ratings yet
United States Patent (10) Patent N0.: US 8,082,167 B2
27 pages
United States Patent (10) Patent No.: US 7,925,474 B2
No ratings yet
United States Patent (10) Patent No.: US 7,925,474 B2
30 pages
Cluster Analysis For Market Segmentation
No ratings yet
Cluster Analysis For Market Segmentation
24 pages
Chapter 20: Cluster Analysis: Advance Marketing Research
No ratings yet
Chapter 20: Cluster Analysis: Advance Marketing Research
40 pages
SPSS Week7
No ratings yet
SPSS Week7
42 pages
SPSS Week7
No ratings yet
SPSS Week7
42 pages
Cluster Analysis
No ratings yet
Cluster Analysis
24 pages
Presentation Malo
No ratings yet
Presentation Malo
65 pages
10.cluster Analysis
No ratings yet
10.cluster Analysis
68 pages
Guide Presentation Memoires Theses en
No ratings yet
Guide Presentation Memoires Theses en
42 pages
In Marketing, Cluster Analysis Is Used For: Statistical
No ratings yet
In Marketing, Cluster Analysis Is Used For: Statistical
3 pages
Cluster Analysis
No ratings yet
Cluster Analysis
30 pages
Spatial Statistics: Jonathan Bossenbroek, PHD Dept of Env. Sciences Lake Erie Center University of Toledo
No ratings yet
Spatial Statistics: Jonathan Bossenbroek, PHD Dept of Env. Sciences Lake Erie Center University of Toledo
45 pages
Cluster Analysis
No ratings yet
Cluster Analysis
33 pages
Cluster Analysis in R TML
No ratings yet
Cluster Analysis in R TML
5 pages
Knowledge Acquisition and Sharing - Data Mining: INF 791 Lecture 4: Cluster Analysis
No ratings yet
Knowledge Acquisition and Sharing - Data Mining: INF 791 Lecture 4: Cluster Analysis
43 pages
Memoir '44 - A Beginner's Reference
No ratings yet
Memoir '44 - A Beginner's Reference
27 pages
SPSS Tutorial Cluster Analysis
No ratings yet
SPSS Tutorial Cluster Analysis
42 pages
Cluster Analysis BRM Session 14
No ratings yet
Cluster Analysis BRM Session 14
25 pages
Cluster Analysis
No ratings yet
Cluster Analysis
47 pages
Typical Operating Modes For Stackers and Reclaimers - Aspec - Com.au
No ratings yet
Typical Operating Modes For Stackers and Reclaimers - Aspec - Com.au
4 pages
Content - Book On Belt Conveyors
0% (1)
Content - Book On Belt Conveyors
12 pages
BasRock Trajec3D Manual
No ratings yet
BasRock Trajec3D Manual
21 pages
Elements of Railway Tracks
No ratings yet
Elements of Railway Tracks
27 pages
Guide To Creating A Mine Site Reconciliation Code
No ratings yet
Guide To Creating A Mine Site Reconciliation Code
11 pages
Chapter13 Slides
No ratings yet
Chapter13 Slides
24 pages
Mining Montecarlo
No ratings yet
Mining Montecarlo
3 pages
Multivariate Data Analysis Techniques Using Python. Dimension Reduction, Classification and Segmentation
From Everand
Multivariate Data Analysis Techniques Using Python. Dimension Reduction, Classification and Segmentation
César Pérez López
No ratings yet
Quant Developers' Tools and Techniques: Quant Books, #2
From Everand
Quant Developers' Tools and Techniques: Quant Books, #2
Manfred Hindering
No ratings yet
Stochastic Modeling: Analysis and Simulation
From Everand
Stochastic Modeling: Analysis and Simulation
Barry L. Nelson
No ratings yet
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
Adaptive Filtering Prediction and Control
From Everand
Adaptive Filtering Prediction and Control
Graham C Goodwin
No ratings yet
INVENRELATION
From Everand
INVENRELATION
Shih Yu Chang
No ratings yet
Physical Pharmaceutics-II Lab Manual as per the PCI Syllabus
From Everand
Physical Pharmaceutics-II Lab Manual as per the PCI Syllabus
A. Pavani
No ratings yet

SPSS Tutorial Cluster Analysis PDF

Uploaded by

SPSS Tutorial Cluster Analysis PDF

Uploaded by

SPSS Tutorial

REGR factor score 2 for analysis 1

Dij distance between cases i and j

Stage Cluster First

Select from here Untick this

REGR factor score 1 for analysis 1

You might also like