0% found this document useful (0 votes)

148 views18 pages

Cluster Analysis: Abu Bashar

This document discusses cluster analysis techniques. It begins by defining cluster analysis as a method used to classify cases into relatively homogeneous groups based on a set of variables. It notes that cluster analysis is especially useful for market segmentation. The document then discusses other uses of cluster analysis including product analysis and data reduction. It outlines the main steps to conduct a cluster analysis including selecting measures, algorithms, determining cluster numbers, and validating results. Finally, it covers topics like distance measures, hierarchical and non-hierarchical clustering methods, and determining the optimal number of clusters.

Uploaded by

Abu Bashar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views18 pages

Cluster Analysis: Abu Bashar

Uploaded by

Abu Bashar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Cluster Analysis

Abu Bashar

Cluster analysis
It is a class of techniques used to classify cases into groups that are
relatively homogeneous within themselves and heterogeneous between each other Homogeneity (similarity) and heterogeneity (dissimilarity) are measured on the basis of a defined set of variables

These groups are called clusters

Market segmentation
Cluster analysis is especially useful for market segmentation Segmenting a market means dividing its potential consumers into separate sub-sets where
Consumers in the same group are similar with respect to a given set of characteristics Consumers belonging to different groups are dissimilar with respect to the same set of characteristics

This allows one to calibrate the marketing mix differently according to the target consumer group

Other uses of cluster analysis

Product characteristics and the identification of new product opportunities. Clustering of similar brands or products according to their characteristics allow one to identify competitors, potential market opportunities and available niches Data reduction
Factor analysis and principal component analysis allow to reduce the number of variables. Cluster analysis allows to reduce the number of observations, by grouping them into homogeneous clusters.

Steps to conduct a cluster analysis

Select a distance measure Select a clustering algorithm Define the distance between two clusters Determine the number of clusters Validate the analysis

Distance measures for individual observations

To measure similarity between two observations a distance measure is needed With a single variable, similarity is straightforward
Example: income two individuals are similar if their income level is similar and the level of dissimilarity increases as the income gap increases

Multiple variables require an aggregate distance measure

Many characteristics (e.g. income, age, consumption habits, family composition, owning a car, education level, job), it becomes more difficult to define similarity with a single value

The most known measure of distance is the Euclidean distance, which is the concept we use in everyday life for spatial coordinates.
7

Other distance measures

Other distance measures: Chebychev, Minkowski, Mahalanobis An alternative approach: use correlation measures, where correlations are not between variables, but between observations. Each observation is characterized by a set of measurements (one for each variable) and bi-variate correlations can be computed between two observations.
8

Clustering procedures
Hierarchical procedures
Agglomerative (start from n clusters to get to 1 cluster) Divisive (start from 1 cluster to get to n clusters)

Non hierarchical procedures

K-means clustering

Hierarchical clustering
Agglomerative:
Each of the n observations constitutes a separate cluster The two clusters that are more similar according to same distance rule are aggregated, so that in step 1 there are n-1 clusters In the second step another cluster is formed (n-2 clusters), by nesting the two clusters that are more similar, and so on There is a merging in each step until all observations end up in a single cluster in the final step.

Divisive
All observations are initially assumed to belong to a single cluster The most dissimilar observation is extracted to form a separate cluster In step 1 there will be 2 clusters, in the second step three clusters and so on, until the final step will produce as many clusters as the number of observations.

The number of clusters determines the stopping rule for the algorithms

Non-hierarchical clustering
These algorithms do not follow a hierarchy and produce a single partition Knowledge of the number of clusters (c) is required In the first step, initial cluster centres (the seeds) are determined for each of the c clusters, either by the researcher or by the software (usually the first c observation or observations are chosen randomly) Each iteration allocates observations to each of the c clusters, based on their distance from the cluster centres Cluster centres are computed again and observations may be reallocated to the nearest cluster in the next iteration When no observations can be reallocated or a stopping rule is met, the process stops
11

Non-hierarchical clustering: K-means method

1. 2.

The number k of clusters is fixed An initial set of k seeds (aggregation centres) is provided
First k elements Other seeds (randomly selected or explicitly defined)

Given a certain fixed threshold, all units are assigned to the nearest cluster seed 4. New seeds are computed 5. Go back to step 3 until no reclassification is necessary Units can be reassigned in successive steps (optimising partioning)

Hierarchical vs. non-hierarchical methods

Hierarchical Methods
No decision about the number of clusters Problems when data contain a high level of error Can be very slow, preferable with small data-sets Initial decisions are more influential (one-step only) At each step they require computation of the full proximity matrix

Non-hierarchical methods
Faster, more reliable, works with large data sets Need to specify the number of clusters Need to set the initial seeds Only cluster distances to seeds need to be computed in each iteration

The number of clusters c

Two alternatives
Determined by the analysis Fixed by the researchers

In segmentation studies, the c represents the number of potential separate segments. Preferable approach: let the data speak
Hierarchical approach and optimal partition identified through statistical tests (stopping rule for the algorithm) However, the detection of the optimal number of clusters is subject to a high degree of uncertainty

If the research objectives allow a choice rather than estimating the number of clusters, non-hierarchical methods are the way to go.

Example: fixed number of clusters

A retailer wants to identify several shopping profiles in order to activate new and targeted retail outlets The budget only allows him to open three types of outlets A partition into three clusters follows naturally, although it is not necessarily the optimal one. Fixed number of clusters and (k-means) non hierarchical approach

And the merging distance is relatively small

C A S E Label Num 231

Dendrogram
Rescaled Distance

This dotted line represents the Cluster Combine distance between clusters

0 5 10 15 20 25 +---------+---------+---------+---------+---------+

Case 231 and case 275 are merged

These are the individual cases

275 145 181 333 117 336 337 209 431 178

As the algorithm proceeds, the merging distances become larger

Scree diagram
Merging distance on the y-axis
Distance

12 10 8 6 4 2 0 11 10 9 8 7 6 5 4 3 2 1 Number of clusters

When one moves from 7 to 6 clusters, the merging distance increases noticeably

Thank You Very Much

Compact Max USER GUIDE
No ratings yet
Compact Max USER GUIDE
14 pages
Marketing Analytics Unit 1
No ratings yet
Marketing Analytics Unit 1
48 pages
Catalogo Pecas Solis 75xl-Fase-V
No ratings yet
Catalogo Pecas Solis 75xl-Fase-V
557 pages
10kw Hybrid System 57 6kwh
No ratings yet
10kw Hybrid System 57 6kwh
1 page
Operation-And Maintenance Manual For Explosion Relief Valves Type EVO
100% (3)
Operation-And Maintenance Manual For Explosion Relief Valves Type EVO
18 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
List of Placement Consultants
0% (1)
List of Placement Consultants
17 pages
Cluster Analysis Concept & Methods
No ratings yet
Cluster Analysis Concept & Methods
14 pages
Cluster
100% (1)
Cluster
72 pages
Cluster Analysis: G Sreenivas
No ratings yet
Cluster Analysis: G Sreenivas
29 pages
Cluster Analysis
No ratings yet
Cluster Analysis
38 pages
Types of Analytics: What Is Descriptive Analytics?
No ratings yet
Types of Analytics: What Is Descriptive Analytics?
3 pages
Instructor Materials Chapter 6: Architecture For Big Data and Data Engineering
No ratings yet
Instructor Materials Chapter 6: Architecture For Big Data and Data Engineering
32 pages
Traditional Conjoint Analysis With Excel
No ratings yet
Traditional Conjoint Analysis With Excel
9 pages
Chapter-18: Research Methodology
No ratings yet
Chapter-18: Research Methodology
19 pages
Ucc & BM of Osmania University (MBA)
No ratings yet
Ucc & BM of Osmania University (MBA)
22 pages
Cluster Analysis
No ratings yet
Cluster Analysis
47 pages
A Practical Guide To Conjoint Analysis
100% (1)
A Practical Guide To Conjoint Analysis
8 pages
Chap8 Basic Cluster Analysis
100% (1)
Chap8 Basic Cluster Analysis
104 pages
Segmentation and Targeting: Basics Market Definition Segmentation Research and Methods Behavior-Based Segmentation
No ratings yet
Segmentation and Targeting: Basics Market Definition Segmentation Research and Methods Behavior-Based Segmentation
77 pages
2nd Unit - 2.2 - Data Analytics
No ratings yet
2nd Unit - 2.2 - Data Analytics
22 pages
Mall Customer Data Analysis PDF
No ratings yet
Mall Customer Data Analysis PDF
10 pages
Unit 1 - Data Scientist Tool Box
No ratings yet
Unit 1 - Data Scientist Tool Box
26 pages
11-12 Big Data Concepts and Tools
No ratings yet
11-12 Big Data Concepts and Tools
30 pages
Business Analytics Using Python Sentiment Analytics: Cyrus Lentin
100% (1)
Business Analytics Using Python Sentiment Analytics: Cyrus Lentin
28 pages
Statistics For Business Analysis: Learning Objectives
No ratings yet
Statistics For Business Analysis: Learning Objectives
37 pages
Cluster Analysis
100% (1)
Cluster Analysis
13 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Social Network Analysis in R PDF
No ratings yet
Social Network Analysis in R PDF
35 pages
Managing Different Stages of CRM: Dr. Savita Sharma
No ratings yet
Managing Different Stages of CRM: Dr. Savita Sharma
28 pages
FinalPaper SalesPredictionModelforBigMart
No ratings yet
FinalPaper SalesPredictionModelforBigMart
14 pages
Cluster Training PDF (Compatibility Mode)
No ratings yet
Cluster Training PDF (Compatibility Mode)
21 pages
Introduction To Factor Analysis (Compatibility Mode) PDF
No ratings yet
Introduction To Factor Analysis (Compatibility Mode) PDF
20 pages
An Introduction To Clustering and Different Methods of Clustering
No ratings yet
An Introduction To Clustering and Different Methods of Clustering
9 pages
Data Visualization: For Analytics and Business Intelligence
No ratings yet
Data Visualization: For Analytics and Business Intelligence
49 pages
1-Big Data Analytics
No ratings yet
1-Big Data Analytics
37 pages
Unit II-Database Design, Archiitecture - Model
No ratings yet
Unit II-Database Design, Archiitecture - Model
23 pages
Motivation and Values: by Michael R. Solomon
No ratings yet
Motivation and Values: by Michael R. Solomon
34 pages
Exploratory Factor Analysis
100% (1)
Exploratory Factor Analysis
33 pages
Cluster Analysis
No ratings yet
Cluster Analysis
77 pages
Marketing Analytics: Putting Data Science and Machine Learning To Work
No ratings yet
Marketing Analytics: Putting Data Science and Machine Learning To Work
2 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
100% (1)
Machine Learning - Customer Segment Project. Approved by UDACITY
19 pages
Business Analytics
50% (2)
Business Analytics
8 pages
Business Analytics For Managers Taking Business Intelligence Beyond Reporting
No ratings yet
Business Analytics For Managers Taking Business Intelligence Beyond Reporting
19 pages
Visual Analytics
No ratings yet
Visual Analytics
36 pages
Decision Trees
100% (6)
Decision Trees
28 pages
DTS Modul Data Science Methodology
100% (1)
DTS Modul Data Science Methodology
56 pages
Data Driven Marketing Decision Making
No ratings yet
Data Driven Marketing Decision Making
5 pages
A Definition of Business Analytics
No ratings yet
A Definition of Business Analytics
4 pages
Factor Analysis
67% (3)
Factor Analysis
25 pages
CCW331 Business Analytics Material Unit I Type2
No ratings yet
CCW331 Business Analytics Material Unit I Type2
43 pages
Case - Study of Data Warehouse
No ratings yet
Case - Study of Data Warehouse
14 pages
Cluster Analysis
No ratings yet
Cluster Analysis
27 pages
Bachelor of Science in Accountancy: Program Curriculum Ay 2020 - 2021
No ratings yet
Bachelor of Science in Accountancy: Program Curriculum Ay 2020 - 2021
6 pages
Types of Analytics - Descriptive, Predictive, Prescriptive Analytics
No ratings yet
Types of Analytics - Descriptive, Predictive, Prescriptive Analytics
6 pages
Business Analytics - Science of Data Driven Decision Making (Live Online Programme)
No ratings yet
Business Analytics - Science of Data Driven Decision Making (Live Online Programme)
3 pages
Semester: 3 Course Name: Marketing Analytics Course Code: 18JBS315 Number of Credits: 3 Number of Hours: 30
No ratings yet
Semester: 3 Course Name: Marketing Analytics Course Code: 18JBS315 Number of Credits: 3 Number of Hours: 30
4 pages
Big Data
No ratings yet
Big Data
21 pages
Tableau Executive Summary PDF
No ratings yet
Tableau Executive Summary PDF
5 pages
Analytics in Action - How Marketelligent Helped A B2B Retailer Increase Its Lead Velocity
No ratings yet
Analytics in Action - How Marketelligent Helped A B2B Retailer Increase Its Lead Velocity
2 pages
02-03 ASAP Business Analytics-2 Descriptive Statistics
No ratings yet
02-03 ASAP Business Analytics-2 Descriptive Statistics
109 pages
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Cluster Analysis
No ratings yet
Cluster Analysis
24 pages
10.cluster Analysis
No ratings yet
10.cluster Analysis
68 pages
Clustering
No ratings yet
Clustering
6 pages
Mca-Ebiz Vs Mcaenterprise
No ratings yet
Mca-Ebiz Vs Mcaenterprise
1 page
Reseach Paper On CSR CONSUMER BEHAVIOUR
No ratings yet
Reseach Paper On CSR CONSUMER BEHAVIOUR
22 pages
MCA HRMS Write Up
No ratings yet
MCA HRMS Write Up
7 pages
Master Pitch Template With Cues For Pitcher
No ratings yet
Master Pitch Template With Cues For Pitcher
2 pages
E-Satisfaction & E-Loyalty
No ratings yet
E-Satisfaction & E-Loyalty
15 pages
Transforming Business Decision Making Using IoT and ML
No ratings yet
Transforming Business Decision Making Using IoT and ML
6 pages
Sharpening Your Business Efficiency: Microcenter
No ratings yet
Sharpening Your Business Efficiency: Microcenter
12 pages
E-Mailer Contents
No ratings yet
E-Mailer Contents
1 page
Assessment For Learning
No ratings yet
Assessment For Learning
42 pages
Gender School and Society
100% (1)
Gender School and Society
7 pages
Unit 1 Paradigms in Education of Children With Special Needs Inclusive Education
88% (8)
Unit 1 Paradigms in Education of Children With Special Needs Inclusive Education
21 pages
Better Learning Through Better Reading and Reflecting
No ratings yet
Better Learning Through Better Reading and Reflecting
6 pages
Knowledge and Curriculum
100% (1)
Knowledge and Curriculum
31 pages
Project Management - CPM-PERT
100% (1)
Project Management - CPM-PERT
49 pages
White Paper VMS
No ratings yet
White Paper VMS
4 pages
MCA Biz, A Web Based ERP With Oracle Database: White Paper
No ratings yet
MCA Biz, A Web Based ERP With Oracle Database: White Paper
3 pages
International Trade in India: Abu Bashar
No ratings yet
International Trade in India: Abu Bashar
11 pages
Containers and Its Types
100% (1)
Containers and Its Types
25 pages
Research Paper On Conversion of Footfall
No ratings yet
Research Paper On Conversion of Footfall
11 pages
United Nations Children's Fund: By: Alankar Agnihotri
No ratings yet
United Nations Children's Fund: By: Alankar Agnihotri
19 pages
Research Designs: Abu Bashar
No ratings yet
Research Designs: Abu Bashar
21 pages
Simple and Cross Tabulation
No ratings yet
Simple and Cross Tabulation
13 pages
Qualitative Research Design
33% (3)
Qualitative Research Design
22 pages
To Meet Anticipated Customer Demand. To Smooth Production Requirements
No ratings yet
To Meet Anticipated Customer Demand. To Smooth Production Requirements
11 pages
5 No
No ratings yet
5 No
1 page
Cyber Crime Investigation
No ratings yet
Cyber Crime Investigation
6 pages
Guía de Usuario UPS HP R3000
No ratings yet
Guía de Usuario UPS HP R3000
64 pages
161207coal - Ash - Utilization - in - Japan-Draft U
No ratings yet
161207coal - Ash - Utilization - in - Japan-Draft U
20 pages
ACMA
No ratings yet
ACMA
8 pages
BIg Data
100% (1)
BIg Data
15 pages
Oumh1203 English For Written
No ratings yet
Oumh1203 English For Written
13 pages
Motor de Popa 60 Ano 2007
No ratings yet
Motor de Popa 60 Ano 2007
68 pages
Sa206 - Biedaalt1
No ratings yet
Sa206 - Biedaalt1
10 pages
Release Notes
No ratings yet
Release Notes
2 pages
Landmark 60 Insgps
No ratings yet
Landmark 60 Insgps
2 pages
Engg Chemistry R13 Model Question Papers
No ratings yet
Engg Chemistry R13 Model Question Papers
4 pages
6 Turbojet - Turbofan - Examples
No ratings yet
6 Turbojet - Turbofan - Examples
24 pages
Garrett Morgan - Biography
No ratings yet
Garrett Morgan - Biography
4 pages
Application For Repatriation and Citizenship
No ratings yet
Application For Repatriation and Citizenship
4 pages
How Weighted Overlay Works
100% (4)
How Weighted Overlay Works
15 pages
Vertical Separator Sizing Report
No ratings yet
Vertical Separator Sizing Report
4 pages
Curriculum Vitae
No ratings yet
Curriculum Vitae
4 pages
Turboprops TNG Manual PDF
100% (2)
Turboprops TNG Manual PDF
21 pages
MSW - 1 - 2016 Munisicpal Solid Waste Rules-2016 - Vol I
No ratings yet
MSW - 1 - 2016 Munisicpal Solid Waste Rules-2016 - Vol I
96 pages
Sparger
No ratings yet
Sparger
14 pages
HW-Q70R Schematic Diagram
No ratings yet
HW-Q70R Schematic Diagram
29 pages
Revised Detail Engineering Assessment Report of Six Storied Connecting Building
No ratings yet
Revised Detail Engineering Assessment Report of Six Storied Connecting Building
42 pages
Wcöwj Gv÷Vi: Mini Law School
No ratings yet
Wcöwj Gv÷Vi: Mini Law School
30 pages
As 2337.2-2004 Gas Cylinder Test Stations LP Gas Fuel Vessels For Automotive Use
No ratings yet
As 2337.2-2004 Gas Cylinder Test Stations LP Gas Fuel Vessels For Automotive Use
9 pages
November 29, 2011, Federal Judge Ruling 2:06-cv-03731 Brown v. Brewer MySpace VS. News Corp $96billion Damage Antitrust - MySpace Founder Dealt Stunning Loss
No ratings yet
November 29, 2011, Federal Judge Ruling 2:06-cv-03731 Brown v. Brewer MySpace VS. News Corp $96billion Damage Antitrust - MySpace Founder Dealt Stunning Loss
114 pages
Tinogard Q Tds
100% (1)
Tinogard Q Tds
4 pages

Cluster Analysis: Abu Bashar

Uploaded by

Cluster Analysis: Abu Bashar

Uploaded by

Cluster Analysis

These groups are called clusters

Other uses of cluster analysis

Steps to conduct a cluster analysis

Distance measures for individual observations

Multiple variables require an aggregate distance measure

Other distance measures

Non hierarchical procedures

Non-hierarchical clustering: K-means method

Hierarchical vs. non-hierarchical methods

The number of clusters c

Example: fixed number of clusters

And the merging distance is relatively small

Case 231 and case 275 are merged

These are the individual cases

As the algorithm proceeds, the merging distances become larger

Thank You Very Much

You might also like