0% found this document useful (0 votes)

25 views29 pages

ML Review PPT 2

Uploaded by

yerukalabharathi0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views29 pages

ML Review PPT 2

Uploaded by

yerukalabharathi0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 29

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY

SCHOOL OF COMPUTING
DEPARTMENT OF COMPUTING TECHNOLOGIES
21CSC305P- MINOR PROJECT

Customer Segmentation
Batch ID: 16

Reg. No: RA2211003011093

Name: M. Durga Prasad

Reg. No:RA221100301104
Name: K. Yaswanth
Guide name: Dr.Poornima S
Reg. No: RA2211003011107
Designation: Associate Professor Name: S. Tejaswi
Department: C.Tech
Reg No.: RA2211003011123
Name: V. Yaswanth
Customer Segmentation

Introduction
In today's competitive market, businesses must understand their customers at a granular
level to effectively meet their needs and enhance customer loyalty. Customer
segmentation is a powerful data-driven approach that divides a company's customer
base into distinct groups based on similar characteristics, such as purchasing behavior,
demographic information, and preferences. By identifying these segments, companies
can develop personalized marketing strategies, optimize product offerings, and
improve customer engagement. The objective of this project is to leverage data analysis
to create actionable customer segments, enabling more targeted and efficient business
decisions.

31/08/2024
Problem Statement
The traditional approach to customer segmentation is often manual, subjective, and
inefficient, leading to missed opportunities for personalized marketing and customer
engagement.

As businesses grow, understanding large and diverse customer bases becomes

increasingly difficult, resulting in generalized marketing strategies that fail to address
specific customer needs and preferences.

Without a clear understanding of different customer segments, companies risk losing

valuable customers by not providing targeted offers or services that resonate with their
unique behaviors and characteristics.

The challenge is to develop an automated system that can efficiently analyze customer
data, identify distinct segments, and provide actionable insights, allowing businesses to
tailor their marketing efforts, improve customer satisfaction, and drive growth.

31/08/2024
Objectives
• Automate Customer Segmentation: Develop a machine learning system to automatically analyze
and segment customers based on behavioral, demographic, and transactional data, reducing manual
effort in identifying key segments.

•Enhance Personalization: Improve the precision of marketing strategies by identifying distinct

customer groups, enabling the delivery of personalized offers, products, and communications that
resonate with each segment.

• Optimize Resource Allocation: Help businesses allocate marketing and sales resources more
efficiently by focusing efforts on the most valuable customer segments, maximizing ROI.

• Scalability: Ensure the system can process and analyze large volumes of customer data, making it
adaptable to businesses of varying sizes and industries.

• Real-time Insights: Provide real-time or near-real-time customer segmentation, allowing businesses

to quickly adapt to changing customer behaviors and market trends.

31/08/2024
Customer Segmentation
S.No Literature
Title reviewAuthor Inference Link

“Customer Segmentation Areeba Afzal, Laiba Khan, Our dataset encompasses a diverse range of mall customers, spanning
1. Using Hierarchical Muhammad Zunnurain Hussain, demographics and behavioral attributes. Hierarchical clustering https://ieeexplore.ieee.org
Clustering “ Muzzamil Mustafa, Aqsa systematically groups customers into clusters, revealing distinct /document/10543349
Khalid, Nawaz khan segments within the mall’s customer base. A comprehensive analysis of
(2024) these clusters unveils profound insights into customer tendencies,
preferences, and purchasing habitsBy leveraging hierarchical
clustering for mall customer segmentation, businesses can enhance
customer satisfaction, drive sales, and foster lasting customer
relationships.
“Customer Segmentation Tushar Kansal, Suraj The process of segmenting the customers with similar behaviours into https://ieeexplore.ieee.org
2. using K-means Bahuguna,Vishal Singh., the same segment and with different patterns into different segments is /document/8769171
Clustering” Tanupriya Choudhury called customer segmentation. In this paper, 3 different clustering
algorithms (k-Means, Agglomerative, and Meanshift) are been
(2018) implemented to segment the customers and finally compare the results
of clusters obtained from the algorithms.
“An efficiency analysis on Ananthi Sheshasaayee, Now a day's commercial marketing growth is improved by customer https://ieeexplore.ieee.org
3. the TPA clustering Santhosh S, L. Logeshwari segmentation model. Literatures use the data mining technology to /document/7975573
methods for intelligent review the customer segmentation and sound effectives. Stages of
customer segmentation” CRM have been used in most of the cases. Based on RFM,
(2017) demographic and LTV data the paper is prepared using the data mining
tools for the new customer segmentation

“Market segmentation Juhi Singh, Kritika Jaiswal, Market segmentation is an approach whose aim is to identify and https://ieeexplore.ieee.org
4. using ML” Minal Singh, Muskan Sama, outline the market segments on which an organization can target for its /document/10150639
Swasti Singhal marketing plans. Market Segmentation is used not only for selling a
(2023) commodity and various services but also plays a crucial role in
meeting the customer’s needs because without customers there is no
business. So satisfying a customer’s need is really important and hence
the need for market segmentation. The general objective of this
research service is to analyze various factors which influence the
31/08/2024 student’s admission process in various private institutions
Customer Segmentation
Proposed System / Methodology
Approach
The objective is to perform customer segmentation using unsupervised machine learning techniques. The two
primary clustering methods—K-Means Clustering and Hierarchical Clustering—will be applied to identify
distinct customer groups based on their behaviors and characteristics. The approach will include the following
steps:

• Data Collection and Understanding:

• Load the customer segmentation dataset.
• Understand the data structure, data types, and identify key features relevant to segmentation.
• Data Preprocessing:
• Handle missing values by removing or imputing them.
• Standardize the data using techniques like StandardScaler to ensure that all features contribute equally to the
clustering process.
• Select only the numeric columns that are meaningful for clustering (e.g., customer age, income, spending score, etc.).
• Clustering Analysis:
 K-Means Clustering:
• Apply the Elbow Method to determine the optimal number of clusters.
• Fit the K-Means model with the chosen number of clusters.
• Assign cluster labels to each customer and analyze the results.
 Hierarchical Clustering:
• Generate a dendrogram to visualize hierarchical relationships and determine the optimal number of clusters.
• Fit the Hierarchical Clustering model based on the determined number of clusters.
• Assign cluster labels to each customer and analyze the results.

31/08/2024
Customer Segmentation
Proposed System / Methodology
• Model Evaluation and Comparison:
• Evaluate the clustering performance using the Silhouette Score to measure the
quality of clusters.
• Compare the results of K-Means and Hierarchical Clustering based on
visualization, interpretability, and silhouette score.
• Select the best-performing model to understand distinct customer segments.
• Insights and Interpretation:
• Interpret the clusters formed by the chosen model to derive actionable insights
for marketing strategies, customer targeting, and personalized promotions.

31/08/2024
Architectural Diagram

31/08/2024
Technologies/Tool Used:
• Python Programming Language:
• Python will be used as the primary programming language due to its wide range of libraries and tools for
data analysis and machine learning.
• Jupyter Lab/Notebook:
• Jupyter Lab or Notebook will be used as the development environment for running Python code,
visualizing results, and documenting the analysis process interactively.
• Data Manipulation and Analysis Libraries:
• Pandas: For loading and preprocessing the dataset (handling missing values, data manipulation).
• NumPy: For numerical operations and mathematical computations.
• Data Visualization Libraries:
• Matplotlib: For creating basic visualizations (Elbow method plot, scatter plots).
• Seaborn: For enhanced visualization (plotting clusters, dendrograms).
• Machine Learning and Clustering Libraries:
• Scikit-Learn (sklearn): For implementing K-Means Clustering, Hierarchical Clustering, data scaling
(StandardScaler), and evaluation metrics (Silhouette Score).
• SciPy: For hierarchical clustering and dendrogram plotting.
• Data Standardization:
• StandardScaler from sklearn.preprocessing to normalize features before applying clustering algorithms.
31/08/2024
Conclusion

• Enhanced Customer Targeting: Our customer segmentation model identifies distinct

customer groups based on behavioral, demographic, and transactional data, enabling
businesses to craft personalized marketing strategies tailored to each segment.
• Optimized Resource Allocation: By grouping customers into meaningful segments,
businesses can allocate their marketing, sales, and service resources more efficiently, focusing
on high-value groups to maximize ROI.
• Real-Time Insights: Our solution provides up-to-date customer segmentation that adapts to
changing behaviors, helping businesses stay agile, anticipate customer needs, and improve
overall customer satisfaction and retention.

31/08/2024
References:-
https://ieeexplore.ieee.org/document/9777194
https://ieeexplore.ieee.org/document/8769171
https://ieeexplore.ieee.org/document/7975573
https://ieeexplore.ieee.org/document/10150639

31/08/2024
REVIEW -2
Customer Segmentation

31/08/2024
K- Means Clustering

Algorithm

1. Initialize centroids: Randomly select k initial centroids.

2. Assign points to the nearest cluster: Assign each data point to the closest centroid using
Euclidean distance.

3. Update centroids: Recalculate centroids by averaging the data points in each cluster.

4. Repeat: Repeat assignment and updating steps until convergence or max iterations.

5. Evaluate the clustering: Use metrics like the silhouette score to evaluate clustering
performance.

31/08/2024
Silhouette Score Calculation

1. Cohesion: Measures how close a point is to its own cluster.

2. Separation: Compares how far a point is from other clusters.

3. Score Range: Ranges from -1 to 1, with values closer to 1 indicating well-separated clusters.

31/08/2024
Code:-
# Store silhouette scores for each clustering method
silhouette_scores = {}
import pandas as pd
import matplotlib.pyplot as plt # K-Means Clustering
from sklearn.preprocessing import
StandardScaler kmeans = KMeans(n_clusters=5, random_state=42)
from sklearn.cluster import KMeans, DBSCAN kmeans_labels = kmeans.fit_predict(scaled_data)
from sklearn.mixture import GaussianMixture data['KMeans_Cluster'] = kmeans_labels
from sklearn.metrics import silhouette_score
# Calculate Silhouette Score for K-Means
# Load dataset (update the file path if necessary) kmeans_silhouette = silhouette_score(scaled_data,
kmeans_labels)
data =
silhouette_scores['K-Means'] = kmeans_silhouette
pd.read_csv('customer_segmentation_data.csv')
print(f"K-Means Silhouette Score:
# Selecting only numeric columns for clustering {kmeans_silhouette}")
numeric_features =
data.select_dtypes(include=['float64', 'int64'])

# Scale the numeric data

scaler = StandardScaler()
scaled_data =
scaler.fit_transform(numeric_features)
31/08/2024
Output

31/08/2024
Hierarchical Clustering

Algorithm
1. Compute linkage matrix: Calculate distances between clusters using methods like
ward, single, or complete.

2. Plot dendrogram: Visualize the hierarchical clustering structure with a dendrogram.

3. Cut the dendrogram: Select the number of clusters by cutting the dendrogram at a
specific height.

4. Assign cluster labels: Use the fcluster function to assign data points to clusters.

5. Evaluate the clustering: Calculate the silhouette score to measure cluster cohesion
and separation.

31/08/2024
Code
# Plot the dendrogram to visualize the hierarchical clustering
import pandas as pd
plt.figure(figsize=(10, 7))
from sklearn.preprocessing import StandardScaler dendrogram(linkage_matrix)
from sklearn.metrics import silhouette_score plt.title('Hierarchical Clustering Dendrogram (Sample)')
from scipy.cluster.hierarchy import dendrogram, linkage, fcluster plt.xlabel('Sample Index')
import matplotlib.pyplot as plt plt.ylabel('Distance')
plt.show()

file_path = 'customer_segmentation_data.csv' # Extract clusters by specifying the number of clusters (e.g., 5 clusters)
data = pd.read_csv(file_path) num_clusters = 5
cluster_labels = fcluster(linkage_matrix, num_clusters, criterion='maxclust')
# Select relevant numerical features for clustering # Add the cluster labels to the original sampled data
numerical_features = ['Age', 'Income Level', 'Coverage Amount', data_sample_with_clusters = data_sample.copy()
'Premium Amount'] data_sample_with_clusters['Cluster'] = cluster_labels
data_numerical = data[numerical_features]
numeric_features = data.select_dtypes(include=['float64', 'int64']) # Calculate the silhouette score for the clustering
silhouette_avg = silhouette_score(data_sample_normalized, cluster_labels)
# Normalize the data print(f'Silhouette Score for {num_clusters} clusters: {silhouette_avg}')
scaler = StandardScaler()
scaled_data = scaler.fit_transform(numeric_features) # Display the first few rows of the data with the cluster labels
print(data_sample_with_clusters.head())
data_normalized = scaler.fit_transform(data_numerical)
# Optionally, save the results to a new CSV file
# Sample a smaller subset of the data for clustering (to avoid memory data_sample_with_clusters.to_csv('customer_segmentation_with_clusters.csv',index=False)
issues)
data_sample = data_numerical.sample(n=1000, random_state=42)
data_sample_normalized = scaler.fit_transform(data_sample)

# Perform hierarchical clustering using the 'ward' method

linkage_matrix = linkage(data_sample_normalized, method='ward')

31/08/2024
Output

31/08/2024
DBSCAN Clustering

Algorithm
1. Initialize DBSCAN: Set parameters eps (maximum distance between points) and
min_samples (minimum points to form a cluster).

2. Fit and predict: Apply DBSCAN to scaled_data to assign cluster labels, where -1 represents
outliers.

3. Assign cluster labels: Store the cluster labels in the original dataset.

4. Check cluster count: Ensure multiple clusters are formed (i.e., more than one unique label).

5. Calculate silhouette score: Compute silhouette score to evaluate clustering quality (if there
are multiple clusters).

6. Handle outliers: If only one cluster or all points are outliers, the silhouette score is set as
"N/A."
31/08/2024
Code:-
# Store silhouette scores for each clustering method
silhouette_scores = {}
import pandas as pd
import matplotlib.pyplot as plt # DBSCAN Clustering
from sklearn.preprocessing import dbscan = DBSCAN(eps=0.5, min_samples=5)
StandardScaler dbscan_labels = dbscan.fit_predict(scaled_data)
from sklearn.cluster import KMeans, DBSCAN data['DBSCAN_Cluster'] = dbscan_labels
from sklearn.mixture import GaussianMixture
from sklearn.metrics import silhouette_score # Calculate Silhouette Score for DBSCAN (if not all
labels are outliers)
# Load dataset (update the file path if necessary) if len(set(dbscan_labels)) > 1: # Ensure there are
multiple clusters
data =
dbscan_silhouette = silhouette_score(scaled_data,
pd.read_csv('customer_segmentation_data.csv')
dbscan_labels)
silhouette_scores['DBSCAN'] = dbscan_silhouette
# Selecting only numeric columns for clustering else:
numeric_features = dbscan_silhouette = "N/A (Only one cluster or all
data.select_dtypes(include=['float64', 'int64']) points are considered outliers)"

# Scale the numeric data print(f"DBSCAN Silhouette Score: {dbscan_silhouette}")

scaler = StandardScaler()
scaled_data =
scaler.fit_transform(numeric_features)
31/08/2024
Output

31/08/2024
Gaussian Mixture Model Clustering

Algorithm
1. Initialize GMM: Set the number of components (clusters) and a random state for
reproducibility.

2. Fit and predict: Apply the GMM to the scaled_data to assign cluster labels.

3. Assign cluster labels: Store the cluster labels in the original dataset.

4. Calculate silhouette score: Compute the silhouette score to evaluate the quality of clustering.

5. Store silhouette score: Save the silhouette score for comparison with other models.

6. Print the result: Output the silhouette score for GMM clustering.

31/08/2024
Code:-
# Store silhouette scores for each clustering method
silhouette_scores = {}
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import # Gaussian Mixture Model (GMM)
StandardScaler
from sklearn.cluster import KMeans, DBSCAN gmm = GaussianMixture(n_components=5,
from sklearn.mixture import GaussianMixture random_state=42)
from sklearn.metrics import silhouette_score gmm_labels = gmm.fit_predict(scaled_data)
data['GMM_Cluster'] = gmm_labels
# Load dataset (update the file path if necessary)
# Calculate Silhouette Score for GMM
data =
gmm_silhouette = silhouette_score(scaled_data,
pd.read_csv('customer_segmentation_data.csv')
gmm_labels)
silhouette_scores['GMM'] = gmm_silhouette
# Selecting only numeric columns for clustering
numeric_features = print(f"GMM Silhouette Score: {gmm_silhouette}")
data.select_dtypes(include=['float64', 'int64'])

# Scale the numeric data

scaler = StandardScaler()
scaled_data =
scaler.fit_transform(numeric_features)
31/08/2024
Output

31/08/2024
Best Model Based on Silhouette Scores
K-Means has the highest silhouette score of 0.1512, which suggests it performs the best in terms
of cluster cohesion and separation compared to the other models. Higher silhouette scores
indicate better-defined clusters with more distinct boundaries between them.

Why Other Models Are Not as Good:

Hierarchical Clustering (Silhouette Score: 0.1493):

Explanation: Though close to K-Means in performance, the slightly lower silhouette score
suggests that the clusters are not as well-defined. This could be due to the hierarchical nature of
the algorithm, which merges clusters based on distance but may create suboptimal boundaries.

Limitation: The predefined linkage criteria might not be as suitable for the given dataset, leading
to clusters that overlap or don't separate as cleanly.

31/08/2024
DBSCAN (Silhouette Score: 0.0708):

Explanation: DBSCAN has the lowest silhouette score, indicating that it struggles to form well-separated

clusters. This could be due to the algorithm treating many points as outliers or forming dense clusters that do

not align well with the data’s natural distribution.

Limitation: DBSCAN is sensitive to the choice of eps and min_samples. In this case, these parameters might

not have been optimal, leading to a poor clustering structure.

Gaussian Mixture Model (GMM) (Silhouette Score: 0.1435):

Explanation: GMM assumes that the data follows a Gaussian distribution, which might not match the

structure of the data well enough, leading to a slightly lower score than K-Means. It performs well but falls

short of creating clusters with strong separation.

Limitation: The probabilistic nature of GMM can sometimes lead to overlapping clusters if the data doesn't

fit the Gaussian distribution well.

31/08/2024
1. Why does K-Means perform better in this case?

K-Means tends to work well when clusters are spherical and of similar size, which might align
with the structure of your scaled data. The simplicity and direct minimization of intra-cluster
variance may have led to more distinct, well-separated clusters compared to other models.

2. Why is DBSCAN's score lower despite its strength in detecting outliers?

DBSCAN is highly sensitive to parameter settings (eps and min_samples). If these are not chosen
carefully, it may either label too many points as outliers or fail to detect clusters properly. In this
case, the lower score indicates that DBSCAN likely failed to capture the true structure of the
data.

31/08/2024
3. What does the difference in silhouette scores tell us about the data?

The relatively low scores across all models suggest that the data may not have clear, well-
separated clusters, or the scaling/preprocessing may not be optimal. The slight differences
between the models highlight how sensitive each algorithm is to the underlying data structure.

4. What is a good silhouette score threshold?

In general, a silhouette score close to 1 indicates excellent clustering, while a score around 0
indicates overlapping clusters or poorly defined boundaries. Scores near 0.15 suggest that
clusters are somewhat defined, but there may be significant overlap or suboptimal separation.

31/08/2024

Segmentation Analysis
No ratings yet
Segmentation Analysis
17 pages
Low Code AIML USL Project CreditCardCustomerSegmentation Vijay Borade Aug23
67% (3)
Low Code AIML USL Project CreditCardCustomerSegmentation Vijay Borade Aug23
66 pages
PGDM Project Report For (BATCH 2018 - 20) SECTION, B. ON Statistical Analysis
100% (1)
PGDM Project Report For (BATCH 2018 - 20) SECTION, B. ON Statistical Analysis
16 pages
1.3.1 Logic Gates (MT)
100% (1)
1.3.1 Logic Gates (MT)
18 pages
Applied Ai Book Preview 2018
No ratings yet
Applied Ai Book Preview 2018
68 pages
Engineering Surveying Unit - 5
No ratings yet
Engineering Surveying Unit - 5
23 pages
41 50
No ratings yet
41 50
18 pages
Evaluating Risks of Construction-Induced Building Damage For Large Underground Construction Projects
No ratings yet
Evaluating Risks of Construction-Induced Building Damage For Large Underground Construction Projects
28 pages
1.1 - Motion Graphs
No ratings yet
1.1 - Motion Graphs
4 pages
10 Graphs of Polynomial Functions
No ratings yet
10 Graphs of Polynomial Functions
25 pages
Information Technology Fundamentals: CCIT4085
No ratings yet
Information Technology Fundamentals: CCIT4085
43 pages
Checking The Timing Between Asynchronous Clock Group Paths
No ratings yet
Checking The Timing Between Asynchronous Clock Group Paths
14 pages
Math Mentals G2
No ratings yet
Math Mentals G2
12 pages
IEEE Template Research-Track
No ratings yet
IEEE Template Research-Track
3 pages
2.2 Freefall
No ratings yet
2.2 Freefall
2 pages
5
No ratings yet
5
14 pages
CIGRE Technical Brochure 939 - Analysis of AC Transformer Reliability, September 2024
100% (1)
CIGRE Technical Brochure 939 - Analysis of AC Transformer Reliability, September 2024
109 pages
MGT Report 1
No ratings yet
MGT Report 1
20 pages
Eigenstructure Assignment For Design of Multimode Flight Control Systems
No ratings yet
Eigenstructure Assignment For Design of Multimode Flight Control Systems
7 pages
Experimental Psychology Chptr. 1-8 Reviewer
No ratings yet
Experimental Psychology Chptr. 1-8 Reviewer
9 pages
Properties of Circle
No ratings yet
Properties of Circle
21 pages
Permutations and Combination
No ratings yet
Permutations and Combination
26 pages
Final Draft Ai Customer Segmentation System
No ratings yet
Final Draft Ai Customer Segmentation System
56 pages
Mall Customer Segmentation: Submitted By: Batch No:8
No ratings yet
Mall Customer Segmentation: Submitted By: Batch No:8
17 pages
Joint Sparse Channel Estimation and Data Detection For Underwater Acoustic Channels Using Partial Interval Demodulation
No ratings yet
Joint Sparse Channel Estimation and Data Detection For Underwater Acoustic Channels Using Partial Interval Demodulation
6 pages
OSMEÑA COLLEGES - Docx Syllabus For Fundamentals of Stat.
No ratings yet
OSMEÑA COLLEGES - Docx Syllabus For Fundamentals of Stat.
10 pages
GEOG2144 L2 Transport Planning and Analysis (2023-24) R
No ratings yet
GEOG2144 L2 Transport Planning and Analysis (2023-24) R
50 pages
First Draft Ai Customer Segmentation System
No ratings yet
First Draft Ai Customer Segmentation System
38 pages
Mall Customer Segmentation Using Cluster
No ratings yet
Mall Customer Segmentation Using Cluster
6 pages
WQD7005 Case Study - 17219402
No ratings yet
WQD7005 Case Study - 17219402
21 pages
Signals & Systems (Common To Ec/Tc/It/Bm/Ml)
No ratings yet
Signals & Systems (Common To Ec/Tc/It/Bm/Ml)
5 pages
Mall Customer Segmentation Kalash Daf
No ratings yet
Mall Customer Segmentation Kalash Daf
12 pages
Lecture 3 - Introduction To Computer Data Processing Using Python
No ratings yet
Lecture 3 - Introduction To Computer Data Processing Using Python
22 pages
Assignment-1 QT
No ratings yet
Assignment-1 QT
3 pages
IGCSE Maths Paper 21 - Final Paper
No ratings yet
IGCSE Maths Paper 21 - Final Paper
17 pages
Workshop Project Report
No ratings yet
Workshop Project Report
10 pages
AD Review Presentation - Template (Rangeesh)
No ratings yet
AD Review Presentation - Template (Rangeesh)
13 pages
Mall Customer Segmentation Using Machine Learning Techniques
No ratings yet
Mall Customer Segmentation Using Machine Learning Techniques
17 pages
Ads Phase 4
No ratings yet
Ads Phase 4
12 pages
Aiml Project Review
No ratings yet
Aiml Project Review
22 pages
En 10210pdf
No ratings yet
En 10210pdf
34 pages
DS MP
No ratings yet
DS MP
18 pages
How Can Algorithms Help in Segmenting Users and Customers? A Systematic Review and Research Agenda For Algorithmic Customer Segmentation
No ratings yet
How Can Algorithms Help in Segmenting Users and Customers? A Systematic Review and Research Agenda For Algorithmic Customer Segmentation
16 pages
Customer Segmentation New
No ratings yet
Customer Segmentation New
11 pages
IEEE Conference Template 5
No ratings yet
IEEE Conference Template 5
5 pages
Bachelor of Electrical and Electronics Engineering
No ratings yet
Bachelor of Electrical and Electronics Engineering
1 page
Machine Learning Project Report - Customer Segmentation
No ratings yet
Machine Learning Project Report - Customer Segmentation
2 pages
Chapter 1,2 Report
No ratings yet
Chapter 1,2 Report
5 pages
Get (Ebook PDF) A Second Course in Statistics: Regression Analysis 8th Edition Free All Chapters
100% (8)
Get (Ebook PDF) A Second Course in Statistics: Regression Analysis 8th Edition Free All Chapters
49 pages
Customer Segmentation Using Machine Learning
No ratings yet
Customer Segmentation Using Machine Learning
8 pages
Major 74 Team
No ratings yet
Major 74 Team
20 pages
Customer Segmentation Using Hierarchical Clustering
No ratings yet
Customer Segmentation Using Hierarchical Clustering
6 pages
Machine Learning Project Report - Customer Segmentation
No ratings yet
Machine Learning Project Report - Customer Segmentation
2 pages
Customer Segmentation Report
No ratings yet
Customer Segmentation Report
31 pages
Customer Segmentation Using Data Science
No ratings yet
Customer Segmentation Using Data Science
7 pages
IJCRT2407525
No ratings yet
IJCRT2407525
9 pages
Customer Segmentation Using Machine Learning
No ratings yet
Customer Segmentation Using Machine Learning
6 pages
Research Proposal
No ratings yet
Research Proposal
3 pages
ADS Phase2
No ratings yet
ADS Phase2
6 pages
Behavioural Customer Segmentation Based
No ratings yet
Behavioural Customer Segmentation Based
7 pages
Customer Segmentation Using K Means Clustering IJERTV11IS030152
No ratings yet
Customer Segmentation Using K Means Clustering IJERTV11IS030152
6 pages
BEng Mechanical 2024
No ratings yet
BEng Mechanical 2024
7 pages
Customer Profiling, Segmentation, and Sales Prediction Using AI in Direct Marketing
No ratings yet
Customer Profiling, Segmentation, and Sales Prediction Using AI in Direct Marketing
11 pages
Da cs-1
No ratings yet
Da cs-1
11 pages
Verapandi
No ratings yet
Verapandi
4 pages
Energy Consumption Prediction System
No ratings yet
Energy Consumption Prediction System
21 pages
Customer Segmentation Project Plan
No ratings yet
Customer Segmentation Project Plan
2 pages
Customer Segmentation
No ratings yet
Customer Segmentation
7 pages
Universiti Teknologi: Mohamad Amir Salihin
No ratings yet
Universiti Teknologi: Mohamad Amir Salihin
5 pages
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
No ratings yet
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
9 pages
K Meanspaper
No ratings yet
K Meanspaper
20 pages
DWDM PPT
No ratings yet
DWDM PPT
13 pages
MiniProject (1) .PPTX LPPT
No ratings yet
MiniProject (1) .PPTX LPPT
11 pages
IJCSP23D1055
No ratings yet
IJCSP23D1055
9 pages
DW&DM PROJECT Sawan
No ratings yet
DW&DM PROJECT Sawan
14 pages
Customer Profiling Segmentation and Sales Predicti
No ratings yet
Customer Profiling Segmentation and Sales Predicti
12 pages
Honey Research Paper
No ratings yet
Honey Research Paper
4 pages
2629 Gembali Maneesh
No ratings yet
2629 Gembali Maneesh
59 pages
Customer Segmentation IEEE Report
No ratings yet
Customer Segmentation IEEE Report
2 pages
Customer Segmentation Using K
No ratings yet
Customer Segmentation Using K
16 pages
BT40904 Project Report MTE
No ratings yet
BT40904 Project Report MTE
22 pages
Customer Segmentation Literature Review 1
No ratings yet
Customer Segmentation Literature Review 1
8 pages
Advanced Data Science Project Report
No ratings yet
Advanced Data Science Project Report
3 pages
Data Analysis and Data Science Task - 3
No ratings yet
Data Analysis and Data Science Task - 3
3 pages
Project Topics and Titles
No ratings yet
Project Topics and Titles
4 pages
Retail Data Analytics: Enhancing Customer Experience and Profitability
From Everand
Retail Data Analytics: Enhancing Customer Experience and Profitability
Christine Nyaga
No ratings yet
Advanced E-Commerce Business Questions and Analytical Hints
From Everand
Advanced E-Commerce Business Questions and Analytical Hints
Zemelak Goraga
No ratings yet
Customer Analysis & Insight: An Introductory Guide To Understanding Your Audience
From Everand
Customer Analysis & Insight: An Introductory Guide To Understanding Your Audience
Anpar Insights
No ratings yet
The Marketing Audit
From Everand
The Marketing Audit
Orlando Skelton
No ratings yet

ML Review PPT 2

Uploaded by

ML Review PPT 2

Uploaded by

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY

Reg. No: RA2211003011093

As businesses grow, understanding large and diverse customer bases becomes

Without a clear understanding of different customer segments, companies risk losing

•Enhance Personalization: Improve the precision of marketing strategies by identifying distinct

• Real-time Insights: Provide real-time or near-real-time customer segmentation, allowing businesses

• Data Collection and Understanding:

• Enhanced Customer Targeting: Our customer segmentation model identifies distinct

1. Initialize centroids: Randomly select k initial centroids.

1. Cohesion: Measures how close a point is to its own cluster.

2. Separation: Compares how far a point is from other clusters.

# Scale the numeric data

2. Plot dendrogram: Visualize the hierarchical clustering structure with a dendrogram.

# Perform hierarchical clustering using the 'ward' method

# Scale the numeric data print(f"DBSCAN Silhouette Score: {dbscan_silhouette}")

# Scale the numeric data

Why Other Models Are Not as Good:

Hierarchical Clustering (Silhouette Score: 0.1493):

not align well with the data’s natural distribution.

not have been optimal, leading to a poor clustering structure.

Gaussian Mixture Model (GMM) (Silhouette Score: 0.1435):

short of creating clusters with strong separation.

fit the Gaussian distribution well.

2. Why is DBSCAN's score lower despite its strength in detecting outliers?

4. What is a good silhouette score threshold?

You might also like