0% found this document useful (0 votes)
67 views2 pages

SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E

This document discusses using the K-Means algorithm in SAP HANA to perform customer segmentation on telco data. It describes generating a K-Means procedure using the AFL wrapper generator, then executing the procedure while varying the number of clusters. To determine the optimal number of clusters, it measures the total intra-cluster distance for different values and looks for the elbow point where adding more clusters does not significantly reduce the distance. This technique is called the elbow criterion and helps identify the right number of clusters for segmentation.

Uploaded by

jefferyleclerc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views2 pages

SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E

This document discusses using the K-Means algorithm in SAP HANA to perform customer segmentation on telco data. It describes generating a K-Means procedure using the AFL wrapper generator, then executing the procedure while varying the number of clusters. To determine the optimal number of clusters, it measures the total intra-cluster distance for different values and looks for the elbow point where adding more clusters does not significantly reduce the distance. This technique is called the elbow criterion and helps identify the right number of clusters for segmentation.

Uploaded by

jefferyleclerc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

3/14/24, 10:16 AM SAP HANA PAL – K-Means Algorithm or How to do Cust...

- SAP Community

call SYSTEM.afl_wrapper_generator('PAL_KMEANS_TELCO', 'AFLPAL', 'KMEANS', PDATA_TELCO);

After executing this code we should see a new procedure in the _SYS_AFL schema called PAL_KMEANS_TELCO

Run the K-Means Procedure

I generated the K-Means procedure so now I need to write the code that will execute it:

/* This table will contain the parameters that will be used

during the execution of the KMeans procedure.

For Eexample, the number of clusters you would like to use */

DROP TABLE PAL_CONTROL_TAB_TELCO;

CREATE COLUMN TABLE PAL_CONTROL_TAB_TELCO(

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 11/39


3/14/24, 10:16 AM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

CALL PAL_KMEANS_TELCO(TELCO, PAL_CONTROL_TAB_TELCO, PAL_KMEANS_RESASSIGN_TAB_TELCO,


PAL_KMEANS_CENTERS_TAB_TELCO) with overview;

Pretty easy huh?

Identify the Right Number of Clusters

Ok, I have my code ready, but I’m missing a very important part, I still don’t know how many Ks I need to specify as the
input parameter (well, I do know because I created the sample data, but let’s pretend I don’t know). There are multiple
techniques to find out how many groups will produce the best clustering, in this case I will use the Elbow Criterion. The
elbow criterion is a common rule of thumb that says that one should choose a number of clusters so that adding another
cluster does not add sufficient information. I will run the code above specifying different number of clusters and for each run
I will measure the total intra-cluster distance. When the distance does not decrease much from one run to the other I will
know the number of groups I need to use. I built the chart below with the results:

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 15/39

You might also like