0% found this document useful (0 votes)
3 views26 pages

Chapter

Uploaded by

batengarania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views26 pages

Chapter

Uploaded by

batengarania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

11

Unsupervised Data
Mining

Business Analytics, 1e
By Sanjiv Jaggia, Alison Kelly, Kevin Lertwachara, and Leida Chen

9/25/21
Chapter 11 Learning Objectives (LOs)

LO 11.1 Conduct hierarchical cluster analysis.


LO 11.3 Conduct association rule analysis.
Introductory Case: Nutritional Facts of Candy
Bars
• Aliyah is an honors student at a prestigious business school in
Southern California. She is also a fledgling entrepreneur and owns a
vending machine business. Aliyah is aware that California consumers
are becoming increasingly health conscious when it comes to food
purchase. Aliyah wants to come up with a better selection of candy bars
and strategically group and display them in her vending machines.

• Aliyah wants to use the information to accomplish the below tasks.


1. Analyze the nutritional fact data and group candy products according to their
nutritional content.
2. Select a variety of candy bars from each group to better meet the taste of today’s
consumers.
3. Display the candy bars in her vending machines according to the grouping.
11.1: Hierarchical Cluster Analysis (1/14)
• Unsupervised data mining requires no knowledge of the
target variable.
• The algorithms allow the computer to identify complex
processes and patterns without any specific guidance from
the analyst.
• It is an important part of exploratory data analysis because
it makes no distinction between the target variable 𝑦 and
the predictor variables 𝑥!, 𝑥", ⋯ , 𝑥# .
• Uses similarity measures: Euclidian, Manhattan, Jaccard’s
• We explore two core unsupervised data mining techniques:
cluster analysis and association rule analysis.
11.1: Hierarchical Cluster Analysis (2/14)
• Cluster analysis is an unsupervised data mining technique
that groups data into categories that share some similar
characteristic or trait.
– Similar within a cluster, dissimilar across clusters
– Uses similarity measures
• Allows useful exploratory analysis by summarizing a large
number of observations in a data set into a small number of
clusters.
• The cluster characteristics or profiles help us understand
and describe the different groups.
• A popular application of cluster analysis is called customer
or market segmentation.
• Two common clustering techniques: hierarchical clustering
and k-means clustering.
11.1: Hierarchical Cluster Analysis (3/14)
• Hierarchical clustering is a technique that uses an
iterative process to group data into a hierarchy of
clusters.
– Agglomerative clustering (AGNES): top-down, starts
with each observation being its own cluster, iteratively
merges clusters that are similar moving up the hierarchy
– Divisive clustering (DIANA): bottom-up, starts with a
single cluster, iteratively separating the most dissimilar
observations moving down the hierarchy
• We focus on agglomerative clustering, which is the
most commonly used approach.
• The methods can be adapted to implement divisive
clustering.
11.1: Hierarchical Cluster Analysis (4/14)
• With AGNES, each observation in the data initially forms its own cluster.
• The algorithm then successively merges these clusters into larger clusters
based on their similarity until all observations are merged into one final
cluster, referred to as a root.
• Uses (dis)similarity measures.
– Numeric: Euclidean distance or Manhattan distance
– Categorical: matching, Jaccard’s coefficient
• Uses the z-score standardization.
• Linkage methods to evaluate (dis)similarity between clusters.
– Single: nearest distance between a pair of observations not in the same cluster
– Complete: farthest distance between a pair of observations not in the same cluster
– Centroid: distance between the center/centroid or mean values of the clusters
– Average: average distance between all pairs of observations not in the same cluster
– Ward’s: uses error sum of squares (ESS), which is the squared difference between
individual observations and the cluster mean; measures the loss of information that
occurs when observations are clustered
11.1: Hierarchical Cluster Analysis (5/14)
11.1: Hierarchical Cluster Analysis (6/14)
• Once AGNES completes the clustering process, data are
usually represented in a treelike structure.
– Called a dendrogram
– Branches are clusters
– An observation is a “leaf”
– Visually inspect the clustering result and determine the appropriate number
of clusters
• The height of each branch (cluster) or sub-branch (sub-
cluster) indicates how dissimilar it is from the other
branches or sub-branches with which it is merged.
• The greater the height, the more distinctive the cluster is
from the other clusters.
11.1: Hierarchical Cluster Analysis (7/14)
11.1: Hierarchical Cluster Analysis (8/14)
• Relying solely on the height of a dendrogram tree branch may
lead to statistically distinctive clusters that have little or no
practical meaning.
• We often take into account both quantitative measures (such as
a dendrogram) and practical considerations to determine the
number of clusters.
• We should also review the profile of each cluster using
descriptive statistics.
• Another common approach to profile clusters is to incorporate
variables that were not used in clustering but of interest to the
decision maker.
• The ability of a clustering method to discover useful hidden
patterns of the data depends on how it is implemented: data
transformations, distance measures, algorithm, linkage.
• Try several techniques, use the one that makes the most sense.
11.1: Hierarchical Cluster Analysis (9/14)

• Example: Consider the crime crate, median


income, and poverty rate for 41 cities.
11.1: Hierarchical Cluster Analysis (10/14)
• With Excel
11.1: Hierarchical Cluster Analysis (11/14)
• With Excel
11.1: Hierarchical Cluster Analysis (12/14)
• With R
11.1: Hierarchical Cluster Analysis (13/14)
• With R
11.1: Hierarchical Cluster Analysis (14/14)
• With R
11.3: Association Rule Analysis (1/9)
• Association rule analysis is essentially a “what goes with what” study.
– Designed to identify events that tend to occur together
– Also known as affinity analysis or market basket analysis
• Classic application of market basket analysis: retail companies seek to
identify products that consumers tend to purchase together.
– Display products next to each other on a shelf
– Develop promotional campaigns to cross-sell or up-sell
• Other examples
– Improve sales and customer service
– Help diagnose illnesses based on different symptoms that occur together
• Association rules are If-Then logical statements that represent
relationships among different items or item sets.
– Designed to identify hidden patterns and co-occurring events in data
– If is the antecedent, then is the consequent
– Antecedents and consequents can comprise a single product or a combination of
products
– Products or a combination of products is called items or an item set
11.3: Association Rule Analysis (2/9)
• One inherent problem with searching for hidden relationships between
items or item sets is dealing with the extremely large number of
possible combinations.
• Let 𝑛 be the number of items. The number of possible combinations
exponentially increases: 3! − 2 !"# + 1.
– Example: 100 items gives 5.15378E+47 possible combinations
– The search problem becomes extremely computationally intensive and time-
consuming.
• There are several algorithms that can be used to perform association
rule analysis in a more efficient manner. They all focus on the
frequency of item sets.
• One of the most widely used algorithms is called the Apriori method.
– Designed to recursively generate item sets that exceed a predetermined frequency
threshold: the support of the item or item set.
– Set a minimum support value, below which an item or item set is excluded, thus
making the analysis more computationally feasible.
– Eliminates infrequent items that are below the support value, makes it easier to
analyze relevant information in a large data set.
11.3: Association Rule Analysis (3/9)
• With enough data, we can propose many of these If-Then association rules.
– We need a way to evaluate the effectiveness of these rules
– Only the strong associations that occur frequently have the potential to reappear consistently in
the future
• Support: the probability of the If-Then statement
!"#$%& '( )&*+,*-).'+, .+-/"0.+1 $')2 *+)%-%0%+) *+0 -'+,%3"%+)
4')*/ +"#$%& '( )&*+,-*).'+,

• Confidence of the association rule: probability that the antecedent and the
consequent occur given the antecedent occurs
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠 𝑖𝑛𝑐𝑙𝑢𝑑𝑖𝑛𝑔 𝑏𝑜𝑡ℎ 𝑎𝑛𝑡𝑒𝑐𝑒𝑑𝑒𝑛𝑡 𝑎𝑛𝑑 𝑐𝑜𝑛𝑠𝑒𝑞𝑢𝑒𝑛𝑡
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠 𝑖𝑛𝑐𝑙𝑢𝑑𝑖𝑛𝑔 𝑎𝑛𝑡𝑒𝑐𝑒𝑑𝑒𝑛𝑡
• Both of these can be misleading, if the antecedent and consequent are
common yet unrelated.
!"#$%&'#('
• The lift ratio evaluates the strength of the association: )*+'(,'& ("#$%&'#('
!"#$%& '( )&*+,*-).'+, .+-/"0.+1 -'+,%2"%+)
– Expected confidence = 3')*/ +"#$%& '( )&*+,*-).'+,
– Compares the confidence of the association rule with the overall unconditional probability
– Lift = 1: level of association is the same as no rule at all (random guessing)
– Lift > 1: strong (positive) association
– Lift between 0 and 1: negative association
11.3: Association Rule Analysis (4/9)
• Example: Consider the below table of transactions.

• For the association rule {mascara} => {eye liner}, compute


the support, confidence, and lift ratio.
11.3: Association Rule Analysis (5/9)

$
• 𝑆𝑢𝑝𝑝𝑜𝑟𝑡 = !% = 0.50
$
• 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 = = 0.71
&
'
• 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 = !% = 0.60
%.&!
• 𝐿𝑖𝑓𝑡 𝑟𝑎𝑡𝑖𝑜 = = 1.19
%.'%
• The lift ratio is greater than 1, indicating a strong
association between the purchase of mascara and eyeliner.
• The association is 19% stronger than guessing at random.
11.3: Association Rule Analysis (6/9)
• Example: The store manager at an electronics store
collects data on the last 100 transactions. Five possible
products were purchased: a keyboard, an SD card, a
mouse, a USB drive, and/or a headphone.
11.3: Association Rule Analysis (7/9)
• With Excel
11.3: Association Rule Analysis (8/9)
• With R
11.3: Association Rule Analysis (9/9)
• With R

You might also like