Unit2 AssociationAnalysis V2
Unit2 AssociationAnalysis V2
Association Analysis
Basic concept, Use of Association Analysis,
Apriori algorithm, pruning
●
Frequent Itemset
– An itemset whose support is greater than or equal to a minimum
support threshold.
●
Pros
– Easy Computation
– Easy Implementation
– Works perfect for smaller number of itemset
●
Cons
– Computationally expensive
MDS 602 (Advanced Data Mining)
2024, rughimire Master’s in Data Science Unit 2: Association Analysis 20
Frequent Itemset Generation
●
Formulate some ways to
– Reduce the number of candidates
– Reduce the number of transactions
– Reduce the number of comparison
●
Pros
– Faster computation
– Faster convergence toward solution
●
Cons
– Still slower (mostly depends on the min_support threshold)
●
The algorithm represents the data in a tree structure known as
FP-tree, responsible for maintaining the association
information between the frequent items
●
Then it splits the database data into a set of conditional
databases (a special kind of projected database), each of
which is associated with one frequent data item.
– Similarly, the FP-growth will build the conditional pattern base table for
all of the items from the FP-tree.
– Only item c appears three times and satisfies the minimum support
requirement.
– That means the algorithm will remove all other items except c.
●
Calculate the support and confidence for items in generated frequent
pattern as done in Apriori Algorithm
●
Reference: https://hands-on.cloud/implementation-of-fp-growth-algorithm-using-python/