MBA in Python - 2
MBA in Python - 2
M A R K E T B A S K E T A N A LY S I S I N P Y T H O N
Isaiah Hull
Economist
When support is misleading
TID Transaction
... ...
Support(X&Y )
Support(X)
Support(Milk&Coffee) 0.20
= = 0.25
Support(Milk) 0.80
Support(Milk&Coffee) 0.20
= = 1.00
Support(Milk) 0.20
Lift provides another metric for evaluating the relationship between items.
Numerator: Proportion of transactions that contain X and Y.
Support(X&Y )
Support(X)Support(Y )
Hunger Gatsby
0 False True
1 False True
2 False False
3 False True
4 False True
Dataset: GoodBooks-10K.
# Print results.
print(supportG, confidence, lift)
Isaiah Hull
Economist
Building on simpler metrics
F requency(X)
Support(X) =
N
F requency(X&Y )
Support(X → Y ) =
N
Support(X → Y )
Confidence(X → Y ) =
Support(X)
Support(X → Y )
Lift(X → Y ) =
Support(X)Support(Y )
Leverage(X → Y ) =
Support(X&Y ) − Support(X)Support(Y )
0.018
Conviction(X → Y ) =
Support(X)Support(Ȳ )
Support(X&Ȳ )
# Compute conviction
conviction = supportT*supportnP / supportTnP
print(conviction)
1.16
Isaiah Hull
Economist
Using dissociation to pair ebooks
1 Zhang, T. (2000). Association Rules. Proceedings of the 4th Paci c-Asia conference, PADKK, pp.245-256. Kyoto,
Japan.
Confidence(A → B) − Confidence(Ā → B)
Max(Confidence(A → B), Confidence(Ā → B))
Support(A&B)
Confidence =
Support(A)
0.08903
Isaiah Hull
Economist
Overview of market basket analysis
Standard procedure for market basket analysis.
1. Generate large set of rules.
12