Statistical Computing With R: Masters in Data Sciences 503 (S28) Third Batch, SMS, TU, 2024
Statistical Computing With R: Masters in Data Sciences 503 (S28) Third Batch, SMS, TU, 2024
• You get a client who runs a retail • Your client will use your findings
store and gives you data for all to not only change/update/add
transactions that consists of items in inventory but also use
items bought in the store by them to change the layout of
several customers over a period the physical store or rather an
of time. online store.
• To find results that will help your
• Your client then asks you to use client, you will use Market
that data to help boost their Basket Analysis (MBA) which
business. uses Association Rule Mining on
the given transaction data.
Use of association rules mining result:
https://www.datacamp.com/community/tutorials/market-basket-analysis-r
• etc.
Association rule mining: If => Then analyis
https://www.datacamp.com/community/tutorials/market-basket-analysis-r
parameter = • [5] {diapers, milk} => {beer} 0.4 0.6666667 0.6 1.1111111 2
list(supp=0.3, conf=0.5,
maxlen=10,
minlen=2),
appearance = list(default="lhs",
rhs="beer"))
#Inspect
• inspect(beer_rules_rhs)
Let’s set LHS rule for “trans” data:
#For example, to analyze what items lhs rhs support confidence coverage lift count
customers buy before buying {beer}, [1] {beer} => {bread} 0.4 0.6666667 0.6 0.8333333 2
[2] {beer} => {milk} 0.4 0.6666667 0.6 0.8333333 2
#we set lhs=beer and default=rhs: [3] {beer} => {diapers} 0.6 1.0000000 0.6 1.2500000 3
beer_rules_lhs <- apriori(trans,
parameter =
list(supp=0.3, conf=0.5,
maxlen=10,
minlen=2),
appearance =
list(lhs="beer", default="rhs"))
#Inspect the result:
inspect(beer_rules_lhs)
Product recommendation rule:
#Product recommendation rule lhs rhs support confidence coverage lift n
• [1] {cola} => {milk} 0.4 1 0.4 1.25 2
• rules_conf <- sort (rules, • [2] {cola} => {diapers} 0.4 1 0.4 1.25 2
by="confidence", • [3] {beer} => {diapers} 0.6 1 0.6 1.25 3
decreasing=TRUE) • [4] {cola, milk} => {diapers} 0.4 1 0.4 1.25 2
• [5] {cola, diapers} => {milk} 0.4 1 0.4 1.25 2
• [6] {beer, milk} => {diapers} 0.4 1 0.4 1.25 2
#inspect the rule
# show the support, lift and
confidence for all rules
• inspect(head(rules_conf))
Plotting rules with “arulesViz” package:
• library(arulesViz)
• plot(rules)
Plotting rules with “arulesViz” package:
• plot(rules, measure =
"confidence")
Plotting rules with “arulesViz” package:
• plot(rules, method = "two-key
plot")
Interactive plot with “plotly” engine:
• #Interactive plot
• plot(rules, engine = "plotly")
Graph based visualization:
#Graph based visualization
subrules <- head(rules, n = 10, by
= "confidence")
plot(subrules, method = "graph",
engine = "htmlwidget")
Parallel coordinate plot for 10 rules:
#Paraller coordinate plot
• plot(subrules,
method="paracoord")
More here:
• Like the one we did before:
• https://www.kirenz.com/post/2020-05-14-r-association-rule-mining/
• https://www.youtube.com/watch?v=91CmrpD-4Fw
Question/queries?
• Next class • Monte Carlo Simulations
• Class imbalance problem
• Statistical approach
• Data sciences approach
Thank you!
@shitalbhandary