Week 7 Assignment 1

This document contains 10 multiple choice questions about frequent itemset mining and association rule mining. Key concepts covered include: 1. Choosing appropriate evaluation metrics like accuracy, precision, recall for classification problems with imbalanced classes. 2. Properties of frequent itemsets like the Apriori property and how it can be used to determine candidate itemsets. 3. Calculating support and confidence of association rules derived from frequent itemsets. 4. Relationships between different types of frequent itemsets like maximal and closed itemsets.

Uploaded by

Justpositive Stuff

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

445 views

Week 7 Assignment 1

Uploaded by

Justpositive Stuff

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Assignment 7 (Sol.

)
Introduction to Data Analytics
Prof. Nandan Sudarsanam & Prof. B. Ravindran

1. Imagine, you are working with NPTEL course management team and you want to develop a
machine learning algorithm which predicts the number of views on the courses. Your analysis is
based on features like the name of instructor, number of courses taught by the same instructor
on NPTEL in the past and a few other features. Which of the following evaluation metric
would you choose in that case?
(a) mean square error
(b) classification accuracy
(c) F1 score
(d) precision
(e) recall
Sol. (a)
You can think that the number of views on the course is the continuous target variable which
fall under the regression problem. So, mean squared error will be used as an evaluation metrics.
2. Imagine, you are solving a multiclass classification problem with highly imbalanced class. The
distribution of the classes is such that, you observed the majority class 99% of the times in the
training data. Your model has 99% accuracy after taking the predictions on test data. Which
of the following is true in such a case?
1) Accuracy is not a good metric for imbalanced class problems.
2) Accuracy is a good metric for imbalanced class problems.
3) Precision and Recall are good metrics for imbalanced class problems.
4) Precision and Recall are not good metrics for imbalanced class problems.
(a) 1 and 2
(b) 2 and 3
(c) 1 and 3
(d) 2 and 4
(e) 3 and 4
(f) 1 and 4
Sol. (c)
In an imbalanced data set, accuracy should not be used as a measure of performance because
99% (as given) might only be predicting majority class correctly, but our class of interest is
also the minority class (1%). Hence, in order to evaluate model performance, we should use
Precision, Recall, and F measure to determine class wise performance of the classifier.

1
3. Imagine you are working on a project which is a binary classification problem. You trained a
model on training dataset and get the below confusion matrix on validation dataset.

Based on the above confusion matrix, choose which option(s) is true among the following?
1) Accuracy is 0.91
2) Misclassification rate is 0.91
3) False positive rate is 0.95
4) True positive rate is 0.95

(a) 1 and 2
(b) 2 and 3
(c) 1 and 3
(d) 2 and 4
(e) 3 and 4
(f) 1 and 4

Sol. (f)
The Accuracy (correct classification) is (50+100)/165 which is nearly equal to 0.91.
The true Positive Rate is how many times you are predicting positive class correctly so true
positive rate would be 100/105 = 0.95 also known as ”Sensitivity” or ”Recall”

4. In identifying frequent itemsets in a transactional database, we find the following to be the

frequent 3-itemsets: {B, D, E}, {C, E, F}, {B, C, D}, {A, B, E}, {D, E, F}, {A, C, F}, {A,
C, E}, {A, B, C}, {A, C, D}, {C, D, E}, {C, D, F}, {A, D, E}. Which among the following
4-itemsets can possibly be frequent?

(a) {A, B, C, D}
(b) {A, B, D, E}
(c) {A, C, E, F}
(d) {C, D, E, F}

2
Sol. (d)
By the apriori property, only itemset {C, D, E, F} can possibly be frequent since all of its
subsets of size 3 are listed as frequent. The other 4-itemsets cannot be frequent since not all
of their subsets of size 3 are frequent. For example, for the first option, the itemset {A, B, D}
is not frequent.
5. Consider the following transactional database of 10 transactions.

Transaction ID Item set

T1 AB
T2 BCD
T3 ACDE
T4 ADE
T5 ABC
T6 ABCD
T7 BA
T8 ABC
T9 ABD
T10 BCE

Making use of the apriori property, find the number of frequent item sets, for a minimum
support of 4 (an item set with support greater than or equal to 4 is frequent)

(a) 7
(b) 8
(c) 10
(d) 6

Sol. (b)
6. Consider the following transactional data.

Transaction ID Items
1 A, B, E
2 B, D
3 B, C
4 A, B, D
5 A, C
6 B, C
7 A, C
8 A, B, C, E
9 A, B, C

Assuming that the minimum support is 2, what is the number of frequent 2-itemsets (i.e.,
frequent items sets of size 2)?

(a) 2
(b) 4

3
(c) 6
(d) 8

Sol. (c)
Candidate 1-itemsets:

itemset support
{A} 6
{B} 7
{C} 6
{D} 2
{E} 2

Frequent 1-itemsets:

itemset support
{A} 6
{B} 7
{C} 6
{D} 2
{E} 2

Candidate 2-itemsets:

itemset support
{A, B} 4
{A, C} 4
{A, D} 1
{A, E} 2
{B, C} 4
{B, D} 2
{B, E} 2
{C, D} 0
{C, E} 1
{D, E} 0

Frequent 2-itemsets:

itemset support
{A, B} 4
{A, C} 4
{A, E} 2
{B, C} 4
{B, D} 2
{B, E} 2

7. For the same data as above, what are the number of candidate 3-itemsets and frequent 3-
itemsets respectively?

4
(a) 1, 1
(b) 2, 2
(c) 2, 1
(d) 3. 2

Sol. (b)
Candidate 3-itemsets:

itemset support
{A, B, C} 2
{A, B, E} 2

Frequent 3-itemsets:

itemset support
{A, B, C} 2
{A, B, E} 2

8. Continuing with the same data, how many association rules can be derived from the frequent
itemset {A, B, E}? (Note: for a frequent itemset X, consider only rules of the form S → (X-S),
where S is a non-empty subset of X.)

(a) 3
(b) 6
(c) 7
(d) 8

Sol. (b)
{A} → {B, E}
{B} → {A, E}
{E} → {A, B}
{A, B} → {E}
{A, E} → {B}
{B, E} → {A}
9. For the same frequent itemset as mentioned above, which among the following rules have a
minimum confidence of 60%?

(a) A ∧ B =⇒ E
(b) A ∧ E =⇒ B
(c) E =⇒ A ∧ B
(d) A =⇒ B ∧ E

Sol. (b), (c)

The confidence values for the above four rules are respectively, 2/4, 2/2, 2/2, and 2/6. Hence,
only rules in (b) and (c) have the minimum required confidence.

5
10. Which of the following statements are true, about frequent itemsets in the context of transac-
tional databases (Note that more than one statement may be correct)
(a) Every maximal frequent itemset is a closed frequent itemset.
(b) Every closed frequent itemset is a maximal frequent itemset.
(c) We can recover all frequent itemsets given all maximal frequent itemsets.
(d) We can recover the frequencies of all frequent itemsets, given the frequencies of all maximal
frequent itemsets.
Sol. (a), (c)

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
The Complete Guide To Artificial Intelligence in Radiology (2022)
100% (2)
The Complete Guide To Artificial Intelligence in Radiology (2022)
34 pages
Data Mining Practice Final Exam Solutions: True/False Questions
100% (1)
Data Mining Practice Final Exam Solutions: True/False Questions
5 pages
Play Tennis Example: Outlook Temperature Humidity Windy
No ratings yet
Play Tennis Example: Outlook Temperature Humidity Windy
29 pages
DAA UNIT 4 - Final
No ratings yet
DAA UNIT 4 - Final
12 pages
Assignment 10: Introduction To Machine Learning Prof. B. Ravindran
100% (1)
Assignment 10: Introduction To Machine Learning Prof. B. Ravindran
4 pages
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
No ratings yet
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
3 pages
NLP - (Natural Language Processing Lab Manual)
No ratings yet
NLP - (Natural Language Processing Lab Manual)
12 pages
IAT-1 Workbook P3-Python
No ratings yet
IAT-1 Workbook P3-Python
16 pages
Assignment 11
100% (1)
Assignment 11
4 pages
ML Mid Sem Question Bank
No ratings yet
ML Mid Sem Question Bank
11 pages
Java Week 7 Solutions (Nptel)
No ratings yet
Java Week 7 Solutions (Nptel)
2 pages
Unit III Ai Kcs071
No ratings yet
Unit III Ai Kcs071
50 pages
Daa-r22-Unit 1&2-Digital Notes Cse Dept (A.y 2024-25) @DR.K
No ratings yet
Daa-r22-Unit 1&2-Digital Notes Cse Dept (A.y 2024-25) @DR.K
50 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
ESDL Lab Manual
No ratings yet
ESDL Lab Manual
7 pages
Assignment 1: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 1: Introduction To Machine Learning Prof. B. Ravindran
4 pages
AIML LAB MANAUAL R23
100% (1)
AIML LAB MANAUAL R23
10 pages
Assignment 11: Introduction To Machine Learning Prof. B. Ravindran
100% (2)
Assignment 11: Introduction To Machine Learning Prof. B. Ravindran
3 pages
Assignment 6
No ratings yet
Assignment 6
2 pages
18ai72 Aml QP Solutions
No ratings yet
18ai72 Aml QP Solutions
39 pages
Daa Assignment
No ratings yet
Daa Assignment
12 pages
Missionaries and Cannibals Python Ai
No ratings yet
Missionaries and Cannibals Python Ai
3 pages
Unit V Graphical Models
No ratings yet
Unit V Graphical Models
23 pages
AL3391 AI UNIT 3 NOTES EduEngg
No ratings yet
AL3391 AI UNIT 3 NOTES EduEngg
38 pages
Deep Learning KCS078
0% (1)
Deep Learning KCS078
2 pages
IML-IITKGP - Assignment 7 Solution
No ratings yet
IML-IITKGP - Assignment 7 Solution
8 pages
Multistage Backward
No ratings yet
Multistage Backward
13 pages
KL Transform
100% (1)
KL Transform
22 pages
The Multinomial Theorem
No ratings yet
The Multinomial Theorem
82 pages
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 8 - Week 5
100% (1)
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 8 - Week 5
5 pages
Software Testing - 2024 - Assignment 2 22.01.2024
100% (1)
Software Testing - 2024 - Assignment 2 22.01.2024
6 pages
Unit III AI
100% (1)
Unit III AI
38 pages
Da Unit-3
No ratings yet
Da Unit-3
27 pages
CS6659 Artificial Intelligence
No ratings yet
CS6659 Artificial Intelligence
10 pages
AD3311-AI Lab Manual-Ex1a and 1b
No ratings yet
AD3311-AI Lab Manual-Ex1a and 1b
6 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
Lesson Plan For GE3151
No ratings yet
Lesson Plan For GE3151
5 pages
006 Practical List of DM-2023
No ratings yet
006 Practical List of DM-2023
1 page
Engineering Digest DSA Sheet
No ratings yet
Engineering Digest DSA Sheet
8 pages
ML Assignment 3
No ratings yet
ML Assignment 3
5 pages
Python Lab Manual Detail
No ratings yet
Python Lab Manual Detail
49 pages
Unit 2 TOC - CSE - 1595997221
No ratings yet
Unit 2 TOC - CSE - 1595997221
18 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
DAA PPT - Unit - I
No ratings yet
DAA PPT - Unit - I
111 pages
Theory of Computation - Part - A - Anna University Questions
0% (1)
Theory of Computation - Part - A - Anna University Questions
8 pages
Ai Lab
No ratings yet
Ai Lab
48 pages
AoA Important Question
100% (1)
AoA Important Question
3 pages
Deterministic and Non-Deterministic Algorithms: Sr. No. Key Deterministic Algorithm Non-Deterministic Algorithm
No ratings yet
Deterministic and Non-Deterministic Algorithms: Sr. No. Key Deterministic Algorithm Non-Deterministic Algorithm
6 pages
Python Model Soultion 2
0% (1)
Python Model Soultion 2
12 pages
21CS54 Aiml Module3 PPT
No ratings yet
21CS54 Aiml Module3 PPT
102 pages
1 FIND+S+Algorithm
No ratings yet
1 FIND+S+Algorithm
2 pages
Course End Survey-21.04.2020
No ratings yet
Course End Survey-21.04.2020
136 pages
Optimal Binary Search Tree (OBST)
No ratings yet
Optimal Binary Search Tree (OBST)
104 pages
Deep Learning - IIT Ropar - Unit 6 - Week 3
No ratings yet
Deep Learning - IIT Ropar - Unit 6 - Week 3
4 pages
Assignment 5 (Sol.) : Reinforcement Learning
100% (1)
Assignment 5 (Sol.) : Reinforcement Learning
4 pages
Reinforcement Learning - Unit 6 - Week 4
No ratings yet
Reinforcement Learning - Unit 6 - Week 4
3 pages
FIND-S Algorithm: Machine Learning 15CSL76
No ratings yet
FIND-S Algorithm: Machine Learning 15CSL76
3 pages
Non-Deterministic Algorithm
No ratings yet
Non-Deterministic Algorithm
13 pages
Ai-Unit2 - QB-VDP
No ratings yet
Ai-Unit2 - QB-VDP
13 pages
C-3 Pap365er
No ratings yet
C-3 Pap365er
4 pages
An Integrated Computerized Cough Analysis by Using Wavelet For Pneumonia Diagnosis 1
No ratings yet
An Integrated Computerized Cough Analysis by Using Wavelet For Pneumonia Diagnosis 1
5 pages
Assessing Autism in Adults An Evaluation of The Developmental, Dimensional and Diagnostic Interview-Adult Version (3Di-Adult)
100% (4)
Assessing Autism in Adults An Evaluation of The Developmental, Dimensional and Diagnostic Interview-Adult Version (3Di-Adult)
12 pages
2what A Manufacturer Needs To Know About Conformity Assessment and Declarations of Conformity For IVDs
No ratings yet
2what A Manufacturer Needs To Know About Conformity Assessment and Declarations of Conformity For IVDs
35 pages
Test Bank for Pathophysiology 6th Edition By Banasik - All Chapter Instant Download
100% (5)
Test Bank for Pathophysiology 6th Edition By Banasik - All Chapter Instant Download
31 pages
2018 PHN Epidemiology & Research Mock-2021
No ratings yet
2018 PHN Epidemiology & Research Mock-2021
23 pages
Mphy0020 Notes
No ratings yet
Mphy0020 Notes
26 pages
Bangla Text Sentiment Analysis Using Supervised Machine Learning With Extended Lexicon Dictionary
No ratings yet
Bangla Text Sentiment Analysis Using Supervised Machine Learning With Extended Lexicon Dictionary
12 pages
Dengue ICT NS1 PDF
No ratings yet
Dengue ICT NS1 PDF
7 pages
RF-6 Manual
No ratings yet
RF-6 Manual
2 pages
Phishing
No ratings yet
Phishing
13 pages
Autonomous Credit Card Fraud Detection Using Machine Learning Approach
No ratings yet
Autonomous Credit Card Fraud Detection Using Machine Learning Approach
23 pages
Jurnal 5 Demam Typoid
No ratings yet
Jurnal 5 Demam Typoid
4 pages
Meter Verificastion
No ratings yet
Meter Verificastion
23 pages
MCI FMGE Previous Year Solved Question Paper 2004
No ratings yet
MCI FMGE Previous Year Solved Question Paper 2004
0 pages
Academia and Clinic: Assessing The Value of Risk Predictions by Using Risk Stratification Tables
No ratings yet
Academia and Clinic: Assessing The Value of Risk Predictions by Using Risk Stratification Tables
11 pages
Ddos Detection ANN
No ratings yet
Ddos Detection ANN
9 pages
Cytology of Bone Fine Needle Aspiration Biopsy
No ratings yet
Cytology of Bone Fine Needle Aspiration Biopsy
11 pages
Behavior-based_features_model_for_malware_detectio
No ratings yet
Behavior-based_features_model_for_malware_detectio
12 pages
4 Measurement Uncertainty - MinimumWeight
No ratings yet
4 Measurement Uncertainty - MinimumWeight
24 pages
Biostatistics and Epidemiology Corse Outline 2
No ratings yet
Biostatistics and Epidemiology Corse Outline 2
2 pages
On-Load Tap Changer's Dynamic Resistance Measurement: Settings and Interpretation
100% (1)
On-Load Tap Changer's Dynamic Resistance Measurement: Settings and Interpretation
5 pages
confusion_matrix
No ratings yet
confusion_matrix
5 pages
Skin Cancer Detection Using Digital Image Processing and Implementation Using ANN and ABCD Features
No ratings yet
Skin Cancer Detection Using Digital Image Processing and Implementation Using ANN and ABCD Features
6 pages
59549-1b.2 HFA II-i User Manual
100% (1)
59549-1b.2 HFA II-i User Manual
480 pages
Clinical Questions PICO and Study Design EBM Part 1
No ratings yet
Clinical Questions PICO and Study Design EBM Part 1
47 pages
Accurate Identification of Pine Wood Nematode Dise
No ratings yet
Accurate Identification of Pine Wood Nematode Dise
16 pages
Pham Et. Al. (2021)
No ratings yet
Pham Et. Al. (2021)
16 pages
Optimizing YouTube Spam Detection With Ensemble Deep Learning Techniques
No ratings yet
Optimizing YouTube Spam Detection With Ensemble Deep Learning Techniques
6 pages
Confusion Matrix
No ratings yet
Confusion Matrix
14 pages