0% found this document useful (0 votes)

6 views15 pages

Asignment

The document provides statistical calculations including mean, median, mode, midrange, quartiles, variance, and standard deviation based on a dataset. It also discusses probabilities related to a classification problem and analyzes the results for a test sample. Additionally, it outlines the candidate generation and pruning steps of the Apriori algorithm for itemset mining.

Uploaded by

dohoangtruonghuy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views15 pages

Asignment

Uploaded by

dohoangtruonghuy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Nguyễn Hoàng Đức _ITITIU20190

Q1:

(a) Mean: The mean is calculated by summing all the values and dividing by the total number
of values.
Mean = (53 + 55 + 70 + 58 + 64 + 57 + 53 + 69 + 57 + 68 + 53) / 11 = 59.73

Median: To find the median, we first sort the data:

53, 53, 53, 55, 57, 57, 58, 64, 68, 69, 70. The middle value is the 6th value:
Median = 57

(b) Mode: The mode is the value that occurs most frequently. From the sorted data, 53
appears 3 times, which is more than any other value.
Mode = 53

(d) First Quartile (Q1): Q1 is the median of the first half of the sorted data (53, 53, 53, 55, 57).
The median of this group is:
Q1 = (53 + 55) / 2 = 54

Third Quartile (Q3): Q3 is the median of the second half of the sorted data (58, 64, 68, 69,
70). The median of this group is:
Q3 = (64 + 68) / 2 = 66

(e) Five-Number Summary:

Min = 53, Q1 = 54, Median = 57, Q3 = 66, Max = 70
(g) Variance, Standard Deviation :
Deviations: [−6.73,−4.73,10.27,−1.73,4.27,−2.73,−6.73,9.27,−2.73,8.27,−6.73]
Variance= [1/(n-1)] x (sum of deviation)^2 = (1/10) x 454.18 = 45.42
Standard Deviation = 𝑠𝑞𝑟𝑡(45. 42)= 6.74

Q2:
Question 5 :

a) From the table:

● Total + : 5 (Records 1, 5, 6, 9, 10)
● Total − : 5 (Records 2, 3, 4, 7, 8)

P(A=1|+) = (Count of A =1 with +) / total + =3/5 =0.6 => P( A=0|+ ) = 1 - 0.6 =0.4

P(B=1|+) = (Count of B =1 with +) / total + =1/5= 0.2 => P( B=0|+ )= 1 - 0.2 = 0.8

P(C=1|+), (Count of C =1 with +) / total +=4/5= 0.8 => P(C=0|+) = 1 - 0.8 = 0.2

P(A=1|-) : (Count of A = 1 with -) / total - =2/5= 0.4 => P(A=0|-) = 1-0.4 = 0.6

P(B=1|-) : (Count of B = 1 with -) / total - =2/5=0.4 =>P(B=0|-) = 1-0.6 = 0.4

P(C=1|-):(Count of C = 1 with -) / total - =5/5=1 => P(C=0|-) = 1-1=0

From the table:

● There are 5 records with class "+" (records 1, 5, 6, 9, 10) and 5 records with class "-"
(records 2, 3, 4, 7, 8).

For Class "+" (Positive):

● The probability of A being 1 given class "+" (P(A=1|+)) is 3 out of 5, which is 0.6, so
P(A=0|+) = 1 - 0.6 = 0.4.
● The probability of B being 1 given class "+" (P(B=1|+)) is 1 out of 5, which is 0.2, so
P(B=0|+) = 1 - 0.2 = 0.8.
● The probability of C being 1 given class "+" (P(C=1|+)) is 4 out of 5, which is 0.8, so
P(C=0|+) = 1 - 0.8 = 0.2.

For Class "-" (Negative):

● The probability of A being 1 given class "-" (P(A=1|-)) is 2 out of 5, which is 0.4, so
P(A=0|-) = 1 - 0.4 = 0.6.
● The probability of B being 1 given class "-" (P(B=1|-)) is 2 out of 5, which is 0.4, so
P(B=0|-) = 1 - 0.4 = 0.6.
● The probability of C being 1 given class "-" (P(C=1|-)) is 5 out of 5, which is 1, so
P(C=0|-) = 1 - 1 = 0.
Analysis of the probabilities for the test sample (A=0, B=1, C=0):

● For A=0, P(A=0|+) = 0.4, which is less than P(A=0|-) = 0.6.

● For B=0, P(B=0|+) = 0.8, which is greater than P(B=0|-) = 0.4.
● For C=0, P(C=0|+) = 0.2, which is less than P(C=0|-) = 0.

P( A=0|+ ) < P(A=0|-)

P( B=0|+ ) > P(B=0|-)

P(C=0|+) < P(C=0|-)

So class label for a test sample (A=0,B=1,C=0) is ( - , +, +)

Q7)
Support for Itemset {b, d, e}

Count the number of customers who have b, d, and e in their basket:

● Customers with {b, d, e}: 2, 5

● Number of customers with {b, d, e}: 2
● Total customers: 5

Support for {b, d, e}:

Q8:

(a) List all candidate 4-itemsets obtained by the candidate generation

procedure in Apriori.

The candidate generation procedure in the Apriori algorithm creates candidate itemsets of
size kkk by combining itemsets of size k−1k-1k−1. To generate the candidate 4-itemsets from
the given frequent 3-itemsets, we first combine the frequent 3-itemsets that share 2 items in
common.

Frequent 3-itemsets:

● {1, 2, 3}
● {1, 2, 4}
● {1, 2, 5}
● {1, 3, 4}
● {1, 3, 5}
● {2, 3, 4}
● {2, 3, 5}
● {3, 4, 5}

Candidate 4-itemsets:

To generate 4-itemsets, we take pairs of frequent 3-itemsets that share 2 items and combine
them to form a 4-itemset. Here are the combinations:

1. Combine {1, 2, 3} and {1, 2, 4}: Resulting 4-itemset = {1, 2, 3, 4}

2. Combine {1, 2, 3} and {1, 2, 5}: Resulting 4-itemset = {1, 2, 3, 5}
3. Combine {1, 2, 4} and {1, 2, 5}: Resulting 4-itemset = {1, 2, 4, 5}
4. Combine {1, 2, 3} and {1, 3, 4}: Resulting 4-itemset = {1, 2, 3, 4} (Duplicate, already
listed)
5. Combine {1, 2, 3} and {1, 3, 5}: Resulting 4-itemset = {1, 2, 3, 5} (Duplicate, already
listed)
6. Combine {1, 2, 4} and {1, 3, 4}: Resulting 4-itemset = {1, 2, 3, 4} (Duplicate, already
listed)
7. Combine {1, 2, 4} and {1, 3, 5}: Resulting 4-itemset = {1, 2, 4, 5}
8. Combine {1, 2, 5} and {1, 3, 4}: Resulting 4-itemset = {1, 2, 4, 5} (Duplicate, already
listed)
9. Combine {1, 2, 5} and {1, 3, 5}: Resulting 4-itemset = {1, 2, 5, 3} (Duplicate, already
listed)
10.Combine {2, 3, 4} and {2, 3, 5}: Resulting 4-itemset = {2, 3, 4, 5}

By considering only unique 4-itemsets:

Candidate 4-itemsets:

● {1, 2, 3, 4}
● {1, 2, 3, 5}
● {1, 2, 4, 5}
● {2, 3, 4, 5}

(b) List all candidate 4-itemsets that survive the candidate pruning step of
the Apriori algorithm.

In the candidate pruning step, any candidate itemset that contains a subset that is not frequent
is discarded. To determine which candidate 4-itemsets survive, we need to check whether
each 4-itemset has all its subsets of size 3 among the given frequent 3-itemsets.
Frequent 3-itemsets:

● {1, 2, 3}
● {1, 2, 4}
● {1, 2, 5}
● {1, 3, 4}
● {1, 3, 5}
● {2, 3, 4}
● {2, 3, 5}
● {3, 4, 5}

Check each candidate 4-itemset:

1. {1, 2, 3, 4}:

○ Subsets: {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}
○ All subsets are frequent.
○ This itemset survives.
2. {1, 2, 3, 5}:
○ Subsets: {1, 2, 3}, {1, 2, 5}, {1, 3, 5}, {2, 3, 5}
○ All subsets are frequent.
○ This itemset survives.
3. {1, 2, 4, 5}:
○ Subsets: {1, 2, 4}, {1, 2, 5}, {1, 4, 5}, {2, 4, 5}
○ {1, 4, 5} and {2, 4, 5} are not frequent.
○ This itemset is pruned.
4. {2, 3, 4, 5}:
○ Subsets: {2, 3, 4}, {2, 3, 5}, {2, 4, 5}, {3, 4, 5}
○ All subsets are frequent.
○ This itemset survives.

Surviving 4-itemsets:

● {1, 2, 3, 4}
● {1, 2, 3, 5}
● {2, 3, 4, 5}

Final Answer:

● Candidate 4-itemsets (from part a):

○ {1, 2, 3, 4}, {1, 2, 3, 5}, {1, 2, 4, 5}, {2, 3, 4, 5}
● 4-itemsets after pruning (from part b):
○ {1, 2, 3, 4}, {1, 2, 3, 5}, {2, 3, 4, 5}

Papoulis Solutions Manual
50% (2)
Papoulis Solutions Manual
186 pages
Apriori Principle Example Question and Answer
100% (11)
Apriori Principle Example Question and Answer
11 pages
BIO401 Solution Assignment
No ratings yet
BIO401 Solution Assignment
6 pages
Solutions To All Problem (1) - Compressed
No ratings yet
Solutions To All Problem (1) - Compressed
25 pages
Unit 4
No ratings yet
Unit 4
113 pages
Data Mining
No ratings yet
Data Mining
24 pages
Solutions 7
No ratings yet
Solutions 7
4 pages
DWM Exp8
No ratings yet
DWM Exp8
8 pages
Assignment 2
No ratings yet
Assignment 2
13 pages
CS426 SolutionForHomework1
No ratings yet
CS426 SolutionForHomework1
6 pages
Full Assignment 1 (Math2565)
No ratings yet
Full Assignment 1 (Math2565)
7 pages
Exercises DM
No ratings yet
Exercises DM
7 pages
Deciles
No ratings yet
Deciles
12 pages
The Probabilistic Method - ProbabilisticMethod
No ratings yet
The Probabilistic Method - ProbabilisticMethod
9 pages
Maths
No ratings yet
Maths
4 pages
Sta1006 Summary
No ratings yet
Sta1006 Summary
20 pages
EC2011E - Practice Sheet Q A
No ratings yet
EC2011E - Practice Sheet Q A
37 pages
Experiment: 3: Aim: Theory
No ratings yet
Experiment: 3: Aim: Theory
16 pages
Association
No ratings yet
Association
17 pages
DM M4
No ratings yet
DM M4
17 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
13 pages
Solucionario Estadistica Papoulis
No ratings yet
Solucionario Estadistica Papoulis
187 pages
Math - McGraw Hil - Probability Random Variables and Stochastic Processes Solutions Manual - Papoulis - 2002
No ratings yet
Math - McGraw Hil - Probability Random Variables and Stochastic Processes Solutions Manual - Papoulis - 2002
187 pages
Probability, Random Variables and Stochastic Proccess Solution-Manual
No ratings yet
Probability, Random Variables and Stochastic Proccess Solution-Manual
187 pages
Midterm F07 Solutions
No ratings yet
Midterm F07 Solutions
4 pages
COMP1942 Question Paper
No ratings yet
COMP1942 Question Paper
5 pages
Revision
No ratings yet
Revision
12 pages
Text Book of Elementary Statistics
80% (5)
Text Book of Elementary Statistics
264 pages
Homework #5: Decision Support Systems 2012/2013 Meic - Taguspark
No ratings yet
Homework #5: Decision Support Systems 2012/2013 Meic - Taguspark
9 pages
Chapter 4 Managerial Statistics Solutions
No ratings yet
Chapter 4 Managerial Statistics Solutions
24 pages
Adm No: Cict/21/00125/3/24
No ratings yet
Adm No: Cict/21/00125/3/24
5 pages
Exp 9
No ratings yet
Exp 9
9 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
178 pages
Frequent Pattern Analysis-Arpriori
No ratings yet
Frequent Pattern Analysis-Arpriori
27 pages
Module 4
No ratings yet
Module 4
71 pages
Mod 5
No ratings yet
Mod 5
56 pages
Association Rule Mining Example
No ratings yet
Association Rule Mining Example
12 pages
STA Formula
No ratings yet
STA Formula
5 pages
Association of Attributes - Compressed
No ratings yet
Association of Attributes - Compressed
50 pages
3) 65 (Apriori Algorithm) : Frequent Item Set in Data Set (Association Rule Mining
No ratings yet
3) 65 (Apriori Algorithm) : Frequent Item Set in Data Set (Association Rule Mining
4 pages
DLWSS551 - Algorithms Part II
No ratings yet
DLWSS551 - Algorithms Part II
44 pages
17) Theory of Attributes - Notes
No ratings yet
17) Theory of Attributes - Notes
22 pages
Exercises Solutions
100% (2)
Exercises Solutions
67 pages
Assignment 1
No ratings yet
Assignment 1
15 pages
Unit-7 Apriori
No ratings yet
Unit-7 Apriori
4 pages
Stat, Correllatn, & Price Ind
No ratings yet
Stat, Correllatn, & Price Ind
52 pages
Untitled Document
No ratings yet
Untitled Document
7 pages
Module 4 DM
No ratings yet
Module 4 DM
86 pages
Assignment 1: Data Mining MGSC5126 - 10
No ratings yet
Assignment 1: Data Mining MGSC5126 - 10
10 pages
Week 7 Assignment 1
No ratings yet
Week 7 Assignment 1
6 pages
What Is A Frequent Itemset?
No ratings yet
What Is A Frequent Itemset?
7 pages
Theory of Attributes
No ratings yet
Theory of Attributes
28 pages
Master Fundamental Concepts of Math Olympiad: Maths, #1
From Everand
Master Fundamental Concepts of Math Olympiad: Maths, #1
Subbalakshmi Devaki
No ratings yet
Master SAT Prep Maths: Maths, #1
From Everand
Master SAT Prep Maths: Maths, #1
Subbalakshmi Devaki
No ratings yet
Master ACT Math Prep: Maths, #1
From Everand
Master ACT Math Prep: Maths, #1
Subbalakshmi Devaki
No ratings yet
SAT Math Shortcuts
From Everand
SAT Math Shortcuts
Bella Biscotti
No ratings yet
Analytic Geometry: Graphic Solutions Using Matlab Language
From Everand
Analytic Geometry: Graphic Solutions Using Matlab Language
Ing. Mario Castillo
No ratings yet
Introduction to Calculus
From Everand
Introduction to Calculus
Joan Van Glabek
4.5/5 (8)
The Beginners Math for GRE & GMAT: Maths, #1
From Everand
The Beginners Math for GRE & GMAT: Maths, #1
Subbalakshmi Devaki
No ratings yet
Advanced Cue Ball Control Self-Testing Program
From Everand
Advanced Cue Ball Control Self-Testing Program
Allan P. Sand
No ratings yet
Laboratory Session 1: Shell Commands On Linux Man
No ratings yet
Laboratory Session 1: Shell Commands On Linux Man
1 page
Ch01 Introduction 4e 2
No ratings yet
Ch01 Introduction 4e 2
47 pages
Ch03 Names Scopes and Bindings 4e
No ratings yet
Ch03 Names Scopes and Bindings 4e
45 pages
Ch06 Control Flow 4e
No ratings yet
Ch06 Control Flow 4e
25 pages
Ch02 Programming Language Syntax 4e 2
No ratings yet
Ch02 Programming Language Syntax 4e 2
64 pages
Distribution With 15: Brief Notation For The
No ratings yet
Distribution With 15: Brief Notation For The
2 pages
Garuda 1627328
No ratings yet
Garuda 1627328
15 pages
Frequency Table: Jenis Kelamin
No ratings yet
Frequency Table: Jenis Kelamin
4 pages
MEDLO5012 Design of Experiments 03: Course Code Course Name Credits
No ratings yet
MEDLO5012 Design of Experiments 03: Course Code Course Name Credits
2 pages
Oneway: Notes
No ratings yet
Oneway: Notes
5 pages
I Css RR MW Brochure
No ratings yet
I Css RR MW Brochure
5 pages
Advanced Statistics Assignment: Business Report (PGP - DSBA)
No ratings yet
Advanced Statistics Assignment: Business Report (PGP - DSBA)
23 pages
Exercise2 Submission Group 12 Yalcin Mehmet
No ratings yet
Exercise2 Submission Group 12 Yalcin Mehmet
10 pages
Business Stati Basic Numerical Skills
No ratings yet
Business Stati Basic Numerical Skills
15 pages
AIML Practical 02 22105A2021
No ratings yet
AIML Practical 02 22105A2021
8 pages
Statistical Estimations in Enzyme Kinetics: Investigation
No ratings yet
Statistical Estimations in Enzyme Kinetics: Investigation
9 pages
Example PPT Case Study 2
No ratings yet
Example PPT Case Study 2
10 pages
Correlation Analysis-Students NotesMAR 2023
No ratings yet
Correlation Analysis-Students NotesMAR 2023
24 pages
DLP Week 2 Q4
No ratings yet
DLP Week 2 Q4
8 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
8 pages
Jensen R. Broñola Jr. Bs Physics-Wmsu Ms Physics-Dlsu-Manila
No ratings yet
Jensen R. Broñola Jr. Bs Physics-Wmsu Ms Physics-Dlsu-Manila
12 pages
Chapter3-1 1
No ratings yet
Chapter3-1 1
21 pages
Reliability Theory and Survival Analysis Final
No ratings yet
Reliability Theory and Survival Analysis Final
12 pages
Statistics Summative 3
No ratings yet
Statistics Summative 3
5 pages
bài tập
No ratings yet
bài tập
4 pages
AI UNIT - 5 Notes
No ratings yet
AI UNIT - 5 Notes
10 pages
Chapter 8 - STATISTICAL Method by S. P. Gupta
No ratings yet
Chapter 8 - STATISTICAL Method by S. P. Gupta
49 pages
Sample Size
No ratings yet
Sample Size
10 pages
wst03 01 Que 2023.01
No ratings yet
wst03 01 Que 2023.01
28 pages
When Can We Trust The Limits On A Process Behavior Chart?: Home Content
No ratings yet
When Can We Trust The Limits On A Process Behavior Chart?: Home Content
2 pages
Evaluation of Melon Cucumis Melo L
No ratings yet
Evaluation of Melon Cucumis Melo L
10 pages
General Entry With Prediction of Population Between India and China
No ratings yet
General Entry With Prediction of Population Between India and China
4 pages
Mann Stats 8e PPT Ch01 (Main) (Updated)
No ratings yet
Mann Stats 8e PPT Ch01 (Main) (Updated)
32 pages
Lab8 QST Set2
No ratings yet
Lab8 QST Set2
2 pages
REGRESSION
No ratings yet
REGRESSION
7 pages

Asignment

Uploaded by

Asignment

Uploaded by

Nguyễn Hoàng Đức _ITITIU20190

Median: To find the median, we first sort the data:​

(e) Five-Number Summary:​

a)​ From the table:

P(B=1|-) : (Count of B = 1 with -) / total - =2/5=0.4 =>P(B=0|-) = 1-0.6 = 0.4

P(C=1|-):(Count of C = 1 with -) / total - =5/5=1 => P(C=0|-) = 1-1=0

From the table:

For Class "+" (Positive):

For Class "-" (Negative):

●​ For A=0, P(A=0|+) = 0.4, which is less than P(A=0|-) = 0.6.

P( A=0|+ ) < P(A=0|-)

P( B=0|+ ) > P(B=0|-)

P(C=0|+) < P(C=0|-)

So class label for a test sample (A=0,B=1,C=0) is ( - , +, +)

Count the number of customers who have b, d, and e in their basket:

●​ Customers with {b, d, e}: 2, 5

Support for {b, d, e}:

(a) List all candidate 4-itemsets obtained by the candidate generation

1.​ Combine {1, 2, 3} and {1, 2, 4}: Resulting 4-itemset = {1, 2, 3, 4}

By considering only unique 4-itemsets:

Check each candidate 4-itemset:

1.​ {1, 2, 3, 4}:

●​ Candidate 4-itemsets (from part a):

You might also like

Median: To find the median, we first sort the data:

(e) Five-Number Summary:

a) From the table:

● For A=0, P(A=0|+) = 0.4, which is less than P(A=0|-) = 0.6.

● Customers with {b, d, e}: 2, 5

1. Combine {1, 2, 3} and {1, 2, 4}: Resulting 4-itemset = {1, 2, 3, 4}

1. {1, 2, 3, 4}:

● Candidate 4-itemsets (from part a):