Asignment
Asignment
Q1:
(a) Mean: The mean is calculated by summing all the values and dividing by the total number
of values.
Mean = (53 + 55 + 70 + 58 + 64 + 57 + 53 + 69 + 57 + 68 + 53) / 11 = 59.73
(b) Mode: The mode is the value that occurs most frequently. From the sorted data, 53
appears 3 times, which is more than any other value.
Mode = 53
(c) Midrange: The midrange is the average of the smallest and largest values.
Midrange = (53 + 70) / 2 = 61.5
(d) First Quartile (Q1): Q1 is the median of the first half of the sorted data (53, 53, 53, 55, 57).
The median of this group is:
Q1 = (53 + 55) / 2 = 54
Third Quartile (Q3): Q3 is the median of the second half of the sorted data (58, 64, 68, 69,
70). The median of this group is:
Q3 = (64 + 68) / 2 = 66
Q2:
Question 5 :
P(A=1|+) = (Count of A =1 with +) / total + =3/5 =0.6 => P( A=0|+ ) = 1 - 0.6 =0.4
P(B=1|+) = (Count of B =1 with +) / total + =1/5= 0.2 => P( B=0|+ )= 1 - 0.2 = 0.8
P(C=1|+), (Count of C =1 with +) / total +=4/5= 0.8 => P(C=0|+) = 1 - 0.8 = 0.2
P(A=1|-) : (Count of A = 1 with -) / total - =2/5= 0.4 => P(A=0|-) = 1-0.4 = 0.6
● There are 5 records with class "+" (records 1, 5, 6, 9, 10) and 5 records with class "-"
(records 2, 3, 4, 7, 8).
● The probability of A being 1 given class "+" (P(A=1|+)) is 3 out of 5, which is 0.6, so
P(A=0|+) = 1 - 0.6 = 0.4.
● The probability of B being 1 given class "+" (P(B=1|+)) is 1 out of 5, which is 0.2, so
P(B=0|+) = 1 - 0.2 = 0.8.
● The probability of C being 1 given class "+" (P(C=1|+)) is 4 out of 5, which is 0.8, so
P(C=0|+) = 1 - 0.8 = 0.2.
● The probability of A being 1 given class "-" (P(A=1|-)) is 2 out of 5, which is 0.4, so
P(A=0|-) = 1 - 0.4 = 0.6.
● The probability of B being 1 given class "-" (P(B=1|-)) is 2 out of 5, which is 0.4, so
P(B=0|-) = 1 - 0.4 = 0.6.
● The probability of C being 1 given class "-" (P(C=1|-)) is 5 out of 5, which is 1, so
P(C=0|-) = 1 - 1 = 0.
Analysis of the probabilities for the test sample (A=0, B=1, C=0):
Q7)
Support for Itemset {b, d, e}
The candidate generation procedure in the Apriori algorithm creates candidate itemsets of
size kkk by combining itemsets of size k−1k-1k−1. To generate the candidate 4-itemsets from
the given frequent 3-itemsets, we first combine the frequent 3-itemsets that share 2 items in
common.
Frequent 3-itemsets:
● {1, 2, 3}
● {1, 2, 4}
● {1, 2, 5}
● {1, 3, 4}
● {1, 3, 5}
● {2, 3, 4}
● {2, 3, 5}
● {3, 4, 5}
Candidate 4-itemsets:
To generate 4-itemsets, we take pairs of frequent 3-itemsets that share 2 items and combine
them to form a 4-itemset. Here are the combinations:
Candidate 4-itemsets:
● {1, 2, 3, 4}
● {1, 2, 3, 5}
● {1, 2, 4, 5}
● {2, 3, 4, 5}
(b) List all candidate 4-itemsets that survive the candidate pruning step of
the Apriori algorithm.
In the candidate pruning step, any candidate itemset that contains a subset that is not frequent
is discarded. To determine which candidate 4-itemsets survive, we need to check whether
each 4-itemset has all its subsets of size 3 among the given frequent 3-itemsets.
Frequent 3-itemsets:
● {1, 2, 3}
● {1, 2, 4}
● {1, 2, 5}
● {1, 3, 4}
● {1, 3, 5}
● {2, 3, 4}
● {2, 3, 5}
● {3, 4, 5}
Surviving 4-itemsets:
● {1, 2, 3, 4}
● {1, 2, 3, 5}
● {2, 3, 4, 5}
Final Answer: