ML 15 09 2022
ML 15 09 2022
Summary of Unit 2
Construct a box plot for the following data: 12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25
Outlier detection using Box Plot
An outlier, is one that appears to deviate markedly from other members of the data set
in which it occurs.
24, 58, 61, 67, 71, 73, 76, 79, 82, 83, 85, 87, 88, 88, 92, 93, 94, 97
x.y = 2*2 + 4*1 + 0*0 + 0*0 + 2*3 + 1*2 + 3*1 + 0*0 + 0*1 = 19.
Summary of Unit 5
The probability of the joint event A and B is defined as the product rule:
It is just the sum or union over all the probabilities of all events for the second
variable for a given fixed event for the first variable.
P(X=A) = sum P(X=A, Y=yi) for all y
Conditional probability
The conditional probability for events A given event B is calculated as follows:
Similarly ,
Bayes rule
The Bayes rule, also known as Bayes Theorem, can be derived by combining the
definition of conditional probability with the product and sum rules, as below
Example
An antibiotic resistance test (random variable T) has 1% false positives (i.e. 1% of
those not resistance to an antibiotic show positive result in the test) and 5% false
negatives (i.e. 5% of those actually resistant to an antibiotic test negative). Let us
assume that 2% of those tested are resistant to antibiotics.
Determine the probability that somebody who tests positive is actually resistant
(random variable D).
1. What is the probability that the student will solve the problem of the exam?
2. Given that the student solved the problem, what is the probability that it was of type
A?
Example
P(Solved)=0.61
P(A)=30%
P(B)=20%
P(C)=50%
P(Solved | A)=9/10
P(Solved | B)=2/10
P(Solved | C)=6/10
1. What is the probability that the student will solve the problem of the exam?
Example
Solution
P(A)=30%
P(B)=20%
P(C)=50%
P(Solved | A)=9/10
P(Solved | B)=2/10
P(Solved | C)=6/10
2. Given that the student solved the problem, what is the probability that it was of type
A?
Summary of Unit 8
Solve Apriori algorithm on the following data set with minimum support value
and minimum confidence value set as 50% and 75% respectively to generate
large itemsets and association rules
Summary of Unit 8