Classification Algorithm
Classification Algorithm
Data Mining
• Naïve Bayes
• Support Vector Machine
• K-Nearest Neighbours
• Decision Tree
DATASET
• a collection of related sets of information
that is composed of separate elements but
can be manipulated as a unit by a computer:
• Random Forest
• Naïve Bayes
Random Forest Algorithm
• Random Forest is a classifier that contains a number of
decision trees on various subsets of the given dataset and
takes the average to improve the predictive accuracy of
that dataset."
• P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
• P(Sunny|Yes)= 3/10= 0.3
• P(Sunny)= 0.35
• P(Yes)=0.71
• So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
• P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
• P(Sunny|NO)= 2/4=0.5
• P(No)= 0.29
• P(Sunny)= 0.35
• So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
• So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)