Bayesian-Classification Ok
Bayesian-Classification Ok
CSED
Computer Science and Engineering Department
Thapar Institute of Engineering and Technology
Patiala, Punjab
Datasets in Learning
Nominal feature has no numerical values as well as ordering to its categories. For example, gender is a
nominal.
Ordinal feature has a clear ordering. Education level is an ordinal variable as it may be have values-
Discrete Feature: It can take finite number of values only. Number of daily admitted patients to a hospital .
Continuous Feature: It can be measured on a continuum or a scale. For example- Price of a house, weight of
Probabilities Classifier
Naïve in nature
Classification problem
• Training data: examples of the form (d,h(d))
– where d are the data objects to classify (inputs)
– and h(d) gives the class info for d, h(d){1,…K}
• Goal: given dnew, provide h(dnew)
Assumption
The fundamental Naive Bayes assumption is that each feature makes an:
independent
We assume that no pair of features are dependent. For example, the temperature being
‘Hot’ has nothing to do with the humidity or the outlook being ‘Rainy’ has no effect on
Secondly, each feature is given the same weight(or importance). For example, knowing
only temperature and humidity alone can’t predict the outcome accurately. None of the
Outlook P N Humidity P N
sunny 2/9 3/5 high 3/9 4/5
overcast 4/9 0 normal 6/9 1/5
rain 3/9 2/5
Tempreature Windy
hot 2/9 2/5 true 3/9 3/5
mild 4/9 2/5 false 6/9 2/5
cool 3/9 1/5
Play-tennis example: estimating P(xi|C)
outlook
Outlook Temperature Humidity W indy Class P(sunny|p) = 2/9 P(sunny|n) = 3/5
sunny hot high false N
sunny hot high true N P(overcast|p) = 4/9 P(overcast|n) = 0
overcast hot high false P
rain mild high false P P(rain|p) = 3/9 P(rain|n) = 2/5
rain cool normal false P
rain cool normal true N temperature
overcast cool normal true P
sunny mild high false N P(hot|p) = 2/9 P(hot|n) = 2/5
sunny cool normal false P
rain mild normal false P P(mild|p) = 4/9 P(mild|n) = 2/5
sunny mild normal true P
overcast mild high true P P(cool|p) = 3/9 P(cool|n) = 1/5
overcast hot normal false P
rain mild high true N humidity
P(high|p) = 3/9 P(high|n) = 4/5
P(normal|p) = 6/9 P(normal|n) = 1/5
P(y) = 9/14
windy
P(n) = 5/14 P(true|p) = 3/9 P(true|n) = 3/5
P(false|p) = 6/9 P(false|n) = 2/5
Predict the class of the day with the condition <sunny, cool, high, strong>
Let X= <sunny, cool, high, strong>
We need to calculate P(Yes|X) and P(No|X)