0% found this document useful (0 votes)
32 views21 pages

Bayesian-Classification Ok

Uploaded by

SHIVOM CHAWLA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views21 pages

Bayesian-Classification Ok

Uploaded by

SHIVOM CHAWLA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Bayesian Classifier

CSED
Computer Science and Engineering Department
Thapar Institute of Engineering and Technology
Patiala, Punjab
Datasets in Learning

Area of House No. of Location Price (in Lakhs)


(in Sq Yards) Bed Room
200 3 Chandigarh 62.0
300 4 Chandigarh 75.0
400 5 Delhi 120.50
300 4 Delhi 95.25
200 3 Patiala 45.50

Total Features =4 Total Instances =5


Labeled Dataset

Area of House No of Bed Rooms Location Price (in Lakhs)


200 3 Chandigarh 65.0
300 4 Chandigarh 75.0
400 5 Delhi 120.50
300 4 Delhi 95.25
200 3 Patiala 45.50

Relationship between input


features and output feature
y=f(X)
Unlabeled Dataset

Emp Id Experience Age Address


(in Years)
101 12 34 Patiala
102 24 47 Delhi
103 16 42 Delhi
104 14 36 Chandigarh
105 16 39 Chandigarh
106 21 42 Chandigarh
Types of Features

Qualitative Feature Quantitative Feature

Qualitative or categorical features have non Quantitative features are numerical in


numerical values. E.g. Gender(male or nature. E.g. Salary of a person, Marks
female). obtained by a student.
They can’t be measured, but can be grouped. They are measurable and have some order.
Qualitative feature

Nominal feature Ordinal feature

Nominal feature has no numerical values as well as ordering to its categories. For example, gender is a

nominal.

Ordinal feature has a clear ordering. Education level is an ordinal variable as it may be have values-

Elementary, Secondary, College/University.


Quantitative feature

Discrete feature Continuous feature

Discrete Feature: It can take finite number of values only. Number of daily admitted patients to a hospital .

Continuous Feature: It can be measured on a continuum or a scale. For example- Price of a house, weight of

a person, blood pressure of a patient etc.


Bayesian Classifier

 Probabilities Classifier

 Supervised Machine Learning Model

 Fast and easy to implement

 Naïve in nature
Classification problem
• Training data: examples of the form (d,h(d))
– where d are the data objects to classify (inputs)
– and h(d) gives the class info for d, h(d){1,…K}
• Goal: given dnew, provide h(dnew)
Assumption
The fundamental Naive Bayes assumption is that each feature makes an:

 independent

 Equal contribution to the output


Dataset

Outlook Temperature Humidity W indy Class


sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
overcast cool normal true P
sunny mild high false N
sunny cool normal false P
rain mild normal false P
sunny mild normal true P
overcast mild high true P
overcast hot normal false P
rain mild high true N
Assumption
With relation to our dataset, this concept can be understood as:

 We assume that no pair of features are dependent. For example, the temperature being

‘Hot’ has nothing to do with the humidity or the outlook being ‘Rainy’ has no effect on

the winds. Hence, the features are assumed to be independent.

 Secondly, each feature is given the same weight(or importance). For example, knowing

only temperature and humidity alone can’t predict the outcome accurately. None of the

attributes is irrelevant and assumed to be contributing equally to the outcome.


Naïve Baye’s Classifier
According to Baye’s rule
P ( A) P ( B | A)
P( A | B) 
P( B)
Posterior Probability= Prior Probability  Likelihood
Evidence

i.e. P(label|Input)= P(label) P(Input|label)


P(Input)
Example : Naïve Bayes
Predict playing tennis in the day with the condition <sunny, cool, high,
strong> (P(v| o=sunny, t= cool, h=high w=strong)) using the following
training data:
Outlook Temperature Humidity W indy Class
sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
overcast cool normal true P
sunny mild high false N
sunny cool normal false P
rain mild normal false P
sunny mild normal true P
overcast mild high true P
overcast hot normal false P
rain mild high true N
Naive Bayesian Classifier
Given a training set, we can compute the probabilities

Outlook P N Humidity P N
sunny 2/9 3/5 high 3/9 4/5
overcast 4/9 0 normal 6/9 1/5
rain 3/9 2/5
Tempreature Windy
hot 2/9 2/5 true 3/9 3/5
mild 4/9 2/5 false 6/9 2/5
cool 3/9 1/5
Play-tennis example: estimating P(xi|C)
outlook
Outlook Temperature Humidity W indy Class P(sunny|p) = 2/9 P(sunny|n) = 3/5
sunny hot high false N
sunny hot high true N P(overcast|p) = 4/9 P(overcast|n) = 0
overcast hot high false P
rain mild high false P P(rain|p) = 3/9 P(rain|n) = 2/5
rain cool normal false P
rain cool normal true N temperature
overcast cool normal true P
sunny mild high false N P(hot|p) = 2/9 P(hot|n) = 2/5
sunny cool normal false P
rain mild normal false P P(mild|p) = 4/9 P(mild|n) = 2/5
sunny mild normal true P
overcast mild high true P P(cool|p) = 3/9 P(cool|n) = 1/5
overcast hot normal false P
rain mild high true N humidity
P(high|p) = 3/9 P(high|n) = 4/5
P(normal|p) = 6/9 P(normal|n) = 1/5
P(y) = 9/14
windy
P(n) = 5/14 P(true|p) = 3/9 P(true|n) = 3/5
P(false|p) = 6/9 P(false|n) = 2/5
Predict the class of the day with the condition <sunny, cool, high, strong>
Let X= <sunny, cool, high, strong>
We need to calculate P(Yes|X) and P(No|X)

For these, we use Bayes theorem


P(Yes|X) = P(Yes) * P(X|Yes) /P(X)
P(No|X) = P(No) * P(X|No) /P(X)
P(Yes| outlook=sunny, temp= cool, humidity=high wind=strong) by using the
given training data:

p(yes) = 9/14 p(outlook = sunny|yes) = 2/9

p(temp = cool|yes) = 3/9 p(humidity = high|yes) = 3/9

p(wind = strong|yes) = 3/9

9/14 * 2/9 * 3/9 * 3/9* 3/9 = 0.0053


Calculate P(No| outlook=sunny, temp= cool, humidity=high wind=strong)
using the given training data:

p(No) = 5/14 p(outlook = sunny|no) = 3/5

p(temp = cool|no) = 1/5 p(humidity = high|no) = 4/5

p(wind = strong|no) = 3/5

5/14 * 3/5 * 1/5 * 4/5* 3/5 = 0.0206


P(x) = P(outlook=sunny) * P(Temperature = cool) * P(humidity =
high)*P(Wind = strong)
= 5/14* 4/14*7/14*6/14 = 0.02186

P(Play= Y| x) = 0.0053/0.02186 = 0.2424


P(Play =N|x) = 0.0206/0.02186 = 0.9421
Thanks

You might also like