0% found this document useful (0 votes)

112 views35 pages

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Bayesian classifiers use Bayes' theorem to predict class membership probabilities based on training data. A naïve Bayesian classifier assumes conditional independence between attributes given the class. This simplifies computations but may not always hold. Bayesian belief networks relax this assumption by allowing dependencies between attributes and representing them in a directed acyclic graph with conditional probability tables. This more flexible approach models class conditional probabilities better for problems where attributes are somewhat correlated.

Uploaded by

Janani Aec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

112 views35 pages

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Uploaded by

Janani Aec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 35

Bayesian Classification

Dr. Navneet Goyal BITS, Pilani

Bayesian Classification

What are Bayesian Classifiers? Statistical Classifiers Predict class membership probabilities Based on Bayes Theorem Nave Bayesian Classifier Computationally Simple Comparable performance with DT and NN classifiers

Bayesian Classification

Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical approaches to certain types of learning problems Incremental: Each training example can incrementally increase/decrease the probability that a hypothesis is correct. Prior knowledge can be combined with observed data.

Bayes Theorem

Let X be a data sample whose class label is unknown Let H be some hypothesis that X belongs to a class C For classification determine P(H/X) P(H/X) is the probability that H holds given the observed data sample X P(H/X) is posterior probability

Bayes Theorem
Example: Sample space: All Fruits X is round and red H= hypothesis that X is an Apple P(H/X) is our confidence that X is an apple given that X is round and red P(H) is Prior Probability of H, ie, the probability that any given data sample is an apple regardless of how it looks P(H/X) is based on more information Note that P(H) is independent of X

Bayes Theorem
Example: Sample space: All Fruits P(X/H) ? It is the probability that X is round and red given that we know that it is true that X is an apple Here P(X) is prior probability = P(data sample from our set of fruits is red and round)

Estimating Probabilities

P(X), P(H), and P(X/H) may be estimated from given data Bayes Theorem

P( X | H )P(H ) P(H | X ) = P( X )

Use of Bayes Theorem in Nave Bayesian Classifier!!

Nave Bayesian Classification

Also Why Class Effect This

called Simple BC Nave/Simple?? Conditional Independence

of an attribute values on a given class is independent of the values of other attributes assumption simplifies computations

Nave Bayesian Classification

Steps Involved
1.

Each data sample is of the type X=(xi) i =1(1)n, where xi is the values of X for attribute Ai

Suppose there are m classes Ci, i=1(1)m. X Ci iff P(Ci|X) > P(Cj|X) for 1 j m, j i i.e BC assigns X to class Ci having highest

Nave Bayesian Classification

The class for which P(Ci|X) is maximized is called the maximum posterior hypothesis. From Bayes Theorem

P(Ci | X ) =P( X | Ci) P(Ci) P( X )

P(X) is constant. Only

P( X | Ci)P(Ci)need be maximized.

If class prior probabilities not known, then assume all classes to be equally likely Otherwise maximize

P(Ci) = Si/S

P( X | Ci)P(Ci)

Problem: computing P(X|Ci) is unfeasible! (find out how you would find it and why it is infeasible)

Nave Bayesian Classification

Nave assumption: attribute independence = P(x1,,xn|C) = P(xk|C) P( X | C i )

In order to classify an unknown sample X, P( X | Ci)P(Ci) each class C . evaluate for i Sample X is assigned to the class Ci iff P(X|Ci)P(Ci) > P(X|Cj) P(Cj) for 1 j m, j i

Nave Bayesian Classification

EXAMPLE
Age <=30 <=30 31..40 >40 >40 >40 31..40 <=30 <=30 >40 <=30 31.40 31.40 >40 Income HIGH HIGH HIGH MEDIUM LOW LOW LOW MEDIUM LOW MEDIUM MEDIUM MEDIUM HIGH MEDIUM Student N N N N Y Y Y N Y Y Y N Y N Credit_rating FAIR EXCELLENT FAIR FAIR FAIR EXCELLENT EXCELLENT FAIR FAIR FAIR EXCELLENT EXCELLENT FAIR EXCELLENT Class:Buys_comp N N Y Y Y N Y N Y Y Y Y Y N

Nave Bayesian Classification

EXAMPLE X= (<=30,MEDIUM, Y,FAIR, ???) We need to maximize:

Nave Bayesian Classification

EXAMPLE

Nave Bayesian Classification

EXAMPLE

P(X | buys_comp=Y)=0.222*0.444*0.667*0.667=0.044 P(X | buys_comp=N)=0.600*0.400*0.200*0.400=0.019 P(X | buys_comp=Y)P(buys_comp=Y) = 0.044*0.643=0.028 P(X | buys_comp=N)P(buys_comp=N) = 0.019*0.357=0.007 CONCLUSION: X buys computer

Nave Bayes Classifier: Issues

Probability
Recall

values ZERO!

what you observed in WEKA! what you observed in WEKA!

Ak is continuous valued!

Recall

If there are no tuples in the training set corresponding to students for the class buys-comp=NO P(student = Y|buys_comp=N)=0 Implications? Solution?

Nave Bayes Classifier: Issues

Laplacian Correction or Laplace Estimator Philosophy we assume that the training data set is so large that adding one to each count that we need would only make a negligible difference in the estimated prob. value. Example: D (1000) Class: buys_comp=Y income=low zero tuples income=medium 990 tuples income=high 10 tuples Without Laplacian Correction the probs. are 0, 0.990, and 0.010 With Laplacian correction: 1/1003 = 0.001, 991/1003=0.988, and 11/1003=0.011 respectively.

Nave Bayes Classifier: Issues

Continuous

variable: need to do more work than categorical attributes! It is typically assumed to have a Guassian distribution with a mean and a std. dev. . Do it yourself! And cross check with WEKA!

Nave Bayes (Summary)

Robust Handle

to isolated noise points

missing values by ignoring the instance during probability estimate calculations to irrelevant attributes

Robust

Independence
Use

assumption may not hold for some attributes

other techniques such as Bayesian Belief Networks (BBN)

Probability Calculations
Age Income Student Credit_rating Class:Buys_comp <=30 <=30 HIGH HIGH N N FAIR EXCELLENT N N 31..40 HIGH N FAIR Y >40 MEDIUM N FAIR Y

No. of attributes = 4 Distinct values = 3,3,3,3 No. of classes = 2 Total no. of probability calculations in NBC = 4*3*2 = 24! What if conditional ind. was not assumed? O(kp) for p k-valued attributes Multiply by m classes.

>40 >40

LOW LOW

Y Y

GOOD EXCELLENT

Y N

31..40

LOW

EXCELLENT

<=30

MEDIUM

FAIR

<=30 >40

LOW MEDIUM

Y Y

GOOD FAIR

Y Y

<=30

MEDIUM

EXCELLENT

31.40

MEDIUM

EXCELLENT

31.40 >40

HIGH MEDIUM

Y N

FAIR EXCELLENT

Y N

Bayesian Belief Networks

Nave BC assumes Class Conditional Independence This assumption simplifies computations When this assumption holds true, Nave BC is most accurate compared to all other classifiers In real problems, dependencies do exist between variables 2 methods to overcome this limitation of NBC

Bayesian networks, that combine Bayesian reasoning with causal relationships between attributes Decision trees, that reason on one attribute at the time, considering most important attributes first

Conditional Independence
Let

X, Y, & Z denote three set of random variables. The variables in X are said to be conditionally independent of Y, given Z if
P(X|Y,Z) = P(X|Z)

Rel.

bet. a persons arm length and his/her reading skills!! One might observe that people with longer arms tend to have higher levels of reading skills How do you explain this rel.?

Conditional Independence
Can

be explained through a confounding factor, AGE A young child tends to have short arms and lacks the reading skills of an adult If the age of a person is fixed, then the observed rel. between arm length and reading skills disappears We can this conclude that arm length and reading skills are conditionally independent when the age variable is fixed
P(reading skills| long arms,age) = P(reading skills|age)

P(X|Ci) = P(x1, x2, x3,,xn|C) = P(xk|C)

Bayesian Belief Networks

Belief Networks Bayesian Networks Probabilistic Networks

Bayesian Belief Networks

Conditional Independence (CI) assumption made by NBC may be too rigid

Specially for classification problems in which attributes are somewhat correlated

We need a more flexible approach for modeling the class conditional probabilities

P(X|Ci) = P(x1, x2, x3,,xn|C)

instead of requiring that all the attributes be CI given the class, BBN allows us to specify which pair of attributes are CI

Bayesian Belief Networks

Belief

Networks has 2 components Acyclic Graph (DAG) Probability Table (CPT)

Directed

Conditional

Bayesian Belief Networks

A node in BBN is CI of its non-descendants, if its parents are known

Bayesian Belief Networks

Family History Smoker (FH, S) (FH, ~S)(~FH, S) (~FH, ~S)

LC
LungCancer Emphysema

0.8 0.2

0.5 0.5

0.7 0.3

0.1 0.9

~LC

The conditional probability table for the variable LungCancer

PositiveXRay Dyspnea

Bayesian Belief Networks

6 boolean variables

Arcs allow representation of causal knowledge

Having lung cancer is influenced by family history and smoking PositiveXray is ind. of whether the paient has a FH or if he/she is a smoker given that we know that the patient has lung cancer

Once we know the outcome of Lung Cancer, FH & Smoker do not provide any additional info. about PositiveXray

Bayesian Belief Networks

Lung Cancer is CI of Emphysema, given its parents, FH

& Smoker BBN has a Conditional Probability Table (CPT) for each variable in the DAG

CPT for a variable Y specifies the conditional distribution P(Y|parents(Y))

(FH, S) (FH, ~S) (~FH, S) (~FH, ~S)

LC ~LC

0.8 0.2

0.5 0.5

0.7 0.3

0.1 0.9

P(LC=Y|FH=Y,S=Y) = 0.8 P(LC=N|FH=N,S=N) = 0.9

CPT for LungCancer

Bayesian Belief Networks

Let X=(x1, x2,,xn) be a tuple described by variables

or attributes Y1, Y2, ,Yn respectively Each variable is CI of its nondescendants given its parents

Allows he DAG to provide a complete representation of the existing Joint Probability Distribution by:

P(x1, x2, x3,,xn)=P(xi|Parents(Yi)) P(x1, x2, x3,,xn) is the prob. of a particular combination of values of X, and the values for P(xi| Parents(Yi)) correspond to the entries in CPT for Yi
where

Bayesian Belief Networks

A node within the network can selected as an output

node, representing a class label attribute

More than one output node

Rather

than returning a single class label, the classification process can return a probability distribution that gives the probability of each class

Training BBN!!

Training BBN
Number of scenarios possible

Network topology may be given in advance or inferred from data

Variables may be observable or hidden (mising or incomplete data) in all or some of the training tuples

Many algos for learning the network topology from the training data given observable attibutes

If network topology is known and the variables observable, training is straightforward (just compute CPT entries)

Training BBNs
Topology given, but some variables are hidden

Gradient Descent (self study)

Falls under the class of algos called Adaptive Probabilistic Networks

BBNs are computationally expensive BBNs provide explicit representation of Causal structure

Domain experts can provide prior knowledge to the training process in the form of topology and/or in conditional probability values This leads to significant improvement in the learning process

Scania Parts List
100% (4)
Scania Parts List
2 pages
Early Method of Detecting Deception
100% (2)
Early Method of Detecting Deception
6 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
Runehammer OSE Hacked 1.2
100% (1)
Runehammer OSE Hacked 1.2
17 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
Đề Thi Học Kì 1 Lớp 3 Môn Tiếng Anh
No ratings yet
Đề Thi Học Kì 1 Lớp 3 Môn Tiếng Anh
56 pages
BHS Inggris Xi Sem-1 TP 2021-2022
No ratings yet
BHS Inggris Xi Sem-1 TP 2021-2022
8 pages
Madrid Protocol TMR
No ratings yet
Madrid Protocol TMR
21 pages
Divinity Activation Mantras Empowerment
0% (2)
Divinity Activation Mantras Empowerment
2 pages
Automotive Piston Ring
No ratings yet
Automotive Piston Ring
68 pages
ECC For EBS
100% (1)
ECC For EBS
6 pages
CSC 325 AI Lecture08 Supervised Learning
No ratings yet
CSC 325 AI Lecture08 Supervised Learning
32 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Multiple Choice Questions (1-5) 1 Tick For Each Correct Answer PDF
No ratings yet
Multiple Choice Questions (1-5) 1 Tick For Each Correct Answer PDF
2 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
M4 Merge PDF
No ratings yet
M4 Merge PDF
68 pages
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Sanskrit PDF
No ratings yet
Sanskrit PDF
33 pages
Structural Foundation Sections Sheet 1 of 2
No ratings yet
Structural Foundation Sections Sheet 1 of 2
1 page
API FR - INR.RINR DS2 en Excel v2 2917298
No ratings yet
API FR - INR.RINR DS2 en Excel v2 2917298
74 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Naïve Bayesv1
No ratings yet
Naïve Bayesv1
31 pages
Lecture 8 - Naive Bayes
No ratings yet
Lecture 8 - Naive Bayes
27 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
UNIT - IV
No ratings yet
UNIT - IV
169 pages
Bays Classifier (Machine Learning)
No ratings yet
Bays Classifier (Machine Learning)
16 pages
Lesson 3.3 - Supervised Learning Rule Based Classification
No ratings yet
Lesson 3.3 - Supervised Learning Rule Based Classification
43 pages
The Classification of Stocks With Basic Financial Indicators An Application of Cluster Analysis On The BIST 100 Index
No ratings yet
The Classification of Stocks With Basic Financial Indicators An Application of Cluster Analysis On The BIST 100 Index
29 pages
Naive Bayes
No ratings yet
Naive Bayes
36 pages
8 - Classification NaiveBayes PDF
No ratings yet
8 - Classification NaiveBayes PDF
13 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Chapter 4
No ratings yet
Chapter 4
57 pages
ST LINES + CIRCLES TOP 200 PYQs of JEE Mains 2022
No ratings yet
ST LINES + CIRCLES TOP 200 PYQs of JEE Mains 2022
60 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Naive by
No ratings yet
Naive by
23 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
CHM2032L Lab Manual 8 Spectrophotometry Yavuz-Petrowski Fall 2021 Tde88JS
No ratings yet
CHM2032L Lab Manual 8 Spectrophotometry Yavuz-Petrowski Fall 2021 Tde88JS
21 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
Bayesian
No ratings yet
Bayesian
23 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
21 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
Statistical Inference INF312 - Is - Lecture 03 - Part 3
No ratings yet
Statistical Inference INF312 - Is - Lecture 03 - Part 3
18 pages
Lecture12 Ch8 ClassBasic Part2
No ratings yet
Lecture12 Ch8 ClassBasic Part2
22 pages
IML Module 3
No ratings yet
IML Module 3
95 pages
Unit6 - 3 Classification-Bayesian
No ratings yet
Unit6 - 3 Classification-Bayesian
15 pages
IME672 - Lecture 44
No ratings yet
IME672 - Lecture 44
16 pages
ML 09 Naive Bayes Classifier
No ratings yet
ML 09 Naive Bayes Classifier
24 pages
Pengaruh Model PBL Terhadap Kemampuan Berpikir Kreatif Ditinjau Dari Kemandirian Belajar Siswa
No ratings yet
Pengaruh Model PBL Terhadap Kemampuan Berpikir Kreatif Ditinjau Dari Kemandirian Belajar Siswa
14 pages
Ict2611 Octnov24
No ratings yet
Ict2611 Octnov24
15 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
CSC 325 AI Lecture08 Supervised Learning Fall2024 DR Raheel 20022025 034558pm
No ratings yet
CSC 325 AI Lecture08 Supervised Learning Fall2024 DR Raheel 20022025 034558pm
29 pages
Bayes Classification Method
No ratings yet
Bayes Classification Method
18 pages
1.1 Identify Ty
No ratings yet
1.1 Identify Ty
7 pages
Literature Review Last Edit
No ratings yet
Literature Review Last Edit
11 pages
Bayes Classification
No ratings yet
Bayes Classification
9 pages
NB Classifier & Bayesian Network 2
No ratings yet
NB Classifier & Bayesian Network 2
37 pages
Research Proposal
No ratings yet
Research Proposal
10 pages
2.3 Bayes Classification
No ratings yet
2.3 Bayes Classification
15 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Social Psychology Assignment
No ratings yet
Social Psychology Assignment
12 pages
Unit-4 DWDM
No ratings yet
Unit-4 DWDM
10 pages
AI Notes
No ratings yet
AI Notes
19 pages
Module 3 - Bayesian Classifier
No ratings yet
Module 3 - Bayesian Classifier
17 pages
Classification-Alternative Techniques: Bayesian Classifiers
No ratings yet
Classification-Alternative Techniques: Bayesian Classifiers
7 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
16 pages
(3b.) Positive Production Externalities (Type of Market Failure) - Notes
No ratings yet
(3b.) Positive Production Externalities (Type of Market Failure) - Notes
6 pages
Unit-Iv Data Classification: Data Warehousing and Data Mining
No ratings yet
Unit-Iv Data Classification: Data Warehousing and Data Mining
7 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Naive Bayesian Classifier: National Institute of Technology Sikkim
No ratings yet
Naive Bayesian Classifier: National Institute of Technology Sikkim
6 pages
07 Naive Bayes
No ratings yet
07 Naive Bayes
6 pages
After Class - AVTC6 - Unit 6 - Pie Charts - K26
No ratings yet
After Class - AVTC6 - Unit 6 - Pie Charts - K26
3 pages
Syltherm HF Tds
No ratings yet
Syltherm HF Tds
2 pages
Yorrick - Player Sheet
No ratings yet
Yorrick - Player Sheet
2 pages
Cookbook - Cuisine of The United Kingdom
No ratings yet
Cookbook - Cuisine of The United Kingdom
4 pages
Mathura Vrindavan Tour
No ratings yet
Mathura Vrindavan Tour
1 page
Specifications-700-HC Relays: Relay and Timer Specifications
No ratings yet
Specifications-700-HC Relays: Relay and Timer Specifications
1 page
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Bayesian Methodology: an Overview With The Help Of R Software
From Everand
Bayesian Methodology: an Overview With The Help Of R Software
Editor IJSMI
No ratings yet

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Uploaded by

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Uploaded by

Bayesian Classification

Dr. Navneet Goyal BITS, Pilani

Use of Bayes Theorem in Nave Bayesian Classifier!!

Nave Bayesian Classification

called Simple BC Nave/Simple?? Conditional Independence

Nave Bayesian Classification

Nave Bayesian Classification

P(Ci | X ) =P( X | Ci) P(Ci) P( X )

P(X) is constant. Only

Nave Bayesian Classification

Nave assumption: attribute independence = P(x1,,xn|C) = P(xk|C) P( X | C i )

Nave Bayesian Classification

Nave Bayesian Classification

Nave Bayesian Classification

Nave Bayesian Classification

Nave Bayes Classifier: Issues

what you observed in WEKA! what you observed in WEKA!

Nave Bayes Classifier: Issues

Nave Bayes Classifier: Issues

Nave Bayes (Summary)

to isolated noise points

assumption may not hold for some attributes

Bayesian Belief Networks

P(X|Ci) = P(x1, x2, x3,,xn|C) = P(xk|C)

Bayesian Belief Networks

Belief Networks Bayesian Networks Probabilistic Networks

Bayesian Belief Networks

Specially for classification problems in which attributes are somewhat correlated

P(X|Ci) = P(x1, x2, x3,,xn|C)

Bayesian Belief Networks

Networks has 2 components Acyclic Graph (DAG) Probability Table (CPT)

Bayesian Belief Networks

Bayesian Belief Networks

The conditional probability table for the variable LungCancer

Bayesian Belief Networks

Bayesian Belief Networks

Arcs allow representation of causal knowledge

Bayesian Belief Networks

CPT for a variable Y specifies the conditional distribution P(Y|parents(Y))

P(LC=Y|FH=Y,S=Y) = 0.8 P(LC=N|FH=N,S=N) = 0.9

CPT for LungCancer

Bayesian Belief Networks

Bayesian Belief Networks

node, representing a class label attribute

More than one output node

Network topology may be given in advance or inferred from data

Gradient Descent (self study)

Falls under the class of algos called Adaptive Probabilistic Networks

You might also like