Bayes Theorem
Bayes Theorem
Bayes theorem is given by an English statistician, philosopher, named Mr. Thomas Bayes in 17th
century.
It is a very important theorem in mathematics that is used to find the probability of an event,
based on prior knowledge of conditions that might be related to that event.
Bayes theorem is also known as the Bayes Rule or Bayes Law. It is used to determine the
conditional probability of event A when event B has already happened.
The general statement of Bayes’ theorem is “The conditional probability of an event A, given the
occurrence of another event B, is equal to the product of the event of B, given A and the
probability of A divided by the probability of event B.” i.e.
P(A|B) = P(B|A)P(A) / P(B)
where,
P(A) and P(B) are the probabilities of events A and B
P(A|B) is the probability of event A when event B happens
P(B|A) is the probability of event B when A happens
where, y is class variable and X is a dependent feature vector (of size n) where:
Just to clear, an example of a feature vector and corresponding class variable can be:
(refer 1st row of dataset)
Now, as the denominator remains constant for a given input, we can remove that term:
Now, we need to create a classifier model. For this, we find the probability of given set of inputs
for all possible values of the class variable y and pick up the output with maximum probability.
This can be expressed mathematically as:
So, finally, we are left with the task of calculating P(y) and P(xi | y) .
Please note that P(y) is also called class probability and P(xi | y) is called conditional
probability.
The different naive Bayes classifiers differ mainly by the assumptions they make regarding the
distribution of P(xi | y).
Let us try to apply the above formula manually on our weather dataset.