Naive Bayes Numericals

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Naive Bayes(Numerical

Example)

Naive Bayes algorithm is a supervised machine learning algorithm


which is based on Bayes Theorem used mainly for classification
problem.

Naive Bayes Classifier is one of the simple and most effective


Classification algorithms which helps in building the fast machine
learning models that can make quick prediction.

It is a probabilistic classifier, which means it predicts on the basis of


the probability of an object. Some popular examples of Naive Bayes
Algorithm are spam filtration, Sentimental analysis, and
classifying articles.

Bayes Theorem:

Bayes’ theorem is also known as Bayes’ Rule or Bayes’ law, which


is used to determine the probability of a hypothesis with prior
knowledge. It depends on the conditional probability.

Where,
P(A|B) is Posterior probability: Probability of hypothesis A on
the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence


given that the probability of a hypothesis is true.

P(A) is Prior Probability: Probability of hypothesis before


observing the evidence.

P(B) is Marginal Probability: Probability of Evidence.

P(A|B) = (P(B|A) * P(A) )/ P(B) == P(A ∩ B) / P(B)

|||ly P(B|A) = (P(A|B) * P(B) )/ P(A) == P(B ∩ A) / P(A)

if P(A∩B) == P(B∩A).

then P(B|A) * P(A) = P(A|B) * P(B)

Types Of Naive Bayes:


There are three types of Naive Bayes model under the scikit-learn
library:

• Gaussian: It is used in classification and it assumes that


features follow a normal distribution.

• Multinomial: It is used for discrete counts. For example, let’s


say, we have a text classification problem. Here we can consider
Bernoulli trials which is one step further and instead of “word
occurring in the document”, we have “count how often word
occurs in the document”, you can think of it as “number of times
outcome number x_i is observed over the n trials”.

• Bernoulli: The binomial model is useful if your feature vectors


are binary (i.e. zeros and ones). One application would be text
classification with ‘bag of words’ model where the 1s & 0s are
“word occurs in the document” and “word does not occur in the
document” respectively.

Numerical Example:

Solution:

P(A|B) = (P(B|A) * P(A) )/ P(B)

1. Mango:

P(X | Mango) = P(Yellow | Mango) * P(Sweet | Mango) * P(Long |


Mango)
1. a)P(Yellow | Mango) = (P(Yellow | Mango) * P(Yellow) )/ P
(Mango)

= ((350/650) * (800/1200)) / (650/1200)

P(Yellow | Mango)= 0.53 →1

1.b)P(Sweet | Mango) = (P(Sweet | Mango) * P(Sweet) )/ P (Mango)

= ((450/650) * (850/1200)) / (650/1200)

P(Sweet | Mango)= 0.69 → 2

1. c)P(Long | Mango) = (P(Long | Mango) * P(Long) )/ P (Mango)

= ((0/650) * (400/1200)) / (800/1200)

P(Long | Mango)= 0 → 3

On multiplying eq 1,2,3 ==> P(X | Mango) = 0.53 * 0.69 * 0

P(X | Mango) = 0
2. Banana:

P(X | Banana) = P(Yellow | Banana) * P(Sweet | Banana) * P(Long |


Banana)
2.a) P(Yellow | Banana) = (P( Banana | Yellow ) * P(Yellow) )/ P
(Banana)

= ((400/400) * (800/1200)) / (400/1200)

P(Yellow | Banana) = 2 → 4

2.b) P(Sweet | Banana) = (P( Banana | Sweet) * P(Sweet) )/ P


(Banana)

= ((300/400) * (850/1200)) / (400/1200)

P(Sweet | Banana) = 1.6 → 5

2.c)P(Long | Banana) = (P( Banana | Yellow ) * P(Long) )/ P


(Banana)

= ((350/400) * (400/1200)) / (400/1200)

P(Yellow | Banana) = 0.875 → 6

On multiplying eq 4,5,6 ==> P(X | Banana) = 2 * 1.6 * 0.875

P(X | Banana) = 2.8


3. Others:

P(X | Others) = P(Yellow | Others) * P(Sweet | Others) * P(Long |


Others)
3.a) P(Yellow | Others) = (P( Others| Yellow ) * P(Yellow) )/ P
(Others)

= ((50/150) * (800/1200)) / (150/1200)

P(Yellow | Others) = 1.78→ 7

3.b) P(Sweet | Others) = (P( Others| Sweet ) * P(Sweet) )/ P (Others)

= ((100/150) * (850/1200)) / (150/1200)

P(Sweet | Others) = 3.78 → 8

3.c) P(Long | Others) = (P( Others| Long) * P(Long) )/ P (Others)

= ((50/150) * (400/1200)) / (150/1200)

P(Long | Others) = 0.9 → 9

On multiplying eq 7,8,9 ==> P(X | Others) = 1.78 * 3.78* 0.9

P(X | Others) = 6.05


So finally from P(X | Mango) == 0 , P(X | Banana) == 0.65 and P(X|
Others) == 0.072.

We can conclude Fruit{Yellow,Sweet,Long} is Banana.

Code:
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size =
0.25, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Training the Naive Bayes model on the Training set
from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(X_train, y_train)
# Predicting a new result
print(classifier.predict(sc.transform([[30,87000]])))
# Predicting the Test set results
y_pred = classifier.predict(X_test)
print(np.concatenate((y_pred.reshape(len(y_pred),1),
y_test.reshape(len(y_test),1)),1))
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
as = accuracy_score(y_test, y_pred)
# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = sc.inverse_transform(X_train), y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 10, stop =
X_set[:, 0].max() + 10, step = 0.25),np.arange(start = X_set[:, 1].min() -
1000, stop = X_set[:, 1].max() + 1000, step = 0.25))
plt.contourf(X1, X2,
classifier.predict(sc.transform(np.array([X1.ravel(),
X2.ravel()]).T)).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1], c =
ListedColormap(('red', 'green'))(i), label = j)
plt.title('Naive Bayes (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
# Visualising the Test set results
from matplotlib.colors import ListedColormap
X_set, y_set = sc.inverse_transform(X_test), y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 10, stop =
X_set[:, 0].max() + 10, step = 0.25),np.arange(start = X_set[:, 1].min() -
1000, stop = X_set[:, 1].max() + 1000, step = 0.25))
plt.contourf(X1, X2,
classifier.predict(sc.transform(np.array([X1.ravel(),
X2.ravel()]).T)).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1], c =
ListedColormap(('red', 'green'))(i), label = j)
plt.title('Naive Bayes (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

Pros:

• It is easy and fast to predict class of test data set. It also perform
well in multi class prediction

• When assumption of independence holds, a Naive Bayes


classifier performs better compare to other models like logistic
regression and you need less training data.

• It perform well in case of categorical input variables compared to


numerical variable(s). For numerical variable, normal
distribution is assumed (bell curve, which is a strong
assumption).

Cons:

• If categorical variable has a category (in test data set), which was
not observed in training data set, then model will assign a 0
(zero) probability and will be unable to make a prediction. This is
often known as “Zero Frequency”. To solve this, we can use the
smoothing technique. One of the simplest smoothing techniques
is called Laplace estimation.

• On the other side naive Bayes is also known as a bad estimator,


so the probability outputs from predict_proba are not to be taken
too seriously.

• Another limitation of Naive Bayes is the assumption of


independent predictors. In real life, it is almost impossible that
we get a set of predictors which are completely independent.

You might also like