0% found this document useful (0 votes)
2 views5 pages

P 7

The document outlines the implementation of a Naïve Bayesian Classifier for classifying documents using Python, including data preprocessing, model training, and evaluation metrics such as accuracy, precision, and recall. It also describes the construction of a Bayesian network for diagnosing heart disease using a medical dataset, demonstrating user input for various health factors and predicting heart disease outcomes. The results indicate the model's performance metrics and the inference results based on user-provided data.

Uploaded by

dkrao.lic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views5 pages

P 7

The document outlines the implementation of a Naïve Bayesian Classifier for classifying documents using Python, including data preprocessing, model training, and evaluation metrics such as accuracy, precision, and recall. It also describes the construction of a Bayesian network for diagnosing heart disease using a medical dataset, demonstrating user input for various health factors and predicting heart disease outcomes. The results indicate the model's performance metrics and the inference results based on user-provided data.

Uploaded by

dkrao.lic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier model to

perform this task. Built-in Java classes/API can be used to write the program. Calculate the
accuracy, precision, and recall for your data set.

import pandas as pd

msg = pd.read_csv('document.csv', names=['message', 'label'])

print("Total Instances of Dataset: ", msg.shape[0])

msg['labelnum'] = msg.label.map({'pos': 1, 'neg': 0})

X = msg.message

y = msg.labelnum

from sklearn.model_selection import train_test_split

Xtrain, Xtest, ytrain, ytest = train_test_split(X, y)

from sklearn.feature_extraction.text import CountVectorizer

count_v = CountVectorizer()

Xtrain_dm = count_v.fit_transform(Xtrain)

Xtest_dm = count_v.transform(Xtest)

df = pd.DataFrame(Xtrain_dm.toarray(),columns=count_v.get_feature_names())

print(df[0:5])

from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB()

clf.fit(Xtrain_dm, ytrain)

pred = clf.predict(Xtest_dm)

for doc, p in zip(Xtrain, pred):

p = 'pos' if p == 1 else 'neg'

print("%s -> %s" % (doc, p))

from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score

print('Accuracy Metrics: \n')

print('Accuracy: ', accuracy_score(ytest, pred))

print('Recall: ', recall_score(ytest, pred))

print('Precision: ', precision_score(ytest, pred))


print('Confusion Matrix: \n', confusion_matrix(ytest, pred))

output:

Accuracy Metrics:

Accuracy: 0.6

Recall: 0.5

Precision: 1.0

Confusion Matrix:

[[1 0]

[2 2]]
Write a program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using standard Heart Disease Data Set. You can use
Java/Python ML library classes/API.

import pandas as pd

data=pd.read_csv("heartdisease.csv")

heart_disease=pd.DataFrame(data)

print(heart_disease)

from pgmpy.models import BayesianModel

model=BayesianModel([

('age','Lifestyle'),

('Gender','Lifestyle'),

('Family','heartdisease'),

('diet','cholestrol'),

('Lifestyle','diet'),

('cholestrol','heartdisease'),

('diet','cholestrol')

])

from pgmpy.estimators import MaximumLikelihoodEstimator

model.fit(heart_disease, estimator=MaximumLikelihoodEstimator)

from pgmpy.inference import VariableElimination

HeartDisease_infer = VariableElimination(model)

print('For age Enter { SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3, Teen:4 }')

print('For Gender Enter { Male:0, Female:1 }')

print('For Family History Enter { yes:1, No:0 }')

print('For diet Enter { High:0, Medium:1 }')

print('For lifeStyle Enter { Athlete:0, Active:1, Moderate:2, Sedentary:3 }')


print('For cholesterol Enter { High:0, BorderLine:1, Normal:2 }')

q = HeartDisease_infer.query(variables=['heartdisease'], evidence={

'age':int(input('Enter age :')),

'Gender':int(input('Enter Gender :')),

'Family':int(input('Enter Family history :')),

'diet':int(input('Enter diet :')),

'Lifestyle':int(input('Enter Lifestyle :')),

'cholestrol':int(input('Enter cholestrol :'))

})

print(q['heartdisease'])

For age Enter { SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3, Teen:4 }

For Gender Enter { Male:0, Female:1 }

For Family History Enter { yes:1, No:0 }

For diet Enter { High:0, Medium:1 }

For lifeStyle Enter { Athlete:0, Active:1, Moderate:2, Sedentary:3 }

For cholesterol Enter { High:0, BorderLine:1, Normal:2 }

Enter age :1

Enter Gender :1

Enter Family history :0

Enter diet :1

Enter Lifestyle :0

Enter cholestrol :1

+----------------+---------------------+

| heartdisease | phi(heartdisease) |

+================+=====================+

| heartdisease_0 | 0.0000 |

+----------------+---------------------+

| heartdisease_1 | 1.0000 |
+----------------+---------------------+

You might also like