100% found this document useful (1 vote)
613 views

Program 7

This laboratory manual outlines an experiment to construct a Bayesian network model for diagnosing heart disease using medical data. The objectives are to implement machine learning concepts in Python and use data sets. The procedure demonstrates building a Bayesian network with 5 nodes using lung cancer data, defining conditional probability tables, and performing inference to calculate disease probabilities based on evidence. The results show applying the approach to a heart disease data set, learning conditional probability distributions with maximum likelihood estimation, and outputting probability queries.

Uploaded by

Pavithra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
613 views

Program 7

This laboratory manual outlines an experiment to construct a Bayesian network model for diagnosing heart disease using medical data. The objectives are to implement machine learning concepts in Python and use data sets. The procedure demonstrates building a Bayesian network with 5 nodes using lung cancer data, defining conditional probability tables, and performing inference to calculate disease probabilities based on evidence. The results show applying the approach to a heart disease data set, learning conditional probability distributions with maximum likelihood estimation, and outputting probability queries.

Uploaded by

Pavithra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

COURSE LABORATORY MANUAL

1. EXPERIMENT NO: 7
2. TITLE: BAYESIAN NETWORK
3. LEARNING OBJECTIVES:
• Make use of Data sets in implementing the machine learning algorithms.
• Implement ML concepts and algorithms in Python

4. AIM:
• Write a program to construct a Bayesian network considering medical data. Use this model
to demonstrate the diagnosis of heart patients using standard Heart Disease Data Set. You
can use Java/Python ML library classes/API.

5. THEORY:
• Bayesian networks are very convenient for representing similar probabilistic relationships
between multiple events.
• Bayesian networks as graphs - People usually represent Bayesian
networks as directed graphs in which each node is a hypothesis or a
random process. In other words, something that takes at least 2
possible values you can assign probabilities to. For example, there can
be a node that represents the state of the dog (barking or not barking at
the window), the weather (raining or not raining), etc.
• The arrows between nodes represent the conditional probabilities
between them — how information about the state of one node changes
the probability distribution of another node it’s connected to.

6. PROCEDURE / PROGRAMME :

Program for the Illustration of Baysian Belief networks using 5 nodes using Lung cancer data. (The
Conditional probabilities are given)

from pgmpy.models import BayesianModel from


pgmpy.factors.discrete import TabularCPD
from pgmpy.inference import VariableElimination

#Define a Structure with nodes and edge cancer_model =


BayesianModel([('Pollution', 'Cancer'),
('Smoker', 'Cancer'),
('Cancer', 'Xray'),
('Cancer', 'Dyspnoea')])

print('Baysian network nodes are:')


print('\t',cancer_model.nodes())
print('Baysian network edges are:')
print('\t',cancer_model.edges())

#Creation of Conditional Probability Table

cpd_poll = TabularCPD(variable='Pollution', variable_card=2,


values=[[0.9], [0.1]])
cpd_smoke= TabularCPD(variable='Smoker', variable_card=2,
values=[[0.3], [0.7]])
cpd_cancer= TabularCPD(variable='Cancer', variable_card=2,
COURSE LABORATORY MANUAL
values=[[0.03, 0.05, 0.001, 0.02],
[0.97, 0.95, 0.999, 0.98]],
evidence=['Smoker', 'Pollution'],
evidence_card=[2, 2])
cpd_xray = TabularCPD(variable='Xray', variable_card=2, values=[[0.9,
0.2], [0.1, 0.8]],
evidence=['Cancer'], evidence_card=[2]) cpd_dysp =
TabularCPD(variable='Dyspnoea', variable_card=2,
values=[[0.65, 0.3], [0.35, 0.7]],
evidence=['Cancer'], evidence_card=[2])

# Associating the parameters with the model structure. cancer_model.add_cpds(cpd_poll,


cpd_smoke, cpd_cancer, cpd_xray, cpd_dysp) print('Model generated by adding
conditional probability disttributions(cpds)')

# Checking if the cpds are valid for the model. print('Checking


for Correctness of model : ', end='' )
print(cancer_model.check_model())

'''print('All local idependencies are as follows')


cancer_model.get_independencies()
'''
print('Displaying CPDs')
print(cancer_model.get_cpds('Pollution'))
print(cancer_model.get_cpds('Smoker'))
print(cancer_model.get_cpds('Cancer'))
print(cancer_model.get_cpds('Xray'))
print(cancer_model.get_cpds('Dyspnoea'))

##Inferencing with Bayesian Network

# Computing the probability of Cancer given smoke.


cancer_infer = VariableElimination(cancer_model)

print('\nInferencing with Bayesian Network'); print('\nProbability

of Cancer given Smoker')


q = cancer_infer.query(variables=['Cancer'], evidence={'Smoker': 1})
print(q['Cancer'])

print('\nProbability of Cancer given Smoker,Pollution')


q = cancer_infer.query(variables=['Cancer'], evidence={'Smoker': 1,'Pollution': 1}) print(q['Cancer'])

Program as per the Syllubus

import numpy as np
import pandas as pd
import csv
from pgmpy.estimators import MaximumLikelihoodEstimator from
pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination

#Read the attributes


lines = list(csv.reader(open('data7_names.csv', 'r'))); attributes
= lines[0]
#Read Cleveland Heart dicease data
heartDisease = pd.read_csv('data7_heart.csv', names = attributes) heartDisease =
heartDisease.replace('?', np.nan)
COURSE LABORATORY MANUAL
# Display the data
#print('Few examples from the dataset are given below')
#print(heartDisease.head())
#print('\nAttributes and datatypes')
#print(heartDisease.dtypes)

# Model Baysian Network


model = BayesianModel([('age', 'trestbps'), ('age', 'fbs'), ('sex', 'trestbps'), ('sex', 'trestbps'),
('exang', 'trestbps'),('trestbps','heartdisease'),('fbs','heartdisease'),
('heartdisease','restecg'),('heartdisease','thalach'),('heartdisease','chol')])

# Learning CPDs using Maximum Likelihood Estimators print('\nLearning CPDs


using Maximum Likelihood Estimators...'); model.fit(heartDisease,
estimator=MaximumLikelihoodEstimator)

# Inferencing with Bayesian Network print('\nInferencing with


Bayesian Network:') HeartDisease_infer =
VariableElimination(model)

# Computing the probability of bronc given smoke.


print('\n1.Probability of HeartDisease given Age=20')
q = HeartDisease_infer.query(variables=['heartdisease'], evidence={'age': 28}) print(q['heartdisease'])

print('\n2. Probability of HeartDisease given chol (Cholestoral) =100')


q = HeartDisease_infer.query(variables=['heartdisease'], evidence={'chol': 100}) print(q['heartdisease'])

7. RESULTS & CONCLUSIONS:

Dataset
data7_names.csv (14 attributes) age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,
slope,ca,thal,heartdisease
data7_heart.csv (5 instances out of 303)
63.0,1.0,1.0,145.0,233.0,1.0,2.0,150.0,0.0,2.3,3.0,0.0,6.0,0
67.0,1.0,4.0,160.0,286.0,0.0,2.0,108.0,1.0,1.5,2.0,3.0,3.0,2
67.0,1.0,4.0,120.0,229.0,0.0,2.0,129.0,1.0,2.6,2.0,2.0,7.0,1
37.0,1.0,3.0,130.0,250.0,0.0,0.0,187.0,0.0,3.5,3.0,0.0,3.0,0
41.0,0.0,2.0,130.0,204.0,0.0,2.0,172.0,0.0,1.4,1.0,0.0,3.0,0

Output
Learing CPDs using Maximum Likelihood Estimators... Inferencing
with Bayesian Network:
1.Probability of HeartDisease given Age=20
╒════════════════╤═════════════════════╕
│ heartdisease │ phi(heartdisease) │
╞════════════════╪═════════════════════╡
│ heartdisease_0 │ 0.6791 │
├────────────────┼─────────────────────┤
│ heartdisease_1 │ 0.1212 │
├────────────────┼─────────────────────┤
│ heartdisease_2 │ 0.0810 │
├────────────────┼─────────────────────┤
│ heartdisease_3 │ 0.0939 │
├────────────────┼─────────────────────┤
│ heartdisease_4 │ 0.0247 │
╘════════════════╧═════════════════════╛
COURSE LABORATORY MANUAL

2. Probability of HeartDisease given chol (Cholestoral) =100


╒════════════════╤═════════════════════╕
│ heartdisease │ phi(heartdisease) │
╞════════════════╪═════════════════════╡
│ heartdisease_0 │ 0.5400 │
├────────────────┼─────────────────────┤
│ heartdisease_1 │ 0.1533 │
├────────────────┼─────────────────────┤
│ heartdisease_2 │ 0.1303 │
├────────────────┼─────────────────────┤
│ heartdisease_3 │ 0.1259 │
├────────────────┼─────────────────────┤
│ heartdisease_4 │ 0.0506 │
╘════════════════╧═════════════════════╛

8. LEARNING OUTCOMES :
• The student will be able to apply baysian network for the medical data and demonstrate the
diagnosis of heart patients using standard Heart Disease Data Set.

9. APPLICATION AREAS:
• Applicable in prediction and classification • Document Classification
• Gene Regulatory Networks • Information Retrieval
• Medicine • Semantic Search
• Biomonitoring

10. REMARKS:

You might also like