Aiml PBL
Aiml PBL
PBL REPORT
on
“Jaundice Prediction Model Using Naive Bayes Algorithm”
SUBMITTED BY:-
1
CERTIFICATE
18
INDEX
1 Introduction 2
2 Literature Review 3
3 4
The Problem Definition
4 Objectives 5
5 Dataset 6
6 Calculations 7-8
7 Program 9-10
8 Output 11
9 Conclusion 12
10 Reference 13
1
INTRODUCTION
Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to
determine the probability of a hypothesis with prior knowledge. It depends on
the conditional probability. The formula for Bayes' theorem is given as:
Naïve Bayes Classifier Algorithm.
Where :
Jaundice, characterized by yellowing of the skin and dark urine due to elevated
bilirubin levels, is an important indicator of various underlying conditions like liver disease
and hemolytic anemia. Early diagnosis is critical, and machine learning (ML) models are
increasingly used in healthcare to predict jaundice based on patient data.
Challenges
However, challenges remain in the integration of such models into clinical practice. Data
quality is a significant concern, as incomplete or inconsistent patient records can affect
model accuracy. Additionally, the interpretability of more complex models like deep
learning remains limited, which may hinder their adoption by clinicians.
3
Problem Definition
Statement
Develop a healthcare system that predicts the likelihood of a patient having jaundice based
on symptoms such as yellowing of the skin and dark urine
Problem Description:
In healthcare settings, there is a need for a system that leverages patient data, such
as symptoms (yellowing of the skin, fatigue) and test results (bilirubin levels), to predict
the likelihood of jaundice. Such a system can assist medical professionals in making
quicker and more accurate decisions, improving patient outcomes
4
OBJECTIVES
Collect and preprocess patient data, including symptoms (e.g., yellowing of skin, fatigue) and lab
results (e.g., bilirubin levels). Handle missing or inconsistent data to ensure clean and reliable
inputs for the model.
2. Feature Selection:
Identify key attributes (features) most relevant to jaundice diagnosis, such as bilirubin levels, liver
function test results, and patient-reported symptoms. Reduce dimensionality if necessary to
improve model performance.
3. Model Training:
Implement the Naive Bayes algorithm to classify patients as "Jaundice" or "No Jaundice." Train
the model on labeled datasets of patient records, ensuring balanced classes for fair learning.
4. Probabilistic Analysis:
Calculate conditional probabilities for symptoms given the presence or absence of jaundice.
Leverage the Naive Bayes assumption of feature independence for simplicity and efficiency.
Use a portion of the dataset for validation and testing to measure the accuracy, precision, recall,
and F1 score of the model. Tune hyperparameters to optimize performance.
5
Dataset
6
Calculations
Yes No
Jaundice 4/7 2/3
Yellow Skin 3/7 2/3
7
P(Yes/Yellow Skin, Jaundice) = P(Yellow Skin/Yes) × P(Jaundice/Yes) × P(Yes)
P(Yellow Skin) x P(Jaundice)
= 0.17
= 0.13
8
Program
9
10
Output
11
Conclusion
The Naive Bayes classifier proved to be a simple yet effective tool for binary
classification tasks, like predicting the presence of jaundice from observed symptoms.
However, its performance depends on the quality and size of the training data. While the
model worked well with the small dataset used here, larger, more diverse datasets would
improve its accuracy and generalizability.
12
References
2. Pazzani, M. J., & Billsus, D. (2007). Learning and extracting relational rules
using inductive logic programming for text classification. Machine
Learning, 31(1), 49-69.
4. Shen, D., Wu, G., & Suk, H. I. (2017). Deep learning in medical image
analysis. Annual Review of Biomedical Engineering, 19, 221-248.
13