Report Final Year Project Completed
Report Final Year Project Completed
LEARNING
PID-BS IT-F18M10
MULTAN CAMPUS
UNIVERSITY OF EDUCATION
LAHORE
2018-22
HEART DISEASE PREDICTION USING MACHINE
LEARNING
2018-22
MULTAN CAMPUS
UNIVERSITY OF EDUCATION
LAHORE
2018-22
@Copyright Muhammad Zain ul Abideen 2022
Signature: ___________________________________________
Date: ___________________________________________
Signature: __________________________________________
Date: ________________________________
I
DECLARATION
Signature: ___________________________________________
Signature: ___________________________________________
Signature: ___________________________________________
Date: ___________________________________________
II
PLAGIARISM UNDERTAKING
We solemnly declare that research work presented in the research project entitled
“Heart Disease Prediction System Using Machine Learning” is solely our research
work with no significant contribution from any other person. Small contribution/help
wherever taken has been duly acknowledged and that complete thesis has been
written by us.
We undertake that if I am found guilty of any formal plagiarism in the above titled
thesis even after award of BS, the University reserves the rights to withdraw/revoke
my IT degree and that HEC and the University has the right to publish my name on
the HEC/University Website on which names of students are placed who submitted
plagiarized thesis.
Name: ___________________________________________
Signature: ___________________________________________
Name: ___________________________________________
Signature: ___________________________________________
Name: ___________________________________________
Signature: ___________________________________________
Date: ___________________________________________
III
CERTIFICATE OF APPROVAL
This is to certify that the research project/project presented in this thesis, entitled
“Heart Disease Prediction System Using Machine Learning” was conducted by
Muhammad Zain-ul-Abideen, Waqas Razzaq and Muhammad Hunzala-bib-Javaid
under the supervision of sir Shahid Touqeer.
No part of this research project/project has been submitted anywhere else for any
other degree. This thesis is submitted to the Division of Science and Technology,
University of Education, Lahore in partial fulfillment of the requirements for the
degree of Bachelors in Information Technology.
Examination Committee:
1. External Examiner
Name: ____________________ Signature: _________________
ABSTRACT
One of the most life-threatening disease is cardiovascular disease. Its high mortality
rate contributes to nearly 17 million deaths all over the world. Early diagnosis helps to
treat the disease in timely manner to prevent mortality. There are several machine and
deep learning techniques available to classify the presence and absence of the disease.
In this research, Logistic Regression (LR) techniques is applied to dataset to classify
the cardiac disease. To improve the performance of the model, pre-processing of data
by Cleaning the dataset, finding the missing values are done and features selection
were performed by correlation with the target value for all the feature. The highly
positive correlated features were selected. Then classification is performed by
dividing the dataset into training. testing in the ratio of 80:20, 70:30, 40:60 and
50:50. The splitting ratio of 50:50 gives best accuracy. The LR model obtained 86.9%
accuracy.
V
Table of Contents
1. INTRODUCTION....................................................................................................1
2. LITERATURE REVIEW.......................................................................................7
2.1 Introduction............................................................................................................8
3. RESEARCH METHODOLOGY.........................................................................17
3.6 Classification...........................................................................................................24
3.7 Summary.................................................................................................................25
4.2 Technologies............................................................................................................32
5. IMPLICATIONS/RECOMMENDATIONS..............................................................33
6. CONCLUSION.........................................................................................................35
REFERENCES.............................................................................................................38
VII
LIST OF FIGURES
FIGURE PAGE
TITLE
NO. NO.
1.1 Gantt Chart 6
2.1 Flow Chart of naïve Bayes Decision 13
2.2 Flow Chart of Entire System 14
2.3 Data Flow Diagram of Model Working 15
2.4 Kappa, ROC, MAE for different algorithms 15
2.5 Relative Absolute Error of different algorithms 16
3.1 Work Break down Structure 18
3.2 Iterative Process Model 19
Logistic Regression Cardiac Disease Classification flow
3.3 20
diagram
3.4 Distribution of Heart Disease in accordance to Gender 23
3.5 Heat Map of Subset Attributes 24
Accuracy Result of Logistic Regression Classifier on
4.1 28
Training Data
Accuracy Result of Logistic Regression Classifier on
4.2 29
Testing Data
4.3 ROC curve 30
VIII
LIST OF TABLES
TABLE PAGE
TITLE
NO. NO.
CHAPTER 1
INTRODUCTION
2
It's a system which gives the ideas and tips to take care of the health of the user and it
provides how to seek out disease using this prediction. So just by entering the
symptoms and every one other useful information the user can get to understand the
disease he/she is affected by and therefore the health industry also can get enjoy this
technique by just asking the values from the user and entering within the system and
in only few seconds they will tell whether the heart is in good condition or not.
These kinds of ML systems have been implemented by many other organizations, but
we intend to make it unique and more useful to users who use this system. This Heart
Disease Prediction Using Machine Learning is completely done with the help of
Machine Learning algorithms and Python Programming language and also using the
dataset that's available previously by the hospitals using that we'll predict the disease.
Nowadays doctors are using many technologies and methodology for not only can
identify and diagnose common diseases, but also many deadly diseases.
The exact and accurate analysis is normally attributed to the successful treatment.
When doctors fail to make accurate decisions while examining a patient's disease,
disease forecasting systems that use ML algorithms can help.
reported with common diseases that have typical symptoms. In this fast moving world
people want to live a very luxurious life so they work like a machine in order to earn
lot of money and live a comfortable life therefore in this race they forget to take care
of themselves, because of this there food habits change their entire lifestyle change, in
this type of lifestyle they are more tensed they have blood pressure, sugar at a very
young age and they don’t give enough rest for themselves and eat what they get and
they even don’t bother about the quality of the food. If they found themselves sick
they go for their own medication as a result of all these small negligence it leads to a
major threat that is the heart disease.
Data mining has been used in a variety of applications such as marketing, customer
relationship management, engineering, and medicine analysis, expert prediction, web
mining and mobile computing. Of late, data mining has been applied successfully in
healthcare fraud and detecting abuse cases.
Data analysis proves to be crucial in the medical field. It provides a meaningful base
to critical decisions. It helps to create a complete study proposal. One of the most
important uses of data analysis is that it helps in keeping human bias away from
medical conclusion with the help of proper statistical treatment. By use of data mining
for exploratory analysis because of nontrivial information in large volumes of data.
The health care industries collect huge amounts of data that contain some hidden
information, which is useful for making effective decisions for providing appropriate
results and making effective decisions on data, some data mining techniques are used
to better the experience and conclusion that have been given. The vast medical
records are available to the research. The medical industry faces enormous challenges
in using the huge medical data. The vast amount of data is transformed to obtain
valuable and accurate information speedily by machine. Thus, machine learning is the
important area. The highly useful machine learning models used to discover the
hidden pattern and correlation among features in the dataset
( Kausar, S. Palaniappan, B.B. Samir, A. Abdullah, N. Dey, T.Turner & R. Stocker,
2016, 2013 ). Heart disease detection system will use the data mining knowledge to
give a user-oriented approach to new and hidden patterns in the data. The knowledge
which is implemented can be used by the healthcare experts to get better quality of
service and to reduce the extent of adverse medicine effect.
4
The main objective of this study is to predict whether a patient is affected with heart
disease or not using machine learning algorithm (Logistic Regression ) on a qualified
dataset by find out the correlations between different attributes (Carney, R. M. &
Freedland, K. E., 2010).The system can discover and extract hidden knowledge
associated with diseases from a historical heart data set heart disease prediction
5
system aims to use data mining techniques on medical data set to assist in the
prediction of the heart diseases.
1.3.2 Specific Objectives
Provides new approach to concealed patterns in the data
Helps avoid human biasness
Reduce the medical cost
Here the scope of the project is that integration of clinical decision support with
computer-based patient records could reduce medical errors, enhance patient safety,
decrease unwanted practice variation, and improve patient health more effectively.
This suggestion is promising as data modeling and analysis tools, e.g., data mining,
have the potential to generate a knowledge-rich environment which can help to
significantly improve the quality of clinical decisions.
Clinical decisions are often made based on doctor’s insight and experience rather than
on the knowledge hidden in the dataset. This practice leads to different barriers, errors
and excessive medical costs which affects the quality of service provided to patients.
The proposed system will integrate clinical decision support with computer-based
patient records (Data Sets). This will reduce medical errors, enhance patient safety,
decrease unwanted practice variation, and improve patient outcome. This suggestion
is promising as data modeling and analysis tools, e.g., data mining, have the potential
to generate a knowledge rich environment which can help to significantly improve the
6
quality of clinical decisions. There are voluminous records in medical data domain
and because of this; it has become necessary to use data mining techniques to help in
decision support and prediction in the field of healthcare. Therefore, medical data
mining is useful for diagnosing of disease.
CHAPTER 2
THE LITERATURE REVIEW
8
2.1 Introduction
Data mining is the process of discovering previously unknown patterns and trends in
the database and using that information to create predictable models. Data mining
involves statistical analysis, machine learning and website technology to extract
hidden patterns and relationships on a large website. A World Health Statistics 2012
report highlights the fact that one in three adults worldwide has elevated blood
pressure, a condition that affects almost half of all deaths from stroke and heart
disease.
Heart disease, also known as coronary heart disease (CVD), involves many
conditions that affect the heart - not just the heart attack. Heart disease is the leading
cause of death in various countries, including India. Heart disease kills one person
every 34 seconds in the United States. Heart disease, Cardiomyopathy and heart
disease are other stages of heart disease. The term “cardiovascular disease” covers a
variety of conditions that affect the heart and blood vessels as well as the way blood is
pumped and distributed throughout the body.
Diagnosis is a difficult and important task that needs to be done accurately and
effectively. Diagnosis is usually made, based on the doctor's experience and
knowledge. This leads to unwanted side effects and overdose of treatment costs
provided to patients. Therefore, an automated medical diagnostic program can be very
helpful.
(Polaraju, Durga Prasad, & Tech Scholar, 2017) Proposed Heart Disease Prediction
using the Multiple Regression Model and proves that Multiple Linear Regression is
appropriate to predict the risk of heart disease. The work is done using a training data
set with 3000 scenarios with 13 different qualifications mentioned earlier. The data
set is divided into two parts which means 70% of the data is used for training and
30% is used for testing (Polaraju, K., Durga Prasad, D. & Tech Scholar, M., 2017).
9
(Beyene & Kamat, 2018) recommend different algorithms such as Naive Bayes, Tree
of Divide, KNN, Logistic Regression, SVM and ANN. Logistic Regression provides
better accuracy compared to other algorithms. (Beyene & Kamat, 2018) developed a
Cardiovascular Predictability System using Data Mining Strategies. WEKA software
used for automatic diagnosis and provision of services at health facilities. The paper
has used various algorithms such as SVM, Naïve Bayes, Organization Law, KNN,
ANN, and Decision Tree. The SVM Recommended Paper is more efficient and offers
more accuracy compared to other data mining algorithms. Chala Beyene commended
Predicting and Analyzing the Incidence of Heart Disease Using Data Mining
Strategies. The primary goal is to predict the onset of heart disease in order to
diagnose autoimmune disease more quickly during the short-term outcome. The
proposed approach is also important for a health care organization with professionals
who do not have the knowledge and skills. It uses a variety of medical attributes such
as blood sugar and heart rate, age, sex and some of the included traits to determine if a
person has heart disease or not. Data set analysis is performed using the WEKA
software (Beyene, C., & Kamat, P., 2018).
(Soni, Ansari, & Sharma, 2011) it is proposed to use an indirect class algorithm to
predict heart disease. It is recommended to use big data tools such as Hadoop
Distributed File System (HDFS), map download and SVM for heart disease with a set
of attribute set. This work investigated the use of various data mining methods for
predicting heart disease. It suggests using HDFS to store large data on different nodes
and using predictive algorithm using SVM in more than one location at a time using
SVM. SVM is used in the same way that has produced a better calculation time than
consecutive SVM (Soni, J., Ansari, U., & Sharma, D., 2011).
(Science and Wisdom, 2009) suggested the use of heart disease using data mining
and machine learning algorithm. The aim of this study was to uncover hidden patterns
through data mining techniques. The best J48 data-based algorithm for UCI has a
much higher accuracy rate compared to LMT (Science, C., & Faculty, G. M., 2009).
(Purushottam, Saxena, & Sharma, 2016) proposed a system for predicting heart
disease using data mining. This program helps the doctor to make effective decisions
based on a specific parameter. For the specific testing and training phase, it provides
10
86.3% accuracy in the test phase and 87.3% in the training phase (Purushottam,
Saxena, K., & Sharma, R.).
(Sai & Reddy, 2017) propose to predict heart disease using the ANN algorithm for
data mapping. Due to the increased cost of heart disease diagnosis, there has been a
need to develop a new system that can predict heart disease. The predictive model is
used to predict the patient's condition after the test based on various parameters such
as heart rate, blood pressure, cholesterol etc. System accuracy is proven in java (Sai,
P. P., & Reddy, C.).
(A & Naik, 2016) are recommended to develop a diagnostic system that will
diagnose heart disease from a patient's medical data set. 13 risk factors for input
attributes were considered for system design. After data analysis from the database,
data purification and data integration was performed. He used the methods of k and
naïve Bayes to predict heart disease. This paper is a program design using
cardiovascular history data that provides diagnostics. Thirteen aspects of building this
program have been considered. To extract information from a database, data mining
techniques such as aggregation, classification methods can be used. Thirteen attributes
with a total of 300 records were used in the Cleveland Heart Database. This model is
predicting whether a patient has heart disease or not based on the number of 13
symptoms (A, A. S., & Naik, C.).
(Sultana, Haider, & Uddin, 2017) proposed a diagnosis of heart disease. This paper
proposes data mining techniques to predict the disease. It is intended to provide a
current strategic survey to extract information from the database and will be useful to
health professionals. Performance can be achieved based on the time it takes to build
a program decision tree. The main goal is to predict the disease with a small number
of factors (Sultana, M., Haider, A., & Uddin, M. S.).
Firda Anindita Latifah et.al., proposed comparative study of machine learning model
namely, logistical regression and random forest for classification of heart disease. The
research done on Framingham dataset with 3656 records and training to testing ratio
of 70:30. The accuracy of 85.04% was achieved by the model (F.A. Latifah,
& I. Slamet).
dataset and the logistic regression achieved accuracy of 82.56% and logistic
regression support vector machine achieved accuracy of 84.85%
(S. Bashir, Z.S. Khan, F. Hassan Khan, A. Anjum, & K. Bashir).
Through the Bayesian categories, the system will discover confidential information
related to diseases in the historical records of patients with heart disease. Bayesian
class dividers predict class membership opportunities, in such a way that the
probability of a given sample is mathematically class. The Bayes category is based on
a Bayes perspective. We can use Bayes theory to determine the likelihood that the
proposed diagnosis is true, given the observations. Possibly simple, a non-judgmental
Bayes divider is used to classify based on what is based on a Bayes perspective.
12
The split tree literally creates a tree with branches, nodes, and leaves that allow us to
replace unknown data and descend from the tree, using the points of the data point in
the tree until the leaf is reached and anonymous extraction of data. the point can be
determined. To create a good split tree model, we need to have an existing data set
with a known effect on which we can build our model. We also divided our data into
14
two parts: a training set, used for model creation, and a test set, used to ensure that the
model was accurate and not over-installed.
This will be the proposed flow chart that the system will look like
Start
Collect Heart
Disease Dataset
Extract Significant
Variable
Data Processing
Test Performance
Deploy Model
Classifier
Pattern Matching
Prediction
Rule
Generation
Accuracy
Calculation
Results
Gradient descent is an algorithm that optimizes many loss functions, such as Support
Vector Machine (SVM), and Logistic Regression models, and is often used to
improve line function, and the stochastic concept is introduced here based on finding
natural roots for development work. In the Stochastic Gradient Descent, for each
multiplication, samples are randomly selected using the word “bulk” by the number of
samples, instead of the whole set of data, and these collections are used to calculate
each multiplication.
17
CHAPTER 3
RESEARCH METHADOLOGY
Heart Disease Detection System will be implemented and executed using the Plan
driven Iterative Process Model.
The reason for selecting this model is that here instead of beginning with fully
known requirements, we can start implementing a set of software requirements,
testing, evaluating and plug-in further requirements after an iteration. During each
iteration new version of the software gets produced. This rinsing and repetition go
on until the complete project is ready. This provides the flexibility in modifying the
requirements and software design if needed. So, the process model we adopt for
developing this project is Iterative model. Because this is the only SDLC model we
20
Cardiovascular disease UCI dataset is first loaded and then data cleaning and finding
missing values was performed on all records. The dataset contains complete
information. The attributes of the dataset are multi-class variable in characteristics
with double classification.
The patient record is identified uniquely by 13 features of the dataset such as sex and
age. The rest of the features consists of medical information. The medical information
are vital attributes predicting heart disease. The correlation performed on all 13
attributes with the target value to select the features with high and positive correlation
feature as shown in table 3.1.
Features Correlation
Exang 0.436757
Cp 0.433798
Oldpeak 0.430696
Thalach 0.421741
Ca 0.391724
Slope 0.344029
1 50% 50%
2 60% 40%
3 70% 30%
4 80% 20%
The data visualization of features such as gender, chest pain category, and fasting
blood sugar level. Males are more likely than females to get heart disease, according
to this Cleveland dataset. The majority of individuals with cardiovascular disease
experience asymptomatic chest discomfort.
The distribution of heart disease in accordance to gender is shown in figure 3.4 where
it is shown that males are more likely to get heart disease rather than females.
The figure 3.4 shoes that 68% males and 32% females getting heart disease
23
3.6 Classification
One of the Simplest and best ML classification algorithm is Logistic Regression. The
LR is the supervised ML binary classification algorithm widely used in most
application. It works on categorical dependent variable the result can be discrete or
binary categorical variable 0 or 1. The sigmoid function is used as a cost function.
Sigmoid function maps a predicted real value to a probabilistic value between ‘0’ and
‘1’.
1
P ( x) = (− x) (1)
(1+e )
Euler’s number and its value is approximately equal to 2.71828 as shown in equation
1.
To predict the cardiac disease logistic regression ML model is used, firstly the LR
model are trained with five splitting condition and tested with test data for prediction
to get the best accuracy and to find the models behavior. The algorithm results
category of 1 and 0 for presence and absences of cardiac disease.
The Logistics Regression Model is described in Pseudo code 1 is used in both training
and testing the data instance.
Z ← yj−P ( 1−dj )
[ P ( 1−dj ) • ( 1− p ( 1−dj ) ) ]
Initialy the weight of instance dj to P(1|dj). (1-P). (1|dj)
Finalize a f(j) to the data with class value (zj) & weights (wj)
Assign (class label:1) if P (1|dj) >0.5, otherwise (class label:2)
3.7 Summary
Planning the project before-hand aids in the timely completion of the project.
Project plan give’s details about the deliverables. The methodology is the important
aspect that describes how we are going to achieve our goals and the manners of
doing it. The iterative model is used in this project due to predefined requirements.
The iterative model is a fast development process and it is a suitable choice for this
project that allows producing prototypes so that it becomes easy to identify faults
and make the refined final product. The condition of using an iterative model is,
requirements must be clear in advance. The algorithm that we used is Logistic
Regression Model. The whole software is built on Logistic Regression Model
26
because its accuracy increases as the training data gets matured with time.
27
CHAPTER 4
RESULTS AND DISCUSSION
The logistical regression is tested with UCI dataset with four different ratios and their
accuracy as shown in the table below. The accuracy of 86.91% obtained by logistical
28
regression for split ratio of training and testing is 50:50. The accuracy of the model on
the basis of training data is shown in table 4.1 and figure 4.1.
The Logistics Regression gives its best accuracy of 86.91% when training and testing
data are split as 50:50 on training data, 86.17% on 60:40, 86.47% when data is split as
70% training data and 30% testing data and 85.24% on 80:20.
87.5
87
86.5
86
85.5
85
84.5
84
80/20 70/30 60/40 50/50
This model gives its highest accuracy of 83.43% when training and testing data are
split as 50:50 on testing data, 81.70% on 60:40, 79.22% when data is split as 70%
training data and 30% testing data and 80.48% on 80:20.
So best accuracy comes when data is split as 50:50 in this model(50% training data
and 50% testing data)
84
83
82
81
80
79
78
77
80/20 70/30 60/40 50/50
The ROC (Receiver Operator Characteristics) curve as shown in the figure 4.3 is used
to further investigation in to the model. The performance of the model is visualized by
ROC Curve and the tradeoff between TPR (True Positive Rate) and FPR (False
Positive Rate). It ranges from 0 to 1 and the area under it signifies the capabilities of
distinguish the class of ML model. The ROC curve as near to one it is more capable
of classifying.
30
4.2 Technologies
For the development, the following are the Software Requirements:
Operating System: Windows or any Linux
Language: Python
Tools: Anaconda, Google Colaboratory, Draw.io and Visio to Create and
design Data Flow and Context Diagram
CHAPTER 5
RECOMMENDATIONS
34
The developed application can be used in hospitals for a quick assessment of the
patient underlying conditions which may result in getting an overview of the patient’s
health. This process can save time and expensive medical tests. The developed
application can be used in homes. People use to be unconscious about their health.
They feel lazy to go for a test. This study will help them getting a prediction about
their heart health.
The previous researchers worked on different algorithms which gave results according
to their efficiency.
Our research is based on the most accurate algorithm named Logistic Regression
which gives the most accurate results. This model will get trained day by day as new
data will be entered on daily basis which will also increase the efficiency and
accuracy of the model.
All available information can be transmitted to mobile devices, meaning that when a
person inserts these signals into a cell phone the trained model will already be
available and will be able to analyze the symptoms and provide the appropriate
prescription. Different doctors can be considered and a complete independent plan
developed. We can also combine doctor numbers if the model shows a high risk and
they can consult a doctor. And if they show minor symptoms, then medication that is
already prescribed by doctors at some point will be indicated. This program will prove
to be beneficial and the work for doctors will also be minimal. And in the current era
of corona virus, we need independent programs that can help and ultimately ensure
authenticity among most people. So we can build other apps with the help of doctors
and make them work.
35
CHAPTER 6
CONCLUSION
36
Prediction of Heart disease is a challenging and very necessary in the medical field.
The recognition of heart diseases through the processing of raw health care
information will help in the long term saving of human lives.
This project predicts people with cardiovascular disease by extracting the patient
medical history that leads to a fatal heart disease from a dataset that includes patients’
medical history such as chest pain, sugar level, blood pressure, etc. This Heart
Disease detection system assists a patient based on his/her clinical information of
them been diagnosed with a previous heart disease (Piller L B, Davis B R, Cutler J A,
Cushman W C, Wright J T, Williamson J D & Haywood L J, 2002).
The mortality rate can be controlled if the disorder is detected at early stages and
preventative measures are adopted as soon as possible It is helpful in the early
detection of abnormalities in heart.
Out of the 13 features we examined, the top 4 significant features that helped us
classify between a positive & negative Diagnosis were chest pain type (cp), maximum
heart rate achieved (thalach), number of major vessels (ca), and ST depression
induced by exercise relative to restimulations.
Our machine learning algorithm can now classify patients with Heart Disease. Now
we can properly diagnose patients, & get them the help they needs to recover. By
diagnosing and detecting these features early, we may prevent worse symptoms from
arising later.
Our system yields the highest accuracy of 86.91% on training data and 83.43% on test
data. Any accuracy above 70% is considered good, but if your accuracy is extremely
high, it may be too good to be true (an example of Over fitting). Thus, 80% is the
ideal accuracy.
Use of more training data ensures the higher chances of the model to accurately
predict whether the given person has a heart disease or not (Dangare Chaitrali S and
Sulabha S Apte., 2012).
37
References
World Health Organization and J. Dostupno, cardiovascular diseases: key facts, vol.
room/fact-sheets/detail/cardiovascular-diseases-(cvds).
593.
Springer (2016), pp. 217-231.
Eng., 20 (1) (2013), pp. 1-10.
Polaraju, K., Durga Prasad, D., & Tech Scholar, M. (2017). Prediction of Heart
www.ijedr.org
38
Beyene, C., & Kamat, P. (2018). Survey on prediction and analysis the occurrence of
heart disease using data mining techniques. International Journal of Pure and
https://www.scopus.com/inward/record.uri?eid=2-s2.0-
85041895038&partnerID=40&md5=2f0b0c5191a82bc0c3f0daf67d73bc81.
Soni, J., Ansari, U., & Sharma, D. (2011). Intelligent and Effective Heart Disease
3(6), 2385–2392.
Science, C., & Faculty, G. M. (2009). Heart Disease Prediction Using Machine
Purushottam, Saxena, K., & Sharma, R. (2016). Efficient Heart Disease Prediction
https://doi.org/10.1016/j.procs.2016.05.288
Sai, P. P., & Reddy, C. (2017). International Journal of Computer Science and Mobile
A, A. S., & Naik, C. (2016). Different Data Mining Approaches for Predicting Heart
Sultana, M., Haider, A., & Uddin, M. S. (2017). Analysis of data mining techniques
Dangare Chaitrali S and Sulabha S Apte. "Improved study of heart disease prediction