Heart Disease Prediction Using Machine Learning Algorithm
Heart Disease Prediction Using Machine Learning Algorithm
Volume 5 Issue 2, January-February 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
KEYWORDS: Machine Learning, k-Nearest Neighbors classifier, Decision Tree Copyright © 2021 by author(s) and
classifier, Random Forest Classifier, Jupyter International Journal of Trend in Scientific
Research and Development Journal. This
is an Open Access article distributed
under the terms of
the Creative
Commons Attribution
License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
INTRODUCTION
Heart disease forecast is one of the most notable point in the This overview paper is committed for a review in the field of
machine learning field for expectation. It clusters the blood machine learning technique in heart disease. Later aspects of
to all aspects of the body. If the blood not siphons to every the overview paper will discuss about different machine
part of the body, at that point the brain and different organ learning calculation for heart disease and their comparison
will stop work and the person may die. It is hard to recognize on different parameters. It also shows future outline of
heart disease on account of few factors, for example, machine learning calculation in heart disease. This paper
diabetes, hypertension, high cholesterol, heart beat rate and gives a profound analysis in the field of predicting heart
various other factors. As per World Health Organization disease.
heart related disease are liable for taking 17.7 million lives
every year, 31% of all over worldwide. In India, heart RELATED WORKS
disease has become the main source of mortality. Heart Heart is one of the main organ of human body, it plays vital
disease has killed 1.7 million Indian in 2016, as indicated by function of blood siphoning in human body which is as
the 2016 worldwide weight of infection report. fundamental as the oxygen of human body so there is
consistently need of insurance of it, this is one of the main
In clinical science coronary illness is one of the huge explanation behind the analysts to work on it. So there are
challenges, because a lot of parameter and technicality is number of specialists dealing with it. There is consistently
involved for predicting this disease. Machine learning could need of examination of heart related things either analysis or
be a superior decision for accomplishing high precision for expectation or you can say that assurance of heart disease.
heart disease as well as another disease and its diverse There are different fields like artificial intelligence, machine
information types under different condition for predicting learning, data mining that contributed on this work. Here, we
the heart disease calculation, for example, Naive Bayes, will discuss some of them.
Decision Tree, KNN, Neural Network are utilized to predict
risk of heart algorithm and its speciality such as Naive Bayes Some of the analysts have taken a shot of information about
is utilized for predicating heart disease, while Decision Tree the expectation of heart disease. Kaur et al. have worked on
is utilized to give ordered report to the heart disease, though this and characterize how the interesting pattern and
the Neural Network give chances to limit the mistake for information are gotten from a huge dataset. They perform
predication of heart disease. All these procedures are exactness correlation on different machine learning and
utilized in old patient record for getting expectation about information mining 453 methodologies for discovering
new patient. The expectation for heart disease encourages which one is best among at that point and get the outcome
doctor to predict heart disease in early stage so that he can on the kindness of SVM.
save millions of lives.
@ IJTSRD | Unique Paper ID – IJTSRD38358 | Volume – 5 | Issue – 2 | January-February 2021 Page 183
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Zhao et al. (2017) built up a framework for heart disease 83.07%, and MPL gave 78.14% exactness and inferred that
classification by utilizing two datasets, one from Shanghai J48 beats every other calculation.
Shuguang Hospital and another in UCI coronary disease
dataset. The model uses support Vector Machine calculation PROJECT SCOPE AND OBJECTIVES
alongside PCA, CCA and DMPCCA which are utilized for The primary goal of this examination is to develop a heart
include extraction and combination. The general forecast framework. The system can find information related
investigation come about that DMPCCA gave the best with heart disease from the historical heart data set to
outcome. implement the classifier that classifies the disease according
to the contribution of the client and reduce the cost of the
Ganesan et al. (2019) utilize IOT innovation for expectation medical test. The scope of the project is to execute machine
and conclusion of heart disease by taking UCI dataset and learning calculation to bigger dataset helps to improve the
applied J48 classifier, Logistic Regression, Multiplayer accuracy of results. Utilizing of machine learning procedure
Perception, and SVM utilizing Java on Amazon cloud. In this gives more exact outcomes than more experienced doctor.
examination J48 gives 91.48%, SVM gave 84.07%, LR gave By this clinical choice with computer-based patient record
could decrease medical error and improve patient result.
Literature survey
SI. no Authors Year Description
The authors proposed to develop a model Intelligent Heart Disease Prediction
Palaniappan
1 2008 System (IHDPS) utilizing information mining procedures to be specific Naive
and Awang
Bayes, Decision Tree, and Neural Network.
The authors proposed that neural network was best survey in information
2 Bhatla and Jyoti 2012
mining methods to anticipate heart disease.
The creators proposed three mainstream information mining calculation CART
3 Chaurasia and Pal 2013 (Classification and Regression Tree), ID3 (Iterative Dichotomized 3) and
Decision Table (DT) separated from a choice tree to foresee heart disease.
The authors proposed to utilize diverse characterization procedures in coronary
4 Boshra Brahmi et al. 2015 illness determination like J48 Decision Tree, K-Nearest Neighbors (KNN), Naive
Bayes (NB) and SMO to classify dataset.
K. Vembandasamy et The authors proposed Naive Bayes algorithm in data mining technique which
5 2015
al. serves diagnosis of heart disease patient.
The authors propose an efficient mechanism to predict heart disease by mining
6 S. Seema et al. 2016
the data from health record.
The authors proposed to analysis information mining methods to foresee various
7 K. Gomathi et al. 2016
kinds of sicknesses like heart disease, diabetes and bosom disease and so on.
The authors proposed of this examination is to dissect directed AI calculation to
8 Ayon Dey et al. 2016
anticipate heart disease.
@ IJTSRD | Unique Paper ID – IJTSRD38358 | Volume – 5 | Issue – 2 | January-February 2021 Page 184
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Material and Methods order groundbreaking perceptions. At the end of the day, the
Dataset Used for Research preparation dataset is utilized to acquire better limit
The dataset consists of 303 individual data. There are 14 conditions which can be utilized to decide each target class;
columns in the dataset, which are described below. when such limit conditions are resolved, next undertaking is
to foresee the objective class
1. Age: displays the age of the individual.
2. Sex: displays the gender of the individual using the Machine learning is a field of study and is concerned with
following format: algorithms that learn from examples. There are many
1 = male different types of classification tasks that you may encounter
0 = female in machine learning and specialized approaches to modelling
that may be used for each.
3. Chest Pain type: shows the kind of chest-torment
experienced by the individual utilizing the accompanying K-Nearest Neighbor Algorithm (KNN)
organization t : K nearest neighbors is one of the easiest machine learning
1 = typical angina calculation is dependent on supervised learning procedure.
2 = atypical angina K-NN calculation accepts the closeness between the new
3 = non — anginal pain case and available cases and put the new case into the
4 = asymptotic classification that is generally like the accessible
classification. K-NN calculation can be utilized for regression
4. Resting Blood Pressure: shows the resting pulse just as for classification issue. K-NN is a non-parametric
estimation of a person in mmHg (unit) calculation, which implies it doesn’t make any presumption
on hidden information.
5. Serum Cholestrol: shows the serum cholesterol in mg/dl
(unit)
Pros:
6. Fasting Blood Sugar: looks at the fasting glucose Basic Algorithm and consequently simple to decipher the
estimation of a person with 120mg/dl. In the event that forecast. Quick calculation time.
fasting glucose > 120mg/dl at that point: 1 (valid) Used for both classification and regression.
8. Max heart rate achieved: displays the max heart rate Random Forest Classifier
achieved by an individual. Random Forest is one of the most prestigious and most
9. Exercise induced angina: remarkable machine learning calculations. It is one sort of
1 = yes machine learning calculation that is called Bagging or
0 = no Bootstrap Aggregation. So, as the access an incentive from an
information test, for example, mean, the bootstrap is very
10. ST depression induced by exercise relative to rest: powerful statistical approach. Here, lots of information are
displays the value which is an integer or float. taken, the mean is determined, after that all the mean value
are averaged to give a superior expectation of the mean value.
11. Peak exercise ST segment: In bagging, a similar strategy is utilized, but instead of
1 = upsloping estimating the mean of each information test, decision tree is
2 = flat commonly utilized.
3 = downsloping
Advantage of Random Forest:
12. Number of major vessels (0-3) colored by flourosopy: Random Forest Algorithm is exact outfit learning
displays the value as integer or float. calculation.
Random Forest runs efficiently for large scale data sets.
13. Thal: displays the thalassemia: It can handle hundreds of input variables.
3 = normal
6 = fixed defect Disadvantage of Random Forest:
7 = reversible defect Features need to have some predictive power else they
won’t work.
14. Diagnosis of heart disease: Displays whether the Forecasts of the trees should be uncorrelated.
individual is suffering from heart disease or not: Appears as black box.
0 = absence
1 = present. Decision Tree Classifier
Decision Tree Classifier is a basic and generally utilized
Classification Techniques grouping procedure. It applies a waterway forward plan to
Procedures In AI and measurements, grouping is a directed take care of the grouping issue.. Decision tree classifier
learning approach in which the PC program gains from the represents a progression of deliberately made inquiries
information and afterward utilizes this figuring out how to concerning the characteristics of the test record. Decision
@ IJTSRD | Unique Paper ID – IJTSRD38358 | Volume – 5 | Issue – 2 | January-February 2021 Page 185
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Trees (DTs) are a non-parametric directed learning method on this attribute I would anticipate the outcome for a patient
used for classification and regression. It is a Supervised whether he is experiencing heart disease or not. This paper
Machine Learning where the information is constantly part has Random Forest classifier, KNN (K-Nearest neighbour
as indicated by a specific boundary. classifier) & Decision Tree classifier – three techniques for
the effective prediction of heart disease. It analyses the
Decision Tree consists of: efficiency & accuracy of the three techniques to choose them
Nodes: Test for the estimation of a specific quality. the best.
Edges/ Branch: Compare to the result of a test and
associate with the following hub or leaf. The figure below shows the number of the heart disease
Leaf nodes: Terminal hubs that anticipate the result cases.
(speak to class marks or class appropriation).
Experiment
The Proposed Method
Heart disease is the main source of death among all the
diseases, even cancer. The quality of people facing heart
disease is on a raise every year. The prompts for its initial
finding and treatment. Because of absence of source in the
medical field, the prediction of heart disease might be a
issue. Use of suitable technology can be useful to the medical
society and patient. The issue can be settled by embracing
machine learning techniques. In my project, I would be
taking a shot at basic machine learning classification model.
And using this model I could prepare my model utilizing the
information which comprise of different attribute like age, 0 = absence 1 = present
sex, cp, blood pressure, skin thickness and so on and based
@ IJTSRD | Unique Paper ID – IJTSRD38358 | Volume – 5 | Issue – 2 | January-February 2021 Page 186
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
K-Nearest Neighbors Classifier:
K Nearest Neighbors is a non-parametric strategy utilized for
grouping. It is lazy learning figuring where all computation is
surrendered until gathering. It is otherwise called case based
learning calculation, where the capacity is approximated
locally. This algorithm is used when the amount of data is
large and there are non-linear decision boundaries between
classes. KNN explains a categorical value using the majority
votes of nearest neighbors. Not only for classification, KNN
can be used for function approximation problem.
@ IJTSRD | Unique Paper ID – IJTSRD38358 | Volume – 5 | Issue – 2 | January-February 2021 Page 187