0% found this document useful (0 votes)
66 views4 pages

Predicting Heart Failure Using ML Algorithms

How to build machine learning models for heart failure prediction using R Studio and compare the classifiers used for modelling to find the best one in terms of accuracy.

Uploaded by

Joyalin Mary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views4 pages

Predicting Heart Failure Using ML Algorithms

How to build machine learning models for heart failure prediction using R Studio and compare the classifiers used for modelling to find the best one in terms of accuracy.

Uploaded by

Joyalin Mary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

End Term - Machine Learning Algorithms – II

A Report submitted in partial fulfilment of the requirements for the degree of 
Master of Business Administration

Submitted by,

JOYALIN MARY MATHEW

REGISTER NUMBER
2028151

Under the Guidance of

 Dr Manohar Kapse


 

School of Business and Management


CHRIST (DEEMED TO BE UNIVERSITY), Bengaluru

 JANUARY 2022
Predicting Heart Failure using ML Algorithms and Comparing the
Classifiers in terms of Accuracy

Problem Statement

Heart failures due to Cardio-vascular diseases (CVDs) are the number one cause of deaths
globally. The available techniques for predicting heart failure are not accurate and they
demand more time, cost and technical expertise. Data mining and ML techniques are helpful
to get rid of these difficulties caused by conventional diagnosis methods. But it is important
to find which attributes and which classifier to be used while using ML techniques for
prediction. The problem statement is to build machine learning models for heart failure
prediction using R Studio and compare the classifiers used for modelling to find the best one
in terms of accuracy and reliability.

Objective

The goal of predicting heart failure is to avoid severe episodes of heart disease with
preventive therapy. The prediction of heart failure using a minimal number of attributes will
be crucial for the health care industry to save lives. Experts so far have used machine learning
techniques to predict the early signs of heart failure. But the classifiers used must be highly
accurate and reliable. This work aims to compare the different classifiers available today for
the prediction of heart failure to find the finest classifier with the highest accuracy. This will
help the healthcare industry to select the best algorithm from the existing Machine Learning
algorithms in cardiovascular disease prediction.

Model Building and comparative analysis of various classifiers available


for heart failure prediction

This work followed the sequence of steps in CRISP-DM.

1. Business understanding

2. Data understanding

3. Data preparation

4. Modelling

5. Evaluation
6. Deployment

Business Understanding

The project aims to make predictions on the possibility of occurrence of Heart Failure for a
person by building Machine Learning models. This problem is a binary classification
problem since it has only two outputs, Y and N. As per the results shown by the Machine
learning model, the person can take actions for his health accordingly. The output of the
machine learning model is expected to be categorical. From a business perspective, this
would enable the healthcare industry to find the best algorithm for heart failure prediction
among various classifiers in terms of performance parameters.

Data understanding

The dataset used is a repository from IBM used for heart failure prediction. It has ten columns with
10800 rows with categorical and numerical variables, including the target variable HEARTFAILURE.
The columns in this data set are optimum for predicting heart disease. Various features leading to
heart failure, including alterable and unalterable risk factors, have been used as the independent
variables, and the dependent variable is whether the person has heart failure or not. The following are
the columns used in the dataset:

Variables Operational Definition


Avg_heartbeats_per_min Average heartbeats per minute of the user
Palpitations_per_day Number of palpitations occurred for the user in
one day
Cholesterol Cholesterol level of the user in mg/dL
BMI Body Mass Index of the user
Age Age of the user
Sex Sex of the user
Family_history Whether the user has any family history of heart
disease
Smoker_last_5years Whether the user has been a smoker in the last
five years
Exercise_mins_per_week The total time spent for exercise in one week in
minutes
Heart_failure Whether the person has experienced heart
failure or not
Data Preparation

For machine learning models built using R Studio, the following steps were taken to prepare the data
for building the model:

1. First checked if there were any missing values.


2. Checked for outliers
3. Standardisation
4. Converting categorical variables
5. The data is then split into training and testing data in 7:3 proportion

You might also like