Final Year Project Report
Final Year Project Report
on
“ Heart disease prediction using machine learning”
A project work submitted to JNTU, Kakinada in partial fulfillments of the requirements for the award of the
Degree of BACHELOR OF TECHNOLOGY IN COMPUTER SCIENCE AND ENGINEERING
Submitted By:
Guided By:
Associate Professor
Department of CSE
Submitted to:
Heart-related diseases or Cardio Vascular Diseases (CVDs) are the main reason for a huge
number of death in the world over the last few decades and has emerged as the most
lifethreatening disease, not only in India but in the whole world. So, there is a need fora
reliable, accurate, and feasible system to diagnose such diseases in time for proper treatment.
Machine Learning algorithms and techniques have been applied to various medical datasets to
automate the analysis of large and complex data. Many researchers, in recent times, have been
using several machine learning techniques to help the health care industry and the
professionals in the diagnosis of heart-related diseases. Heart is the next major organ
comparing to the brain which has more priority in the Human body. It pumps the blood and
supplies it to all organs of the whole body. Prediction of occurrences of heart diseases in the
medical field is significant work. Data analytics is useful for prediction from more information
and it helps the medical center to predict various diseases. A huge amount of patient-related
data is maintained on monthly basis. The stored data can be useful for the source of predicting
the occurrence of future diseases. Some of the data mining and machine learning techniques
are used to predict heart diseases, such as Artificial Neural Network (ANN), Random Forest,
and Support Vector Machine (SVM). Prediction and diagnosing of heart disease become a
challenging factor faced by doctors and hospitals both in India and abroad. To reduce the large
scale of deaths from heart diseases, a quick and efficient detection technique is to be
discovered. Data mining techniques and machine learning algorithms play a very important
role in this area. The researchers accelerating their research works to develop software with
the help of machine learning algorithms which can help doctors to decide both prediction and
diagnosing of heart disease. The main objective of this research project is to predict the heart
disease of a patient using machine learning algorithms.
Needs and Objectives
Main Objectives:
The system can discover and extract hidden knowledge associated with diseases from a
historical heart data set .
Heart disease prediction system aims to exploit data mining techniques on medical data set to
assist in the prediction of the heart diseases.
Specific Objectives:
• To implement Naïve Bayes Classifier that classifies the disease as per the input of the
user.
• Processor – i3
• 2 GB RAM
• Memory – 5 GB
Software requirements:
• python 3.7.2
• Jupyter Notebook
• Numpy
• Librosa
• Matplotlib
• Seaborn
• SciPy
Literature Survey
Numerous studies have been done that have focus on diagnosis of heart disease. They have
applied different data mining techniques for diagnosis & achieved different probabilities for
different methods. (Polaraju, Durga Prasad, & Tech Scholar, 2017) proposed Prediction of
Heart Disease using Multiple Regression Model and it proves that Multiple Linear Regression
is appropriate for predicting heart disease chance. The work is performed using training data
set consists of 3000 instances with 13 different attributes which has mentioned earlier. The
data set is divided into two parts that is 70% of the data are used for training and 30% used for
testing. (Deepika & Seema, 2017) focuses on techniques that can predict chronic disease by
mining the data containing in historical health records using Naïve Bayes, Decision tree,
Support Vector Machine (SVM) and Artificial Neural Network (ANN). A comparative study
is performed on classifiers to measure the better performance on an accurate rate. From this
experiment, SVM gives highest accuracy rate, whereas for diabetes Naïve Bayes gives the
highest accuracy. (Beyene & Kamat, 2018) recommended different algorithms like Naive
Bayes, Classification Tree, KNN, Logistic Regression, SVM and ANN. The Logistic
Regression gives better accuracy compared to other algorithms. (Beyene & Kamat, 2018)
suggested Heart Disease Prediction System using Data Mining Techniques. WEKA software
used for automatic diagnosis of disease and to give qualities of services in healthcare centers.
The paper used various algorithms like SVM, Naïve Bayes, Association rule, KNN, ANN, and
Decision Tree. The paper recommended SVM is effective and provides more accuracy as
compared with other data mining algorithms. Chala Beyene recommended Prediction and
Analysis the occurrence of Heart Disease Using Data Mining Techniques. The main objective
is to predict the occurrence of heart disease for early automatic diagnosis of the disease within
result in short time. The proposed methodology is also critical in healthcare organization with
experts that have no more knowledge and skill. It uses different medical attributes such as
blood sugar and heart rate, age, sex are some of the attributes are included to identify if the
person has heart disease or not. Analyses of data set are computed using WEKA software. It
is proposed to use bigdata tools such as Hadoop Distributed File System (HDFS), Map reduce
along with SVM for prediction of heart disease with optimized attribute set.
Time Chart
Month 0-Month 1
Data Collection
Data PreProcessing
Month 1-Month 2
Month 2-Month 3
• Model Selection
Month 3-Month 4
• Model Training
Month 4-Month 5
Month 5-Month 6
• Model Deployment
Conclusion