0% found this document useful (0 votes)
2 views54 pages

Project Report On Data Science of Heart Disease Prediction

The document outlines a project titled 'Heart Disease Data Science Project' conducted by Ilias Ahmed and Mohammad Mosharaf Hossain under the supervision of Md. Tofazzal Hosen at The People’s University of Bangladesh. The project aims to predict heart disease using machine learning techniques on a dataset that includes various patient attributes, with a focus on developing accurate predictive models for early detection and treatment. Key features include data preprocessing, exploratory data analysis, model development and evaluation, with the Random Forest model achieving perfect accuracy in predictions.

Uploaded by

ilias ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views54 pages

Project Report On Data Science of Heart Disease Prediction

The document outlines a project titled 'Heart Disease Data Science Project' conducted by Ilias Ahmed and Mohammad Mosharaf Hossain under the supervision of Md. Tofazzal Hosen at The People’s University of Bangladesh. The project aims to predict heart disease using machine learning techniques on a dataset that includes various patient attributes, with a focus on developing accurate predictive models for early detection and treatment. Key features include data preprocessing, exploratory data analysis, model development and evaluation, with the Random Forest model achieving perfect accuracy in predictions.

Uploaded by

ilias ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 54

Report on Heart Disease Prediction of Data Science

python Project
Supervision OF PROJECT

Md. Tofazzal Hosen

Lecturer
Department of Computer Science and Engineering
Faculty of Applied Science
[email protected]
+88 01877984002

Submitted by

ILIAS AHMED Mohammad Mosharaf Hossain


ID: 0122320005103288 ID: 0122320005103290
Batch: 40th (Evening) Batch: 40th (Evening)
Department of Computer Science and Department of Computer Science and
Engineering Engineering
Email:[email protected] Email: mosharraf [email protected]

THE PEOPLE’S UNIVERSITY OF BANGLADESH 3/2, Block-A, Asad Avenue,


Mohammadpur, Dhaka-1207.

Project reference Links:

https://surl.li/rjogwr
Project datasets source:
Kaggle library/UCI Library
Approval
This project titled “Heart Disease Data Science Project” submitted by Ilias
Ahmed and Mohammed Mosharaf Hossain to the Department of Computer
Science, The People’s University of Bangladesh, has been accepted as satisfactory
for partial fulfillment of the requirements for the degree of “MSc in Computer
Science and Engineering” and approved to appealed.

Supervisor

Name Md. Tofazzal Hosen

Designation Lecturer

Department Department of Computer Science and Engineering

Faculty Faculty of Applied Science

E-mail [email protected]

Cell-Phone +88 01877984002

Chairman:

Name
Fahmida Islam

Designation Assistant Professor & Chairman

Department Department of Computer Science and Engineering

Faculty Faculty of Applied Science

E-mail [email protected]

Cell-Phone +880 1815311311


DECLARATION
I hereby declare that I have done this project under the supervision of Md.
Tofazzal Hosen , Lecturer, Department of CSE, The People’s University of
Bangladesh. I also declare that neither this project nor any part of it has been
submitted elsewhere for the reward of any degree or qualification.
Supervision of Project:
--------------------------------------------------------------------------------------------------
-----
Md. Tofazzal Hosen

Lecturer
Department of Computer Science and Engineering

Faculty of Applied Science


[email protected]
+88 01877984002

Submitted by project:

ILIAS AHMED
ID: 0122320005103288
Batch: 40th (Evening)
Department of Computer Science and Engineering

Email:[email protected]

Mohammad Mosharaf Hossain


ID: 0122320005103290
Batch: 40th (Evening)
Department of Computer Science and Engineering

Email: mosharraf [email protected]


Certification
This project, “Heart Disease Data Science Project,” is an original work developed for my M.Sc.
final project. It is a unique academic project submitted in partial fulfillment of the
requirements for the degree of “Masters of Science in Computer Science and Engineering”
from The People’s University of Bangladesh. The resources used in this project are not copied
from any other source. This project was completed in September 2024 under the supervision
of Md. Tofazzal Hosen, Lecturer, Department of Computer Science and Engineering, The
People’s University of Bangladesh.
Abstract of project
The “Heart Disease Data Science Project” is an in-depth analytical endeavor
focused on predicting heart disease using advanced data science methodologies.
The project utilizes a dataset comprising various patient attributes such as age,
gender, chest pain type, and several medical indicators like cholesterol levels
and blood pressure. By analyzing these features, the project aims to develop
predictive models that can accurately forecast the presence of heart disease,
thereby aiding clinicians in early detection and treatment.
The main goal is to harness machine learning techniques to enhance the
accuracy and reliability of predictions, ultimately contributing to improved
patient care and resource allocation in healthcare settings.
Key Features
Data Preprocessing:
 Handling missing data and outliers to ensure data integrity.
 Normalizing and scaling features to improve model performance.
Exploratory Data Analysis (EDA):
 Using visualizations such as histograms and scatter plots to uncover
patterns and relationships among variables.
 Identifying correlations to understand feature interactions and their
impact on heart disease.
Model Development:
 Implementing various machine learning algorithms including
Logistic Regression, Decision Trees, Random Forest, and Support
Vector Machines.
 Fine-tuning models through hyperparameter optimization to
enhance prediction accuracy.
Model Evaluation:
 Utilizing metrics such as accuracy, precision, recall, F1-score, and
ROC-AUC for comprehensive model assessment.
 Performing cross-validation to ensure model robustness and
generalizability.

Feature Importance:
 Analyzing which features have the most significant impact on heart
disease predictions.
 Using techniques like feature importance scores and SHAP values to
interpret model decisions.
Deployment and Integration:
 Developing a user-friendly interface for healthcare professionals to
input patient data and receive instant predictions.
 Ensuring the model is scalable and can be integrated into existing
healthcare systems for real-time analysis.
This project leverages Python and data science libraries such as Pandas, NumPy,
Scikit-learn, and Matplotlib, providing a comprehensive framework for
predictive analytics in healthcare. By offering data-driven insights, the project
aims to support healthcare professionals in making informed decisions,
ultimately improving patient outcomes and reducing the prevalence of heart
disease.

Key points list of projects :


importing libraries 1.0
importing after loading datasets 1.1
See first five rows of data sets 1.2
Information’s about data sets 1.3
Datasets describe 1.4
Display basic info about the dataset 2.0
Check for missing values 2.1
Display final data after cleaning 2.3
Drop duplicate rows 3.0
Display final data after cleaning 3.1
commenting 4.0
Check for missing values and summary statistics 4.1
Key points list of projects :
## 5.0
importing libraries 5.1
Set up the figure for visualizations 5.2
Plot 1: Age distribution 5.3
Plot 2: Cholesterol distribution 5.4
Plot 2: Cholesterol distribution 5.5
# Plot 4: Heart disease target count (0 or 1) 6.0
# Plot 4: Heart disease target count (0 or 1) 6.1
## 6.2
Compute the correlation matrix 7.0
import plotly.express as px 7.1
import plotly.express as px 7.2
import plotly.express as px 8.0
import plotly.express as px 8.1
import plotly.figure_factory as ff 9.0
Bar Chart: Count of heart disease vs. no heart disease 10.1
import plotly.express as px 11.0
full code design by ilias ahmed 11.1
Project scenarios:
Importing libraries :1.0
See first five rows :
Info of the data sets:
Model Summary and Performance Comparison
Model Performance
1. Logistic Regression
 Accuracy: 0.81
2. Random Forest
 Accuracy: 1.00
Best Model: Random Forest
The Random Forest model outperformed the Logistic Regression model, achieving
perfect accuracy.
Metrics
 Accuracy Score: 1.0
Classification Report
Class Precision Recall F1-Score Support

0 1.00 1.00 1.00 100

1 1.00 1.00 1.00 105

 Overall Accuracy: 1.00


 Macro Average: 1.00
 Weighted Average: 1.00
Confusion Matrix
[100 0
0 105]
Prediction for New Data
Based on the Random Forest model, the prediction for the new data is No Heart
Disease.

You might also like