ALY6015 Final Project Report

The document describes a study conducted to build a predictive model for heart disease. The study used a dataset from Kaggle containing information on 304 patients. Variables like age, sex, cholesterol level, chest pain type, and ECG test results were analyzed. Visualizations showed that heart disease risk increases with chest pain type and cholesterol level. A logistic regression model was selected as the target variable (presence of heart disease) is binary. Factorization was performed to group variable categories before building the predictive model.

Uploaded by

Lizza Amini Gumilar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

119 views19 pages

ALY6015 Final Project Report

Uploaded by

Lizza Amini Gumilar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 19

HEART DISEASE PREDICTION

ARVIND PAWAR

AZAR NAJAFLI

DHRUVIN KOTHARI

NORTHEASTERN UNIVERSITY

ALY6015 INTERMEDIATE ANALYTICS

INSTRUCTOR NAME- TENGLONG LI

DATE- 03/28/2019
INTRODUCTION
Cardiology is one of the most important and yet very difficult field of health care. Heart
diseases can prove lethal if not detected in early stages. Every year around 610000 people die from
heart diseases in the United States of America, which is 1 in every 4 deaths, according to the data
provided by CDC (CDC, 2017). The same report says that the disease is main cause of death for
both men and women, however more than half of similar diseases have been observed in men in
2009. Every year more than 700000 Americans have heart attacks in which more than 500000
have it for the first time and more than 200000 have already have attack before. Because of these
statistics, every year researches are done by different people including doctors all around the world
and data analysts/scientists. The role of data science in those researches is that doc
With all the information we had above, we decided to choose this topic as part of our final
project in which our main objective is to construct a predictive model that will help us to determine
the disease beforehand any anomalies in patient’s health.
Our team used the dataset called heart.csv received from Kaggle.com which is an open
source web site for various real-life datasets. Our dataset contains 14 variables (columns) and 304
observations (rows). We decided to define the variable called “target” as dependent one which
means, our model will predict that variable using others (independent). As the variable “target” is
binomial which is either yes or no (1 – if the patient has heart disease or 0 – the patient does not
have a heart disease), the best model to use should be logistic regression.
RStudio is used for all our analysis and visualizations.
We installed all the required packages.
install.packages("DataExplorer") library(ggplot2
install.packages("Hmisc")
install.packages("data.table")
install.packages("caret")
install.packages("extrafont")
install.packages("ggthemes")
install.packages("caret",
repos = "http://cran.r-project.org",
dependencies = c("Depends", "Imports", "Suggests"))
> library(caret)
> library(data.table)
> library(Hmisc)
> library(DataExplorer)
> library(ggplot2)
> library(carData)
> library(car)
> library(dplyr)
> library(lattice)
> library(tidyr)
> library(caret)
> library(MASS)
> library(broom)
> library(ROCR)
> library(corrplot)
> setwd("C:/Users/Arvind/Desktop/Projects/Heart disease/")
> heart_data= read.csv("heart.csv")
> head(heart_data)
ï..age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
1 63 1 3 145 233 1 0 150 0 2.3 0 0 1 1
2 37 1 2 130 250 0 1 187 0 3.5 0 0 2 1
3 41 0 1 130 204 0 0 172 0 1.4 2 0 2 1
4 56 1 1 120 236 0 1 178 0 0.8 2 0 2 1
5 57 0 0 120 354 0 1 163 1 0.6 2 0 2 1
6 57 1 0 140 192 0 1 148 0 0.4 1 0 1 1
> names(heart_data)
[1] "ï..age" "sex" "cp" "trestbps" "chol" "fbs" "restecg" "thalach" "exang"
[10] "oldpeak" "slope" "ca" "thal" "target"
> colnames(heart_data)[colnames(heart_data)=="ï..age"]<- "age"
Renamed the age column name
Some of the variables of our data has different categories for example chest pain has type
0,1,2,3, ECG level 0,1,3 and so on. To determine the effect of each variable and better results
we decided to perform factorization.
> factor_heart_data<- copy(heart_data)
> colnames(factor_heart_data)[colnames(factor_heart_data)=="ï..age"]<- "age"
> head(factor_heart_data)
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
1 63 1 3 145 233 1 0 150 0 2.3 0 0 1 1
2 37 1 2 130 250 0 1 187 0 3.5 0 0 2 1
3 41 0 1 130 204 0 0 172 0 1.4 2 0 2 1
4 56 1 1 120 236 0 1 178 0 0.8 2 0 2 1
5 57 0 0 120 354 0 1 163 1 0.6 2 0 2 1
6 57 1 0 140 192 0 1 148 0 0.4 1 0 1 1
>col_names<-
c("age","sex","chest_pain","rest_bp","chol","fasting_bloodsugar","rest_ecg","max_heartrate",
+ "exercise_angina","ST_depression","slope","n_major_vasel","thal","target")
> names(factor_heart_data)<- col_names
> names(factor_heart_data)
[1] "age" "sex" "chest_pain" "rest_bp" "chol"
[6] "fasting_bloodsugar" "rest_ecg" "max_heartrate" "exercise_angina" "ST_depression"
[11] "slope" "n_major_vasel" "thal" "target"
> factor_heart_data$sex <- as.character(heart_data$sex)
> factor_heart_data$sex <- ifelse(heart_data$sex=="0", 'female', 'male')
> factor_heart_data$chest_pain<-as.factor(heart_data$cp)
> factor_heart_data$fasting_bloodsugar[heart_data$fasting_bloodsugar == 1]= "Diabetic"
> factor_heart_data$fasting_bloodsugar[heart_data$fasting_bloodsugar == 0] = "Normal"
> factor_heart_data$rest_ecg[heart_data$rest_ecg == 0] = "Normal"
> factor_heart_data$rest_ecg[heart_data$rest_ecg == 1] = "Abnormality"
> factor_heart_data$rest_ecg[heart_data$rest_ecg == 2] = "Probable or definite"
> factor_heart_data$exercise_angina[heart_data$exercise_angina == "1"]= "yes"
> factor_heart_data$exercise_angina[heart_data$exercise_angina == "0"] = "no"
> factor_heart_data$slope=as.factor(heart_data$slope)
> factor_heart_data$thal=as.factor(heart_data$thal)
> factor_heart_data$target=as.factor(heart_data$target)
> factor_heart_data$sex=as.factor(heart_data$sex)
> factor_heart_data$fasting_bloodsugar=as.factor(heart_data$fbs)
> factor_heart_data$exercise_angina=as.factor(heart_data$exang)
> #View(factor_heart_data)
> head(factor_heart_data)
age sex chest_pain rest_bp chol fasting_bloodsugar rest_ecg max_heartrate exercise_angina
ST_depression slope
1 63 1 3 145 233 1 0 150 0 2.3 0
2 37 1 2 130 250 0 1 187 0 3.5 0
3 41 0 1 130 204 0 0 172 0 1.4 2
4 56 1 1 120 236 0 1 178 0 0.8 2
5 57 0 0 120 354 0 1 163 1 0.6 2
6 57 1 0 140 192 0 1 148 0 0.4 1
n_major_vasel thal target
1 0 1 1
2 0 2 1
3 0 2 1
4 0 2 1
5 0 2 1
6 0 1 1
Our response variable is “target” that has binary data, 0 and 1. 0 means a person don’t have heart
disease and 1 means a person has heart disease.
> plot_histogram(factor_heart_data)
> plot_density(factor_heart_data)

> plot_correlation(heart_data)
From the above results we can say that Age, Cholesterol, maximum heart rate and blood pressure
follows normal distribution.
We started to plot the correlation matrix to see the relationship between “target” and all
other variables visually. As we had, relatively large number of independent variables, we decided
to focus on 4 (cholesterol, heart rate, chest pain, and rest ECG test (this is the test advised by
doctors if the patient has those three symptoms)) which doctors pay attention more when they try
to determine the illness. Below you can see the correlation matrix:
Surprisingly, we can see from the matrix that, although the doctors have more focus on those 4
variables we mentioned above, cholesterol, heart rate, and rest ECG test do not have strong
correlation with dependent variable.

VISUALIZATION
Age Distribution:
> ggplot(factor_heart_data,aes(age, fill=target)) +
+ geom_histogram(aes(y=..density..),breaks=seq(0, 80, by=1), color="grey17") +
+ geom_density(alpha=.1, fill="black")+
+ facet_wrap(~target, ncol=1,scale="fixed") +
+ scale_fill_manual(values=c("blue","red"))+
+ xlab("Age") +
+ ylab("Density / Count") +
+ ggtitle("Age Distribution")
Output:
Conclusion:
From the above graph we can conclude that, heart disease is uniformly spread out across Age.
Gender count for heart disease:
colnames(heart_data)[colnames(heart_data)=="ï..age"]<-"Age"
head(heart_data)
ggplot(factor_heart_data,aes(target, fill=target)) +
geom_bar(stat = "count") + facet_wrap(sex~.) + scale_fill_manual(values=c("Blue","red")) +
ggtitle("Gender count for Heart Disease")
Output:
Conclusion:
From the above graph we can conclude that, more females have Heart Disease as compared to
male patients.
2) To construct a predictive model, we focused performing analysis on the following parameters.
 Chest Pain (Diabetic Patient): As the chest pain increases there is higher probability
that the patient will have a heart problem.
> ggplot(factor_heart_data,aes(target,fill=target)) +
+ geom_bar(stat = "count") + facet_wrap(chest_pain~.) + ggtitle("Count of Heart Patients having
different types of chest Pains") + theme_bw() +
+ scale_fill_manual(values=c("Blue","red"))+
+ xlab("Target")
Output:

Conclusion: More Heart Disease patients have chest pain type 1 or 2

 Cholesterol Level: As the cholesterol level increases chances of having stroke, diabetes
or high blood pressure also increases which may lead to severe heart problems.
> ggplot(factor_heart_data,aes(factor_heart_data$chol, fill=target)) +
+ geom_histogram(aes(y=..density..),breaks=seq(90, 550, by=25), color="grey17") +
+ geom_density(alpha=.1, fill="black")+
+ facet_wrap(~target, ncol=1,scale="fixed") +
+ theme_economist() +
+ scale_fill_manual(values=c("blue","red")) +
+ xlab("Cholestoral levels") +
+ ylab("Density / Count") +
+ ggtitle("Cholestrol level test")
Output:

Conclusion: More Heart Disease patients seem to have between 200 and 250 mg/dl.
 Heart Rate Monitoring (Blood Pressure): For the patients having blood high blood
pressure are often considered as higher probability of having heart problems.
> ggplot(factor_heart_data,aes(factor_heart_data$max_heartrate, fill=target)) +
+ geom_histogram(aes(y=..density..),breaks=seq(70, 205, by=10), color="grey17") +
+ geom_density(alpha=.1, fill="black")+
+ facet_wrap(~target, ncol=1,scale="fixed") +
+ theme_economist() +
+ scale_fill_manual(values=c("blue","red"))+
+ xlab("Maximum Heart Rate Achieved") +
+ ylab("Density / Count") +
+ ggtitle("Max Heart Rate Histogram")
Output:

Conclusion: Heart Disease patients have higher maximum heart rate than healthy patients
 Rest ECG Tests: ECG test are carried out to monitor heart conditions.
> ggplot(factor_heart_data,aes(target,fill=target)) +
+ geom_bar(stat = "count") + facet_wrap(rest_ecg~.)+
+ ggtitle("Patients with different types of ECG test")+
+ scale_fill_manual(values=c("Blue","red"))
Output:

Conclusion: From the above graph we can observe that Patients with Rest ECG 1 have more
probability of having Heart Diseases.
PREDICTION
We are going to examine the relationship between one categorical response variable and
various categorical independent variables, so we chose Logistic Regression for prediction.
We divided the dataset into training and testing data in 75% and 25% respectively.
> set.seed(12345)
> train <- floor(0.75*nrow(factor_heart_data))
> train_ind <-sample(seq_len(nrow(factor_heart_data)),size = train)
> trainset <- factor_heart_data[train_ind, ]
> testset <- factor_heart_data[-train_ind, ]
> dim(trainset)
[1] 227 14
> dim(testset)
[1] 76 14
Above are the dimensions of our training and testing dataset.
First, we applied the logistic regression model on all the independent variables using glm()
function.
> logit<-glm(target~., data=trainset, family = binomial)
> summary(logit)
Call:
glm(formula = target ~ ., family = binomial, data = trainset)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.6911 -0.3425 0.1598 0.6103 2.6969
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.258e+01 1.455e+03 0.009 0.993106
age 6.877e-03 2.703e-02 0.254 0.799172
sex1 -1.677e+00 5.846e-01 -2.868 0.004133 **
chest_pain1 8.921e-01 5.946e-01 1.500 0.133532
chest_pain2 2.034e+00 5.486e-01 3.708 0.000209 ***
chest_pain3 1.947e+00 7.560e-01 2.575 0.010016 *
rest_bp -1.469e-02 1.315e-02 -1.117 0.263945
chol -5.670e-03 4.236e-03 -1.339 0.180657
fasting_bloodsugar1 -2.108e-01 6.912e-01 -0.305 0.760370
rest_ecg1 4.284e-01 4.259e-01 1.006 0.314494
rest_ecg2 -4.029e-01 2.361e+00 -0.171 0.864495
max_heartrate 2.541e-02 1.220e-02 2.083 0.037226 *
exercise_angina1 -6.633e-01 4.795e-01 -1.383 0.166528
ST_depression -4.378e-01 2.672e-01 -1.638 0.101338
slope1 -1.322e-01 8.605e-01 -0.154 0.877894
slope2 7.741e-01 9.387e-01 0.825 0.409615
n_major_vasel -8.687e-01 2.348e-01 -3.700 0.000216 ***
thal1 -1.124e+01 1.455e+03 -0.008 0.993840
thal2 -1.176e+01 1.455e+03 -0.008 0.993551
thal3 -1.302e+01 1.455e+03 -0.009 0.992861
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 312.74 on 226 degrees of freedom
Residual deviance: 160.10 on 207 degrees of freedom
AIC: 200.1
Number of Fisher Scoring iterations: 14
From above output, we observed that sex, chest pain, number of major vessels observed,
max heart rate has effect on heart disease.
Then, we decided to create another dataset that has significant variables only.
> cor_data<-trainset[,c(2,3,9,10,12,14)]
> summary(cor_data)
sex chest_pain exercise_angina ST_depression n_major_vasel target
0: 96 0:143 0:204 Min. :0.00 Min. :0.0000 0:138
1:207 1: 50 1: 99 1st Qu.:0.00 1st Qu.:0.0000 1:165
2: 87 Median :0.80 Median :0.0000
3: 23 Mean :1.04 Mean :0.7294
3rd Qu.:1.60 3rd Qu.:1.0000
Max. :6.20 Max. :4.0000
However, before applying the model let’s see the variance inflation factor to check whether
there is multicollinearity or not.
> #variance inflation factor
> vif(glm(target ~ ., data=cor_data, family="binomial"))
GVIF Df GVIF^(1/(2*Df))
sex 1.056310 1 1.027770
chest_pain 1.311732 3 1.046263
exercise_angina 1.093148 1 1.045537
ST_depression 1.152195 1 1.073403
n_major_vasel 1.045553 1 1.022523
VIF seems to be low, and there is no multicollinearity in data. So, we can move forward to
apply the logit model on significant variables data.
> logit1<-glm(target~., data=cor_data, family=binomial)
> summary(logit1)
Call:
glm(formula = target ~ ., family = binomial, data = cor_data)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.3277 -0.5202 0.2011 0.5714 2.5038
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.9614 0.4348 4.511 6.44e-06 ***
sex1 -1.4117 0.3894 -3.625 0.000289 ***
chest_pain1 1.3498 0.4868 2.773 0.005560 **
chest_pain2 2.0905 0.4192 4.987 6.12e-07 ***
chest_pain3 2.0161 0.6086 3 .313 0.000924 ***
exercise_angina1 -1.2217 0.3721 -3.283 0.001028 **
ST_depression -0.8060 0.1810 -4.454 8.42e-06 ***
n_major_vasel -0.7635 0.1662 -4.595 4.34e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance:
417.64 on 302 degrees of freedom
Residual deviance: 238.32 on 295 degrees of freedom
AIC: 254.32
Number of Fisher Scoring iterations: 5
From the above results we can see that all the variables are significant, which is good for
prediction. Let’s visualize the correlation.
> cor_data$chest_pain<-as.factor(cor_data$chest_pain)
> factor_heart_data$chest_pain<-as.factor(heart_data$cp)
> logit1.df<-tidy(logit1)
> library(ggthemes)
> library(extrafont)
> logit1.df %>%
+ mutate(term=reorder(term,estimate)) %>%
+ ggplot(aes(term,estimate, fill=estimate))+
+ geom_bar(stat="identity")+
+ ggtitle("Effect of variables resulting Heart Disease")+
+ scale_fill_gradient(low="blue", high="red")+
+ theme_economist()+
+ geom_hline(yintercept=0)+
+ coord_flip()
Output:

From the above output, we observed that chest pain type 2 has the most impact on heart
disease. Person having any type of chest pain is likely to have heart disease. Increase in
number of major vessels and exercise angina reduces the chances of heart disease.
Our varianvce inflation factor result was good but still to avoid the problem of overfitting
we performed cross validation using trainControl() function.
> fitControl <- trainControl(method = "repeatedcv",
+ number = 10,
+ repeats = 10,
+ classProbs = TRUE,
+ summaryFunction = twoClassSummary)
> trainset$target<-make.names(trainset$target)
> set.seed(142)
> trainset$target<-as.factor(trainset$target)
> generalized_model <- caret::train(target ~ .,
+ data = trainset ,
+ method = "glm",
+ trControl = fitControl,
+ metric="ROC")
There were 20 warnings (use warnings() to see them)
> generalized_model
Generalized Linear Model
227 samples
13 predictor
2 classes: 'X0', 'X1'
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 205, 204, 205, 205, 205, 203, ...
Resampling results:
ROC Sens Spec
0.8685239 0.7367273 0.8478205
> pred <- predict(generalized_model, testset,type='raw')
> summary(pred)
X0 X1
33 43
Here we passed the test dataset for prediction using predict() function.
> pred
[1] X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1
X0 X0 X1 X1 X1 X1 X1 X1 X1 X0 X1
[37] X1 X1 X1 X1 X1 X0 X0 X0 X0 X0 X0 X1 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0
X1 X0 X0 X0 X0 X0 X0 X0 X1 X1 X0
[73] X0 X1 X0 X0
Levels: X0 X1
> testset$target
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
0000
[55] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Levels: 0 1
We can compare the predicted values and observed values.
Here, we created a confusion matrix
> Confusion_Matrix<-table(testset$target, pred)
> Confusion_Matrix
pred
X0 X1
0 30 5
1 3 38
In the end we determined the accuracy of our model.
> accuracy<-sum(diag(Confusion_Matrix))/sum(Confusion_Matrix)
> accuracy
[1] 0.8947368
We got 89.47% accuracy.

CONCLUSION
We analyzed various parameters that risk in causing heart disease, and we created a predictive
model with 89.47% accuracy that can help us determine beforehand whether preventive measure
needs to be taken so that we can avoid heart disease and heart strokes. We examined the
relationship between one categorical response variable and various categorical independent
variables, so we chose Logistic Regression for prediction. To get better results, we can keep on
increasing various parameters and increase the size of the data.
TEAM CONTRIBUTION:
Everyone contributed equally as we helped each other when anyone of us was stuck on certain
point and that was the key highlight of our project. Our team-work led to a successful completion
of our project.

REFERENCES
 Heart Disease UCI. (2019). Kaggle.com. Retrieved 29 March 2019, from
https://www.kaggle.com/ronitf/heart-disease-uci/kernels
 ggplot2 histogram : Easy histogram graph with ggplot2 R package - Easy Guides - Wiki -
STHDA. (2019). Sthda.com. Retrieved 29 March 2019, from
http://www.sthda.com/english/wiki/ggplot2-histogram-easy-histogram-graph-with-
ggplot2-r-package
 A Short Introduction to the caret Package. (2019). Cran.r-project.org. Retrieved 29 March
2019, from https://cran.r-project.org/web/packages/caret/vignettes/caret.html
 Cui, B. (2019). Introduction to DataExplorer. Cran.r-project.org. Retrieved 29 March 2019,
from https://cran.r-project.org/web/packages/DataExplorer/vignettes/dataexplorer-
intro.html
 Machine Learning Part 3 : Logistic Regression. (2017). Towards Data Science. Retrieved
29 March 2019, from https://towardsdatascience.com/machine-learning-part-3-logistics-
regression-9d890928680f
 Evaluating Logistic Regression Models. (2015). R-bloggers. Retrieved 29 March 2019,
from https://www.r-bloggers.com/evaluating-logistic-regression-models/
 package, F., docs, R., & browser, R. (2019). theme_economist: ggplot color theme based
on the Economist in ggthemes: Extra Themes, Scales and Geoms for 'ggplot2'. Rdrr.io.
Retrieved 29 March 2019, from https://rdrr.io/cran/ggthemes/man/theme_economist.html

MSDS For CEMENT
No ratings yet
MSDS For CEMENT
3 pages
Pima Indian Diabetes Questions
No ratings yet
Pima Indian Diabetes Questions
6 pages
Untitled2.Ipynb - Colab
No ratings yet
Untitled2.Ipynb - Colab
8 pages
Stroke Prediction Dataset
No ratings yet
Stroke Prediction Dataset
48 pages
Heart Disease
No ratings yet
Heart Disease
37 pages
Cluster Result
No ratings yet
Cluster Result
3 pages
Lab Program 7
No ratings yet
Lab Program 7
5 pages
C ML1
No ratings yet
C ML1
10 pages
DocScanner Oct 22, 2024 17-38
No ratings yet
DocScanner Oct 22, 2024 17-38
2 pages
Project Deliverable 3
No ratings yet
Project Deliverable 3
7 pages
Healthcare Analytics
No ratings yet
Healthcare Analytics
72 pages
Assignment# 06
No ratings yet
Assignment# 06
16 pages
Correlation: Import As Import As Import As Import As From Import From Import Import Matplotlib Import
No ratings yet
Correlation: Import As Import As Import As Import As From Import From Import Import Matplotlib Import
1 page
Heart Failure Prediction With Detailed Headings
No ratings yet
Heart Failure Prediction With Detailed Headings
12 pages
QT Report
No ratings yet
QT Report
20 pages
Import Numpy As NP
No ratings yet
Import Numpy As NP
3 pages
Project Report
No ratings yet
Project Report
18 pages
Cluster Result
No ratings yet
Cluster Result
3 pages
Data Science Week 4
No ratings yet
Data Science Week 4
14 pages
MayankBaryal
No ratings yet
MayankBaryal
9 pages
Assignment 2 Bayesian
No ratings yet
Assignment 2 Bayesian
3 pages
Hands-On Lab: Generative AI For Querying Databases: Efficient
No ratings yet
Hands-On Lab: Generative AI For Querying Databases: Efficient
4 pages
Machine Learning: Course-End Project Problem Statement
No ratings yet
Machine Learning: Course-End Project Problem Statement
4 pages
My ML Project
No ratings yet
My ML Project
14 pages
Ide To 6 Classification Algorithms
No ratings yet
Ide To 6 Classification Algorithms
34 pages
Problem Statement
No ratings yet
Problem Statement
2 pages
Heart Disease
No ratings yet
Heart Disease
33 pages
Edited Version of Cardiovascular Diseases Risk Prediction Dataset Report
No ratings yet
Edited Version of Cardiovascular Diseases Risk Prediction Dataset Report
25 pages
DAL Experiment Outputs 6to10
No ratings yet
DAL Experiment Outputs 6to10
16 pages
Heart Failure Prediction
100% (1)
Heart Failure Prediction
41 pages
Zhang 2021 J. Phys. Conf. Ser. 1769 012024
No ratings yet
Zhang 2021 J. Phys. Conf. Ser. 1769 012024
6 pages
Project Mid
No ratings yet
Project Mid
4 pages
Final Project AinaMarti
No ratings yet
Final Project AinaMarti
21 pages
Heart Disease Prediction Using Machine Learning-1
No ratings yet
Heart Disease Prediction Using Machine Learning-1
6 pages
Medidas de Tendencia Central 2020 PDF
No ratings yet
Medidas de Tendencia Central 2020 PDF
26 pages
AI-Based Predictive Support For Heart Disease Diagnosis
No ratings yet
AI-Based Predictive Support For Heart Disease Diagnosis
16 pages
Heart Disease Report
No ratings yet
Heart Disease Report
8 pages
# Load Packages: Pandas Pandas PD PD Numpy Numpy NP NP
No ratings yet
# Load Packages: Pandas Pandas PD PD Numpy Numpy NP NP
17 pages
Final PPT Heart Disease
67% (3)
Final PPT Heart Disease
23 pages
Final
No ratings yet
Final
13 pages
Final Cc01 Group05-1
No ratings yet
Final Cc01 Group05-1
26 pages
Week - 6 - SWI - MLP - LogisticRegression - Ipynb - Colaboratory
No ratings yet
Week - 6 - SWI - MLP - LogisticRegression - Ipynb - Colaboratory
15 pages
Heart Disease Risk Factor Data Analysis Midterm Data 2 - Jupyter Notebook
No ratings yet
Heart Disease Risk Factor Data Analysis Midterm Data 2 - Jupyter Notebook
20 pages
Major Project - Colab
No ratings yet
Major Project - Colab
15 pages
181B226 Internship Report
No ratings yet
181B226 Internship Report
48 pages
Logistic Reg Application 2024-1
No ratings yet
Logistic Reg Application 2024-1
56 pages
ISYE6414 FA23 Practice Midterm Exam 2 Solutions
No ratings yet
ISYE6414 FA23 Practice Midterm Exam 2 Solutions
6 pages
Predict Heart Disease
No ratings yet
Predict Heart Disease
55 pages
q3 Stat2100 Bautista-Lhuriely
No ratings yet
q3 Stat2100 Bautista-Lhuriely
11 pages
Project Report Soft
No ratings yet
Project Report Soft
123 pages
IR Final LabManual
No ratings yet
IR Final LabManual
18 pages
Lecture-4 (Day 3) - Pandas
No ratings yet
Lecture-4 (Day 3) - Pandas
4 pages
Dataset Documentation
No ratings yet
Dataset Documentation
3 pages
Heart Disease Indicator Prediction Model
No ratings yet
Heart Disease Indicator Prediction Model
17 pages
BIG DATA SYSTEM REPORT GROUP 2-Heartdisease
No ratings yet
BIG DATA SYSTEM REPORT GROUP 2-Heartdisease
21 pages
Eda-Ml-Decision-Tree - Ipynb - Colab
No ratings yet
Eda-Ml-Decision-Tree - Ipynb - Colab
20 pages
34 Davass1
No ratings yet
34 Davass1
8 pages
Diabetis Project
No ratings yet
Diabetis Project
7 pages
Reducing Blood Pressure Naturally: Do You Suffer From High Blood Pressure? Do You Feel Like This 'Silent Killer' Might Be Stalking You? Have you been diagnosed or pre-hypertension and hypertension?
From Everand
Reducing Blood Pressure Naturally: Do You Suffer From High Blood Pressure? Do You Feel Like This 'Silent Killer' Might Be Stalking You? Have you been diagnosed or pre-hypertension and hypertension?
Karllo MELLO
No ratings yet
Heart Disease: Stress Management for a Healthy Heart: The most important information you need to improve your health
From Everand
Heart Disease: Stress Management for a Healthy Heart: The most important information you need to improve your health
Adams Media
No ratings yet
Lower Your High Blood Pressure Naturally - How To Control High Blood Pressure Without Medication
From Everand
Lower Your High Blood Pressure Naturally - How To Control High Blood Pressure Without Medication
Lane Woolridge
No ratings yet
Current Affairs-Sample Notes
No ratings yet
Current Affairs-Sample Notes
71 pages
PDF-7 The Socio-Cultural Model
100% (2)
PDF-7 The Socio-Cultural Model
16 pages
079 Muhammad Khairul Nizam Bin Khir 7.1
No ratings yet
079 Muhammad Khairul Nizam Bin Khir 7.1
30 pages
Insanity in Edgar Allan Poe's 'The Fall of The House of Usher' and 'The Tell-Tale Heart'
No ratings yet
Insanity in Edgar Allan Poe's 'The Fall of The House of Usher' and 'The Tell-Tale Heart'
12 pages
CHN2 Mastery P3 Test C1 & C3
No ratings yet
CHN2 Mastery P3 Test C1 & C3
2 pages
Sciatica Pain Relief From Natural Therapies
No ratings yet
Sciatica Pain Relief From Natural Therapies
11 pages
LPD Delivarbel Document
No ratings yet
LPD Delivarbel Document
3 pages
Prof Venkatesh Balasubramanian Current State Road Accident Database and Requirements For Integration of A Trauma Registry
No ratings yet
Prof Venkatesh Balasubramanian Current State Road Accident Database and Requirements For Integration of A Trauma Registry
37 pages
AUSTRALIAN FOOTBALL LEAGUE and Another V THE AGE COMPANY LTD and Others
No ratings yet
AUSTRALIAN FOOTBALL LEAGUE and Another V THE AGE COMPANY LTD and Others
28 pages
Jetam Price List
No ratings yet
Jetam Price List
12 pages
Chemical Spillage Training
No ratings yet
Chemical Spillage Training
4 pages
CFPP 01-01b Final
No ratings yet
CFPP 01-01b Final
40 pages
EPA Lagoon Design Manual - Paul Krauft Utah State PDF
No ratings yet
EPA Lagoon Design Manual - Paul Krauft Utah State PDF
79 pages
PROJECT WORK 7th Sem
No ratings yet
PROJECT WORK 7th Sem
6 pages
Osteomyelitis and Suppurative Arthritis: - Etiology
No ratings yet
Osteomyelitis and Suppurative Arthritis: - Etiology
8 pages
Rosemarie Rizzo Parse: Theory of Human Becoming: Biography
No ratings yet
Rosemarie Rizzo Parse: Theory of Human Becoming: Biography
3 pages
Process Recording CET - 20241112 - 213114 - 0000
No ratings yet
Process Recording CET - 20241112 - 213114 - 0000
19 pages
School-Based Immunization
100% (2)
School-Based Immunization
1 page
Module 1 Microplanning
No ratings yet
Module 1 Microplanning
33 pages
Lecture # 3. Semen Preservation
No ratings yet
Lecture # 3. Semen Preservation
35 pages
LAS (HEALTH 6-Quarter 3-Week 3)
No ratings yet
LAS (HEALTH 6-Quarter 3-Week 3)
5 pages
Bihar Economic Survey & BUDGET 2024-25
No ratings yet
Bihar Economic Survey & BUDGET 2024-25
6 pages
Scape-South-Bank Resident-Handbook A6 WEB
No ratings yet
Scape-South-Bank Resident-Handbook A6 WEB
28 pages
HPV Vaccine
No ratings yet
HPV Vaccine
1 page
Activity 2 Group 2
No ratings yet
Activity 2 Group 2
44 pages
(Đề thi có 05 trang) Thời gian làm bài: 60 phút, không kể thời gian phát đề
100% (1)
(Đề thi có 05 trang) Thời gian làm bài: 60 phút, không kể thời gian phát đề
23 pages
Cristina Paulin: 1102 Hampstead Place Martinez, GA 30907 706-306-8200
No ratings yet
Cristina Paulin: 1102 Hampstead Place Martinez, GA 30907 706-306-8200
2 pages
BSMMU Reproductive and Child Health July 2024
No ratings yet
BSMMU Reproductive and Child Health July 2024
1 page
Offences Against The Person
No ratings yet
Offences Against The Person
2 pages

ALY6015 Final Project Report

Uploaded by

ALY6015 Final Project Report

Uploaded by

HEART DISEASE PREDICTION

ALY6015 INTERMEDIATE ANALYTICS

INSTRUCTOR NAME- TENGLONG LI

Conclusion: More Heart Disease patients have chest pain type 1 or 2

You might also like