0% found this document useful (0 votes)
129 views

Prediction of Stroke Using Machine Learning: June 2020

The document discusses predicting stroke using machine learning. Stroke is a major health issue and predictive models using patient data may help identify those at high risk of stroke and enable preventative measures. The authors aim to build models to predict stroke risk based on modifiable factors and provide personalized risk assessments and lifestyle recommendations.

Uploaded by

Musaddique Dange
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views

Prediction of Stroke Using Machine Learning: June 2020

The document discusses predicting stroke using machine learning. Stroke is a major health issue and predictive models using patient data may help identify those at high risk of stroke and enable preventative measures. The authors aim to build models to predict stroke risk based on modifiable factors and provide personalized risk assessments and lifestyle recommendations.

Uploaded by

Musaddique Dange
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/342437236

Prediction of Stroke Using Machine Learning

Conference Paper · June 2020

CITATIONS READS
2 2,866

4 authors, including:

Srikanth .S
Visvesvaraya Technological University
1 PUBLICATION   2 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Srikanth .S on 25 June 2020.

The user has requested enhancement of the downloaded file.


Prediction of Stroke Using Machine
Learning
KUNDER AKASH MAHESH SHASHANK H N
Dept. of Computer Science & Engineering Dept. of Computer Science & Engineering
CMRIT, Bangalore CMRIT, Bangalore
Karnataka, India Karnataka, India
[email protected] [email protected]
SRIKANTH S THEJAS A M
Dept. of Computer Science & Engineering Dept. of Computer Science & Engineering
CMRIT, Bangalore CMRIT, Bangalore
Karnataka, India Karnataka, India
[email protected] [email protected]

Abstract - Stroke is a blood clot or is not only cost huge medical care and
bleeds in the brain, which can make permanent disability but can
permanent damage that has an effect eventually lead to death. Every 4
on mobility, cognition, sight or minutes someone dies of stroke, but
communication. Stroke is up to 80% of stroke can be prevented
considered as medical urgent if we can identify or predict the
situation and can cause long-term occurrence of stroke in its early stage.
neurological damage, complications
and often death. The majority of INTRODUCTION
strokes are classified as ischemic
embolic and Hemorrhagic. An Burden of Stroke in the World-Stroke
ischemic embolic stroke happens is the second leading cause of death
when a blood clot forms away from and leading cause of adult disability
the patient brain usually in the patient worldwide with 400-800 strokes per
heart and travels through the patient 100,000, 15 million new acute strokes
bloodstream to lodge in narrower every year, 28,500,000 disability
brain arteries. Hemorrhagic stroke is adjusted life-years and 28-30-day case
considered another type of brain fatality ranging from 17% to 35%. The
stroke as it happens when an artery in burden of stroke will likely worsen
the brain leaks blood or ruptures. with stroke and heart disease related
Stroke is the second leading cause of deaths projected to increase to five
death worldwide and one of the most million in 2020, compared to three
life- threatening diseases for persons million in 1998. This will be a result of
above 65 years. It injures the brain continuing health and demographic
like “heart attack” which injures the transition resulting in increase in
heart. Once a stroke disease occurs, it vascular disease risk factors and
population of the elderly. Developing the most common predictors of death
countries account for 85% of the from stroke for those aged more than
global deaths from stroke. The social 65 years of age reported by Mackay
and economic consequences of stroke included previous stroke, atrial
are substantial. The cost of stroke for fibrillation and hypertension. Nigeria
the year 2002 was estimated to be as 6 reported a 12.6% 30-day case
high as $49.4 billion in the United fatality of all strokes. Among patients
States of America (USA), while costs with hemorrhagic stroke: fixed dilated
after discharge were estimated to pupil(s), a Glasgow coma score of less
amount to 2.9 billion Euros in France. than 10 on admission, swallowing
difficulties at admission, fever, lung
Causes of mortality from stroke- infection, and no aspirin treatment
Death from stroke is as a result of co- were independent risk factors for a
morbidities and/ or complications. lethal outcome. Yikona J et al also
Complications of stroke may arise at observed that stroke severity,
different time periods. The beginning neurological deterioration during
of stroke symptoms and the first hospitalization, non-use of
month following the stroke onset is antithrombolytics during hospital
the most critical period for survival admission and lack of assessment by a
with the highest number of fatalities stroke team were the most consistent
in the first week. Complications of predictors of case fatality at seven
stroke include hyperglycemia, days, 30 days and one year after
hypoglycemia, hypertension, stroke. In Pretoria, South Africa, case
hypotension, fever, infarct extension fatality at 30 days was much higher,
or rebreeding, cerebral edema, 22% for ischemic stroke, 58% for
herniation, coning, aspiration, cerebral hemorrhagic stroke and
aspiration pneumonia, urinary tract hypertension was significantly
infection, cardiac dysrhythmia, deep associated with stroke. At Mulago
venous thrombosis and pulmonary hospital, 30 day case fatality of 43.8%
embolism among others. During the was reported among 133 patients
first week from stroke onset, death is (mean age 65.8+ 15.8 years) with,
usually due to transtentorial fever > 37.50 (OR 2.81 (95%CI; 1.2-
herniation and hemorrhage, with 6.6) and impaired level of
death due to hemorrhage happening consciousness with a GCS <9 (OR0.13
within the first three days and death 95%CI; 0.005-0.35) significantly
due to cerebral infarction usually associated with increased mortality.
occurring between the third to sixth
day. One week after the onset of Traditional risk factors associated
stroke, death is usually due to with stroke- Stroke can occur in
complications resulting from relative anyone regardless of race, gender or
immobility such as pneumonia, sepsis age however the chances of having a
and pulmonary embolism. stroke increase if an individual has
certain risk factors that can cause a
Different studies have found varied stroke. The best way to protect
factors associated with stroke oneself and others is to understand
mortality in their setting. For example, personal risk and how to manage it.
Studies have shown that 80% of diabetes, and dysregulation of glucose
strokes can be prevented in this way. metabolism, atrial fibrillation, and
Stroke risk factors are divided into lifestyle factors. Therefore, the goal of
modifiable and non-modifiable. The our project is to apply principles of
modifiable risk factors are further machine learning over large existing
subdivided into lifestyle risk factors or data sets to effectively predict the
medical risk factors. Lifestyle risk stroke based on potentially modifiable
factors which include smoking, risk factors. Then it intended to
alcohol use, physical inactivity and develop the application to provide a
obesity can often be changed while personalized warning on the basis of
medical risk factors such as high blood each user’s level of stroke risk and a
pressure, atrial fibrillation, diabetes lifestyle correction message about the
mellitus and high cholesterol can stroke risk factors.
usually be treated. A large multicenter
(INTERSTROKE) case control study LITERATURE SURVEY
showed that there are ten factors that
are associated with 90% of stroke risk In order to get required knowledge
and half of these are modifiable. Non- about various concepts related to the
modifiable risk factors on the other present analysis existing literature
hand though they cannot be were studied. Some of the important
controlled, they help to identify conclusions were made through those
individuals at risk for stroke. are listed below.

Prevention of stroke - More than 70% “Computer Methods and Programs


of strokes are first events, thus in Biomedicine” - Jae–woo Lee, Hyun-
making primary stroke prevention a sun Lim, Dong-wook Kim, Soon-ae
particularly important aspect. Shin, Jinkwon Kim, Bora Yoo, Kyung-
Interventions should be targeted at hee Cho – The Purpose of this paper
behavior modification, which however was Calculation of 10-year stroke
requires information about the prediction probability and classifying
baseline perceptions, knowledge and the user's individual probability of
prevalence of risk factors in defined stroke into five categories.
populations. “Probability of Stroke: A Risk
Profile from the Framingham
Study” - Philip A. Wolf, MD; Ralph B.
D'Agostino, PhD, Albert J. Belanger,
MA; and William B. Kannel, MD - In
PROBLEM STATEMENT this paper, A health risk appraisal
function has been developed for the
Stroke is the second leading cause of prediction of stroke using the
death worldwide and remains an Framingham Study cohort.
important health burden both for the
individuals and for the national “Development of an Algorithm for
healthcare systems. Potentially Stroke Prediction: A National
modifiable risk factors for stroke Health Insurance Database Study” -
include hypertension, cardiac disease, Min SN, Park SJ, Kim DJ,
Subramaniyam M, Lee KS – In this CT scans: a retrospective study” -
research, this paper aimed to derive a Rohit Ghosh, Swetha Tanamala,
model equation for developing a Mustafa Biviji, Norbert G Campeau,
stroke pre- diagnosis algorithm with Vasantha Kumar Venugopal - In this
the potentially modifiable risk factors. paper Non-contrast head CT scan is
the current standard for initial
“Stroke prediction using artificial imaging of patients with head trauma
intelligence”- M. Sheetal Singh, or stroke symptoms. This article
Prakash Choudhary - In this paper, aimed to develop and validate a set of
Here, decision tree algorithm is used deep learning algorithms for
for feature selection process, principle automated detection.
component analysis algorithm is used PROPOSED SYSTEM
for reducing the dimension and
adopted back propagation neural Algorithms Involved-
network classification algorithm, to
construct a classification model. Few methodologies used in our
projects are:
“Medical software user interfaces, 1. Decision Tree
stroke MD application design 2. Naïve Bayes
(IEEE)” Elena Zamsa-The article 3. Artificial Neural Network
presents the design of an application
interface for associated medical data Decision Tree- A decision tree is a
visualization and management for decision support tool that uses a tree-
neurologists in a stroke clustering and like graph or model of decisions and
prediction system called Stroke MD. their possible consequences, including
chance event outcomes, resource
“Focus on stroke: Predicting and costs, and utility. It is one way to
preventing stroke” Michael Regnier- display an algorithm that only
This paper focuses on cutting-edge contains conditional control
prevention of stroke. statements. Decision tree is one of the
important methods for handling high
“Effective Analysis and Predictive dimensional data. Tree based learning
Model of Stroke Disease using algorithms are considered to be one of
Classification the best and mostly used supervised
Methods”-A.Sudha, P.Gayathri, learning methods. Tree based
N.Jaisankar- This paper, principle methods empower predictive models
component analysis algorithm is used with high accuracy, stability and ease
for reducing the dimensions and it of interpretation. Unlike the linear
determines the attributes involving models, they map non-linear
more towards the prediction of stroke relationships quite well. They are
disease and predicts whether the adaptable at solving any kind of
patient is suffering from stroke problem at hand. Fig 1 represents part
disease or not. of the decision tree model for
prediction of stroke diseases.
“Deep learning algorithms for
detection of critical findings in head
Fig 1: - Decision tree
Naive Bayes- A Naïve Bayes classifier
is a probabilistic machine-learning
model that’s used for classification Fig 2: - Bayesian classifier
task. The crux of the classifier is based
on the Bayes theorem. Naive Bayes algorithms are mostly
used in sentiment analysis, spam
filtering, recommendation systems
etc. They are fast and easy to
implement but their biggest
disadvantage is that the requirement
of predictors to be independent. In
most of the real-life cases, the
predictors are dependent; this hinders
Using Bayes theorem, we can find the the performance of the classifier.
probability of A happening, given that
B has occurred. Hence, B is the Artificial Neural Network- Neural
evidence and A is the hypothesis. The networks are a set of algorithms,
assumption made here is that the modelled loosely after the human
predictors/features are independent. brain, that are designed to recognize
That is the presence of one particular patterns. They interpret sensory data
feature does not affect the other. through a kind of machine perception,
Hence it is called naïve. labeling or clustering raw input. The
patterns they recognize are numerical,
contained in vectors, into which all
real-world data, be it images, sound,
text or time series, must be translated.
data structure is defined as
Architectural Design.

Fig 3: Artificial Neural Network

Neural networks help us cluster and


classify. They help to group unlabeled
data according to similarities among
the example inputs, and they classify Figure 5: System Architecture
data when they have a labeled dataset
to train on. Figure 5 shows the overall logical
structure of the project with following
DATASET USED- modules:
1. Input data: Risk factors like
age, gender, hypertension,
heart disease, BMI, Smoking
status, Glucose level.
2. Machine Learning Techniques:
Artificial Neural Networks,
Decision Tree, Naïve Bayes
classifier.
3. Analysis: Prediction and
analysis of stroke whose
Fig 4: -Dataset performance is based on
Architectural Design machine learning techniques.
4. Management: Suggestion and
System architecture is the conceptual improvement of stroke victims.
model that defines the structure,
behavior, and more views of a system.
An architecture description is a formal
description and representation of a
system, organized in a way that WORK FLOW
supports reasoning about the
structures and behaviors of the
system. The overall logical structure
of the project is divided into
processing modules and a conceptual
RESULTS AND PERFORMANCE
EVALUATION-

Performance Analysis
In this section snapshot showing the
performance of three algorithms
proposed in this project i.e. Decision
Tree, Naïve Bayes, Artificial Neural
Network are compared. AUC – ROC
(Area Under The Curve - Receiver
Operating Characteristics) curve is a
performance measurement for
classification problem at various
thresholds settings. ROC is a
probability curve and AUC represents
degree or measure of separability. It
tells how much model is capable of
distinguishing between classes.
Higher the AUC, better the model is at
predicting 0s as 0s and 1s as 1s. By
analogy, Higher the AUC, better the
Fig 6: - Work Flow model is at distinguishing between
patients with disease and no disease.
IMPLEMENTATION STEPS-
The ROC curve is plotted with TPR
1. Clean the missing values both against the FPR where TPR is on y-
training and testing data axis and FPR is on the x-axis.
2. Applying Label Encoder to
convert object into integer
3. Balancing Dataset
4. Split the data into training and
testing
5. Building Decision Tree Model
6. Building Naïve Bayes Model
7. Building Artificial Neural
Networks Model
8. Create a GUI and extract
models into GUI module
9. Enter the new data for which
stroke has to be predicted
10. Result: -Predicted data with
respect to each model
Graphs and Analysis

Frequency of Stroke before Balanced


Dataset
GUI respective risk factor information’s
and doctors consulting.

REFERENCES

[1]. “Computer Methods and Programs


in the Biomedicine” - Jae–woo Lee,
Hyun-sun
Lim, Dong-wook Kim, Soon-ae Shin,
Jinkwon Kim, Bora Yoo, Kyung-hee
Cho
[2]. “Probability of Stroke: A Risk
Profile from the Framingham Study” -
Philip A.
Wolf, MD; Ralph B. D'Agostino, PhD,
Albert J. Belanger, MA; and William B.
Kannel,
Conclusion MD
Several assessments and prediction [3]. “Development of an Algorithm for
models, Decision Tree, Naive Bayes Stroke Prediction: A National Health
and Neural Network, showed Insurance
acceptable accuracy in identifying Database Study” - Min SN, Park SJ, Kim
stroke-prone patients. This project DJ, Subramaniyam M, Lee KS
hence helps to predict the stroke risk [4]. “Stroke prediction using artificial
using prediction model and provide intelligence”- M. Sheetal Singh,
personalized warning and the lifestyle Prakash
correction message through a web Choudhary
application. By doing so, it urges [5]. “Medical software user interfaces,
medical users to strengthen the stroke MD application design (IEEE)” -
motivation of health management and Elena
induce changes in their health Zamsa
behaviors. [6]. “Focus on stroke: Predicting and
preventing stroke” - Michael Regnier
Future Scope [7]. “Effective Analysis and Predictive
This project helps to predict the Model of Stroke Disease using
stroke risk using prediction model in Classification
older people and for people who are Methods” - A.Sudha, P.Gayathri,
addicted to the risk factors as N.Jaisankar
mentioned in the project. In future, [8]. “Deep learning algorithms for
the same project can be extended to detection of critical findings in head
give the stroke percentage using the CT scans: a
output of current project. This project retrospective study” - Rohit Ghosh,
can also be used to find the stroke Swetha Tanamala, Mustafa Biviji,
probabilities in young people and Norbert G
underage people by collecting Campeau, Vasantha Kumar Venugopal

View publication stats

You might also like