0% found this document useful (0 votes)
2 views6 pages

Brain Stroke Prediction Using Machine Learning Techniques

This paper discusses the use of machine learning algorithms to predict brain strokes, which are critical medical emergencies caused by insufficient blood flow to the brain. The authors emphasize the importance of timely prediction for rehabilitation and recovery, utilizing various classification methods such as Naive Bayes, Decision Trees, and Random Forests to analyze patient data. The study aims to enhance predictive accuracy and improve outcomes for stroke patients through data-driven decision-making.

Uploaded by

movie user
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views6 pages

Brain Stroke Prediction Using Machine Learning Techniques

This paper discusses the use of machine learning algorithms to predict brain strokes, which are critical medical emergencies caused by insufficient blood flow to the brain. The authors emphasize the importance of timely prediction for rehabilitation and recovery, utilizing various classification methods such as Naive Bayes, Decision Trees, and Random Forests to analyze patient data. The study aims to enhance predictive accuracy and improve outcomes for stroke patients through data-driven decision-making.

Uploaded by

movie user
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Brain Stroke Prediction Using Machine Learning

Techniques
Nagaraju Devarakondaa*, Bethu Lokendra Sri Saib, Upadrashta Pravalikac and Satuluri Naganjaneyulud
2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT) | 978-1-6654-9360-4/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICECCT56650.2023.10179623

a
Computer Science, VIT-AP University, Amaravathi, India
b
Computer Science, VIT-AP University, Amaravathi, India
c
Computer Science, VIT-AP University, Amaravathi, India
d
Computer Science, Lakireddy Bali Reddy College of Engineering, Mylavaram, India

[email protected]

Abstract—In a human life there are alot of life-threatening machine learning algorithms and find the best one which gives
consequences, one among those dangerous situations is having a the accurate results to predict stroke in advance and save lives.
brain stroke. When part of the brain does not receive sufficient This prediction can be also helpful for the BCI, which is a
blood flow for functioning a brain stroke strikes a person. This is method for the stroke affected patients to bring back their
most often due to a blockage in an artery or bleeding in the brain. normal life. It is known as stroke rehabilitation. Where we can
When brain cells are deprived of oxygen for an extended period read the health history of the patient, predict the stroke and
of time, they die. lf enough brain cells in this area die, the damage warn them. The data we have consists of the patients with
becomes irreversible and you may lose the ability you once had their daily habits which were affected by the stroke and those
control over. However, restoring blood flow can prevent or at
who did not get affected by stroke, as we cannot predict their
least limit the severity of this type of damage. Therefore, stroke
treatment is time-critical. Different areas of the brain control
stroke by their daily routine and their habits so we need to take
different abilities hence the symptoms for a brain stroke are not help of machine learning where it can help us predict the
constant for everyone. The brain stroke is what a heart attack stroke using different classification methods.
does to the heart is what a stroke does to the brain. When a
stroke occurs, part of the brain loses its blood supply, leaving that
II. LITERATURE REVIEW
area of the brain without oxygen. Without oxygen, the affected
brain cells are starved of oxygen and stop functioning normally.
Strokes are very common. Stroke is the second leading cause of Many people across the world experience brain stroke in a
death worldwide. This paper is based on the prediction of brain direct way or indirectly. The main problem occurs when the
stroke using machine learning algorithms which helps to strobe hits a person in the brain, one cannot do anything on
rehabilitate the patient so that one can gain their life back to their own. They need constant supervision and they are
normal. (Abstract) dependent on others for the rest of their lives. One might be
lucky enough to survive from the stroke but will forever be
Keywords—Rehabilitation, Brain computer interface, Data relied on by others for anything and everything. Nothing will
driven decision making, Prediction.(key words) be under their control, the stroke to the brain is the same as a
I. INTRODUCTION (HEADING 1) stroke to heart, in fact it is worse. The very necessity of taking
care of the human body ahead of time is crucial. For this,
A stroke is a life-hanging condition that occurs when part predicting the brain stroke before it actually hits a person and
of the brain doesn't admit enough blood inflow. This is most
ruins their life is very essential. The prediction of a brain
frequently due to a blockage in a roadway or bleeding in the
stroke involves a fair amount of methods and algorithms
brain. Without a constant blood force, brain cells will not be
able to survive from the insufficiency of oxygen. Brain cells which have been discussed in the methodology section.
die if left without oxygen for too long. However, the damage
becomes unrecoverable and you may lose the capability you
formerly had control over this area, if enough brain cells in III. PROPOSED METHODOLOGY.
this area die.
A. Data preprocessing methods
Stroke has been such a disease which is quite The data should be available in the way the BCI can read it
unpredictable[9]. It can happen to anyone regardless of age and make the brain stroke prediction using it so that data
group, but it has its worst effects on the elderly. The medical should be trained and go through the methods as as figure.1
professionals are trying their best to bring back the life of then, the data should be modified so for that the dataset will be
stroke patients by stroke rehabilitation. The scientists are also going through the data preprocessing methods[3] so here the
working efficiently in finding the ways to predict the stroke dataset we have taken from the kaggle. By using the first
beforehand for BCI. With the outgrowing technology there are method in the data preprocessing as shown in the figure.1
a lot of ways where one can predict the stroke of a person with dataset may contain noise and disturbances or duplicate data,
basic health history. The purpose behind this paper is to apply

Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
so we used the data cleaning method to remove the values
which are not present. There are several ways to handle
missing data. If the value is missing, you can delete it, but
delete it with care, as deletion or loss of information will
occur. Missing values ​can be re-inserted based on other
observations, and data integrity is likely to be lost as you may
be working on assumptions rather than actual observations.

Fig.2 The attributes with missing values in pattern recognition


methods.

Now that the data has been classified using suitable


methods the stroke prediction will be much easier than before.
To make the classification more simple we have used another
method which is an encoding method[7]. Better encoding
makes for better models, and most algorithms cannot handle
categorical variables unless they are converted to numeric
values.
Correlation matrix between attributes.We took out the
correlation matrix for the dataset of the patients. The attributes
are gender, age, if the patient is suffering from hypertension
Fig.1 Implemented methods then 1 if not 0, if the patient is suffering from heart disease
Here in the dataset that we have used we have found that then 1 if not 0, marital status of the patient, working type,
the BMI attribute has a lot of noisy values. So to remove the residence type, average glucose level, body mass index (BMI)
noisy data we have used a missing method[6]. The data we and patient’s smoking status.
used is now clean, it is ready for the data selection. As for the ● The correlation between the stroke and patient is
prediction, we may not need all the attributes, only a few positive.
attributes will be enough to predict the data, we have to use ● There is a weak positive correlation between stroke
the classification method[3] to classify our data. The first of the patient and hypertension. factor.
method used for classification is pattern recognition method, ● There is a weak positive correlation between stroke
which is used to detect regularities and patterns in the data. of the patient and heart diseases.​
This makes the data easier to read. Classify data based on ● The correlation between the stroke and children is
statistics or insights from patterns and their representation. negative.
This technique uses labeled training data to train a pattern
recognition system. Labels are added to specific input values
​used to create pattern-based output. In pattern recognition the
number of patients affected by brain stroke are found
commonly throughout the dataset(figure.2.), the data has been
classified into gender attributes. It indicates the count of brain
stroke patients according to gender(see figure.2.).
Furthermore, the classification is done for gender attributes
and types of smokers as shown(see figure.2.).

Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
analytics[8] and machine learning are closely related, as
predictive models typically involve machine learning
algorithms. These models can be trained over time to
accommodate new data and values ​and deliver the results.
This helps us understand potential future outcomes of the
patients for the prediction of brain stroke beforehand. Each
classifier processes data differently, so the selection of
classifiers and models should be appropriate to get desired
results.
B. Machine Learning approach using different algorithms for
Brain stroke prediction
Brain Stroke prediction can be done by taking all attributes
into consideration such as the smoking index, BMI, stress
levels and so on. The BCI also needs to predict the brain
stroke through the brain with the help of nerves. For this
prediction[9] it is very uncertain to know which attributes lead
Fig.3. Correlation matrix between attributes to the brain stroke and the ones which do not. We have many
attributes that lead to the stroke. So to make this process easy
Label encoding [7] communes to the transformation of labels for BCI, we need help from the machine learning classifiers as
the predictions will be done so that data will be taken into BCI
to numeric format in order to change them into a
and predict it, as we have to see which classification
machine-readable format. This is an important step in the
techniques get more accuracy. The BCI needs the data, where
preprocessing of structured data sets in supervised learning. it can take in the information and give the output which makes
The two main steps in binary encoding are: stroke rehabilitation easy. So for this we have trained, reduced
● Different columns will be formed by splitting the and classified our dataset, now we need some classification
categories. techniques which classifies the data and gives the highest
● Setting ‘0’ for others and ‘1’ as an indicator for the accuracy to predict the brain stroke based on the history of
required columns. their health. We have used four main machine learning
algorithms namely Naive Bayes, Decision tree classifier,
TABLE I. DATA AFTER USING THE LABEL ENCODER METHOD
Random forest classifier, Multi level perceptron[1].

Gender Married
Work
type
Resident Smoking Hypertension Stroke IV. NAIVE BAYES
1 1 2 1 1 0 1 Naive Bayes comes from a set of methods that are a set of
supervised learning algorithms. Assuming conditional
1 1 2 0 2 0 1 independence between components of elements for a given
0 1 2 1 3 0 0 value of the class variable. The relationship between given
class variable and dependent feature vector through x1, xn is
0 1 3 0 2 1 0 as follows:
1 1 2 1 1 0 0

This is probably the simplest way to encode features[7]. In this (1)


method, the categorical data is converted into numerical data
as shown in table 1. Each category is assigned a numerical (2)
value. The data has been converted into binary terms. The data
has been cleaned, modified and classified now we have to
model the data for the prediction.
Modeling and predicting the data[8], the process of modeling (3)
means training a dataset to predict the labels from features,
turning it for the business need and validating it on hold out
data, by using the binary classification modeling method we (4)
have classified the data into the binary terms. The data has
been modeled successfully.
After the modeling is done we are having the required inputs (5)
to predict the brain stroke of a patient based upon their This algorithm follows the principle that each classified
historical health data. But there is also another way to predict function or attribute is independent of each other.
the brain stroke, by using predictive analytics. Predictive

Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
V. RANDOM FOREST CLASSIFIER
The random forest algorithm [5] is a very suitable supervised
machine learning algorithm used for classification and
regression problems in machine learning.

Fig.5. Random forest classifier


The logic behind the Random Forest model is that multiple
uncorrelated models (separate decision trees) give results in a
much better quality. The computing time required to establish
the RF classification model is

(6)

TABLE III . RESULTS AFTER APPLYING NAIVE BAYES CLASSIFIER.

Precision recall f1-score support

yes 0.95 1.00 0.98 1172

no 0.00 0.00 0.00 56

accuracy 0.00 0.00 0.95 1228


macro
0.48 0.50 0.49 1228
avg
weighted
0.91 0.95 0.93 1228
avg

Machine learning algorithms, especially random forests, can


be effectively used for long-term prediction of mortality and
morbidity in stroke patients. Future research may include the
use of images and genetic information. Moreover, the
Fig. 4. Naive Bayes classifier developed robust models can be used in other applications and
other fields with similar data.
VI. DECISION TREE CLASSIFIER
As other supervised learning algorithms, the decision tree
TABLE II . RESULTS AFTER APPLYING NAIVE BAYES CLASSIFIER.
algorithm[4] does not follow the same rules, with the help of
this one can solve regression and classification problems as
Precision recall f1-score support Stroke well.
yes 1 2 1 1 1

no 1 2 0 2 1

accuracy 1 2 1 3 0
macro
1 3 0 2 0
avg
weighted
1 2 1 1 0
avg

Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
Fig.6. Decision tree classifier

Fig.7. Multi Level perceptron


The above diagram depicts the process flow of how decision
tree classifiers work in general. Here, the decision for finding
the brain stroke involves different attributes like age, smoking, MLP is a feedforward neural network, which means that data
marriage status, sex, glucose level, BMI, etc. With each is transferred from the to the output layer in a feedforward
attribute having its own values one can experience a brain direction[12]. With a Multilayer Perceptron, the extended
stroke in any way possible. So in order to rehabilitate from the operations and now this neural network can have multiple
brain stroke one must have knowledge about the attributes that layers of neurons, and is ready to learn more knotty patterns
are causing the stroke of a person. The pattern analysis helps which helps to work with ease. For MLP training, supervised
to rehabilitate and bring back a person to their normal human learning is performed by aiding a set of initial data into lateral
behavior. data from the training data set so that the connection weights
are iteratively adjusted to produce the desired I/O mapping
One must not choose a data which has a lot of randomness as function. After appropriate training, we evaluated model
it creates more entropy. More entropy is likely to cause a generalization performance using an external test dataset.[12].
decrease in information gain. So one must cleverly choose an
algorithm where the data can be less random, minimizes the VIII .EXPERIMENT RESULTS
entropy and maximizes the information gain.[11]. After Now that we have applied machine learning algorithms for
minimizing the entropy and gaining the maximum information healthcare dataset to predict the stroke in patients we have
gain, it can be used for construction of a decision tree. found the following tabular column of results. It is seen that
the random forest classifier gives the results more accurately
TABLE IV . RESULTS AFTER APPLYING NAIVE BAYES CLASSIFIER.
than the other three algorithms. After we gain the knowledge
Precision recall f1-score support of the stroke patients by applying these algorithms it is much
easier for us to do the stroke rehabilitation and bring back
yes 0.96 0.98 0.97 1172
lives to the normal state for the patients affected by stroke.
no 0.10 0.05 0.07 56

accuracy 0.00 0.00 0.94 1228


TABLE V . E VALUATION OF CLASSIFIERS.
macro
0.53 0.52 0.52 1228
avg Method Accuracy

weighted Naive Bayes 0.9244


0.92 0.94 0.93 1228
avg
Random forest classifier 0.9570

VII .MULTI LEVEL PERCEPTRON Decision tree classifier 0.9162

Multi level perceptron 0.9439


A multi level artificial neural network[4]is a requisite part of
deep learning. A multi-layer artificial neural network is an
integral part of deep learning. A Multilayer perceptron has
two or more layers such as input layer and output layer, and
multiple hidden layers with many neurons kept together.[12]. CONCLUSION
BCI- the Brain Computer Interface needs the input so that
it can predict how the stroke is happening inside the brain with
the help of many such algorithms. In Machine learning, there

Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
are a lot of algorithms which help to predict the required [4] Abeer Ali Alnuaim ,Mohammed Zakariah, Prashant Kumar Shukla,
output and give the best classifications methods to help it Aseel Alhadlaq, Wesam Atef Hatamleh, Hussam Tarazi, R. Sureshbabu
and Rajnish Ratna: Human-Computer Interaction for Recognizing
predict. Machine learning applications are becoming more and Speech Emotions Using Multilayer Perceptron Classifier, Volume 22,
more common in the healthcare industry due to its right Issue 3, 2022.
accuracy. Brain Stroke prediction using machine learning [5] Sami Briouza; Hassène Gritli; Nahla Khraief; Safya Belghith; Dilbag
algorithms has been extensively studied. Machine learning Singh: Classification of sEMG Biomedical Signals for Upper-Limb
algorithms build models based on sample data to make Rehabilitation Using the Random Forest Method, Volume 10, Issue 3,
predictions or recommendations or classifications. You can 2022.
use these models to automate your decision-making process. [6] Lin Sun, Tianxiang Wang, Weiping Ding, Jiucheng Xu, Anhui Tan:
Machine learning algorithms are supervised learning Two-stage-neighborhood-based multilabel classification for incomplete
data with missing labels, Volume37, Issue 10, 2022.
algorithms which means it will learn from data by building
models. Algorithms learn from these patterns and improve the
[7] Sunaina Tabassum*1, Ulluri Nissi Priya*2, Sumanath Beku*3, Mrs. G.
model over time in future. It is a powerful tool for making Ahalya Rani*4: ML classification techniques to improve accuracy of
predictions from data. In order to get accurate results it is heart disease prediction, Volume 4, Issue 3, 2022.
important to use high-quality data. So for the dataset which we
have taken from the kaggle we have applied four main [8] Huaiyu Wang, Changwei Jiae, Teng Su, Cheng Shi, unshan Ge, Jinxin
machine learning algorithms they are Naive Bayes, Decision Yang, ShuofengWang: Comparison and implementation of machine
Tree classification, Random forest classifier, and multi level learning models for predicting the combustion phases of
perceptron. While comparing the accuracy of these four hydrogen-enriched Wankel rotary engines, Volume 10, Issue B, 2022.
algorithms we have observed that the Random forest classifier [9] Elias Dritsas and Maria Trigka: Stroke Risk Prediction with Machine
Learning Techniques, Volume 22, Issue 13, 2022.
shows the accurate results with accuracy of 0.9570. So with
the results we have obtained we will be able to identify and [10] Laércio Ives Santos, Murilo Osorio Camargo, Marcos Flávio Silveira
Vasconcelos D’ Angelo, João Batista Mendes, Egydio Emiliano
predict the patient’s stroke beforehand. It even helps a lot in Camargos de Medeiros, André Luiz Sena Guimarães, Reinaldo Martínez
BCI stroke rehabilitation. Palhares: Decision tree and artificial immune systems for stroke
prediction in imbalanced data, Volume 191, Issue 10, 2022.
REFERENCES [11] Ching-Lung Fan: Evaluation of Classification for Project Features with
[1] Omar Almomani; Mohammed Amin Almaiah; Adeeb Alsaaidah; Sami Machine Learning Algorithms, Volume 14, 2022.
Smadi; Adel Hamdan Mohammad; Ahmad Althunibat: Machine [12] IssZheyu Zhang, Dengfeng Zhou, Jungen Zhang, Yuyun Xu, Gaoping
Learning Classifiers for Network Intrusion Detection System: Lin, Bo Jin, Yingchuan Liang, Yu Geng & Sheng Zhangue: Multilayer
Comparative Study, Volume 10, Issue 19, 2021. perceptron-based prediction of stroke mimics in prehospital triage,
[2] Duy-HienVu: Privacy-preserving Naive Bayes classification in Volume 36, Issue 4, 2022.
semi-fully distributed data model, Volume 115, Issue 17, 2022.
[3] I.Chouribab, G.Guillardc, I.R.Faraha, B.Solaiman: Stroke Treatment
Prediction Using Features Selection Methods and Machine Learning
Classifiers, Volume 43, Issue 6, 2022.

Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.

You might also like