Brain Stroke Prediction Using Machine Learning Techniques
Brain Stroke Prediction Using Machine Learning Techniques
Techniques
Nagaraju Devarakondaa*, Bethu Lokendra Sri Saib, Upadrashta Pravalikac and Satuluri Naganjaneyulud
2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT) | 978-1-6654-9360-4/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICECCT56650.2023.10179623
a
Computer Science, VIT-AP University, Amaravathi, India
b
Computer Science, VIT-AP University, Amaravathi, India
c
Computer Science, VIT-AP University, Amaravathi, India
d
Computer Science, Lakireddy Bali Reddy College of Engineering, Mylavaram, India
Abstract—In a human life there are alot of life-threatening machine learning algorithms and find the best one which gives
consequences, one among those dangerous situations is having a the accurate results to predict stroke in advance and save lives.
brain stroke. When part of the brain does not receive sufficient This prediction can be also helpful for the BCI, which is a
blood flow for functioning a brain stroke strikes a person. This is method for the stroke affected patients to bring back their
most often due to a blockage in an artery or bleeding in the brain. normal life. It is known as stroke rehabilitation. Where we can
When brain cells are deprived of oxygen for an extended period read the health history of the patient, predict the stroke and
of time, they die. lf enough brain cells in this area die, the damage warn them. The data we have consists of the patients with
becomes irreversible and you may lose the ability you once had their daily habits which were affected by the stroke and those
control over. However, restoring blood flow can prevent or at
who did not get affected by stroke, as we cannot predict their
least limit the severity of this type of damage. Therefore, stroke
treatment is time-critical. Different areas of the brain control
stroke by their daily routine and their habits so we need to take
different abilities hence the symptoms for a brain stroke are not help of machine learning where it can help us predict the
constant for everyone. The brain stroke is what a heart attack stroke using different classification methods.
does to the heart is what a stroke does to the brain. When a
stroke occurs, part of the brain loses its blood supply, leaving that
II. LITERATURE REVIEW
area of the brain without oxygen. Without oxygen, the affected
brain cells are starved of oxygen and stop functioning normally.
Strokes are very common. Stroke is the second leading cause of Many people across the world experience brain stroke in a
death worldwide. This paper is based on the prediction of brain direct way or indirectly. The main problem occurs when the
stroke using machine learning algorithms which helps to strobe hits a person in the brain, one cannot do anything on
rehabilitate the patient so that one can gain their life back to their own. They need constant supervision and they are
normal. (Abstract) dependent on others for the rest of their lives. One might be
lucky enough to survive from the stroke but will forever be
Keywords—Rehabilitation, Brain computer interface, Data relied on by others for anything and everything. Nothing will
driven decision making, Prediction.(key words) be under their control, the stroke to the brain is the same as a
I. INTRODUCTION (HEADING 1) stroke to heart, in fact it is worse. The very necessity of taking
care of the human body ahead of time is crucial. For this,
A stroke is a life-hanging condition that occurs when part predicting the brain stroke before it actually hits a person and
of the brain doesn't admit enough blood inflow. This is most
ruins their life is very essential. The prediction of a brain
frequently due to a blockage in a roadway or bleeding in the
stroke involves a fair amount of methods and algorithms
brain. Without a constant blood force, brain cells will not be
able to survive from the insufficiency of oxygen. Brain cells which have been discussed in the methodology section.
die if left without oxygen for too long. However, the damage
becomes unrecoverable and you may lose the capability you
formerly had control over this area, if enough brain cells in III. PROPOSED METHODOLOGY.
this area die.
A. Data preprocessing methods
Stroke has been such a disease which is quite The data should be available in the way the BCI can read it
unpredictable[9]. It can happen to anyone regardless of age and make the brain stroke prediction using it so that data
group, but it has its worst effects on the elderly. The medical should be trained and go through the methods as as figure.1
professionals are trying their best to bring back the life of then, the data should be modified so for that the dataset will be
stroke patients by stroke rehabilitation. The scientists are also going through the data preprocessing methods[3] so here the
working efficiently in finding the ways to predict the stroke dataset we have taken from the kaggle. By using the first
beforehand for BCI. With the outgrowing technology there are method in the data preprocessing as shown in the figure.1
a lot of ways where one can predict the stroke of a person with dataset may contain noise and disturbances or duplicate data,
basic health history. The purpose behind this paper is to apply
Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
so we used the data cleaning method to remove the values
which are not present. There are several ways to handle
missing data. If the value is missing, you can delete it, but
delete it with care, as deletion or loss of information will
occur. Missing values can be re-inserted based on other
observations, and data integrity is likely to be lost as you may
be working on assumptions rather than actual observations.
Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
analytics[8] and machine learning are closely related, as
predictive models typically involve machine learning
algorithms. These models can be trained over time to
accommodate new data and values and deliver the results.
This helps us understand potential future outcomes of the
patients for the prediction of brain stroke beforehand. Each
classifier processes data differently, so the selection of
classifiers and models should be appropriate to get desired
results.
B. Machine Learning approach using different algorithms for
Brain stroke prediction
Brain Stroke prediction can be done by taking all attributes
into consideration such as the smoking index, BMI, stress
levels and so on. The BCI also needs to predict the brain
stroke through the brain with the help of nerves. For this
prediction[9] it is very uncertain to know which attributes lead
Fig.3. Correlation matrix between attributes to the brain stroke and the ones which do not. We have many
attributes that lead to the stroke. So to make this process easy
Label encoding [7] communes to the transformation of labels for BCI, we need help from the machine learning classifiers as
the predictions will be done so that data will be taken into BCI
to numeric format in order to change them into a
and predict it, as we have to see which classification
machine-readable format. This is an important step in the
techniques get more accuracy. The BCI needs the data, where
preprocessing of structured data sets in supervised learning. it can take in the information and give the output which makes
The two main steps in binary encoding are: stroke rehabilitation easy. So for this we have trained, reduced
● Different columns will be formed by splitting the and classified our dataset, now we need some classification
categories. techniques which classifies the data and gives the highest
● Setting ‘0’ for others and ‘1’ as an indicator for the accuracy to predict the brain stroke based on the history of
required columns. their health. We have used four main machine learning
algorithms namely Naive Bayes, Decision tree classifier,
TABLE I. DATA AFTER USING THE LABEL ENCODER METHOD
Random forest classifier, Multi level perceptron[1].
Gender Married
Work
type
Resident Smoking Hypertension Stroke IV. NAIVE BAYES
1 1 2 1 1 0 1 Naive Bayes comes from a set of methods that are a set of
supervised learning algorithms. Assuming conditional
1 1 2 0 2 0 1 independence between components of elements for a given
0 1 2 1 3 0 0 value of the class variable. The relationship between given
class variable and dependent feature vector through x1, xn is
0 1 3 0 2 1 0 as follows:
1 1 2 1 1 0 0
Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
V. RANDOM FOREST CLASSIFIER
The random forest algorithm [5] is a very suitable supervised
machine learning algorithm used for classification and
regression problems in machine learning.
(6)
no 1 2 0 2 1
accuracy 1 2 1 3 0
macro
1 3 0 2 0
avg
weighted
1 2 1 1 0
avg
Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
Fig.6. Decision tree classifier
Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.
are a lot of algorithms which help to predict the required [4] Abeer Ali Alnuaim ,Mohammed Zakariah, Prashant Kumar Shukla,
output and give the best classifications methods to help it Aseel Alhadlaq, Wesam Atef Hatamleh, Hussam Tarazi, R. Sureshbabu
and Rajnish Ratna: Human-Computer Interaction for Recognizing
predict. Machine learning applications are becoming more and Speech Emotions Using Multilayer Perceptron Classifier, Volume 22,
more common in the healthcare industry due to its right Issue 3, 2022.
accuracy. Brain Stroke prediction using machine learning [5] Sami Briouza; Hassène Gritli; Nahla Khraief; Safya Belghith; Dilbag
algorithms has been extensively studied. Machine learning Singh: Classification of sEMG Biomedical Signals for Upper-Limb
algorithms build models based on sample data to make Rehabilitation Using the Random Forest Method, Volume 10, Issue 3,
predictions or recommendations or classifications. You can 2022.
use these models to automate your decision-making process. [6] Lin Sun, Tianxiang Wang, Weiping Ding, Jiucheng Xu, Anhui Tan:
Machine learning algorithms are supervised learning Two-stage-neighborhood-based multilabel classification for incomplete
data with missing labels, Volume37, Issue 10, 2022.
algorithms which means it will learn from data by building
models. Algorithms learn from these patterns and improve the
[7] Sunaina Tabassum*1, Ulluri Nissi Priya*2, Sumanath Beku*3, Mrs. G.
model over time in future. It is a powerful tool for making Ahalya Rani*4: ML classification techniques to improve accuracy of
predictions from data. In order to get accurate results it is heart disease prediction, Volume 4, Issue 3, 2022.
important to use high-quality data. So for the dataset which we
have taken from the kaggle we have applied four main [8] Huaiyu Wang, Changwei Jiae, Teng Su, Cheng Shi, unshan Ge, Jinxin
machine learning algorithms they are Naive Bayes, Decision Yang, ShuofengWang: Comparison and implementation of machine
Tree classification, Random forest classifier, and multi level learning models for predicting the combustion phases of
perceptron. While comparing the accuracy of these four hydrogen-enriched Wankel rotary engines, Volume 10, Issue B, 2022.
algorithms we have observed that the Random forest classifier [9] Elias Dritsas and Maria Trigka: Stroke Risk Prediction with Machine
Learning Techniques, Volume 22, Issue 13, 2022.
shows the accurate results with accuracy of 0.9570. So with
the results we have obtained we will be able to identify and [10] Laércio Ives Santos, Murilo Osorio Camargo, Marcos Flávio Silveira
Vasconcelos D’ Angelo, João Batista Mendes, Egydio Emiliano
predict the patient’s stroke beforehand. It even helps a lot in Camargos de Medeiros, André Luiz Sena Guimarães, Reinaldo Martínez
BCI stroke rehabilitation. Palhares: Decision tree and artificial immune systems for stroke
prediction in imbalanced data, Volume 191, Issue 10, 2022.
REFERENCES [11] Ching-Lung Fan: Evaluation of Classification for Project Features with
[1] Omar Almomani; Mohammed Amin Almaiah; Adeeb Alsaaidah; Sami Machine Learning Algorithms, Volume 14, 2022.
Smadi; Adel Hamdan Mohammad; Ahmad Althunibat: Machine [12] IssZheyu Zhang, Dengfeng Zhou, Jungen Zhang, Yuyun Xu, Gaoping
Learning Classifiers for Network Intrusion Detection System: Lin, Bo Jin, Yingchuan Liang, Yu Geng & Sheng Zhangue: Multilayer
Comparative Study, Volume 10, Issue 19, 2021. perceptron-based prediction of stroke mimics in prehospital triage,
[2] Duy-HienVu: Privacy-preserving Naive Bayes classification in Volume 36, Issue 4, 2022.
semi-fully distributed data model, Volume 115, Issue 17, 2022.
[3] I.Chouribab, G.Guillardc, I.R.Faraha, B.Solaiman: Stroke Treatment
Prediction Using Features Selection Methods and Machine Learning
Classifiers, Volume 43, Issue 6, 2022.
Authorized licensed use limited to: Odisha University of Technology and Research. Downloaded on January 19,2024 at 17:48:07 UTC from IEEE Xplore. Restrictions apply.