0% found this document useful (0 votes)
44 views13 pages

EDA Presentation

This document summarizes an exploratory data analysis of factors related to heart stroke. It finds that age, pre-existing conditions like hypertension, smoking, self-employment, and marriage increase the risk of stroke. Specifically, the risk is highest for those aged 60-80, with hypertension or heart disease, who smoke, are self-employed, or married. Higher glucose levels and BMIs between 20-37 also increase stroke risk.

Uploaded by

rutvikpatel62001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views13 pages

EDA Presentation

This document summarizes an exploratory data analysis of factors related to heart stroke. It finds that age, pre-existing conditions like hypertension, smoking, self-employment, and marriage increase the risk of stroke. Specifically, the risk is highest for those aged 60-80, with hypertension or heart disease, who smoke, are self-employed, or married. Higher glucose levels and BMIs between 20-37 also increase stroke risk.

Uploaded by

rutvikpatel62001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Exploratory Data Analysis on Heart Stroke

Summary of numeric variable

Summary of categorical variable


Distribution Of Numerical Data

• The skewness score of -0.078 for Age indicates that the distribution of ages is slightly left-skewed.
This means that there are slightly more people with higher ages than lower ages

• The skewness score of 1.5722 for average glucose level indicates that the distribution of glucose levels is
slightly right-skewed. This means that there are slightly more people with lower glucose levels than
higher glucose levels

• The skewness score of 1.076 for BMI indicates that the distribution of BMIs is slightly right-
skewed. This means that there are slightly more people with lower BMIs than higher BMIs.
Univariate Analysis of Numerical Variables

● Age: The mean age of individuals in the dataset is 45.68 years with a standard deviation of 20.83 years.
The age range is from 5 to 82 years. The majority of the individuals (50%) are between the ages of 29 to
62 years.
● Avg_glucose_level: The mean average glucose level is 106.15 mg/dL with a standard deviation of 45.28
mg/dL. The glucose levels range from 55.12 mg/dL to 271.74 mg/dL. The distribution seems to be slightly
skewed to the right.
● BMI: The mean BMI of individuals in the dataset is 28.89 kg/m2 with a standard deviation of 7.85 kg/m2.
The BMI values range from 10.30 kg/m2 to 97.60 kg/m2. The majority of the individuals (50%) have a BMI
between 23.5 to 33.1 kg/m2.
Analyze Relationship Between Numerical Variables

The correlation between age and average glucose level (0.23) is stronger compared to age and
BMI (0.21), while the correlation between BMI and average glucose level (0.17) is relatively
weaker
Univariate Analysis For Categorical Variables

● There are 2969 (59.0%) females and 2097 (41.0%) males in the dataset.
● Majority of the individuals (90.30%) in the dataset do not have hypertension, while 9.70% have
hypertension.
● Similarly, the majority of the individuals (94.6%) in the dataset do not have heart disease, while 5.4%
have heart disease.
● The majority of the individuals (65.6%) in the dataset are married, while 34.4% are not married.
● The most common work type is private (57.20%), followed by self-employed (24.91%), children (6.96%),
government jobs (6.57%), and never worked (4.36%).
● The majority of the individuals (51.4%) in the dataset reside in urban areas, while 48.6% reside in rural areas.
● Non-smokers are the most common smoking status (37.0%), followed by unknown (30.2%), formerly smoked
(17.3%), and smokers (15.4%).
● The majority of the individuals (95.1%) in the dataset have not had a stroke, while only 4.9% have had a
stroke.
Bivariate Analysis for Categorical Variables and Target

Does the type of work or residence (Urban vs Rural) have any effect on the incidence of
stroke?

Individuals who are self-employed have a higher risk of experiencing a stroke compared to those in private or
government jobs.

There is no significant difference in the incidence of stroke between individuals residing in urban or rural areas.
Does having hypertension or heart disease increase the likelihood of having stroke?

Having pre-existing conditions such as hypertension increases the risk of experiencing a stroke.

Having pre-existing conditions such as heart disease increases the risk of experiencing a stroke.
Does smoking or marital status have any association with stroke?

Smoking increases the risk of experiencing a stroke compared to being a former smoker or a non-smoker.

Married individuals have a higher risk of experiencing a stroke compared to those who are not married.
Analyze Relationship Between Numerical and Target Variable
How does age relate to stroke ? Are Older people more likely to have stroke?

The probability of experiencing a


heart stroke among individuals
aged between 60 and 80 years is
75%

Does high glucose level increase the likelihood of stroke?

People with avg_glucose_level


between 100-200 they have
more chance of having heart
stroke
Does higher Bmi increase the risk of stroke?

There are more chances of having heart stroke whose bmi is between 20-37.
❖ Conclusion

➢ Based on our analysis on the heart stroke dataset, we have identified the following five factors
that affect the incidence of stroke:

➢ Age: The risk of experiencing a stroke increases with age.

➢ Pre-existing conditions: Individuals with pre-existing conditions such as hypertension and heart
disease have a higher risk of experiencing a stroke.

➢ Smoking: Smoking increases the risk of experiencing a stroke compared to being a former smoker or
a non-smoker.

➢ Employment: Individuals who are self-employed have a higher risk of experiencing a stroke
compared to those in private or government jobs.

➢ Marital status: Married individuals have a higher risk of experiencing a stroke compared to those
who are not married.

You might also like