Mod 4
Mod 4
import numpy as np
from sklearn.impute import SimpleImputer
import matplotlib.pyplot as plt
Out[12]: age anaemia creatinine_phosphokinase diabetes ejection_fraction high_blood_pressure platelets serum_creatinine serum_sodium sex smoking time DEATH_EVENT age_category
In [13]: dummy=pd.get_dummies(df['age_category'])
print(dummy)
df.head()
Older Younger
0 True False
1 False True
2 True False
3 False True
4 True False
.. ... ...
294 True False
295 False True
296 False True
297 False True
298 False True
In [14]: Q1 = df['ejection_fraction'].quantile(0.25)
Q3 = df['ejection_fraction'].quantile(0.75)
IQR = Q3 - Q1
4.Compare the distribution of the creatinine_phosphokinase with and without missing values using box plots
plt.figure(figsize=(8, 6))
plt.boxplot([df_original['creatinine_phosphokinase'], df_with_nulls['creatinine_phosphokinase'].dropna()],
patch_artist=True, labels=['Without Missing', 'With Missing '])
plt.title("Comparison of 'creatinine_phosphokinase' Distribution With and Without Missing Values")
plt.ylabel("Creatinine Phosphokinase")
plt.show()
C:\Users\MYPC\AppData\Local\Temp\ipykernel_9240\2092470060.py:11: MatplotlibDeprecationWarning: The 'labels' parameter of boxplot() has been renamed 'tick_labels' since Matplotlib 3.9; support for the old name will be dropped in 3.11.
plt.boxplot([df_original['creatinine_phosphokinase'], df_with_nulls['creatinine_phosphokinase'].dropna()],
4.Visualize Relationships between the variable with missing values "creatinine_phosphokinase" and other variable "age" Using Scatter Plot
1.various normalization techniques such as Min-Max scaling, Standardization, Robust scaling, encoding, Normalization
df.head()
Out[20]: age anaemia creatinine_phosphokinase diabetes ejection_fraction high_blood_pressure platelets serum_creatinine serum_sodium sex smoking time DEATH_EVENT age_category normalized_age standardized_eje
3. Box plots to compare the median, quartiles, and outliers in your data
axes[1].boxplot(df['normalized_age'], patch_artist=True)
axes[1].set_title("Normalized Age Data (Min-Max)")
plt.show()
In [ ]: