0% found this document useful (0 votes)

11 views2 pages

Dovdush KN-305 Lab2

Uploaded by

multifunctionalbot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views2 pages

Dovdush KN-305 Lab2

Uploaded by

multifunctionalbot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Практична робота №2

з дисципліни "Інформаційні технології смартсистем"

на тему "Кардіологічна клініка"

Виконав:

студент групи КН-305

Довбуш Павло
In [1]: !pip install numpy
!pip install matplotlib
!pip install pandas
!pip install seaborn

Defaulting to user installation because normal site-packages is not writeable

Requirement already satisfied: numpy in c:\users\олеся\appdata\roaming\python\python310\site-packages (1.24.2)
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: matplotlib in c:\users\олеся\appdata\roaming\python\python310\site-packages (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (1.0.7)
Requirement already satisfied: cycler>=0.10 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (4.39.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (1.4.4)
Requirement already satisfied: numpy>=1.20 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (1.24.2)
Requirement already satisfied: packaging>=20.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (23.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib) (2.8.2)
Requirement already satisfied: six>=1.5 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pandas in c:\users\олеся\appdata\roaming\python\python310\site-packages (1.5.3)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from pandas) (2022.7.1)
Requirement already satisfied: numpy>=1.21.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from pandas) (1.24.2)
Requirement already satisfied: six>=1.5 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: seaborn in c:\users\олеся\appdata\roaming\python\python310\site-packages (0.12.2)
Requirement already satisfied: numpy!=1.24.0,>=1.17 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from seaborn) (1.24.2)
Requirement already satisfied: pandas>=0.25 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from seaborn) (1.5.3)
Requirement already satisfied: matplotlib!=3.6.1,>=3.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from seaborn) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.0.7)
Requirement already satisfied: cycler>=0.10 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (4.39.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.4.4)
Requirement already satisfied: packaging>=20.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (23.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from pandas>=0.25->seaborn) (2022.7.1)
Requirement already satisfied: six>=1.5 in c:\users\олеся\appdata\roaming\python\python310\site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.1->seaborn) (1.16.0)

In [2]: import os.path

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import scipy.stats as stats

In [3]: import warnings

warnings.simplefilter('ignore')

In [4]: pd.set_option('display.max_columns', 500)

pd.set_option('display.max_rows', 500)

Read the dataset

In [5]: print(os.path.exists("dataset_3.csv"))

True

In [6]: ds = pd.read_csv("dataset_3.csv")
ds.head()

Out[6]: Unnamed: 0 Age Sex ChestPainType RestingBP Cholesterol FastingBS RestingECG MaxHR ExerciseAngina Oldpeak ST_Slope HeartDisease

0 0 40.0 M ATA 140.0 289.0 0.0 Normal 172.0 N 0.0 Up 0.0

1 1 49.0 F NAP NaN 180.0 NaN Normal 156.0 N 1.0 Flat 1.0

2 2 37.0 M ATA 130.0 283.0 0.0 ST NaN N 0.0 Up 0.0

3 3 48.0 F ASY 138.0 214.0 0.0 Normal 108.0 Y 1.5 Flat 1.0

4 4 54.0 M NAP 150.0 195.0 0.0 Normal 122.0 N 0.0 Up 0.0

In [7]: print('columns count - ',len(ds.columns), '\n')

print('columns: ',list(ds.columns))

columns count - 13

columns: ['Unnamed: 0', 'Age', 'Sex', 'ChestPainType', 'RestingBP', 'Cholesterol', 'FastingBS', 'RestingECG', 'MaxHR', 'ExerciseAngina', 'Oldpeak', 'ST_Slope', 'HeartDisease']

Missing data imputation

In [8]: ds.shape

Out[8]: (918, 13)

In [9]: ds.dtypes

Out[9]: Unnamed: 0 int64

Age float64
Sex object
ChestPainType object
RestingBP float64
Cholesterol float64
FastingBS float64
RestingECG object
MaxHR float64
ExerciseAngina object
Oldpeak float64
ST_Slope object
HeartDisease float64
dtype: object

In [10]: for col in ds.columns:

if ds[col].isnull().values.any():
print("Missing data in ", col, ds[col].isnull().sum())

Missing data in Age 45

Missing data in Sex 18
Missing data in ChestPainType 18
Missing data in RestingBP 36
Missing data in Cholesterol 82
Missing data in FastingBS 45
Missing data in RestingECG 27
Missing data in MaxHR 91
Missing data in ExerciseAngina 9
Missing data in Oldpeak 73
Missing data in ST_Slope 91
Missing data in HeartDisease 64

In [11]: def impute_na(df, variable, value):

return df[variable].fillna(value)

In [12]: Age_median = ds['Age'].median()

RestingBP_median = ds['RestingBP'].median()
Cholesterol_median = ds['Cholesterol'].median()
FastingBS_median = ds['FastingBS'].median()
MaxHR_median = ds['MaxHR'].median()
Oldpeak_median = ds['Oldpeak'].median()
Sex_mode = ds['Sex'].mode()
ChestPainType_mode = ds['ChestPainType'].mode()
RestingECG_mode = ds['RestingECG'].mode()
ExerciseAngina_mode = ds['ExerciseAngina'].mode()
ST_Slope_mode = ds['ST_Slope'].mode()
HeartDisease_median = ds['HeartDisease'].median()

In [13]: #числові значення з заміною на середнє

ds['Age'] = impute_na(ds, 'Age',Age_median)
ds['RestingBP'] = impute_na(ds, 'RestingBP',RestingBP_median)
ds['Cholesterol'] = impute_na(ds, 'Cholesterol',Cholesterol_median)
ds['FastingBS'] = impute_na(ds, 'FastingBS',FastingBS_median)
ds['MaxHR'] = impute_na(ds, 'MaxHR',MaxHR_median)
ds['Oldpeak'] = impute_na(ds, 'Oldpeak',Oldpeak_median)
ds['HeartDisease'] = impute_na(ds, 'HeartDisease',HeartDisease_median)

#Заміна відсутніх значень на категорію, що найчастіше зустрічається

ds['Sex'] = impute_na(ds, 'Sex',Sex_mode)

ds['ChestPainType'] = impute_na(ds, 'ChestPainType',ChestPainType_mode)
ds['RestingECG'] = impute_na(ds, 'RestingECG',RestingECG_mode)
ds['ExerciseAngina'] = impute_na(ds, 'ExerciseAngina',ExerciseAngina_mode)
ds['ST_Slope'] = impute_na(ds, 'ST_Slope',ST_Slope_mode)

ds['Sex'].fillna(method ='ffill', inplace = True)

ds['ChestPainType'].fillna(method ='ffill', inplace = True)
ds['RestingECG'].fillna(method ='ffill', inplace = True)
ds['ExerciseAngina'].fillna(method ='ffill', inplace = True)
ds['ST_Slope'].fillna(method ='ffill', inplace = True)

In [14]: for col in ds.columns:

if ds[col].isnull().values.any():
print("Missing data in ", col, ds[col].isnull().sum())

Categorical encoding
In [15]: ds.nunique()

Out[15]: Unnamed: 0 918

Age 50
Sex 2
ChestPainType 4
RestingBP 66
Cholesterol 217
FastingBS 2
RestingECG 3
MaxHR 118
ExerciseAngina 2
Oldpeak 51
ST_Slope 3
HeartDisease 2
dtype: int64

In [16]: ds['Sex'].unique()

Out[16]: array(['M', 'F'], dtype=object)

In [17]: ds['ChestPainType'].unique()

Out[17]: array(['ATA', 'NAP', 'ASY', 'TA'], dtype=object)

In [18]: ds['RestingECG'].unique()

Out[18]: array(['Normal', 'ST', 'LVH'], dtype=object)

In [19]: ds['ExerciseAngina'].unique()

Out[19]: array(['N', 'Y'], dtype=object)

In [20]: ds['ST_Slope'].unique()

Out[20]: array(['Up', 'Flat', 'Down'], dtype=object)

In [21]: from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()

In [22]: ds['Sex'] = le.fit_transform(ds['Sex'])

ds['ChestPainType'] = le.fit_transform(ds['ChestPainType'])
ds['RestingECG'] = le.fit_transform(ds['RestingECG'])
ds['ExerciseAngina'] = le.fit_transform(ds['ExerciseAngina'])
ds['ST_Slope'] = le.fit_transform(ds['ST_Slope'])

In [23]: ds.head(10)

Out[23]: Unnamed: 0 Age Sex ChestPainType RestingBP Cholesterol FastingBS RestingECG MaxHR ExerciseAngina Oldpeak ST_Slope HeartDisease

0 0 40.0 1 1 140.0 289.0 0.0 1 172.0 0 0.0 2 0.0

1 1 49.0 0 2 130.0 180.0 0.0 1 156.0 0 1.0 1 1.0

2 2 37.0 1 1 130.0 283.0 0.0 2 138.0 0 0.0 2 0.0

3 3 48.0 0 0 138.0 214.0 0.0 1 108.0 1 1.5 1 1.0

4 4 54.0 1 2 150.0 195.0 0.0 1 122.0 0 0.0 2 0.0

5 5 39.0 1 2 120.0 339.0 0.0 1 138.0 0 0.0 2 0.0

6 6 45.0 0 1 130.0 237.0 0.0 1 170.0 0 0.0 2 0.0

7 7 54.0 1 1 110.0 208.0 0.0 1 142.0 0 0.0 2 0.0

8 8 37.0 1 0 140.0 207.0 0.0 1 130.0 1 1.5 1 1.0

9 9 48.0 0 1 120.0 284.0 0.0 1 120.0 1 0.0 2 1.0

In [24]: def diagnostic_plots(df, variable):

# function takes a dataframe (df) and
# the variable of interest as arguments

# define figure size

plt.figure(figsize=(16, 4))

# histogram
plt.subplot(1, 3, 1)
sns.histplot(df[variable], bins=30)
plt.title('Histogram')

# Q-Q plot
plt.subplot(1, 3, 2)
stats.probplot(df[variable], dist="norm", plot=plt)
plt.ylabel('Variable quantiles')

# boxplot
plt.subplot(1, 3, 3)
sns.boxplot(y=df[variable])
plt.title('Boxplot')

plt.show()

In [25]: diagnostic_plots(ds, 'Age')

In [26]: diagnostic_plots(ds, 'RestingBP')

In [27]: diagnostic_plots(ds, 'Cholesterol')

In [28]: diagnostic_plots(ds, 'MaxHR')

In [29]: diagnostic_plots(ds, 'FastingBS')

In [30]: diagnostic_plots(ds, 'Oldpeak')

Data Scaling
In [31]: from sklearn.preprocessing import MinMaxScaler,StandardScaler
mms = MinMaxScaler() # Normalization
ss = StandardScaler() # Standardization

ds['Oldpeak'] = mms.fit_transform(ds[['Oldpeak']])
ds['Age'] = ss.fit_transform(ds[['Age']])
ds['RestingBP'] = ss.fit_transform(ds[['RestingBP']])
ds['Cholesterol'] = ss.fit_transform(ds[['Cholesterol']])
ds['MaxHR'] = ss.fit_transform(ds[['MaxHR']])
ds.head()

Out[31]: Unnamed: 0 Age Sex ChestPainType RestingBP Cholesterol FastingBS RestingECG MaxHR ExerciseAngina Oldpeak ST_Slope HeartDisease

0 0 -1.473387 1 1 0.427330 0.846142 0.0 1 1.443735 0 0.295455 2 0.0

1 1 -0.496724 0 2 -0.127534 -0.202998 0.0 1 0.780688 0 0.409091 1 1.0

2 2 -1.798941 1 1 -0.127534 0.788391 0.0 2 0.034759 0 0.295455 2 0.0

3 3 -0.605242 0 0 0.316357 0.124257 0.0 1 -1.208455 1 0.465909 1 1.0

4 4 0.045866 1 2 0.982193 -0.058621 0.0 1 -0.628288 0 0.295455 2 0.0

Модель машинного навчання не розуміє одиниці значень ознак. Він розглядає вхідні дані як просте число, але не розуміє справжнього значення цього значення. Таким чином, виникає необхідність масштабувати дані.

У нас є 2 варіанти масштабування даних: 1) Нормалізація 2) Стандартизація. Оскільки більшість алгоритмів передбачає, що дані мають нормальний (гаусівський) розподіл, нормалізація виконується для функцій, дані яких не відображають нормального розподілу, а
стандартизація виконується для функцій, які нормально розподіляються, де їхні значення величезні або дуже малі порівняно з іншими особливості.

Нормалізація: функцію Oldpeak нормалізовано, оскільки вона відображала правий спотворений розподіл даних. Стандартизація: Age, RestingBP, Cholesterol і MaxHR зменшено, оскільки ці функції розподілені нормально.
In [ ]:

Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
No ratings yet
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
12 pages
LdG-2 Lighting Design Guide For Vertical Surfaces Ezzatbaroudi's Weblog
100% (1)
LdG-2 Lighting Design Guide For Vertical Surfaces Ezzatbaroudi's Weblog
12 pages
ASTM E384-11e1
No ratings yet
ASTM E384-11e1
43 pages
Dovdush KN-305 Lab3
No ratings yet
Dovdush KN-305 Lab3
2 pages
Data Set Preperation
No ratings yet
Data Set Preperation
7 pages
Dovdush KN-305 Lab4
No ratings yet
Dovdush KN-305 Lab4
17 pages
Untitled2.Ipynb - Colab
No ratings yet
Untitled2.Ipynb - Colab
8 pages
LP Practical ! Jupyter Notebook
No ratings yet
LP Practical ! Jupyter Notebook
6 pages
Bio-Signal Analysis For Smoking
No ratings yet
Bio-Signal Analysis For Smoking
1 page
Eda-Ml-Decision-Tree - Ipynb - Colab
No ratings yet
Eda-Ml-Decision-Tree - Ipynb - Colab
20 pages
Heart Failure Prediction With Detailed Headings
No ratings yet
Heart Failure Prediction With Detailed Headings
12 pages
LAB8 LogisticReg HeartDisease
No ratings yet
LAB8 LogisticReg HeartDisease
31 pages
Heart - Disease - Ipynb - Colab
No ratings yet
Heart - Disease - Ipynb - Colab
13 pages
Baseline - Ipynb - Colab
No ratings yet
Baseline - Ipynb - Colab
5 pages
Hare Krishna
No ratings yet
Hare Krishna
1 page
TP3.ipynb - Colab
No ratings yet
TP3.ipynb - Colab
17 pages
Heart Failure Prediction
100% (1)
Heart Failure Prediction
41 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Assignment 1
No ratings yet
Assignment 1
10 pages
Model2.ipynb - Colab
No ratings yet
Model2.ipynb - Colab
11 pages
Heart Disease Indicator Prediction Model
No ratings yet
Heart Disease Indicator Prediction Model
17 pages
Stroke Prediction
No ratings yet
Stroke Prediction
10 pages
Itm 617 Capstone Code - Colaboratory
No ratings yet
Itm 617 Capstone Code - Colaboratory
13 pages
Major Project - Colab
No ratings yet
Major Project - Colab
15 pages
Heart Disease Classification Using Ann Hands-On
No ratings yet
Heart Disease Classification Using Ann Hands-On
7 pages
Preprocessing1.ipynb - Colab
No ratings yet
Preprocessing1.ipynb - Colab
13 pages
Untitled3.Ipynb - Colab
No ratings yet
Untitled3.Ipynb - Colab
6 pages
Heart - Cleveland - Ipynb - Colab
No ratings yet
Heart - Cleveland - Ipynb - Colab
5 pages
C ML1
No ratings yet
C ML1
10 pages
DocScanner Oct 22, 2024 17-38
No ratings yet
DocScanner Oct 22, 2024 17-38
2 pages
Heart Disease Diagnosis Using Machine Learning
No ratings yet
Heart Disease Diagnosis Using Machine Learning
26 pages
Diabetes Prediction 1704256341
No ratings yet
Diabetes Prediction 1704256341
17 pages
Heart Disease Classification ML Assignment - Jupyter Notebook
No ratings yet
Heart Disease Classification ML Assignment - Jupyter Notebook
7 pages
Medidas de Tendencia Central 2020 PDF
No ratings yet
Medidas de Tendencia Central 2020 PDF
26 pages
# Load Packages: Pandas Pandas PD PD Numpy Numpy NP NP
No ratings yet
# Load Packages: Pandas Pandas PD PD Numpy Numpy NP NP
17 pages
Ide To 6 Classification Algorithms
No ratings yet
Ide To 6 Classification Algorithms
34 pages
Sleep Disorder 1689050852
No ratings yet
Sleep Disorder 1689050852
41 pages
CardioGoodFitness - Jupyter Notebook
No ratings yet
CardioGoodFitness - Jupyter Notebook
12 pages
Prediction.ipynb - Colab
No ratings yet
Prediction.ipynb - Colab
7 pages
Heart - Disease - 1.ipynb - Colaboratory
No ratings yet
Heart - Disease - 1.ipynb - Colaboratory
9 pages
Practical 1
No ratings yet
Practical 1
7 pages
Logistic Regression 205
No ratings yet
Logistic Regression 205
8 pages
Heart Disease Prediction! ?
No ratings yet
Heart Disease Prediction! ?
52 pages
Covid19 Death Prediction
No ratings yet
Covid19 Death Prediction
1 page
Assignment 1
No ratings yet
Assignment 1
11 pages
Sleep Health
No ratings yet
Sleep Health
53 pages
Heart Diesese
No ratings yet
Heart Diesese
9 pages
Data Science Week 4
No ratings yet
Data Science Week 4
14 pages
CVD_WEB
No ratings yet
CVD_WEB
22 pages
Mehak Monika Ip Project Final 1
No ratings yet
Mehak Monika Ip Project Final 1
24 pages
Heart Dataset Analysis
No ratings yet
Heart Dataset Analysis
24 pages
QUIZ Week 2 CART Practice PDF
No ratings yet
QUIZ Week 2 CART Practice PDF
10 pages
Healthcare-Project-Simplilearn - Week1
No ratings yet
Healthcare-Project-Simplilearn - Week1
6 pages
health_risk_prediction
No ratings yet
health_risk_prediction
80 pages
Mod 4
No ratings yet
Mod 4
2 pages
B58 - Handling Missing Values, Feature - Selection
No ratings yet
B58 - Handling Missing Values, Feature - Selection
4 pages
Random Forest - US - Heart - Patients - Class
100% (1)
Random Forest - US - Heart - Patients - Class
24 pages
Mehak Monika Ip Project Final 1
No ratings yet
Mehak Monika Ip Project Final 1
24 pages
Stroke Prediction Dataset
No ratings yet
Stroke Prediction Dataset
48 pages
Heart Disease Report With Comments and Code
No ratings yet
Heart Disease Report With Comments and Code
9 pages
Old QP-1
No ratings yet
Old QP-1
1 page
AIIB-W6 Nepal Project-Foundation and Steel Structure Design
No ratings yet
AIIB-W6 Nepal Project-Foundation and Steel Structure Design
23 pages
3G SMAW WQTR - R. Peco (SN-335)
No ratings yet
3G SMAW WQTR - R. Peco (SN-335)
1 page
A380 ATA 27 Flight Controls
100% (1)
A380 ATA 27 Flight Controls
434 pages
Venn Diagrams Sheet C
No ratings yet
Venn Diagrams Sheet C
4 pages
Evaluation of Productivity Performance o
No ratings yet
Evaluation of Productivity Performance o
16 pages
TER30004
No ratings yet
TER30004
1 page
6.0 User Manual (Basic Model Use)
No ratings yet
6.0 User Manual (Basic Model Use)
56 pages
Sequences and Series - Formula Sheet - MathonGo
80% (5)
Sequences and Series - Formula Sheet - MathonGo
6 pages
Polarisation by Quarterwave Plates
No ratings yet
Polarisation by Quarterwave Plates
5 pages
Chemical Engineering Thermodynamics II - Model Questions
No ratings yet
Chemical Engineering Thermodynamics II - Model Questions
10 pages
Temporal Explorability Games: Pete Austin Nicolas Mazzocchi
No ratings yet
Temporal Explorability Games: Pete Austin Nicolas Mazzocchi
17 pages
CatIA Assembly Modelling
No ratings yet
CatIA Assembly Modelling
6 pages
تمارین درس داده کاوی فصل طبقه بندی
No ratings yet
تمارین درس داده کاوی فصل طبقه بندی
7 pages
Economic Analysis of Distribution Transformers
No ratings yet
Economic Analysis of Distribution Transformers
33 pages
An Investigation Into The Capabilities of MATLAB Power System Toolbox For Small Signal Stability Analysis in Power Systems
No ratings yet
An Investigation Into The Capabilities of MATLAB Power System Toolbox For Small Signal Stability Analysis in Power Systems
7 pages
Cloward's Procedure
No ratings yet
Cloward's Procedure
30 pages
Nsa325 481aaaj1c0
No ratings yet
Nsa325 481aaaj1c0
8 pages
Elliott Davis Ferreira Mahdi Gorgun Semi Rigid - Structural Eng Part 1
No ratings yet
Elliott Davis Ferreira Mahdi Gorgun Semi Rigid - Structural Eng Part 1
14 pages
20131a0249 Ai-Ml
100% (1)
20131a0249 Ai-Ml
45 pages
LAB # 04 Tasks: Self Joins
No ratings yet
LAB # 04 Tasks: Self Joins
4 pages
Lm358 100 Gain Operational Amplifier Module
No ratings yet
Lm358 100 Gain Operational Amplifier Module
5 pages
Redutores Industriais - G1050 Global 5060Hz en 4317
No ratings yet
Redutores Industriais - G1050 Global 5060Hz en 4317
284 pages
Ajay Bhosale Civil Resume
No ratings yet
Ajay Bhosale Civil Resume
1 page
Monitoring User Manual Inview S en v1.0
No ratings yet
Monitoring User Manual Inview S en v1.0
47 pages
PHILO - 101 (Complete Course)
No ratings yet
PHILO - 101 (Complete Course)
333 pages
Datasheet Kleemann MOBIBELT MBT 20
No ratings yet
Datasheet Kleemann MOBIBELT MBT 20
2 pages
Linked List
No ratings yet
Linked List
6 pages

Dovdush KN-305 Lab2

Uploaded by

Dovdush KN-305 Lab2

Uploaded by

Практична робота №2

з дисципліни "Інформаційні технології смартсистем"

на тему "Кардіологічна клініка"

студент групи КН-305

Defaulting to user installation because normal site-packages is not writeable

In [2]: import os.path

import scipy.stats as stats

In [3]: import warnings

In [4]: pd.set_option('display.max_columns', 500)

Read the dataset

0 0 40.0 M ATA 140.0 289.0 0.0 Normal 172.0 N 0.0 Up 0.0

2 2 37.0 M ATA 130.0 283.0 0.0 ST NaN N 0.0 Up 0.0

4 4 54.0 M NAP 150.0 195.0 0.0 Normal 122.0 N 0.0 Up 0.0

In [7]: print('columns count - ',len(ds.columns), '\n')

Missing data imputation

Out[8]: (918, 13)

Out[9]: Unnamed: 0 int64

In [10]: for col in ds.columns:

Missing data in Age 45

In [11]: def impute_na(df, variable, value):

In [12]: Age_median = ds['Age'].median()

In [13]: #числові значення з заміною на середнє

#Заміна відсутніх значень на категорію, що найчастіше зустрічається

ds['Sex'] = impute_na(ds, 'Sex',Sex_mode)

ds['Sex'].fillna(method ='ffill', inplace = True)

In [14]: for col in ds.columns:

Out[15]: Unnamed: 0 918

Out[16]: array(['M', 'F'], dtype=object)

Out[17]: array(['ATA', 'NAP', 'ASY', 'TA'], dtype=object)

Out[18]: array(['Normal', 'ST', 'LVH'], dtype=object)

Out[19]: array(['N', 'Y'], dtype=object)

Out[20]: array(['Up', 'Flat', 'Down'], dtype=object)

In [21]: from sklearn.preprocessing import LabelEncoder

In [22]: ds['Sex'] = le.fit_transform(ds['Sex'])

0 0 40.0 1 1 140.0 289.0 0.0 1 172.0 0 0.0 2 0.0

1 1 49.0 0 2 130.0 180.0 0.0 1 156.0 0 1.0 1 1.0

2 2 37.0 1 1 130.0 283.0 0.0 2 138.0 0 0.0 2 0.0

3 3 48.0 0 0 138.0 214.0 0.0 1 108.0 1 1.5 1 1.0

4 4 54.0 1 2 150.0 195.0 0.0 1 122.0 0 0.0 2 0.0

5 5 39.0 1 2 120.0 339.0 0.0 1 138.0 0 0.0 2 0.0

6 6 45.0 0 1 130.0 237.0 0.0 1 170.0 0 0.0 2 0.0

7 7 54.0 1 1 110.0 208.0 0.0 1 142.0 0 0.0 2 0.0

8 8 37.0 1 0 140.0 207.0 0.0 1 130.0 1 1.5 1 1.0

9 9 48.0 0 1 120.0 284.0 0.0 1 120.0 1 0.0 2 1.0

In [24]: def diagnostic_plots(df, variable):

# define figure size

In [25]: diagnostic_plots(ds, 'Age')

In [26]: diagnostic_plots(ds, 'RestingBP')

In [27]: diagnostic_plots(ds, 'Cholesterol')

In [28]: diagnostic_plots(ds, 'MaxHR')

In [29]: diagnostic_plots(ds, 'FastingBS')

In [30]: diagnostic_plots(ds, 'Oldpeak')

0 0 -1.473387 1 1 0.427330 0.846142 0.0 1 1.443735 0 0.295455 2 0.0

1 1 -0.496724 0 2 -0.127534 -0.202998 0.0 1 0.780688 0 0.409091 1 1.0

2 2 -1.798941 1 1 -0.127534 0.788391 0.0 2 0.034759 0 0.295455 2 0.0

3 3 -0.605242 0 0 0.316357 0.124257 0.0 1 -1.208455 1 0.465909 1 1.0

4 4 0.045866 1 2 0.982193 -0.058621 0.0 1 -0.628288 0 0.295455 2 0.0

You might also like