Class 14 - Basic Coding in Python - 5
Class 14 - Basic Coding in Python - 5
0
Making loops in Python
• while loops: while loop are used to execute a set of statements as long as a condition is true
+= means x = x + value
Making loops in Python
• while loops: while loop are used to execute a set of statements as long as a condition is true
Step 2. Click on Upload File > Upload file “data.csv” from your local computer
- the file is also uploaded on the Course page -
Step 3. Import the database file into replit by executing the following 3 lines of code
import pandas as pd
df = pd.read_csv('data.csv')
print(df.to_string())
Step 4. Import libraries with the following 4 lines of code
import numpy
from scipy import stats
import math
import matplotlib.pyplot as plt
Step 5: Statistics for the Age column - Define mean, median and standard deviation (std)
by executing the following 6 lines of code
median = numpy.median(df['age'])
print(median)
mean = numpy.mean(df['age'])
print(mean)
std = numpy.std(df['age'])
print(std)
Step 5: Statistics for the Age column - Define mean, median and standard deviation (var)
by executing the following 6 lines of code
Mean = 55 years
Median = 54.3 years
Stand. Dev = 9 years
Step 6: Let’s organize the data in rows
categorical_val = []
continous_val = []
for column in df.columns:
print('==============================')
print(f"{column} : {df[column].unique()}")
if len(df[column].unique()) <= 10:
categorical_val.append(column)
else:
continous_val.append(column)
Step 6: Let’s organize the data in rows
Cp - Chest pain
exercise-induced angina
Usually, data split is 70% of the total for the training set and 30% testing set
Data split is 70% for training set and 30% for testing set
X = df.drop('target', axis=1)
y = df.target
lr_clf = LogisticRegression(solver='liblinear')
lr_clf.fit(X_train, y_train)
the model performs well: the test set has almost the same accuracy as the training set