100% found this document useful (1 vote)

1K views16 pages

045 Assignment PDF

- The document discusses building an earthquake damage classification model to predict building damage for the district of Kavrepalanchok in Nepal. - It involves preparing data from a nepal.sqlite database, which includes joining tables, creating a target variable, and dropping unnecessary columns. - An exploration of the target variable shows the classes in the dataset are imbalanced, with a bar chart showing the relative frequencies of the two classes in the "severe_damage" column.

Uploaded by

Tman Letswalo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views16 pages

045 Assignment PDF

Uploaded by

Tman Letswalo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

045-assignment

May 18, 2022

4.5. Earthquake Damage in Kavrepalanchok ��

In this assignment, you’ll build a classification model to predict building damage for the district of
Kavrepalanchok.
[2]: import warnings

import wqet_grader

warnings.simplefilter(action="ignore", category=FutureWarning)
wqet_grader.init("Project 4 Assessment")

<IPython.core.display.HTML object>

[3]: # Import libraries here

import sqlite3
import warnings

import matplotlib.pyplot as plt

import numpy as np
import pandas as pd
import seaborn as sns
from category_encoders import OneHotEncoder
from category_encoders import OrdinalEncoder
from IPython.display import VimeoVideo
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.utils.validation import check_is_fitted

warnings.simplefilter(action="ignore", category=FutureWarning)

1 Prepare Data
1.1 Connect
Run the cell below to connect to the nepal.sqlite database.

1
[4]: %load_ext sql
%sql sqlite:////home/jovyan/nepal.sqlite

[4]: 'Connected: @/home/jovyan/nepal.sqlite'

Task 4.5.1: What districts are represented in the id_map table? Determine the unique values in
the district_id column.
[5]: %%sql
SELECT distinct(district_id)
FROM id_map

* sqlite:////home/jovyan/nepal.sqlite
Done.

[5]: [(1,), (2,), (3,), (4,)]

[6]: result = _.DataFrame().squeeze() # noqa F821

wqet_grader.grade("Project 4 Assessment", "Task 4.5.1", result)

<IPython.core.display.HTML object>
What’s the district ID for Kavrepalanchok? From the lessons, you already know that Gorkha is 4;
from the textbook, you know that Ramechhap is 2. Of the remaining districts, Kavrepalanchok is
the one with the largest number of observations in the id_map table.
Task 4.5.2: Calculate the number of observations in the id_map table associated with district 1.
[7]: %%sql
SELECT count(*)
FROM id_map
WHERE district_id = 1

* sqlite:////home/jovyan/nepal.sqlite
Done.

[7]: [(36112,)]

[8]: result = [_.DataFrame().astype(float).squeeze()] # noqa F821

wqet_grader.grade("Project 4 Assessment", "Task 4.5.2", result)

<IPython.core.display.HTML object>
Task 4.5.3: Calculate the number of observations in the id_map table associated with district 3.
[9]: %%sql
SELECT count(*)
FROM id_map
WHERE district_id = 3

2
* sqlite:////home/jovyan/nepal.sqlite
Done.

[9]: [(82684,)]

[10]: result = [_.DataFrame().astype(float).squeeze()] # noqa F821

wqet_grader.grade("Project 4 Assessment", "Task 4.5.3", result)

<IPython.core.display.HTML object>
Task 4.5.4: Join the unique building IDs from Kavrepalanchok in id_map, all the columns from
building_structure, and the damage_grade column from building_damage, limiting. Make sure
you rename the building_id column in id_map as b_id and limit your results to the first five rows
of the new table.
[11]: %%sql
SELECT distinct(i.building_id) AS b_id,
s.*,
d.damage_grade
FROM id_map AS i
JOIN building_structure AS s ON i.building_id = s.building_id
JOIN building_damage AS d ON i.building_id = d.building_id
WHERE district_id = 3
LIMIT 5

* sqlite:////home/jovyan/nepal.sqlite
Done.

[11]: [(87473, 87473, 2, 1, 15, 382, 18, 7, 'Flat', 'Mud mortar-Stone/Brick',

'Bamboo/Timber-Light roof', 'Mud', 'TImber/Bamboo-Mud', 'Not attached',
'Rectangular', 'Damaged-Used in risk', 'Stone, mud mortar', 'Grade 4'),
(87479, 87479, 1, 0, 12, 328, 7, 0, 'Flat', 'Mud mortar-Stone/Brick',
'Bamboo/Timber-Light roof', 'Mud', 'Not applicable', 'Not attached',
'Rectangular', 'Damaged-Rubble clear', 'Stone, mud mortar', 'Grade 5'),
(87482, 87482, 2, 1, 23, 427, 20, 7, 'Flat', 'Mud mortar-Stone/Brick',
'Bamboo/Timber-Light roof', 'Mud', 'TImber/Bamboo-Mud', 'Not attached',
'Rectangular', 'Damaged-Not used', 'Stone, mud mortar', 'Grade 4'),
(87491, 87491, 2, 1, 12, 427, 14, 7, 'Flat', 'Mud mortar-Stone/Brick',
'Bamboo/Timber-Light roof', 'Mud', 'TImber/Bamboo-Mud', 'Not attached',
'Rectangular', 'Damaged-Not used', 'Stone, mud mortar', 'Grade 4'),
(87496, 87496, 2, 0, 32, 360, 18, 0, 'Flat', 'Mud mortar-Stone/Brick',
'Bamboo/Timber-Light roof', 'Mud', 'TImber/Bamboo-Mud', 'Not attached',
'Rectangular', 'Damaged-Rubble clear', 'Stone, mud mortar', 'Grade 5')]

[12]: result = _.DataFrame().set_index("b_id") # noqa F821

wqet_grader.grade("Project 4 Assessment", "Task 4.5.4", result)

<IPython.core.display.HTML object>

3
1.2 Import
Task 4.5.5: Write a wrangle function that will use the query you created in the previous task to
create a DataFrame. In addition your function should:
1. Create a "severe_damage" column, where all buildings with a damage grade greater than 3
should be encoded as 1. All other buildings should be encoded at 0.
2. Drop any columns that could cause issues with leakage or multicollinearity in your model.
[13]: # Build your `wrangle` function here
def wrangle(db_path):
# Connect to database
conn = sqlite3.connect(db_path)

# Construct query
query = """
SELECT distinct(i.building_id) AS b_id,
s.*,
d.damage_grade
FROM id_map AS i
JOIN building_structure AS s ON i.building_id = s.building_id
JOIN building_damage AS d ON i.building_id = d.building_id
WHERE district_id = 3
"""

# Read query results into DataFrame

df = pd.read_sql(query, conn, index_col="b_id")

# Identify leaky columns

drop_cols = [col for col in df.columns if "post_eq" in col]

# Add high-cardinality / redundant column

drop_cols.append("building_id")

# Create binary target column

df["damage_grade"] = df["damage_grade"].str[-1].astype(int)
df["severe_damage"] = (df["damage_grade"] > 3).astype(int)

# Drop old target

drop_cols.append("damage_grade")

# Drop multicollinearity column

drop_cols.append("count_floors_pre_eq")

# Drop columns
df.drop(columns=drop_cols, inplace=True)

return df

4
Use your wrangle function to query the database at "/home/jovyan/nepal.sqlite" and return
your cleaned results.
[14]: df = wrangle("/home/jovyan/nepal.sqlite")
df.head()

[14]: age_building plinth_area_sq_ft height_ft_pre_eq \

b_id
87473 15 382 18
87479 12 328 7
87482 23 427 20
87491 12 427 14
87496 32 360 18

land_surface_condition foundation_type \
b_id
87473 Flat Mud mortar-Stone/Brick
87479 Flat Mud mortar-Stone/Brick
87482 Flat Mud mortar-Stone/Brick
87491 Flat Mud mortar-Stone/Brick
87496 Flat Mud mortar-Stone/Brick

roof_type ground_floor_type other_floor_type \

b_id
87473 Bamboo/Timber-Light roof Mud TImber/Bamboo-Mud
87479 Bamboo/Timber-Light roof Mud Not applicable
87482 Bamboo/Timber-Light roof Mud TImber/Bamboo-Mud
87491 Bamboo/Timber-Light roof Mud TImber/Bamboo-Mud
87496 Bamboo/Timber-Light roof Mud TImber/Bamboo-Mud

position plan_configuration superstructure severe_damage

b_id
87473 Not attached Rectangular Stone, mud mortar 1
87479 Not attached Rectangular Stone, mud mortar 1
87482 Not attached Rectangular Stone, mud mortar 1
87491 Not attached Rectangular Stone, mud mortar 1
87496 Not attached Rectangular Stone, mud mortar 1

[15]: wqet_grader.grade(
"Project 4 Assessment", "Task 4.5.5", wrangle("/home/jovyan/nepal.sqlite")
)

<IPython.core.display.HTML object>

1.3 Explore
Task 4.5.6: Are the classes in this dataset balanced? Create a bar chart with the normalized value
counts from the "severe_damage" column. Be sure to label the x-axis "Severe Damage" and the

5
y-axis "Relative Frequency". Use the title "Kavrepalanchok, Class Balance".
[16]: # Plot value counts of `"severe_damage"`
df["severe_damage"].value_counts(normalize=True).plot(
kind="bar", xlabel="Severe Damage", ylabel="Relative Frequency", title =␣
,→"Kavrepalanchok, Class balance"

);
# Don't delete the code below �
plt.savefig("images/4-5-6.png", dpi=150)

[17]: with open("images/4-5-6.png", "rb") as file:

wqet_grader.grade("Project 4 Assessment", "Task 4.5.6", file)

<IPython.core.display.HTML object>
Task 4.5.7: Is there a relationship between the footprint size of a building and the damage it
sustained in the earthquake? Use seaborn to create a boxplot that shows the distributions of the
"plinth_area_sq_ft" column for both groups in the "severe_damage" column. Label your x-
axis "Severe Damage" and y-axis "Plinth Area [sq. ft.]". Use the title "Kavrepalanchok,
Plinth Area vs Building Damage".

[18]: sns.boxplot(x="severe_damage", y="plinth_area_sq_ft", data=df)

# Label axes
plt.xlabel("Severe Damage")
plt.ylabel("Plinth Area [sq. ft.]")

6
plt.title("Kavrepalanchok, Plinth Area vs Building Damage");
# Don't delete the code below �
plt.savefig("images/4-5-7.png", dpi=150)

[19]: with open("images/4-5-7.png", "rb") as file:

wqet_grader.grade("Project 4 Assessment", "Task 4.5.7", file)

<IPython.core.display.HTML object>
Task 4.5.8: Are buildings with certain roof types more likely to suffer severe damage? Create a
pivot table of df where the index is "roof_type" and the values come from the "severe_damage"
column, aggregated by the mean.
[20]: roof_pivot = pd.pivot_table(
df, index="roof_type", values="severe_damage", aggfunc=np.mean
).sort_values(by="severe_damage")
roof_pivot

[20]: severe_damage
roof_type
RCC/RB/RBC 0.040715
Bamboo/Timber-Heavy roof 0.569477
Bamboo/Timber-Light roof 0.604842

7
[21]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.8", roof_pivot)

<IPython.core.display.HTML object>

1.4 Split
Task 4.5.9: Create your feature matrix X and target vector y. Your target is "severe_damage".

[22]: target = "severe_damage"

X = df.drop(columns=target)
y = df[target]
print("X shape:", X.shape)
print("y shape:", y.shape)

X shape: (76533, 11)

y shape: (76533,)

[23]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.9a", X)

<IPython.core.display.HTML object>

[24]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.9b", y)

<IPython.core.display.HTML object>
Task 4.5.10: Divide your dataset into training and validation sets using a randomized split. Your
validation set should be 20% of your data.
[25]: X_train, X_val, y_train, y_val = train_test_split(
X, y, test_size=0.2, random_state=42
)
print("X_train shape:", X_train.shape)
print("y_train shape:", y_train.shape)
print("X_val shape:", X_val.shape)
print("y_val shape:", y_val.shape)

X_train shape: (61226, 11)

y_train shape: (61226,)
X_val shape: (15307, 11)
y_val shape: (15307,)

[26]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.10", [X_train.shape ==␣

,→(61226, 11)])

<IPython.core.display.HTML object>

8
2 Build Model
2.1 Baseline
Task 4.5.11: Calculate the baseline accuracy score for your model.
[27]: acc_baseline = y_train.value_counts(normalize=True).max()
print("Baseline Accuracy:", round(acc_baseline, 2))

Baseline Accuracy: 0.55

[28]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.11", [acc_baseline])

<IPython.core.display.HTML object>

2.2 Iterate
Task 4.5.12: Create a model model_lr that uses logistic regression to predict building damage.
Be sure to include an appropriate encoder for categorical features.
[29]: model_lr = make_pipeline(
OneHotEncoder(use_cat_names=True),
LogisticRegression(max_iter=3000)
)
# Fit model to training data
model_lr.fit(X_train, y_train)

[29]: Pipeline(steps=[('onehotencoder',
OneHotEncoder(cols=['land_surface_condition',
'foundation_type', 'roof_type',
'ground_floor_type', 'other_floor_type',
'position', 'plan_configuration',
'superstructure'],
use_cat_names=True)),
('logisticregression', LogisticRegression(max_iter=3000))])

[30]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.12", model_lr)

<IPython.core.display.HTML object>
Task 4.5.13: Calculate training and validation accuracy score for model_lr.
[31]: lr_train_acc = accuracy_score(y_train, model_lr.predict(X_train))
lr_val_acc = model_lr.score(X_val, y_val)

print("Logistic Regression, Training Accuracy Score:", lr_train_acc)

print("Logistic Regression, Validation Accuracy Score:", lr_val_acc)

Logistic Regression, Training Accuracy Score: 0.6515042628948486

Logistic Regression, Validation Accuracy Score: 0.6536878552296335

9
[32]: submission = [lr_train_acc, lr_val_acc]
wqet_grader.grade("Project 4 Assessment", "Task 4.5.13", submission)

<IPython.core.display.HTML object>
Task 4.5.14: Perhaps a decision tree model will perform better than logistic regression, but what’s
the best hyperparameter value for max_depth? Create a for loop to train and evaluate the model
model_dt at all depths from 1 to 15. Be sure to use an appropriate encoder for your model, and
to record its training and validation accuracy scores at every depth. The grader will evaluate your
validation accuracy scores only.
[33]: depth_hyperparams = range(1, 16)
training_acc = []
validation_acc = []
for d in depth_hyperparams:
model_dt = test_model = make_pipeline(
OrdinalEncoder(),
DecisionTreeClassifier(max_depth=d, random_state=42)
)
model_dt.fit(X_train, y_train)
# Calculate training accuracy score and append to `training_acc`
training_acc.append(model_dt.score(X_train, y_train))
# Calculate validation accuracy score and append to `training_acc`
validation_acc.append(model_dt.score(X_val, y_val))

print("Training Accuracy Scores:", training_acc[:3])

print("Validation Accuracy Scores:", validation_acc[:3])

Training Accuracy Scores: [0.6303041191650606, 0.6303041191650606,

0.642292490118577]
Validation Accuracy Scores: [0.6350035931273273, 0.6350035931273273,
0.6453909975828053]

[34]: submission = pd.Series(validation_acc, index=depth_hyperparams)

wqet_grader.grade("Project 4 Assessment", "Task 4.5.14", submission)

<IPython.core.display.HTML object>
Task 4.5.15: Using the values in training_acc and validation_acc, plot the validation curve
for model_dt. Label your x-axis "Max Depth" and your y-axis "Accuracy Score". Use the title
"Validation Curve, Decision Tree Model", and include a legend.
[35]: plt.plot(depth_hyperparams, training_acc, label="training")
plt.plot(depth_hyperparams, validation_acc, label="validation")
plt.xlabel("Max Depth")
plt.ylabel("Accuracy Score")
plt.title("Validation Curve, Decision Tree Model")
# Don't delete the code below �

10
plt.savefig("images/4-5-15.png", dpi=150)

[36]: with open("images/4-5-15.png", "rb") as file:

wqet_grader.grade("Project 4 Assessment", "Task 4.5.15", file)

<IPython.core.display.HTML object>
Task 4.5.16: Build and train a new decision tree model final_model_dt, using the value for
max_depth that yielded the best validation accuracy score in your plot above.
[37]: final_model_dt = make_pipeline(
OrdinalEncoder(),
DecisionTreeClassifier(max_depth=10, random_state=42)
)
# Fit model to training data
final_model_dt.fit(X_train, y_train)

[37]: Pipeline(steps=[('ordinalencoder',
OrdinalEncoder(cols=['land_surface_condition',
'foundation_type', 'roof_type',
'ground_floor_type', 'other_floor_type',
'position', 'plan_configuration',
'superstructure'],
mapping=[{'col': 'land_surface_condition',

11
'data_type': dtype('O'),
'mapping': Flat 1
Moderate slope 2
Steep slope 3
NaN -2
dtype: int64},
{'col': 'foundation_type',
'dat…
Building with Central Courtyard 9
H-shape 10
NaN -2
dtype: int64},
{'col': 'superstructure',
'data_type': dtype('O'),
'mapping': Stone, mud mortar 1
Adobe/mud 2
Brick, cement mortar 3
RC, engineered 4
Brick, mud mortar 5
Stone, cement mortar 6
RC, non-engineered 7
Timber 8
Other 9
Bamboo 10
Stone 11
NaN -2
dtype: int64}])),
('decisiontreeclassifier',
DecisionTreeClassifier(max_depth=10, random_state=42))])

[38]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.16", final_model_dt)

<IPython.core.display.HTML object>

2.3 Evaluate
Task 4.5.17: How does your model perform on the test set? First, read the CSV
file "data/kavrepalanchok-test-features.csv" into the DataFrame X_test. Next, use
final_model_dt to generate a list of test predictions y_test_pred. Finally, submit your test
predictions to the grader to see how your model performs.
Tip: Make sure the order of the columns in X_test is the same as in your X_train. Otherwise, it
could hurt your model’s performance.
[39]: X_test = pd.read_csv("data/kavrepalanchok-test-features.csv", index_col="b_id")
y_test_pred = final_model_dt.predict(X_test)
y_test_pred[:5]

12
[39]: array([1, 1, 1, 1, 0])

[40]: submission = pd.Series(y_test_pred)

wqet_grader.grade("Project 4 Assessment", "Task 4.5.17", submission)

---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Input In [40], in <cell line: 2>()
1 submission = pd.Series(y_test_pred)
----> 2 wqet_grader.grade("Project 4 Assessment", "Task 4.5.17", submission)

File /opt/conda/lib/python3.9/site-packages/wqet_grader/init.py:180, in␣

,→grade(assessment_id, question_id, submission)

175 def grade(assessment_id, question_id, submission):

176 submission_object = {
177 'type': 'simple',
178 'argument': [submission]
179 }
--> 180 return␣
,→show_score(grade_submission(assessment_id, question_id, submission_object))

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:145, in␣

,→grade_submission(assessment_id, question_id, submission_object)

143 raise Exception('Grader raised error: {}'.format(error['message']))

144 else:
--> 145 raise Exception('Could not grade submission: {}'.
,→format(error['message']))

146 result = envelope['data']['result']

148 # Used only in testing

Exception: Could not grade submission: Could not verify access to this␣
,→assessment: Received error from WQET submission API: You have already passed␣

,→this course!

3 Communicate Results
Task 4.5.18: What are the most important features for final_model_dt? Create a Series Gini
feat_imp, where the index labels are the feature names for your dataset and the values are the
feature importances for your model. Be sure that the Series is sorted from smallest to largest
feature importance.
[41]: features = X_train.columns
importances = final_model_dt.named_steps["decisiontreeclassifier"].
,→feature_importances_

feat_imp = pd.Series(importances, index=features).sort_values()

feat_imp.head()

13
[41]: plan_configuration 0.004189
land_surface_condition 0.008599
foundation_type 0.009967
position 0.011795
ground_floor_type 0.013521
dtype: float64

[42]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.18", feat_imp)

---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Input In [42], in <cell line: 1>()
----> 1 wqet_grader.grade("Project 4 Assessment", "Task 4.5.18", feat_imp)

File /opt/conda/lib/python3.9/site-packages/wqet_grader/init.py:180, in␣

,→grade(assessment_id, question_id, submission)

175 def grade(assessment_id, question_id, submission):

176 submission_object = {
177 'type': 'simple',
178 'argument': [submission]
179 }
--> 180 return␣
,→show_score(grade_submission(assessment_id, question_id, submission_object))

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:145, in␣

,→grade_submission(assessment_id, question_id, submission_object)

143 raise Exception('Grader raised error: {}'.format(error['message']))

144 else:
--> 145 raise Exception('Could not grade submission: {}'.
,→format(error['message']))

146 result = envelope['data']['result']

148 # Used only in testing

Exception: Could not grade submission: Could not verify access to this␣
,→assessment: Received error from WQET submission API: You have already passed␣

,→this course!

Task 4.5.19: Create a horizontal bar chart of feat_imp. Label your x-axis "Gini
Importance" and your y-axis "Label". Use the title "Kavrepalanchok Decision Tree, Feature
Importance".
Do you see any relationship between this plot and the exploratory data analysis you did regarding
roof type?
[43]: # Create horizontal bar chart of feature importances
feat_imp.plot(kind="barh")
plt.xlabel("Gini Importance")

14
plt.ylabel("Features");

# Don't delete the code below �

plt.tight_layout()
plt.savefig("images/4-5-19.png", dpi=150)

[44]: with open("images/4-5-19.png", "rb") as file:

wqet_grader.grade("Project 4 Assessment", "Task 4.5.19", file)

---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Input In [44], in <cell line: 1>()
1 with open("images/4-5-19.png", "rb") as file:
----> 2 wqet_grader.grade("Project 4 Assessment", "Task 4.5.19", file)

File /opt/conda/lib/python3.9/site-packages/wqet_grader/init.py:180, in␣

,→grade(assessment_id, question_id, submission)

175 def grade(assessment_id, question_id, submission):

176 submission_object = {
177 'type': 'simple',
178 'argument': [submission]
179 }

15
--> 180 return␣
,→show_score(grade_submission(assessment_id, question_id, submission_object))

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:145, in␣

,→grade_submission(assessment_id, question_id, submission_object)

143 raise Exception('Grader raised error: {}'.format(error['message']))

144 else:
--> 145 raise Exception('Could not grade submission: {}'.
,→format(error['message']))

146 result = envelope['data']['result']

148 # Used only in testing

Exception: Could not grade submission: Could not verify access to this␣
,→assessment: Received error from WQET submission API: You have already passed␣

,→this course!

Congratulations! You made it to the end of Project 4. ��

035 Assignment PDF
No ratings yet
035 Assignment PDF
14 pages
Practical File (Ip Class Xii) 2024-25
No ratings yet
Practical File (Ip Class Xii) 2024-25
27 pages
024 Price and Everything PDF
100% (1)
024 Price and Everything PDF
12 pages
Invoice: For Device Status
No ratings yet
Invoice: For Device Status
1 page
Supervised Learning
100% (1)
Supervised Learning
15 pages
MOS For T&C Data & Telecommunication Network
No ratings yet
MOS For T&C Data & Telecommunication Network
11 pages
BSBLDR811 Student Assessment V2.0
0% (1)
BSBLDR811 Student Assessment V2.0
28 pages
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
022 Price and Location PDF
No ratings yet
022 Price and Location PDF
16 pages
Internship Report
100% (1)
Internship Report
58 pages
Deh 5250SD
No ratings yet
Deh 5250SD
80 pages
House Price Prediction: # Importing Necessary Libraries
No ratings yet
House Price Prediction: # Importing Necessary Libraries
18 pages
Housing Prices Notebook
No ratings yet
Housing Prices Notebook
14 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (4)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
11 pages
WQU Lecon 8 3
No ratings yet
WQU Lecon 8 3
549 pages
Tutorial Letter 101 2022 0 B PDF
No ratings yet
Tutorial Letter 101 2022 0 B PDF
17 pages
Project
No ratings yet
Project
18 pages
The Data Science Process
100% (1)
The Data Science Process
53 pages
Pandas Assignment
0% (5)
Pandas Assignment
8 pages
Dav All Practicals
No ratings yet
Dav All Practicals
35 pages
Pandas Assignment 1
No ratings yet
Pandas Assignment 1
7 pages
Ex 1
No ratings yet
Ex 1
119 pages
Chapter 4 Supply Management Integration For Competitive Advantage
No ratings yet
Chapter 4 Supply Management Integration For Competitive Advantage
81 pages
ISTQB FL Chap 1
No ratings yet
ISTQB FL Chap 1
10 pages
Ir Remote Control Devices Ece Eie Final Year Project
100% (1)
Ir Remote Control Devices Ece Eie Final Year Project
11 pages
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
No ratings yet
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
127 pages
Lecture02. ML Pipeline (Chapter 2)
No ratings yet
Lecture02. ML Pipeline (Chapter 2)
50 pages
ML Merged
No ratings yet
ML Merged
28 pages
ML Lab - BCSL606
No ratings yet
ML Lab - BCSL606
67 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
Faseeh Chap 2 Report
No ratings yet
Faseeh Chap 2 Report
30 pages
Machinelearning
No ratings yet
Machinelearning
26 pages
Setup: Chapter 2 - End-To-End Machine Learning Project
No ratings yet
Setup: Chapter 2 - End-To-End Machine Learning Project
31 pages
02 End To End Machine Learning Project
No ratings yet
02 End To End Machine Learning Project
26 pages
Phase 1
No ratings yet
Phase 1
13 pages
(2018) Final Exam +solutions PDF
No ratings yet
(2018) Final Exam +solutions PDF
23 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
P04 The Regression Pipeline - Preprocessing Ans
No ratings yet
P04 The Regression Pipeline - Preprocessing Ans
19 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
Project
No ratings yet
Project
27 pages
Jashan ML
No ratings yet
Jashan ML
20 pages
Exercise3 Solution
No ratings yet
Exercise3 Solution
19 pages
Linear Regression Using Python
No ratings yet
Linear Regression Using Python
18 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
Lecture 12 - Art and Science of Data Visualization
No ratings yet
Lecture 12 - Art and Science of Data Visualization
21 pages
Dejene Chala Stat606 Screening Quiz Programming Part
No ratings yet
Dejene Chala Stat606 Screening Quiz Programming Part
12 pages
HW 3
No ratings yet
HW 3
20 pages
Economics 1A OSA Preparation PDF
No ratings yet
Economics 1A OSA Preparation PDF
26 pages
Apache Superset Readthedocs Io en Latest
No ratings yet
Apache Superset Readthedocs Io en Latest
135 pages
3 Creating Features - Kaggle
No ratings yet
3 Creating Features - Kaggle
14 pages
Part A Assignment - No - 1
No ratings yet
Part A Assignment - No - 1
7 pages
00 Data Wrangling
No ratings yet
00 Data Wrangling
10 pages
DALab Part-B BCU&BU
No ratings yet
DALab Part-B BCU&BU
12 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
Kaggle Machine Learning
No ratings yet
Kaggle Machine Learning
6 pages
External
No ratings yet
External
11 pages
CLO4 Review Data Analytics
No ratings yet
CLO4 Review Data Analytics
11 pages
Exercise6 Solution
No ratings yet
Exercise6 Solution
8 pages
EDA and Hypothesis Testing On KC Housing Data: Daniele Sammarco - Exploratory Data Analysis For Machine Learning by IBM
No ratings yet
EDA and Hypothesis Testing On KC Housing Data: Daniele Sammarco - Exploratory Data Analysis For Machine Learning by IBM
9 pages
Numpy
No ratings yet
Numpy
9 pages
Rapids Cheatsheet
100% (1)
Rapids Cheatsheet
2 pages
Normialization Dataset
No ratings yet
Normialization Dataset
7 pages
Data Clearning
No ratings yet
Data Clearning
7 pages
Main - Py Text File
No ratings yet
Main - Py Text File
5 pages
Free PMP Practice Questions
No ratings yet
Free PMP Practice Questions
1 page
Exercises 5
No ratings yet
Exercises 5
7 pages
Tarea - Prediccion de Casas en California
No ratings yet
Tarea - Prediccion de Casas en California
5 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
Datasheet
No ratings yet
Datasheet
2 pages
Chapter-1. Introduction To Communication Systems:-: (April-2010) (07) (2.1 & 2.2)
No ratings yet
Chapter-1. Introduction To Communication Systems:-: (April-2010) (07) (2.1 & 2.2)
6 pages
hdc4 2
No ratings yet
hdc4 2
23 pages
Boston Housing Solutions
No ratings yet
Boston Housing Solutions
3 pages
Programs For Practical
No ratings yet
Programs For Practical
3 pages
Property Database (Market Value Finder)
No ratings yet
Property Database (Market Value Finder)
4 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
F 5
No ratings yet
F 5
2 pages
Business Mathematics Lecture 7
No ratings yet
Business Mathematics Lecture 7
30 pages
Diabetes Expert System
0% (1)
Diabetes Expert System
20 pages
Exploring Reconfigurable Intelligent Surfaces For 6G State of The
No ratings yet
Exploring Reconfigurable Intelligent Surfaces For 6G State of The
14 pages
Economics 1A Revison Unit 1 Unit 9
No ratings yet
Economics 1A Revison Unit 1 Unit 9
7 pages
SPAR H Guidance
No ratings yet
SPAR H Guidance
25 pages
PGDIPPM - Project Risk Management - 15 - S2
No ratings yet
PGDIPPM - Project Risk Management - 15 - S2
4 pages
WSC Penang Registration Form
No ratings yet
WSC Penang Registration Form
4 pages
CH 2 - Path and Circuits
No ratings yet
CH 2 - Path and Circuits
34 pages
Handwritten Telugu Character Recognition Using Machine Learning
No ratings yet
Handwritten Telugu Character Recognition Using Machine Learning
6 pages
USA Team Selection Test 2011
No ratings yet
USA Team Selection Test 2011
3 pages
Circa 2000 Amd Laptop Power Up Sequence
No ratings yet
Circa 2000 Amd Laptop Power Up Sequence
58 pages
Intel SSD Firmware Update Tool Release Notes Rev037US
No ratings yet
Intel SSD Firmware Update Tool Release Notes Rev037US
8 pages
Certificate - Ems - Marine Abs - Pda - 2025
No ratings yet
Certificate - Ems - Marine Abs - Pda - 2025
5 pages
Biseccion y Falsa Posicion - Ejercicio Resuelto
No ratings yet
Biseccion y Falsa Posicion - Ejercicio Resuelto
3 pages
Test Result Cable System
No ratings yet
Test Result Cable System
10 pages
HW 7 Solutions
No ratings yet
HW 7 Solutions
9 pages
Python - How To Draw A Heart With Pylab - Stack Overflow
No ratings yet
Python - How To Draw A Heart With Pylab - Stack Overflow
5 pages
NASA Science Mission Directorate Knowledge Graph Discovery
No ratings yet
NASA Science Mission Directorate Knowledge Graph Discovery
6 pages
Quotation: Shenzhen Manridy Technology Co., LTD
No ratings yet
Quotation: Shenzhen Manridy Technology Co., LTD
2 pages
Futaba - Tbs - CRT As9106
No ratings yet
Futaba - Tbs - CRT As9106
2 pages

045 Assignment PDF

Uploaded by

045 Assignment PDF

Uploaded by

045-assignment

May 18, 2022

4.5. Earthquake Damage in Kavrepalanchok ��

[3]: # Import libraries here

import matplotlib.pyplot as plt

[4]: 'Connected: @/home/jovyan/nepal.sqlite'

[5]: [(1,), (2,), (3,), (4,)]

[6]: result = _.DataFrame().squeeze() # noqa F821

wqet_grader.grade("Project 4 Assessment", "Task 4.5.1", result)

[8]: result = [_.DataFrame().astype(float).squeeze()] # noqa F821

[10]: result = [_.DataFrame().astype(float).squeeze()] # noqa F821

[11]: [(87473, 87473, 2, 1, 15, 382, 18, 7, 'Flat', 'Mud mortar-Stone/Brick',

[12]: result = _.DataFrame().set_index("b_id") # noqa F821

wqet_grader.grade("Project 4 Assessment", "Task 4.5.4", result)

# Read query results into DataFrame

# Identify leaky columns

# Add high-cardinality / redundant column

# Create binary target column

# Drop old target

# Drop multicollinearity column

[14]: age_building plinth_area_sq_ft height_ft_pre_eq \

roof_type ground_floor_type other_floor_type \

position plan_configuration superstructure severe_damage

[17]: with open("images/4-5-6.png", "rb") as file:

[18]: sns.boxplot(x="severe_damage", y="plinth_area_sq_ft", data=df)

[19]: with open("images/4-5-7.png", "rb") as file:

[22]: target = "severe_damage"

X shape: (76533, 11)

[23]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.9a", X)

[24]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.9b", y)

X_train shape: (61226, 11)

[26]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.10", [X_train.shape ==␣

Baseline Accuracy: 0.55

[28]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.11", [acc_baseline])

[30]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.12", model_lr)

print("Logistic Regression, Training Accuracy Score:", lr_train_acc)

Logistic Regression, Training Accuracy Score: 0.6515042628948486

print("Training Accuracy Scores:", training_acc[:3])

Training Accuracy Scores: [0.6303041191650606, 0.6303041191650606,

[34]: submission = pd.Series(validation_acc, index=depth_hyperparams)

wqet_grader.grade("Project 4 Assessment", "Task 4.5.14", submission)

[36]: with open("images/4-5-15.png", "rb") as file:

[38]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.16", final_model_dt)

[40]: submission = pd.Series(y_test_pred)

File /opt/conda/lib/python3.9/site-packages/wqet_grader/__init__.py:180, in␣

175 def grade(assessment_id, question_id, submission):

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:145, in␣

143 raise Exception('Grader raised error: {}'.format(error['message']))

146 result = envelope['data']['result']

feat_imp = pd.Series(importances, index=features).sort_values()

[42]: wqet_grader.grade("Project 4 Assessment", "Task 4.5.18", feat_imp)

File /opt/conda/lib/python3.9/site-packages/wqet_grader/__init__.py:180, in␣

175 def grade(assessment_id, question_id, submission):

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:145, in␣

143 raise Exception('Grader raised error: {}'.format(error['message']))

146 result = envelope['data']['result']

# Don't delete the code below �

[44]: with open("images/4-5-19.png", "rb") as file:

File /opt/conda/lib/python3.9/site-packages/wqet_grader/__init__.py:180, in␣

175 def grade(assessment_id, question_id, submission):

File /opt/conda/lib/python3.9/site-packages/wqet_grader/transport.py:145, in␣

143 raise Exception('Grader raised error: {}'.format(error['message']))

146 result = envelope['data']['result']

Congratulations! You made it to the end of Project 4. ���

You might also like

File /opt/conda/lib/python3.9/site-packages/wqet_grader/init.py:180, in␣

File /opt/conda/lib/python3.9/site-packages/wqet_grader/init.py:180, in␣

File /opt/conda/lib/python3.9/site-packages/wqet_grader/init.py:180, in␣

Congratulations! You made it to the end of Project 4. ��