0% found this document useful (0 votes)

786 views4 pages

Carreon WS06

This document contains a student's answers to multiple choice and coding questions about supervised machine learning techniques in Python. It includes examples of classification using k-nearest neighbors on House voting data and linear regression using life expectancy and fertility data from Gapminder. The student performs tasks like data preprocessing, training and testing models, evaluating accuracy at different values of k, and 5-fold cross-validation.

Uploaded by

Keneth Carreon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

786 views4 pages

Carreon WS06

Uploaded by

Keneth Carreon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Name

CARREON, KENETH C. DS100-3 / B9 APPLIED DATA SCIENCE

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

1 Date: January 28, 2020

Which of the following example applications of machine learning is a supervised classification problem? Answer
A. Using labeled financial data to predict whether the value of a stock will go up or go down next week
B. Using labeled housing price data to predict the price of a new house based on various features A
C. Using unlabeled data to cluster the students of an online education company into different categories
based on their learning styles
D. Using labeled financial data to predict what the value of a stock will be next week

2 Date: January 28, 2020

Import house-votes-84 (edited).csv. Write the codes necessary to import and examine this dataset. Which of the
following statements is not true?
A. The DataFrame has a total of 232 rows and 17 columns.
B. Except for ‘party’, all of the columns are of type int64.
C. The first row of the DataFrame consists of votes by a Democrat and the second row consists of votes by a Republican.
D. There are 17 predictor variables, or features, in this DataFrame.
E. The target variable in this DataFrame is ‘party’.
Code Answer

import pandas as pd
df = pd.read_csv('house-votes-84 (edited).csv')
print (df.info())
E

3 Date: January 28, 2020

Perform visual exploratory data analysis on the house votes dataset. Use Seaborn’s countplot to visualize the votes to the
satellite testing bill, grouped by party. Include the following line before the show function:
plt.xticks([0,1], [‘No’, ‘Yes’])
Do the same for the missile bill. Write the codes here and answer the question:
Of the two bills, which one/s do Democrats vote resoundingly in favor of, compared to Republicans?
A. Missile Bill C. Both Missile and Satellite Bills
B. Satellite Bill D. Neither Missile nor Satellite Bill
Code Answer

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('house-votes-84 (edited).csv')
plt.figure()
sns.countplot(x='sat_test', hue='party', data=df, palette='RdBu')
C
plt.xticks([0,1], ['No', 'Yes'])
plt.show()

Page 1 of 4
Name
CARREON, KENETH C. DS100-3 / B9 APPLIED DATA SCIENCE

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

4 Date: January 28, 2020

Predict the party affiliation of the House member whose votes have been recorded in the file named x_new.csv. Write the code
here to achieve the following output:
Party Prediction: [‘democrat’/’republican’]
Code

import pandas as pd
df = pd.read_csv('house-votes-84 (edited).csv')
x_new = pd.read_csv('x_new.csv')
import sklearn.neighbors
from sklearn.neighbors import KNeighborsClassifier
y = df['party']
X = df.drop('party', axis=1).values
knn = KNeighborsClassifier(n_neighbors=6)
import numpy as np
y = y.reshape(-1,1)
X = X.reshape(-1,1)
knn.fit(X, y)
y_pred = knn.predict(X)
new_prediction = knn.predict('x_new')
print("Prediction: {}".format(new_prediction))

Output

Party Prediction: [‘democrat’/’republican’]

5 Date: January 28, 2020

Use train_test_split from sklearn on your House votes data. Use 70% of the data for training and the rest for testing.
Add the following arguments to train_test_split: random_state = 21, stratify = y. Print out the predictions
for the test set and the model score. Write the code here and submit a copy of the output through Cardinal Edge Worksheet
Submission.
Code Output

from sklearn import datasets

import matplotlib.pyplot as plt
digits = datasets.load_digits()
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
X = digits.data
y = digits.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=21, stratify=y)
knn = KNeighborsClassifier(n_neighbors=7)
knn.fit(X_train, y_train)
print(knn.score(X_test, y_test))

Page 2 of 4
Name
CARREON, KENETH C. DS100-3 / B9 APPLIED DATA SCIENCE

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

6 Date: January 28, 2020

Use a for loop to determine the training accuracy and testing accuracy for the House votes data at k-values from 1 to 9. Plot the
results. Write the code here and submit a copy of the output through Cardinal Edge Worksheet Submission.
Code Output

neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))
for i, k in enumerate(neighbors):
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
train_accuracy[i] = knn.score(X_train, y_train)
test_accuracy[i] = knn.score(X_test, y_test)
plt.title('k-NN: Varying Number of Neighbors')
plt.plot(neighbors, test_accuracy, label = 'Testing Accuracy')
plt.plot(neighbors, train_accuracy, label = 'Training Accuracy')
plt.legend()
plt.xlabel('Number of Neighbors')
plt.ylabel('Accuracy')
plt.show()

7 Date: January 25, 2020

Which of the following example applications of machine learning is best framed as a regression problem? Answer
A. An e-commerce company using labeled customer data to predict whether or not a customer will purchase
a particular item
B. A healthcare company using data about cancer tumors (such as their geometric measurements) to predict
whether a new tumor is benign or malignant C
C. A restaurant using review data to ascribe positive or negative sentiment to a given review
D. A bike share company using time and weather data to predict the number of bikes being rented at any
given hour

8 Date: January 28, 2020

Import the gapminder file. Pre-process the data by examining its features and converting the DataFrame into arrays and
reshaping them for regression. We want to see how life expectancy varies with fertility. Write the necessary codes here.
Code
import numpy as np
import pandas as pd
df = pd.read_csv('gapminder_P06.csv')
y = df['life_exp'].values
X = df['fertility'].values
print("Dimensions of y before reshaping: {}".format(y.shape))
print("Dimensions of X before reshaping: {}".format(X.shape))
y = y.reshape(-1,1)
X = X.reshape(-1,1)
print("Dimensions of y after reshaping: {}".format(y.shape))
print("Dimensions of X after reshaping: {}".format(X.shape))
Page 3 of 4
Name
CARREON, KENETH C. DS100-3 / B9 APPLIED DATA SCIENCE

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

9 Date: January 28, 2020

Perform regression on the data (life expectancy as a function of fertility). Prepare a plot showing the data points (in blue) and the
linear model (in red). Print out the regression score. Write the code here and submit a copy of the output through Cardinal Edge
Worksheet Submission.
Code Output

import numpy as np
import pandas as pd
df = pd.read_csv('gapminder_P06.csv')
y = df['life_exp'].values
X = df['fertility'].values
y_life = y.reshape(-1,1)
X_fertility = X.reshape(-1,1)
from sklearn.linear_model import LinearRegression
reg = LinearRegression()
prediction_space = np.linspace(min(X_fertility), max(X_fertility)).reshape(-1,1)
reg.fit(X_fertility, y_life)
y_pred = reg.predict(prediction_space)
print(reg.score(X_fertility, y_life))
plt.scatter(X_fertility, y_life, color='blue')
plt.plot(prediction_space, y_pred, color='red', linewidth=3)
plt.show()

10 Date: January 28, 2020

Perform a 5-fold cross validation on the data on the previous numbers. Write lines of code to print out the individual validation
scores and the average validation score.
Code

from sklearn.linear_model import LinearRegression

from sklearn.model_selection import cross_val_score
reg = LinearRegression()
cv_scores = cross_val_score(reg, X, y, cv=5)
print(cv_scores)
print("Average 5-Fold CV Score: {}".format(np.mean(cv_scores)))

Output

Page 4 of 4

SAKO SKI780 Manual
100% (1)
SAKO SKI780 Manual
74 pages
Module 1 Examination CPE111
No ratings yet
Module 1 Examination CPE111
100 pages
Compiled E105 E111
No ratings yet
Compiled E105 E111
36 pages
Ce120-Project For Module 2 - Problem Set With Solution
No ratings yet
Ce120-Project For Module 2 - Problem Set With Solution
11 pages
E 107
No ratings yet
E 107
15 pages
Module Paper 1: Write Your Answers Below Each of The Questions
No ratings yet
Module Paper 1: Write Your Answers Below Each of The Questions
7 pages
Date Submitted: Course / Section: Activity Title: Realization, Insight and Learnings
No ratings yet
Date Submitted: Course / Section: Activity Title: Realization, Insight and Learnings
1 page
Scilab Cose
No ratings yet
Scilab Cose
3 pages
DS100-1 WS 2.6 Enrico, DM
No ratings yet
DS100-1 WS 2.6 Enrico, DM
8 pages
DS100-1 WS 3.7 Enrico, DM
No ratings yet
DS100-1 WS 3.7 Enrico, DM
4 pages
E109 E110: Series and Parallel Circuits/Kirchhoff's Laws: BARBERO, Nick Earl B
No ratings yet
E109 E110: Series and Parallel Circuits/Kirchhoff's Laws: BARBERO, Nick Earl B
4 pages
Fa2.1 M2 1Q2122 Coronel PDF
No ratings yet
Fa2.1 M2 1Q2122 Coronel PDF
5 pages
DS100-3 WS 2.6 Gonzaga Eh
No ratings yet
DS100-3 WS 2.6 Gonzaga Eh
6 pages
Coursera KAIST Math156
No ratings yet
Coursera KAIST Math156
2 pages
MRR in Readings in Philippine History: Answer The Following Questions
No ratings yet
MRR in Readings in Philippine History: Answer The Following Questions
2 pages
E110
No ratings yet
E110
3 pages
Sustainable Development: For Good or For Bad
No ratings yet
Sustainable Development: For Good or For Bad
3 pages
Experiment 111 Spherical Mirrors - Online
No ratings yet
Experiment 111 Spherical Mirrors - Online
5 pages
Phys 101L - Physics For Engineering Laboratory 4Q2021 - Mapua University
No ratings yet
Phys 101L - Physics For Engineering Laboratory 4Q2021 - Mapua University
4 pages
CWTS103 Week-3 Essay
100% (1)
CWTS103 Week-3 Essay
2 pages
MRR3 A51
No ratings yet
MRR3 A51
2 pages
Homework 2
No ratings yet
Homework 2
14 pages
E106: Specific Heat Capacity E107: Latent Heat of Fusion: June N. Lantacon Results and Discussion
No ratings yet
E106: Specific Heat Capacity E107: Latent Heat of Fusion: June N. Lantacon Results and Discussion
4 pages
Amarillas Paper2 GED105
No ratings yet
Amarillas Paper2 GED105
6 pages
Me160p-3 - E01 Report
No ratings yet
Me160p-3 - E01 Report
12 pages
Ged101 - Module Paper 1
No ratings yet
Ged101 - Module Paper 1
2 pages
Experiment 2: Molar Mass of A Volatile Liquid: Chemistry For Engineers Laboratory
No ratings yet
Experiment 2: Molar Mass of A Volatile Liquid: Chemistry For Engineers Laboratory
10 pages
GED102 Week 4 WGN
No ratings yet
GED102 Week 4 WGN
5 pages
Activity 2 Graphs and Equations (Ver06222020)
No ratings yet
Activity 2 Graphs and Equations (Ver06222020)
11 pages
DS100-1 WS 2.5 Enrico, DM
No ratings yet
DS100-1 WS 2.5 Enrico, DM
5 pages
MRR1 Ged101 2qay20182019
100% (1)
MRR1 Ged101 2qay20182019
2 pages
Diagnostic Exam - MEC32P-2 - A77
No ratings yet
Diagnostic Exam - MEC32P-2 - A77
1 page
Mse SW2B
No ratings yet
Mse SW2B
25 pages
E106 - Agustin
No ratings yet
E106 - Agustin
20 pages
Experiment 103 Moment of Inertia Analysis
No ratings yet
Experiment 103 Moment of Inertia Analysis
12 pages
Sapiandante - Co1 MRR1
No ratings yet
Sapiandante - Co1 MRR1
2 pages
E108 - Agustin
No ratings yet
E108 - Agustin
23 pages
MRR1 On GED105
No ratings yet
MRR1 On GED105
3 pages
E105 - Agustin
No ratings yet
E105 - Agustin
25 pages
RZL Quiz
No ratings yet
RZL Quiz
2 pages
Module 1
No ratings yet
Module 1
14 pages
E104 - Agustin
No ratings yet
E104 - Agustin
19 pages
Date Submitted: Course / Section: Activity Title: Case Story Title
0% (1)
Date Submitted: Course / Section: Activity Title: Case Story Title
2 pages
On Filipino Citizenship Values - NG, DeNZEL
No ratings yet
On Filipino Citizenship Values - NG, DeNZEL
1 page
Form - CWTS103 Term-End Essay Part1 02MAY2022 - de Juan
No ratings yet
Form - CWTS103 Term-End Essay Part1 02MAY2022 - de Juan
2 pages
Differential Equations: Elementary Applications of Differential Equations of The First Order
No ratings yet
Differential Equations: Elementary Applications of Differential Equations of The First Order
34 pages
Understanding The Self MRR1
No ratings yet
Understanding The Self MRR1
2 pages
05 Lab Exer 1
100% (1)
05 Lab Exer 1
2 pages
Caro PHYS101L (A12) Report 6
100% (1)
Caro PHYS101L (A12) Report 6
11 pages
Lab 4
No ratings yet
Lab 4
3 pages
E102 - Agustin
No ratings yet
E102 - Agustin
20 pages
Mapúa University: Program Educational Objectives
100% (1)
Mapúa University: Program Educational Objectives
6 pages
WS3 7stamaria
No ratings yet
WS3 7stamaria
6 pages
Math154-1 Module 1 Exercise#1
No ratings yet
Math154-1 Module 1 Exercise#1
2 pages
MMW Project 1
No ratings yet
MMW Project 1
2 pages
NAME: Jimenez, Samantha Shane O. Group # 5 DATE: July 27,2021
No ratings yet
NAME: Jimenez, Samantha Shane O. Group # 5 DATE: July 27,2021
6 pages
FilmAnalysis Ged109
No ratings yet
FilmAnalysis Ged109
3 pages
Mapua University Department of Mathematics Math 149-Linear Algebra 2 QUARTER 2018-2019
No ratings yet
Mapua University Department of Mathematics Math 149-Linear Algebra 2 QUARTER 2018-2019
1 page
Experiment 10 Kirchhoffs Law
No ratings yet
Experiment 10 Kirchhoffs Law
8 pages
DS100-1 WS 3.8 Enrico, DM
No ratings yet
DS100-1 WS 3.8 Enrico, DM
5 pages
DS100 3 WS3.8 PDF
No ratings yet
DS100 3 WS3.8 PDF
5 pages
Research
No ratings yet
Research
22 pages
Carreon MMW Digitalweek 10.12.2019
No ratings yet
Carreon MMW Digitalweek 10.12.2019
8 pages
Carreon MMW BB 12142019
No ratings yet
Carreon MMW BB 12142019
2 pages
Quiz 2
No ratings yet
Quiz 2
3 pages
Rizal MRR#2 Carreon2
100% (1)
Rizal MRR#2 Carreon2
2 pages
Discrete Mathematics B.Tech III Sem
No ratings yet
Discrete Mathematics B.Tech III Sem
18 pages
Parameters Estimation Methods of The Weibull Distribution: A Comparative Study
No ratings yet
Parameters Estimation Methods of The Weibull Distribution: A Comparative Study
9 pages
DBE - LB2014 - Answers - Grade4 - Book1 - Part 2of 4
No ratings yet
DBE - LB2014 - Answers - Grade4 - Book1 - Part 2of 4
3 pages
Transformations - Practice Problems For CAD
No ratings yet
Transformations - Practice Problems For CAD
6 pages
Chapter 2 - General Fracture Mechanics
75% (4)
Chapter 2 - General Fracture Mechanics
45 pages
1st Grade Math Standard Rubric
No ratings yet
1st Grade Math Standard Rubric
1 page
Direct Torque Control of Brushless DC Motor: With Non-Sinusoidal Back-EMF
No ratings yet
Direct Torque Control of Brushless DC Motor: With Non-Sinusoidal Back-EMF
7 pages
MIPT Task Samples
No ratings yet
MIPT Task Samples
46 pages
Exercises2 2022 2023v1
No ratings yet
Exercises2 2022 2023v1
2 pages
JFo Section 8 Quiz
100% (1)
JFo Section 8 Quiz
4 pages
Electrical Power Quality
No ratings yet
Electrical Power Quality
68 pages
52194
No ratings yet
52194
2 pages
Veilog HDL Module 3
No ratings yet
Veilog HDL Module 3
62 pages
Unit-IV (Difference Between Structure and Union)
No ratings yet
Unit-IV (Difference Between Structure and Union)
5 pages
Microeconomics Course: BY Emery Emerimana MBA-Project Management and Finance Email: Tel: 71 578 069/75 658 470
No ratings yet
Microeconomics Course: BY Emery Emerimana MBA-Project Management and Finance Email: Tel: 71 578 069/75 658 470
103 pages
Control Systems Engineering: Modeling in The Frequency Domain
No ratings yet
Control Systems Engineering: Modeling in The Frequency Domain
138 pages
Paul Dirac Thesis PDF
100% (3)
Paul Dirac Thesis PDF
8 pages
Fibonacci
No ratings yet
Fibonacci
3 pages
1ST0 Grade Characteristics FINAL
No ratings yet
1ST0 Grade Characteristics FINAL
7 pages
Reputational Cheap Talk: Marco Ottaviani and Peter Norman Sørensen
No ratings yet
Reputational Cheap Talk: Marco Ottaviani and Peter Norman Sørensen
21 pages
Dept. of Business Administration-General Course Outline
No ratings yet
Dept. of Business Administration-General Course Outline
2 pages
A. DRILL (Preliminary Activities) : Teachers' Activity Students' Activity
No ratings yet
A. DRILL (Preliminary Activities) : Teachers' Activity Students' Activity
4 pages
Lec.1 ENM4137introduction - Simulation
No ratings yet
Lec.1 ENM4137introduction - Simulation
30 pages
Bài Tập Về Nhà Buổi 2
No ratings yet
Bài Tập Về Nhà Buổi 2
2 pages
Second Semester Examination, 2002-2003: B. Tech
No ratings yet
Second Semester Examination, 2002-2003: B. Tech
7 pages
Pencil Sketching
100% (5)
Pencil Sketching
70 pages
ADL 10 Marketing Research V3 2
67% (9)
ADL 10 Marketing Research V3 2
11 pages
DeepLearning in Chemistry
No ratings yet
DeepLearning in Chemistry
44 pages

Carreon WS06

Uploaded by

Carreon WS06

Uploaded by

Name

CARREON, KENETH C. DS100-3 / B9 APPLIED DATA SCIENCE

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

1 Date: January 28, 2020

2 Date: January 28, 2020

3 Date: January 28, 2020

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

4 Date: January 28, 2020

Party Prediction: [‘democrat’/’republican’]

5 Date: January 28, 2020

from sklearn import datasets

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

6 Date: January 28, 2020

7 Date: January 25, 2020

8 Date: January 28, 2020

WORKSHEET #6: SUPERVISED LEARNING WITH PYTHON

9 Date: January 28, 2020

10 Date: January 28, 2020

from sklearn.linear_model import LinearRegression

You might also like