2020 - 21 - Extra Exercise 1
2020 - 21 - Extra Exercise 1
Question 1
The amount of rain that falls in a day is usually measured in either millimeters (mm) or inches.
Would you treat the following problem as a classification or a regression problem? Why?
1. Suppose you use a learning algorithm to predict how much rain will fall tomorrow.
2. Suppose you use a learning algorithm to predict whether it will rain tomorrow.
Question 2
Would you treat the following problem as a supervised or an unsupervised learning
algorithm?
1. Given email labeled as spam/not spam, learn a spam filter.
2. Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new
patients as having diabetes or not.
3. Given a set of news articles found on the web, group them into sets of articles about the
same stories.
4. Given a database of customer data, automatically discover market segments and group
customers into different market segments.
Question 3
Consider the problem of predicting how well a student does in her second year of
college/university, given how well she did in her first year.
Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a
student receives in their first year of college (freshmen year). We would like to predict the
value of y, which we define as the number of "A" grades they get in their second year
(sophomore year).
Refer to the following training set of a small sample of different students' performances. Here
each row is one training example. Recall that in linear regression, our hypothesis is ℎ𝛽 (𝑥) =
𝛽0 + 𝛽1 𝑥, and we use N to denote the number of training examples.
Question 4
Please draw the decision boundary for the following hypotheses and mark the decision area
for y=0 and y=1.
1. 𝛽0 = −1, 𝛽1 = −1, 𝛽2 = 1, 𝛽3 = 0, and 𝛽4 = 0
2. 𝛽0 = −1, 𝛽1 = 0, 𝛽2 = 0, 𝛽3 = 1, and 𝛽4 = 1
Question 5
Suppose that 𝛽0 = −1, 𝛽1 = 0, 𝛽2 = 0, 𝛽3 = 1, and 𝛽4 = 1. Please find out the values for
ℎ𝛽 (0,0) and ℎ𝛽 (1,1) and their corresponding predictions of the coronavirus infection.
Question 6
Consider the problem of predicting the housing price as a function of its size. Your model is:
Suppose that in a training sample, the size of an apartment ranges from 1 to 10000 square feet.
The average size and √𝑠𝑖𝑧𝑒 is 1000 and 30. You will implement this by fitting a model:
ℎ𝛽 (𝑥) = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2
Suppose you want to use feature scaling with mean normalization to preprocess the data.
What values of 𝑥1 and 𝑥2 would you use?