0% found this document useful (0 votes)
56 views2 pages

2020 - 21 - Extra Exercise 1

1. The document presents 6 questions regarding classification, regression, supervised and unsupervised learning problems. 2. Question 1 asks to classify problems as classification or regression. Question 2 asks about supervised vs unsupervised problems. 3. The remaining questions involve specific regression or classification problems, including linear regression to predict student grades, logistic regression to predict virus infection, and feature scaling for housing price prediction.

Uploaded by

Lin Feng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views2 pages

2020 - 21 - Extra Exercise 1

1. The document presents 6 questions regarding classification, regression, supervised and unsupervised learning problems. 2. Question 1 asks to classify problems as classification or regression. Question 2 asks about supervised vs unsupervised problems. 3. The remaining questions involve specific regression or classification problems, including linear regression to predict student grades, logistic regression to predict virus infection, and feature scaling for housing price prediction.

Uploaded by

Lin Feng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Extra Exercise 1 – For the Mid-term Exam

Question 1
The amount of rain that falls in a day is usually measured in either millimeters (mm) or inches.
Would you treat the following problem as a classification or a regression problem? Why?
1. Suppose you use a learning algorithm to predict how much rain will fall tomorrow.
2. Suppose you use a learning algorithm to predict whether it will rain tomorrow.

Question 2
Would you treat the following problem as a supervised or an unsupervised learning
algorithm?
1. Given email labeled as spam/not spam, learn a spam filter.
2. Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new
patients as having diabetes or not.
3. Given a set of news articles found on the web, group them into sets of articles about the
same stories.
4. Given a database of customer data, automatically discover market segments and group
customers into different market segments.

Question 3
Consider the problem of predicting how well a student does in her second year of
college/university, given how well she did in her first year.
Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades) that a
student receives in their first year of college (freshmen year). We would like to predict the
value of y, which we define as the number of "A" grades they get in their second year
(sophomore year).
Refer to the following training set of a small sample of different students' performances. Here
each row is one training example. Recall that in linear regression, our hypothesis is ℎ𝛽 (𝑥) =
𝛽0 + 𝛽1 𝑥, and we use N to denote the number of training examples.

1. What is the value of N in this training sample?


2. Given the hypothesis ℎ𝛽 (𝑥) = 𝛽0 + 𝛽1 𝑥, write the cost function 𝐽(𝛽), and find out the
value of 𝐽(0,1). (Simplify fractions to decimals when entering answer, e.g., 1.5)
3. Suppose that we set 𝛽0 = −2 and 𝛽1 = 0.5. What is the value of ℎ𝛽 (6)?
4. Suppose that we use the normal equation to find the OLS solution 𝛽 = (𝑋 𝑇 𝑋)−1 𝑋 𝑇 𝑌.
What are 𝑋 and Y? What are the dimensions of X and Y?
For Questions 4 and 5, please refer to the following:
Consider the problem of a multivariate logistic regression to predict coronavirus infection.
Suppose that you have the following hypothesis: ℎ𝛽 (𝑥) = 𝑔(𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥12 +
𝛽4 𝑥22 ) where g(z) is the logistic function.

Question 4
Please draw the decision boundary for the following hypotheses and mark the decision area
for y=0 and y=1.
1. 𝛽0 = −1, 𝛽1 = −1, 𝛽2 = 1, 𝛽3 = 0, and 𝛽4 = 0
2. 𝛽0 = −1, 𝛽1 = 0, 𝛽2 = 0, 𝛽3 = 1, and 𝛽4 = 1

Question 5
Suppose that 𝛽0 = −1, 𝛽1 = 0, 𝛽2 = 0, 𝛽3 = 1, and 𝛽4 = 1. Please find out the values for
ℎ𝛽 (0,0) and ℎ𝛽 (1,1) and their corresponding predictions of the coronavirus infection.

Question 6
Consider the problem of predicting the housing price as a function of its size. Your model is:

ℎ𝛽 (𝑥) = 𝛽0 + 𝛽1 𝑠𝑖𝑧𝑒 + 𝛽2 √𝑠𝑖𝑧𝑒

Suppose that in a training sample, the size of an apartment ranges from 1 to 10000 square feet.
The average size and √𝑠𝑖𝑧𝑒 is 1000 and 30. You will implement this by fitting a model:
ℎ𝛽 (𝑥) = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2
Suppose you want to use feature scaling with mean normalization to preprocess the data.
What values of 𝑥1 and 𝑥2 would you use?

You might also like