ECE 528 / DA 528 Machine Learning, Midterm Exam: Instructions
ECE 528 / DA 528 Machine Learning, Midterm Exam: Instructions
Best of Luck!
Instructions:
Cycle the correct answer and fill in the blank spaces
Write your name, number, and signature in the honour code form attached
to the midterm exam
Total is 100 points
3. [10 point] Suppose you are working on weather prediction and use a
learning algorithm to predict tomorrow’s temperature (in degrees
Centigrade/Fahrenheit). Would you treat this as a classification or a
regression problem?
(a) Regression
(b) Classification
4. [10 points] Some of the problems below are best addressed using a
supervised learning algorithm, and the others with an unsupervised
learning algorithm. Which of the following would you apply
unsupervised learning to? (Select all that apply.) In each case, assume
some appropriate dataset is available for your algorithm to learn from.
(a) Given data on how 1000 medical patients respond to an
experimental drug (such as effectiveness of the treatment, side
effects, etc.), discover whether there are different categories or
“types” of patients in terms of how they respond to the drug, and if so
what these categories are.
(b) Given a large dataset of medical records from patients suffering
from heart disease, try to learn whether there might be different
clusters of such patients for which we might tailor separate
treatments.
(c) Have a computer examine an audio clip of a piece of music, and
classify whether or not there are vocals (i.e., a human voice singing) in
that audio clip, or if it is a clip of only musical instruments (and no
vocals).
(d) Given genetic (DNA) data from a person, predict the odds of
him/her developing diabetes over the next 10 years
5. [10 points] Some of the problems below are best addressed using a
supervised learning algorithm, and the others with an unsupervised
learning algorithm. Which of the following would you apply
unsupervised learning to? (Select all that apply.) In each case, assume
some appropriate dataset is available for your algorithm to learn from.
(a) Given historical data of children’s ages and heights, predict
children’s height as a function of their age.
(b) Given 50 articles written by male authors, and 50 articles written
by female authors, learn to predict the gender of a new
manuscript’s author (when the identity of this author is unknown).
(c) Take a collection of 1000 essays written on the US Economy, and
find a way to automatically group these essays into a small number
of groups of essays that are somehow “similar” or “related”.
(d) Examine a large collection of emails that are known to be spam
email, to discover if there are sub-types of spam mail.
6. [10 points] Here each row is one training example. Recall that in linear
regression, our hypothesis is to denote the number of
training examples.
For the training set given above (note that this training set may also be
referenced in other questions), what is the value of m? In the box
below, please enter your answer (which should be a number between
0 and 10).
…………………………………………………..
7. [10 points] For this question, assume that we are using the training set
from Q6. Recall our definition of the cost function was
What is ?
…………………………………………………………
10. [10 points] Draw a flowchat to show how supervise learning works.