0% found this document useful (0 votes)
85 views5 pages

ECE 528 / DA 528 Machine Learning, Midterm Exam: Instructions

This document provides a 10 question midterm exam on machine learning. The exam covers topics like supervised vs unsupervised learning, linear regression, and gradient descent. Students are instructed to select the correct answers, sign an honor code, and the exam is worth a total of 100 points.

Uploaded by

şafak erdoğdu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views5 pages

ECE 528 / DA 528 Machine Learning, Midterm Exam: Instructions

This document provides a 10 question midterm exam on machine learning. The exam covers topics like supervised vs unsupervised learning, linear regression, and gradient descent. Students are instructed to select the correct answers, sign an honor code, and the exam is worth a total of 100 points.

Uploaded by

şafak erdoğdu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ECE 528 / DA 528 Machine Learning, Midterm Exam

Instructor: Asst. Prof. Dr. Abdullahi Abdu Ibrahim


Date: Thursday, 15/04/2021, 5pm to 8pm

Best of Luck!

Instructions:
Cycle the correct answer and fill in the blank spaces
Write your name, number, and signature in the honour code form attached
to the midterm exam
Total is 100 points

1. [10 points] A computer program is said to learn from experience E with


respect to some task T and some performance measure P if its
performance on T, as measured by P, improves with experience E.
Suppose we feed a learning algorithm a lot of historical weather data,
and have it learn to predict weather. What would be a reasonable
choice for P?
(a) The probability of it correctly predicting a future date’s weather.
(b)The weather prediction task.
(c)The process of the algorithm examining a large amount of
historical weather data.
(d)all of the above.

2. [10 points] A computer program is said to learn from experience E with


respect to some task T and some performance measure P if its
performance on T, as measured by P, improves with experience E.
Suppose we feed a learning algorithm a lot of historical weather data,
and have it learn to predict weather. In this setting, what is T?
(a) The weather prediction task.
(b) None of these.
(c) The probability of it correctly predicting a future date’s weather.
(d)The process of the algorithm examining a large amount of
historical weather data.

3. [10 point] Suppose you are working on weather prediction and use a
learning algorithm to predict tomorrow’s temperature (in degrees
Centigrade/Fahrenheit). Would you treat this as a classification or a
regression problem?
(a) Regression
(b) Classification

4. [10 points] Some of the problems below are best addressed using a
supervised learning algorithm, and the others with an unsupervised
learning algorithm. Which of the following would you apply
unsupervised learning to? (Select all that apply.) In each case, assume
some appropriate dataset is available for your algorithm to learn from.
(a) Given data on how 1000 medical patients respond to an
experimental drug (such as effectiveness of the treatment, side
effects, etc.), discover whether there are different categories or
“types” of patients in terms of how they respond to the drug, and if so
what these categories are.
(b) Given a large dataset of medical records from patients suffering
from heart disease, try to learn whether there might be different
clusters of such patients for which we might tailor separate
treatments.
(c) Have a computer examine an audio clip of a piece of music, and
classify whether or not there are vocals (i.e., a human voice singing) in
that audio clip, or if it is a clip of only musical instruments (and no
vocals).
(d) Given genetic (DNA) data from a person, predict the odds of
him/her developing diabetes over the next 10 years

5. [10 points] Some of the problems below are best addressed using a
supervised learning algorithm, and the others with an unsupervised
learning algorithm. Which of the following would you apply
unsupervised learning to? (Select all that apply.) In each case, assume
some appropriate dataset is available for your algorithm to learn from.
(a) Given historical data of children’s ages and heights, predict
children’s height as a function of their age.
(b) Given 50 articles written by male authors, and 50 articles written
by female authors, learn to predict the gender of a new
manuscript’s author (when the identity of this author is unknown).
(c) Take a collection of 1000 essays written on the US Economy, and
find a way to automatically group these essays into a small number
of groups of essays that are somehow “similar” or “related”.
(d) Examine a large collection of emails that are known to be spam
email, to discover if there are sub-types of spam mail.
6. [10 points] Here each row is one training example. Recall that in linear
regression, our hypothesis is to denote the number of
training examples.

For the training set given above (note that this training set may also be
referenced in other questions), what is the value of m? In the box
below, please enter your answer (which should be a number between
0 and 10).
…………………………………………………..

7. [10 points] For this question, assume that we are using the training set
from Q6. Recall our definition of the cost function was

What is ?
…………………………………………………………

8. [10 points] Suppose we set in the linear regression


hypothesis from Q6. What is ?
………………………………………………………

9. [10 points] Let be some function so that outputs a number.


For this problem, is some arbitrary/unknown smooth function (not
necessarily the cost function of linear regression, so may have local
optima).
Suppose we use gradient descent to try to minimize as a function
of and . Which of the following statements are not true? (Check all
that apply.)
(a) If and are initialized at the global minimum, then one
iteration will not change their values.
(b) Setting the learning rate to be very small is not harmful, and can
only speed up the convergence of gradient descent.
(c) No matter how and are initialized, so long as is sufficiently
small, we can safely expect gradient descent to converge to the
same solution.
(d) If the first few iterations of gradient descent cause to
increase rather than decrease, then the most likely cause is that we
have set the learning rate to too large a value.

10. [10 points] Draw a flowchat to show how supervise learning works.

You might also like