Midterm 2021 - Model Answer1
Midterm 2021 - Model Answer1
Midterm Exam
Department: CS
Course Name: Machine Learning Date: 1/12/2021
Course Code: CS467 Duration: 1 hour
Instructor(s): Dr. Hanaa Bayomi Total Marks: 20
Name:……………………………………… ID:…………………………..
تعليمات هامة
.• حيازة التليفون المحمول مفتوحا داخل لجنة اإل\متحان يعتبر حالة غش تستوجب العقاب وإذا كان ضرورى الدخول بالمحمول فيوضع مغلق فى الحقائب
.• ال يسمح بدخول سماعة األذن أو البلوتوث
.• اليسمح بدخول أي كتب أو مالزم أو أوراق داخل اللجنة والمخالفة تعتبر حالة غش
Question 1 [5 marks]
- Answer the following Questions:
Construct a parametric classifier using Naïve byes to predict whether this person with a new
instance
X= (Given Birth= "Yes", Can Fly= "no", Live in water = "Yes", Have legs="no")
Will be mammals or non-mammals.
1
Question 2 Mark each statement with T or F in the right side: [5 marks]
2) When a decision tree is grown to full depth, it is more likely to fit the noise in
( T )
the data.
9) When the trained system matches the training set perfectly, overfitting may
( T )
occur
10) Algorithms for supervised learning are not directly applicable for
( T)
unsupervised learning
2
Question 3 [5 marks]
Kim is building a spam filter. She has the hypothesis that counting the occurrences of the
letter ‘x’ in the e-mails will be a good indicator of spam or no-spam. She collects 7 spam
messages and 7 no spam messages and counts the number of x-s in each. Here is what she
finds.
• Number of ‘x’-s in each spam: [0, 3, 4, 8, 9, 13, 21]
• Number of ‘x’-s in each no-spam: [0, 0, 1, 2, 2, 5, 6]
She trains a logistic regression classifier on the data and plots the classifier against the data.
b) How is a logistic regression model normally turned into a binary classifier? If you turn
the model into a classifier in this way, what is the accuracy of the classifier on the
training data?
c) Can use the SVM to solve this problem? explain. if you use it what is the training error
rate after using SVM?
3
Question 4 [5 marks]
a) While minimizing a convex objective function using gradient descent, the algorithm does
not converge even after 10,000 iterations. Mention any two reasons and the possible
solutions?
True: Each point is its own neighbor, so 1-NN classifier achieves perfect classification
on training data.
c) We consider the following models of logistic regression for a binary classification with
a sigmoid function
11
gg((zz)) == −−zz
11++ ee
Does it matter how the third example is labeled in Model 1? i.e., would the learned value
of w = (w1, w2) be different if we change the label of the third example to -1? Does it
matter in Model 2? Briefly explain your answer. (Hint: think of the decision boundary on
2D plane.)
It does not matter in Model 1 because x (3) = (0, 0) makes w1x1 + w2x2 always zero and
hence the likelihood of the model does not depend on the value of w. But it does matter
in Model 2.
Good
4
Luck
Dr.Hanaa Bayomi