0% found this document useful (0 votes)
25 views3 pages

Sheet ML

Uploaded by

malkmoh781.mm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views3 pages

Sheet ML

Uploaded by

malkmoh781.mm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Course : Machine Learning

Faculty of Computers and Information


Minia University
------------------------------------------------------------------------------------------------------------------------------------------------------------

Answer the following question:

1. List Major Tasks in Data Preprocessing.

2. About 2/3 of your email is spam so you downloaded an open source spam filter
based on word occurrences that uses the Naive Bayes classifier. Assume you
collected the following regular and spam mails to train the classifier, and only
three words are informative for this classification, i.e., each email is
represented as a 3-dimensional binary vector whose components indicate
whether the respective word is contained in the email.

1. You find that the spam filter uses a prior p(spam) = 0.1. Explain (in one sentence) why this
might be sensible.
2. Based on the prior and conditional probabilities above, give the model probability
P(spam|s) that the sentence s=“money for psychology study” is spam.
Course : Machine Learning
Faculty of Computers and Information
Minia University
------------------------------------------------------------------------------------------------------------------------------------------------------------

3. What is the biggest advantage of decision trees when compared to logistic regression
classifiers?
4. What is the biggest weakness of decision trees compared to logistic regression
classifiers?
5. We are given a set of two dimensional inputs and their corresponding output
pair: {xi, 1, xi, 2, y } We would like to use the following regression model to predict y:

6. Construct the Decision Tree for the following dataset:

7. Discuss the clustering methods.


8. Divide the following data into two clustering using k means
clustering algorithms.
Course : Machine Learning
Faculty of Computers and Information
Minia University
------------------------------------------------------------------------------------------------------------------------------------------------------------

9. Draw the full decision tree that would be learned for this data.

You might also like