100% found this document useful (1 vote)
837 views4 pages

Weekly Quiz 2 Machine Learning PDF

This document contains a 9 question quiz on machine learning concepts. The questions cover topics like weak learners in ensemble models, bagging and boosting methods, bias and variance tradeoffs, overfitting and underfitting, gradient boosting, and using SMOTE for imbalanced classification problems. Correct answers are provided for each multiple choice question along with additional explanatory notes.

Uploaded by

likhith krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
837 views4 pages

Weekly Quiz 2 Machine Learning PDF

This document contains a 9 question quiz on machine learning concepts. The questions cover topics like weak learners in ensemble models, bagging and boosting methods, bias and variance tradeoffs, overfitting and underfitting, gradient boosting, and using SMOTE for imbalanced classification problems. Correct answers are provided for each multiple choice question along with additional explanatory notes.

Uploaded by

likhith krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Weekly Quiz 2 (ML)

Type: Graded Quiz Attempts: 2/2 Questions: 9


Time: 20m Scoring Policy: Highest Score

Question No: 1
Which of the following is / are true about weak learners used in ensemble model?
1. They have low variance and they don’t usually over-fit
2. They have high bias, so they cannot solve hard learning problems
3. They have high variance and they don’t usually over-fit

1 and 2

2 and 3

None of these

1 and 3

Note: Weak learners are sure about particular part of a problem. So they usually don’t overfit
which means that weak learners have low variance and high bias.

Question No: 2
In an election, N candidates are competing against each other and people are voting for either of the
candidates. Voters don’t communicate with each other while casting their votes.
Which of the following ensemble method works similar to above-discussed election procedure?

Hint: Persons are like base models of ensemble method

A Or B

None of these

Bagging

Boosting

Note: In bagging process, you will take multiple random sample from the population and build
CART on the random sample and then taking the count of the responses to come up with the
predictive score.

Question No: 3
Models with High Bias and Low Variance are inconsistent but accurate on average

TRUE

FALSE

This study source was downloaded by 100000834959320 from CourseHero.com on 03-05-2022 14:18:46 GMT -06:00

https://www.coursehero.com/file/55557128/Weekly-Quiz-2-Machine-Learningpdf/
Note: High bias, low variance algorithms train models that are consistent, but inaccurate on
average. High variance, low bias algorithms train models that are accurate on average, but
inconsistent.

Question No: 4
If the model has a very high bias the model is more likely to

Underfit

Overfit

Note: In machine learning terminology, underfitting means that a model is too general, leading to
high bias.

Question No: 5
What type of boosting involves the following three elements?

1. A loss function to be optimized.


2. A weak learner to make predictions.
3. An additive model to add weak learners to minimize the loss function.

Gradient Boosting

Adaptive Boosting

Extreme Gradient Boosting

Note: Gradient Boosting for regression builds an additive model in a forward stage-wise fashion; it
allows for the optimization of arbitrary differentiable loss functions. The objective is to minimize the
loss of the model by adding weak learners using a gradient descent like procedure.

Question No: 6
Let us look at the following case and answer the following question
Consider a data set with 100,000 observations. This data set consists of candidates who applied for
Internship in Harvard. Apparently, Harvard is well-known for its extremely low acceptance rate. The
dependent variable represents if a candidate has been shortlisted (1) or not shortlisted (0). After
analysing the data, it was found ~ 98% did not get shortlisted and only ~ 2% got shortlisted.
Which of the following methods can be used before starting to work on the data given in the above
case?

No changes required

SMOTE

Under Sampling
Note: Rather than replicating the minority observations (e.g., defaulters, fraudsters, churners),
Synthetic Minority Oversampling (SMOTE) works by creating synthetic observations based upon the
existing minority observations

This study source was downloaded by 100000834959320 from CourseHero.com on 03-05-2022 14:18:46 GMT -06:00

https://www.coursehero.com/file/55557128/Weekly-Quiz-2-Machine-Learningpdf/
Question No: 7
In terms of the bias-variance trade-off, which of the following is/are substantially more harmful to
the test error than the training error?

Risk

Bias

Variance

Loss

Note: The variance is an error from sensitivity to small fluctuation in the training set. High variance
can cause an algorithm to model the random noise in training data, rather than the intended
outputs. This is known as overfitting. And once you run the model on test data to check the
performance of model you will get drop in model performance because of overfitting.

Question No: 8
What is the explanation of the mentioned code snippet?

“balanced.gd <- SMOTE(Class ~., smote.train, perc.over = 4800, k = 5, perc.under = 1000)”

We are adding 10 for every 100 of the minority class sample.

We are subtracting 48 for every 100 of the minority class sample

We are adding 48 for every 100 of the minority class sample

We are subtracting 10 for every 100 of the minority class sample.

Note: SMOTE is R algorithm for unbalanced classification problems. Alternatively, it can also run a
classification algorithm on this new data set and return the resulting model. perc.over: A number
that drives the decision of how many extra cases from the minority class are generated, known as
over sampling. Pred.under: A number that drives the decision of how many extra cases from the
majority class are selected for each case generated from the minority class known as under
sampling.

This study source was downloaded by 100000834959320 from CourseHero.com on 03-05-2022 14:18:46 GMT -06:00

https://www.coursehero.com/file/55557128/Weekly-Quiz-2-Machine-Learningpdf/
Question No: 9
What kind of Boosting is the following?

Adaptive Boosting

Extreme Gradient Boosting

Gradient Boosting

Note: In the above diagram base model has been created and then getting a weak classifier and
updating the population distribution for the next step. Use the new population distribution to again
find the next learners. And this is the process of gradient boosting methodology.

This study source was downloaded by 100000834959320 from CourseHero.com on 03-05-2022 14:18:46 GMT -06:00

https://www.coursehero.com/file/55557128/Weekly-Quiz-2-Machine-Learningpdf/
Powered by TCPDF (www.tcpdf.org)

You might also like