0% found this document useful (0 votes)
23 views2 pages

ML Intw - Points To Remember

Random forest and gradient boosting are both tree-based algorithms. Random forest uses bagging to make predictions by dividing data into samples and building models on each, then combining results through voting or averaging. Gradient boosting uses boosting, weighting misclassified predictions more so they can be corrected sequentially until a stopping point. Random forest reduces variance while gradient boosting reduces both bias and variance. Type I error rejects a true null hypothesis as false, while Type II error accepts a false null hypothesis as true. Stratified sampling maintains class proportions across samples for classification problems, unlike random sampling.

Uploaded by

amar7chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views2 pages

ML Intw - Points To Remember

Random forest and gradient boosting are both tree-based algorithms. Random forest uses bagging to make predictions by dividing data into samples and building models on each, then combining results through voting or averaging. Gradient boosting uses boosting, weighting misclassified predictions more so they can be corrected sequentially until a stopping point. Random forest reduces variance while gradient boosting reduces both bias and variance. Type I error rejects a true null hypothesis as false, while Type II error accepts a false null hypothesis as true. Stratified sampling maintains class proportions across samples for classification problems, unlike random sampling.

Uploaded by

amar7chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 2

1 - In stratified cross-validation, the split preserves the ratio of the

categories on both the training and validation datasets.

2- Q21. Both being tree based algorithm, how is random


forest different from Gradient boosting algorithm (GBM)?
Answer: The fundamental difference is, random forest
uses bagging technique to make predictions. GBM uses
boosting techniques to make predictions.

In bagging technique, a data set is divided into n


samples using randomized sampling. Then, using a
single learning algorithm a model is build on all samples.
Later, the resultant predictions are combined using
voting or averaging. Bagging is done is parallel. In
boosting, after the first round of predictions, the
algorithm weighs misclassified predictions higher, such
that they can be corrected in the succeeding round. This
sequential process of giving higher weights to
misclassified predictions continue until a stopping
criterion is reached.

Random forest improves model accuracy by reducing


variance (mainly). The trees grown are uncorrelated to
maximize the decrease in variance. On the other hand,
GBM improves accuracy my reducing both bias and
variance in a model.

3- Q30. What do you understand by Type I vs Type II error


?
Answer: Type I error is committed when the null
hypothesis is true and we reject it, also known as a
‘False Positive’. Type II error is committed when the null
hypothesis is false and we accept it, also known as
‘False Negative’.

4- In case of classification problem, we should always


use stratified sampling instead of random sampling. A
random sampling doesn’t takes into consideration the
proportion of target classes. On the contrary, stratified
sampling helps to maintain the distribution of target
variable in the resultant distributed samples also.

5-

You might also like