Supervised ML
Supervised ML
Machine Learning
Source: https://towardsdatascience.com/understanding-random-forest-58381e0602d2
Random Forest
● The most informative attributes are placed next to the
root (an informativeness function is applied)
○ entropy
○ gini (based on entropy but
computationally simpler)
Source: https://towardsdatascience.com/understanding-random-forest-58381e0602d2
Random Forest
● All ML algorithm has a set of hyperparameter used to
tune the model
○ criterion ['gini','entropy']
○ n_estimators n - # of trees (100)
○ max_features ['auto', 'sqrt', 'log2'] – maximal number of
attributes to split a node
○ max_depth n – max depth of a tree (None)
○ bootstrap [True, False] – built synthetic samples to
build the trees
○ Several others must be studied
Random Forest
● How to choose the best hyperparameters
○ sklearn.model_selection.GridSearchCV
○ Allow to run a given estimator with a combination of
hiperparameter and return the best combination