XG Boosting Reference
XG Boosting Reference
Previously, you learned about gradient boosting machine models and studied how to build and
tune them with XGBoost’s scikit-learn API. This reading is a quick-reference guide to help you
when you’re building XGBoost models of your own. It includes information on the following
components:
● Import statements
● Hyperparameters
Import statements
The following are some of the most commonly used import statements for gradient boosting
models using the XGBoost library together with scikit-learn.
Models
For classification tasks:
from xgboost import XGBClassifier
Evaluation metrics
metrics.log_loss(y_true, y_pred, *[, eps, ...]) Log loss, aka logistic loss or cross-
entropy loss
Hyperparameters
The following are some of the most important hyperparameters for gradient boosting
machine classification models built with the XGBoost library. These are the hyperparameters
that data professionals typically reach for first, because they are among the most intuitive and
they control the model at different levels using a diverse variety of mechanisms.
n_estimators
Input Default
Hyperparameter What it does
type Value
Considerations:
A typical range is 50–500. Consider how much data you have, how deep the trees are allowed
to grow, and how many samples are bootstrapped from the overall data to grow each tree
(you generally need more trees if they’re shallow, and more trees if your bootstrap sample
size represents just a small fraction of your overall data). For an extreme but illustrative
example, if you have a dataset of 10,000, and each tree only bootstraps 20 samples, you'll
need more trees than if you gave each tree 5,000 samples. Also keep in mind that, unlike
random forest, which can grow base learners in parallel, gradient boosting grows base
learners successively, so training can take longer for more trees.
max_depth
Input Default
Hyperparameter What it does
type Value
Considerations: Controls complexity of the model. Gradient boosting typically uses weak
learners, or “decision stumps” (i.e., shallow trees). Restricting tree depth can reduce training
times and serving latency as well as prevent overfitting. Consider values 2–6.
min_child_weight
Input Default
Hyperparameter What it does
type Value
Considerations: Higher values will stop trees splitting further, and lower values will allow trees
to continue to split further. If your model is underfitting, then you may want to lower it to allow
for more complexity. Conversely, increase this value to stop your trees from getting too finely
divided.
learning_rate
Input Default
Hyperparameter What it does
type Value
Considerations: Values can range from (0–1]. Typical values range from 0.01 to 0.3. Lower
values mean less weight is given to each consecutive base learner. Consider how many trees
are in your ensemble. Lower values typically benefit from more trees.
colsample_bytree*
Input Default
Hyperparameter What it does
type Value
Considerations: Adds randomness to the model to make it robust to noise. Consider how
many features the dataset has and how many trees will be grown. Fewer features sampled
means more base learners might be needed. Small colsample_bytree values on datasets with
many features mean more unpredictive trees in the ensemble.
subsample*
Input Default
Hyperparameter What it does
type Value
*Note that colsample_bytree and subsample were not used in the Tune a GBM model video and
its accompanying notebook; they are included here so you can use these hyperparameters in
your own work. Remember that using fractions of the data to train each base learner can
possibly improve model predictions and certainly speed up training times.
Key takeaways
When building machine learning models, it’s essential to have the right tools and understand
how to use them. Although there are numerous other hyperparameters to explore, the ones in
this reference guide are among the most important. Be inquisitive and try different
approaches. Discovering ways to improve your model is a lot of fun!