0% found this document useful (0 votes)
18 views26 pages

XGBoost Algorithm

The document provides an overview of machine learning techniques, focusing on tree algorithms, bagging, random forests, and boosting, particularly the XGBoost algorithm. It details the parameters of XGBoost and includes a practical example using the Pima Indians onset of diabetes dataset. The document emphasizes the mathematical foundations and applications of XGBoost in supervised learning tasks.

Uploaded by

shar125f
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views26 pages

XGBoost Algorithm

The document provides an overview of machine learning techniques, focusing on tree algorithms, bagging, random forests, and boosting, particularly the XGBoost algorithm. It details the parameters of XGBoost and includes a practical example using the Pima Indians onset of diabetes dataset. The document emphasizes the mathematical foundations and applications of XGBoost in supervised learning tasks.

Uploaded by

shar125f
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Presented By:

Mohanad ALMUZIAN
 Machine Learning Types
 Evolution of Tree Algorithms
 Bagging
 Random Forest
 Boosting
 Types of Boosting
 XGBoost Algorithm
XGBoost Parameters
 Example on XGBoost
 Mathematics of XGBoost algorithm
 Practical Part
Flowchart-like tree structure, where each internal node denotes a test on an attribute, each
branch represents an outcome of the test, and each leaf node (terminal node) holds a class label.
Also known as Bootstrap Aggregation. This method creates random samples from the main data
and then develops parallel models on each data set.
Random forest uses a technique called bagging to build full
decision trees in parallel from random bootstrap samples of
the data set. The final prediction is an average of all of the
decision tree predictions.
Boosting is an ensemble modelling, technique that attempts to build a strong classifier from the
number of weak classifiers
XGBoost, or eXtreme Gradient Boosting, is a XGBoost algorithm in machine learning algorithm
under ensemble learning. It is trendy for supervised learning tasks, such as regression and
classification.
XGBoost Parameters
- General Parameters

- Booster parameters:

1. Eta
2. Max_depth
3. Max_leaf_nodes
4. Subsample
5. colsample_bytree

- Learning task parameters


Step 3: Fit a Tree to the Residuals

XGBoost fits a simple decision tree (weak learner) to the residuals. Let’s assume the decision tree looks like this:

x < 2.5

/ \

Yes No

-1.2 1.35

This tree attempts to predict the residuals. For instance:

• If x < 2.5, the tree predicts a correction of -1.2.

• If x ≥ 2.5, the tree predicts a correction of +1.35.


Problem Description: Predict Onset of Diabetes

In this example we are going to use the Pima Indians onset of diabetes dataset. This dataset is
comprised of 8 input variables that describe medical details of patients and one output variable to
indicate whether the patient will have an onset of diabetes within 5 years.

Pregnancies Glucose BloodPressure Skin Thickness Insulin BMI DiabetesPedigreeFunction Age Outcome
6 148 72 35 0 33.6 0.627 50 1
1 85 66 29 0 26.6 0.351 31 0
8 183 64 0 0 23.3 0.672 32 1
1 89 66 23 94 28.1 0.167 21 0
0 137 40 35 168 43.1 2.288 33 1
5 116 74 0 0 25.6 0.201 30 0
3 78 50 32 88 31 0.248 26 1
10 115 0 0 0 35.3 0.134 29 0
 https://blog.quantinsti.com/xgboost-python/
 https://www.shiksha.com/online-courses/articles/xgboost-algorithm-in-machine-learning/#intro
 https://www.nvidia.com/en-us/glossary/xgboost/
 https://www.enjoyalgorithms.com/blog/xg-boost-algorithm-in-ml
 https://www.analyticsvidhya.com/blog/2018/09/an-end-to-end-guide-to-understand-the-math
behind-xgboost/
 https://www.enjoyalgorithms.com/blogs/supervised-unsupervised-and-semisupervised-learning
 https://dimleve.medium.com/xgboost-mathematics-explained-58262530904a
 https://www.geeksforgeeks.org/xgboost-for-regression/
 https://en.wikipedia.org/wiki/XGBoost

You might also like