Hyper Parameter Optimization

The document discusses hyperparameter optimization in decision trees, highlighting the importance of tuning hyperparameters to improve model performance, reduce overfitting, and enhance generalization. It outlines key hyperparameters such as criteria, max_depth, min_samples_split, and min_samples_leaf, along with their roles in controlling the learning process. Additionally, it provides a practical example using Python code to implement hyperparameter tuning with GridSearchCV on the Iris dataset.

Uploaded by

chandanaramesh2711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views13 pages

Hyper Parameter Optimization

Uploaded by

chandanaramesh2711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Hyper parameter

optimization
Decision Trees
Hyperparameter
• A hyperparameter is a parameter that is defined before the learning
process begins and it helps to control aspects of the learning process.
• Examples of hyperparameters include the learning rate,
regularization strength, and the choice of optimization algorithm.
• When we define these hyperparameters, the model can control the
features of the learning process and possibly impact its performance
and behavior.
Hyper parameter tuning
• While training the machine learning models, the requirement for
different sets of hyperparameters arises because of the needs of each
dataset and model.
• One such solution to determine the hyperparameter is to perform
multiple experiments that allow us to choose a set of
hyperparameters that best suits our model. This process of selecting
the optimal hyperparameter is called hyperparameter tuning.
Hyper parameter tuning

Tuning hyperparameters is crucial for decision trees for below reasons:

• Improved Performance: Untuned hyperparameters can lead to sub-
optimal decision trees. Tuning allows you to find the settings that best
suit your data, resulting in a model that captures the underlying
patterns more effectively and delivers better predictions.
• Reduced Overfitting: Decision trees are prone to overfitting, where
the model memorizes the training data's noise instead of learning
generalizable patterns. Hyperparameter tuning helps prevent this by
controlling the tree's complexity (e.g., with max_depth) and
preventing excessive granularity (e.g., with min_samples_split).
• Enhanced Generalization:The goal is for the decision tree to perform well
on unseen data. Tuning hyperparameters helps achieve this by striking a
balance between model complexity and flexibility. A well-tuned tree can
capture the important trends in the data without overfitting to the
specifics of the training set, leading to better performance on new data.

• Addressing Class Imbalance: Class imbalance occurs when one class has
significantly fewer samples than others. Tuning hyperparameters like
min_weight_fraction_leaf allows you to leverage sample weights and
ensure the tree doesn't get biased towards the majority class, leading to
more accurate predictions for the minority class.
• Tailoring the Model to Specific Tasks: Different tasks might require
different decision tree behaviors. Hyperparameter tuning allows you
to customize the tree's structure and learning process to fit the
specific needs of your prediction problem. For example, you might
prioritize capturing complex relationships by adjusting max_depth for
a complex classification task
Types of Hyperparameters in Decision Tree

• Criteria : The quality of the split in the decision tree is measured by

the function called criteria. The criteria support two types such as
gini (Gini impurity) and entropy (information gain).
• Gini index - Gini impurity or Gini index is the measure that parts the
probability distributions of the target attribute’s values. It splits the node in a
way that yields the least amount of impurity.
• Information gain - It is an impurity measure that uses the entropy measure to
spilt a node in a way that it yields the most amount of information gain.
• max_depth: As the name suggests, max_depth
hyperparameter controls the maximum depth to which the decision
tree is allowed to grow. When the max_depth is deeper it allows the
tree to capture more complex patterns in the training data potentially
reducing the training error. However, setting max_depth too high can
lead to overfitting where the model memorizes the noise in the
training data. It is very important to tune max_depth carefully to find
the right balance between model complexity and generalization
performance.
• min_samples_split: The min_sample_split hyperparameter defines
the minimal number of samples that are needed to split a node. It
should be noted that the min_samples_split works as a threshold to
split a node in a decision tree, if the number of samples in a node is
less than min_samples_split, the node will not be split and it will turn
into a leaf node.
• min_samples_leaf: The min_samples_leaf hyperparameter defines
the required minimal amount of samples to be present at a leaf node.
• max_features: The max_features hyperparameter allow us to control
the number of features to be considered when looking for the best
split in the decision tree. It can either define an exact number of
features to consider at each split or as a percentage that represents
the proportion of features to consider. The input options can be an
integer, float, auto, sqrt, log2. It function as follows:
• auto - It allows the decision tree algorithm to consider all the features for
each split.
• sqrt - It allows the algorithm to consider only the square root of the total
number of features for each split
• log2 - It allows the algorithm to consider the logarithm base 2 of a total
number of features for each split.
• import numpy as np
• import pandas as pd
• from sklearn.datasets import load_iris
• from sklearn.tree import DecisionTreeClassifier
• from sklearn.model_selection import train_test_split, GridSearchCV
• from sklearn.metrics import accuracy_score

• # Load dataset
• iris = load_iris()
• X, y = iris.data, iris.target

• # Split into training and testing sets

• X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
• # Define the model
• dt = DecisionTreeClassifier(random_state=42)

• # Define hyperparameter grid

• param_grid = {
• 'criterion': ['gini', 'entropy'],
• 'max_depth': [None, 5, 10, 20],
• 'min_samples_split': [2, 5, 10],
• 'min_samples_leaf': [1, 2, 5]
•}
• # Perform GridSearchCV
• grid_search = GridSearchCV(estimator=dt, param_grid=param_grid, cv=5,
scoring='accuracy', n_jobs=-1)
• grid_search.fit(X_train, y_train)

• # Best parameters
• print("Best Parameters:", grid_search.best_params_)

• # Train with best parameters

• best_dt = grid_search.best_estimator_
• y_pred = best_dt.predict(X_test)

• # Evaluate model
• accuracy = accuracy_score(y_test, y_pred)
• print(f"Test Accuracy: {accuracy:.4f}")

Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Bias and Variance in Machine Learning
100% (1)
Bias and Variance in Machine Learning
7 pages
170 Machine Learning Interview Questios - Greatlearning
100% (1)
170 Machine Learning Interview Questios - Greatlearning
57 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Tree Based Learning Methods
No ratings yet
Tree Based Learning Methods
28 pages
Clinical Prediction Models A Practical Approach To Development, Validation, and Updating 2nd Edition Readable PDF Download
100% (15)
Clinical Prediction Models A Practical Approach To Development, Validation, and Updating 2nd Edition Readable PDF Download
16 pages
The Hundred-Page Machine Learning Book - Andriy Burkov
No ratings yet
The Hundred-Page Machine Learning Book - Andriy Burkov
16 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
4 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Act 9
No ratings yet
Act 9
22 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
2 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Decision Trees and Random Forest
No ratings yet
Decision Trees and Random Forest
79 pages
EST Cheatsheet
No ratings yet
EST Cheatsheet
5 pages
Hyperparametric Tuning of XG and RFC
No ratings yet
Hyperparametric Tuning of XG and RFC
2 pages
Decision Tree Hyperparameters
No ratings yet
Decision Tree Hyperparameters
1 page
Decision Tree
No ratings yet
Decision Tree
1 page
Sentence Building
No ratings yet
Sentence Building
1 page
Team 5
No ratings yet
Team 5
12 pages
Optimized Hyperparameters Tuning of Multi-Class Classification Algorithms
No ratings yet
Optimized Hyperparameters Tuning of Multi-Class Classification Algorithms
17 pages
Decision Tree Notes
No ratings yet
Decision Tree Notes
6 pages
MI - Unit 4
No ratings yet
MI - Unit 4
79 pages
Machine Learning: Version 2 CSE IIT, Kharagpur
No ratings yet
Machine Learning: Version 2 CSE IIT, Kharagpur
6 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Lecture-7 Machine Learning With Python
No ratings yet
Lecture-7 Machine Learning With Python
42 pages
Bhabesh - Chapter 3 Complete Editing Including Summary
No ratings yet
Bhabesh - Chapter 3 Complete Editing Including Summary
18 pages
Decision Trees Implementation
No ratings yet
Decision Trees Implementation
13 pages
2018 - PIDT A Novel Decision Tree Algorithm Based On Parameterised Impurities and Statistical Pruning Approaches - Daniel Stamate
No ratings yet
2018 - PIDT A Novel Decision Tree Algorithm Based On Parameterised Impurities and Statistical Pruning Approaches - Daniel Stamate
13 pages
Tables
No ratings yet
Tables
10 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
Classification
No ratings yet
Classification
8 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Hyperparameters Hyperparameters For Decision Trees: Maximum Depth
No ratings yet
Hyperparameters Hyperparameters For Decision Trees: Maximum Depth
4 pages
XG Boosting Reference
No ratings yet
XG Boosting Reference
6 pages
Best Splitting Attributes ML
No ratings yet
Best Splitting Attributes ML
34 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Lesson 36 - Rule Induction and Decision Tree II
No ratings yet
Lesson 36 - Rule Induction and Decision Tree II
6 pages
Training Day 22
No ratings yet
Training Day 22
48 pages
CSET301 LabW8L2
No ratings yet
CSET301 LabW8L2
1 page
Machine Learning Unit 1
100% (7)
Machine Learning Unit 1
112 pages
Hyperparameter Tuning in Machine Learning 1706249573
No ratings yet
Hyperparameter Tuning in Machine Learning 1706249573
9 pages
Decision Tree and Related Techniques For Classification in Scalation
No ratings yet
Decision Tree and Related Techniques For Classification in Scalation
12 pages
Unit 4
No ratings yet
Unit 4
33 pages
Hyper-Parameter Optimization: A Review of Algorithms and Applications
No ratings yet
Hyper-Parameter Optimization: A Review of Algorithms and Applications
56 pages
ANN Formulas and Models
No ratings yet
ANN Formulas and Models
24 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
Decision Trees
No ratings yet
Decision Trees
11 pages
Hyperparameter Tuning For Machine Learning Models
No ratings yet
Hyperparameter Tuning For Machine Learning Models
5 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
9 pages
CBP - AL and ML Brochure Final
No ratings yet
CBP - AL and ML Brochure Final
13 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Unit-5 Decision Trees & Ensembles Methods
No ratings yet
Unit-5 Decision Trees & Ensembles Methods
11 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Hyper Parameters
No ratings yet
Hyper Parameters
24 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Hierarchical Clusters
No ratings yet
Hierarchical Clusters
6 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Data Analytics Compendium BITeSys 2024
No ratings yet
Data Analytics Compendium BITeSys 2024
46 pages
Random Forest
No ratings yet
Random Forest
25 pages
ML1 17 Hepsi
No ratings yet
ML1 17 Hepsi
90 pages
PDF 3
No ratings yet
PDF 3
169 pages
Csit (r22) 3-2 Machine Learning Digital Notes
No ratings yet
Csit (r22) 3-2 Machine Learning Digital Notes
120 pages
Chapter 8
No ratings yet
Chapter 8
60 pages
Pa 1 Unit
No ratings yet
Pa 1 Unit
23 pages
Agripredict CAPSTONE Report
No ratings yet
Agripredict CAPSTONE Report
40 pages
Computer Science Students Academic Performance Prediction Using Ai
No ratings yet
Computer Science Students Academic Performance Prediction Using Ai
68 pages
Course Slides - Regression Analysis
No ratings yet
Course Slides - Regression Analysis
63 pages
Lab 5
No ratings yet
Lab 5
30 pages
Iiver
No ratings yet
Iiver
53 pages
CP1407 Prac6-9
No ratings yet
CP1407 Prac6-9
45 pages
CN Module 2
No ratings yet
CN Module 2
40 pages
NO - The Influence of Person Specific Biometrics in Improving Generic Stress Predictive Models
No ratings yet
NO - The Influence of Person Specific Biometrics in Improving Generic Stress Predictive Models
12 pages
ML Solved Endsem
No ratings yet
ML Solved Endsem
16 pages
Cross-Impact of Order Flow Imbalance in Equity Markets: Rama Cont, Mihai Cucuringu, and Chao Zhang
No ratings yet
Cross-Impact of Order Flow Imbalance in Equity Markets: Rama Cont, Mihai Cucuringu, and Chao Zhang
41 pages
Predicting Job Salaries From Text Descriptions
No ratings yet
Predicting Job Salaries From Text Descriptions
6 pages
NSS Posterholder Appreciation Certificate-1
No ratings yet
NSS Posterholder Appreciation Certificate-1
11 pages
AI Cheatsheet Withlinks Compressed
No ratings yet
AI Cheatsheet Withlinks Compressed
15 pages
Deep Learning Parameters v12.1
No ratings yet
Deep Learning Parameters v12.1
13 pages
ChandanaR 1DT23MC022
No ratings yet
ChandanaR 1DT23MC022
2 pages
An SVM-based Approach For Stock Market Trend Prediction 1
No ratings yet
An SVM-based Approach For Stock Market Trend Prediction 1
7 pages
Bhumika - Intern Offer Letter
No ratings yet
Bhumika - Intern Offer Letter
2 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet