0% found this document useful (0 votes)
9 views13 pages

Hyper Parameter Optimization

The document discusses hyperparameter optimization in decision trees, highlighting the importance of tuning hyperparameters to improve model performance, reduce overfitting, and enhance generalization. It outlines key hyperparameters such as criteria, max_depth, min_samples_split, and min_samples_leaf, along with their roles in controlling the learning process. Additionally, it provides a practical example using Python code to implement hyperparameter tuning with GridSearchCV on the Iris dataset.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Hyper Parameter Optimization

The document discusses hyperparameter optimization in decision trees, highlighting the importance of tuning hyperparameters to improve model performance, reduce overfitting, and enhance generalization. It outlines key hyperparameters such as criteria, max_depth, min_samples_split, and min_samples_leaf, along with their roles in controlling the learning process. Additionally, it provides a practical example using Python code to implement hyperparameter tuning with GridSearchCV on the Iris dataset.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Hyper parameter

optimization
Decision Trees
Hyperparameter
• A hyperparameter is a parameter that is defined before the learning
process begins and it helps to control aspects of the learning process.
• Examples of hyperparameters include the learning rate,
regularization strength, and the choice of optimization algorithm.
• When we define these hyperparameters, the model can control the
features of the learning process and possibly impact its performance
and behavior.
Hyper parameter tuning
• While training the machine learning models, the requirement for
different sets of hyperparameters arises because of the needs of each
dataset and model.
• One such solution to determine the hyperparameter is to perform
multiple experiments that allow us to choose a set of
hyperparameters that best suits our model. This process of selecting
the optimal hyperparameter is called hyperparameter tuning.
Hyper parameter tuning

Tuning hyperparameters is crucial for decision trees for below reasons:


• Improved Performance: Untuned hyperparameters can lead to sub-
optimal decision trees. Tuning allows you to find the settings that best
suit your data, resulting in a model that captures the underlying
patterns more effectively and delivers better predictions.
• Reduced Overfitting: Decision trees are prone to overfitting, where
the model memorizes the training data's noise instead of learning
generalizable patterns. Hyperparameter tuning helps prevent this by
controlling the tree's complexity (e.g., with max_depth) and
preventing excessive granularity (e.g., with min_samples_split).
• Enhanced Generalization:The goal is for the decision tree to perform well
on unseen data. Tuning hyperparameters helps achieve this by striking a
balance between model complexity and flexibility. A well-tuned tree can
capture the important trends in the data without overfitting to the
specifics of the training set, leading to better performance on new data.

• Addressing Class Imbalance: Class imbalance occurs when one class has
significantly fewer samples than others. Tuning hyperparameters like
min_weight_fraction_leaf allows you to leverage sample weights and
ensure the tree doesn't get biased towards the majority class, leading to
more accurate predictions for the minority class.
• Tailoring the Model to Specific Tasks: Different tasks might require
different decision tree behaviors. Hyperparameter tuning allows you
to customize the tree's structure and learning process to fit the
specific needs of your prediction problem. For example, you might
prioritize capturing complex relationships by adjusting max_depth for
a complex classification task
Types of Hyperparameters in Decision Tree

• Criteria : The quality of the split in the decision tree is measured by


the function called criteria. The criteria support two types such as
gini (Gini impurity) and entropy (information gain).
• Gini index - Gini impurity or Gini index is the measure that parts the
probability distributions of the target attribute’s values. It splits the node in a
way that yields the least amount of impurity.
• Information gain - It is an impurity measure that uses the entropy measure to
spilt a node in a way that it yields the most amount of information gain.
• max_depth: As the name suggests, max_depth
hyperparameter controls the maximum depth to which the decision
tree is allowed to grow. When the max_depth is deeper it allows the
tree to capture more complex patterns in the training data potentially
reducing the training error. However, setting max_depth too high can
lead to overfitting where the model memorizes the noise in the
training data. It is very important to tune max_depth carefully to find
the right balance between model complexity and generalization
performance.
• min_samples_split: The min_sample_split hyperparameter defines
the minimal number of samples that are needed to split a node. It
should be noted that the min_samples_split works as a threshold to
split a node in a decision tree, if the number of samples in a node is
less than min_samples_split, the node will not be split and it will turn
into a leaf node.
• min_samples_leaf: The min_samples_leaf hyperparameter defines
the required minimal amount of samples to be present at a leaf node.
• max_features: The max_features hyperparameter allow us to control
the number of features to be considered when looking for the best
split in the decision tree. It can either define an exact number of
features to consider at each split or as a percentage that represents
the proportion of features to consider. The input options can be an
integer, float, auto, sqrt, log2. It function as follows:
• auto - It allows the decision tree algorithm to consider all the features for
each split.
• sqrt - It allows the algorithm to consider only the square root of the total
number of features for each split
• log2 - It allows the algorithm to consider the logarithm base 2 of a total
number of features for each split.
• import numpy as np
• import pandas as pd
• from sklearn.datasets import load_iris
• from sklearn.tree import DecisionTreeClassifier
• from sklearn.model_selection import train_test_split, GridSearchCV
• from sklearn.metrics import accuracy_score

• # Load dataset
• iris = load_iris()
• X, y = iris.data, iris.target

• # Split into training and testing sets


• X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
• # Define the model
• dt = DecisionTreeClassifier(random_state=42)

• # Define hyperparameter grid


• param_grid = {
• 'criterion': ['gini', 'entropy'],
• 'max_depth': [None, 5, 10, 20],
• 'min_samples_split': [2, 5, 10],
• 'min_samples_leaf': [1, 2, 5]
•}
• # Perform GridSearchCV
• grid_search = GridSearchCV(estimator=dt, param_grid=param_grid, cv=5,
scoring='accuracy', n_jobs=-1)
• grid_search.fit(X_train, y_train)

• # Best parameters
• print("Best Parameters:", grid_search.best_params_)

• # Train with best parameters


• best_dt = grid_search.best_estimator_
• y_pred = best_dt.predict(X_test)

• # Evaluate model
• accuracy = accuracy_score(y_test, y_pred)
• print(f"Test Accuracy: {accuracy:.4f}")

You might also like