Hyperparameter Tuning with R
Last Updated :
12 Sep, 2024
In R Language several techniques and packages can be used to optimize these hyperparameters, leading to better, more reliable models. in this article, we will discuss all the techniques and packages for Hyperparameter Tuning with R.
What are Hyperparameters?
Hyperparameters are the settings that control how a machine-learning model learns from data. Examples include the learning rate in neural networks, the number of trees in a random forest, or the number of neighbors in a k-nearest neighbors (k-NN) algorithm. Choosing the correct hyperparameters can make the difference between a model that generalizes well to new data and one that overfits or underfits the training data. Unlike model parameters, which are adjusted during training to fit the data, hyperparameters must be set before training begins.
Why Hyperparameter Tuning is Important?
Correctly tuning hyperparameters can improve a model's performance. Poor settings may cause:
- Underfitting: The model is too simple to capture the underlying patterns.
- Overfitting: The model is too complex and captures noise rather than useful patterns, leading to poor generalization.
- Slow Convergence: Poorly tuned models, such as those with too small a learning rate, can take excessively long to converge or may not converge at all.
Techniques for Hyperparameter Tuning
Here are the some of the main Techniques for Hyperparameter Tuning.
- Grid Search: Grid search is an exhaustive search method that systematically evaluates all possible combinations of a predefined set of hyperparameters. For example, when tuning a random forest model, we may create a grid of values for parameters such as
mtry
(the number of variables randomly sampled for each split) and ntree
(the number of trees in the forest). - Random Search: Random search selects random combinations of hyperparameters from predefined distributions. Unlike grid search, random search does not evaluate every possible combination, making it more efficient, especially when the search space is large.
- Bayesian Optimization: Bayesian optimization is an advanced technique that models the relationship between hyperparameters and model performance. It uses this model to predict which hyperparameters will lead to better results, refining its predictions as it gathers more data.
- Cross-Validation: Cross-validation is commonly used alongside the above methods to evaluate model performance. By splitting the data into multiple subsets, training the model on some subsets and testing on others, we ensure that the selected hyperparameters lead to a model that generalizes well to unseen data.
Now we implement stepwise to Hyperparameter Tuning with R Programming Langauge.
Step 1: Load the required liabries and dataset
Load the required libaries and dataset.
R
library(randomForest)
library(caret)
# Load the mtcars dataset
data <- mtcars
Step 2: Data Preparation
Convert the am
column (which represents the type of transmission) to a factor with two levels: "Automatic" and "Manual".
R
# Convert 'am' (Transmission) to a factor for classification
mtcars$am <- factor(mtcars$am, levels = c(0, 1), labels = c("Automatic", "Manual"))
Step 3: Feature Selection
A subset of features (mpg
, cyl
, hp
, wt
) is selected for modeling.
R
# Subset of features for modeling
features <- mtcars[, c("mpg", "cyl", "hp", "wt")]
Step 4: Define Hyperparameter Grid
Define a grid of hyperparameter values for mtry
, which controls the number of features randomly selected at each split in the Random Forest algorithm.
R
#Define Hyperparameter Grid
# Define a grid for the 'mtry' parameter in Random Forest
tuneGrid <- expand.grid(mtry = c(1, 2))
Step 5: Cross-Validation Setup
The data is split into 5 parts: 4 for training, and 1 for validation, repeated 5 times.
R
#Cross-Validation Setup (using stratified sampling)
control <- trainControl(method = "cv",
number = 5,
summaryFunction = defaultSummary,
savePredictions = TRUE,
classProbs = FALSE,
sampling = "smote") # Handle class imbalance in small datasets
Step 6: Model Training
Now train the model.
R
# Model Training with Hyperparameter Tuning
# Train the Random Forest model using the 'caret' package and grid search
model <- train(am ~ mpg + cyl + hp + wt,
data = mtcars,
method = "rf",
metric = "Accuracy", # Set the metric explicitly for classification
trControl = control,
tuneGrid = tuneGrid,
allowParallel = TRUE)
Step 7:Print the result
Now print the best tuned model.
R
# Print the Best Tuned Model
print(model$bestTune) # Output the best 'mtry' value
print(model)
Output:
mtry
2 2
Random Forest
32 samples
4 predictor
2 classes: 'Automatic', 'Manual'
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 25, 26, 26, 26, 25
Addtional sampling using SMOTE
Resampling results across tuning parameters:
mtry Accuracy Kappa
1 0.7714286 0.5357971
2 0.9047619 0.8057971
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.
Setp 8: Visualize the Tuning Results
Now we will Visualize the Tuning Results.
R
# Visualize the Tuning Results
# Plot the performance of different hyperparameter values
plot(model)
Output:
Hyperparameter Tuning with RThe plot shows accuracy across different values of mtry
. As mtry
increases, accuracy improves until it peaks at mtry = 2
. Visualizing performance across different hyperparameters helps identify the optimal settings.
Conclusion
Hyperparameter tuning is a crucial step in refining machine learning models to achieve better performance. By carefully selecting and adjusting hyperparameters, such as those in neural networks or random forests, the model's ability to generalize to new data improves, reducing the risk of overfitting or underfitting. Techniques like grid search, random search, and Bayesian optimization, especially when combined with cross-validation, provide powerful ways to identify the optimal settings. Implementing these methods in R can significantly enhance model reliability and accuracy.
Similar Reads
Machine Learning Tutorial
Machine learning is a branch of Artificial Intelligence that focuses on developing models and algorithms that let computers learn from data without being explicitly programmed for every task. In simple words, ML teaches the systems to think and understand like humans by learning from the data.It can
5 min read
Non-linear Components
In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Linear Regression in Machine learning
Linear regression is a type of supervised machine-learning algorithm that learns from the labelled datasets and maps the data points with most optimized linear functions which can be used for prediction on new datasets. It assumes that there is a linear relationship between the input and output, mea
15+ min read
Support Vector Machine (SVM) Algorithm
Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. While it can handle regression problems, SVM is particularly well-suited for classification tasks. SVM aims to find the optimal hyperplane in an N-dimensional space to separate data
10 min read
Class Diagram | Unified Modeling Language (UML)
A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
K means Clustering â Introduction
K-Means Clustering is an Unsupervised Machine Learning algorithm which groups unlabeled dataset into different clusters. It is used to organize data into groups based on their similarity. Understanding K-means ClusteringFor example online store uses K-Means to group customers based on purchase frequ
4 min read
K-Nearest Neighbor(KNN) Algorithm
K-Nearest Neighbors (KNN) is a supervised machine learning algorithm generally used for classification but can also be used for regression tasks. It works by finding the "k" closest data points (neighbors) to a given input and makesa predictions based on the majority class (for classification) or th
8 min read
Logistic Regression in Machine Learning
In our previous discussion, we explored the fundamentals of machine learning and walked through a hands-on implementation of Linear Regression. Now, let's take a step forward and dive into one of the first and most widely used classification algorithms â Logistic RegressionWhat is Logistic Regressio
12 min read
Spring Boot Tutorial
Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Backpropagation in Neural Network
Backpropagation is also known as "Backward Propagation of Errors" and it is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network. In this article we will explore what
10 min read