0% found this document useful (0 votes)
10 views

Model Fine-Tuning_ Hyperparameter Optimization

The document discusses the importance of hyperparameter optimization in machine learning, focusing on model parameters and hyperparameters. It details a practical implementation using Logistic Regression with various experiments to evaluate model performance under different conditions, including scaling and hyperparameter tuning. The results highlight the impact of these techniques on model accuracy and class handling, particularly for imbalanced datasets.

Uploaded by

pezzyrex
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Model Fine-Tuning_ Hyperparameter Optimization

The document discusses the importance of hyperparameter optimization in machine learning, focusing on model parameters and hyperparameters. It details a practical implementation using Logistic Regression with various experiments to evaluate model performance under different conditions, including scaling and hyperparameter tuning. The results highlight the impact of these techniques on model accuracy and class handling, particularly for imbalanced datasets.

Uploaded by

pezzyrex
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Model Fine-Tuning: Hyperparameter Optimization

An essential component contributing to the performance of machine learning


models is the parameters that govern the learning process. You can compare
these parameters to the strategic commands in a chess game; they decide
the course and dynamics. Similarly, machine learning has two types of
decision-making elements: model parameters and hyperparameters.

Model parameters are the learned attributes that influence the performance of
the training data, such as the weights in a linear regression model. These are
learned during training from the data itself.

Hyperparameters, on the other hand, are preset before training and guide the
learning process. For instance, the learning rate in gradient descent is a
hyperparameter.

Hyperparameters mold the strategy of the learning process.

Advanced Problem to be Solved


Now that we have a deeper understanding of model improvement and
enhancement mechanisms - including overfitting and underfitting,
hyperparameter optimization through grid and random search, and advanced
techniques like regularization

# Import the necessary libraries


from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import numpy as np
import pandas as pd
import datasets

# Load the dataset


red_wine = datasets.load_dataset('codesignal/wine-quality', split='red')
red_wine = pd.DataFrame(red_wine)
# Preparing the data
X = red_wine.drop(columns='quality')
y = (red_wine['quality'] > 6).astype('int') # classifying wine as good
(1) if quality > 6, else not good (0)

# TODO: Standardize the features X using sklearn's StandardScaler.


scaler = StandardScaler()
X =scaler.fit_transform(X)

# Split the dataset into train and test sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Initialize Logistic Regression Model


model = LogisticRegression(solver='saga', max_iter=1000,
class_weight='balanced')
model.fit(X_train,y_train)
# Define the grid for hyperparameter tuning
param_grid = {
'penalty': ['l1', 'l2'],
'C': np.logspace(-2, 2, 10),
'fit_intercept': [True, False]
}

# TODO: Use GridSearchCV for hyperparameter tuning to fit it to X_train


and y_train.
grid_search = GridSearchCV(model,param_grid)
grid_search.fit(X_train,y_train)
# Print the best parameters
print(f'Best parameters: {grid_search.best_params_}\n')

# TODO: Predict the test set results using the fitted model.
#y_pred = model.predict(X_test)
y_pred=grid_search.best_estimator_.predict(X_test)
# Print Classification Report
print("Classification Report: \n", classification_report(y_test, y_pred))
is it like we need to always follow this rule that you have used, that is for
training data fit_transform and for test data only transform or just for
illustration you have showed without any fixed rule
ChatGPT said:
ChatGPT

Yes, the rule of using fit_transform() on the training data and only
transform() on the test data is important and not just for illustration.
Here's why:

Key Reasons for the Rule:

1. Training Data (fit and transform):


○ When you use fit_transform() on the training data, the
StandardScaler calculates the mean and standard deviation
based solely on the training data.
○ This is essential because the model should only have access to
the training data during the training phase to avoid data leakage.
○ After computing the mean and standard deviation, the training
data is then transformed based on those computed values.
2. Test Data (only transform):
○ The test data should never be used to compute the mean or
standard deviation. The test data represents unseen data that the
model will encounter in real-world scenarios.
○ You should only apply the transformation (scaling) to the test data
using the mean and standard deviation computed from the
training data.
○ If you were to use fit_transform() on the test data, you
would compute new statistics, and your test set would be treated
differently from the training set, leading to biased evaluation and
data leakage.

Rule Summary:

● Training set: Use fit_transform() to compute the scaling factors


(mean and standard deviation) and apply them to the training data.
● Test set: Use transform() to apply the same scaling factors
(calculated from the training set) to the test data.

Example (Correct):
python
Copy code
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# Split data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)

# Initialize the StandardScaler


scaler = StandardScaler()

# Fit the scaler to the training data and transform it


X_train_scaled = scaler.fit_transform(X_train)

# Transform the test data using the same scaler (no


fitting)
X_test_scaled = scaler.transform(X_test)

What Would Happen If You Used fit_transform() on Both:

If you used fit_transform() on both the training and test sets, the test set
would be scaled differently than the training set. This is a form of data
leakage, and it would result in an unrealistically optimistic performance
because the model would have access to information it shouldn’t have (the
test data distribution).

Example (Incorrect):
python
Copy code
# Wrong way - fitting the scaler on both training and
test sets
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.fit_transform(X_test) # This is
incorrect!

Conclusion:

It is a best practice to always follow this rule:

● fit_transform() on the training data.


● transform() on the test data.

This ensures that the model only learns the statistics (mean, standard
deviation) from the training data, maintaining proper separation between
training and testing phases.

# Import the necessary libraries


from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import numpy as np
import pandas as pd
import datasets

# Load the dataset


red_wine = datasets.load_dataset('codesignal/wine-quality', split='red')
red_wine = pd.DataFrame(red_wine)

# Preparing the data


X = red_wine.drop(columns='quality')
y = (red_wine['quality'] > 6).astype('int') # classifying wine as good
(1) if quality > 6, else not good (0)

# TODO: Standardize the features X using sklearn's StandardScaler.


# Split the dataset into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

scaler=StandardScaler()
X_train=scaler.fit_transform(X_train)
X_test=scaler.transform(X_test)

# Initialize Logistic Regression Model


model = LogisticRegression(solver='saga', max_iter=500,
class_weight='balanced')
#model.fit(X_train,y_train)
# Define the grid for hyperparameter tuning
param_grid = {
'penalty': ['l1', 'l2'],
'C': np.logspace(-0.5, 0.5, 5),
'fit_intercept': [True, False]
}

# TODO: Use GridSearchCV for hyperparameter tuning to fit it to X_train


and y_train.
grid_search=GridSearchCV(model,param_grid, cv=5, verbose=1)
grid_search.fit(X_train,y_train)
# Print the best parameters
print(f'Best parameters: {grid_search.best_params_}\n')

# TODO: Predict the test set results using the fitted model.
#y_pred=model.predict(X_test)
y_pred=grid_search.predict(X_test)
# Print Classification Report
print("Classification Report: \n", classification_report(y_test, y_pred))

Experiment 1: No Scaler, No Hyperparameter Tuning

● Description: Logistic Regression without feature scaling and hyperparameter tuning.


● Model: Logistic Regression with default parameters.
● Results:
● Precision:
● Class 0: 0.88
● Class 1: 0.62
● Recall:
● Class 0: 0.98
● Class 1: 0.21
● F1-Score:
● Class 0: 0.93
● Class 1: 0.32
● Accuracy: 0.87
● Macro Avg F1-Score: 0.62
● Weighted Avg F1-Score: 0.84
● Tags: no_scaler, no_hyperparameter_tuning

Experiment 2: No Scaler, No Hyperparameter Tuning, Increased max_iter

● Description: Logistic Regression without feature scaling and hyperparameter tuning, but
with increased max_iter.
● Model: Logistic Regression with max_iter=1000.
● Results:
● Precision:
● Class 0: 0.88
● Class 1: 0.55
● Recall:
● Class 0: 0.97
● Class 1: 0.23
● F1-Score:
● Class 0: 0.92
● Class 1: 0.33
● Accuracy: 0.86
● Macro Avg F1-Score: 0.62
● Weighted Avg F1-Score: 0.83
● Tags: no_scaler, no_hyperparameter_tuning, max_iter_1000

Experiment 3: Standardization, Direct Model Fit, Increased max_iter, No


Hyperparameter Tuning

● Description: Logistic Regression with feature scaling using StandardScaler, direct


model fitting without hyperparameter tuning, and increased max_iter.
● Model: Logistic Regression with max_iter=500.
● Results:
● Precision:
● Class 0: 0.89
● Class 1: 0.59
● Recall:
● Class 0: 0.97
● Class 1: 0.28
● F1-Score:
● Class 0: 0.92
● Class 1: 0.38
● Accuracy: 0.87
● Macro Avg F1-Score: 0.65
● Weighted Avg F1-Score: 0.84
● Tags: standardization, direct_fit, max_iter_500,
no_hyperparameter_tuning

Experiment 4: Standardization, Grid Search for Hyperparameter Tuning,


Increased max_iter

● Description: Logistic Regression with feature scaling using StandardScaler,


hyperparameter tuning using GridSearchCV, and increased max_iter.
● Model: Logistic Regression with solver='saga' and max_iter=500.
● Results:
● Precision:
● Class 0: 0.89
● Class 1: 0.57
● Recall:
● Class 0: 0.96
● Class 1: 0.28
● F1-Score:
● Class 0: 0.92
● Class 1: 0.37
● Accuracy: 0.86
● Macro Avg F1-Score: 0.65
● Weighted Avg F1-Score: 0.84
● Tags: standardization, grid_search, max_iter_500

Experiment 5: Standardization, Grid Search for Hyperparameter Tuning,


Increased max_iter, Balanced Class Weights

● Description: Logistic Regression with feature scaling using StandardScaler,


hyperparameter tuning using GridSearchCV, increased max_iter, and balanced
class weights.
● Model: Logistic Regression with solver='saga', max_iter=500, and
class_weight='balanced'.
● Results:
● Precision:
● Class 0: 0.98
● Class 1: 0.42
● Recall:
● Class 0: 0.79
● Class 1: 0.89
● F1-Score:
● Class 0: 0.87
● Class 1: 0.58
● Accuracy: 0.81
● Macro Avg F1-Score: 0.72
● Weighted Avg F1-Score: 0.83
● Tags: standardization, grid_search, max_iter_500,
class_weight_balanced

Special Comments:

● The model shows improved recall for class 1, indicating better handling of class
imbalance.
● The warning in STDERR is related to dataset loading and doesn't affect the model's
performance. Consider using trust_remote_code=True in future dataset loads to
avoid this warning.

Feel free to copy and keep these records for future reference! If you have any more questions
or need further assistance, let me know! 😊

You might also like