Classification using PyTorch linear function
Last Updated :
24 Apr, 2025
In machine learning, prediction is a critical component. It is the process of using a trained model to make predictions on new data. PyTorch is an open-source machine learning library that allows developers to build and train neural networks. One common use case in PyTorch is using linear classifiers for prediction tasks. In this article, we will go through the steps to build a linear classifier in PyTorch and use it to make predictions on new data.
Linear Classifier:
A linear classifier is a type of machine learning model that uses a linear function to classify data into two or more classes. It works by computing a weighted sum of the input features and adding a bias term. The result is then passed through an activation function, which maps the output to a probability distribution over the classes.
In PyTorch, we can define a linear classifier using the nn.Linear module. This module takes two arguments: the number of input features and the number of output classes. It automatically initializes the weight and bias parameters with random values.
Let's go through an example of building a linear classifier in PyTorch.
Example:
We will use the famous Iris dataset for our example. The Iris dataset contains measurements of the sepal length, sepal width, petal length, and petal width for three species of iris flowers. Our goal is to build a linear classifier that can predict the species of an iris flower based on its measurements.
Step 1: Import the Required Libraries
We will start by importing the necessary libraries. We need torch for building the linear classifier and sklearn for loading the Iris dataset.
Python3
# Import the required libraries
import torch
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
Step 2: Load the Data
Next, we will load the Iris dataset and split it into training and testing sets.
Python3
# Load the Iris dataset
iris = load_iris()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(iris.data,
iris.target,
test_size=0.2,
random_state=42)
Step 3: Prepare the Data
We need to convert the data into PyTorch tensors and normalize the features to have a mean of zero and a standard deviation of one.
Python3
# Convert the data to PyTorch tensors
X_train = torch.tensor(X_train).float()
X_test = torch.tensor(X_test).float()
y_train = torch.tensor(y_train)
y_test = torch.tensor(y_test)
# Normalize the features
mean = X_train.mean(dim=0)
std = X_train.std(dim=0)
X_train = (X_train - mean) / std
X_test = (X_test - mean) / std
Step 4: Define the Model
We can define our linear classifier using the nn.Linear module. We will set the number of input features to 4 (since we have four measurements) and the number of output classes to 3 (since we have three species of iris flowers).
Python3
# Define the model
model = torch.nn.Sequential(
torch.nn.Linear(in_features = 4, out_features =3),
torch.nn.Softmax(dim=1)
)
We also add a Softmax activation function to the end of the model. This will map the output to a probability distribution over the classes.
Step 5: Train the Model
We can train the model using the CrossEntropyLoss loss function and the SGD optimizer.
Python3
# Train the model
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
num_epochs = 1000
for epoch in range(num_epochs):
# Forward pass
y_pred = model(X_train)
loss = criterion(y_pred, y_train)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Print the loss every 100 epochs
if (epoch+1) % 100 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
Output:
Epoch [100/1000], Loss: 0.8240
Epoch [200/1000], Loss: 0.7616
Epoch [300/1000], Loss: 0.7324
Epoch [400/1000], Loss: 0.7152
Epoch [500/1000], Loss: 0.7021
Epoch [600/1000], Loss: 0.6913
Epoch [700/1000], Loss: 0.6819
Epoch [800/1000], Loss: 0.6737
Epoch [900/1000], Loss: 0.6665
Epoch [1000/1000], Loss: 0.6602
Step 6: Evaluate the Model
The final step is to evaluate the performance of the linear classifier using the test set. We'll use the test set to compute the accuracy of the model. Here's how we can do that:
Python3
# Evaluate the model
with torch.no_grad():
y_pred = model(X_test)
_, predicted = torch.max(y_pred, dim=1)
accuracy = (predicted == y_test).float().mean()
print(f'Test Accuracy: {accuracy.item():.4f}')
Output:
Test Accuracy: 0.9667
Here's the complete example:
Python3
# Import the required libraries
import torch
import torchvision
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
# Load the Iris dataset
iris = load_iris()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(iris.data,
iris.target,
test_size=0.2,
random_state=42)
# Convert the data to PyTorch tensors
X_train = torch.tensor(X_train).float()
X_test = torch.tensor(X_test).float()
y_train = torch.tensor(y_train)
y_test = torch.tensor(y_test)
# Normalize the features
mean = X_train.mean(dim=0)
std = X_train.std(dim=0)
X_train = (X_train - mean) / std
X_test = (X_test - mean) / std
# Define the model
model = torch.nn.Sequential(
torch.nn.Linear(4, 3),
torch.nn.Softmax(dim=1)
)
# Train the model
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
num_epochs = 1000
for epoch in range(num_epochs):
# Forward pass
y_pred = model(X_train)
loss = criterion(y_pred, y_train)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Print the loss every 100 epochs
if (epoch+1) % 100 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
# Evaluate the model
with torch.no_grad():
y_pred = model(X_test)
_, predicted = torch.max(y_pred, dim=1)
accuracy = (predicted == y_test).float().mean()
print(f'Test Accuracy: {accuracy.item():.4f}')
Output:
Epoch [100/1000], Loss: 0.7564
Epoch [200/1000], Loss: 0.6042
Epoch [300/1000], Loss: 0.5304
Epoch [400/1000], Loss: 0.4833
Epoch [500/1000], Loss: 0.4513
Epoch [600/1000], Loss: 0.4286
Epoch [700/1000], Loss: 0.4121
Epoch [800/1000], Loss: 0.3995
Epoch [900/1000], Loss: 0.3897
Epoch [1000/1000], Loss: 0.3817
Test Accuracy: 0.9667
We start by importing the necessary libraries, including PyTorch, and scikit-learn. We then load the Iris dataset and split it into training and testing sets using the train_test_split function.
Next, we convert the data to PyTorch tensors and normalize the features to have a mean of zero and a standard deviation of one. We define our linear classifier using the nn.Linear module and add a Softmax activation function to the end of the model.
We train the model using the CrossEntropyLoss loss function and the SGD optimizer. We loop over the dataset for a specified number of epochs and perform a forward pass, backward pass, and optimization at each iteration. We print the loss every 100 epochs.
Finally, we evaluate the model on the testing set by computing the accuracy.
Other subtopics related to this concept include:
1. Custom datasets: In the example above, we used scikit-learn to load and split the Iris dataset. However, PyTorch provides a Dataset class that can be used to create custom datasets. This is particularly useful when working with large datasets that cannot fit in memory. You can define a custom Dataset class that loads and preprocesses the data on the fly, making it easier to work with.
2. Transfer learning: Transfer learning is a technique that involves using a pre-trained model and fine-tuning it on a new task. PyTorch provides several pre-trained models through torchvision that can be used for transfer learning. By freezing some of the layers of the pre-trained model and training only the last few layers on the new task, we can achieve good results with fewer training examples.
3. Hyperparameter tuning: The performance of a machine learning model depends on several hyperparameters, such as the learning rate, the number of hidden layers, and the number of epochs. PyTorch provides several libraries, such as Optuna and Ray Tune, that can be used for hyperparameter tuning. These libraries automate the process of trying out different hyperparameter configurations and selecting the best one based on a specified metric.
Conclusion
In conclusion, PyTorch provides a powerful platform for building and training machine learning models. The nn module provides a flexible and easy-to-use interface for defining neural networks, and the optim module provides a range of optimization algorithms. By using PyTorch, you can focus on designing and testing new machine learning models, rather than spending time on low-level implementation details.
Similar Reads
Image Classification Using PyTorch Lightning
Image classification is one of the most common tasks in computer vision and involves assigning a label to an input image from a predefined set of categories. While PyTorch is a powerful deep learning framework, PyTorch Lightning builds on it to simplify model training, reduce boilerplate code, and i
5 min read
Swish activation function in Pytorch
Activation functions are a fundamental component of artificial neural networks. They introduce non-linearity into the model, allowing it to learn complex relationships in the data. One such activation function, the Swish activation function, has gained attention for its unique properties and potenti
6 min read
Activation Functions in Pytorch
In this article, we will Understand PyTorch Activation Functions. What is an activation function and why to use them?Activation functions are the building blocks of Pytorch. Before coming to types of activation function, let us first understand the working of neurons in the human brain. In the Artif
5 min read
Image Classification using ResNet
This article will walk you through the steps to implement it for image classification using Python and TensorFlow/Keras. Image classification classifies an image into one of several predefined categories. ResNet (Residual Networks), which introduced the concept of residual connections to address the
3 min read
Classification Using Sklearn Multi-layer Perceptron
Multi-Layer Perceptrons (MLPs) are a type of neural network commonly used for classification tasks where the relationship between features and target labels is non-linear. They are particularly effective when traditional linear models are insufficient to capture complex patterns in data. This includ
6 min read
Perceptron Algorithm for Classification using Sklearn
Assigning a label or category to an input based on its features is the fundamental task of classification in machine learning. One of the earliest and most straightforward machine learning techniques for binary classification is the perceptron. It serves as the framework for more sophisticated neura
11 min read
Evaluation Metrics For Classification Model in Python
Classification is a supervised machine-learning technique that predicts the class label based on the input data. There are different classification algorithms to build a classification model, such as Stochastic Gradient Classifier, Support Vector Machine Classifier, Random Forest Classifier, etc. To
7 min read
Building a Convolutional Neural Network using PyTorch
Convolutional Neural Networks (CNNs) are deep learning models used for image processing tasks. They automatically learn spatial hierarchies of features from images through convolutional, pooling and fully connected layers. In this article we'll learn how to build a CNN model using PyTorch. This incl
6 min read
Python PyTorch â torch.linalg.cond() Function
In this article, we are going to discuss how to compute the condition number of a matrix in PyTorch. we can get the condition number of a matrix by using torch.linalg.cond() method. torch.linalg.cond() method This method is used to compute the condition number of a matrix with respect to a matrix n
3 min read
Activation Function in TensorFlow
Activation functions add non-linearity to deep learning models and allow them to learn complex patterns. TensorFlowâs tf.keras.activations module provides a variety of activation functions to use in different scenarios. An activation function is a mathematical transformation applied to the output of
4 min read