Sequential Feature Selection
Last Updated :
28 Apr, 2025
Feature selection is a process of identifying and selecting the most relevant features from a dataset for a particular predictive modeling task. This can be done for a variety of reasons, such as to improve the predictive accuracy of a model, to reduce the computational complexity of a model, or to make a model more interpretable. This article focuses on a sequential feature selector, which is one such feature selection technique.
Sequential feature selection (SFS) is a greedy algorithm that iteratively adds or removes features from a dataset in order to improve the performance of a predictive model. SFS can be either forward selection or backward selection.
Sequential Feature Selector
SequentialFeatureSelector class in Scikit-learn supports both forward and backward selection. The SequentialFeatureSelector class in scikit-learn works by iteratively adding or removing features from a dataset in order to improve the performance of a predictive model. The process is as follows:
- The selector is initialized with a predictive model, the number of features to select, the scoring metric, and the tolerance for improvement.
- The selector fits the predictive model on the full set of features.
- The model is evaluated on the training set using the scoring metric.
- The feature that most improve the model's cross-validation score is added to the selected features set, or the feature that least reduces the model's cross-validation score is removed from the selected features set, whichever one gives the greatest improvement in the scoring metric.
- The selector repeats steps 2-4 until the desired number of features has been selected.
The process is reversed if the selector is doing backward selection. During backward selection, selector starts with the entire set of features and iteratively removes the feature that has the least impact on the predictive model's performance. The process is repeated until the required number of features is chosen or until no additional features can be eliminated without significantly decreasing the model's performance.
The required number of features can be specified via the n_features_to_select argument, which specifies the number of features to select, or the tol parameter, which specifies the tolerance for improvement. The selector will only add or remove a feature if it improves the scoring metric by at least tol.
Code implementation
Python3
#Code for demostrating use of SFS on iris data. written by Tapendra Kumar
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import SequentialFeatureSelector
iris = load_iris(as_frame=True)
X = iris.data
y = iris.target
# Create a logistic regression model
logreg = LogisticRegression()
# Create a sequential feature selector
selector = SequentialFeatureSelector(
logreg, n_features_to_select=2, scoring='accuracy')
# Fit the selector to the data
selector.fit(X, y)
# Get the selected features
selected_features = selector.get_support()
print('The selected features are:', list(X.columns[selected_features]))
Output :
The selected features are: ['petal length (cm)', 'petal width (cm)']
Advantages and Disadvantages
The advantages of sequential feature selection include:
- It is a simple and efficient algorithm.
- It can be used with any type of predictive model.
- It can be used to select features for both classification and regression tasks.
The disadvantages of sequential feature selection include:
- It can be sensitive to the choice of the scoring metric.
- It can be biased towards features that are highly correlated with the target feature.
- It can be computationally expensive for large datasets.
Conclusion
Sequential feature selection is a powerful tool that can be used to improve the performance of predictive models. However, it is important to be aware of its limitations and to use it appropriately.
Similar Reads
jQuery :nth-last-child() Selector The jQuery :nth-last-child() selector is used to select all elements that are the nth last child of their parent. The counting of elements starts from the last element. Syntax: :nth-last-child( n | even | odd | formula )Parameters: The :nth-last-child() selector contain parameters which are listed b
2 min read
Minimum time to pick all elements with a cooldown period Given an array arr[] of size n, where each element in the array represents a value between 1 and n. Additionally, you are given another array time[], where time[i] (1 ⤠i ⤠n) represents the minimum time required before you can pick the ith element again after you have already picked it, the task is
7 min read
Java Program for Activity Selection Problem | Greedy Algo-1 You are given n activities with their start and finish times. Select the maximum number of activities that can be performed by a single person, assuming that a person can only work on a single activity at a time. Example: Example 1 : Consider the following 3 activities sorted by finish time. start[]
2 min read
Top Interview Questions and Answers on Selection Sort Selection sort is a simple and efficient sorting algorithm that works by repeatedly selecting the smallest (or largest) element from the unsorted portion of the list and moving it to the sorted portion of the list. In our article âTop Interview Questions and Answers on Selection Sortâ, we present a
5 min read
Logical Sequence of Words- Logical Reasoning Question and Answer A logical sequence of words refers to arranging a set of words or phrases in an order that makes the most sense. The arrangement follows a specific logic-chronological, cause and effect, hierarchical, or thematic-so that the resulting sequence forms a coherent and meaningful statement, story, or lis
7 min read
Sequential Decision Problems in AI Sequential decision problems are at the heart of artificial intelligence (AI) and have become a critical area of study due to their vast applications in various domains, such as robotics, finance, healthcare, and autonomous systems. These problems involve making a sequence of decisions over time, wh
10 min read