0% found this document useful (0 votes)
19 views2 pages

Feature Selection

Uploaded by

vishnucheppanam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views2 pages

Feature Selection

Uploaded by

vishnucheppanam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Feature Selection

Reference : https://machinelearningmastery.com/feature-selection-machine-learning-python/
https://www.datacamp.com/tutorial/feature-selection-python https://towardsdatascience.com/feature-
selection-methods-and-how-to-choose-them-1e7469100e7e https://www.youtube.com/watch?
v=za1aA9U4kbI

It is the process where you automatically select the features which are most important. It is also known as
variable selection or attribute selection.

Importance of feature selection


Proper feature selection improves the performance of model and vice versa. The benefits are;
Reduces overfitting
Improves accuracy
Reduces running time
Improves explainability
Improves data model compatibility

Difference between feature selection, feature extraction,


feature engineering and dimensionality reduction?
Feature Engineering and feature extraction refers to the creation of new features from existing ones.
They are performed before the feature selection.
[[Dimensionality reduction Techniques]] method reduce number of features by creating new combinations
of attributes (also known as feature transformation). It is performed after feature selection if required.
Some examples are;
Principal Component Analysis
Singular Value Decomposition
Linear Discriminant Analysis ...etc
Feature selection involves only inclusion and exclusion of features.

What are the different methods of feature selection?


![[Feature Selection-2.png]]

Unsupervised methods
Unsupervised feature selection methods doesn't require any labels. They don't need access to the target
variables. It works by,
Discarding almost constant variables.
Dropping incomplete features.
Dropping high multicollinear variables.

Supervised methods
Wrapper methods
![[Feature Selection-4.png]] Wrapper method uses a model to evaluate the performance of different
subsets and the best subset is selected. But there is a chance of overfitting. So it is recommended to check
the subset selected with another model. Another disadvantage is the large computational requirements.
Popular wrapper methods are,

Backward Selection

In Backward selection, a full model with all features is analysed. In each iteration the feature contributed
more to the performance is removed. Process is repeated until we gain desired number of features.

Forward Selection

Forward Selection, is initiated with a null model and features are added by one by one maximizing the
performance of the model.

Recursive Feature Elimination (RFE)

Recursive Feature Elimination is similar to Backward Selection. Difference is the selection of features to
discard. RFE uses importance of features to discard the features. Which is weight in linear models, impurity
decrease in tree based models ...etc.

Filter Methods
![[Feature Selection-3.png]] In Filter Methods the statistical relation of features with the target variable is
analysed using measures like correlation or mutual information. This is more simpler, faster and model-
agnostic than wrapper methods. They are less prone to overfitting. The major draw back of this method is
that, it ignores features that are weak predictors of the target variable. But makes more sense when
combined with other features.

Embedded Methods
The idea of embedded methods are to combine the benefits of filter methods and wrapper methods. It
focuses on getting faster results while getting the best subset like wrapper methods. There aren't many
embedded methods or inclusive algorithms available. One example is LASSO regression. Where the
weights of the features gradually shrunk towards zero. Many zero weighted features are removed while the
rest of the non zero features are remained. An example of LASSO regression is the computer vision.

Implementation

Unfold Data Science


https://www.youtube.com/watch?v=LTE7YbRexl8

![[Feature Selection-6.png]]

You might also like