0% found this document useful (0 votes)

33 views5 pages

Feature Selection

Uploaded by

banushri914

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views5 pages

Feature Selection

Uploaded by

banushri914

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

What is Feature Selection?

A feature is an attribute that has an impact on a problem or is useful for the problem, and
choosing the important features for the model is known as feature selection.
Each machine learning process depends on feature engineering, which mainly contains two
processes; which are Feature Selection and Feature Extraction.
Although feature selection and extraction processes may have the same objective, both are
completely different from each other.
The main difference between them is that feature selection is about selecting the subset
of the original feature set, whereas feature extraction creates new features.
Feature selection is a way of reducing the input variable for the model by using only relevant
data in order to reduce overfitting in the model.
"It is a process of automatically or manually selecting the subset of most appropriate and
relevant features to be used in model building." Feature selection is performed by either
including the important features or excluding the irrelevant features in the dataset without
changing them.
Below are some benefits of using feature selection in machine learning:
o It helps in avoiding the curse of dimensionality.
o It helps in the simplification of the model so that it can be easily interpreted by
the researchers.
o It reduces the training time.
o It reduces overfitting hence enhance the generalization.
Feature Selection Techniques
There are mainly two types of Feature Selection techniques, which are:
Supervised Feature Selection technique
Supervised Feature selection techniques consider the target variable and can be used for the
labelled dataset.
Unsupervised Feature Selection technique
Unsupervised Feature selection techniques ignore the target variable and can be used for the
unlabelled dataset.
Wrapper Methods
In wrapper methodology, selection of features is done by considering it as a search problem,
in which different combinations are made, evaluated, and compared with other combinations.
It trains the algorithm by using the subset of features iteratively.

On the basis of the output of the model, features are added or subtracted, and with this feature
set, the model has trained again.
Some techniques of wrapper methods are:
o Forward selection - Forward selection is an iterative process, which begins with an
empty set of features. After each iteration, it keeps adding on a feature and evaluates
the performance to check whether it is improving the performance or not.
o The process continues until the addition of a new variable/feature does not improve
the performance of the model.
o Backward elimination - Backward elimination is also an iterative approach, but it is
the opposite of forward selection. This technique begins the process by considering all
the features and removes the least significant feature.
o This elimination process continues until removing the features does not improve the
performance of the model.
o Exhaustive Feature Selection- Exhaustive feature selection is one of the best feature
selection methods, which evaluates each feature set as brute-force.
o It means this method tries & make each possible combination of features and return
the best performing feature set.
o RecursiveFeatureElimination-
Recursive feature elimination is a recursive greedy optimization approach, where
features are selected by recursively taking a smaller and smaller subset of features.
o Now, an estimator is trained with each set of features, and the importance of each
feature is determined using coef_attribute or through
a feature_importances_attribute.
2. Filter Methods
In Filter Method, features are selected on the basis of statistics measures. This method does
not depend on the learning algorithm and chooses the features as a pre-processing step.
The filter method filters out the irrelevant feature and redundant columns from the model by
using different metrics through ranking.
The advantage of using filter methods is that it needs low computational time and does not
overfit the data.

Some common techniques of Filter methods are as follows:

o Information Gain
o Chi-square Test
o Fisher's Score
o Missing Value Ratio
Information Gain: Information gain determines the reduction in entropy while transforming
the dataset.
It can be used as a feature selection technique by calculating the information gain of each
variable with respect to the target variable.
Chi-square Test: Chi-square test is a technique to determine the relationship between the
categorical variables.
The chi-square value is calculated between each feature and the target variable, and the
desired number of features with the best chi-square value is selected.
Fisher's Score:
Fisher's score is one of the popular supervised technique of features selection.
It returns the rank of the variable on the fisher's criteria in descending order. Then we can
select the variables with a large fisher's score.
Missing Value Ratio:
The value of the missing value ratio can be used for evaluating the feature set against the
threshold value. The formula for obtaining the missing value ratio is the number of missing
values in each column divided by the total number of observations. The variable is having
more than the threshold value can be dropped.

3. Embedded Methods
Embedded methods combined the advantages of both filter and wrapper methods by
considering the interaction of features along with low computational cost. These are fast
processing methods similar to the filter method but more accurate than the filter method.

These methods are also iterative, which evaluates each iteration, and optimally finds the most
important features that contribute the most to training in a particular iteration. Some
techniques of embedded methods are:
o Regularization- Regularization adds a penalty term to different parameters of the
machine learning model for avoiding overfitting in the model. This penalty term is
added to the coefficients; hence it shrinks some coefficients to zero.
o Those features with zero coefficients can be removed from the dataset. The types of
regularization techniques are L1 Regularization (Lasso Regularization) or Elastic Nets
(L1 and L2 regularization).
o Random Forest Importance - Different tree-based methods of feature selection help
us with feature importance to provide a way of selecting features. Here, feature
importance specifies which feature has more importance in model building or has a
great impact on the target variable.
o Random Forest is such a tree-based method, which is a type of bagging algorithm
that aggregates a different number of decision trees. It automatically ranks the nodes
by their performance or decrease in the impurity (Gini impurity) over all the trees.
Nodes are arranged as per the impurity values, and thus it allows to pruning of trees
below a specific node. The remaining nodes create a subset of the most important
features.
Feature selection is a very complicated and vast field of machine learning, and lots of studies
are already made to discover the best methods. There is no fixed rule of the best feature
selection method. However, choosing the method depend on a machine learning engineer
who can combine and innovate approaches to find the best method for a specific problem.
One should try a variety of model fits on different subsets of features selected through
different statistical Measure

Unit - 3 Feature Engineering
No ratings yet
Unit - 3 Feature Engineering
29 pages
Artificial Intelligence (AI) / Machine Learning (ML) : Limited Seats Only
67% (3)
Artificial Intelligence (AI) / Machine Learning (ML) : Limited Seats Only
2 pages
Ai Specialist Salesforce Exam
No ratings yet
Ai Specialist Salesforce Exam
24 pages
AI Agents vs. Agentic AI
100% (1)
AI Agents vs. Agentic AI
33 pages
Darktrace PLC
No ratings yet
Darktrace PLC
108 pages
Amadeus Whitepaper Travelers Motivation 201808
No ratings yet
Amadeus Whitepaper Travelers Motivation 201808
24 pages
The Big Book of ChatGPT Prompts For Advancement Professionals - Evertrue
No ratings yet
The Big Book of ChatGPT Prompts For Advancement Professionals - Evertrue
9 pages
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
No ratings yet
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
63 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
49 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
5 pages
A Study On Feature Selection Techniques in Bio Informatics
100% (1)
A Study On Feature Selection Techniques in Bio Informatics
7 pages
Literature Review On Feature Subset Selection Techniques
No ratings yet
Literature Review On Feature Subset Selection Techniques
3 pages
Feature Subset Selection With Fast Algorithm Implementation
No ratings yet
Feature Subset Selection With Fast Algorithm Implementation
5 pages
Feature Selection Techniques in Machine Learning - Javatpoint
No ratings yet
Feature Selection Techniques in Machine Learning - Javatpoint
9 pages
June 77
No ratings yet
June 77
20 pages
Business Data Mining Week 4
No ratings yet
Business Data Mining Week 4
12 pages
Feature Selection in Machine Learning
No ratings yet
Feature Selection in Machine Learning
4 pages
Kernels, Model Selection and Feature Selection
No ratings yet
Kernels, Model Selection and Feature Selection
5 pages
Lecture 15 - 23.09.2024 - Feature Selection
No ratings yet
Lecture 15 - 23.09.2024 - Feature Selection
47 pages
Wrapper Method
No ratings yet
Wrapper Method
58 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
9 pages
Feature Selection
No ratings yet
Feature Selection
6 pages
Cheatsheet 232
No ratings yet
Cheatsheet 232
2 pages
Feature Selection 1692278667
No ratings yet
Feature Selection 1692278667
100 pages
E-Note 14653 Content Document 20231228101402AM
No ratings yet
E-Note 14653 Content Document 20231228101402AM
10 pages
Feature Selection in PR
No ratings yet
Feature Selection in PR
6 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
3.1 Dimensionality Reduction
No ratings yet
3.1 Dimensionality Reduction
24 pages
3038-Article Text-5729-1-10-20210418
No ratings yet
3038-Article Text-5729-1-10-20210418
6 pages
Feature Selection Technique
No ratings yet
Feature Selection Technique
7 pages
Feature Pruning and Normalization
No ratings yet
Feature Pruning and Normalization
8 pages
A Review of Feature Selection and Its Methods: Cybernetics and Information Technologies March 2019
No ratings yet
A Review of Feature Selection and Its Methods: Cybernetics and Information Technologies March 2019
25 pages
A Review of Feature Selection Techniques in BioinformaticsBioinformatics
No ratings yet
A Review of Feature Selection Techniques in BioinformaticsBioinformatics
11 pages
ML Unit 2 Part - 2
No ratings yet
ML Unit 2 Part - 2
6 pages
Feature Selection Mechanisms in ML
No ratings yet
Feature Selection Mechanisms in ML
93 pages
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
No ratings yet
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
7 pages
Presentation 1
No ratings yet
Presentation 1
15 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
52 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
A Review of Feature Selection Methods On Synthetic Data
No ratings yet
A Review of Feature Selection Methods On Synthetic Data
37 pages
Filter Based Feature Selection Using ANOVA: Suppose A Company Wants To Analyze Whether The
No ratings yet
Filter Based Feature Selection Using ANOVA: Suppose A Company Wants To Analyze Whether The
66 pages
Feature Selection Techniques
No ratings yet
Feature Selection Techniques
5 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
Feature Engineering
No ratings yet
Feature Engineering
5 pages
MRMRKKT PDF
No ratings yet
MRMRKKT PDF
5 pages
A Review of Feature Selection Methods With Applications
No ratings yet
A Review of Feature Selection Methods With Applications
6 pages
Module-3 DSV
No ratings yet
Module-3 DSV
20 pages
A Review of Feature Selection Techniques in Bioinformatics
No ratings yet
A Review of Feature Selection Techniques in Bioinformatics
11 pages
Feature Selection - Study Material
No ratings yet
Feature Selection - Study Material
6 pages
Data Prep For ML-1
No ratings yet
Data Prep For ML-1
5 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Feature Selection Techniques and Its Importance in Machine Learning: A Survey
No ratings yet
Feature Selection Techniques and Its Importance in Machine Learning: A Survey
6 pages
Feature Selection
No ratings yet
Feature Selection
2 pages
Unit 3
No ratings yet
Unit 3
50 pages
Module5.2 Feature Selection Methods
No ratings yet
Module5.2 Feature Selection Methods
64 pages
An Introduction To Feature Selection
No ratings yet
An Introduction To Feature Selection
45 pages
Wa0028.
No ratings yet
Wa0028.
10 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
7 Selectia Trasaturilor
No ratings yet
7 Selectia Trasaturilor
54 pages
Featuere Selection
No ratings yet
Featuere Selection
5 pages
Shap-Select:: Lightweight Feature Selection Using SHAP Values and Regression
No ratings yet
Shap-Select:: Lightweight Feature Selection Using SHAP Values and Regression
13 pages
Data Science Chapitre 0
No ratings yet
Data Science Chapitre 0
25 pages
1 s2.0 S277266222400081X Main
No ratings yet
1 s2.0 S277266222400081X Main
11 pages
EY The New Age Artificial Intelligence For Human Resource Opportunities and Functions
No ratings yet
EY The New Age Artificial Intelligence For Human Resource Opportunities and Functions
11 pages
Recruitment Report 2022 2023 PDF 1668475931
No ratings yet
Recruitment Report 2022 2023 PDF 1668475931
16 pages
Psycho-Pass, A Case Study
No ratings yet
Psycho-Pass, A Case Study
10 pages
Artificial Intelligence in Chemistry and Drug Design
No ratings yet
Artificial Intelligence in Chemistry and Drug Design
7 pages
An Introduction To Machine Learning and Its Applications
No ratings yet
An Introduction To Machine Learning and Its Applications
8 pages
Final Report Dinesh
No ratings yet
Final Report Dinesh
33 pages
Ai in Control
No ratings yet
Ai in Control
4 pages
Automation and RPA in The Enterprise
No ratings yet
Automation and RPA in The Enterprise
47 pages
AI Research
No ratings yet
AI Research
17 pages
Project Documentation
No ratings yet
Project Documentation
19 pages
Chapter 8
No ratings yet
Chapter 8
103 pages
FinPro Startup Case Study
No ratings yet
FinPro Startup Case Study
3 pages
(Ebooks PDF) Download Reinventing Manufacturing and Business Processes Through Artificial Intelligence (Innovations in Big Data and Machine Learning) 1st Edition Geeta Rana Full Chapters
100% (2)
(Ebooks PDF) Download Reinventing Manufacturing and Business Processes Through Artificial Intelligence (Innovations in Big Data and Machine Learning) 1st Edition Geeta Rana Full Chapters
45 pages
Min Lin PDF
No ratings yet
Min Lin PDF
10 pages
Deep Residual Learning For Image Recognition
No ratings yet
Deep Residual Learning For Image Recognition
2 pages
Survey On Large Language Models
No ratings yet
Survey On Large Language Models
52 pages
Sr. Shubham Singh Albert
No ratings yet
Sr. Shubham Singh Albert
25 pages
Pitch Deck Review - GAKO
No ratings yet
Pitch Deck Review - GAKO
24 pages
PHD in Software Engineering
No ratings yet
PHD in Software Engineering
8 pages
The Role of An AI Architect
No ratings yet
The Role of An AI Architect
7 pages
KDD 2024 ChatGPT Camera Ready
No ratings yet
KDD 2024 ChatGPT Camera Ready
7 pages
Assignment 2a Research Simulation
No ratings yet
Assignment 2a Research Simulation
4 pages
Module 5 - QB - Optimization Techniques
No ratings yet
Module 5 - QB - Optimization Techniques
2 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

Feature Selection

Uploaded by

Feature Selection

Uploaded by

What is Feature Selection?

Some common techniques of Filter methods are as follows:

You might also like