3.1 Feature Selection

The document discusses feature selection, a process aimed at selecting a subset of relevant features from a larger set to minimize classification error and improve model performance. It differentiates feature selection from dimensionality reduction techniques and outlines various methods including supervised, semi-supervised, and unsupervised approaches, as well as filter, wrapper, and hybrid methods. Additionally, it highlights the importance of feature ranking techniques and the impact of training set size on the effectiveness of feature selection.

Uploaded by

kiemdaidaukien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views35 pages

3.1 Feature Selection

Uploaded by

kiemdaidaukien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Data Preparation

• Given a set of D features, the role of feature selection is

to select a subset of d features (d < D) in order to
minimize the classification error.

dimensionality
reduction

• Fundamentally different from dimensionality reduction

(e.g., PCA or LDA) based on feature combinations (i.e.,
feature extraction).
• Feature selection is defined as a process of selecting the
features that best describe a dataset out of a larger set
of candidate features
1. to improve performance (in terms of speed,
predictive power, simplicity of the model).
2. to visualize the data for model selection.
3. to reduce dimensionality and remove noise.
4. removing irrelevant data.
5. increasing predictive accuracy of learned models.
6. reducing the cost of the data.
7. improving learning efficiency, such as reducing
storage requirements and computational cost.
8. reducing the complexity of the resulting model
description, improving the understanding of the
data and the model
Data Reduction
 Typically, there are two types of features: relevant and irrelevant
features
 For classification problem, relevant features are the features that
contain discriminative information about the classes (supervised
context) or clusters (unsupervised context)
 The terms of “feature selection” can be
replaced by different synonyms in the
literature: “variable selection”, “attribute
selection” and “feature ranking”, “feature
weighting”.
 A generation step which is based on a search method
generates subsets of features to be evaluated. A subset
search strategy generates candidate feature subsets in order
to find the optimal subset
 Random
 Sequential
 Complete
 An evaluation function
 A stopping criterion
 A validation step
 The context of learning or the evaluation
strategy
 Supervised methods:
 Semi-supervised methods:
 Unsupervised methods:
 Filter Methods
 Evaluation is independent of
the classification algorithm.
 The objective function evaluates
feature subsets by their
information content, typically
interclass distance, statistical
dependence or information-
theoretic measures.
 Wrapper Methods
 Evaluation uses criteria
related to the classification
algorithm.
 The objective function is a
pattern classifier, which
evaluates feature subsets by
their predictive accuracy
(recognition rate on test data)
by statistical resampling or
cross-validation.
 Hybrid methods combine both filter and wrapper
methods into a single framework, in order to
provide a more efficient solution to the feature
selection problem
 Feature Ranking Techniques:
 we expect as the output a ranked list of features which
are ordered according to evaluation measures.
 they return the relevance of the features.
 for performing actual FS, the simplest way is to
choose the first m features for the task at hand,
whenever we know the most appropriate m value.
 Unsupervised feature selection
 Variance score
 Unsupervised feature selection
 Laplacian score

 Unsupervised sparsity score

 Supervised feature selection
 Fisher score
 Supervised feature selection
 Supervised Laplacian score
 Semi-Supervised feature selection
 Class labels are usually limited or expensive to
be obtained
 A part of unsupervised and supervised learning
 Generate constraints from class labels
 For a set S with N samples and d features
 Initialize : w = (1,1,1….1)
For t=1:T (number iteration) do
 Pick a random sample x from S
 Find nearhit(x) and nearmiss(x) with Euclidean distance
 Compute 1
 
  x  nearmiss( x ) 2

 x  nearhit( x ) 2


2  x  nearmiss( x ) w x  nearhit( x ) w 
 Compute w  w  

End w2
w 2
w

 For a set S with N samples and d features
 Initialize : w=(1,1,1….1)
For t= 1: T ( number iteration) do
 Pick a random sample x from S
 Find K  nearmisses( x)  y1, y 2 ,... y K 
 
K  nearhit( x)  z1, z 2 ,...z K
( xi  yi )2
with  distance based on w,  w ( x, y )   wi
2 2

i ( xi  yi )
1 K 2 1 K 2
 Compute : Dmiss  
K j 1
 ( x, yi ) ; Dhit    ( x, zij )
j

K j 1
 Compute :  
1
Dmiss  Dhit 
2
End
w2
w 2
w

29
 Simba in semi-supervised contexte
Unknown labels Labels

Propagation method => SOFT Labels [1]

Simba

[1] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with local and
global consistency,” in Advances in Neural Information Processing Systems 16, 2004, pp.
321–328.
Unknown labels Labels

Simba

 Compute a neighborhood size by [2]

 The samples outside the neighborhood is nearmiss

[2] F. Dornaika, Y. El Traboulsi, and A. Assoum, “Adaptive Two Phase Sparse Representation
Classifier for Face Recognition,” in Advanced Concepts for Intelligent Vision Systems: 15th
International Conference, ACIVS 2013, Poznań, Poland, October 28-31, 2013. Proceedings, J.
Blanc-Talon, A. Kasinski, W. Philips, D. Popescu, and P. Scheunders, Eds. Cham: Springer
International Publishing, 2013, pp. 182–191.
 Must-link (x+y)
 Cannot-link (x-y)
 Compute nearhit and nearmiss

mCANNOT  d ( x, NM ( y))  d ( x, NH ( x))

mMUSTLINK  d ( x, NH ( x))  d ( x, NH ( y ))
mNEW  d ( x, NM ( x))  d ( x, NH ( x)) Future Work (2)
M fusion  M CANNOT  M MUSTLINK  M UNSUPERVIS ED

 fusion  1M CANNOT   2M MUSTLINK   3M UNSUPERVIS ED

 Using score existing : Fisher, Laplacian,… like
a constant guide for simba in supervised
contexte

w  w    Fisher

33
 The resulted subsets of many models of FS are strongly
dependent on the training set size.
 It is not true that a large dimensionality input can always be
reduced to a small subset of features because the objective
feature is actually related with many input features and the
removal of any of them will seriously effect the learning
performance.
.
[1] F. Dornaika, Y. El Traboulsi, and A. Assoum, “Adaptive Two Phase Sparse Representation Classifier for
Face Recognition,” in Advanced Concepts for Intelligent Vision Systems: 15th International Conference,
ACIVS 2013, Poznań, Poland, October 28-31, 2013. Proceedings, J. Blanc-Talon, A. Kasinski, W. Philips, D.
Popescu, and P. Scheunders, Eds. Cham: Springer International Publishing, 2013, pp. 182–191.
[2] M. Yang, F. Wang, and P. Yang, “A novel feature selection algorithm based on hypothesis-margin,”
Journal of Computers, vol. 3, no. 12, pp. 27–34, 2008.
[3] K. Q. Weinberger and L. K. Saul, “Distance metric learning for large margin nearest neighbor
classification,” The Journal of Machine Learning Research, vol. 10, pp. 207–244, 2009.
[4] A. Moujahid, A. Abanda, and F. Dornaika, “Feature Extraction Using Block-based Local Binary Pattern
for Face Recognition,” Electronic Imaging, vol. 2016, no. 10, pp. 1–6, 2016.
[5] Y. Li and B.-L. Lu, “Feature selection based on loss-margin of nearest neighbor classification,” Pattern
Recognition, vol. 42, no. 9, pp. 1914–1921, Sep. 2009.
[6] W. Pan, P. Ma, and X. Su, “Feature Weighting Algorithm Based on Margin and Linear Programming,” in
Rough Sets and Current Trends in Computing, 2012, pp. 388–396.
[7] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with local and global
consistency,” in Advances in Neural Information Processing Systems 16, 2004, pp. 321–328.
[8] K. Crammer, R. Gilad-Bachrach, A. Navot, and N. Tishby, “Margin analysis of the LVQ algorithm,” in
Advances in neural information processing systems, 2002, pp. 462–469.
[9]R. Gilad-Bachrach, A. Navot, and N. Tishby, “Margin based feature selection-theory and algorithms,” in
Proceedings of the twenty-first international conference on Machine learning, 2004, p. 43.
35

CSE 473 Pattern Recognition
No ratings yet
CSE 473 Pattern Recognition
45 pages
Introduction To Pattern Recognition: Vojtěch Franc
100% (1)
Introduction To Pattern Recognition: Vojtěch Franc
21 pages
Feature Selection 1692278667
No ratings yet
Feature Selection 1692278667
100 pages
Feature Selection Techniques
No ratings yet
Feature Selection Techniques
17 pages
Foundations of Machine Learning: Sudeshna Sarkar IIT Kharagpur
No ratings yet
Foundations of Machine Learning: Sudeshna Sarkar IIT Kharagpur
40 pages
Feature Selection Methods
No ratings yet
Feature Selection Methods
24 pages
Unit 3
No ratings yet
Unit 3
100 pages
Feature Selection
No ratings yet
Feature Selection
56 pages
Dimenn Red PDF
No ratings yet
Dimenn Red PDF
135 pages
Wrapper Method
No ratings yet
Wrapper Method
58 pages
Case Study - Transformers in Machine Translation - Quiz - Attempt Review
No ratings yet
Case Study - Transformers in Machine Translation - Quiz - Attempt Review
6 pages
Eel891 Selecao Atributos George Bebis
No ratings yet
Eel891 Selecao Atributos George Bebis
58 pages
15 dm2 Imbalanced Learning 2022 23
No ratings yet
15 dm2 Imbalanced Learning 2022 23
35 pages
Discriminative and Generative Methods For Bags of Features: Zebra Non-Zebra
No ratings yet
Discriminative and Generative Methods For Bags of Features: Zebra Non-Zebra
40 pages
Module5.2 Feature Selection Methods
No ratings yet
Module5.2 Feature Selection Methods
64 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
Neuromorphic Computing
No ratings yet
Neuromorphic Computing
25 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
Feature Selection
No ratings yet
Feature Selection
36 pages
Lua Chon Dac Trung
No ratings yet
Lua Chon Dac Trung
18 pages
UNIT04
No ratings yet
UNIT04
35 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
3b Features PDF
No ratings yet
3b Features PDF
40 pages
کتاب پنجم بارگزاری شده
No ratings yet
کتاب پنجم بارگزاری شده
35 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Introduction To Pattern Recognition
No ratings yet
Introduction To Pattern Recognition
46 pages
Data Pre Processing
No ratings yet
Data Pre Processing
26 pages
Pattern Summary Final
No ratings yet
Pattern Summary Final
28 pages
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
No ratings yet
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
40 pages
CSC 522 Lecture3
No ratings yet
CSC 522 Lecture3
32 pages
CZ4032 Data Analytics & Mining Notes
No ratings yet
CZ4032 Data Analytics & Mining Notes
16 pages
Graph Autoencoder-Based Unsupervised Feature Selection With Broad and Local Data Structure Preservation
No ratings yet
Graph Autoencoder-Based Unsupervised Feature Selection With Broad and Local Data Structure Preservation
28 pages
3ML.03.Feature Reduction
No ratings yet
3ML.03.Feature Reduction
44 pages
Feature Selection: Slide 1
No ratings yet
Feature Selection: Slide 1
29 pages
Lec 04
No ratings yet
Lec 04
70 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
69 pages
Cs2351 Artificial Intelligence 16 Marks
100% (1)
Cs2351 Artificial Intelligence 16 Marks
1 page
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Module-3 DSV
No ratings yet
Module-3 DSV
20 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Back Propagation Network: Soft Computing
No ratings yet
Back Propagation Network: Soft Computing
33 pages
2015 Elsevier Multi Objective Optimization of Shared Nearest Neighbor Similarity For Feature Selection
No ratings yet
2015 Elsevier Multi Objective Optimization of Shared Nearest Neighbor Similarity For Feature Selection
12 pages
Expert Systems With Applications: Jianhua Hu, Kejin Pan, Yan Song, Guoliang Wei, Chungen Shen
No ratings yet
Expert Systems With Applications: Jianhua Hu, Kejin Pan, Yan Song, Guoliang Wei, Chungen Shen
15 pages
AI5003 AML Week07
No ratings yet
AI5003 AML Week07
14 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
j077 2011 KulHar WileyTutorial
No ratings yet
j077 2011 KulHar WileyTutorial
14 pages
Review On Online Feature Selection
No ratings yet
Review On Online Feature Selection
4 pages
Icml 2005
No ratings yet
Icml 2005
8 pages
Unit 3,4 and 5
No ratings yet
Unit 3,4 and 5
5 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
5 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
PR Slide Spring 2017
No ratings yet
PR Slide Spring 2017
5 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
29 pages
Clustering Before Classification
No ratings yet
Clustering Before Classification
3 pages
Recently
No ratings yet
Recently
1 page
PR Assignment 01 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 01 - Seemal Ajaz (206979)
7 pages
Unit 1 NNDL
No ratings yet
Unit 1 NNDL
8 pages
AI Unit-5
No ratings yet
AI Unit-5
10 pages
A Comprehensive Survey On Automatic Text Summarization With Exploration of LLM-Based Methods
No ratings yet
A Comprehensive Survey On Automatic Text Summarization With Exploration of LLM-Based Methods
31 pages
History of Artificial Intelligence
No ratings yet
History of Artificial Intelligence
10 pages
Lec 1
No ratings yet
Lec 1
27 pages
7 Ann Multilayer Perceptron Full
No ratings yet
7 Ann Multilayer Perceptron Full
69 pages
4 - Mcq-Ann-Ann-Quiz - Selected
No ratings yet
4 - Mcq-Ann-Ann-Quiz - Selected
13 pages
Cheng Yang, Zhiyuan Liu Et Al - Network Embedding Theories, Methods, and Applications
No ratings yet
Cheng Yang, Zhiyuan Liu Et Al - Network Embedding Theories, Methods, and Applications
244 pages
Deep Learning - A Beginners' Guide - Dulani Meedeniya - 1, 2023 - Chapman and Hall - CRC - 103247324X - Anna's Archive
No ratings yet
Deep Learning - A Beginners' Guide - Dulani Meedeniya - 1, 2023 - Chapman and Hall - CRC - 103247324X - Anna's Archive
199 pages
ML Unit 3
No ratings yet
ML Unit 3
17 pages
A Comprehensive Analytical Study of Traditional and Recent Development in Natural Language Processing
No ratings yet
A Comprehensive Analytical Study of Traditional and Recent Development in Natural Language Processing
11 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
Lightweight Densenet Model For Plant Disease Diagnosis
No ratings yet
Lightweight Densenet Model For Plant Disease Diagnosis
17 pages
AI Magazine - 2021 - Steck - Deep Learning For Recommender Systems A Netflix Case Study
No ratings yet
AI Magazine - 2021 - Steck - Deep Learning For Recommender Systems A Netflix Case Study
12 pages
Medical Text Classifier GabrieldeOlaguibel
No ratings yet
Medical Text Classifier GabrieldeOlaguibel
12 pages
Retele Neuronale Convolutionale
No ratings yet
Retele Neuronale Convolutionale
60 pages
Pattern Recognition: Talal A. Alsubaie Sfda
No ratings yet
Pattern Recognition: Talal A. Alsubaie Sfda
40 pages
Large Language Models For Business Process Management
No ratings yet
Large Language Models For Business Process Management
18 pages
Hands-On Machine Learning: Chapter 5: Support Vector Machines
No ratings yet
Hands-On Machine Learning: Chapter 5: Support Vector Machines
32 pages
Classification and Analysis of Deep Learning Applications in Construction A Systematic Literature Review
No ratings yet
Classification and Analysis of Deep Learning Applications in Construction A Systematic Literature Review
16 pages
Implementation of Lumpy Skin Disease Detection Using Machine Learning Approach
No ratings yet
Implementation of Lumpy Skin Disease Detection Using Machine Learning Approach
7 pages
AlphaZero Research Paper Summary
No ratings yet
AlphaZero Research Paper Summary
3 pages
M.Sc. IT Semester III Artificial Neural Networks (2014 - 2015) Chapter 1 To 5
No ratings yet
M.Sc. IT Semester III Artificial Neural Networks (2014 - 2015) Chapter 1 To 5
4 pages
Deep Learning: - Course Code: - Unit 2
No ratings yet
Deep Learning: - Course Code: - Unit 2
15 pages
Aman Arora Blog On Vision Transformer
No ratings yet
Aman Arora Blog On Vision Transformer
11 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet