Feature Selection Method in Decision Tree Induction

The document discusses different feature selection methods that can be used in decision tree induction, including information gain, gain ratio, Gini index, chi-square test, wrapper methods, and embedded methods. Information gain and gain ratio are splitting criteria that measure reduction in entropy. Gini index measures data impurity. Chi-square evaluates independence between variables. Wrapper methods use a learning algorithm to evaluate feature subsets. Embedded methods perform selection during model construction.

Uploaded by

soma7513

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Feature Selection Method in Decision Tree Induction

Uploaded by

soma7513

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 7

FEATURE SELECTION

METHOD IN DECISION
TREE INDUCTION
71762351023
NITHISH KUMAR R
2

WHAT IS FSD ?
Feature selection is the process of choosing a subset of
relevant features (variables, attributes) from the original set
of features in a dataset. The goal of feature selection is to
improve model performance by reducing dimensionality,
decreasing computational cost, enhancing interpretability,
and mitigating the risk of overfitting.
INFORMATION GAIN (IG):
Information gain measures the reduction in entropy or
impurity achieved by splitting the data on a particular
feature.
Features with higher information gain are considered more
relevant for splitting.
Decision trees typically use information gain (or related
metrics like gain ratio) as the criterion for selecting the best
split at each node.
GAIN RATIO:
Gain ratio is a modification of information gain that penalizes attributes
with a large number of distinct values. It helps to avoid biases towards
attributes with many categories.

It is calculated as the ratio of information gain to the intrinsic information,

which measures the potential of an attribute to split the data into classes.
5
GINI INDEX:
• Gini index measures the impurity of a set of examples. It is computed as the sum of
the probabilities of each item being chosen times the probability of a mistake in
categorizing that item.
• Lower Gini index values indicate better splits. Decision trees often use Gini index as
an alternative splitting criterion.

CHI-SQUARE (Χ²) TEST:

Chi-square test evaluates the independence between variables by comparing observed
frequencies to expected frequencies.

In the context of decision trees, it assesses whether the distribution of the target
variable differs significantly in different subsets of the data defined by each attribute.

Features with higher chi-square statistics (indicating greater association with the target
variable) are selected.
6
WRAPPER METHODS:
Wrapper methods use a specific learning algorithm (e.g.,
decision tree) as a black box to evaluate the usefulness of
feature subsets.

They employ a search algorithm to select the optimal feature

subset based on the performance of the learning algorithm.

Examples include forward selection, backward elimination,

and recursive feature elimination.
7

EMBEDDED METHODS:
Embedded methods incorporate feature selection
within the model construction process. They select
features during the training phase of the learning
algorithm.

Decision trees often have built-in mechanisms to

evaluate feature importance during training, such as
pruning techniques that remove less important
features or regularization methods.

Document Control Procedure in Construction Project
100% (3)
Document Control Procedure in Construction Project
18 pages
Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
dm unit 4
No ratings yet
dm unit 4
24 pages
Decitions Tree
No ratings yet
Decitions Tree
6 pages
Decision tree
No ratings yet
Decision tree
16 pages
Ml Unit 2 Final_iii Yr
No ratings yet
Ml Unit 2 Final_iii Yr
72 pages
Class Basic
No ratings yet
Class Basic
75 pages
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
No ratings yet
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
34 pages
solution for dwdm problems (1)
No ratings yet
solution for dwdm problems (1)
24 pages
Decision Trees
No ratings yet
Decision Trees
3 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
CSE445 NSU Week_4
No ratings yet
CSE445 NSU Week_4
48 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Day48 Decision Trees
No ratings yet
Day48 Decision Trees
5 pages
S&ML Unit 6- Q & A
No ratings yet
S&ML Unit 6- Q & A
12 pages
Machine Learning chapter 4
No ratings yet
Machine Learning chapter 4
9 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
10 pages
DECSION TREE
No ratings yet
DECSION TREE
6 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
23 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Classification
No ratings yet
Classification
45 pages
2179-Unit-3
No ratings yet
2179-Unit-3
29 pages
Ecture Ecision REE: Sajal Halder Bsmrstu
100% (1)
Ecture Ecision REE: Sajal Halder Bsmrstu
22 pages
5-Intro-to-Tree-Methods-LT
No ratings yet
5-Intro-to-Tree-Methods-LT
15 pages
Lecture 7.1 - Decision Tree Classification
No ratings yet
Lecture 7.1 - Decision Tree Classification
15 pages
Decision Tree Assignment
No ratings yet
Decision Tree Assignment
3 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Session 5b Classification by Decision Tree Induction (1)
No ratings yet
Session 5b Classification by Decision Tree Induction (1)
42 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Data Mining & Knowledge Discovery
No ratings yet
Data Mining & Knowledge Discovery
34 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Module 3
No ratings yet
Module 3
33 pages
chapter 04
No ratings yet
chapter 04
48 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
Data Science Concepts Lesson04 Decision Tree Concepts
No ratings yet
Data Science Concepts Lesson04 Decision Tree Concepts
22 pages
dm 3
No ratings yet
dm 3
37 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree Theory
No ratings yet
Decision Tree Theory
22 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Decision Tree Method in Financial Analysis of Listed Logistics Companies
No ratings yet
Decision Tree Method in Financial Analysis of Listed Logistics Companies
6 pages
6__DecisionTrees__ID3_CART
No ratings yet
6__DecisionTrees__ID3_CART
24 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
Classification
No ratings yet
Classification
95 pages
CH 5
No ratings yet
CH 5
81 pages
Construction of Decision Tree Attribute Selection Measures
No ratings yet
Construction of Decision Tree Attribute Selection Measures
5 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
87 pages
Classification_Decision Tree
No ratings yet
Classification_Decision Tree
32 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Evolium 9100 BTS Product Description
No ratings yet
Evolium 9100 BTS Product Description
82 pages
Information About Ethical Hacking
No ratings yet
Information About Ethical Hacking
2 pages
Reserve Bank of India
No ratings yet
Reserve Bank of India
8 pages
Lab Manual Wxmaxima
No ratings yet
Lab Manual Wxmaxima
33 pages
Beginner Guide To Cyber Security
No ratings yet
Beginner Guide To Cyber Security
7 pages
Logcat Prev CSC Log
No ratings yet
Logcat Prev CSC Log
106 pages
Sony kdl-26 32 37 40m4000 CH Ma2
No ratings yet
Sony kdl-26 32 37 40m4000 CH Ma2
118 pages
theory lecture for week 5 assembly language_edited by Prof Muhibul H Bhuyan
No ratings yet
theory lecture for week 5 assembly language_edited by Prof Muhibul H Bhuyan
32 pages
Atb 1759
No ratings yet
Atb 1759
4 pages
Python Using AI Workshop Notes
No ratings yet
Python Using AI Workshop Notes
21 pages
IP Project File CYBER CAFE MANAGEMENT
No ratings yet
IP Project File CYBER CAFE MANAGEMENT
27 pages
MAXFORDHAM Sustainability All
No ratings yet
MAXFORDHAM Sustainability All
10 pages
Schmidt On Mapping
No ratings yet
Schmidt On Mapping
2 pages
Jwi 3000
No ratings yet
Jwi 3000
1 page
13-03-20 BSA Amicus Brief in Support of Apple On FRAND
No ratings yet
13-03-20 BSA Amicus Brief in Support of Apple On FRAND
28 pages
MOSFET Interview Questions Part-2
No ratings yet
MOSFET Interview Questions Part-2
4 pages
Brookfield Dv-Ii+ Pro Programmable Viscometer: Operating Instructions Manual No
No ratings yet
Brookfield Dv-Ii+ Pro Programmable Viscometer: Operating Instructions Manual No
82 pages
BGP OSPF Interaction Report
No ratings yet
BGP OSPF Interaction Report
11 pages
High-Impedance Fault Detection With The F60 Universal Relay
No ratings yet
High-Impedance Fault Detection With The F60 Universal Relay
48 pages
Assignement of Law & Poverty by Zubair Ahmad
No ratings yet
Assignement of Law & Poverty by Zubair Ahmad
4 pages
Ruijie - SME Product Mapping - Poster
No ratings yet
Ruijie - SME Product Mapping - Poster
1 page
h5 Manual PDF 5f58f94d402e7
No ratings yet
h5 Manual PDF 5f58f94d402e7
62 pages
Dvir Hakmon CV English Final
No ratings yet
Dvir Hakmon CV English Final
2 pages
GW1114-4DI (3IN1) - RJ-P (12-48VDC) Datasheet
No ratings yet
GW1114-4DI (3IN1) - RJ-P (12-48VDC) Datasheet
6 pages
Continuous Optimization
No ratings yet
Continuous Optimization
23 pages
Ir Sensor
No ratings yet
Ir Sensor
7 pages
Current Log
No ratings yet
Current Log
6 pages
Data Structures and Algorithms: (CS210/ESO207/ESO211)
No ratings yet
Data Structures and Algorithms: (CS210/ESO207/ESO211)
12 pages
Innocentive Case Study
No ratings yet
Innocentive Case Study
4 pages