0% found this document useful (0 votes)

7 views

Comp3314 5. Data Preprocessing

The document discusses the importance of data preprocessing in machine learning, highlighting techniques for handling missing values, encoding categorical data, and feature selection. Key methods include removing or imputing missing data, using one-hot encoding for nominal features, and applying L1 regularization for feature selection. The document also covers feature scaling and the use of algorithms like Sequential Backward Selection and Random Forests to assess feature importance.

Uploaded by

jocelynpratamah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Comp3314 5. Data Preprocessing

Uploaded by

jocelynpratamah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Data Preprocessing

COMP3314
Machine Learning
COMP 3314 2

Introduction
● Preprocessing a dataset is a crucial step
○ Garbage in, garbage out
○ Quality of data and amount of useful information it contains are
key factors
● Data-gathering methods are often loosely controlled, resulting in
out-of-range values (e.g., Income: −100), impossible data
combinations (e.g., Sex: Male, Pregnant: Yes), missing values, etc.
● Preprocessing is often the most important phase of a machine
learning project
COMP 3314 3

Outline
● In this chapter you will learn how to …
○ Remove and impute missing values from the dataset
○ Get categorical data into shape
○ Select relevant features
● Specifically, we will looking at the following topics
○ Dealing with missing data
○ Nominal and ordinal features
○ Partitioning a dataset into training and testing sets
○ Bringing features onto the same scale
○ Selecting meaningful features
○ Sequential feature selection algorithms
○ Random forests
COMP 3314 4

Dealing with Missing Data

● Missing data is common in real-world applications
○ Samples might be missing one or more values
● ML models are unable to handle this
● Two ways to handle this
○ Remove entries
○ Imputing missing values from other samples and features (repair)
COMP 3314 5

Identifying Missing Values

● Consider the following simple example generated from CSV
COMP 3314 6

Identifying Missing Values

● For larger data, it can be tedious to look for missing values
○ Use the isnull method to return a DataFrame with Boolean
values that indicate whether a cell
■ contains a numeric value (False), or if
■ data is missing (True)
● Use sum() to count the number of missing values per column
COMP 3314 7

Remove Missing Data

● One option is to simply remove the corresponding features (columns) or
samples (rows)
● Rows with missing values can be dropped via the dropna method with
argument axis=0

● Columns with missing values can be dropped via the dropna method with
argument axis=1
COMP 3314 8

Dropna
● The dropna method supports several additional parameters that can
come in handy
only drop rows only drop rows
where all drop rows that where NaN appear
columns are have less than 4 in specific columns
NaN real values (here: 'C')
COMP 3314 9

Remove Missing Data

● Convenient approach
● Disadvantage
○ May remove too many samples
■ Risk losing valuable information
■ Our classifier may need them to discriminate between
classes
● Could make a reliable analysis impossible
● Alternative approach: Interpolation
COMP 3314 10

Interpolation
● Estimate missing values from the other training samples in our dataset
● Example: Mean imputation
○ Replace missing value with the mean value of the entire feature column

mean and median are for

Try to change to: numerical data only,
- median most_frequent and constant can
- most_frequent be used for numerical data or
- constant, fill_value=42 strings
COMP 3314 11

Scikit-Learn Estimator API

● SimpleImputer is a Transformer class
○ Used for data transformation
○ Two essential methods
■ fit
■ transform
● Estimator class
○ Very similar to transformer class
○ Two essential methods
■ fit
■ predict
■ Transform (optional)
COMP 3314 12

Transformer - Fit and Transform

● fit method
○ Used to learn the
parameters from the
training data
● transform method
○ Uses those parameters
to transform the data

Note: Number of features need

to be identical
COMP 3314 13

Estimator - Fit and Predict

● Use fit method to learn parameters
○ Additionally provide class labels
● Use predict method to make predictions
about unlabeled data
COMP 3314 14

Handling Categorical Data - infinity to infinity

● We have been exclusively working with numerical data

● How to handle categorical data? A categorical feature can take
on one of a limited, and usually
● Example of categorical data fixed, number of possible
there is a fixed number of distinct value
values

XL L M
COMP 3314 15

Categorical Data
● It is common that real-world datasets contain categorical features
○ How to deal with this type of data?
● Nominal features vs ordinal features
○ Ordinal features can be sorted / ordered
■ E.g., t-shirt size, because we can define an order XL>L>M
○ Nominal features don't imply any order
■ E.g., t-shirt color
COMP 3314 16

Example Dataset

nominal ordinal numerical

COMP 3314 17

Mapping Ordinal Features

● To ensure correct interpretation of ordinal features, convert string values
to integers

● Reverse-mapping to go back
COMP 3314 18

Encoding Class Labels

● Most models require integer encoding for class labels
○ Note: class labels are not ordinal, and it doesn't matter which integer number
we assign to a particular string label
COMP 3314 19

LabelEncoder
● Alternatively, there is a convenient LabelEncoder class directly
implemented in scikit-learn to achieve this

Shortcut of calling fit

and transform
separately
COMP 3314 20

One-Hot Encoding
● We could use a similar approach to transform the nominal color column
of our dataset, as follows

○ Problem:
■ Model may assume that green > blue, and red > green
■ This could result in suboptimal model
● Workaround: Use one-hot encoding
○ Create a dummy feature for each unique value of nominal features
■ E.g., a blue sample is encoded as blue = 1 , green = 0 , red = 0
COMP 3314 21

One-Hot Encoding
● Use the OneHotEncoder available in scikit-learn’s preprocessing
module
-1 means unknown
dimension and we want
numpy to figure it out

Apply to only a
single column
red
blue
COMP 3314 22

One-Hot Encoding via ColumnTransformer

● To selectively transform columns in a multi-feature array, use
ColumnTransformer
○ Accepts a list of (name, transformer, column(s)) tuple
Only modify the first
column

dummy feature (color)

COMP 3314 23

One-Hot Encoding - Via Pandas

● An even more convenient way to create those dummy features via
one-hot encoding is to use the get_dummies method implemented
in pandas
○ get_dummies will only convert string columns
COMP 3314 24

One-Hot Encoding - Dropping First Feature

● Note that we do not lose any information by removing one dummy column
○ E.g., if we remove the column color_blue, the feature information is still
preserved since if we observe color_green=0 and color_red=0, it implies that
the observation must be blue
COMP 3314 25

UCI Wine Dataset

● The UCI wine dataset consists of 178 wine samples with 13 features describing
their different chemical properties
COMP 3314 26

UCI Wine Dataset: Training-Testing

● Let’s first divide the dataset into separate training and testing sets

30% for testing

COMP 3314 27

UCI Wine Dataset: Training-Testing

● It is important to balance the trade-off between inaccurate estimation of
generalization error and withholding too much information from the
learning algorithm
● In practice, the most commonly used splits are 60:40, 70:30, or 80:20,
depending on the size of the initial dataset
○ For large datasets, 90:10 or 99:1 splits are also common and
if we need 50 samples,

appropriate in 100 dataset, training : testing = 50:50

in 500 dataset, training: testing = 90 : 10

● Instead of discarding the allocated test data after model training and
Bigger dataset, smaller testing ratio

evaluation, we can retrain a classifier on the entire dataset as it could

improve the predictive performance of the model
○ While this approach is generally recommended, it could lead to worse
generalization performance testing set should not be > 50%
COMP 3314 28

Feature Scaling
● The majority of ML algorithms require feature scaling
○ Decision trees and random forests are two of few ML algorithms that don’t
require feature scaling
● Importance
○ Consider the squared error function in Adaline for two dimensional features
where one feature is measured on a scale from 1 to 10 and the second feature is
measured on a scale from 1 to 100,000
■ The second feature would contribute to the error with a much higher
significance
● Two common approaches to bring different features onto the same scale
○ Normalization
■ E.g., rescaling features to a range of [0, 1]
○ Standardization
■ E.g., center features at mean 0 with standard deviation 1
COMP 3314 29

Feature Scaling - Normalization Find the min and max value

● Most often, normalization refers to the rescaling of features to a range of [0, 1]
● To normalize our data, we can simply apply a min-max scaling to each feature column
○ A new value x(i)norm of a sample x(i) is calculated as follows

○ Here xmin is the smallest value in a feature column and xmax the largest
COMP 3314 30

Feature Scaling - Standardization

● Standardization is more practical for various reasons including retaining useful
information about outliers
● A new value x(i)std of a sample x(i) is calculated as follows

● Here μx is the sample mean of feature column and σx the corresponding standard deviation
● Similar to the MinMaxScaler class, scikit-learn also implements a class for standardization
COMP 3314 31

Normalization vs. Standardization

● The following example illustrates the difference between
standardization and normalization
COMP 3314 32

Robust Scaler
● More advanced methods for feature scaling are available in sklearn
● The RobustScaler is especially helpful and recommended if
working with small datasets that contain many outliers
COMP 3314 33

Feature Selection
● Selects a subset of relevant features
○ Simplify model for easier interpretation
○ Shorten training time
○ Avoid curse of dimensionality
○ Reduce overfitting
● Feature selection ≠ feature extraction (covered in next chapter)
○ Selecting subset of the features ≠ creating new features
● We are going to look at two techniques for feature selection
○ L1 Regularization
○ Sequential Backward Selection (SBS)
COMP 3314 34

L1 vs. L2 Regularization
● L2 regularization (penalty) used in chapter 3

● Another approach: L1 regularization (penalty)

● This will usually yield sparse feature weights

○ Most feature weights will be zero zero = not selected (discarded)
● Sparsity can be useful in practice if we have a high dimensional dataset with
many features that are irrelevant
● L1 regularization can be taken as a technique for feature selection
COMP 3314 35

Geometric Interpretation
● To better understand how L1 regularization encourages sparsity, let’s take a look
at a geometric interpretation of regularization
● Consider the sum of squared errors cost function used for Adaline
● Plot of the contours of a convex cost function for two coefficients w1 and w2

increasing when we move out

COMP 3314 36

Geometric Interpretation: L2 Regularization

● Regularization adds a penalty to the cost function to encourage smaller weights
○ By increasing the regularization strength λ we shrink the weights towards
zero and decrease the dependency of our model on the training data

cannot just minimize the cost, because

the penalty will be huge
need to balance between cost and
penalty
all points at the same distance of
the circle = same penalty value
COMP 3314 37

Geometric Interpretation: L1 Regularization

● Since the L1 penalty is the sum of the absolute weight coefficients we can
represent it as a diamond-shape
● It is more likely that the optimum is located on the axes, which encourages
sparsity

closer to the minimum cost the minimum value are very likely to be
located at the sharp corner

Mathematical details can be found in

Section 3.4 of
The Elements of Statistical Learning
COMP 3314 38

Sparse Solution
● We can simply set the penalty parameter to ‘l1’ for models in scikit-learn that
support L1 regularization

● In scikit-learn, w0 corresponds to intercept_ and wj (for j > 0) corresponds to the

values in coef_
COMP 3314 39

Sparse Solution - Regularization Strength

converge to zero

too small = all zero large regularization strength => non zero value
should find the correct C, not too large & not too small
COMP 3314 40

Sequential Backward Selection (SBS)

● Reduces an initial d-dimensional space to a k-dimensional subspace (k < d)
by automatically selecting features that are most relevant
● Idea:
○ Sequentially remove features until desired feature number is reached
○ Define a criterion function J to be maximized
■ E.g., performance of the classifier after removal
■ Use a validation subset of the training set for performance
evaluation
○ Eliminate the feature that causes the least performance loss

remove 1 - calculate - put it back, remove another features ,... || see which feature give the maximum performance
COMP 3314 41

SBS
Steps:
1. Initialize the algorithm with k = d
d is the dimensionality of the full feature space Xd
2. Determine the feature x- = argmax J (Xk - x) that maximizes the criterion
function J
3. Remove the feature x- from the feature set
Xk-1 = Xk - x-
k=k-1
4. Terminate if k equals the number of desired features;
otherwise, go to step 2

● In the following we will implement SBS in Python from scratch

COMP 3314 42
COMP 3314 43

removing features
from 13 -1 -1 -1 ...
Best choice = 3 features
COMP 3314 44

SBS - Analyzing the Result

● The smallest feature subset (k = 3) that yielded such a good performance on the
validation dataset has the following features

● The accuracy of the KNN classifier on the original test set is as follows

● The three-feature subset has the following accuracy

less testing accuracy

COMP 3314 45

Feature Selection Algorithms in scikit-learn

● There are many more feature selection algorithms available via
scikit-learn
● A comprehensive discussion of the different feature selection
methods is beyond the scope of this lecture
○ A good summary with illustrative examples can be found here
COMP 3314 46

Assessing Feature Importance

● We can determine relevant features using random forest
○ Measure the feature importance as the averaged information gain
● The random forest implementation in scikit-learn already collects the
feature importance values for us
○ Access them via the feature_importances_ attribute after fitting a
RandomForestClassifier
● In the following we will train a forest of 500 trees on the Wine dataset
and rank the 13 features by their respective importance measures
COMP 3314 47

different ML algorithm choose different features

(different result)
COMP 3314 48

Conclusion
● Handle missing data correctly
● Encode categorical variables correctly
● Map ordinal and nominal feature values to integer representations
● L1 regularization can help us to avoid overfitting by reducing the
complexity of a model
● Used a sequential feature selection algorithm to select meaningful
features from a dataset
COMP 3314 49

References
● Most materials in this chapter are
based on
○ Book
○ Code
COMP 3314 50

References
● Some materials in this chapter
are based on
○ Book
○ Code
COMP 3314 51

References
● The Elements of Statistical Learning: Data Mining, Inference, and
Prediction, Second Edition
○ Trevor Hastie, Robert Tibshirani, Jerome Friedman
● https://web.stanford.edu/~hastie/ElemStatLearn/
● Pandas User Guide: Working with missing data

(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
Isye6501 Office Hour Fa22 Week09 Thu
No ratings yet
Isye6501 Office Hour Fa22 Week09 Thu
8 pages
Unit 4 Basics of Feature Engineering
No ratings yet
Unit 4 Basics of Feature Engineering
33 pages
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
No ratings yet
COMPX310-19A Machine Learning: An Introduction Using Python, Scikit-Learn, Keras, and Tensorflow
44 pages
Machine Learning Algorithms PDF
100% (1)
Machine Learning Algorithms PDF
148 pages
Data Preprocessing
No ratings yet
Data Preprocessing
65 pages
ML - WEEK 04
No ratings yet
ML - WEEK 04
33 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
FeatureEngineering (1)
No ratings yet
FeatureEngineering (1)
50 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
01 - Feature Engg
No ratings yet
01 - Feature Engg
43 pages
1737527078055
No ratings yet
1737527078055
111 pages
Week 10
No ratings yet
Week 10
50 pages
Practical 1 52
No ratings yet
Practical 1 52
4 pages
Unit-2Exploratory-Analysis
No ratings yet
Unit-2Exploratory-Analysis
37 pages
Eda
No ratings yet
Eda
48 pages
Hint_sheet
No ratings yet
Hint_sheet
13 pages
Unit 4 Basics of Feature Engineering
100% (1)
Unit 4 Basics of Feature Engineering
33 pages
Data Pre-Processing Python For Beginner
No ratings yet
Data Pre-Processing Python For Beginner
12 pages
Data Pre-Processing Python For Beginner
No ratings yet
Data Pre-Processing Python For Beginner
12 pages
Dsbda Ass2
No ratings yet
Dsbda Ass2
49 pages
Lecture Material 10
No ratings yet
Lecture Material 10
9 pages
Machine Learning (2) : Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Machine Learning (2) : Inteligência Artificial E Cibersegurança (Inacs)
45 pages
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
No ratings yet
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
69 pages
Unit-II
No ratings yet
Unit-II
119 pages
Data Analysis: Data Preparation
No ratings yet
Data Analysis: Data Preparation
9 pages
Scikit Hca
No ratings yet
Scikit Hca
8 pages
IML 2 - Data Preparation
No ratings yet
IML 2 - Data Preparation
13 pages
Lecture-2-20022025-092902am
No ratings yet
Lecture-2-20022025-092902am
87 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
Lab File
No ratings yet
Lab File
96 pages
LAB MANUAL 5 SOLVED 40 (1)
No ratings yet
LAB MANUAL 5 SOLVED 40 (1)
13 pages
Seven Lab Instruction
No ratings yet
Seven Lab Instruction
38 pages
Lecture Material 3
No ratings yet
Lecture Material 3
7 pages
Feature Engineering: Getting The Most Out of Data For Predictive Models
No ratings yet
Feature Engineering: Getting The Most Out of Data For Predictive Models
75 pages
Lecture5
No ratings yet
Lecture5
26 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
ML unit 3
No ratings yet
ML unit 3
17 pages
2_DataPreProcessing_code
No ratings yet
2_DataPreProcessing_code
46 pages
Data Preprocessing Techniques in ML
No ratings yet
Data Preprocessing Techniques in ML
12 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Slides on DataI
No ratings yet
Slides on DataI
33 pages
Data Pre Processing
No ratings yet
Data Pre Processing
2 pages
ML-Lab05-Data Preprocessing Techniques in Python
No ratings yet
ML-Lab05-Data Preprocessing Techniques in Python
7 pages
Data Preparation.2
No ratings yet
Data Preparation.2
18 pages
DM Lab Cycle 2 1
No ratings yet
DM Lab Cycle 2 1
10 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
Analysis and Prediction of House Prices by Linear Regression Model
No ratings yet
Analysis and Prediction of House Prices by Linear Regression Model
91 pages
Python Basics Refresher
No ratings yet
Python Basics Refresher
19 pages
ML Unit 2
No ratings yet
ML Unit 2
41 pages
MLP Week 2 Slides
No ratings yet
MLP Week 2 Slides
82 pages
data-mining-lab-manual-CSE-VII-Sem
No ratings yet
data-mining-lab-manual-CSE-VII-Sem
63 pages
Kabir Data Preprocessing Python
No ratings yet
Kabir Data Preprocessing Python
14 pages
ML Lab Records
No ratings yet
ML Lab Records
101 pages
Feature Engineering PDF
100% (1)
Feature Engineering PDF
75 pages
Assignment 1 - LP1
No ratings yet
Assignment 1 - LP1
14 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
13 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Insomnia Type Questionnaire
No ratings yet
Insomnia Type Questionnaire
61 pages
Data Wrangling and Preprocessing
100% (1)
Data Wrangling and Preprocessing
41 pages
Jamapediatrics Boers 2019 Oi 190041
No ratings yet
Jamapediatrics Boers 2019 Oi 190041
7 pages
The Statistics of Causal Inference: A View From Political Methodology
No ratings yet
The Statistics of Causal Inference: A View From Political Methodology
23 pages
The Impact of Microfinance On Poverty Alleviation: The Case of Pakistan
No ratings yet
The Impact of Microfinance On Poverty Alleviation: The Case of Pakistan
22 pages
56 Rural Underserved Schools
No ratings yet
56 Rural Underserved Schools
12 pages
Ptdlkd Final Report 2 Pdff
No ratings yet
Ptdlkd Final Report 2 Pdff
60 pages
RJwrapper
No ratings yet
RJwrapper
24 pages
BRM Unit-4
No ratings yet
BRM Unit-4
18 pages
ds cs
No ratings yet
ds cs
22 pages
Capl 2 Manual en
No ratings yet
Capl 2 Manual en
101 pages
Recruiting
No ratings yet
Recruiting
18 pages
EORTC QOL Scoring Manual
No ratings yet
EORTC QOL Scoring Manual
78 pages
Data Science Report
No ratings yet
Data Science Report
35 pages
Data Integration Using Statistical Matching Techniques: A Review
No ratings yet
Data Integration Using Statistical Matching Techniques: A Review
20 pages
Jurnal Neuro
No ratings yet
Jurnal Neuro
12 pages
GDELT
No ratings yet
GDELT
13 pages
UNIT 2_2
No ratings yet
UNIT 2_2
22 pages
PDF Applied Longitudinal Data Analysis for Epidemiology A Practical Guide 2nd Edition Jos W. R. Twisk download
100% (5)
PDF Applied Longitudinal Data Analysis for Epidemiology A Practical Guide 2nd Edition Jos W. R. Twisk download
51 pages
Investigating Key Factors Influencing Decision Making in The Design of Buildings and Places: A Survey of Stakeholders' Perception
No ratings yet
Investigating Key Factors Influencing Decision Making in The Design of Buildings and Places: A Survey of Stakeholders' Perception
21 pages
A Feature Learning and Object Recognition Framework For Underwater Fish Images
No ratings yet
A Feature Learning and Object Recognition Framework For Underwater Fish Images
11 pages
STROBE Checklist Cross-Sectional
No ratings yet
STROBE Checklist Cross-Sectional
3 pages
Capstone Project
No ratings yet
Capstone Project
21 pages
DOC-20241024-WA0008. (1)
No ratings yet
DOC-20241024-WA0008. (1)
21 pages
4VZ22MC073-Rajesh S (Conference)
No ratings yet
4VZ22MC073-Rajesh S (Conference)
8 pages
2019 - Nissen Etal - Missing Data and Bias in Physics Education Research - A Case For Using Multiple Imputation
No ratings yet
2019 - Nissen Etal - Missing Data and Bias in Physics Education Research - A Case For Using Multiple Imputation
15 pages
Project Guidelines (ISE-291 _T 241)
No ratings yet
Project Guidelines (ISE-291 _T 241)
3 pages
1.data Mining Functionalities
No ratings yet
1.data Mining Functionalities
14 pages

Comp3314 5. Data Preprocessing

Uploaded by

Comp3314 5. Data Preprocessing

Uploaded by

Data Preprocessing

Dealing with Missing Data

Identifying Missing Values

Identifying Missing Values

Remove Missing Data

Remove Missing Data

mean and median are for

Scikit-Learn Estimator API

Transformer - Fit and Transform

Note: Number of features need

Estimator - Fit and Predict

Handling Categorical Data - infinity to infinity

● We have been exclusively working with numerical data

nominal ordinal numerical

Mapping Ordinal Features

Encoding Class Labels

Shortcut of calling fit

One-Hot Encoding via ColumnTransformer

dummy feature (color)

One-Hot Encoding - Via Pandas

One-Hot Encoding - Dropping First Feature

UCI Wine Dataset

UCI Wine Dataset: Training-Testing

30% for testing

UCI Wine Dataset: Training-Testing

appropriate in 100 dataset, training : testing = 50:50

evaluation, we can retrain a classifier on the entire dataset as it could

Feature Scaling - Normalization Find the min and max value

Feature Scaling - Standardization

Normalization vs. Standardization

● Another approach: L1 regularization (penalty)

● This will usually yield sparse feature weights

increasing when we move out

Geometric Interpretation: L2 Regularization

cannot just minimize the cost, because

Geometric Interpretation: L1 Regularization

Mathematical details can be found in

● In scikit-learn, w0 corresponds to intercept_ and wj (for j > 0) corresponds to the

Sparse Solution - Regularization Strength

Sequential Backward Selection (SBS)

● In the following we will implement SBS in Python from scratch

SBS - Analyzing the Result

● The three-feature subset has the following accuracy

less testing accuracy

Feature Selection Algorithms in scikit-learn

Assessing Feature Importance

different ML algorithm choose different features

You might also like