0% found this document useful (0 votes)
2 views21 pages

Feature Engineering

The document outlines the process of problem-solving in machine learning, emphasizing the importance of problem statements, domain studies, and feature engineering. It details various aspects of feature engineering, including feature selection, transformation, creation, encoding, binning, and extraction, highlighting their roles in enhancing the predictive power of machine learning models. The document serves as a guide for effectively preparing data to improve machine learning outcomes.

Uploaded by

qubefexe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views21 pages

Feature Engineering

The document outlines the process of problem-solving in machine learning, emphasizing the importance of problem statements, domain studies, and feature engineering. It details various aspects of feature engineering, including feature selection, transformation, creation, encoding, binning, and extraction, highlighting their roles in enhancing the predictive power of machine learning models. The document serves as a guide for effectively preparing data to improve machine learning outcomes.

Uploaded by

qubefexe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

1

AGENDA
Problem Statement

Domain Study

Problem defining

Feature Engineering

Data Frame Creation

Data set preprocessing

2
What is a
Problem
Statement
?

3
4
5
DOMAIN
STUDY
DOMAIN STUDY

It aims to identify features common to


a domain of applications, selecting and
abstracting the objects and operations
that characterize those features.
7
F E AT U R E E N G I N E E R I N G

 Feature engineering is the process of using domain


knowledge of the data to create features that make
machine learning algorithms work.

 If feature engineering is done correctly, it increases


the predictive power of machine learning algorithms by
creating features from raw data that help facilitate the
machine learning process.
Feature engineering is the most important art in machine
learning which creates a huge difference between a good
model and a bad model.

S T E P S T H AT A R E I N V O LV E D W H I L E S O LV I N G A N Y
PROBLEM IN MACHINE LEARNING ARE AS FOLLOWS:
F E AT U R E S E L E C T I O N

 Feature Selection is the method of reducing the


input variable to your model by using only
relevant data and getting rid of noise in data.

 It is the process of automatically choosing


relevant features for your machine learning
model based on the type of problem you are trying
to solve.
B E N E F I T S O F F E AT U R E S E L E C T I O N

Click icon to add picture


F E AT U R E T R A N S F O R M AT I O N

 It refers to the algorithm family that creates new


features using the existing features.

 These new features may not have the same


interpretation as the original features, but they may
have more explanatory power in a different space rather
than in the original space. This can also be used
for Feature Reduction.
Icon
F E AT U R E C R E AT I O N

 Creating features involves creating new


variables which will be most helpful for our
model.

 These artificial features are then used by that


algorithm in order to improve its performance,
or in other words reap better results.
F E AT U R E E N C O D I N G

Transforming the categorical


values of the relevant features into
numerical ones. This process is
called feature encoding.
F E AT U R E B I N N I N G / D I S C R E T I Z AT I O N

 Feature binning aggregates large amounts of point features


into dynamic polygon bins that vary through scaled levels of
detail.

 Feature binning can improve both drawing performance


and data comprehension.
F E AT U R E E X T R A C T I O N

 Feature extraction helps to reduce the amount of redundant


data from the data set.

 It yields better results than applying machine learning directly to


the raw data.

You might also like