0% found this document useful (0 votes)
18 views

What Is Feature Engineering

-

Uploaded by

lishanthigak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

What Is Feature Engineering

-

Uploaded by

lishanthigak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

What is Feature Engineering?

Feature engineering is the process of transforming raw data into features that are
suitable for machine learning models. In other words, it is the process of
selecting, extracting, and transforming the most relevant features from the
available data to build more accurate and efficient machine learning models.

What is a Feature?
In the context of machine learning, a feature (also known as a variable or
attribute) is an individual measurable property or characteristic of a data point
that is used as input for a machine learning algorithm. Features can be numerical,
categorical, or text-based, and they represent different aspects of the data that
are relevant to the problem at hand.

Need for Feature Engineering in Machine Learning?


Improve User Experience:
Competitive Advantage:
Meet Customer Needs:
Increase Revenue:
Future-Proofing:

1. Feature Creation
Feature Creation is the process of generating new features based on domain
knowledge or by observing patterns in the data. It is a form of feature engineering
that can significantly improve the performance of a machine-learning model.

Types of Feature Creation:


Domain-Specific: Creating new features based on domain knowledge, such as creating
features based on business rules or industry standards.

Data-Driven: Creating new features by observing patterns in the data, such as


calculating aggregations or creating interaction features.

Synthetic: Generating new features by combining existing features or synthesizing


new data points.

2. Feature Transformation
Feature Transformation is the process of transforming the features into a more
suitable representation for the machine learning model. This is done to ensure that
the model can effectively learn from the data.

Types of Feature Transformation:


Normalization: Rescaling the features to have a similar range, such as between 0
and 1, to prevent some features from dominating others.

Scaling: Scaling is a technique used to transform numerical variables to have a


similar scale, so that they can be compared more easily.

Rescaling the features to have a similar scale, such as having a standard deviation
of 1, to make sure the model considers all features equally.

Encoding: Transforming categorical features into a numerical representation.


Examples are one-hot encoding and label encoding.

Transformation: Transforming the features using mathematical operations to change


the distribution or scale of the features. Examples are logarithmic, square root,
and reciprocal transformations.

3. Feature Extraction
Feature Extraction is the process of creating new features from existing ones to
provide more relevant information to the machine learning model. This is done by
transforming, combining, or aggregating existing features.

Types of Feature Extraction:


Dimensionality Reduction: Reducing the number of features by transforming the data
into a lower-dimensional space while retaining important information. Examples are
PCA and t-SNE.

Feature Combination: Combining two or more existing features to create a new one.
For example, the interaction between two features.

Feature Aggregation: Aggregating features to create a new one. For example,


calculating the mean, sum, or count of a set of features.

Feature Transformation: Transforming existing features into a new representation.


For example, log transformation of a feature with a skewed distribution.

4. Feature Selection
Feature Selection is the process of selecting a subset of relevant features from
the dataset to be used in a machine-learning model. It is an important step in the
feature engineering process as it can have a significant impact on the model’s
performance.

Types of Feature Selection:


Filter Method: Based on the statistical measure of the relationship between the
feature and the target variable. Features with a high correlation are selected.

Wrapper Method: Based on the evaluation of the feature subset using a specific
machine learning algorithm. The feature subset that results in the best performance
is selected.

Embedded Method: Based on the feature selection as part of the training process of
the machine learning algorithm.

5. Feature Scaling
Feature Scaling is the process of transforming the features so that they have a
similar scale. This is important in machine learning because the scale of the
features can affect the performance of the model.

Types of Feature Scaling:


Min-Max Scaling: Rescaling the features to a specific range, such as between 0 and
1, by subtracting the minimum value and dividing by the range.

Standard Scaling: Rescaling the features to have a mean of 0 and a standard


deviation of 1 by subtracting the mean and dividing by the standard deviation.

Robust Scaling: Rescaling the features to be robust to outliers by dividing them by


the interquartile range.

You might also like