0% found this document useful (0 votes)
11 views42 pages

AI-Module 4 Updated

This document discusses feature engineering in machine learning. It describes feature engineering as the pre-processing step that transforms raw data into features for building predictive models. The key processes of feature engineering are feature creation, transformations, extraction, and selection. Various feature engineering techniques are also covered, including imputation, discretization, encoding, splitting, handling outliers, transformations, scaling, and creating new features.

Uploaded by

rithusagar5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views42 pages

AI-Module 4 Updated

This document discusses feature engineering in machine learning. It describes feature engineering as the pre-processing step that transforms raw data into features for building predictive models. The key processes of feature engineering are feature creation, transformations, extraction, and selection. Various feature engineering techniques are also covered, including imputation, discretization, encoding, splitting, handling outliers, transformations, scaling, and creating new features.

Uploaded by

rithusagar5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

MODULE 4

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 1


Feature Engineering
• Feature engineering is the pre-processing step of machine learning,
which is used to transform raw data into features that can be used for
creating a predictive model using Machine learning.
• All machine learning algorithms take input data to generate the output.
• The input data remains in a tabular form consisting of rows (instances
or observations) and columns (variable or attributes), and these attributes
are often known as features.
• Example: An image is an instance in computer vision, but a line in the
image could be the feature. In NLP, a document can be an observation, and
the word count could be the feature.
• A feature is an attribute or individual measurable property or
characteristic of a phenomenon.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 2


• Feature engineering process selects the most useful predictor variables
for the model.

• Feature engineering in ML contains mainly four processes: Feature


Creation, Transformations, Feature Extraction, and Feature Selection.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 3
1. Feature Creation:
• Feature creation is finding the most useful variables to be used in a
predictive model.
• The process is subjective, and it requires human creativity and intervention.
• The new features are created by mixing existing features using addition,
subtraction, and ratio, and these new features have great flexibility.

2. Transformation:
• It involves adjusting the predictor variable to improve the accuracy and
performance of the model.
• It ensures that all the variables are on the same scale, making the model
easier to understand.
• It ensures that all the features are within the acceptable range to avoid
any computational error.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 4
3. Feature Extraction:
• Feature extraction is an automated feature engineering process that
generates new variables by extracting them from the raw data.
• The aim is to reduce the volume of data so that it can be easily used and
managed for data modelling.
• Feature extraction methods include cluster analysis, text analytics, edge
detection algorithms, and principal components analysis (PCA).

4. Feature Selection:
• Feature selection is a way of selecting the subset of the most relevant
features from the original features set by removing the redundant,
irrelevant, or noisy features.
• This is done in order to reduce overfitting in the model and improve the
performance.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 5
Feature Engineering Techniques
1. Imputation:
• Imputation deals with handling missing values in data.
• Deleting records that are missing is one way of dealing with missing data
issue. But it could lead to losing out on a chunk of valuable data. This is
where imputation can help.
• Data imputation can be classified into two types:
 Categorical Imputation: Missing categorical values are generally
replaced by the most commonly occurring value (mode) of the feature.
 Numerical Imputation: Missing numerical values are generally replaced
by the mean or median of the corresponding feature.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 6


• Example: Categorical Imputation

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 7


• Example: Numerical Imputation

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 8


2. Discretization:
• Discretization involves taking a set of values of data and grouping sets
of them in some logical fashion into bins (or buckets).
• Binning can apply to numerical values as well as to categorical values.

Prof. Trupthi Rao, Dept. of AI & DS, GAT 9


• The grouping of data can be done as follows:
 Grouping of equal intervals (equal width)
 Grouping based on equal frequencies (of observations in the bin)
• Example:

Prof. Trupthi Rao, Dept. of AI & DS, GAT 10


3. Categorical encoding:
• Categorical encoding is the technique used to encode categorical features
into numerical values which are usually simpler for an algorithm to
understand.
• This can be done by:
(i) Integer Encoding
(ii) One-Hot Encoding

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 11


(i) Integer Encoding:
• Integer encoding consist in replacing the
categories by digits from 1 to n (or 0 to n-1),
where n is the number of distinct categories of
the variable.
• Each unique category is assigned an integer
value.
• This method is also called as label encoding.
• This method is used when there exists ordinal
relationship in the variables.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 12


(ii) One-Hot Encoding:
• For categorical variables where no ordinal
relationship exists, a one-hot encoding (OHE) can be
applied.
• Here a new binary variable is added for each
unique integer value.
• In the “color” variable example, there are 3
categories: red, green and blue.
• Therefore 3 binary variables: ‘color_red’,
‘color_blue’ and ‘color_green’ are needed.
• A “1” value is placed in the binary variable for the
color and “0” values for the other colors.
• The binary variables are often called “dummy
variables or indicator variables”.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 13
4. Feature Splitting:
• Feature splitting is the process of separating features into two or more
parts to make new features.
• This technique helps the algorithms to better understand and learn the
patterns in the dataset.
• Example 1: Sale Date is split into year, month and day.

18/03/2024 14
• Example 2: Time stamp is split into 6 different attributes.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 15


5. Handling outliers:
• Outliers are unusually high or low values in the dataset which are
unlikely to occur in normal scenarios.
• Since the outliers could adversely affect the model prediction they must be
handled appropriately.
• Methods of handling outliers include:
 Removal: The records containing outliers are removed from the variable.
However, the presence of outliers over multiple variables could result in
losing out on a large portion of the data.
 Replacing values: The outliers could alternatively be treated as missing
values and replaced by using appropriate imputation.
 Capping: Capping the maximum and minimum values and replacing them
with an arbitrary value.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 16
6. Variable transformations:
• Variable transformation techniques
could help with normalizing skewed
data.
• Skewness is a measure of the
asymmetry of a distribution.
• A distribution is asymmetrical when its
left and right side are not mirror images.
• Some of the variable transformations are
the Logarithmic transformation, Square
root transformation and Box cox
transformation which when applied on
heavy-tailed distributions results in less
skewed values.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 17
7. Scaling:
• Feature scaling is a method used to normalize the range of independent
variables or features of data.
• The commonly used processes of scaling include:
 Min-Max Scaling/Normalization: This process involves the rescaling of
all values in a feature in the range 0 to 1. In other words, the minimum
value in the original range will take the value 0, the maximum value will
take 1 and the rest of the values in between the two extremes will be
appropriately scaled.

 Standardization/Variance scaling: Mean is subtracted from every data


point and the result is divided by the standard deviation to arrive at a
distribution with a 0 mean and variance of 1.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 18
8. Creating features:
• Feature creation involves deriving new features from existing ones.
• This can be done by simple mathematical operations such as aggregations
to obtain the mean, median, mode, sum, or difference and even product of
two values.
• These features, although derived directly from the given data, when
carefully chosen to relate to the target can have an impact on the
performance.
• Example:

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 19


18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 20
Introduction to ML

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 21


a) Evolution of Machine Learning
• The term Machine Learning (ML) was first used by Arthur Samuel, one of
the pioneers of Artificial Intelligence at IBM, in 1959.
• Machine learning (ML) is an important tool for the goal of leveraging
technologies around artificial intelligence.
• Because of its learning and decision-making abilities, machine learning is often
referred to as AI, though, in reality, it is a subdivision of AI.
• Until the late 1970s, it was a part of AI’s evolution. Then, it branched off to
evolve on its own.
• Machine learning is now responsible for some of the most significant
advancements in technology.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 22


b) What is Machine Learning (ML)?

• Machine learning is a branch of artificial intelligence (AI) and computer


science which focuses on the use of data and algorithms to imitate the way
that humans learn, gradually improving its accuracy.

• Machine learning is an application of AI that provides systems the ability to


learn on their own and improve from experiences without being programmed
externally.

• Machine learning was defined by Stanford University as “the science of


getting computers to act without being explicitly programmed.”
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 23
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 24
• Traditional programming is a manual process—here the programmer
creates the program. Programming aims to answer a problem using a
predefined set of rules or logic.

• In machine learning, the algorithm automatically formulates the rules


from the data. Machine learning seeks to construct a model or logic for the
problem by analyzing its input data and answers.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 25
c) Types of ML
• Based on the methods and way of learning, machine learning is
divided into six types.

18/03/2024 26
1. Supervised Machine Learning Algorithms:
• The primary purpose of supervised learning is to scale the scope of
data and to make predictions of unavailable, future or unseen data
based on labeled sample data.
• Supervised learning is where there are input variables (x) and an
output variable (Y) and an algorithm is used to learn the mapping
function from the input to the output Y = f(x) .
• The goal is to approximate the mapping function so well that when
there comes a new input data (x), the machine should be able to
predict the output variable (Y) for that data.
• Supervised machine learning includes two major
processes: classification and regression.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 27
 Classification is the process which basically categorizes a set of data into classes
(yes/no, true/false, 0/1, yes/no/may be). There are various types of Classification
problems, such as: Binary Classification, Multi-class Classification, Multi-label
Classification. Examples for classification problems are: Spam filtering, Image
classification, Sentiment analysis, Classifying cancerous and non-cancerous
tumors, Customer churn prediction etc.

 Regression is the process of identifying patterns and calculating the predictions


of continuous outcomes. The different types of regression analysis techniques get
used when the target and independent variables show a linear or non-linear
relationship between each other, and the target variable contains continuous values.
Examples for regression problems are: predicting the house rate, predicting month’s
sales, predicting age of a person, prediction of rain, determining Market trends etc.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 28
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 29
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 30
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 31
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 32
• The most widely used supervised algorithms are:
 Linear Regression
 Logistic Regression
 Random Forest
 Boosting algorithms
 Support Vector Machines
 Decision Trees
 Naive Bayes
 Nearest Neighbor.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 33


2. Unsupervised Machine Learning Algorithms:
• Unsupervised learning feeds on unlabeled data.
• In unsupervised machine learning algorithms, the desired results are unknown and yet to be
defined.
• Unsupervised learning algorithms apply the following techniques to describe the data:
 Clustering: It is an exploration of data used to segment it into meaningful groups (i.e., clusters)
based on their internal patterns without any prior knowledge of group credentials. The credentials
are defined by similarity of individual data objects and also aspects of its dissimilarity from the
rest. Examples: Identifying fraudulent or criminal activity, classifying network traffic, Identifying
Fake News etc.

 Dimensionality reduction: Most of the time, there is a lot of noise in the incoming data.
Machine learning algorithms use dimensionality reduction to remove this noise while distilling
the relevant information. Examples: Image compression, classify a database full of emails into
“not spam” and “spam”.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 34
• The most widely used unsupervised algorithms are:
 K-means clustering
 PCA (Principal Component Analysis)
 Association rule.

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 35


18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 36
3. Semi-supervised Machine Learning Algorithms:
• Semi-supervised learning algorithms represent a middle ground between
supervised and unsupervised algorithms.
• In this type of learning, the algorithm is trained upon a combination of
labeled and unlabelled data.
• This combination will contain a very small amount of labeled data and a
very large amount of unlabelled data.
• The basic procedure involved is that first, the programmer will cluster
similar data using an unsupervised learning algorithm and then use the
existing labeled data to label the rest of the unlabelled data.
• Examples: Text document classifier, Speech analysis etc.
• One of the popular Semi-supervised ML algorithm is Label Propagation
algorithm.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 37
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 38
4. Reinforcement Machine Learning Algorithms:
• Reinforced ML employs a technique called exploration/exploitation.
• It’s an iterative algorithm. The action takes place, the consequences
are observed, and the next action considers the results of the first
action.
• Using this algorithm, the machine is trained to make specific
decisions.
• It works this way: The machine is exposed to an environment where it
trains itself continually using trial and error. The machine learns from
past experience and tries to capture the best possible knowledge to
make accurate business decisions.
• Examples: Video games, Self-driving cars etc.
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 39
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 40
18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 41
• Most common reinforcement learning algorithms include:
 Q-Learning
 Temporal Difference (TD)
 Monte-Carlo Tree Search (MCTS)
 Asynchronous Actor-Critic Agents (A3C).

18/03/2024 Prof. Trupthi Rao, Dept. of AI & DS, GAT 42

You might also like