0% found this document useful (0 votes)

14 views43 pages

Lesson 7 Feature Engineering

The document outlines a 4-hour course on feature engineering, covering topics such as the definition of features, the importance of feature selection and extraction, and various techniques like one-hot encoding, normalization, and dealing with missing data. It also discusses visual pattern recognition features, including shape-based descriptors and the Histogram of Oriented Gradients (HOG) algorithm. The course emphasizes the need for informative and discriminative features to improve machine learning model performance.

Uploaded by

hadatalex

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views43 pages

Lesson 7 Feature Engineering

Uploaded by

hadatalex

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Feature engineering

 Duration: 4 hrs
 Outline:
1. Introduction
2. Feature engineering
3. Features in visual pattern recognition
4. Shape-based feature descriptors
Feature engineering

 Duration: 4 hrs
 Outline:
1. Introduction
2. Feature engineering
3. Features in visual pattern recognition
4. Shape-based feature descriptors
Introduction to feature & feature engineering
 Feature:
 an individual measurable property or characteristic of a data
example

 describes the example

 Features are usually numeric.

 Feature engineering: transfer raw data into feature vector

Data  feature vector  ML model

The general framework for Machine Learning
Curse of dimensionality
 Dimensionality: the number of features in feature vector.

 Curse of dimensionality:

 The number of features is very large relative to the number of

observations (examples) in dataset

 Hard to train effective model

 Dimensionality reduction
 Feature selection
 Feature extraction
Feature extraction vs. feature selection

 Feature selection:

 Filtering irrelevant or redundant features from dataset

 Choosing a subset of the original features

 Feature extraction:

 Creating a new smaller set of features

 Getting useful features from existing data

 Feature need to be informative, discriminating and

independent
Feature extraction vs. feature selection
Feature engineering

 Duration: 4 hrs
 Outline:
1. Introduction
2. Feature engineering
3. Features in visual pattern recognition
4. Shape-based feature descriptors
Feature engineering
 One-hot encoding

 Binning

 Normalization

 Standardization

 Dealing with missing feature

 Data imputation techniques

One-hot encoding
 Transform a categorical feature into several binary features

 Example: feature “color” has 3 values “red”, “yellow”, green”

 “red” = 1, “yellow” = 2, “green” = 3

 ?
Binning (bucketing)
 Transform a numerical feature into categorical feature

 Example: feature “age”

 Put all ages between 0 and 5 years-old into one bin

 Put ages from 6 to 10 years-old in the second bin

 Put ages from 11 to 15 years-old in the third bin, and so on.

Normalization

 Converting an actual range of values of a numerical feature

into a standard range of values, typically in the interval [-1, 1] or
[0, 1].

 Example: natural range = [350, 1450]

 Subtracting 350 from every value of the feature

 Dividing the result by 1100  normalized range = [0, 1].

Standardization

 Rescaling the feature values so that they have the properties of

a standard normal distribution with µ = 0 and =1

 Formula:
Standardization or normalization?
 Try two if have time 

 Rule of thumbs:
Dealing with missing features

 Removing the examples with missing features.

 Use data imputation technique

Data imputation techniques

 Technique 1: Replacing the missing value of a feature by an

average value of this feature in the dataset

 Technique 2: Replacing the missing value by the same value

outside the normal range of values.

 Technique 3: Replacing the missing value by a value in the

middle of the range.

…etc…
Feature engineering

 Duration: 4 hrs
 Outline:
1. Introduction
2. Feature engineering
3. Features in visual pattern recognition
4. Shape-based feature descriptors
Image feature extraction
 Purpose:

 To reduce the dimensionality of input image

 To transform each input image into a corresponding multi-
dimension feature vector

 To perform the predefined classification tasks with sufficient

accuracy without using the entire input image

 Requirements:
 Features should extract the most suitable characteristics from
the input image
An example of feature extraction

.
Visual features

 Color-based features
Visual features
 Shape-based features
Visual features
 Texture-based features
Which feature is the best?

 Example: plant recognition

 Plant features: leaf, fruit, flower, root, branch,…

 Leaf features: shape, vein, margin, texture

 No single best feature for a given leaf identity  combination
of different features

 No single best presentation for a given feature  multiple

descriptors to characterize the feature from different
perspectives

 Challenging
Deep learning
 Innovative
Feature engineering

 Duration: 4 hrs
 Outline:
1. Introduction
2. Feature engineering
3. Features in visual pattern recognition
4. Shape-based feature descriptors
Shape-based feature descriptor

 Shape: important

 Good shape descriptor: invariant to geometrical

transformations (rotation, reflection, scaling, translation)

 Types of shape descriptors: simple and morphological shape

descriptor (SMSD), contour-based, region-based
Simple and morphological shape descriptor

 Refer to basic geometric properties of the shape

 Basic descriptor: diameter, major axis length, minor axis length,

area, perimeter, centroid,…

 Morphological descriptor: aspect ratio, perimeter to area ratio,

rectangularity measures, circularity measures,…
Contour-based feature descriptor

 Consider the boundary of a shape and neglect the information

contained in the shape interior

 Ex: CCD (centroid contour distance), Fourier descriptor

computed on CCD.
Contour-based feature descriptor
Region-based feature descriptor

 Take all the pixels within a shape region into account to

obtain the shape representation

 Image moments: statistical descriptor of a shape. Ex: Hu

moments

 Local features: select key points in image. Ex: HOG

(histogram of oriented gradients), SIFT (scale-invariant
feature transform)
Histogram of Oriented Gradients (HOG) ALGORITHM

• HOG stands for histogram of oriented gradients.

• The hog descriptor focuses on structure or shape of
the object.
• It uses magnitude as well as direction of the
gradient to compute the features.
• It generates histogram by using magnitude and
direction of the gradient.
HOG ALGORITHM
HOG ALGORITHM

40 70

• Here we calculating gradient magnitude and 70

direction, to calculate pixels intensity we need
• X direction=|40-70|=30
• Y direction=|20-70|=50
• By these values we are calculating magnitude and
direction of the gradient
• By using magnitude and direction we calculate
feature vectors
HOG ALGORITHM
e
HOG ALGORITHM

• Before getting the hog feature and after

concatenating feature vectors we are supposed to
do normalize.
• Suppose we have taken 150*300 pixels and multiply
with 2 to increase the brightness and divided by 2 to
decrease the brightness, then you cant compare
two images without normalization bec’z the pixels
intensity will be changed.
• But if you normalize the feature vectors it is easy to
compare
HOG ALGORITHM
HOG ALGORITHM

• For hog features giving human template and giving output

for convolving with human model
• Then it will predict whether it is human or not.
Image moments
Hu moments feature descriptor
Central moments:

Central normalized moments:

Centroid of the image:

_
x.s(x, y) _
 y.s(x, y)
x x y
,y x y

s(x, y) s(x, y)
x y x y
Hu moments feature (cont)
Id image S1 S2 S3 S4 S5 S6 S7

6
HOG

https://www.youtube.com/watch?v=XmO0CSsKg88&t=41s

DECCA Navigation System
25% (4)
DECCA Navigation System
19 pages
Chapter 6 Object Recognition
No ratings yet
Chapter 6 Object Recognition
51 pages
Wind Load AS1170 2
No ratings yet
Wind Load AS1170 2
15 pages
Feature Extraction - Main
No ratings yet
Feature Extraction - Main
96 pages
Florida Museum Pangea Analyzing Evidence Worksheet
No ratings yet
Florida Museum Pangea Analyzing Evidence Worksheet
3 pages
ATA 46 Network
100% (1)
ATA 46 Network
52 pages
Sample Docs Format - Brgy
No ratings yet
Sample Docs Format - Brgy
14 pages
Worksheets - Unit 5
90% (10)
Worksheets - Unit 5
13 pages
IT5409 Ch4 Part2 Feature ExtractionMatching
No ratings yet
IT5409 Ch4 Part2 Feature ExtractionMatching
85 pages
Machine Learning: Aigerim Bogyrbayeva
No ratings yet
Machine Learning: Aigerim Bogyrbayeva
85 pages
Bai09 Descriptors
No ratings yet
Bai09 Descriptors
81 pages
FEM Analysis of CORNERING CHARACTERISTICS OF ROTATING TIRES PDF
No ratings yet
FEM Analysis of CORNERING CHARACTERISTICS OF ROTATING TIRES PDF
178 pages
Shell Theory:: 1. In-Plane and Out-Of-Plane Effects in Shells: Membrane and Bending Theories
100% (1)
Shell Theory:: 1. In-Plane and Out-Of-Plane Effects in Shells: Membrane and Bending Theories
5 pages
IT5409 - Ch4 - Part2 - Feature ExtractionMatching - 4pages
No ratings yet
IT5409 - Ch4 - Part2 - Feature ExtractionMatching - 4pages
43 pages
2021 等级2：3-4年级
No ratings yet
2021 等级2：3-4年级
19 pages
10.8 - Histogram of Oriented Gradients - PyImageSearch Gurus
No ratings yet
10.8 - Histogram of Oriented Gradients - PyImageSearch Gurus
20 pages
HOG Feature Descriptor. in Computer Vis... e Are Many - by Dahi Nemutlu - Medium
No ratings yet
HOG Feature Descriptor. in Computer Vis... e Are Many - by Dahi Nemutlu - Medium
18 pages
Features Descriptions and Hsitogram of Oriented Gradient
No ratings yet
Features Descriptions and Hsitogram of Oriented Gradient
3 pages
IP FeatureExtractionEndAnalysis L7
No ratings yet
IP FeatureExtractionEndAnalysis L7
63 pages
Lec 27
No ratings yet
Lec 27
25 pages
GNR602-Lec14-15 Harris-HoG-SIFT
No ratings yet
GNR602-Lec14-15 Harris-HoG-SIFT
86 pages
Local Features and Bag of Words Models
No ratings yet
Local Features and Bag of Words Models
60 pages
Achour Idoughi - Project03
No ratings yet
Achour Idoughi - Project03
7 pages
CH 8
No ratings yet
CH 8
21 pages
Unit - 4: Object Detection and Tracking
No ratings yet
Unit - 4: Object Detection and Tracking
31 pages
ML-Unit 3
No ratings yet
ML-Unit 3
58 pages
Amazon Inventory Reconciliation Using AI: ST ND RD
No ratings yet
Amazon Inventory Reconciliation Using AI: ST ND RD
6 pages
Basics of Feature Engineering Marked
No ratings yet
Basics of Feature Engineering Marked
33 pages
Lecture 1.1
No ratings yet
Lecture 1.1
26 pages
UNIT 4-Feature Extraction
No ratings yet
UNIT 4-Feature Extraction
5 pages
Unit 2 Feature Engineering
No ratings yet
Unit 2 Feature Engineering
64 pages
Fermentation Media, Fermentation Process and Downstream Processing Bcba p7 T
No ratings yet
Fermentation Media, Fermentation Process and Downstream Processing Bcba p7 T
159 pages
ML UNIT 2 2 Old
No ratings yet
ML UNIT 2 2 Old
15 pages
Another Descriptor: Histograms of Oriented Gradients For Human Detection
No ratings yet
Another Descriptor: Histograms of Oriented Gradients For Human Detection
6 pages
Kernel Visual Recognition
No ratings yet
Kernel Visual Recognition
9 pages
Features
No ratings yet
Features
60 pages
Feature Engineering
No ratings yet
Feature Engineering
13 pages
03-3 Feature Descriptors
No ratings yet
03-3 Feature Descriptors
58 pages
10 1109icsc45622 2019 8938371
No ratings yet
10 1109icsc45622 2019 8938371
7 pages
AI Feature Engineering in Detail
No ratings yet
AI Feature Engineering in Detail
12 pages
UNIT04
No ratings yet
UNIT04
35 pages
Session 7 Feature Selection & Dimensionality Reduction
No ratings yet
Session 7 Feature Selection & Dimensionality Reduction
20 pages
Shape Classification Using Histogram of Oriented Gradients
No ratings yet
Shape Classification Using Histogram of Oriented Gradients
6 pages
Unit-4 Part 3 Feature Engineering
No ratings yet
Unit-4 Part 3 Feature Engineering
29 pages
Feature Descriptor Spring2021
No ratings yet
Feature Descriptor Spring2021
16 pages
Lecture # 27: Image Analysis Cont : Muhammad Rzi Abbas
No ratings yet
Lecture # 27: Image Analysis Cont : Muhammad Rzi Abbas
16 pages
06 Features
No ratings yet
06 Features
94 pages
Csvtu Syllabus Be Civil 5 Sem
No ratings yet
Csvtu Syllabus Be Civil 5 Sem
12 pages
Model Selection and Feature Engineering
No ratings yet
Model Selection and Feature Engineering
64 pages
Unit 4
No ratings yet
Unit 4
21 pages
4.01 08 2022 - FeatureDescriptors
No ratings yet
4.01 08 2022 - FeatureDescriptors
46 pages
Rajat Agarwal-21bcon630
No ratings yet
Rajat Agarwal-21bcon630
13 pages
521H0290 DoMinhQuan Assignment3
No ratings yet
521H0290 DoMinhQuan Assignment3
3 pages
Summery of Feature Eng
No ratings yet
Summery of Feature Eng
4 pages
Feature Detection: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Feature Detection: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
54 pages
Blob Analysis
No ratings yet
Blob Analysis
28 pages
Assignment 2 Comp 425
No ratings yet
Assignment 2 Comp 425
3 pages
Pattern Recognition: Lecturer
No ratings yet
Pattern Recognition: Lecturer
43 pages
CHP 4
No ratings yet
CHP 4
72 pages
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
No ratings yet
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
6 pages
Contoh Soal Peringatan
No ratings yet
Contoh Soal Peringatan
3 pages
CV Lecture 11
No ratings yet
CV Lecture 11
147 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
TEMSMET 2020 Paper 60 Nikhil
No ratings yet
TEMSMET 2020 Paper 60 Nikhil
5 pages
Dalal 2008
No ratings yet
Dalal 2008
6 pages
Thời gian làm bài: 60 phút không kể thời gian phát đề
No ratings yet
Thời gian làm bài: 60 phút không kể thời gian phát đề
7 pages
Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET
No ratings yet
Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET
69 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
Tutorial 7 Developing A Simple Image Classifier
No ratings yet
Tutorial 7 Developing A Simple Image Classifier
11 pages
Switching Power Supply Design Review - 60 Watt Flyback Regulator by Raoji Patel and Glen FRFTZ Slup072
No ratings yet
Switching Power Supply Design Review - 60 Watt Flyback Regulator by Raoji Patel and Glen FRFTZ Slup072
17 pages
23 - Transition Metals and Coordination Chemistry
No ratings yet
23 - Transition Metals and Coordination Chemistry
39 pages
Second Division
No ratings yet
Second Division
59 pages
UNIT4IMAGEANALYTICS
No ratings yet
UNIT4IMAGEANALYTICS
21 pages
The Damning Stone
No ratings yet
The Damning Stone
349 pages
( 1) Zootopia
No ratings yet
( 1) Zootopia
7 pages
Vehicle Detection and Tracking
No ratings yet
Vehicle Detection and Tracking
11 pages
Fender Blues Junior Owner's Manual
No ratings yet
Fender Blues Junior Owner's Manual
4 pages
Training Report
No ratings yet
Training Report
23 pages
Quality Assurance Unit - Exam - Template - 2008
No ratings yet
Quality Assurance Unit - Exam - Template - 2008
7 pages
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
No ratings yet
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
46 pages
CASE 10.1 - Minentonka Warehouse
No ratings yet
CASE 10.1 - Minentonka Warehouse
1 page
Assignment-3 (Motion in A Plane)
No ratings yet
Assignment-3 (Motion in A Plane)
7 pages
Errata Electromagnetic Foundations of Electrical Engineering
No ratings yet
Errata Electromagnetic Foundations of Electrical Engineering
5 pages
Tbc-50s - Instruction Manual
No ratings yet
Tbc-50s - Instruction Manual
32 pages
Veterinary Behaviorists Should Be The First
No ratings yet
Veterinary Behaviorists Should Be The First
3 pages
Laboratory Exercise 3
No ratings yet
Laboratory Exercise 3
3 pages
AutoQuant 200i Brochure
No ratings yet
AutoQuant 200i Brochure
2 pages
STP2 & STP 3.docx 3
No ratings yet
STP2 & STP 3.docx 3
2 pages
State Board of Technical Education Bihar: Theory Papers
No ratings yet
State Board of Technical Education Bihar: Theory Papers
1 page
SOLIDWORKS 2023 for Designers, 21st Edition
From Everand
SOLIDWORKS 2023 for Designers, 21st Edition
Prof. Sham Tickoo
No ratings yet

Lesson 7 Feature Engineering

Uploaded by

Lesson 7 Feature Engineering

Uploaded by

Feature engineering

 describes the example

 Features are usually numeric.

 Feature engineering: transfer raw data into feature vector

Data  feature vector  ML model

 The number of features is very large relative to the number of

 Hard to train effective model

 Filtering irrelevant or redundant features from dataset

 Choosing a subset of the original features

 Creating a new smaller set of features

 Getting useful features from existing data

 Feature need to be informative, discriminating and

 Dealing with missing feature

 Data imputation techniques

 Example: feature “color” has 3 values “red”, “yellow”, green”

 “red” = 1, “yellow” = 2, “green” = 3

 Example: feature “age”

 Put all ages between 0 and 5 years-old into one bin

 Put ages from 6 to 10 years-old in the second bin

 Put ages from 11 to 15 years-old in the third bin, and so on.

 Converting an actual range of values of a numerical feature

 Example: natural range = [350, 1450]

 Subtracting 350 from every value of the feature

 Dividing the result by 1100  normalized range = [0, 1].

 Rescaling the feature values so that they have the properties of

 Removing the examples with missing features.

 Use data imputation technique

 Technique 1: Replacing the missing value of a feature by an

 Technique 2: Replacing the missing value by the same value

 Technique 3: Replacing the missing value by a value in the

 To reduce the dimensionality of input image

 To perform the predefined classification tasks with sufficient

 Example: plant recognition

 Plant features: leaf, fruit, flower, root, branch,…

 Leaf features: shape, vein, margin, texture

 No single best presentation for a given feature  multiple

 Good shape descriptor: invariant to geometrical

 Types of shape descriptors: simple and morphological shape

 Refer to basic geometric properties of the shape

 Basic descriptor: diameter, major axis length, minor axis length,

 Morphological descriptor: aspect ratio, perimeter to area ratio,

 Consider the boundary of a shape and neglect the information

 Ex: CCD (centroid contour distance), Fourier descriptor

 Take all the pixels within a shape region into account to

 Image moments: statistical descriptor of a shape. Ex: Hu

 Local features: select key points in image. Ex: HOG

• HOG stands for histogram of oriented gradients.

• Here we calculating gradient magnitude and 70

• Before getting the hog feature and after

• For hog features giving human template and giving output

Central normalized moments:

Centroid of the image:

You might also like