04 - Data Normalization in Python - en

Uploaded by

zackgtay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views1 page

04 - Data Normalization in Python - en

Uploaded by

zackgtay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 1

In this video, we'll be talking about data normalization.

An important technique to
understand in data pre-processing. When we take a look at the used car data set, we
notice in the data that the feature length ranges from 150-250, while feature width
and height ranges from 50-100. We may want to normalize these variables so that the
range of the values is consistent. This normalization can make some statistical
analyses easier down the road. By making the ranges consistent between variables,
normalization enables a fair comparison between the different features, making sure
they have the same impact. It is also important for computational reasons. Here is
another example that will help you understand why normalization is important.
Consider a data set containing two features, age and income. Where age ranges from
0-100, while income ranges from 0-20,000 and higher. Income is about 1,000 times
larger than age and ranges from 20,000-500,000. So, these two features are in very
different ranges. When we do further analysis, like linear regression for example,
the attribute income will intrinsically influence the result more due to its larger
value. But this doesn't necessarily mean it is more important as a predictor. So,
the nature of the data biases the linear regression model to weigh income more
heavily than age. To avoid this, we can normalize these two variables into values
that range from zero to one. Compare the two tables at the right. After
normalization, both variables now have a similar influence on the models we will
build later. There are several ways to normalize data. I will just outline three
techniques. The first method called simple feature scaling just divides each value
by the maximum value for that feature. This makes the new values range between zero
and one. The second method called min-max takes each value X_old subtract it from
the minimum value of that feature, then divides by the range of that feature.
Again, the resulting new values range between zero and one. The third method is
called z-score or standard score. In this formula for each value you subtract the
mu which is the average of the feature, and then divide by the standard deviation
sigma. The resulting values hover around zero, and typically range between negative
three and positive three but can be higher or lower. Following our earlier example,
we can apply the normalization method on the length feature. First, we use the
simple feature scaling method, where we divide it by the maximum value in the
feature. Using the pandas method max, this can be done in just one line of code.
Here's the min-max method on the length feature. We subtract each value by the
minimum of that column, then divide it by the range of that column. The max minus
the min. Finally, we apply the z-score method on length feature to normalize the
values. Here we apply the mean and STD method on the length feature. Mean method
will return the average value of the feature in the data set, and STD method will
return the standard deviation of the features in the data set.

Data Normalization in Data Mining
No ratings yet
Data Normalization in Data Mining
8 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
Data Normalization
No ratings yet
Data Normalization
7 pages
Rapid Miner - Data Preparation
100% (1)
Rapid Miner - Data Preparation
17 pages
Unit 4 Basics of Feature Engineering
No ratings yet
Unit 4 Basics of Feature Engineering
33 pages
Database Design
No ratings yet
Database Design
4 pages
Data Mining
No ratings yet
Data Mining
7 pages
Data Preparation.2
No ratings yet
Data Preparation.2
18 pages
5 Data Preprocessing III Editted Notes
No ratings yet
5 Data Preprocessing III Editted Notes
17 pages
8 Normalization Methods
No ratings yet
8 Normalization Methods
10 pages
3point5point2 Normalization
No ratings yet
3point5point2 Normalization
3 pages
Data Mining
No ratings yet
Data Mining
11 pages
Standar Ization
No ratings yet
Standar Ization
7 pages
Lec 7
No ratings yet
Lec 7
9 pages
Feature Scaling Techniques: Machine Learning
No ratings yet
Feature Scaling Techniques: Machine Learning
27 pages
Uklanjanje I Normalizacija Karakteristika
No ratings yet
Uklanjanje I Normalizacija Karakteristika
2 pages
Data Preparation
No ratings yet
Data Preparation
11 pages
Mine 5
No ratings yet
Mine 5
8 pages
Standardization & Normalization In: ML With Python Example
No ratings yet
Standardization & Normalization In: ML With Python Example
8 pages
3 1 Chapter 3 Normalization
No ratings yet
3 1 Chapter 3 Normalization
22 pages
Week 10
No ratings yet
Week 10
50 pages
Scaling Techniques
No ratings yet
Scaling Techniques
30 pages
Lecture 2.3 Data Normalization
No ratings yet
Lecture 2.3 Data Normalization
7 pages
Seven Lab Instruction
No ratings yet
Seven Lab Instruction
38 pages
Data Minig Lab Manual
No ratings yet
Data Minig Lab Manual
58 pages
ML Normalization Techniques - Overview & Practical Guide
No ratings yet
ML Normalization Techniques - Overview & Practical Guide
5 pages
5.feauture Engineering
No ratings yet
5.feauture Engineering
34 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
Unit 2 ML 2019
No ratings yet
Unit 2 ML 2019
91 pages
ML - Week 04
No ratings yet
ML - Week 04
33 pages
Lecture-11 - Feature Scaling
No ratings yet
Lecture-11 - Feature Scaling
26 pages
Example Data Mining
No ratings yet
Example Data Mining
4 pages
Model Selection and Feature Engineering
No ratings yet
Model Selection and Feature Engineering
64 pages
Feature Scaling (Standardization & Normalization)
No ratings yet
Feature Scaling (Standardization & Normalization)
35 pages
Practical 6
No ratings yet
Practical 6
6 pages
Feature Engineering
No ratings yet
Feature Engineering
50 pages
L 10 Principal Component Analysis 09052024 072206pm
No ratings yet
L 10 Principal Component Analysis 09052024 072206pm
37 pages
Data Preprocessing: Essential Steps For Preparing Data Before Modeling
No ratings yet
Data Preprocessing: Essential Steps For Preparing Data Before Modeling
111 pages
Normalization Vs Standardization
No ratings yet
Normalization Vs Standardization
2 pages
Standardisation Vs Normalisation
No ratings yet
Standardisation Vs Normalisation
6 pages
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
No ratings yet
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
10 pages
Data Preprocessing
No ratings yet
Data Preprocessing
49 pages
Feature Scaling in Machine Learning
No ratings yet
Feature Scaling in Machine Learning
4 pages
Normalization: Normalization Techniques at A Glance
No ratings yet
Normalization: Normalization Techniques at A Glance
5 pages
IDS5
No ratings yet
IDS5
56 pages
Feature Engineering
No ratings yet
Feature Engineering
18 pages
Unit 4-1
No ratings yet
Unit 4-1
13 pages
Chapter 3 Solutions
No ratings yet
Chapter 3 Solutions
3 pages
Lecture 10 - Data Transformation-M
No ratings yet
Lecture 10 - Data Transformation-M
8 pages
Conversation Normalization
No ratings yet
Conversation Normalization
2 pages
Presentation #1 Data Mining Minahel Khan BSIT (E) 22!11!1
No ratings yet
Presentation #1 Data Mining Minahel Khan BSIT (E) 22!11!1
7 pages
ML Notes
No ratings yet
ML Notes
44 pages
ML Distance
No ratings yet
ML Distance
18 pages
Data Cleaning Techniques
No ratings yet
Data Cleaning Techniques
11 pages
dmdw2 2
No ratings yet
dmdw2 2
24 pages
21BDS0357 VL2024250504577 Ast02
No ratings yet
21BDS0357 VL2024250504577 Ast02
5 pages
Data Preprocessing PT 2
No ratings yet
Data Preprocessing PT 2
7 pages
TOPIC 3 Pima Indian
No ratings yet
TOPIC 3 Pima Indian
16 pages
Ideal Gas Law Experiment
100% (2)
Ideal Gas Law Experiment
19 pages
Time Series Forecasting Business Report
No ratings yet
Time Series Forecasting Business Report
62 pages
Econometrics Lecture Note Chapter 4 and 5
No ratings yet
Econometrics Lecture Note Chapter 4 and 5
39 pages
Quiz 19: Multiple-Choice Questions On Ideal Gas Laws
No ratings yet
Quiz 19: Multiple-Choice Questions On Ideal Gas Laws
2 pages
Kinetic Theory & Thermodynamics
No ratings yet
Kinetic Theory & Thermodynamics
3 pages
Hardy-Weinberg Solved Examples
No ratings yet
Hardy-Weinberg Solved Examples
9 pages
Sample Size and Estimation New
No ratings yet
Sample Size and Estimation New
4 pages
CH 4.violations of The Assumptions of The Classical Model
No ratings yet
CH 4.violations of The Assumptions of The Classical Model
54 pages
PP 01 Soln
No ratings yet
PP 01 Soln
10 pages
Mark All Your Answers in The System
No ratings yet
Mark All Your Answers in The System
5 pages
Gas Laws PDF
No ratings yet
Gas Laws PDF
12 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Lect 3
No ratings yet
Lect 3
55 pages
The Assumptions Underlying The Method of Least Squares (CLRM)
No ratings yet
The Assumptions Underlying The Method of Least Squares (CLRM)
11 pages
ch03 Forecasting
No ratings yet
ch03 Forecasting
43 pages
KPN Stat Mech
No ratings yet
KPN Stat Mech
182 pages
Coding Questions
No ratings yet
Coding Questions
124 pages
6 +ARTIKEL+Nur+Rahma
No ratings yet
6 +ARTIKEL+Nur+Rahma
9 pages
Unit Root Tests
No ratings yet
Unit Root Tests
2 pages
Calculating Confidence Intervals For Prediction Error in Microarray Classification Using Resampling
No ratings yet
Calculating Confidence Intervals For Prediction Error in Microarray Classification Using Resampling
22 pages
Chap 1,2,3,5,6 (QA) Upload
No ratings yet
Chap 1,2,3,5,6 (QA) Upload
6 pages
Stepwise Regression
No ratings yet
Stepwise Regression
2 pages
Linear Regression - Stats 2 (Translated)
No ratings yet
Linear Regression - Stats 2 (Translated)
63 pages
Statistic Analysis PB
No ratings yet
Statistic Analysis PB
12 pages
Maths PPT
No ratings yet
Maths PPT
12 pages
The Two-Way Error Component Regression Model
No ratings yet
The Two-Way Error Component Regression Model
29 pages
Bsacore 1 M7 Mon
No ratings yet
Bsacore 1 M7 Mon
3 pages
Winters Model Excel For SPC
No ratings yet
Winters Model Excel For SPC
14 pages
Introduction To The Theory of Ferromagnetism: Exact Solution of The Ising Model in One Dimension. Antiferromagnetism
No ratings yet
Introduction To The Theory of Ferromagnetism: Exact Solution of The Ising Model in One Dimension. Antiferromagnetism
8 pages

04 - Data Normalization in Python - en

Uploaded by

04 - Data Normalization in Python - en

Uploaded by

In this video, we'll be talking about data normalization.

You might also like