0% found this document useful (0 votes)

82 views6 pages

Data Normalization

The document discusses data normalization techniques in data mining. It explains that normalization transforms data values into a common scale to improve machine learning algorithm performance by making the data less sensitive to differences in feature scales. Common normalization methods include min-max normalization, z-score normalization, decimal scaling, and logarithmic/root transformations. Normalization can improve accuracy but may also result in information loss or make outliers harder to detect. Overall, normalization is useful for data preprocessing but its costs and benefits must be considered for each application.

Uploaded by

[CO - 174] Shubham Mourya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views6 pages

Data Normalization

Uploaded by

[CO - 174] Shubham Mourya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

90% Refund @Courses Trending Now Data Structures & Algorithms Foundational Courses Data Science

Data Normalization in Data Mining

Read Courses Jobs

INTRODUCTION:

Data normalization is a technique used in data mining to transform the

values of a dataset into a common scale. This is important because many
machine learning algorithms are sensitive to the scale of the input features
and can produce better results when the data is normalized.

There are several different normalization techniques that can be used in

data mining, including:

1. Min-Max normalization: This technique scales the values of a feature to a

range between 0 and 1. This is done by subtracting the minimum value of
the feature from each value, and then dividing by the range of the feature.
2. Z-score normalization: This technique scales the values of a feature to
have a mean of 0 and a standard deviation of 1. This is done by
subtracting the mean of the feature from each value, and then dividing by
the standard deviation.
3. Decimal Scaling: This technique scales the values of a feature by dividing
the values of a feature by a power of 10.
4. Logarithmic transformation: This technique applies a logarithmic
transformation to the values of a feature. This can be useful for data with
a wide range of values, as it can help to reduce the impact of outliers.
5. Root transformation: This technique applies a square root transformation
to the values of a feature. This can be useful for data with a wide range of
values, as it can help to reduce the impact of outliers.
6. It’s important to note that normalization should be applied only to the
input features, not the target variable, and that different normalization
technique may work better for different types of data and models.

https://www.geeksforgeeks.org/data-normalization-in-data-mining/ 1/11
1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

In conclusion, normalization is an important step in data mining, as it can

help to improve the performance of machine learning algorithms by scaling
the input features to a common scale. This can help to reduce the impact of
outliers and improve the accuracy of the model.

Normalization is used to scale the data of an attribute so that it falls in a

smaller range, such as -1.0 to 1.0 or 0.0 to 1.0. It is generally useful for
classification algorithms.

Need of Normalization –

Normalization is generally required when we are dealing with attributes on a

different scale, otherwise, it may lead to a dilution in effectiveness of an
important equally important attribute(on lower scale) because of other
attribute having values on larger scale. In simple words, when multiple
attributes are there but attributes have values on different scales, this may
lead to poor data models while performing data mining operations. So they
are normalized to bring all the attributes on the same scale.

Methods of Data Normalization –

https://www.geeksforgeeks.org/data-normalization-in-data-mining/ 2/11
1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

90% Refund Hack

1200+ have already taken up the challenge. It's your turn now! Get 90% refund
on course fees upon achieving 90% completion.Discover How

Decimal Scaling
Min-Max Normalization
z-Score Normalization(zero-mean Normalization)

Decimal Scaling Method For Normalization –

It normalizes by moving the decimal point of values of the data. To normalize

the data by this technique, we divide each value of the data by the maximum
absolute value of data. The data value, vi, of data is normalized to vi‘ by
using the formula below –

where j is the smallest integer such that max(|vi‘|)<1. Example –

Let the input data is: -10, 201, 301, -401, 501, 601, 701 To normalize
the above data, Step 1: Maximum absolute value in given data(m): 701

https://www.geeksforgeeks.org/data-normalization-in-data-mining/ 3/11
1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

Step 2: Divide the given data by 1000 (i.e j=3) Result: The normalized
data is: -0.01, 0.201, 0.301, -0.401, 0.501, 0.601, 0.701

Min-Max Normalization –

In this technique of data normalization, linear transformation is performed on

the original data. Minimum and maximum value from data is fetched and
each value is replaced according to the following formula.

Where A is the attribute data, Min(A), Max(A) are the minimum and
maximum absolute value of A respectively. v’ is the new value of each entry
in data. v is the old value of each entry in data. new_max(A), new_min(A) is
the max and min value of the range(i.e boundary value of range required)
respectively.

Z-score normalization –

In this technique, values are normalized based on mean and standard

deviation of the data A. The formula used is:

https://www.geeksforgeeks.org/data-normalization-in-data-mining/ 4/11
1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

v’, v is the new and old of each entry in data respectively. σA, A is the
standard deviation and mean of A respectively.

ADVANTAGES OR DISADVANTAGES:

Data normalization in data mining can have a number of advantages and

disadvantages.

Advantages:

1. Improved performance of machine learning algorithms: Normalization can

help to improve the performance of machine learning algorithms by
scaling the input features to a common scale. This can help to reduce the
impact of outliers and improve the accuracy of the model.
2. Better handling of outliers: Normalization can help to reduce the impact
of outliers by scaling the data to a common scale, which can make the
outliers less influential.
3. Improved interpretability of results: Normalization can make it easier to
interpret the results of a machine learning model, as the inputs will be on
a common scale.
4. Better generalization: Normalization can help to improve the
generalization of a model, by reducing the impact of outliers and by
making the model less sensitive to the scale of the inputs.

Disadvantages:

1. Loss of information: Normalization can result in a loss of information if the

original scale of the input features is important.
https://www.geeksforgeeks.org/data-normalization-in-data-mining/ 5/11
1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

2. Impact on outliers: Normalization can make it harder to detect outliers as

they will be scaled along with the rest of the data.
3. Impact on interpretability: Normalization can make it harder to interpret
the results of a machine learning model, as the inputs will be on a
common scale, which may not align with the original scale of the data.
4. Additional computational costs: Normalization can add additional
computational costs to the data mining process, as it requires additional
processing time to scale the data.
5. In conclusion, data normalization can have both advantages and
disadvantages. It can improve the performance of machine learning
algorithms and make it easier to interpret the results. However, it can also
result in a loss of information and make it harder to detect outliers. It’s
important to weigh the pros and cons of data normalization and carefully
assess the risks and benefits before implementing it.

Learn to code easily with our course Coding for Everyone. This course is
accessible and designed for everyone, even if you're new to coding. Start
today and join millions on a journey to improve your skills!

Whether you're preparing for your first job interview or aiming to upskill in
this ever-evolving tech landscape, GeeksforGeeks Courses are your key to
success. We provide top-quality content at affordable prices, all geared
towards accelerating your growth in a time-bound manner. Join the millions
we've already empowered, and we're here to do the same for you. Don't
miss out - check it out now!

Commit to GfG's Three-90 Challenge! Purchase a course, complete 90% in 90

days, and save 90% cost click here to explore.

Last Updated : 02 Feb, 2023 34

Previous Next

Introduction to Mixed Reality Weighted K-NN

Share your thoughts in the comments Add Your Comment

https://www.geeksforgeeks.org/data-normalization-in-data-mining/ 6/11

Data Transformation
No ratings yet
Data Transformation
12 pages
Unit 2 - Data Munging PDF
No ratings yet
Unit 2 - Data Munging PDF
54 pages
ML_DA
No ratings yet
ML_DA
55 pages
Data Scientist Roadmap 2025-26
No ratings yet
Data Scientist Roadmap 2025-26
32 pages
Maths Mate Homework Program 8
100% (1)
Maths Mate Homework Program 8
7 pages
7360022802PL Canon
No ratings yet
7360022802PL Canon
193 pages
Lind Review
100% (1)
Lind Review
8 pages
LT6_Assisting and Participating in School Programs and Activities
No ratings yet
LT6_Assisting and Participating in School Programs and Activities
5 pages
12 01 0192 PDF
No ratings yet
12 01 0192 PDF
9 pages
Database Design
No ratings yet
Database Design
4 pages
Part 1 Lectures
No ratings yet
Part 1 Lectures
100 pages
Mkandawire-GoodGovernanceItinerary-2007
No ratings yet
Mkandawire-GoodGovernanceItinerary-2007
4 pages
Recommender System - Module 2 - Data Mining Techniques in Recommender System
No ratings yet
Recommender System - Module 2 - Data Mining Techniques in Recommender System
58 pages
Unit I
No ratings yet
Unit I
41 pages
4th Quarter - 3rd ST-PT
No ratings yet
4th Quarter - 3rd ST-PT
3 pages
Hyundai Motor Company
No ratings yet
Hyundai Motor Company
4 pages
Iath Prelim
No ratings yet
Iath Prelim
7 pages
66d7e3eae85918746a8f3f15 0.0CourseIntro
No ratings yet
66d7e3eae85918746a8f3f15 0.0CourseIntro
9 pages
Module VI- Aristotle's Political Thought Ravi Saxena KPMSOL, NMIMS January 24, 2024 (1)
No ratings yet
Module VI- Aristotle's Political Thought Ravi Saxena KPMSOL, NMIMS January 24, 2024 (1)
20 pages
Data Proprocesing
No ratings yet
Data Proprocesing
18 pages
MIQ - Lecture 1, 2, 3 and 4 ?
No ratings yet
MIQ - Lecture 1, 2, 3 and 4 ?
15 pages
Eced Individual
No ratings yet
Eced Individual
11 pages
L0 Overview
No ratings yet
L0 Overview
15 pages
Hareesh Resume
No ratings yet
Hareesh Resume
2 pages
COMPAPPABCA50150rDatrAP Data Preprocessing2 (DataMining)
No ratings yet
COMPAPPABCA50150rDatrAP Data Preprocessing2 (DataMining)
13 pages
Cam 1 Manahijul Muhadditsin
No ratings yet
Cam 1 Manahijul Muhadditsin
3 pages
bargauwdatasciencelecture2-160424211445
No ratings yet
bargauwdatasciencelecture2-160424211445
137 pages
Become A .: Data Analyst
No ratings yet
Become A .: Data Analyst
8 pages
Automotive Electrical & Electronics Unit II: Presented by
100% (1)
Automotive Electrical & Electronics Unit II: Presented by
58 pages
609b784052d21b6878c9321d - Syllabus Summer Training ShapeAi
No ratings yet
609b784052d21b6878c9321d - Syllabus Summer Training ShapeAi
8 pages
The Fun They Had Notes
No ratings yet
The Fun They Had Notes
3 pages
Colors
No ratings yet
Colors
17 pages
HMI Viva QB
No ratings yet
HMI Viva QB
15 pages
IT-416 Data Mining
No ratings yet
IT-416 Data Mining
3 pages
chap3
No ratings yet
chap3
26 pages
Uts Reviewer
No ratings yet
Uts Reviewer
11 pages
Unit III
No ratings yet
Unit III
28 pages
Data Sciences Unit-I
No ratings yet
Data Sciences Unit-I
83 pages
CS 119
No ratings yet
CS 119
7 pages
Struc Ex Pro Rectangular Beam Design Calculation
No ratings yet
Struc Ex Pro Rectangular Beam Design Calculation
1 page
India - IT Services, April 2024
No ratings yet
India - IT Services, April 2024
51 pages
HMI Viva
No ratings yet
HMI Viva
1 page
6GAN
No ratings yet
6GAN
4 pages
Simple Interest - Aptitude Questions and Answers
No ratings yet
Simple Interest - Aptitude Questions and Answers
3 pages
Average - Aptitude Questions and Answers
No ratings yet
Average - Aptitude Questions and Answers
3 pages
Data Preprocessing
No ratings yet
Data Preprocessing
49 pages
DSV-S8 Data Cleaning
No ratings yet
DSV-S8 Data Cleaning
34 pages
MAT8033 Lecture Slides (3)
No ratings yet
MAT8033 Lecture Slides (3)
62 pages
CEED Question Paper 14 Feb 2016.
No ratings yet
CEED Question Paper 14 Feb 2016.
19 pages
Data Minig Lab Manual
No ratings yet
Data Minig Lab Manual
58 pages
AI Agenda
No ratings yet
AI Agenda
36 pages
Teks DATA SCIENCE Syllabus - QR
No ratings yet
Teks DATA SCIENCE Syllabus - QR
26 pages
Feature Scaling
No ratings yet
Feature Scaling
13 pages
Concurrency Control
No ratings yet
Concurrency Control
31 pages
MAD EXP6 Prog
No ratings yet
MAD EXP6 Prog
5 pages
DAI101 4 Data Preparation (1)
No ratings yet
DAI101 4 Data Preparation (1)
45 pages
Ch8 Data and Its Processing
No ratings yet
Ch8 Data and Its Processing
32 pages
3 1 Chapter 3 Normalization
No ratings yet
3 1 Chapter 3 Normalization
22 pages
MAT8033 Lecture Slides
No ratings yet
MAT8033 Lecture Slides
29 pages
STE (22518) - 1st Class Test Question Bank
No ratings yet
STE (22518) - 1st Class Test Question Bank
1 page
PPT 1.1.5
No ratings yet
PPT 1.1.5
20 pages
Skewness - Definition, Examples & Formula
No ratings yet
Skewness - Definition, Examples & Formula
10 pages
01ce0707 Data Mining and Information Retrieval
No ratings yet
01ce0707 Data Mining and Information Retrieval
3 pages
Session-2-CO3-Introduction to Data Preprocessing (1)
No ratings yet
Session-2-CO3-Introduction to Data Preprocessing (1)
39 pages
Study+Material+Unit 4+Data+Preprocessing+
No ratings yet
Study+Material+Unit 4+Data+Preprocessing+
8 pages
1-Introduction to data cleaning
No ratings yet
1-Introduction to data cleaning
22 pages
ML - WEEK 04
No ratings yet
ML - WEEK 04
33 pages
Unihemispheric Slow-Wave Sleep - Kimberly Mauro
No ratings yet
Unihemispheric Slow-Wave Sleep - Kimberly Mauro
10 pages
Crowd Coin PPT
No ratings yet
Crowd Coin PPT
5 pages
Data Normalization in Data Mining
No ratings yet
Data Normalization in Data Mining
8 pages
DSC Brochure
No ratings yet
DSC Brochure
14 pages
DBMS 1-4
No ratings yet
DBMS 1-4
36 pages
CAS CS 565, Data Mining
No ratings yet
CAS CS 565, Data Mining
30 pages
Assessment Type: Summative: End of CO: in LMS
No ratings yet
Assessment Type: Summative: End of CO: in LMS
9 pages
Data Analytics
No ratings yet
Data Analytics
1 page
Data Mining
No ratings yet
Data Mining
11 pages
Data Normalization
No ratings yet
Data Normalization
7 pages
MGT - CO3 - All Assessment (5 Files Merged)
No ratings yet
MGT - CO3 - All Assessment (5 Files Merged)
18 pages
Sp24-DM-Teaching-plan-02042024-114322am
No ratings yet
Sp24-DM-Teaching-plan-02042024-114322am
7 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
Advance Java Chapter 1 Full Notes - Ur Engineering Friend-1
100% (1)
Advance Java Chapter 1 Full Notes - Ur Engineering Friend-1
33 pages
OJCST_Vol13_N2-3_p_78-81
No ratings yet
OJCST_Vol13_N2-3_p_78-81
4 pages
1737527078055
No ratings yet
1737527078055
111 pages
DBMS Ese
No ratings yet
DBMS Ese
96 pages
ML1
No ratings yet
ML1
69 pages
19CS003..Handout
No ratings yet
19CS003..Handout
5 pages
Course Outline (Ds & Ai) 2024
No ratings yet
Course Outline (Ds & Ai) 2024
13 pages
Handout
No ratings yet
Handout
4 pages
How Does ChatGPT Work
No ratings yet
How Does ChatGPT Work
8 pages
MR20 Vi-I Syllabus
No ratings yet
MR20 Vi-I Syllabus
22 pages
Chapter 3 - Exception Handling
No ratings yet
Chapter 3 - Exception Handling
13 pages
Advance Java - Exam Sutra MCQ Book by Ur Engineering Friend
100% (1)
Advance Java - Exam Sutra MCQ Book by Ur Engineering Friend
66 pages
Ebara CDX 2 Poles 50 en L
No ratings yet
Ebara CDX 2 Poles 50 en L
20 pages
DS Unit 1 Essay Answers.
No ratings yet
DS Unit 1 Essay Answers.
18 pages
HX710 PDF
No ratings yet
HX710 PDF
1 page
JAVA Advanced 3
No ratings yet
JAVA Advanced 3
19 pages
Foundations of Data Science
No ratings yet
Foundations of Data Science
139 pages
COT - Math5 - 2ndquarter - Expressing Ratio Using Either The Colon or Fraction - Math5
75% (4)
COT - Math5 - 2ndquarter - Expressing Ratio Using Either The Colon or Fraction - Math5
22 pages
AA02128 (730E and 83OE Cracks in Drive Tube)
No ratings yet
AA02128 (730E and 83OE Cracks in Drive Tube)
2 pages
EN15194 V5 LCS210707062AE Report
100% (1)
EN15194 V5 LCS210707062AE Report
20 pages
Live Seminar: Subject - Operating System Date - 07 Dec, 2022
No ratings yet
Live Seminar: Subject - Operating System Date - 07 Dec, 2022
89 pages
Data Science Master
No ratings yet
Data Science Master
11 pages
Lecture 3-Theorems (Thevenin, Norton, Maximum Power)
No ratings yet
Lecture 3-Theorems (Thevenin, Norton, Maximum Power)
39 pages
Print Python
No ratings yet
Print Python
22 pages
FEA Summary
No ratings yet
FEA Summary
4 pages
Regional Ecologies and Peripheral Aesthetics in Indian Literature: Tarashankar Bandyopadhyay's
No ratings yet
Regional Ecologies and Peripheral Aesthetics in Indian Literature: Tarashankar Bandyopadhyay's
17 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
Exam Sutra: EST EST
No ratings yet
Exam Sutra: EST EST
133 pages
ETI Chapter 1. Artificial Intelligence
No ratings yet
ETI Chapter 1. Artificial Intelligence
45 pages
Coffee Break NumPy PDF
100% (5)
Coffee Break NumPy PDF
211 pages
Lesson Plan: Unit Topic Books For Reference No. of Hours Required Teaching Methodology
No ratings yet
Lesson Plan: Unit Topic Books For Reference No. of Hours Required Teaching Methodology
6 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

Data Normalization

Uploaded by

Data Normalization

Uploaded by

1/21/24, 8:31 AM Data Normalization in Data Mining - GeeksforGeeks

Data Normalization in Data Mining

Data normalization is a technique used in data mining to transform the

There are several different normalization techniques that can be used in

1. Min-Max normalization: This technique scales the values of a feature to a

In conclusion, normalization is an important step in data mining, as it can

Normalization is used to scale the data of an attribute so that it falls in a

Normalization is generally required when we are dealing with attributes on a

Methods of Data Normalization –

90% Refund Hack

Decimal Scaling Method For Normalization –

It normalizes by moving the decimal point of values of the data. To normalize

where j is the smallest integer such that max(|vi‘|)<1. Example –

In this technique of data normalization, linear transformation is performed on

In this technique, values are normalized based on mean and standard

Data normalization in data mining can have a number of advantages and

1. Improved performance of machine learning algorithms: Normalization can

1. Loss of information: Normalization can result in a loss of information if the

2. Impact on outliers: Normalization can make it harder to detect outliers as

Commit to GfG's Three-90 Challenge! Purchase a course, complete 90% in 90

Last Updated : 02 Feb, 2023 34

Introduction to Mixed Reality Weighted K-NN

Share your thoughts in the comments Add Your Comment

You might also like