0% found this document useful (0 votes)
7 views11 pages

Preprocessing (Review)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

Preprocessing (Review)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Data Preprocessing

(Review)

By. Gunawansyah
 Garbage In Garbage Out

 Data preprocessing steps is very important in mining


process

 Different method can cause different result

 Preprocessing Phases :
 Data Cleaning
 Data Integration
 Data Transformation
 Data Reduction
Data Preprocessing
Without the availability of quality data, good quality mining can't be result because for
quality decision making must be based on quality data as well. Because of that, raw data
obtained should be through the prepocessing stage before entering to the next stage.

There are some stages in data preparation, namely:


1. Data Integration
Merging data from different sources of databases into a cube or storage such as data
warehouse. Data from various locations and format merge into a new database format.
2. Data Transformation
Change/transform data into a form that is most suitable for data mining process. Data
from different locations with different data format unifed that has the same format.
3. Data Reduction
Reduce the volume of data, but maintains the same analytical results.
4. Data Cleaning
Filling in missing values, smoothing noisy data and inconsistency cleared
Normalize Data :
Data Normalization is a crucial process in data mining, the process of scaling
the value attribute of the data so that it can fall in a certain range. This step is
very important when dealing with parameters of different units and scales.
All parameters should have the same scale for a fair comparison between
them. One of the goal from data normalization process is to facilitate the
process of prediction so the value used is not too large. Normalized data
using a scale of 0 to 1.

Denormalize data
Smoothing Data
One of data preprocessing phase in data mining is data
smoothing, to create an approximating function that attempts
to capture important patterns in the data, while leaving out
noise or other fine-scale structures/rapid phenomena.
This technique, when properly applied, reveals more clearly
the underlying trend, seasonal and cyclic components.

Averaging and exponential smoothing are two distinct groups


of smoothing methods.
Example Smoothing data method :
1. Moving Average
• Single Moving Average
• Weighted Moving Average

2. Exponential smoothing (ES)


• Single exponential smoothing (SES)
• Double exponential smoothing (DES)
Accuracy
The Mean Absolute Percentage Error (MAPE)

Yt : actual value for t period


Ŷt : predict value for t
n : number of data

You might also like