Chapter Four
Chapter Four
4.1 Introduction
In this chapter, it describes the process of the data collection from KFC Setia City
Mall to generate the outcome of quantity products in year 2012 until 2015. The data
mining method analysis has been used in this study is Prediction by Loan Defaulters
using Bayesian Network.
Before the data mining been process in SPSS Modeler, the data should be in format
Excel. In this study, we used KFC Data and save in format .scv. The data mining
also easy to process in specific format such as csv, xls, txt and others. KFC Data
have been categorized in three types such as years, months and quantity. In the KFC
Data also have 49 rows including every month in year 2012 until 2015 as shown in
Figure above.
Figure: KFC Data Preview
The first step, the data need to be input var file. Figure shows the has been input into
SPSS Modeler to be process. On right is preview of data mining that has been
generated in table form. If this happen, the data is suitable to be used in the data
mining method and can proceed to the next step.
Figure: Type Node
In figure shown the second step of data mining process. In this part, the data type
will connect with source node. There are 3 types of field representing years, months
and quantity of the product. In measurement, we have decided to select years,
months and quantity as a Nominal data type for KFC data. Quantity has been
selected as a target for data analysis outcome and the others remaining as an input.
Figure: Select Node
Next, in the Select Node there should key in the purpose or outcome that we want in
the end of the data mining process. In this step, we choose “Quantity” because in 4
years back which is in year 2012 until 2015, we want to know how many quantities
of the KFC’s products have been produced. Therefore, the SPSS Modeler will detect
and count only quantities details in each month of years.
Figure: Structure Type
In prediction technique, this part must have involved in data mining process. The
data need to fill out in each of Structure Type as follow the guideline such as Tan,
Markov and Markov-FS. This three structure type also need to connected to the
Select Node to be easy the data link into the structure node.
Figure: Browsing the Model (Nugget)
The three Bayesian Network nodes need to generate the models to produce the
nugget. The nugget model in each structure node which is Tan, Markov and
Markov-FS will be appear separately and it should relate to the Select Node to be
data processing.
Figure: Filter Node
After the it has been connected until Markov-FS, we need to click Filter Node icon
at the bottom to appear in the box process. This step is used to easy the SPSS
Modeler to identify the contents in the data Tan, Markov and Markov-FS to been
show correctly outcome in the end of the data mining process. After it correctly
identify the node, it must be executing the process by clicking the play green button
on top before it proceeds to the next step.
Figure: Outcome (Graph & Statistic)
This is the last step of data mining process as shown in figure . The outcome will be
generating in these two processes such as Analysis and Evaluation. To know the data
is correctly input in the process to the end, it need to execute from beginning data
has been entered until it will generate automatically the statistic of data and the
result.
Figure: Result of KFC Data (Quantity)
In the figure above is shown the graph of quantity. Each line is representative the
structure type such as Tan, Markov and Markov-FS. On the right is shows the
analysis result of the data for output field quantity.