Department of Computer Engineering: Experiment No.3
Department of Computer Engineering: Experiment No.3
Experiment No.3
Experiment Number 3
Theory
WEKA - an open source software provides tools for data
preprocessing, implementation of several Machine Learning
algorithms, and visualization tools so that you can develop machine
learning techniques and apply them to real-world data mining
problems. What WEKA offers is summarized in the following
diagram −
Department of Computer Engineering
Experiment No.3
If you observe the beginning of the flow of the image, you will
understand that there are many stages in dealing with Big Data to
make it suitable for machine learning −
First, you will start with the raw data collected from the field. This
data may contain several null values and irrelevant fields. You use
the data preprocessing tools provided in WEKA to cleanse the data.
Then, you would save the preprocessed data in your local storage
for applying ML algorithms.
Then, WEKA would give you the statistical output of the model
processing. It provides you a visualization tool to inspect the data.
The various models can be applied on the same dataset. You can
then compare the outputs of different models and select the best
that meets your purpose.
Conclusion
Explored weka tool prepossessing by using default datasets available
on weka.