0% found this document useful (0 votes)
68 views5 pages

Learning To Use We Ka

Weka is a collection of machine learning algorithms for data mining tasks that can be run directly on datasets through a GUI or called from Java code. It includes tools for data preprocessing, classification, regression, clustering, association rules, feature selection, and visualization. The Weka GUI provides access to the Explorer for data exploration, Experimenter for running experiments, Knowledge Flow with a drag-and-drop interface, and a simple CLI. Data is typically provided in the ARFF format, which can be generated from spreadsheets by exporting to CSV and converting. Weka includes 23 sample datasets to explore using the Arff Viewer and other tools to visualize data dispersion and understand how to prepare data for mining.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views5 pages

Learning To Use We Ka

Weka is a collection of machine learning algorithms for data mining tasks that can be run directly on datasets through a GUI or called from Java code. It includes tools for data preprocessing, classification, regression, clustering, association rules, feature selection, and visualization. The Weka GUI provides access to the Explorer for data exploration, Experimenter for running experiments, Knowledge Flow with a drag-and-drop interface, and a simple CLI. Data is typically provided in the ARFF format, which can be generated from spreadsheets by exporting to CSV and converting. Weka includes 23 sample datasets to explore using the Arff Viewer and other tools to visualize data dispersion and understand how to prepare data for mining.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Learning to Use Weka

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be
applied directly to a dataset (using GUI) or called from your own Java code (using Weka Java library). Tools
(or functions) in Weka include:

Data preprocessing (e.g., Data Filters),


Classification (e.g., BayesNet, KNN, C4.5 Decision Tree, Neural Networks, SVM),
Regression (e.g., Linear Regression, Isotonic Regression, SVM for Regression),
Clustering (e.g., Simple K-means, Expectation Maximization (EM)),
Association rules (e.g., Apriori Algorithm, Predictive Accuracy, Confirmation Guided),
Feature Selection (e.g., Cfs Subset Evaluation, Information Gain, Chi-squared Statistic), and
Visualization (e.g., View different two-dimensional plots of the data).

The Weka GUI Chooser (class weka.gui.GUIChooser) provides a starting point for launching Wekas
main GUI applications and supporting tools. If one prefers a MDI (multiple document interface)
appearance, then this is provided by an alternative launcher called Main (class weka.gui.Main).
The GUI Chooser consists of four buttons one for each of the four major Weka applications and four
menus. The buttons can be used to start the following applications:

Explorer: An environment for exploring data with WEKA (the rest of this documentation deals
with this application in more detail).
Experimenter: An environment for performing experiments and conducting statistical tests
between learning schemes.
Knowledge Flow: This environment supports essentially the same functions as the Explorer but
with a drag-and-drop interface. One advantage is that it supports incremental learning.
Simple CLI: Provides a simple command-line interface that allows direct execution of WEKA
commands for operating systems that do not provide their own command line interface.

Preparing the data


The data is often presented in a spreadsheet or database. However,Wekas native data storage
method is ARFF format. You can easily convert from a spreadsheet to ARFF. The bulk of an ARFF file
consists of a list of the instances, and the attribute values for each instance are separated by commas.
Most spreadsheet and database programs allow you to export data into a file in comma-separated
value (CSV) format as a list of records with commas between items. Having done this, you need only
load the file into a text editor or word processor; add the datasets name using the @relation tag, the
attribute information using @attribute, and a @data line; and save the file as raw text. For example,
Figure below shows an Excel spreadsheet containing the weather data, the data in CSV form loaded
into Microsoft Word, and the result of converting it manually into an ARFF file. However, you dont
actually have to go through these steps to create the ARFF file yourself, because the Explorer can
read CSV spreadsheet files directly.

HTS_LearningWeka_KnowYourData

9/17/2015 11:29:10 AM

Page 1 of 5

(c)

HTS_LearningWeka_KnowYourData

9/17/2015 11:29:10 AM

Page 2 of 5

(d)
Weather data: (a) spreadsheet, (b) CSV format, and (c) ARFF numeric (d) ARFF nominal.
Getting to Know your Data
WEKA provides 23 sample data (.ARFF) in directory C:\Program Files (x86)\Weka-3-6\data\. Please
explore these data using Arff Viewer in Weka Tools.

HTS_LearningWeka_KnowYourData

9/17/2015 11:29:10 AM

Page 3 of 5

Arff Viewer shows all data credit-g in relational format. You can see any type of attribute in column
header. You also can view sample data in notepad or WordPad tools. Some data have description
before relation name. Furthermore, Weka also have feature to visualize data from menu
Visualization. You can choose any visualization format to show the dispersion of data. For example:
You can see figure blow that visualize sample data credit-g. You can change the ordinate and axis of
graph according to the data attribute.

HTS_LearningWeka_KnowYourData

9/17/2015 11:29:10 AM

Page 4 of 5

Task 1: Explore 23 sample data (.ARFF) using Arff Viewer in Weka Tools. Read the description (attribute,
number of tuple, etc) of data.
Task 2: Open sample data in Weka and try to visualize it, then see the dispersion of data. Please, try any
parameter of visualization according to the data attribute.
Task 3: Explore tab Preprocess, Classify, Cluster, Associate, Select Attributes and Visualize from Weka
Explorer. Remember you must choose filter method and data that you want to mining before do data
mining process.
Task 4: Please read WekaManual.pdf to explore and understand about Weka. This file can be access in
directory: C:\Program Files (x86)\Weka-3-6\

HTS_LearningWeka_KnowYourData

9/17/2015 11:29:10 AM

Page 5 of 5

You might also like