0% found this document useful (0 votes)
136 views9 pages

Weka Tool

This document provides step-by-step instructions for using the WEKA tool to perform machine learning tasks on a sample Iris flower dataset. It describes loading the Iris dataset into WEKA, which contains 150 records of 3 types of iris flowers described by 4 attributes. It then explains how to use the Explorer interface in WEKA to preprocess, classify, cluster, select attributes and visualize the data to predict flower types using various machine learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views9 pages

Weka Tool

This document provides step-by-step instructions for using the WEKA tool to perform machine learning tasks on a sample Iris flower dataset. It describes loading the Iris dataset into WEKA, which contains 150 records of 3 types of iris flowers described by 4 attributes. It then explains how to use the Explorer interface in WEKA to preprocess, classify, cluster, select attributes and visualize the data to predict flower types using various machine learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Name :- Purushottam Kumar

Roll :- 2020178043
MCA 2nd Sem (Regular)
Submission Date : 13-June-2021
 Weka is an open source software that provides a collection of machine
learning algorithms for data mining tasks. The algorithms can either be applied directly
to a dataset or called from our own Java code. Weka contains tools for data pre-
processing, classification, regression, clustering, association rules, and visualization so
that we can develop machine learning techniques and apply them to real-world data
mining problems.
Step 1: After installing WEKA tool this interface is visible.

Step 2 : Initially as a beginner we will use Explorer option because this


option has all the important data mining & machine learning algorithm.
Step 3 : Selecting a dataset :
In order to solve a real word problem we need existing data. Here I am using a predefined
dataset named Fisher’s Iris dataset. Mainly this dataset contains information about 3
species of flowers.

 3 species of flowers given are : Setosa , Versicolor , Verginica.


 For each and every species , this dataset has 50 samples. It means this dataset has
total 3 x 50 = 150 records.
 Every flower species can be identified using its part also known as Sepal & Petal.
 Sepal & Petal both has length & width attribute on the basis of which we will predict
the type of species.

 So this dataset has experience of 150 records.


 Here performance will depend on the prediction that after providing input data
(length & width of Sepal and Petal ) how much correctly I can predict the output
(Species name).
 It is a problem of classification not the regression because here we are predicting
one species out of 3.
 Also it is supervised because it already has data of 150 records on the basis of hich I
can predict the flower species name.
 This file is in form of excel (.xlsx)
 But Weka supports only two type of dataset :-
o .arff => [ attribute relation file format ]
o .csv => [ Comma separated values ]
 We can convert our excel (.xlsx) file easily to csv file.

Step 4 : Loading dataset

On the top, we have several tabs as listed here −

1. Preprocess 4. Associate
2. Classify 5. Select Attributes
3. Cluster 6. Visualize
 Pre-processing is the first stage where actually we load the dataset file.
 The Classify tab provides several machine learning algorithms for the classification
of our data. We can apply algorithms such as Linear Regression, Logistic
Regression, Support Vector Machines, Decision Trees, RandomTree,
RandomForest, NaiveBayes, and so on.

Step 5 : Loading dataset


 First we will click on to locate where the dataset file is.

 After selecting the file click on OPEN > Now the dataset file will be loaded.
 Also other tabs will be enabled after loading this dataset correctly.
 After loading Dataset :-
 Relation (table/ File) name will be displayed
 Attributes (Column)
 Instances (total records ) will be displayed.
Here I am removing one attribute “Instance” because during analysis it is not
required so I will remove this attribute , so my screen look like this..

Statistics
Section

Histogram
 Visualization

------------*--------------*--------------------------------------------------*-------------------*----------

: Thanks

You might also like