0% found this document useful (0 votes)
25 views18 pages

MINING

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views18 pages

MINING

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

PRACTICAL LIST

S.NO. PRACTICAL NAME SIGNATURE


1. Installation of WEKA.

2. Create an Employee Table with the help of Data Mining Tool


WEKA.

3. Convert excel file to arff format in weka.

4. Apply Pre-Processing techniques to the training data set of


Employee Table

5. Finding Association Rules for Employee data

6. To Construct Decision Tree for Customer data and classify it.

7. Normalize Weather Table data using Knowledge Flow.

8. Write a procedure for cross-validation using J48 Algorithm for


weather table.

9. Write a procedure for Employee data using Make Density Based


Cluster Algorithm

10. Write a procedure for Clustering Customer data using Simple


KMeans Algorithm

0
PRACTICAL NO.1
PRACTICAL : Installation of WEKA.
WEKA - an open source software provides tools for data preprocessing, implementation of several Machine Learning
algorithms, and visualization tools so that you can develop machine learning techniques and apply them to real-world
data mining problems.
WEKA provides the implementation of several algorithms. You would select an algorithm of your choice, set the
desired parameters and run it on the dataset.
Then, WEKA would give you the statistical output of the model processing. It provides you a visualization tool to
inspect the data.
To install WEKA on your machine, visit WEKA’s official website and download the installation file. WEKA supports
installation on Windows, Mac OS X and Linux. You just need to follow the instructions on this page to install WEKA
for your OS.

Follow the below steps to install Weka on Windows


Step 1: Search the WEKA installation on any browser or use below http link to search.
https://sourceforge.net/projects/weka/
Step 2: Click on the download, it will take few seconds to start downloading. It is upto 127 MB.
Step 3: After downloading, click on agree to the all condition of installation.
Step 4: Weka is successfully installed on the system and an icon is created on the desktop.
Step 5: Run the software and see the interface.

Note: For installation of WEKA Tools java Is mandatory.

1
PRACTICAL NO.2
PRACTICAL : Create an Employee Table with the help of Data Mining Tool WEKA.
We need to create an Employee Table with training data set which includes attributes like name, id, salary, experience,
gender, phone number.
Steps:
1) Open Start Programs Accessories  Notepad
2) Type the following training data set with the help of Notepad for Employee Table.
@relation employee
@attribute name {x, y, z, a, b}
@attribute id numeric
@attribute salary {low, medium, high}
@attribute exp numeric
@attribute gender {male, female}
@attribute phone numeric

@data
x, 101, low, 2, male, 250311
y, 102, high, 3, female, 251665
z, 103, medium, 1, male, 240238
a, 104, low, 5, female, 200200
b, 105, high, 2, male, 240240

3) After that the file is saved with .arff file format.


4) Minimize the arff file and then open Start  Programs  weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows employee table on weka.
Training Data Set Employee Table

Result: This program has been successfully executed.

2
PRACTICAL NO.3
PRACTICAL : Convert excel file to arff format in weka.
Use an excel sheet to store the dataset and convert the sheet into CSV(MS-DOS) file. The easiest way to do it
is to convert your data to CSV first and load CSV file into Weka. Because in weka only arff format is
supported so we have to convert excel to arff format to use in weka tool. We can convert it online but here are
some steps by using these we can convert offline in weka .

The following steps can solve your problem

1. Save your .xls file in .csv format.


2. Open the Weka GUI Chooser and then click on the tools button in the top menu bar.
3. Click on the Arffviwer
4. Choose file types to be loaded like, *.csv, *.data
5. Open *.csv file to view the data and values
6. Name the file with the .arff extension
7. Save the file

3
PRACTICAL NO.4
PRACTICAL : Apply Pre-Processing techniques to the training data set of Employee Table.
Real world databases are highly influenced to noise, missing and inconsistency due to their queue size so the data can
be pre-processed to improve the quality of data and missing results and it also improves the efficiency.
There are 3 pre-processing techniques they are:
1) Add
2) Remove
3) Normalization
Add → Pre-Processing Technique:
Procedure:
1) Start → Programs → Weka-3-4 → Weka-3-4
2) Click on explorer.
3) Click on open file.
4) Select Employee.arff file and click on open.
5) Click on Choose button and select the Filters option.
6) In Filters, we have Supervised and Unsupervised data.
7) Click on Unsupervised data.
8) Select the attribute Add.
9) A new window is opened.
10) In that we enter attribute index, type, data format, nominal label values for Address.
11) Click on OK.
12) Press the Apply button, then a new attribute is added to the Employee Table.
13) Save the file.
14) Click on the Edit button, it shows a new Employee Table on Weka.
Remove → Pre-Processing Technique:
Procedure:
1) Start → Programs → Weka-3-4 → Weka-3-4
2) Click on explorer.
3) Click on open file.
4) Select Employee.arff file and click on open.
5) Click on Choose button and select the Filters option.
6) In Filters, we have Supervised and Unsupervised data.
7) Click on Unsupervised data.
8) Select the attribute Remove.
9) Select the attributes salary, gender to Remove.
10) Click Remove button and then Save.
11) Click on the Edit button, it shows a new Employee Table on Weka.

4
Normalize → Pre-Processing Technique:
Procedure:
1) Start → Programs → Weka-3-4 → Weka-3-4
2) Click on explorer.
3) Click on open file.
4) Select Employee.arff file and click on open.
5) Click on Choose button and select the Filters option.
6) In Filters, we have Supervised and Unsupervised data.
7) Click on Unsupervised data.
8) Select the attribute Normalize.
9) Select the attributes id, experience, phone to Normalize.
10) Click on Apply button and then Save.
11) Click on the Edit button, it shows a new Employee Table with normalized values on Weka.

Result: This program has been successfully executed.

5
PRACTICAL NO.5
PRACTICAL : Finding Association Rules for Employee data.
In data mining, association rule learning is a popular and well researched method for discovering interesting relations
between variables in large databases. It can be described as analyzing and presenting strong rules discovered in
databases using different measures of interestingness. In market basket analysis association rules are used and they are
also employed in many application areas including Web usage mining, intrusion detection and bioinformatics.
Creation of Employee Table:
Procedure:
1) Open Start  Programs Accessories Notepad
2) Type the following training data set with the help of Notepad for Employee Table.
@relation employee-1
@attribute age {youth, middle, senior}
@attribute income {high, medium, low}
@attribute class {A, B, C}

@data
youth, high, A
youth, medium, B
youth, low, C
middle, low, C
middle, medium, C
middle, high, A
senior, low, C
senior, medium, B
senior, high, B
middle, high, B

3) After that the file is saved with .arff file format.


4) Minimize the arff file and then open Start  Programs  weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows employee table on weka.
Training Data Set  Employee Table

6
Procedure for Association Rules:

1) Open Start  Programs  Weka-3-4  Weka-3-4


2) Open explorer.
3) Click on open file and select employee-1.arff
4) Select Associate option on the top of the Menu bar.
5) Select Choose button and then click on Apriori Algorithm.
6) Click on Start button and output will be displayed on the right side of the window.
OUTPUT:

Result: This program has been successfully executed.

7
PRACTICAL NO.6
PRACTICAL : To Construct Decision Tree for Customer data and classify it.
Creation of Customer Table:

Procedure:
1) Open Start  Programs  Accessories  Notepad
2) Type the following training data set with the help of Notepad for Customer Table.
@relation customer
@attribute name {x, y, z, u, v, l, w, q, r, n}
@attribute age {youth, middle, senior}
@attribute income {high, medium, low}
@attribute class {A, B}

@data
x, youth, high, A
y, youth, low, B
z, middle, high, A
u, middle, low, B
v, senior, high, A
l, senior, low, B
w, youth, high, A
q, youth, low, B
r, middle, high, A
n, senior, high, A
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start  Programs  weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows customer table on weka.

Training Data Set  Customer Table

8
Procedure for Decision Trees:

1) Open Start  Programs  Weka-3-4  Weka-3-4


2) Open explorer.
3) Click on open file and select customer.arff
4) Select Classifier option on the top of the Menu bar.
5) Select Choose button and click on Tree option.
6) Click on J48.
7) Click on Start button and output will be displayed on the right side of the window.
8) Select the result list and right click on result list and select Visualize Tree option.
9) Then Decision Tree will be displayed on new window.

Output:

Decision Tree:

Result: This program has been successfully executed.

9
PRACTICAL NO.7
PRACTICAL : Normalize Weather Table data using Knowledge Flow.
The knowledge flow provides an alternative way to the explorer as a graphical front end to WEKA’s algorithm.
Knowledge flow is a working progress. So, some of the functionality from explorer is not yet available. So, on the
other hand there are the things that can be done in knowledge flow, but not in explorer. Knowledge flow presents a
dataflow interface to WEKA. The user can select WEKA components from a toolbar placed them on a layout campus
and connect them together in order to form a knowledge flow for processing and analyzing the data.
Creation of Weather Table:
Procedure:
1) Open Start → Programs → Accessories → Notepad
2) Type the following training data set with the help of Notepad for Weather Table.
@relation weather
@attribute outlook {sunny,rainy,overcast}
@attribute temparature numeric
@attribute humidity numeric
@attribute windy {true,false}
@attribute play {yes,no}

@data
sunny,85.0,85.0,false,no
overcast,80.0,90.0,true,no
sunny,83.0,86.0,false,yes
rainy,70.0,86.0,false,yes
rainy,68.0,80.0,false,yes
rainy,65.0,70.0,true,no
overcast,64.0,65.0,false,yes
sunny,72.0,95.0,true,no
sunny,69.0,70.0,false,yes
rainy,75.0,80.0,false,yes
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start → Programs → weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows Weather table on weka.
Training Data Set → Weather Table

10
Procedure for Knowledge Flow:
1) Open Start → Programs → Weka-3-4 → Weka-3-4
2) Open the Knowledge Flow.
3) Select the Data Source component and add Arff Loader into the knowledge layout canvas.
4) Select the Filters component and add Attribute Selection and Normalize into the knowledge layout canvas.
5) Select the Data Sinks component and add Arff Saver into the knowledge layout canvas.
6) Right click on Arff Loader and select Configure option then the new window will be opened and select Weather.arff
7) Right click on Arff Loader and select Dataset option then establish a link between Arff Loader and Attribute
Selection.
8) Right click on Attribute Selection and select Dataset option then establish a link between Attribute
Selection and Normalize.
9) Right click on Attribute Selection and select Configure option and choose the best attribute for Weather data.
10) Right click on Normalize and select Dataset option then establish a link between Normalize and Arff Saver.
11) Right click on Arff Saver and select Configure option then new window will be opened and set the path, enter .arff
in look in dialog box to save normalize data.
12) Right click on Arff Loader and click on Start Loading option then everything will be executed one by one.
13) Check whether output is created or not by selecting the preferred path.
14) Rename the data name as a.arff
15) Double click on a.arff then automatically the output will be opened in MS-Excel.
Output:

Result: This program has been successfully executed

11
PRACTICAL NO.8
PRACTICAL : Write a procedure for cross-validation using J48 Algorithm for weather table.
Cross-validation, sometimes called rotation estimation, is a technique for assessing how the results of a statistical
analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one
wants to estimate how accurately a predictive model will perform in practice. One round of cross-validation involves
partitioning a sample of data into complementary subsets, performing the analysis on one subset (called the training
set), and validating the analysis on the other subset (called the validation set or testing set).
Creation of Weather Table:
Procedure:
1) Open Start → Programs → Accessories → Notepad
2) Type the following training data set with the help of Notepad for Weather Table.
@relation weather
@attribute outlook {sunny, rainy, overcast}
@attribute temperature numeric
@attribute humidity numeric
@attribute windy {TRUE, FALSE}
@attribute play {yes, no}
@data
sunny,85,85,FALSE,no
sunny,80,90,TRUE,no
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start → Programs → weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows weather table on weka.
Training Data Set → Weather Table

12
Procedure:
1) Start -> Programs -> Weka 3.4
2) Open Knowledge Flow.
3) Select Data Source tab & choose Arff Loader.
4) Place Arff Loader component on the layout area by clicking on that component.
5) Specify an Arff file to load by right clicking on Arff Loader icon, and then a pop-up menu will appear.
In that select Configure & browse to the location of weather.arff
6) Click on the Evaluation tab & choose Class Assigner & place it on the layout.
7) Now connect the Arff Loader to the Class Assigner by right clicking on Arff Loader, and then select
Data Set option, now a link will be established.
8) Right click on Class Assigner & choose Configure option, and then a new window will appear & specify
a class to our data.
9) Select Evaluation tab & select Cross-Validation Fold Maker & place it on the layout.
10) Now connect the Class Assigner to the Cross-Validation Fold Maker.
11) Select Classifiers tab & select J48 component & place it on the layout.
12) Now connect Cross-Validation Fold Maker to J48 twice; first choose Training Data Set option and
then Test Data Set option.
13) Select Evaluation Tab & select Classifier Performance Evaluator component & place it on the layout.
14) Connect J48 to Classifier Performance Evaluator component by right clicking on J48 & selecting
Batch Classifier.
15) Select Visualization tab & select Text Viewer component & place it on the layout.
16) Connect Text Viewer to Classifier Performance Evaluator by right clicking on Text Viewer & by
selecting Text option.
17) Start the flow of execution by selecting Start Loading from Arff Loader.
18) For viewing result, right click on Text Viewer & select the Show Results, and then the result will be
displayed on the new window.

Output:

Result: The program has been successfully executed.

13
PRACTICAL NO.9
PRACTICAL : Write a procedure for Employee data using Make Density Based Cluster Algorithm
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in
the same cluster are more similar (in some sense or another) to each other than to those in other clusters. Clustering is a
main task of explorative data mining, and a common technique for statistical data analysis used in many fields,
including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics.
Creation of Employee Table:
Procedure:
1) Open Start → Programs → Accessories → Notepad
2) Type the following training data set with the help of Notepad for Employee Table.
@relation employee
@attribute eid numeric
@attribute ename {raj,ramu,anil,sunil,rajiv,sunitha,kavitha,suresh,ravi,ramana,ram,kavya,navya}
@attribute salary numeric
@attribute exp numeric
@attribute address {pdtr,kdp,nlr,gtr}

@data
101,raj,10000,4,pdtr
102,ramu,15000,5,pdtr
103,anil,12000,3,kdp
104,sunil,13000,3,kdp
105,rajiv,16000,6,kdp
106,sunitha,15000,5,nlr
107,kavitha,12000,3,nlr
108,suresh,11000,5,gtr
109,ravi,12000,3,gtr
110,ramana,11000,5,gtr
111,ram,12000,3,kdp
112,kavya,13000,4,kdp
113,navya,14000,5,kdp
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start → Programs → weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file
8) Click on edit button which shows employee table on weka.
Training Data Set → Employee Table

14
Procedure:
1) Click Start -> Programs -> Weka 3.4
2) Click on Explorer.
3) Click on open file & then select Employee.arff file.
4) Click on Cluster menu. In this there are different algorithms are there.
5) Click on Choose button and then select MakeDensityBasedClusterer algorithm.
6) Click on Start button and then output will be displayed on the screen

OUTPUT:

Result: The program has been successfully executed.

15
PRACTICAL NO.10
PRACTICAL : Write a procedure for Clustering Customer data using Simple KMeans Algorithm.
Creation of Customer Table:
Procedure:
1) Open Start → Programs → Accessories → Notepad
2) Type the following training data set with the help of Notepad for Buying Table.
@relation customer
@attribute name {x,y,z,u,v,l,w,q,r,n}
@attribute age {youth,middle,senior}
@attribute income {high,medium,low}
@attribute class {A,B}

@data
x,youth,high,A
y,youth,low,B
z,middle,high,A
u,middle,low,B
v,senior,high,A
l,senior,low,B
w,youth,high,A
q,youth,low,B
r,middle,high,A
n,senior,high,A
3) After that the file is saved with .arff file format.
4) Minimize the arff file and then open Start → Programs → weka-3-4.
5) Click on weka-3-4, then Weka dialog box is displayed on the screen.
6) In that dialog box there are four modes, click on explorer.
7) Explorer shows many options. In that click on ‘open file’ and select the arff file.
8) Click on edit button which shows buying table on weka.

Training Data Set → Customer Table

16
Procedure:
1) Click Start -> Programs -> Weka 3.4
2) Click on Explorer.
3) Click on open file & then select Customer.arff file.
4) Click on Cluster menu. In this there are different algorithms are there.
5) Click on Choose button and then select SimpleKMeans algorithm.
6) Click on Start button and then output will be displayed on the screen.

OUTPUT:

Result: The program has been successfully executed

17

You might also like