0% found this document useful (0 votes)

19 views6 pages

DM Lab 1

Uploaded by

Routhu hemalatha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views6 pages

DM Lab 1

Uploaded by

Routhu hemalatha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

1. Demonstration of preprocessing on dataset student.

arff

Aim: This experiment illustrates some of the basic data preprocessing operations that can be
performed using WEKA-Explorer. The sample dataset used for this example is the student
data available in arff format.
EXCEL sheet

 Create new excel sheet Student.xls

 Enter the tables and save it.

Step1: Take the existing Student data set and save it as CSV(Macintosh). Open WEKA tool and then
Click on Tools- ArffViewer. Open the Student file and converted arff file is as follows:
CSV (Comma Separated Values)
 Open the saved Student.xls and save as csv.
 We generate Student.csv file.

Step2: Loading the data. We can load the dataset into weka by clicking on open button in
preprocessing interface and selecting the appropriate file.
Step3: Once the data is loaded, weka will recognize the attributes and during the scan of the
data weka will compute some basic strategies on each attribute. The left panel in the above
figure shows the list of recognized attributes while the top panel indicates the names of the
base relation or table and the current working relation (which are same initially).
Step4:Clicking on an attribute in the left panel will show the basic statistics on the attributes
for the categorical attributes the frequency of each attribute value is shown, while for
continuous attributes we can obtain min, max, mean, standard deviation and deviation etc.,

Step5:The visualization in the right button panel in the form of cross-tabulation across two
attributes.
Dataset Student .arff file opened with Note-pad:

Dataset Student .arff file opened with arff viewer :

Step 6: Following are the operations in pre-processing the data:

1.Discretization

1) Sometimes association rule mining can only be performed on categorical data.This requires
performing discretization on numeric or continuous attributes.
In the following example let us discretize age attribute.

 Let us divide the values of age attribute into three bins(intervals).

 First load the dataset into weka(student.arff)

 Select the age attribute.

 Activate filter-dialog box and select “WEKA.filters.unsupervised.attribute.discretize”

from the list.

 To change the defaults for the filters, click on the box immediately to the right of the
choose button.

 We enter the index for the attribute to be discretized. In this case the attribute is age. So
we must enter ‘1’ corresponding to the age attribute.

 Enter ‘3’ as the number of bins. Leave the remaining field values as they are.

 Click OK button.

 Click apply in the filter panel. This will result in a new working relation with the
selected attribute partition into 3 bins.

 Save the new working relation in a file called student-data-discretized.arff

The following screenshot shows the effect of discretization:

2.ReplaceWithMissingValues:
 Select the path as follows: “choose-filters-unsupervised-attribute-ReplaceWithMissingValue”.
 On clicking that attribute, the current data will be replaced with the missing values based on the
probability .
3.ReplaceMissingValuesWithUserConstant:
 Select the path as follows:
“ choose -filters -unsupervised -attribute -ReplaceMissingValuesWithUserConstant”.
 On clicking that attribute, the current data with the missing values will be replaced based on the
Constants given by the user .

4.ReplaceMissingValues with Mean and Mode:

 Select the path as follows:
“ choose -filters -unsupervised -attribute -ReplaceMissingValues”.
 On clicking that attribute, the current data with the missing values will be replaced with selected
column’s mean and mode.
5.Remove:
 Select the path as follows:
“ choose -filters -unsupervised -attribute -Remove”.
 On clicking that attribute,we can select attribute index so that that indexed attribute will be
removed from the current data.

Anne - CCS341 - DW - Students Record - 1a - 1b - 2 - Print
No ratings yet
Anne - CCS341 - DW - Students Record - 1a - 1b - 2 - Print
63 pages
Data Mining - Lab - Manual
No ratings yet
Data Mining - Lab - Manual
20 pages
Experiment No: 01 Data Exploration & Data Preprocessing
No ratings yet
Experiment No: 01 Data Exploration & Data Preprocessing
54 pages
Lab Assignment Report: ECS 851 Data Warehousing and Data Mining
No ratings yet
Lab Assignment Report: ECS 851 Data Warehousing and Data Mining
69 pages
Module 1 - Introduction To Animal Science
No ratings yet
Module 1 - Introduction To Animal Science
13 pages
ccs341 Data Warehousing Lab Manual2021
No ratings yet
ccs341 Data Warehousing Lab Manual2021
41 pages
CVR DWDM Manual
100% (1)
CVR DWDM Manual
70 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
40 pages
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
No ratings yet
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
6 pages
Demonstration of Preprocessing On Dataset Student - Arff Aim: This Experiment Illustrates Some of The Basic Data Preprocessing Operations That Can Be
100% (1)
Demonstration of Preprocessing On Dataset Student - Arff Aim: This Experiment Illustrates Some of The Basic Data Preprocessing Operations That Can Be
4 pages
Data Mining Lab File
No ratings yet
Data Mining Lab File
20 pages
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
100% (1)
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
8 pages
Data Mining Lab Manual
33% (3)
Data Mining Lab Manual
44 pages
Data Warehouse and Data Mining: Lab Manual
100% (1)
Data Warehouse and Data Mining: Lab Manual
69 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
47 pages
Christ Came Forth From India Georgian Astrological Texts 2020
100% (1)
Christ Came Forth From India Georgian Astrological Texts 2020
485 pages
CCS341-Data Warehousing Lab Manual (2021)
No ratings yet
CCS341-Data Warehousing Lab Manual (2021)
88 pages
Can You Double Check It and Give Me Detailed Step - .
No ratings yet
Can You Double Check It and Give Me Detailed Step - .
56 pages
DMDV 210
No ratings yet
DMDV 210
61 pages
DW Lab
No ratings yet
DW Lab
85 pages
Data Mining File
No ratings yet
Data Mining File
87 pages
Ccs341-Data-Warehousing-Lab-Manual2021 240410 1745 250417 141609
No ratings yet
Ccs341-Data-Warehousing-Lab-Manual2021 240410 1745 250417 141609
46 pages
March Version 3 - Module 1
No ratings yet
March Version 3 - Module 1
27 pages
DataMining-LabManual 241220 165057
No ratings yet
DataMining-LabManual 241220 165057
69 pages
Discipline of Focus
No ratings yet
Discipline of Focus
9 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
71 pages
DM Tools Sample-1
No ratings yet
DM Tools Sample-1
72 pages
Wekappt
No ratings yet
Wekappt
58 pages
This Is Are All Practical Questions and I Want An - .
No ratings yet
This Is Are All Practical Questions and I Want An - .
33 pages
Task 3
No ratings yet
Task 3
36 pages
Data Warehousing and Data Mining Lab
No ratings yet
Data Warehousing and Data Mining Lab
53 pages
BBA CA Semester III Manisha Madam
No ratings yet
BBA CA Semester III Manisha Madam
32 pages
DWBI Lab Manual 2023-24 Final
No ratings yet
DWBI Lab Manual 2023-24 Final
40 pages
DMLab
No ratings yet
DMLab
27 pages
What Is The Role of Students in Online Courses?
100% (1)
What Is The Role of Students in Online Courses?
4 pages
Iare DWDM and WT Lab Manual PDF
No ratings yet
Iare DWDM and WT Lab Manual PDF
69 pages
Lab Manual
No ratings yet
Lab Manual
69 pages
DWM Lab Manual 2025-26 Updated
No ratings yet
DWM Lab Manual 2025-26 Updated
47 pages
Journal Data Mining
No ratings yet
Journal Data Mining
31 pages
Data-Mining-Lab-Manual Cs 703b
No ratings yet
Data-Mining-Lab-Manual Cs 703b
41 pages
MC0717 Lab Manual
No ratings yet
MC0717 Lab Manual
42 pages
Weka LAB-ALL
No ratings yet
Weka LAB-ALL
19 pages
Lab Manual
No ratings yet
Lab Manual
16 pages
MINING
No ratings yet
MINING
18 pages
DWDM Record With Alignment
No ratings yet
DWDM Record With Alignment
69 pages
Experiment 1: Installation of WEKA Tool Aim
No ratings yet
Experiment 1: Installation of WEKA Tool Aim
19 pages
DWDM Lab Manual Using Weka-For MIC
No ratings yet
DWDM Lab Manual Using Weka-For MIC
42 pages
Data Mining Lab Manual: Aurora's PG College Moosarambagh Mca Department
No ratings yet
Data Mining Lab Manual: Aurora's PG College Moosarambagh Mca Department
42 pages
DWDM - Case Study On Weka - Ceb624
No ratings yet
DWDM - Case Study On Weka - Ceb624
13 pages
BI - Experiment - No - 1
No ratings yet
BI - Experiment - No - 1
7 pages
Task 0: Weka Introduction
No ratings yet
Task 0: Weka Introduction
11 pages
hw2 Datapreproc
No ratings yet
hw2 Datapreproc
15 pages
DMLB 1
No ratings yet
DMLB 1
3 pages
Program No-1 OBJECTIVE: To Create Data-Set in .Arff File Format. Demonstration of Preprocessing On WEKA Data-Set
No ratings yet
Program No-1 OBJECTIVE: To Create Data-Set in .Arff File Format. Demonstration of Preprocessing On WEKA Data-Set
7 pages
DMLab
No ratings yet
DMLab
14 pages
Assignment 1-Preprocessing Handon
No ratings yet
Assignment 1-Preprocessing Handon
13 pages
Step1. Open The Data/bank Data - CSV Dataset
No ratings yet
Step1. Open The Data/bank Data - CSV Dataset
3 pages
Assignment 1-Preprocessing Handon
No ratings yet
Assignment 1-Preprocessing Handon
6 pages
EXP1
No ratings yet
EXP1
2 pages
Assignment #7 - Dr. Totanes
No ratings yet
Assignment #7 - Dr. Totanes
3 pages
Perform Data Pre-Processing On Sample Data Set (Student - Arff)
No ratings yet
Perform Data Pre-Processing On Sample Data Set (Student - Arff)
4 pages
USP-NF Purified Water
No ratings yet
USP-NF Purified Water
1 page
Perth 2014 - Abstract Book - Final PDF
100% (1)
Perth 2014 - Abstract Book - Final PDF
277 pages
Documentation MuRAT
No ratings yet
Documentation MuRAT
76 pages
Gap Analysis 2024 2025
No ratings yet
Gap Analysis 2024 2025
4 pages
Ayesha Ramzan
No ratings yet
Ayesha Ramzan
19 pages
High Fluence High Beam Quality Q Switched Ndyag Laser With Optoflex Delivery System For Treating Benign Pigmented Lesions and Tattoos
No ratings yet
High Fluence High Beam Quality Q Switched Ndyag Laser With Optoflex Delivery System For Treating Benign Pigmented Lesions and Tattoos
12 pages
Year 5 Equivalent Fractions and Decimals Tenths RPS
No ratings yet
Year 5 Equivalent Fractions and Decimals Tenths RPS
2 pages
Quantum Physics For Babies
No ratings yet
Quantum Physics For Babies
13 pages
Limit of PAC (1.5%) Analysis of The Effect of Polyanionic Cellulose On Viscosity and Filtrate Volume in Drilling Fluid
No ratings yet
Limit of PAC (1.5%) Analysis of The Effect of Polyanionic Cellulose On Viscosity and Filtrate Volume in Drilling Fluid
6 pages
BSR Tran Uno Bsu
No ratings yet
BSR Tran Uno Bsu
2 pages
Recent IELTS Writing Topics and Questions 2024 - How To Do IELTS
No ratings yet
Recent IELTS Writing Topics and Questions 2024 - How To Do IELTS
49 pages
Sample Lesson Plan For JET Program Teaching Demo Carl Benson Vlogs Japan
No ratings yet
Sample Lesson Plan For JET Program Teaching Demo Carl Benson Vlogs Japan
2 pages
Leishen LiDAR Product Guide 7.20213
No ratings yet
Leishen LiDAR Product Guide 7.20213
27 pages
Voronoi Diagrams - A Survey of A Fundamental Geometric Data Structure
No ratings yet
Voronoi Diagrams - A Survey of A Fundamental Geometric Data Structure
61 pages
GITAM School of Technology, Visakhapatnam
No ratings yet
GITAM School of Technology, Visakhapatnam
4 pages
Ap Physics 2 Lab: Photoelectric Effect
No ratings yet
Ap Physics 2 Lab: Photoelectric Effect
9 pages
Dissertation Theatre Vu Ou Lu
100% (1)
Dissertation Theatre Vu Ou Lu
7 pages
ME451: Control Systems Course Roadmap
No ratings yet
ME451: Control Systems Course Roadmap
5 pages
Local Media3092843488830198412
100% (1)
Local Media3092843488830198412
2 pages
MTH101 Final Term Solved Subjective Lecture 23 To 45
No ratings yet
MTH101 Final Term Solved Subjective Lecture 23 To 45
43 pages
Cmos Asynchronous Fifo 2048 X 9, 4096 X 9, 8192 X 9 and 16384 X 9
No ratings yet
Cmos Asynchronous Fifo 2048 X 9, 4096 X 9, 8192 X 9 and 16384 X 9
15 pages
MS Broschuere FLUITEX EN Metric
No ratings yet
MS Broschuere FLUITEX EN Metric
12 pages
AECC Assignment - 2
No ratings yet
AECC Assignment - 2
5 pages
" Druggist Fold : West Manheim Twp. Police Dept. Property Manual
No ratings yet
" Druggist Fold : West Manheim Twp. Police Dept. Property Manual
1 page

DM Lab 1

Uploaded by

DM Lab 1

Uploaded by

1. Demonstration of preprocessing on dataset student.

 Create new excel sheet Student.xls

Dataset Student .arff file opened with arff viewer :

Step 6: Following are the operations in pre-processing the data:

 Let us divide the values of age attribute into three bins(intervals).

 First load the dataset into weka(student.arff)

 Activate filter-dialog box and select “WEKA.filters.unsupervised.attribute.discretize”

 Save the new working relation in a file called student-data-discretized.arff

The following screenshot shows the effect of discretization:

4.ReplaceMissingValues with Mean and Mode:

You might also like