0% found this document useful (0 votes)

54 views18 pages

DHW Lab (Ex1 To 3)

Uploaded by

dideep1624

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views18 pages

DHW Lab (Ex1 To 3)

Uploaded by

dideep1624

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

INTRODUCTION TO WEKA TOOL

Weka is a collection of machine learning algorithms for data mining tasks. It

contains tools for data preparation, classification, regression, clustering, association
rules mining, and visualization.

Found only on the islands of New Zealand, the Weka is a flightless bird with an
inquisitive nature. The name is pronounced like this, and the bird sounds like this.
Weka is open source software issued under the GNU General Public License.

Downloading and/or installation of WEKA data mining toolkit

1. Go to the Weka website, http://www.cs.waikato.ac.nz/ml/weka/, and download

the software. On the left-hand side, click on the link that says download.
2. Select the appropriate link corresponding to the version of the software based on
your operating system.
3. You can download the software from a site. Save the self-extracting executable to
disk and then double click on it to install Weka. Answer yes or next to the questions
during the installation.
4. Click yes to accept the Java agreement if necessary. After you install the program
Weka should appear on your start menu under Programs.
5. Running Weka from the start menu select Programs, then Weka. You will see the
Weka GUI Chooser. Select Explorer. The Weka Explorer will then launch.

Understand the features of WEKA toolkit such as Explorer, Knowledge Flow

interface, Experimenter, command-line interface.
The Weka GUI Chooser provides a starting point for launching Weka‘s main GUI
applications and supporting tools. The GUI Chooser consists of four buttons—one
for each of the four major Weka applications— and four menus.

The buttons can be used to start the following applications:

Explorer: An environment for exploring data with WEKA
a) Click on explorer button to bring up the explorer window.
b) Make sure the preprocess tab is highlighted.
c) Open a new file by clicking on Open New file and choosing a file with .arff
extension from the Data directory.
d) Attributes appear in the window below.
e) Click on the attributes to see the visualization on the right.
f) Click visualize all to see them all

Experimenter: An environment for performing experiments and conducting

statistical tests between learning schemes.
a) Experimenter is for comparing results.
b) Under the set up tab click New.
c) Click on Add New under Data frame. Choose a couple of arff format files from
Data directory one at a time.
d) Click on Add New under Algorithm frame. Choose several algorithms, one at a
time by clicking OK in the window and Add New.
e) Under the Run tab click Start.
f) Wait for WEKA to finish.
g) Under Analyses tab click on Experiment to see results.

Knowledge Flow: This environment supports essentially the same functions as the
Explorer but with a drag and drop interface. One advantage is that it supports
incremental learning. Simple CLI: Provides a simple command line interface that
allows direct execution of WEKA commands for operating systems that do not
provide their own command line interface.

Navigate the options available in the WEKA (ex. Select attributes panel,
Preprocess panel,classify panel, Cluster panel, Associate panel and Visualize
panel)

When the Explorer is first started only the first tab is active. This is because it is
necessary to open a data set before starting to explore the data.
The tabs are as follows:

 Preprocess: Choose and modify the data being acted on.

 Classify: Train and test learning schemes that classify or perform regression.
 Cluster: Learn clusters for the data.
 Associate: Learn association rules for the data.
 Select attributes: Select the most relevant attributes in the data.
 Visualize: View an interactive 2D plot of the data.
1. Preprocessing

Loading Data: The first four buttons at the top of the preprocess section enable you
to load data into WEKA:
 Open file.... Brings up a dialog box allowing you to browse for the data file
on the local file system.
 Open URL Asks for a Uniform Resource Locator address for where the data
is stored.
 Open DB Reads data from a database.
 Generate. Enables you to generate artificial data from a variety of Data
Generators.
Using the Open file ... button you can read files in a variety of formats:

WEKA‘s ARFF format, CSV format, C4.5 format, or serialized Instances format.
ARFF files typically have a .arff extension, CSV files a .csv extension, C4.5 files a
.data and .names extension, and serialized Instances objects a .bsi extension.
2. Classification:

Selecting a Classifier: At the top of the classify section is the Classifier box. This
box has a text field that gives the name of the currently selected classifier, and its
options. Clicking on the text box with the left mouse button brings up a Generic
Object Editor dialog box, just the same as for filters that you can use to configure
the options of the current classifier. With a right click (or Alt+Shift+left click) you
can once again copy the setup string to the clipboard or display the properties in a
Generic Object Editor dialog box. The Choose button allows you to choose on4eof
the classifiers that are available in WEKA.

Test Options: The result of applying the chosen classifier will be tested according
to the options that are set by clicking in the Test options box.

There are four test modes:

1. Use training set: The classifier is evaluated on how well it predicts the class of
the instances it was trained on.
2. Supplied test set: The classifier is evaluated on how well it predicts the class of
a set of instances loaded from a file. Clicking the Set. button brings up a dialog
allowing you to choose the file to test on.
3. Cross-validation: The classifier is evaluated by cross-validation, using the
number of folds that are entered in the Folds text field.
4. Percentage split: The classifier is evaluated on how well it predicts a certain
percentage of the data which is held out for testing. The amount of data held out
depends on the value entered in the % field.

3. Clustering:

Cluster Modes: The Cluster mode box is used to choose what to cluster and how to
evaluate the results. The first three options are the same as for classification: Use
training set, Supplied test set and Percentage split.
4. Associating:

Setting Up: This panel contains schemes for learning association rules, and the
learners are chosen and configured in the same way as the clusterers, filters, and
classifiers in the other panels.

5. Selecting Attributes:
Searching and Evaluating: Attribute selection involves searching through all
possible combinations of attributes in the data to find which subset of attributes
works best for prediction. To do this, two objects must be set up: an attribute
evaluator and a search method. The evaluator determines what method is used to
assign a worth to each subset of attributes. The search method determines what style
of search is performed.

6. Visualizing:

WEKA‘s visualization section allows you to visualize 2D plots of the current

relation.
Study the arff file format
An ARFF (Attribute-Relation File Format) file is an ASCII text file that describes a
list of instances sharing a set of attributes. ARFF files were developed by the
Machine Learning Project at the Department of Computer Science of The University
of Waikato for use with the Weka machine learning software.
Overview: ARFF files have two distinct sections. The first section is the Header
information, which is followed the Data information.
The Header of the ARFF file contains the name of the relation, a list of the attributes
(the columns in the data), and their types. An example header on the standard IRIS
dataset looks like this:
% 1. Title: Iris Plants Database
%
% 2. Sources:
% (a) Creator: R.A. Fisher
% (b) Donor: Michael Marshall (MARSHALL%[email protected])
% (c) Date: July, 1988
%
@RELATION iris
@ATTRIBUTE sepallength NUMERIC
@ATTRIBUTE sepalwidth NUMERIC
@ATTRIBUTE petallength NUMERIC
@ATTRIBUTE petalwidth NUMERIC
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}
The Data of the ARFF file looks like the following:
@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
Lines that begin with a % are comments.
The @RELATION, @ATTRIBUTE and @DATA declarations are case insensitive.

Explore the available data sets in WEKA

There are 23 different datasets are available in weka (C:\Program Files\Weka-3-6\)
by default for testing purpose. All the datasets are available in. arff format. Those
data sets are listed
EX:No:1 DATA EXPLORATION AND INTEGRATION USING WEKA
AIM:

To explore and integrate Data using weka tool

Description:

Step 1: Load Your Data

 Open WEKA Tool.

 Click on the "Explorer" tab.
 In the "Preprocess" panel, click on the "Open file" button to load your dataset. WEKA
supports various file formats like CSV, ARFF, etc.

Step 2: Explore Your Data

 Once dataset is loaded, explore it in the "Preprocess" panel.

 Now view summary statistics and information about your dataset by clicking on the
"Summary" button.
 This will give you a quick overview of the data's distribution, missing values, and other
statistics.
 You can also visualize your data using the "Visualize" button. This allows you to generate
various plots and charts to understand the data's patterns and relationships.

Step 3: Preprocess Data

Data preprocessing is a critical step in data integration and exploration. You may need to clean,
transform,and preprocess your data to make it suitable for machine learning. Here are some
common preprocessing

Data Preprocessing Steps:

Handling Missing Values: Use the "Filter" option in the "Preprocess" panel to apply filters like
"ReplaceMissingValues" to handle missing data.

Feature Selection: WEKA provides various feature selection methods to choose the most
relevant features for your machine learning model.

Data Transformation: You can use filters like "Normalize" or "Standardize" to scale your
features. This ensures that all features are on the same scale, which can be important for many
machine learning algorithms.

Data Discretization: If you have continuous variables, you may want to discretize them into
bins using filters like "Discretize."
Step 4: Integration

Integration often involves combining data from multiple sources. In WEKA, you can load
multiple datasets and merge, append them using the "Merge Two Files" filter or use external
tools to combine data before loading it into WEKA.

Example Scenario:

Let's consider two datasets: one containing information about customers (StudentDetails.csv) and
another containing their purchase history (WeatherDetails.csv). Now integrate these datasets.
Preprocess each dataset separately to handle missing values, feature selection, and transformation.
Use the "Merge Two Files" and Append Two Files filter in WEKA. Once the datasets are
integrated, proceed with clustering or classification tasks to segment.

Load datasets in WEKA.

PROCEDURE 1:
1. Open the weka tool

2. Click explore button

3. Click open file button under preprocess tap

4. Choose the file weather.nominal.arff and click open

5. Select the outlook attribute and observe the attributes and the charts

6. Uncheck all the attributes

7. Select play attributes

8. Click visualize all button to view different charts

9. Stop the process

Data Integration after Loading

PROCEDURE 2:
STEP 1: Create a new dataset Sample1.arff

@relation Sample 1.arff

@attribute col1
@attribute col2
@attribute result {Yes, No}
@data
10, 20, Yes
20, 30, No
STEP 2: Create the new dataset Sample2.arff

@relation sample2
@attribute col1
@attribute col2
@attribute result {Yes, No}
@data
30, 40, Yes
40, 50, No

STEP 3: Open the weka tool

STEP 4: Click simple CLI button

STEP 5: Java weka.core.Instances append z:/sample1.arff z:/sample2.arff > z:/sample3.arff

RESULT:

Thus, the data exploration and integration using weka has been completed successfully.
Ex:No:2 Apply Weka tool for data validation

Aim:

To apply weka tool for data validation.

Description

Cross Validation (Using 10 folds)

 If Weka takes 100 labeled data
 It produces 10 equal sized sets. Each set is divided into two groups: 90 labeled data are
used for training and 10 labeled data are used for testing.
 it produces a classifier with an algorithm from 90 labeled data and applies that on the 10
testing data for set 1.
 It does the same thing for set 2 to 10 and produces 9 more classifiers it averages the
performance of the 10 classifiers produced from 10 equal sized (90 training and 10 testing)
sets

Procedure 1:

1. Open Weka and Click on Explorer.

2. Under Preprocess tab click “Open File” and load “Credit-g.arff” dataset.
3. Under classify tab select J-48 classifier in trees and test option as Cross Validation
with fold 10
4. Click start and note the result.
5. Repeat the step 3 and 4 by changing the test option as Cross validation with
fold (3 and 5).
6. Compare the generated results
7. Visualize the results by right-clicking the result-list and click Tree visualize tree.

Training Set and Test Set

Training data is an extremely large dataset that is used to teach a machine

learning model. Training data is used to teach prediction models that use machine
learning algorithms how to extract features that are relevant to specific business
goals. For supervised ML models, the training data is labeled. The data used to
train unsupervised ML models is not labeled. Training data is also known as a
training set, training dataset or learning set.
The test set is a separate set of data used to test the model after completing the
training.

Procedure 2:

1. Open weka and click on explorer.

2. Under preprocess tab click on ‘open-file’ and load ‘segment-challenge’ dataset.

3. Under classify tab select J-48 Classifier in trees.

4. Select ‘Use Training set’ under test option.

5. Click start button and observe the generated results.

6. Select ‘Supplied test set’ under test options.

7. Click set button,Click open file choose ‘Segment-Test.arff’ file.

8. Click start button and compare the training and test results.

9. Stop the process.

Result:

Thus Data validation using Weka tool has been completed successfully.
Plan the architecture for Real time application

AIM:

To plan the Web Services based Real time Data Warehouse Architecture

Procedure:

A web services-based real-time data warehouse architecture enables the integration

of data from various sources in near real-time using web services as a
communication mechanism. Here's an overview of such an architecture:

Data Sources: These are the systems or applications where the raw data originates
from. They could include operational databases, external APIs, logs, etc.

Web Service Clients (WS Client): These components are responsible for extracting
data changes from the data sources using techniques such as Change Data Capture
(CDC) and sending them to the web service provider. They make use of web service
calls to transmit data.

Web Service Provider: The web service provider receives data from the clients and
processes them for further integration into the real-time data warehouse. It
decomposes the received data, performs necessary transformations, generates SQL
statements, and interacts with the data warehouse for insertion.

This is a web service that receives data from the WS Client and adds it to the Real-
Time Partition. It decomposes the received Data Transfer Object into data and
metadata. It then uses metadata to generate SQL via an SQL-Generator to insert the
data into RTDW log tables and executes the generated SQL on the RTDW database.

Metadata: Metadata describes the structure and characteristics of the data. In this
context, it's used by the Web Service Provider to generate SQL for inserting data
into RTDW log tables.In a web services-based architecture, metadata plays a crucial
role in understanding data formats, schemas, and transformations. It is often
managed centrally to ensure consistency across the system.

ETL (Extract, Transform, Load): ETL processes are employed to collect data
from various sources, transform it into a consistent format, and load it into the data
warehouse. In a real-time context, this process may involve continuous or near real-
time transformations to ensure that data is available for analysis without significant
delays.

Real-Time Partition: This is a section of the data warehouse dedicated to storing

real-time or near real-time data. It may utilize techniques such as in-memory
databases or specialized storage structures optimized for high-speed data ingestion
and query processing. There are three stages:

 Putting the CDC data into the log table.

 Cleaning the CDC log data on demand.
 Aggregating the cleaned CDC data on demand.
Data Warehouse: The data warehouse stores both historical and real-time data. It
provides a unified repository for storing and querying data for analytical purposes.
In a web services-based architecture, the data warehouse may be accessed through
APIs exposed as web services.

Real-Time Data Integration: This component facilitates the integration of real-

time data into the data warehouse. It ensures that data from various sources are
combined seamlessly and made available for analysis in real-time or near real-time.

Query Interface: Users interact with the system through a query interface, which
could be a web-based dashboard, API endpoints, or other client applications. The
query interface allows users to retrieve and analyze data stored in the data
warehouse, including both historical and real-time data.

Web Services based Real time Data Warehouse Architecture

Overall, a web services-based real-time data warehouse architecture provides a
scalable and flexible framework for integrating and analyzing data from diverse
sources in real-time, enabling organizations to make data-driven decisions more
effectively.

Result:

Thus the web services based real time data warehouse application has been studied successfully

Manual Bluelight Bl6-U
100% (3)
Manual Bluelight Bl6-U
238 pages
WEKA Practical Protocol
No ratings yet
WEKA Practical Protocol
40 pages
2.3 Weka Tool
No ratings yet
2.3 Weka Tool
84 pages
Lab Manual - DM
No ratings yet
Lab Manual - DM
56 pages
DM Lab
No ratings yet
DM Lab
101 pages
Lecture 12 - Weka Tutorial
No ratings yet
Lecture 12 - Weka Tutorial
84 pages
State of Local Governance Report 2011
100% (1)
State of Local Governance Report 2011
103 pages
Dinesh DM
No ratings yet
Dinesh DM
34 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
55 pages
Laboratory Manual On: Data Mining
No ratings yet
Laboratory Manual On: Data Mining
41 pages
WEKA Lab Record
No ratings yet
WEKA Lab Record
69 pages
Administare Netwrok and Peripheral Devices Information Sheet
88% (16)
Administare Netwrok and Peripheral Devices Information Sheet
54 pages
Expt 1 Docx
No ratings yet
Expt 1 Docx
15 pages
Data Warehousing Lab Excercise
No ratings yet
Data Warehousing Lab Excercise
45 pages
CS-703 (B) Data Warehousing and Data Mining Lab
No ratings yet
CS-703 (B) Data Warehousing and Data Mining Lab
50 pages
Weka Lab Manual
No ratings yet
Weka Lab Manual
49 pages
Secure SDLC Consideration With NIST SP 800 64
No ratings yet
Secure SDLC Consideration With NIST SP 800 64
27 pages
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
No ratings yet
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
42 pages
Lecture 7 - Weka
No ratings yet
Lecture 7 - Weka
69 pages
Aiml Manual
No ratings yet
Aiml Manual
27 pages
Data Mining Complete Lab Manual - DRSNR
No ratings yet
Data Mining Complete Lab Manual - DRSNR
27 pages
Mooc On Weka
No ratings yet
Mooc On Weka
59 pages
Experiment WEKA
No ratings yet
Experiment WEKA
16 pages
DWBI Lab Manual 2023-24 Final
No ratings yet
DWBI Lab Manual 2023-24 Final
40 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
50 pages
Lab 04
No ratings yet
Lab 04
7 pages
131953194aams Vol 196 April 2020 A3 p451-469 Kanwal Preet Singh Attwal
No ratings yet
131953194aams Vol 196 April 2020 A3 p451-469 Kanwal Preet Singh Attwal
19 pages
Weka Experiment
No ratings yet
Weka Experiment
13 pages
DWDM File-Final Ver3.pdf 20241230 172003 0000
No ratings yet
DWDM File-Final Ver3.pdf 20241230 172003 0000
54 pages
Lab 02
No ratings yet
Lab 02
4 pages
Weka Installation Steps Final
No ratings yet
Weka Installation Steps Final
7 pages
DM Lab Material
No ratings yet
DM Lab Material
88 pages
Datawarehouse Pract 2
No ratings yet
Datawarehouse Pract 2
7 pages
DMW Lab Print
No ratings yet
DMW Lab Print
21 pages
DM Lab Task-1 Expr's-1
No ratings yet
DM Lab Task-1 Expr's-1
58 pages
Data Mining Unit 5
No ratings yet
Data Mining Unit 5
12 pages
Weka Exercise 1
No ratings yet
Weka Exercise 1
7 pages
GE3791 - Unit 3 - 4 Skepticism, Empiricism, Rationalism and Scientific Temper
0% (1)
GE3791 - Unit 3 - 4 Skepticism, Empiricism, Rationalism and Scientific Temper
22 pages
Data Mining (WEKA) en Formatted
No ratings yet
Data Mining (WEKA) en Formatted
52 pages
Rintro Wekacomplete
No ratings yet
Rintro Wekacomplete
135 pages
Data Warehousing and Data Mining Lab Manual
0% (1)
Data Warehousing and Data Mining Lab Manual
30 pages
Data Warehousing and Data Mining Lab Manual
100% (1)
Data Warehousing and Data Mining Lab Manual
30 pages
Weka Lab
No ratings yet
Weka Lab
11 pages
Using Weka
No ratings yet
Using Weka
6 pages
Data Mining (WEKA) en
No ratings yet
Data Mining (WEKA) en
51 pages
Wekappt
No ratings yet
Wekappt
58 pages
Data Warehousing and Data Mining Lab
No ratings yet
Data Warehousing and Data Mining Lab
53 pages
Weka Data Miningvsem
No ratings yet
Weka Data Miningvsem
7 pages
ExplorerGuide A Version 3-5-8
No ratings yet
ExplorerGuide A Version 3-5-8
22 pages
WEKA Manual
No ratings yet
WEKA Manual
25 pages
Weka Tutorial
No ratings yet
Weka Tutorial
45 pages
Weka (20030421-Version1 by Kdelab)
No ratings yet
Weka (20030421-Version1 by Kdelab)
51 pages
Weka Overview Slides
No ratings yet
Weka Overview Slides
31 pages
Weka Exercise 1
No ratings yet
Weka Exercise 1
7 pages
Weka Tutorial
No ratings yet
Weka Tutorial
32 pages
Weka Software Manuala
No ratings yet
Weka Software Manuala
20 pages
Learning To Use We Ka
No ratings yet
Learning To Use We Ka
5 pages
WEKA Explorer User Guide For Version 3-4: Richard Kirkby Eibe Frank July 15, 2008
No ratings yet
WEKA Explorer User Guide For Version 3-4: Richard Kirkby Eibe Frank July 15, 2008
13 pages
WEKA Explorer Tutorial
No ratings yet
WEKA Explorer Tutorial
45 pages
Data Base Management Key Points
No ratings yet
Data Base Management Key Points
8 pages
PSQI Scoring
100% (1)
PSQI Scoring
2 pages
Weka Weka: A - Antony Alex MCA DR G R D College of Science - CBE Tamil Nadu - India
No ratings yet
Weka Weka: A - Antony Alex MCA DR G R D College of Science - CBE Tamil Nadu - India
23 pages
Appendix Weka
No ratings yet
Appendix Weka
17 pages
D1.5 Analysis of Hard and Software Requirements
No ratings yet
D1.5 Analysis of Hard and Software Requirements
59 pages
Docker Kubernetes Made Easy Interactive Ebook FINAL
No ratings yet
Docker Kubernetes Made Easy Interactive Ebook FINAL
7 pages
DOH AO No 2020 0023
No ratings yet
DOH AO No 2020 0023
11 pages
As 1683.11-2001 Methods of Test For Elastomers Tension Testing of Vulcanized or Thermoplastic Rubber
No ratings yet
As 1683.11-2001 Methods of Test For Elastomers Tension Testing of Vulcanized or Thermoplastic Rubber
4 pages
Determining Spot Heights From Contours
0% (1)
Determining Spot Heights From Contours
13 pages
Unit 5 App Development
No ratings yet
Unit 5 App Development
25 pages
CGI EA Maturity Model
No ratings yet
CGI EA Maturity Model
1 page
File Handling in Python
No ratings yet
File Handling in Python
25 pages
LPC-P1114 Development Board
No ratings yet
LPC-P1114 Development Board
15 pages
IoT-Version2 FR
No ratings yet
IoT-Version2 FR
143 pages
HCIA-HarmonyOS Device Developer V1.0 学员用书
No ratings yet
HCIA-HarmonyOS Device Developer V1.0 学员用书
166 pages
Casestudy 4
No ratings yet
Casestudy 4
3 pages
Data Warehouse 21reg
No ratings yet
Data Warehouse 21reg
2 pages
AUTONOMOUS 231GES104T - PROBLEM SOLVING THROUGH PYTHON PROGRAMMING Question Bank - Unit
No ratings yet
AUTONOMOUS 231GES104T - PROBLEM SOLVING THROUGH PYTHON PROGRAMMING Question Bank - Unit
6 pages
Complex Digital Signal Processing in Telecommunications
No ratings yet
Complex Digital Signal Processing in Telecommunications
23 pages
Computer Science Extended Essay First Draft (Second Version)
No ratings yet
Computer Science Extended Essay First Draft (Second Version)
10 pages
AJP Question Paper
No ratings yet
AJP Question Paper
7 pages
70 00035 01 00 AN - 30e - User - Manual
No ratings yet
70 00035 01 00 AN - 30e - User - Manual
106 pages
Resilience To Overfitting AdaBoosts Approach
No ratings yet
Resilience To Overfitting AdaBoosts Approach
8 pages
PC DMIS Software de Masura PDF
No ratings yet
PC DMIS Software de Masura PDF
24 pages
Operations Management Report On The Itc Echoupal Initiative
No ratings yet
Operations Management Report On The Itc Echoupal Initiative
13 pages
TechSmart 131, August 2014
No ratings yet
TechSmart 131, August 2014
52 pages
Che205s17 Reading 01e
No ratings yet
Che205s17 Reading 01e
14 pages
Introduction To Optimization: Class Notes On: Mathematical Foundations in Engineering, ECEG 6209
No ratings yet
Introduction To Optimization: Class Notes On: Mathematical Foundations in Engineering, ECEG 6209
34 pages
Type 01
No ratings yet
Type 01
2 pages
PS817C
No ratings yet
PS817C
1 page
VLSI Testing: 18-322 Fall 2003
No ratings yet
VLSI Testing: 18-322 Fall 2003
33 pages
Practice Excel 2
No ratings yet
Practice Excel 2
3 pages
Study Guide MO-500 Certification Exam Microsoft Access Expert ( Office 2019)
From Everand
Study Guide MO-500 Certification Exam Microsoft Access Expert ( Office 2019)
Anand Vemula
No ratings yet
Crystal Reports Introduction: Versions 2008-2016
From Everand
Crystal Reports Introduction: Versions 2008-2016
Seth Bonder
No ratings yet
Tableau 8.2 Training Manual: From Clutter to Clarity
From Everand
Tableau 8.2 Training Manual: From Clutter to Clarity
Larry Keller
No ratings yet

DHW Lab (Ex1 To 3)

Uploaded by

DHW Lab (Ex1 To 3)

Uploaded by

INTRODUCTION TO WEKA TOOL

Weka is a collection of machine learning algorithms for data mining tasks. It

Downloading and/or installation of WEKA data mining toolkit

1. Go to the Weka website, http://www.cs.waikato.ac.nz/ml/weka/, and download

Understand the features of WEKA toolkit such as Explorer, Knowledge Flow

The buttons can be used to start the following applications:

Experimenter: An environment for performing experiments and conducting

 Preprocess: Choose and modify the data being acted on.

There are four test modes:

WEKA‘s visualization section allows you to visualize 2D plots of the current

Explore the available data sets in WEKA

To explore and integrate Data using weka tool

Step 1: Load Your Data

 Open WEKA Tool.

Step 2: Explore Your Data

 Once dataset is loaded, explore it in the "Preprocess" panel.

Step 3: Preprocess Data

Data Preprocessing Steps:

Load datasets in WEKA.

2. Click explore button

3. Click open file button under preprocess tap

4. Choose the file weather.nominal.arff and click open

6. Uncheck all the attributes

7. Select play attributes

8. Click visualize all button to view different charts

9. Stop the process

Data Integration after Loading

@relation Sample 1.arff

STEP 3: Open the weka tool

STEP 4: Click simple CLI button

STEP 5: Java weka.core.Instances append z:/sample1.arff z:/sample2.arff > z:/sample3.arff

To apply weka tool for data validation.

Cross Validation (Using 10 folds)

1. Open Weka and Click on Explorer.

Training Set and Test Set

Training data is an extremely large dataset that is used to teach a machine

1. Open weka and click on explorer.

2. Under preprocess tab click on ‘open-file’ and load ‘segment-challenge’ dataset.

3. Under classify tab select J-48 Classifier in trees.

4. Select ‘Use Training set’ under test option.

5. Click start button and observe the generated results.

6. Select ‘Supplied test set’ under test options.

7. Click set button,Click open file choose ‘Segment-Test.arff’ file.

9. Stop the process.

A web services-based real-time data warehouse architecture enables the integration

Real-Time Partition: This is a section of the data warehouse dedicated to storing

 Putting the CDC data into the log table.

Real-Time Data Integration: This component facilitates the integration of real-

Web Services based Real time Data Warehouse Architecture

You might also like