0% found this document useful (0 votes)

8 views32 pages

DM Lab Record PDF

The document outlines a series of experiments and tasks related to credit risk assessment using a dataset of 1000 historical credit cases from Germany. It includes tasks such as creating decision trees, assessing attribute importance, and evaluating model accuracy through cross-validation. Additionally, it provides resources and instructions for using the WEKA data mining toolkit to analyze the dataset and develop credit assessment models.

Uploaded by

srinumarlapati4732

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views32 pages

DM Lab Record PDF

Uploaded by

srinumarlapati4732

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

INDEX

S.no. Date Title of the experiment Page No. Marks Sign. With Date

1. 14/08/23 List all the categorical and the real- 18

valued attributes
2. 21/08/23 What attributes are crucial in making 19
the credit assessment?
3. 28/08/23 Create a Decision Tree – train using the 20
complete dataset.
4. 04/09/23 Classify credit good/bad for each of the 21
examples in the dataset.
5. 11/09/23 Is testing on the training set as you did 21
above, a good idea?
6. 25/09/23 Train a Decision Tree again using cross- 21-22
validation and report your results.
7. 30/10/23 Use the preprocess tab and check 23
whether removing these attributes have
any significant effect?
8. 06/11/23 Check cross validation results after 24-25
making changes. Are they significantly
different from results obtained in
problem 6?
9. 20/11/23 How does the complexity of a Decision 25
Tree relate to the bias of the model?
10. 27/11/23 Report your accuracy using the pruned 26
model. Does your accuracy increase?
Syllabus:
Task 1: Credit Risk Assessment
Description:
The business of banks is making loans. Assessing the credit worthiness of an applicant is of crucial
importance. You have to develop a system to help a loan officer decide whether the credit of a
customer is good, or bad. A bank’s business rules regarding loans must consider two opposing factors.
On the one hand, a bank wants to make as many loans as possible. Interest on these loans is the banks
profit source. On the other hand, a bank cannot afford to make too many bad loans. Too many bad
loans could lead to the collapse of the bank. The bank’s loan policy must involve a compromise: not
too strict, and not too lenient.
To do the assignment, you first and foremost need some knowledge about the world of credit.
You can acquire such knowledge in a number of ways.
1. Knowledge engineering. Find a loan officer who is willing to talk. Interview her and
try to represent her knowledge in the form of production rules.
2. Books. Find some training manuals for loan officers or perhaps a suitable textbook on
finance. Translate this knowledge from text form to production rule form.
3. Common sense. Imagine yourself as a loan office and make up reasonable rules which can
be used to judge the credit worthiness of a loan applicant.
4. Case histories. Find records of actual cases where competent loan officers correctly judged
when and when not to, approve a loan application.

The German credit data:

Actually historical credit data is not always easy to come by because of confidentiality rules. Here is
one such dataset, consisting of 1000 actual cases collected in Germany. Credit dataset(original) excel
spreadsheet version of the German credit data.
In spite of the fact that the data is German, you should probably make use of it for this assignment
(unless you really can consult a real loan officer!)
A few notes on the German dataset
 DM stands for Deutsche mark, the unit of currency, worth about 90 cents Canadian (but
looks and acts like a quarter).
 Owns telephone. German phone rates are much higher than in Canada so fewer people
own telephones.
 Foreign worker. There are millions of these in Germany( many from turkey). It is very hard
to get German citizenship if you were not born of German parents.
 There are 20 attributes used in judging a loan applicant. The goal is the classify the applicant
into one of two categories, good or bad.
Subtasks: (Turn in your answers to the following tasks)
1. List all the categorical (or nominal) attributes and the real valued attributes separately. (5 marks)
2. What attributes do you think might be crucial in making the credit assessment? Come up
with some simple rules in plain English using your selected attributes. (5 marks)
3. One type of model that you can create is a decision tree – train a decision tree using the
complete dataset as the training data. Report the model obtained after training. (10 marks)
4. Suppose you use your above model trained on the complete data set, and classify credit
good/bad for each of the examples in the dataset. What % of examples can you classify
correctly? (this is also called testing on the training set) why do you think you cannot get
100% training accuracy? (10 marks)
5. Is testing on the training set as you did above a good idea? Why or Why not?(10 marks)
6. One approach for solving the problem encountered in the previous question is using cross-
validation? Describe what cross-validation is and report your results. Does your accuracy
increase
/decrease? Why? (10 marks)
7. Check to see if the data shows a bias against ”foreign workers”(attribute 20),or “personal –
status”(attribute 9).One way to do this (perhaps rather simple minded) is to remove these
attributes from the data set and see if the decision tree created in those cases is significantly
different from the full data set case which you have already done .To remove an attribute you
can use the preprocess tab in Weka’s GUI Explorer. Did removing these attributes have any
significant effect? Discuss (10 marks)
8. Another question might be, do you really need to input so many attributes to get good results?
May be only a few would do. For example, you could try just having attributes
2,3,5,7,10,17(and 21,the class attribute (naturally)).Try out some combination .(You had
removed two attributes in problem 7.Remember to reload the arff data file to get all the
attributes initially before you start selecting the ones you want.)(10 marks)
9. Sometimes ,the cost of rejecting an applicant who actually has a good credit (case 1) might be
higher than accepting an applicant who has bad credit (case 2).Instead of counting the
misclassifications equally in both cases , give a higher cost to the first case(say cost 5 ) and
lower cost to the second case. You can do this by using a cost matrix in Weka. Train your
Decision Tree again and report the Decision Tree and cross-validation results. Are they
significantly different from results obtained in problem 6(using equal cost)? (10 marks)
10. Do you think it is a good idea to prefer simple decision trees instead of having long complex
decision trees? How does the complexity of a Decision Tree relate to the bias of the model?
(10 marks)
11. You can make your Decision Trees simpler by pruning the nodes. One approach is to use
Reduced Error Pruning-Explain this idea briefly. Try reduced error pruning for training your
Decision Trees using cross-validation (you can do this in Weka) and report the Decision Tree
you obtain? Also, report your accuracy using the pruned model. Does your accuracy increase?
(10 marks)
12. (Extra credit): How can you convert a Decision Trees into “if-then-else rules”. Make up your
own small Decision Tree consisting of 2-3 levels and convert it into a set of rules. There also
exist different classifiers that output the model in the form of rules –one such classifier in
Weka is rules. PART, train this model and report the set of rules obtained .Sometimes just
one attribute can be good enough in making the decision, yes, just one! Can you predict what
attribute that might be in this data set ? One R classifier uses to a single attribute to make
decisions (it chooses the attribute based on minimum error ).Report the rule obtained by
training a one R classifier . Rank the performance of j 48,PART and one R .(10 marks)

Task Resources:
 Mentor lecture on Decision Trees
 Andrew Moore’s Data mining Tutorials(See tutorials on Decision Trees and cross-validation )
 Decision Trees(Source: Tan, MSU)
 Tom Mitchell’s book slides (See slides on concept learning and Decision Trees)
 Weka resources:
 Introduction to Weka (html version ) (download ppt version)
 Download Weka
 Weka Tutorial
 ARFF format
 Using Weka from command line

Introduction
Explore WEKA Data mining /machine Learning Toolkit

I. Downloading and/or installation of WEKA data mining toolkit,

II. Understanding features of WEKA toolkit such as Explorer , Knowledge flow
interface, Experimenter, command-line interface.
III. Navigate the options available in the WEKA (ex. Select attributes panel , preprocess
panel, classify panel, cluster panel, associate panel and visualize panel)
IV. Study the arff file format
V. Explorer the available data sets in WEKA.
VI. Load a data set(ex. weather dataset, iris dataset, etc.)
VII. Load each dataset and observe the following:

List the attributes names and they types

i. Number of records in each dataset
ii. Identity the class attribute(if any)
iii. Plot Histogram
iv. Determine the number of records for each class.
v. Visualize the data in various dimensions.
PROCEDURE:
I. Downloading and/or installation of WEKA data mining toolkit.
II. Download the WEKA tool from the following link. And Install the WEKA tool.
http://www.cs.waikato.ac.nz/ml/weka/downloading.html
III. Understanding features of WEKA toolkit such as Explorer , Knowledge flow
interface,Experimenter, command-line interface.
WEKA: Waikato Environment for Knowledge Analysis
The WEKA GUI Chooser provides a starting point for launching WEKA’S main GUI applications
and supporting tools. The GUI Chooser consists of four buttons—one for each of the four major
Weka applications—and four menus.
The buttons can be used to start the following applications:
1. Explorer: An environment for exploring data with WEKA.
2. Experimenter: An environment for performing experiments and conductingstatistical
tests between learning schemes.
3. Knowledge Flow: It supports essentially the same functions as the explorer butwith
drag and drop interface. One advantage is that it supports incremental learning.
4. Simple CLI: Provides a simple command-line interface that allows direct
execution of WEKA commands for operating systems that do not provide their
own command line interface.

EXPLORER:
It is a user interface which contains a group of tabs just below the title bar. Thetabs
are as follows:
1. Preprocess
2. Classify
3. Cluster
4. Associate
5. Select Attributes
6. Visualize
The bottom of the window contains status box, log and WEKA bird.
Experimenter:
The Weka Experiment Environment enables the user to create, run, modify, and analyze
experiments in a more convenient manner than is possible when processing the schemes
individually. For example, the user can create an experiment that runs several schemes against a series
of datasets and then analyse the results to determine if one of the schemes is (statistically) better than
the other schemes.
The Experiment Environment can be run from the command line using the Simple
CLI.

You can choose between those two with the Experiment Configuration Mode radio buttons:
• Simple
• Advanced
Both setups allow you to setup standard experiments, that are run locally on a single machine, or
remote experiments, which are distributed between several hosts.
Knowledge Flow
The Knowledge Flow provides an alternative to the Explorer as a graphical front end to
WEKA’s core algorithms. The KnowledgeFlow presents a data-flow inspired interface to
WEKA. The user can selectWEKA components from a palette, place them on a layout canvas
and connect them together in order to form a knowledge flow for processing and analyzing
data. At present, all of WEKA’s classifiers, filters, clusterers, associators, loaders and savers
are available in the Knowledge Flow along with some extra tools.
Simple CLI
The Simple CLI provides full access to all Weka classes, i.e., classifiers, filters, clusterers, etc.,
but without the hassle of the CLASSPATH (it facilitates the one, with which WEKA was
started). It offers a simple Weka shell with separated command line and output.

II. Navigate the options available in the WEKA (ex. Select attributes panel, preprocess
panel, classify panel, cluster panel, associate panel and visualize panel). And Explorer
the available data sets in WEKA. Load a data set. Load each dataset and observe the
following:

i. List the attributes names and they types

ii. Number of records in each dataset
iii. Identity the class attribute(if any)
iv. Plot Histogram
v. Determine the number of records for each class.
vi. Visualize the data in various dimensions

PREPROCESSING:
It is a process of identifying the unwanted data (data cleaning) before loading the data
from the data base.
 Now Open the WEKA application as shown in the bellow figure-1
 Now Click on Explorer as shown in the above figure-2
 Now open file by choosing the “open file” button as shown in the above figure-3.
 Now choose the data folder in the open dialogue box as in figure-4.
 Now choose the “house.arff” file in the above figure-5.
Figure-1

Figure-2

Figure-3
Relation specifies the name of the database used, instances specify the objects involved,
and attributes specify the number of attributes used in the data base or relation.
Figure-4

Figure-5
Figure-6

Now click the visualize all.

Figure-7
III. Study the arff file format

An ARFF (= Attribute-Relation File Format ) file is an ASCII text file that describes a list of
instances sharing a set of attributes. ARFF files are not the only format one can load, but all
files that can be converted with Weka’s “core converters”. The following formats are currently
supported.

Now you create the arff file. Then open the Notepad type the following code and save file with
.arff extension.

@RELATION Student
@ATTRIBUTE customerid NUMERIC
@ATTRIBUTE age{youth,middle,senior}
@ATTRIBUTE income{low,medium,high}
@ATTRIBUTE student{yes,no}
@ATTRIBUTE credit_rating{fair,excellent}
@ATTRIBUTE buy_computer{yes,no}
@data
%
1,youth,high,no,fair,no
2,youth,high,no,excellent,no
3,middle,high,no,fair,yes
4,senior,medium,no,fair,yes
5,senior,low,yes,fair,yes
6,senior,low,yes,excellent,no
7,middle,low,yes,excellent,yes
8,youth,medium,no,fair,no
9,youth,low,yes,fair,yes
10,senior,medium,yes,fair,yes
11,youth,medium,yes,excellent,yes
12,middle,medium,no,excellent,yes
13,middle,high,yes,fair,yes
14,senior,medium,no,excellent,no
%

 Now save the file with extension .arff

TASK-1
Credit Risk Assessment

Description: The business of banks is making loans. Assessing the credit worthiness of an
applicant is of crucial importance. You have to develop a system to help a loan officer decide
whether the credit of a customer is good. Or bad. A bank’s business rules regarding loans must
consider two opposing factors. On th one han, a bank wants to make as many loans as possible.
Interest on these loans is the banks profit source. On the other hand, a bank cannot afford to make
too many bad loans. Too many bad loans could lead to the collapse of the bank. The bank’s loan
policy must involved a compromise. Not too strict and not too lenient.

The German Credit Data

Actual historical credit data is not always easy to come by because of confidentiality rules.

 Now select the credit-g.arff file from weka data folder. After load the file as shown in figure-1.
Tasks:

1. List all the categorical (or nominal) attributes and the real valued attributes
separately.

Ans) The following are the Categorical (or Nominal) attributes:

1. Checking_Status
2. Credit_history
3. Purpose
4. Savings_status
5. Employment
6. Personal_status
7. Other_parties
8. Property_Magnitude
9. Other_payment_plans
10. Housing
11. Job
12. Own_telephone
13. Foreign_worker

The following are the Numerical attributes:

1. Duration
2. Credit_amout
3. Installment_Commitment
4. Residence_since
5. Age
6. Existing_credits
Num_dependents
2. What attributes do you think might be crucial in making the credit assessment? Comeup
with some simple rules in plain English using your selected attributes.
Ans) The following are the attributes may be crucial in making the credit assessment.
1. Credit_amount
2. Age
3. Job
4. Savings_status
5. Existing_credits
6. Installment_commitment
7. Property_magnitude
3. One type of model that you can create is a Decision tree. Train a Decision tree using the
complete data set as the training data. Report the model obtained after training.

4. Suppose you use your above model trained on the complete dataset, and classify credit
good/bad for each of the examples in the dataset. What % of examples can you classify
correctly?(This is also called testing on the training set) why do you think can not get 100%
training accuracy?
Ans) If we used our above model trained on the complete dataset and classified credit as
good/bad for each of the examples in that dataset. We can not get 100% training accuracy only
85.5% of examples, we can classify correctly.

5. Is testing on the training set as you did above a good idea? Why or why not?
Ans) It is not good idea by using 100% training data set
6. One approach for solving the problem encountered in the previous question is using cross-
validation? Describe what is cross validation briefly. Train a decision tree again using cross
validation and report your results. Does accuracy increase/decrease? Why?

Ans) Cross-Validation Definition: The classifier is evaluated by cross validation using the
number of folds that are entered in the folds text field.
In Classify Tab, Select cross-validation option and folds size is 2 then Press Start Button, next
time change as folds size is 5 then press start, and next time change as folds size is 10 then press
start.
i) Fold Size-10

ii) Fold Size-5

iii) Fold Size-2

Note: With this observation, we have seen accuracy is increased when we have folds size is 5 and
accuracy is decreased when we have 10 folds.
7. Check to see if the data shows a bias against “foreign workers” or “personal-status”. One
way to do this is to remove these attributes from the data set and see if the decision tree
created in those cases is significantly different from the full dataset case which you have
already done. Did removing these attributes have any significantly effect? Discuss.
Ans) We use the Preprocess Tab in Weka GUI Explorer to remove an attribute “Foreign-
workers” & “Perosnal_status” one by one. In Classify Tab, Select Use Training set option then
Press Start Button, If these attributes removed from the dataset, we can see change in the
accuracy compare to full data set when we removed.
i) If Foreign_worker is removed

ii)If Personal_status is removed

Analysis:
With this observation we have seen, when “Foreign_worker “attribute is removed from the
Dataset, the accuracy is decreased. So this attribute is important for classification.
8. Another question might be, do you really need to input so many attributes to get good
results? May be only a few would do. For example, you could try just having attributes
2,3,5,7,10,17 and 21. Try out some combinations.(You had removed two attributes in problem
7. Remember to reload the arff data file to get all the attributes initially before you start
selecting the ones you want.)

Procedure:

1. Remove the 2nd Attribute:

We use the Preprocess Tab in Weka GUI Explorer to remove 2nd attribute (Duration). In
Classify Tab, Select Use Training set option then Press Start Button, If these attributes removed
from the dataset, we can see change in the accuracy compare to full data set when we removed.
Then see the output as shown in figure-1.

Figure-1
2. Remove the 3rd Attribute:
Remember to reload the previous removed attribute, press Undo option in Preprocess tab.
We use the Preprocess Tab in Weka GUI Explorer to remove 3rd attribute
(Credit_history). In Classify Tab, Select Use Training set option then Press Start Button, If
these attributes removed from the dataset, we can see change in the accuracy compare to
full data set when we removed. Then see the output as shown in figure-2.
Figure-2
3. Remove 5th attribute (Credit_amount).

Remember to reload the previous removed attribute, press Undo option in Preprocess tab. We
use the Preprocess Tab in Weka GUI Explorer to remove 5th attribute (Credit_amount). In
Classify Tab, Select Use Training set option then Press Start Button, If these attributes removed
from the dataset, we can see change in the accuracy compare to full data set when we removed.
Then see the output as shown in figure -3.

Figure-3
4. Remove 7th attribute (Employment)

Remember to reload the previous removed attribute, press Undo option in Preprocess tab. We
use the Preprocess Tab in Weka GUI Explorer to remove 7th attribute (Employment). In
Classify Tab, Select Use Training set option then Press Start Button, If these attributes
removed from the dataset, we can see change in the accuracy compare to full data set when we
removed. Then see the output as shown in figure-4.
Figure-4
5. Remove 10th attribute (Other_parties):

Remember to reload the previous removed attribute, press Undo option in Preprocess tab. We
use the Preprocess Tab in Weka GUI Explorer to remove 10th attribute (Other_parties). In
Classify Tab, Select Use Training set option then Press Start Button, If these attributes
removed from the dataset, we can see change in the accuracy compare to full data set when
we removed. Then see the output as shown in figure-5.

Figure-5
6.Remove 17th attribute (Job):
Remember to reload the previous removed attribute, press Undo option in Preprocess tab. We
use the Preprocess Tab in Weka GUI Explorer to remove 17th attribute (Job). In Classify
Tab, Select Use Training set option then Press Start Button, If these attributes removed from the
dataset, we can see change in the accuracy compare to full data set when we removed. Then
see the output as shown in figure-6.

Figure-6

 Remove 21st attribute (Class):

Remember to reload the previous removed attribute, press Undo option in Preprocess tab. We
use the Preprocess Tab in Weka GUI Explorer to remove 21st attribute (Class). In Classify
Tab, Select Use Training set option then Press Start Button, If these attributes removed from the
dataset, we can see change in the accuracy compare to full data set when we removed. Then
see the output as shown in figure- 7.
Figure-7

ANALYSIS:

With this observation we have seen, when 3rd attribute is removed from the Dataset, the
accuracy (83%) is decreased. So this attribute is important for classification. when 2nd and
10th attributes are removed from the Dataset, the accuracy(84%) is same. So we can remove any
one among them. when 7th and 17th attributes are removed from the Dataset, the
accuracy(85%) is same. So we can remove any one among them. If we remove 5th and 21st
attributes the accuracy is increased, so these attributes may not be needed for the classification.
9. Sometimes, The cost of rejecting an applicant who actually has good credit might be
higher than accepting an applicant who has bad credit. Instead of counting the
misclassification equally in both cases, give a higher cost to the first case (say cost 5)
and lower cost to the second case. By using a cost matrix in weak. Train your decision
tree and report the Decision Tree and cross validation results. Are they significantly
different from results obtained in problem 6?

Procedure:
 Now Open the WEKA GUI Explorer, Select Classify Tab, In that Select Use Training
set option .
 In Classify Tab then press Choose button in that select J48 as Decision Tree Technique.
 In Classify Tab then press More options button then we get classifier evaluation options
window
 Now select cost sensitive evaluation the press set option Button then we get Cost Matrix
Editor.
 Now change classes as 2 then press Resize button. Then we get 2X2 Cost matrix.
 Now in Cost Matrix (0,1) location value change as 5, then we get modified cost matrix is
as follows. Show in figure-8.

Figure-8
 Then close the cost matrix editor, then press ok button. Then press start button. Then shown
below figure-9.
Figure-9

Analysis:
With this observation we have seen that , total 700 customers in that 669 classified as good
customers and 31 misclassified as bad customers. In total 300cusotmers, 186 classified as bad
customers and 114 misclassified as good customers.
10. Do you think it is a good idea to prefect simple decision trees instead of having long complex
decision tress? How does the complexity of a Decision Tree relate to the bias of the model?

Analysis:
It is Good idea to prefer simple Decision trees, instead of having complex Decision tree.
11. You can make your Decision Trees simpler by pruning the nodes. One approach is to use
Reduced Error Pruning. Explain this idea briefly. Try reduced error pruning for training
your Decision Trees using cross validation and report the Decision Trees you obtain? Also
Report your accuracy using the pruned model Does your Accuracy increase?

 Now we can make our decision tree simpler by pruning the nodes.
 For that In Weka GUI , Select Classify Tab, In that Select Use Training set option .
 Now select the Classify Tab then press Choose button in that select J48 as Decision Tree Technique.
 Now Beside Choose Button Press on J48 –c 0.25 –M2 text we get Generic Object Editor.
 Now select Reduced Error pruning Property as True then press ok.
 Now then press start button.

Figure-10

Analysis:
By using pruned model, the accuracy decreased. Therefore by pruning the nodes we can make our
decision tree simpler.
12 How can you convert a Decision Tree into “if-then-else rules”. Make up your own small
Decision Tree consisting 2-3 levels and convert into a set of rules. There also exist different
classifiers that output the model in the form of rules. One such classifier in weka is rules.
PART, train this model and report the set of rules obtained. Sometimes just one attribute
can be good enough in making the decision, yes, just one
! Can you predict what attribute that might be in this data set? OneR classifier uses a single
attribute to make decisions(it chooses the attribute based on minimum error).Report the rule
obtained by training a one R classifier. Rank the performance of j48,PART,oneR.

Procedure:
 Sample Decision Tree shown in figure -11, for weather dataset, with 2-3 levels .

Figure-11

 Now converting above Decision tree into a set of rules is asfollows:

Rule1: If age = youth AND student=yes THEN buys_computer=yes

Rule2: If age = youth AND student=no THEN buys_computer=no
Rule3: If age = middle_aged THEN buys_computer=yes
Rule4: If age = senior AND credit_rating=excellent THEN buys_computer=yes
Rule5: If age = senior AND credit_rating=fair THEN buys_computer=no

 Now open the Weka GUI Explorer, Select Classify Tab.

 Now Select Use Training set option. There also exist different classifiers that output the
model in the form of Rules. Such classifiers in weka are “PART” and ”OneR” .
 Then go to Choose and select Rules in that select PART and press start Button. Show the
result in figure-12.

Figure-12

 Then go to Choose and select Rules in that select OneR and press start Button. Show the
result in figure-13 .

Figure-13
 Then go to Choose and select Trees in that select J48 and press start Button. Show the result
in figure-14.

Figure-14.

Analysis:
This observation we have seen the performance of classifier and Rank is as follows
1. PAR
T 2. J48
3. OneR

Functional Design Specification - Automation System
88% (8)
Functional Design Specification - Automation System
52 pages
Weka Experiments
No ratings yet
Weka Experiments
4 pages
DBMS
No ratings yet
DBMS
51 pages
DWDM Lab Manual 7th Sem
No ratings yet
DWDM Lab Manual 7th Sem
45 pages
Adama Science and Technology University-Dm-Lab
No ratings yet
Adama Science and Technology University-Dm-Lab
47 pages
Data Mining - Lab - Manual
No ratings yet
Data Mining - Lab - Manual
20 pages
DWDM
No ratings yet
DWDM
46 pages
Lecture 12 - Weka Tutorial
No ratings yet
Lecture 12 - Weka Tutorial
84 pages
DWDM Manual-1
No ratings yet
DWDM Manual-1
96 pages
Data Mining Lab Manual Student - Copy - For - Print
No ratings yet
Data Mining Lab Manual Student - Copy - For - Print
24 pages
Visual Basic Programming
100% (1)
Visual Basic Programming
316 pages
Manisha 3001 Week 12
No ratings yet
Manisha 3001 Week 12
22 pages
DM Lab Manual IV Cse I Sem
No ratings yet
DM Lab Manual IV Cse I Sem
36 pages
Enterprise Firewall 7.2 Lab Guide-Online U
No ratings yet
Enterprise Firewall 7.2 Lab Guide-Online U
158 pages
DA LabFile
No ratings yet
DA LabFile
63 pages
DMlab - FilE prINCE
No ratings yet
DMlab - FilE prINCE
27 pages
DMW FIle
No ratings yet
DMW FIle
27 pages
DM Record-No Roll No
No ratings yet
DM Record-No Roll No
46 pages
Recent Trends in IT Practical Solutions
No ratings yet
Recent Trends in IT Practical Solutions
11 pages
DWDM Lab Tasks
No ratings yet
DWDM Lab Tasks
13 pages
OS Journal
No ratings yet
OS Journal
28 pages
Data Mining Lab Manual
100% (1)
Data Mining Lab Manual
41 pages
Data Mining Techniques Using WEKA: Vinod Gupta School of Management, Iit Kharagpur
No ratings yet
Data Mining Techniques Using WEKA: Vinod Gupta School of Management, Iit Kharagpur
17 pages
Lab Manual
No ratings yet
Lab Manual
16 pages
DM Tools Sample-1
No ratings yet
DM Tools Sample-1
72 pages
Weka Book Questions
0% (1)
Weka Book Questions
2 pages
DM Lab Material
No ratings yet
DM Lab Material
88 pages
DMW LabFile 0901CS243D11 Swastik
No ratings yet
DMW LabFile 0901CS243D11 Swastik
25 pages
Data Mining Report - Group 22
No ratings yet
Data Mining Report - Group 22
13 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
47 pages
Assignment 1-Preprocessing Handon
No ratings yet
Assignment 1-Preprocessing Handon
6 pages
DM Manual-Min
No ratings yet
DM Manual-Min
100 pages
Institute Vision and Mission Vision: PEO1: PEO2: PEO3
No ratings yet
Institute Vision and Mission Vision: PEO1: PEO2: PEO3
35 pages
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
100% (1)
Weka-: Data Warehousing and Data Mining Lab Manual-Week 9
8 pages
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
No ratings yet
Data Mining Term Project Machine Learning With WEKA: Weka Explorer Tutorial For Version 3.4.3
42 pages
Assignment1 COMP723 2019
No ratings yet
Assignment1 COMP723 2019
4 pages
ML Index Nancy
No ratings yet
ML Index Nancy
3 pages
DM Lab Cse
No ratings yet
DM Lab Cse
108 pages
German Dataset Tasks
No ratings yet
German Dataset Tasks
6 pages
Latest Data Mining Lab Manual
No ratings yet
Latest Data Mining Lab Manual
74 pages
Individual Assignment 2
No ratings yet
Individual Assignment 2
4 pages
BI - Experiment - No - 1
No ratings yet
BI - Experiment - No - 1
7 pages
Data Mining Lab Manual: Aurora's PG College Moosarambagh Mca Department
No ratings yet
Data Mining Lab Manual: Aurora's PG College Moosarambagh Mca Department
42 pages
DWM1
No ratings yet
DWM1
19 pages
Data Warehousing and Data Mining Lab
No ratings yet
Data Warehousing and Data Mining Lab
53 pages
Data-Mining-Lab-Manual Cs 703b
No ratings yet
Data-Mining-Lab-Manual Cs 703b
41 pages
DWM Lab Manual
No ratings yet
DWM Lab Manual
92 pages
Data Mining Lab File
No ratings yet
Data Mining Lab File
20 pages
DWDM Lab Manual Using Weka-For MIC
No ratings yet
DWDM Lab Manual Using Weka-For MIC
42 pages
DMLB 1
No ratings yet
DMLB 1
3 pages
SAP - TS410 - Resume
No ratings yet
SAP - TS410 - Resume
17 pages
6.034 Design Assignment 2: 1 Data Sets
No ratings yet
6.034 Design Assignment 2: 1 Data Sets
6 pages
Weka (20030421-Version1 by Kdelab)
No ratings yet
Weka (20030421-Version1 by Kdelab)
51 pages
Journal Data Mining
No ratings yet
Journal Data Mining
31 pages
Lab (I)
No ratings yet
Lab (I)
3 pages
Weka Tutorial: 1. Downloading and Installing Weka (Version 3.6)
No ratings yet
Weka Tutorial: 1. Downloading and Installing Weka (Version 3.6)
4 pages
Data Mining Lab Syllabus
No ratings yet
Data Mining Lab Syllabus
2 pages
DWDM Lab Manual: Department of Computer Science and Engineering
No ratings yet
DWDM Lab Manual: Department of Computer Science and Engineering
46 pages
Weka Lab Record Experiments
No ratings yet
Weka Lab Record Experiments
21 pages
Hyper Mesh
No ratings yet
Hyper Mesh
363 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
34 pages
Digitalai Deploy 22.1.1 - Compressed
No ratings yet
Digitalai Deploy 22.1.1 - Compressed
599 pages
Data Mining and Warehousing Lab
No ratings yet
Data Mining and Warehousing Lab
4 pages
Weka Tutorial
No ratings yet
Weka Tutorial
2 pages
Cutter Server
No ratings yet
Cutter Server
42 pages
IBM Tivoli Monitoring For Virtual Servers VMware ESX AgentUser's
No ratings yet
IBM Tivoli Monitoring For Virtual Servers VMware ESX AgentUser's
124 pages
iOS InterView Que Ans
No ratings yet
iOS InterView Que Ans
122 pages
Lecture 11-12 User Interface Development
No ratings yet
Lecture 11-12 User Interface Development
19 pages
CIT 811 TMA 1 Quiz Question
No ratings yet
CIT 811 TMA 1 Quiz Question
3 pages
Institute of Engineering Jiwaji University: Software Engineering CS-603 ASSIGNMENT-02
No ratings yet
Institute of Engineering Jiwaji University: Software Engineering CS-603 ASSIGNMENT-02
30 pages
UGEE User Manual (English)
No ratings yet
UGEE User Manual (English)
10 pages
Mobile Application For University Transportation
No ratings yet
Mobile Application For University Transportation
59 pages
Zcorp Series Zscanner 800 User Guide Manual en
No ratings yet
Zcorp Series Zscanner 800 User Guide Manual en
51 pages
TIB BC 6.2 Installation
No ratings yet
TIB BC 6.2 Installation
50 pages
RAIDXpert2 UserGuide Enu PDF
No ratings yet
RAIDXpert2 UserGuide Enu PDF
140 pages
Microsoft Exchange Load Generator 2013
No ratings yet
Microsoft Exchange Load Generator 2013
82 pages
Visual Workshop 3B User Guide
No ratings yet
Visual Workshop 3B User Guide
129 pages
Class XII Computer Science Project 1
No ratings yet
Class XII Computer Science Project 1
19 pages
UI/UX Presentation10
No ratings yet
UI/UX Presentation10
36 pages
ICT Elective D2
No ratings yet
ICT Elective D2
18 pages
CPE432 - Lecture Notes 5
No ratings yet
CPE432 - Lecture Notes 5
25 pages
MSC Simufac
No ratings yet
MSC Simufac
10 pages
Exercise 3
No ratings yet
Exercise 3
7 pages
Croquet - A Collaboration System Architecture: David A. Smith Alan Kay Andreas Raab David P. Reed
No ratings yet
Croquet - A Collaboration System Architecture: David A. Smith Alan Kay Andreas Raab David P. Reed
8 pages
Pharmacy Management System Complete Repo
No ratings yet
Pharmacy Management System Complete Repo
5 pages
VHDL Configurations Tutorial
No ratings yet
VHDL Configurations Tutorial
23 pages
Statistical Software - Overview
No ratings yet
Statistical Software - Overview
8 pages
Khazama AVR Programmer: Avrdude
No ratings yet
Khazama AVR Programmer: Avrdude
3 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)

DM Lab Record PDF

Uploaded by

DM Lab Record PDF

Uploaded by

INDEX

1. 14/08/23 List all the categorical and the real- 18

The German credit data:

I. Downloading and/or installation of WEKA data mining toolkit,

List the attributes names and they types

i. List the attributes names and they types

Now click the visualize all.

 Now save the file with extension .arff

The German Credit Data

Ans) The following are the Categorical (or Nominal) attributes:

The following are the Numerical attributes:

ii) Fold Size-5

ii)If Personal_status is removed

1. Remove the 2nd Attribute:

 Remove 21st attribute (Class):

 Now converting above Decision tree into a set of rules is asfollows:

Rule1: If age = youth AND student=yes THEN buys_computer=yes

 Now open the Weka GUI Explorer, Select Classify Tab.

You might also like