0% found this document useful (0 votes)
26 views

Programming Assign Unit 5

The document describes using a decision tree to classify data from the Ionosphere dataset. It: 1) Loads required packages, sets the working directory, and loads the Ionosphere data. 2) Creates a decision tree model using the rpart() function and plots the tree. 3) Splits the data into train and test sets, trains a decision tree on the train set, and uses the model to make predictions on the test set. 4) Computes the accuracy of the predictions on the test set by comparing them to the true labels, finding an accuracy of 87.4%.

Uploaded by

Mahmoud Heretani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Programming Assign Unit 5

The document describes using a decision tree to classify data from the Ionosphere dataset. It: 1) Loads required packages, sets the working directory, and loads the Ionosphere data. 2) Creates a decision tree model using the rpart() function and plots the tree. 3) Splits the data into train and test sets, trains a decision tree on the train set, and uses the model to make predictions on the test set. 4) Computes the accuracy of the predictions on the test set by comparing them to the true labels, finding an accuracy of 87.4%.

Uploaded by

Mahmoud Heretani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Part 1: Print decision tree

a. We begin by setting the working directory, loading the required packages


(rpart and mlbench) and then loading the Ionosphere dataset.

#set working directory if needed (modify path as needed)

setwd(“working directory”)

#load required libraries – rpart for classification and regression trees

library(rpart)

#mlbench for Ionosphere dataset

library(mlbench)

#load Ionosphere

data(Ionosphere)

> setwd('C:\\Users\\Admin\\Downloads')

> library(rpart)

> library(mlbench)

> data(Ionosphere)

b. Use the rpart() method to create a regression tree for the data.

rpart(Class~.,Ionosphere)

> rpart.ionosphere=rpart(Class~.,Ionosphere)

> rpart.ionosphere

n= 351

node), split, n, loss, yval, (yprob)

* denotes terminal node

1) root 351 126 good (0.35897436 0.64102564)


2) V5< 0.23154 77 4 bad (0.94805195 0.05194805) *
3) V5>=0.23154 274 53 good (0.19343066 0.80656934)
6) V27>=0.999945 52 13 bad (0.75000000 0.25000000)
12) V1=0 19 0 bad (1.00000000 0.00000000) *
13) V1=1 33 13 bad (0.60606061 0.39393939)
26) V3< 0.73004 8 0 bad (1.00000000 0.00000000) *
27) V3>=0.73004 25 12 good (0.48000000 0.52000000)
54) V22>=0.47714 9 1 bad (0.88888889 0.11111111) *
55) V22< 0.47714 16 4 good (0.25000000 0.75000000) *
7) V27< 0.999945 222 14 good (0.06306306 0.93693694) *

c. Use the plot() and text() methods to plot the decision tree.

> plot(rpart.ionosphere)

> text(rpart.ionosphere,pretty=0)

Part 2: Estimate accuracy

a. Split the data a test and train subsets using the sample() method.

> set.seed=(42)
> train=sample(1:nrow(Ionosphere),200)

b. Use the rpart method to create a decision tree using the training data.

rpart(Class~.,Ionosphere,subset=train)
> rpart.ionosphere=rpart(Class~.,Ionosphere,subset=train)
> rpart.ionosphere
n= 200

node), split, n, loss, yval, (yprob)


* denotes terminal node

1) root 200 73 good (0.36500000 0.63500000)


2) V5< 0.02313 40 0 bad (1.00000000 0.00000000) *
3) V5>=0.02313 160 33 good (0.20625000 0.79375000)
6) V27>=0.99921 31 9 bad (0.70967742 0.29032258)
12) V22>=-0.009455 20 2 bad (0.90000000 0.10000000) *
13) V22< -0.009455 11 4 good (0.36363636 0.63636364) *
7) V27< 0.99921 129 11 good (0.08527132 0.91472868) *

c. Use the predict method to find the predicted class labels for the testing data.

> rpart.pred=predict(rpart.ionosphere,Ionosphere.test,type="class")

d. Use the table method to create a table of the predictions versus true labels and then
compute the accuracy. The accuracy is the number of correctly assigned good cases
(true positives) plus the number of correctly assigned bad cases (true negatives) divided
by the total number of testing cases.

> table(rpart.pred,Ionosphere$Class[-train])

rpart.pred bad good


bad 37 3
good 16 95

> (37+95)/(37+3+16+95)
[1] 0.8741722

You might also like