0% found this document useful (0 votes)
70 views

Ex 10 - Decision Tree With Rpart and Fancy Plot and Cardio Data

This document discusses building and visualizing decision trees using R. It loads several packages for data analysis and decision trees. Cardiovascular data is then loaded and explored. A full decision tree is built on the data using all predictors. The tree is visualized using fancyRpartPlot. A second tree is built using a control parameter to limit complexity and is also visualized.

Uploaded by

Nope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Ex 10 - Decision Tree With Rpart and Fancy Plot and Cardio Data

This document discusses building and visualizing decision trees using R. It loads several packages for data analysis and decision trees. Cardiovascular data is then loaded and explored. A full decision tree is built on the data using all predictors. The tree is visualized using fancyRpartPlot. A second tree is built using a control parameter to limit complexity and is also visualized.

Uploaded by

Nope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

rpart-decission-tree.

R
Admin

2019-12-12
library(dplyr)

##
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':


##
## filter, lag

## The following objects are masked from 'package:base':


##
## intersect, setdiff, setequal, union

library(ggplot2)
library(rpart)
library(rattle)

## Rattle: A free graphical interface for data science with R.


## Version 5.2.0 Copyright (c) 2006-2018 Togaware Pty Ltd.
## Type 'rattle()' to shake, rattle, and roll your data.

library(RColorBrewer)
library(readr)
mydata <- read_csv("D:/Academics/AY 2019-20/ODD SEM 2019/Predictive
Analytics/Datasets/Exercises/Ex 7 - Decision Tree/cardio data.csv")

## Warning: Missing column names filled in: 'X1' [1]

## Parsed with column specification:


## cols(
## X1 = col_double(),
## age = col_double(),
## sex = col_character(),
## cp = col_double(),
## trestbps = col_double(),
## chol = col_double(),
## fbs = col_double(),
## restecg = col_double(),
## thalach = col_double(),
## exang = col_double(),
## oldpeak = col_double(),
## slope = col_double(),
## ca = col_character(),
## thal = col_character(),
## status = col_character()
## )

View(mydata)

# Counting the missing values in the datframe


sum(is.na(mydata))

## [1] 0

str(mydata)

## Classes 'spec_tbl_df', 'tbl_df', 'tbl' and 'data.frame': 303 obs. of 15


variables:
## $ X1 : num 1 2 3 4 5 6 7 8 9 10 ...
## $ age : num 63 67 67 37 41 56 62 57 63 53 ...
## $ sex : chr "male" "male" "male" "male" ...
## $ cp : num 1 4 4 3 2 2 4 4 4 4 ...
## $ trestbps: num 145 160 120 130 130 120 140 120 130 140 ...
## $ chol : num 233 286 229 250 204 236 268 354 254 203 ...
## $ fbs : num 1 0 0 0 0 0 0 0 0 1 ...
## $ restecg : num 2 2 2 0 2 0 2 0 2 2 ...
## $ thalach : num 150 108 129 187 172 178 160 163 147 155 ...
## $ exang : num 0 1 1 0 0 0 0 1 0 1 ...
## $ oldpeak : num 2.3 1.5 2.6 3.5 1.4 0.8 3.6 0.6 1.4 3.1 ...
## $ slope : num 3 2 2 3 1 1 3 1 2 3 ...
## $ ca : chr "0.0" "3.0" "2.0" "0.0" ...
## $ thal : chr "6.0" "3.0" "7.0" "3.0" ...
## $ status : chr "normal" "abnormal" "abnormal" "normal" ...
## - attr(*, "spec")=
## .. cols(
## .. X1 = col_double(),
## .. age = col_double(),
## .. sex = col_character(),
## .. cp = col_double(),
## .. trestbps = col_double(),
## .. chol = col_double(),
## .. fbs = col_double(),
## .. restecg = col_double(),
## .. thalach = col_double(),
## .. exang = col_double(),
## .. oldpeak = col_double(),
## .. slope = col_double(),
## .. ca = col_character(),
## .. thal = col_character(),
## .. status = col_character()
## .. )

## Removing missing values


mydata <- na.omit(mydata)
## using all the predictors and setting all other arguments to default
fullmodel <- rpart(status ~ . ,
data = mydata,
method = "class",
cp = 0)

## Using fancyRpartPlot() from "rattle" package


fancyRpartPlot(fullmodel,
palettes = c("Blues", "Reds"))

# refer to ?fancyRpartPlot to check available colours


#model with control parameter
cpmodel <- rpart(status ~ . ,
data = mydata,
method = "class",
cp = 0.05)
fancyRpartPlot(cpmodel,
palettes = c("Purples", "Reds"),
sub = "")

You might also like