Conditional Inference Trees in R Programming
Last Updated :
10 Jul, 2020
Conditional Inference Trees is a non-parametric class of decision trees and is also known as unbiased recursive partitioning. It is a recursive partitioning approach for continuous and multivariate response variables in a conditional inference framework. To perform this approach in R Programming, ctree()
function is used and requires partykit
package. In this article, let’s learn about conditional inference trees, syntax, and its implementation with the help of examples.
Conditional Inference Trees
Conditional Inference Trees is a different kind of decision tree that uses recursive partitioning of dependent variables based on the value of correlations. It avoids biasing just like other algorithms of classification and regression in machine learning. Thus, avoiding vulnerability to the errors making it more flexible for the problems in the data. Conditional inference trees use a significance test which is a permutation test that selects covariate to split and recurse the variable. The p-value is calculated in this test. The significance test is executed at each start of the algorithm. This algorithm is not good for data with missing values for learning.
Algorithm:
- Test the global null hypothesis between random input and response variables and select the input variable with the highest p-value with response variable.
- Perform binary split on the selected input variable.
- Recursively perform step 1 and 2.
How Conditional Inference Trees differs from Decision Trees?
Conditional Inference Trees is a tree-based classification algorithm. It is similar to the decision trees as ctree()
also performs recursively partitioning of data just like decision trees. The only procedure that makes conditional inference trees different from decision trees is that conditional inference trees use a significance test to select input variables rather than selecting the variable that maximizes the information measure. For example, the Gini coefficient is used in traditional decision trees to select the variable that maximizes the information measure.
Implementation in R
Syntax:
ctree(formula, data)
Parameters:
formula: represents formula on the basis of which model is to be fit
data: represents dataframe containing the variables in the model
Example 1:
In this example, let’s use the regression approach of Condition Inference trees on the air quality dataset which is present in the R base package. After the execution, different levels of ozone will be determined based on different environmental conditions. This helps in learning the different behavior of ozone value in different environmental conditions.
Step 1: Installing the required packages.
install.packages ( "partykit" )
|
Step 2: Loading the required package.
Step 3: Creating regression model of Condition inference tree.
air <- subset (airquality, ! is.na (Ozone))
airConInfTree <- ctree (Ozone ~ .,
data = air)
|
Step 4: Print regression model.
Output:
Model formula:
Ozone ~ Solar.R + Wind + Temp + Month + Day
Fitted party:
[1] root
| [2] Temp <= 82
| | [3] Wind 6.9
| | | [5] Temp 77: 31.143 (n = 21, err = 4620.6)
| [7] Temp > 82
| | [8] Wind 10.3: 48.714 (n = 7, err = 1183.4)
Number of inner nodes: 4
Number of terminal nodes: 5
Step 4: Plotting the graph.
png (file = "conditionalRegression.png" )
plot (airConInfTree)
dev.off ()
|
Output:

Explanation:
After executing, the above code produces a graph of conditional inference tree that shows the ozone value in the form of a box plot in each node in different environmental conditions. As in the above output image, Node 5 shows the minimum ozone value. Further, learning the behavior shows Temp6.9 shows the least ozone value in air quality.
Example 2:
In this example, let’s use the classification approach of Condition Inference trees on the iris dataset present in the R base package. After executing the code, different species of iris plants will be determined on the basis of petal length and width.
Step 1: Installing the required packages.
install.packages ( "partykit" )
|
Step 2: Loading the required package.
Step 3: Creating classification model of Condition inference tree
irisConInfTree <- ctree (Species ~ .,
data = iris)
|
Step 4: Print classification model
Output:
Model formula:
Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width
Fitted party:
[1] root
| [2] Petal.Length 1.9
| | [4] Petal.Width <= 1.7
| | | [5] Petal.Length 4.8: versicolor (n = 8, err = 50.0%)
| | [7] Petal.Width > 1.7: virginica (n = 46, err = 2.2%)
Number of inner nodes: 3
Number of terminal nodes: 4
Step 4: Plotting the graph
png (file = "conditionalClassification.png" ,
width = 1200, height = 400)
plot (irisConInfTree)
dev.off ()
|
Output:

Explanation:
After executing the above code, species of iris plants are classified based on petal length and width. As in above graph, setosa species have petal length <= 1.9.
Similar Reads
Condition Handling in R Programming
Decision handling or Condition handling is an important point in any programming language. Most of the use cases result in either positive or negative results. Sometimes there is the possibility of condition checking of more than one possibility and it lies with n number of possibilities. In this ar
5 min read
Decision Tree in R Programming
In this article, weâll explore how to implement decision trees in R, covering key concepts, step-by-step examples, and tuning strategies. A decision tree is a flowchart-like model where each internal node represents a decision based on a feature, each branch represents an outcome of that decision, a
3 min read
Control Statements in R Programming
Control statements are expressions used to control the execution and flow of the program based on the conditions provided in the statements. These structures are used to make a decision after assessing the variable. In this article, we'll discuss all the control statements with the examples. In R pr
4 min read
Decision Tree Classifiers in R Programming
Classification is the task in which objects of several categories are categorized into their respective classes using the properties of classes. A classification model is typically used to, Predict the class label for a new unlabeled data objectProvide a descriptive model explaining what features ch
4 min read
How to Code in R programming?
R is a powerful programming language and environment for statistical computing and graphics. Whether you're a data scientist, statistician, researcher, or enthusiast, learning R programming opens up a world of possibilities for data analysis, visualization, and modeling. This comprehensive guide aim
4 min read
Decision Tree for Regression in R Programming
Decision tree is a type of algorithm in machine learning that uses decisions as the features to represent the result in the form of a tree-like structure. It is a common tool used to visually represent the decisions made by the algorithm. Decision trees use both classification and regression. Regres
4 min read
How to Specify Split in a Decision Tree in R Programming?
Decision trees are versatile and widely used machine learning algorithms for both classification and regression tasks. A fundamental aspect of building decision trees is determining how to split the dataset at each node effectively. In this comprehensive guide, we will explore the theory behind deci
6 min read
Fitting Linear Models to the Data Set in R Programming - glm() Function
glm() function in R Language is used to fit linear models to the dataset. Here, glm stands for a generalized linear model. Syntax: glm(formula)Parameters: formula: specified formula Example 1: C/C++ Code # R program to illustrate # glm function # R growth of orange trees dataset Orange # Putting age
2 min read
dplyr Package in R Programming
The dplyr package for R offers efficient data manipulation functions. It makes data transformation and summarization simple with concise, readable syntax. Key Features of dplyrData Frame and TibbleData frames in dplyr in R is organized tables where each column stores specific types of information, l
4 min read
Creating a Data Frame from Vectors in R Programming
A vector can be defined as the sequence of data with the same datatype. In R, a vector can be created using c() function. R vectors are used to hold multiple data values of the same datatype and are similar to arrays in C language. Data frame is a 2 dimensional table structure which is used to hold
5 min read