0% found this document useful (0 votes)
12 views14 pages

CHAID Decision Tree

CHAID is a decision tree algorithm used for classification tasks, utilizing chi-square tests to identify the most significant features. It is known for its interpretability, versatility, and ability to handle missing values, but can be prone to overfitting and instability. The document also discusses the process of decision tree construction, including data partitioning, node creation, recursion, and pruning.

Uploaded by

abhijaychauhan88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views14 pages

CHAID Decision Tree

CHAID is a decision tree algorithm used for classification tasks, utilizing chi-square tests to identify the most significant features. It is known for its interpretability, versatility, and ability to handle missing values, but can be prone to overfitting and instability. The document also discusses the process of decision tree construction, including data partitioning, node creation, recursion, and pruning.

Uploaded by

abhijaychauhan88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

CHAID

Technique to construct a Decision Tree


Decision Tree
• A decision tree is a supervised machine learning algorithm that
resembles a flowchart-like structure. It is used for both classification
and regression tasks. The tree starts with a root node, branches out
into decision nodes (internal nodes), and ends in leaf nodes (terminal
nodes). Each internal node represents a test on an attribute, and each
branch represents an outcome of the test. Leaf nodes represent the
final decision or prediction
How Decision Trees Work
• Data Partitioning: The algorithm starts by selecting the best attribute
to split the data. This is often determined using metrics like
information gain, Gini impurity, or chi-square.
• Node Creation: The data is split into subsets based on the attribute's
values, and new nodes are created for each subset.
• Recursion: This process is repeated recursively for each new node
until a stopping criterion is met (e.g., maximum depth, minimum
number of samples, or purity of the node).
• Pruning: To prevent overfitting, the tree might be pruned by removing
unnecessary branches.
Importance of Decision Trees
• Interpretability: Decision trees are easily understandable by humans,
making them transparent and explainable.
• Versatility: They can handle both categorical and numerical data.
• Non-parametric: They don't make assumptions about the underlying
data distribution.
• Fast to build and use: Decision trees are computationally efficient.
• Handles missing values: Most decision tree algorithms can handle
missing data.
Limitations of Decision Trees
• Overfitting: Decision trees can be prone to overfitting, especially with
noisy data.
• Instability: Small changes in the data can lead to significantly different
trees.
• Biased towards features with many levels: Algorithms like ID3 tend
to favor attributes with more values.
• Suboptimal for some problems: For some datasets, other algorithms
might perform better.
Techniques
• Decision trees are constructed using algorithms that iteratively split
the data based on specific criteria. Here are some of the most
common techniques:
• ID3 (Iterative Dichotomiser 3)
• CART (Classification and Regression Trees)
• CHAID (Chi-square Automatic Interaction Detection)
• C4.5
CHAID
• CHAID is the oldest decision tree algorithm in the history. It was raised in 1980 by
Gordon V. Kass. Then, CART was found in 1984, ID3 was proposed in 1986
and C4.5 was announced in 1993. It is the acronym of chi-square automatic
interaction detection. Here, chi-square is a metric to find the significance of a
feature. The higher the value, the higher the statistical significance. Similar to the
others, CHAID builds decision trees for classification problems. This means that it
expects data sets having a categorical target variable.
• CHAID uses chi-square tests to find the most dominant feature whereas ID3 uses
information gain, C4.5 uses gain ratio and CART uses GINI index. Chi-square
testing was raised by Karl Pearson.
√((y – y’)2 / y’)
where y is actual and y’ is expected
Day Outlook Temp. Humidity Wind Decision
Example:
1 Sunny Hot High Weak No

2 Sunny Hot High Strong No

3 Overcast Hot High Weak Yes

4 Rain Mild High Weak Yes

5 Rain Cool Normal Weak Yes

6 Rain Cool Normal Strong No

7 Overcast Cool Normal Strong Yes

8 Sunny Mild High Weak No

9 Sunny Cool Normal Weak Yes

10 Rain Mild Normal Weak Yes

11 Sunny Mild Normal Strong Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

14 Rain Mild High Strong No


Outlook feature
• Outlook feature has 3 classes: sunny, rain and overcast. There are 2 decisions: yes and no. We firstly
find the number of yes decisions and no decision for each class.

• Total column is the sum of yes and no decisions for each row. Expected values are the half of total
column because there are 2 classes in the decision. It is easy to calculate the chi-squared values based
on this table.
• For example, chi-square yes for sunny outlook is √((2 – 2.5)2 / 2.5) = 0.316 whereas actual is 2 and
expected is 2.5.
• Chi-square value of outlook is the sum of chi-square yes and no columns.
0.316 + 0.316 + 1.414 + 1.414 + 0.316 + 0.316 = 4.092
Temperature feature
• This feature has 3 classes: hot, mild and cool. The following table summarizes the chi-
square values for these classes.

• Chi-square value of temperature feature will be


0 + 0 + 0.577 + 0.577 + 0.707 + 0.707 = 2.569
• This is a value less than the chi-square value of outlook. This means that the feature
outlook is more important than the feature temperature based on chi-square testing.
Humidity feature
• Humidity has 2 classes: high and normal. Let’s summarize the chi-
square values.

• So, the chi-square value of humidity feature is


0.267 + 0.267 + 1.336 + 1.336 = 3.207
Wind feature
• Wind feature has 2 classes: weak and strong. The following table is
the pivot table.

• Herein, the chi-square test value of the wind feature is


0.802 + 0.802 + 0 + 0 = 1.604
Table
•.

You might also like