0% found this document useful (0 votes)
30 views

C4.5 Decision Tree Algorithm

The C4.5 algorithm, developed by Ross Quinlan, is an advanced decision tree algorithm that improves upon ID3 by handling continuous data, missing values, and overfitting. Key features include the use of Information Gain Ratio, post-pruning, and the ability to generate decision rules. C4.5 is effective in building accurate and interpretable decision trees for complex datasets.

Uploaded by

diljeetpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

C4.5 Decision Tree Algorithm

The C4.5 algorithm, developed by Ross Quinlan, is an advanced decision tree algorithm that improves upon ID3 by handling continuous data, missing values, and overfitting. Key features include the use of Information Gain Ratio, post-pruning, and the ability to generate decision rules. C4.5 is effective in building accurate and interpretable decision trees for complex datasets.

Uploaded by

diljeetpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

C4.

5 Algorithm in Decision Trees

Explanation with Numerical Example


C4.5 Algorithm in Decision Trees
• C4.5 algorithm, developed by Ross Quinlan, is an enhanced
version of the ID3 algorithm used for building decision trees.
• It introduces advanced features to handle continuous data,
missing values, and overfitting.
Key Features of C4.5
• Supports both categorical and numerical attributes
• Uses Information Gain Ratio to avoid bias
• Employs post-pruning for better generalization
• Handles missing values
• Can generate decision rules from the tree
C4.5 Algorithm Steps
1. Select the attribute with the highest gain ratio
2. Create a decision node for the chosen attribute
3. Split the dataset accordingly
4. Recur for each subset until:
 All instances belong to the same class
 No attributes remain
 No instances remain
Outlook Temperature Humidity Windy Play Tennis
Sunny 85 85 No No
Sunny 80 90 Yes No
Overcast 83 78 No Yes
Rainy 70 96 No Yes
Rainy 68 80 No Yes
Rainy 65 70 Yes No
Overcast 64 65 Yes Yes
Sunny 72 95 No No
Sunny 69 70 No Yes
Rainy 75 80 No Yes
Sunny 75 70 Yes Yes
Overcast 72 90 Yes Yes
Overcast 81 75 No Yes
Rainy 71 80 Yes No
C4.5 vs ID3
Feature ID3 C4.5

Splitting Criterion Information Gain Gain Ratio

Categorical +
Attribute Types Categorical only Numerical

Pruning Not available Post-pruning

Missing Data Handling Not supported Supported

Rule Extraction Not supported Supported


Summary
• C4.5 is a robust and practical decision tree
algorithm
• Handles complex data types and overfitting
• Builds accurate and interpretable decision
trees

You might also like