0% found this document useful (0 votes)
5 views

Decision Tree

Uploaded by

lovishh03.ssll
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Decision Tree

Uploaded by

lovishh03.ssll
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Decision Tree Classifier

Decision Tree
• It is a supervised learning method used for
both classification and regression tasks.
• A decision tree is a tree in which each branch
node represents a choice between number of
alternatives and each leaf node represents a
decision.
• The decision tree does a great job of distilling
data into knowledge
• Works with: Numeric values, nominal values
General approach to decision trees
1 Collect: Any method.
2. Prepare: This tree-building algorithm works only on
nominal values, so any continuous values will need to
be quantized.
3. Analyze: Any method. You should visually inspect the
tree after it is built.
4. Train: Construct a tree data structure.
5. Test: Calculate the error rate with the learned tree.
6. Use: This can be used in any supervised learning task.
Often, trees are used to better understand the data.
By using information theory, you can measure the information before
and after the split.
The change in information before and after the split is known as
the information gain.
Information gain

• By using information theory, you can measure the information


before and after the split.
• The change in information before and after the split is known
as the information gain.
• When you know how to calculate the information gain, you
can split your data across every feature to see which split
gives you the highest information gain.
• The split with the highest information gain is your best option.
• Entropy is defined as the expected value of the information.
• The higher the entropy, the more mixed up the data is
Uses
• Facial recognition
• Medical field: diagnosis of some diseases
• Recommender systems
Advantges
• Simple to understand, interpret, visualize.
• Can handle both numerical and categorical data.
• can generate understandable rules
• perform classification without much
computation
• Inexpensive to construct
• Can handle large data
• Extremely fast at classifying unknown records
Disadvantages
• Perform poorly with many class and small
data.
• Computationally expensive to train.
• Create complex models
• Overfitting is quite common with decision
trees.
• decision trees are also vulnerable to becoming
biased to the classes that have a majority in
the dataset

You might also like