Decision trees are effective tools for classification and prediction, utilizing a set of training cases to determine target attribute values for new examples. They require a fixed collection of attributes, predefined classes, and sufficient data for model learning. While they generate understandable rules and handle various variable types, decision trees can struggle with continuous attributes, small datasets, and computational efficiency during training.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views45 pages
Decisiontree With Example
Decision trees are effective tools for classification and prediction, utilizing a set of training cases to determine target attribute values for new examples. They require a fixed collection of attributes, predefined classes, and sufficient data for model learning. While they generate understandable rules and handle various variable types, decision trees can struggle with continuous attributes, small datasets, and computational efficiency during training.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45
Decision Tree
The problem
• Given a set of training cases/objects and their attribute
values, try to determine the target attribute value of new examples.
• Classification • Prediction Why decision tree?
• Decision trees are powerful and popular tools for
classification and prediction. • Decision trees represent rules, which can be understood by humans and used in knowledge system such as database. key requirements • Attribute-value description: object or case must be expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold). • Predefined classes (target values): the target function has discrete output values (bollean or multiclass) • Sufficient data: enough training cases should be provided to learn the model. Windy Evaluation • Training accuracy • How many training instances can be correctly classify based on the available data? • Is high when the tree is deep/large, or when there is less confliction in the training instances. • however, higher training accuracy does not mean good generalization • Testing accuracy • Given a number of new instances, how many of them can we correctly classify? • Cross validation Strengths • can generate understandable rules • perform classification without much computation • can handle continuous and categorical variables • provide a clear indication of which fields are most important for prediction or classification Weakness • Not suitable for prediction of continuous attribute. • Perform poorly with many class and small data. • Computationally expensive to train. • At each node, each candidate splitting field must be sorted before its best split can be found. • In some algorithms, combinations of fields are used and a search must be made for optimal combining weights. • Pruning algorithms can also be expensive since many candidate sub-trees must be formed and compared. • Do not treat well non-rectangular regions.