0% found this document useful (0 votes)
22 views1 page

Decision Tree in Data Mining

Uploaded by

rolexxx3636
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views1 page

Decision Tree in Data Mining

Uploaded by

rolexxx3636
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Decision Tree Induction in Data Mining

 Decision tree induction is a common technique in data mining that is used to generate a
predictive model from a dataset. This technique involves constructing a tree-like
structure, where each internal node represents a test on an attribute, each branch
represents the outcome of the test, and each leaf node represents a prediction. The goal
of decision tree induction is to build a model that can accurately predict the outcome of
a given event, based on the values of the attributes in the dataset.
 To build a decision tree, the algorithm first selects the attribute that best splits the data
into distinct classes. This is typically done using a measure of impurity, such as entropy
or the Gini index, which measures the degree of disorder in the data. The algorithm then
repeats this process for each branch of the tree, splitting the data into smaller and smaller
subsets until all of the data is classified.
 Decision tree induction is a popular technique in data mining because it is easy to
understand and interpret, and it can handle both numerical and categorical data.
Additionally, decision trees can handle large amounts of data, and they can be updated
with new data as it becomes available. However, decision trees can be prone to
overfitting, where the model becomes too complex and does not generalize well to new
data. As a result, data scientists often use techniques such as pruning to simplify the tree
and improve its performance.

Advantages of Decision Tree Induction


1. Easy to understand and interpret: Decision trees are a visual and intuitive model that can
be easily understood by both experts and non-experts.
2. Handle both numerical and categorical data: Decision trees can handle a mix of numerical
and categorical data, which makes them suitable for many different types of datasets.
3. Can handle large amounts of data: Decision trees can handle large amounts of data and
can be updated with new data as it becomes available.
4. Can be used for both classification and regression tasks: Decision trees can be used for
both classification, where the goal is to predict a discrete outcome, and regression, where
the goal is to predict a continuous outcome.

Disadvantages of Decision Tree


Induction

1. Prone to overfitting: Decision trees


can become too complex and may
not generalize well to new data. This
can lead to poor performance on
unseen data.
2. Sensitive to small changes in the
data: Decision trees can be sensitive
to small changes in the data, and a
small change in the data can result in
a significantly different tree.
3. Biased towards attributes with
many levels: Decision trees can be biased towards attributes with many levels, and may
not perform well on attributes with a small number of levels.

You might also like