0% found this document useful (0 votes)
33 views

Data Classification-Decision Trees: Business Intelligence

This document discusses decision trees for data classification. It begins with an outline of topics including math for business intelligence, probability, and decision trees. It then discusses why mathematical modeling is useful for explaining systems and making predictions. The document outlines the steps for modeling a system, including observing variables, building relationships, analyzing the model, and testing the model. It also covers probability and expected value. The main content discusses decision trees for classification, how they work, entropy, interpreting results, and creating rules from decision trees. It ends with justifying the use of decision trees over linear models for problems with non-linear relationships.

Uploaded by

Areeba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Data Classification-Decision Trees: Business Intelligence

This document discusses decision trees for data classification. It begins with an outline of topics including math for business intelligence, probability, and decision trees. It then discusses why mathematical modeling is useful for explaining systems and making predictions. The document outlines the steps for modeling a system, including observing variables, building relationships, analyzing the model, and testing the model. It also covers probability and expected value. The main content discusses decision trees for classification, how they work, entropy, interpreting results, and creating rules from decision trees. It ends with justifying the use of decision trees over linear models for problems with non-linear relationships.

Uploaded by

Areeba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Data Classification-Decision Trees L

E
C
T
U
R
E

Business 4

Intelligence Farah Mehboob


Outline

• Math For BI
• Probability
• Decision Trees
• Knime Demo

2
Reference

□ Mathematical Modelling for Business


Analytics by William.P. Fox
Why Maths?

□ A mathematical model may be used to help explain a


system, to study the effects of different components, and
to make predictions about behavior.
□ formulate the model, outline the model, ask if it is useful,
and test the model.

4
Real World is not in the books

□ Consider modeling an investment. Our first inclination is to


use the equations about compound interest rates that we used
in high school or college algebra. The compound interest
formula calculates the value of a compound interest
investment after “n” interest periods.

□ where: A is the amount after n interest periods P is the


principal, the amount invested at the start i is the interest rate
applying to each period n is the number of interest periods

5
cont

This is a continuous formula. Have you seen any banking institutions


that give continuous interest? In our research, we have not. As a
matter of fact at our local credit union, they have a sign that says,
money deposited after 10 a.m. do not get credited until the night after
the deposit. This makes discrete compound interest on the balance in
a more compelling assumption. A powerful paradigm that we use to
model with discrete dynamical systems is as follows:

Future value= present value + change

6
Closed System For Modeling

7
Modeling Outline

□ 1. Observe the system and identify the factors and variables


involved in the real-world behavior, possibly making
simplifying assumptions as necessary.
□ 2. Build initial relationships among the factors and
variables.
□ 3. Build the model and analyze the model’s results.
□ 4. Interpret the mathematical results both mathematically
and in terms of the real-world system.
□ 5. Test the model results and conclusions against real-
world observations. Do the results and use of the model
pass the common sense test? If not go back and remodel
the system.

8
Probability and Expected Value

9
• P

10
Decision Trees

Decision Trees are more effective for predicting nominal and binary
data

13
Classification through Decision Trees

14
Decision Trees

15
Entropy

16
Data Mining –Decision Tree

17
Interpreting Results

18
Decision Trees Discussion

1. Working with continuous attributes (binning)


2. Avoiding overfitting
3. Super Attributes (attributes with many unique
values)
4. Working with missing values

19
Creating Rules from Decision Tree

20
Justifying Decision Trees

Are tree based models better than linear models?


“If I can use logistic regression for classification problems and linear
regression for regression problems, why is there a need to use trees”?
Many of us have this question. And, this is a valid one too.
Actually, you can use any algorithm. It is dependent on the type of
problem you are solving. Let’s look at some key factors which will help
you to decide which algorithm to use:
If the relationship between dependent & independent variable is well
approximated by a linear model, linear regression will outperform tree
based model.
If there is a high non-linearity & complex relationship between
dependent & independent variables, a tree model will outperform a
classical regression method.
If you need to build a model which is easy to explain to people, a
decision tree model will always do better than a linear model. Decision
tree models are even simpler to interpret than linear regression!

21
Demo

22
Reference

https://www.coursera.org/lecture/big-data-machine-learning/classification-using-decision-tree-in-
knime-0nzzY
https://www.knime.com/knime-introductory-course/chapter6/section3/decision-tree

https://www.saedsayad.com/decision_tree.htm

23

You might also like