0% found this document useful (0 votes)
7 views

Lecture 8

Uploaded by

Mrawan Taha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lecture 8

Uploaded by

Mrawan Taha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Lec.8.

computational
Tools for

4170201
Classification

2
•Classification is (supervised learning): a
form of data analysis that extracts models escribing
important data classes.

• Such analysis can help provides us with a better


understanding of the large data .

• Recent data science researches has built on such


work, developing scalable classification and
prediction techniques capable of handling large
amounts of disk-resident data.
General Approach to Classification
•Data classification is a two-step
process, consisting of :
learning step (where a classification
model is constructed).

classification step (where the model is


used to predict class labels for given
data).
Classification—A Two-Step Process:
• Model construction (Learning step): describing
a set of predetermined classes
Each sample is assumed to belong to a
predefined class, as determined by the class
label attribute.
The set of sample used for model construction
is training set.
The model is represented as classification
rules, decision trees, or mathematical
formulae.
5
Classification—A Two-Step Process
• Model usage: for classifying future or unknown
objects
Estimate accuracy of the model
 The known label of test sample is compared
with the classified result from the model
 Accuracy rate is the percentage of test set
samples that are correctly classified by the
model
 Test set is independent of training set .
 If the accuracy is acceptable, use the model to
classify new data.
• Note: If the test set is used to select models, it is
called validation (test) set 6
Process (1): Model Construction

Classification
Training
Algorithms
Data

NAME RANK YEARS TENURED


M ike A ssistant P rof 3 no Classifier
M ary A ssistant P rof 7 yes (Model)
B ill P rofessor 2 yes
Jim A ssociate P rof 7 yes
D ave A ssistant P rof 6 no
IF rank = ‘professor’
A nne A ssociate P rof 3 no
OR years > 6
THEN tenured = ‘yes’
7
Process (2): Using the Model in Prediction

Classifier

Testing Unseen Data


Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
T om A ssistant P rof 2 no Tenured?
M erlisa A ssociate P rof 7 no
G eorge P rofessor 5 yes
Joseph A ssistant P rof 7 yes

8
Classification
- The primary task performed by classifiers is to assign
labels to objects.
- Labels in classifiers are pre-determined unlike in
clustering where we discover the structure and assign
labels.
- Classifier problems are supervised learning methods.

Examples for Classification techniques(Methods):


Decision Trees

9
Classification Basic concepts:
Decision Trees

10
Decision Trees
• Decision Trees are a flexible method very commonly deployed in
classification applications.

• There are two types of trees; Classification Trees and Regression


(or Prediction) Trees

• Classification Trees (we will use in these slides) – are used to


segment observations into more homogenous groups (assign class
labels). They usually apply to outcomes that are binary or
categorical in nature.
• Regression Trees – are variations of regression and what is
returned in each node is the average value at each node (type of a
step function with which the average value can be computed).
Regression trees can be applied to outcomes that are continuous
(like account spend or personal income).
11
Decision Tree Classifier - What is it?

• Used for classification:


• Input variables can be continuous or discrete
• Output:
 A tree that describes the decision flow.

 Leaf nodes return either a probability score, or

simply a classification.

 Trees can be converted to a set of "decision rules“

 "IF income < $50,000 AND mortgage_amt >


$100K THEN default=T with 75% probability“
12
Classification

• Classification: assign labels to objects.


• Usually supervised: training set of pre-classified
examples.
• examples for Classification
techniques(Methods):
Decision Trees
(and Regression)

13
Root &C
parent
node

childe
branches

14
Trees

• A tree is a hierarchical data structure consisting of:


 Nodes – store information
 Branches – connect the nodes
• The top node is the root, occupying the highest
hierarchy.
• The leaves are at the bottom, occupying the lowest
hierarchy.
• Every node, except the root, has exactly one parent.
• Every node may have zero or more child nodes.
• A binary tree restricts the number of children per
node to a maximum of two
• Degenerate trees have only a single pathway from
root to its one leaf.
15
• Each node may have a left child and a right
child.
• If you start from any node and move
upward, you will eventually reach the root.
•depth: the path length from the root of the
tree to this node
Creating a Decision Tree
Let us consider a scenario where,
a new planet is discovered by a group of
astronomers. Now the question is whether it
could be ‘the next earth?’.

The decision factors can be whether,


what is the temperature, Water is present on
the planet, whether the surface is prone to
continuous storms, flora and fauna survives
the climate or not, etc.
Creating a Decision Tree Example
Decision Tree – Example of Visual Structure

Gender
Female Male
Branch – outcome of test

Income Age Internal Node – decision on variable

<=45,000 >45,000 <=40 >40

Yes No Yes No Leaf Node – class label

19
• Branches refer to the outcome of a decision .When
the decision is numerical, the “greater than” branch
is usually shown on the right and “less than” on the
left.

• Internal Nodes are the decision or test points.


Each refers to a single variable or attribute. In the
example here the outcomes are binary, although
there could be more than 2 branches stemming
from an internal node. For example, if the variable
was categorical and had 3 choices, you might need
a branch for each choice.
20
The Leaf Nodes are at the end of the last
branch on the tree. These represent the
outcome of all the prior decisions. The leaf
nodes are the class labels, or the segment in
which all observations that follow the path to
the leaf would be placed.

21
Advantages of Decision Trees
• Easy to understand.
• Map nicely to a set of production rules.
• Applied to real problems.
• Able to process both numerical and
categorical data.
Disadvantages of Decision Trees
• Output attribute must be categorical.
• Limited to one output attribute.
• Decision tree algorithms are unstable( slight
variations in the training set can results in
different attribute selections).
• Trees created from numeric datasets can be
complex as attribute splits for numeric data are
typically binary)
From Trees to rules
Decision trees can be nicely mapped to a set of production rules ─
one advantage of DTs

One rule for each leaf


From Trees to rules
Is Person Fit or Unfit?

If age <30 and eat pizza then unfit


THEN
AND = Yes Unfit
Eat
= Yes Pizza?
IF No
Fit
Age < 30
No Yes Fit
Exercise
No
Unfit
From Trees to rules

IF Temperature is not between -10 and


60 THEN Survival Difficult

Whether water is
present or not?

Whether flora and fauna


flourishes?

The planet has a


stormy surface?

Thus, we a have a decision tree


Decision Tree Classifier - Reasons to Choose (+)
& Cautions (-)
Reasons to Choose (+) Cautions (-)
Takes any input type (numeric, categorical) Decision surfaces can only be axis-aligned
In principle, can handle categorical variables with
many distinct values (ZIP code)
Robust with redundant variables, correlated variables Tree structure is sensitive to small changes in the
training data
Naturally handles variable interaction A "deep" tree is probably over-fit
Because each split reduces the training data for
subsequent splits
Handles variables that have non-linear effect on Not good for outcomes that are dependent on many
outcome variables
Related to over-fit problem, above
Computationally efficient to build Doesn't naturally handle missing values;
However most implementations include a
method for dealing with this
Easy to score data In practice, decision rules can be fairly complex

Many algorithms can return a measure of variable


importance
In principle, decision rules are easy to understand

27
Check Your Knowledge
Your Thoughts?

1. How do you define information gain?


2. List three use cases of Decision Trees.
3. What are weak learners and how are they used in
ensemble methods?

4. Why do we end up with an over fitted model with


deep trees and in data sets when we have outcomes
that are dependent on many variables?
28

You might also like