0% found this document useful (0 votes)

9 views43 pages

DMDW - Unit 3 - Classification

This document covers the fundamentals of classification in data mining, including the definition, general approach, and decision tree induction. It explains how to build decision trees, evaluate classifier performance using confusion matrices, and discusses various algorithms for tree induction. Key concepts such as model overfitting and methods for expressing attribute test conditions are also addressed.

Uploaded by

pavankumardokku2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views43 pages

DMDW - Unit 3 - Classification

Uploaded by

pavankumardokku2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

UNIT – 3

Classification

Basic Concepts
General Approach to solving a classification problem
Decision Tree Induction
Working of Decision Tree- building a decision tree
Methods for expressing an attribute test conditions
Measures for selecting the best split
Algorithm for decision tree induction
Model Over fitting
Evaluating the performance of classifier

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1

Classification: Definition
 Usually, the given data set is divided into training and test sets,
with training set used to build the model and test set used to
validate it.
 Given a collection of records (training set )
– Each record contains a set of attributes, one of the attributes
is the class.
 A test set is used to determine the accuracy of the model.
 Find a model for class attribute as a function of the
values of other attributes.

Goal: previously unseen records should be assigned a class as accurately as

possible.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›
 A classification model is useful for the following purposes.
– Descriptive Modeling: A classification model can serve as an
explanatory tool to distinguish between objects of different
classes.
– Table explains what features
define a borrower as a
defaulter or not.

– Predictive Modeling: A classification model can also be used to

predict the class label of unknown records.
Home Marital Annual Defaulted
Owner Status Income Borrower
No Married 80K ?
10

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

General Approach to Solving a Classification Problem

Illustrating Classification Task

Tid Attrib1 Attrib2 Attrib3 Class Learning

1 Yes Large 125K No
algorithm
2 No Medium 100K No

3 No Small 70K No

4 Yes Medium 120K No

Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn

8 No Small 85K Yes Model
9 No Medium 75K No

10 No Small 90K Yes

Model
10

Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ? Deduction

14 No Small 95K ?

15 No Large 67K ?
10

Test Set
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›
Performance metrics

 Evaluation of the performance of a classification model is based on the

counts of test records correctly and incorrectly predicted by the model.
 These counts are tabulated in a table known as a confusion matrix.
 f01 is the number of records from class 0 incorrectly predicted as class 1

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Decision Tree Induction:

 How a Decision Tree Works?

The tree has three types of nodes:
– Root node that has no incoming edges and zero or more outgoing edges.
– Internal nodes, each of which has exactly one incoming edge and two or
more outgoing edges.
– Leaf or terminal nodes, each of which has exactly one incoming edge
and no outgoing edges.
 In a decision tree, each leaf node is assigned a class label.
 The non-terminal nodes, which include the root and other internal nodes,
contain attribute test conditions to separate records that have different
characteristics.

 How to Build a Decision Tree?

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Example of a Decision Tree

Tid Home Marital Annual

Splitting Attributes
Defaulted
Owner Status Income Borrower
1 Yes Single 125K No
Home Owner
2 No Married 100K No
Yes No
3 No Single 70K No
4 Yes Married 120K No NO MarSt
5 No Divorced 95K Yes Single, Divorced Married
6 No Married 60K No Annual Income
NO
7 Yes Divorced 220K No
< 80K >= 80K
8 No Single 85K Yes
9 No Married 75K No
NO YES

10 No Single 90K Yes

Model: Decision Tree

Training Data
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›
There could be more than one tree that fits the same data!

MarSt Single,
Married Divorced
Tid Home Marital Annual Defaulted
Owner Status Income Borrower NO Home Owner

1 Yes Single 125K No Yes No

2 No Married 100K No
NO Annual Income
3 No Single 70K No
< 80K >= 80K
4 Yes Married 120K No
5 No Divorced 95K Yes NO YES
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Decision Tree Classification Task

Tid Attrib1 Attrib2 Attrib3 Class

Tree
1 Yes Large 125K No Induction
2 No Medium 100K No algorithm
3 No Small 70K No

4 Yes Medium 120K No

Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn

8 No Small 85K Yes Model
9 No Medium 75K No

10 No Small 90K Yes

Model
10

Training Set
Apply Decision
Tid Attrib1 Attrib2 Attrib3 Class
Model Tree
11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ?

Deduction
14 No Small 95K ?

15 No Large 67K ?
10

Test Set

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Apply Model to Test Data

Test Data
Start from the root of tree. Home Marital Annual Defaulted
Owner Status Income Borrower

Home Owner No Married 80K ?

Yes No

NO MarSt
Single, Divorced Married

Annual Income NO
< 80K >= 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Apply Model to Test Data

Test Data
Home Marital Annual Defaulted
Owner Status Income Borrower
Home Owner No Married 80K ?
10

Yes No

NO MarSt
Single, Divorced Married

Annual Income NO
< 80K >= 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Apply Model to Test Data

Test Data
Home Marital Annual Defaulted
Owner Status Income Borrower
No Married 80K ?
Home Owner 10

Yes No

NO MarSt
Single, Divorced Married

Annual Income NO
< 80K >= 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Apply Model to Test Data

Test Data
Home Marital Annual Defaulted
Owner Status Income Borrower

Home Owner No Married 80K ?

Yes No

NO MarSt
Single, Divorced Married

Annual Income NO
< 80K >= 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Apply Model to Test Data

Test Data
Home Marital Annual Defaulted
Owner Status Income Borrower
No Married 80K ?
Home Owner 10

Yes No

NO MarSt
Single, Divorced Married

Annual Income NO
< 80K >= 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Apply Model to Test Data

Test Data
Home Marital Annual Defaulted
Owner Status Income Borrower
No Married 80K ?
Home Owner 10

Yes No

NO MarSt
Married Assign Defaulted Borrower
Single, Divorced to “No”

Annual Income NO
< 80K >= 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Decision Tree Induction

 Many Algorithms:
– Hunt’s Algorithm (one of the earliest)
– CART
– ID3,
– C4.5

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Hunt’s Algorithm
 In Hunt's algorithm, a decision tree is grown in a recursive fashion by
partitioning the training records into successive purer subsets.
 Let Dt be the set of training records that are associated with node t and
y= { y1, y2…. yc } be the class labels.
Tid Home Marital Annual Defaulted
Owner Status Income Borrower

The following is a recursive definition of Hunt's algorithm. 1 Yes Single 125K No

 Step 1: If all the records in Dt belong to the same class yt 2 No Married 100K No

then t is a leaf node labeled as yt . 3 No Single 70K No

4 Yes Married 120K No
 Step 2: If Dt contains records that belong to more than
5 No Divorced 95K Yes
one class, an attribute test condition is selected to partition
6 No Married 60K No
the records into smaller subsets.
7 Yes Divorced 220K No
8 No Single 85K Yes
A child node is created for each outcome of the test
9 No Married 75K No
condition and the records in Dt are distributed to the 10 No Single 90K Yes
children based on the outcomes.
10

The algorithm is then recursively applied to each child node.

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Hunt’s Algorithm
Tid Home Marital Annual Defaulted
Owner Status Income Borrower
1 Yes Single 125K No

Home 2 No Married 100K No

Defaulted
Owner 3 No Single 70K No
= No Yes No
4 Yes Married 120K No
Defaulted Defaulted
= No = No 5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
Home Home 8 No Single 85K Yes
Owner Owner
Yes No 9 No Married 75K No
Yes No
10 No Single 90K Yes
Defaulted Marital
Defaulted
10

Marital = No
= No
Status Status
Single, Single,
Married Married
Divorced Divorced
Defaulted
Defaulted Defaulted Annual = No
= Yes = No Income
< 80K >= 80K
Defaulted Defaulted
= No = Yes

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Classification Problem-2

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Classification Problem-2

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Classification Problem-2

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Tree Induction

 Design Issues of Decision Tree Induction

– Determine how to split the records
How to specify the attribute test condition?
How to determine the best split?

– Determine when to stop splitting

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

1. Methods for Expressing Attribute Test Conditions

 Depends on attribute types

– Binary
– Nominal
– Ordinal
– Continuous

 Depends on number of ways to split

– 2-way split
– Multi-way split

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Splitting Based on Binary Attributes

 Binary Attributes: The test condition for a binary

attribute generates two potential outcomes

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Splitting Based on Nominal Attributes

 Since a nominal attribute can have many values, its test condition can
be expressed in two ways.
 1.Multi-way split: The number of outcomes depends on the number of
distinct values for the corresponding attribute.

Car Type
Family Luxury
Sports

 2.Binary split: Divides values into two subsets. Need to find optimal
partitioning.

CarType CarType
{Sports, OR {Family,
Luxury} {Family} Luxury} {Sports}

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Splitting Based on Ordinal Attributes

 Ordinal attribute values can be grouped as long as the grouping does

not violate the order property of the attribute values
 Multi-way split: Use as many partitions as distinct values.

Size
Small Large
Medium
 Binary split: Divides values into two subsets. Need to find optimal
partitioning.

Size Size
{Small,
{Large}
OR {Medium,
{Small}
Medium} Large}

Size
– What about this split? {Small,
Large} {Medium}

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Splitting Based on Continuous Attributes

 Different ways of handling

– Discretization to form an ordinal categorical
attribute
 Static – discretize once at the beginning
 Dynamic – ranges can be found by equal interval
bucketing, equal frequency bucketing
(percentiles), or clustering.

– Binary Decision: (A < v) or (A  v)

 consider all possible splits and finds the best cut
 can be more compute intensive

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

Splitting Based on Continuous Attributes

Taxable Taxable
Income Income?
> 80K?
< 10K > 80K
Yes No

[10K,25K) [25K,50K) [50K,80K)

(i) Binary split (ii) Multi-way split

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#›

2. How to determine the Best Split

Measures for Selecting the Best Split

Before Splitting: 10 records of class 0,

10 records of class 1

Own Car Student

Car? Type? ID?

Yes No Family Luxury c1 c20

c10 c11
Sports
C0: 6 C0: 4 C0: 1 C0: 8 C0: 1 C0: 1 ... C0: 1 C0: 0 ... C0: 0
C1: 4 C1: 6 C1: 3 C1: 0 C1: 7 C1: 0 C1: 0 C1: 1 C1: 1

Which test condition is the best?

How to determine the Best Split

 Greedy approach:
– Nodes with homogeneous class distribution are
preferred

 Need a measure of node impurity:

C0: 5 C0: 9
C1: 5 C1: 1

Non-homogeneous, Homogeneous,
High degree of impurity Low degree of impurity

Measures of Node Impurity

 Measures are defined in terms of the class distribution of

the records before and after splitting
– Gini Index
– Entropy
– Misclassification error

Let denote the fraction of records belonging to class i at a given node t.

Examples of computing the different impurity measures

Node N1 has the lowest impurity value, followed by N2 and N3.

Comparison among the impurity measures for binary classification problem

For a 2-class problem:

• ‘P’ refers to the fraction of records that belong to one of the two classes
• All three measures attain their maximum value when the class distribution is uniform
(i.e., when P = 0.5).
•The minimum values for the measures are attained when all the records belong to the same class
(i.e., when P equals 0 or 1).

To determine how well a test condition performs

• we need to compare the degree of impurity of the parent node (before splitting)
with the degree of impurity of the child nodes (after splitting).
• The larger their difference, the better the test condition.

•The gain, ∆ , is a criterion that can be used to determine the goodness of a split:

Gain

where I(.) is the impurity measure of a given node,

N is the total number of records at the parent node,
k is the number of attribute values, and
N (vj ) is the number of records associated with the child node, vj.

Decision tree induction algorithms often choose a test condition that maximizes the gain ∆ .
Since I(parent) is the same for all test conditions, maximizing the gain is equivalent to
minimizing the weighted average impurity measures of the child nodes.

Splitting of Binary Attributes
Suppose there are two ways to split the data
into smaller subsets.
Before splitting, the Gini index is 0.5 since
there are an equal number of records from
both classes

If attribute A is chosen to split the data, the Gini

index for node N1 is 0.4898, and
for node N2, it is 0.480.
The weighted average of the Gini index for the
descendent nodes is
(7/12) x 0.4898 + (5/12) x 0.480 = 0.486.

Similarly, Gini index for attribute B is 0.375.

Since the subsets for attribute B have a smaller

Gini index, it is preferred over attribute A.

Splitting of Nominal Attributes
A nominal attribute can produce either binary or multiway splits.

Splitting of Continuous Attributes
Brute-force method for finding v is to consider every value of the attribute in the N records as
a candidate split position.
For efficient computation: Sort the attribute on values
For each candidate v , the data set is scanned once to count the number of records with
annual income less than or greater than v .
We then compute the Gini index for each candidate and choose the one that gives the lowest
value.

Class No No No Yes Yes Yes No No No No

Annual Income
Sorted Values 60 70 75 85 90 95 100 120 125 220
Split Positions 55 65 72 80 87 92 97 110 122 172 230
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >
Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420

Class No No No Yes Yes Yes No No No No
Annual Income
Sorted Values 60 70 75 85 90 95 100 120 125 220
Split Positions 55 65 72 80 87 92 97 110 122 172 230
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >
Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420

This problem can be further optimized by considering only candidate split positions located
between two adjacent records with different class labels.
Therefore, the candidate split positions at v = $55K, $65K, $72K, $87K, $92K, $110K, $I22K,
$172K, and $230K are ignored because they are located between two adjacent records with the
same class labels.
This approach allows us to reduce the number of candidate split positions from 11 to 2.

Gain Ratio
Impurity measures such as entropy and Gini index tend to favor attributes
that have a large number of distinct values
Customer ID is not
a predictive attribute

If we compare Gender and Car Type with Customer ID, it produce purer partitions

A test condition that results in a large number of outcomes may not be desirable because the
number of records associated with each partition is too small to enable us to make any
reliable predictions.

There are two strategies for overcoming this problem.

The first strategy is to restrict the test conditions to binary splits only.
This strategy is employed by decision tree algorithms such as CART.

Another strategy is to modify the splitting criterion to take into account the number of
outcomes produced by the attribute test condition.
For example, in the C4.5 decision tree algorithm, a splitting criterion known as gain
ratio is used to determine the goodness of a split.

k is the total number of splits

This example suggests that if an attribute produces a large number of splits, its split
information will also be large, which in turn reduces its gain ratio.

Example
Consider the training examples shown in Table 4.1 for a binary classification problem.

(a) Compute the Gini index for the overall collection

of training examples.

(b) Compute the Gini index for the Customer ID

attribute.
(c) Compute the Gini index for the Gender attribute.

(d) Compute the Gini index for the Car Type attribute
using multiway split.

(e) Compute the Gini index for the Shirt Size attribute
using multiway split

(f) Which attribute is better, Gender, Car Type, or Shirt

Size?

(g) Explain why Customer ID should not be used as

the attribute test condition even though it has the
lowest Gini.

Example
Consider the training examples shown in Table 4.1 for a binary classification problem.

(a) Compute the Gini index for the overall collection of

training examples.
Answer:
Gini = 1 − 2 × 0.52 = 0.5.
(b) Compute the Gini index for the Customer ID
attribute.
Answer:
The gini for each Customer ID value is 0. Therefore,
the overall gini for Customer ID is 0.
(c) Compute the Gini index for the Gender attribute.
Answer:
The gini for Male is 1 − 2 × 0.52 = 0.5. The gini for
Female is also 0.5.
Therefore, the overall gini for Gender is 0.5 × 0.5 + 0.5
× 0.5 = 0.5.

(d) Compute the Gini index for the Car Type attribute using multiway split.
Answer:
The gini for Family car is 0.375, Sports car is 0, and Luxury car is 0.2188. The overall gini is
0.1625.
(e) Compute the Gini index for the Shirt Size attribute using multiway split.
Answer:
The gini for Small shirt size is 0.48, Medium shirt size is 0.4898, Large
shirt size is 0.5, and Extra Large shirt size is 0.5. The overall gini for
Shirt Size attribute is 0.4914.
(f) Which attribute is better, Gender, Car Type, or Shirt Size?
Answer:
Car Type because it has the lowest gini among the three attributes.
(g) Explain why Customer ID should not be used as the attribute test condition even though it
has the lowest Gini.
Answer:
The attribute has no predictive power since new customers are assigned
to new Customer IDs.

Chap4 Basic Classification PDF
No ratings yet
Chap4 Basic Classification PDF
101 pages
DM Lec6
No ratings yet
DM Lec6
18 pages
Chap4 Basic Classification
No ratings yet
Chap4 Basic Classification
51 pages
4-Chap4 Basic Classification
No ratings yet
4-Chap4 Basic Classification
128 pages
Classification Techniques
No ratings yet
Classification Techniques
50 pages
Chap3 Basic Classification
No ratings yet
Chap3 Basic Classification
29 pages
Chap4 Basic Classification
No ratings yet
Chap4 Basic Classification
82 pages
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
101 pages
Chap4 - Basic - Classification-Admin and Economy
No ratings yet
Chap4 - Basic - Classification-Admin and Economy
31 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
58 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
59 pages
Lecture Notes For Chapter 3: by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Lecture Notes For Chapter 3: by Tan, Steinbach, Karpatne, Kumar
58 pages
Chap4 Basic Classification
No ratings yet
Chap4 Basic Classification
101 pages
Lecture Notes For Chapter 3: by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Lecture Notes For Chapter 3: by Tan, Steinbach, Karpatne, Kumar
58 pages
Chapter 3 DESKTOP VS93238 S Conflicted Copy 2019-09-29
No ratings yet
Chapter 3 DESKTOP VS93238 S Conflicted Copy 2019-09-29
55 pages
Chap3 Basic Classification
No ratings yet
Chap3 Basic Classification
59 pages
Decision Tree 1
No ratings yet
Decision Tree 1
59 pages
A.I. Lecture 6 NEW
No ratings yet
A.I. Lecture 6 NEW
59 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
61 pages
Chap3 Basic Classification
No ratings yet
Chap3 Basic Classification
63 pages
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
35 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
58 pages
Basic Classification
No ratings yet
Basic Classification
58 pages
01 Classification
No ratings yet
01 Classification
77 pages
Unit 4 Data Mining Algorithms: Dr. Anjan Krishnamurthy Associate Professor Bmsit&M
No ratings yet
Unit 4 Data Mining Algorithms: Dr. Anjan Krishnamurthy Associate Professor Bmsit&M
95 pages
CS 6823 Data Mining: Classification Decision Tree
No ratings yet
CS 6823 Data Mining: Classification Decision Tree
39 pages
Classification
No ratings yet
Classification
58 pages
Chap3 Basic Classification New 2
No ratings yet
Chap3 Basic Classification New 2
21 pages
Chapter 6. Decision Tree Classification
No ratings yet
Chapter 6. Decision Tree Classification
19 pages
05 Chap3 - Basic - Classification Edited On Oct 10, 2023
No ratings yet
05 Chap3 - Basic - Classification Edited On Oct 10, 2023
78 pages
Week 6 Chap3 - Basic - Classificationi
No ratings yet
Week 6 Chap3 - Basic - Classificationi
59 pages
Week 4 - Classification - Decision Tree 1
No ratings yet
Week 4 - Classification - Decision Tree 1
40 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
50 pages
Classification: Basic Concepts and Decision Trees
No ratings yet
Classification: Basic Concepts and Decision Trees
56 pages
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
82 pages
06 Classification
No ratings yet
06 Classification
32 pages
CH03 Classification Part I
No ratings yet
CH03 Classification Part I
58 pages
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 3 Introduction To Data Mining, 2 Edition
76 pages
DMDW - MOD4 - Classification - PPT Updated
No ratings yet
DMDW - MOD4 - Classification - PPT Updated
128 pages
UNIT IVClassification AR19
No ratings yet
UNIT IVClassification AR19
61 pages
Chap4 - Basic - Classification - Class Teaching
No ratings yet
Chap4 - Basic - Classification - Class Teaching
168 pages
Lecture3 2020classification PDF
No ratings yet
Lecture3 2020classification PDF
124 pages
Datamining-Lect5 Decision Tree
No ratings yet
Datamining-Lect5 Decision Tree
38 pages
Datamining Lect10a Classsification Basics DT
No ratings yet
Datamining Lect10a Classsification Basics DT
87 pages
DSTBD - 10 DMClassification ENG
No ratings yet
DSTBD - 10 DMClassification ENG
160 pages
Datamining-Lect3 - Classification. Decision Trees. Evaluation
No ratings yet
Datamining-Lect3 - Classification. Decision Trees. Evaluation
95 pages
Important For Data Mining
No ratings yet
Important For Data Mining
96 pages
Lecture Notes For Chapter 1: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 1: by Tan, Steinbach, Kumar
34 pages
Liaquat Majeed Sheikh: National University of Computer and Emerging Sciences
No ratings yet
Liaquat Majeed Sheikh: National University of Computer and Emerging Sciences
79 pages
Lec 6
No ratings yet
Lec 6
39 pages
Classification: Basic Concepts and Decision Trees
No ratings yet
Classification: Basic Concepts and Decision Trees
71 pages
Classification Basics
No ratings yet
Classification Basics
65 pages
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
104 pages
Pemodelan Dan Pembelajaran Mesin 07
No ratings yet
Pemodelan Dan Pembelajaran Mesin 07
40 pages
2EL1730-ML-Lecture05-Trees and Ensemble Learning
No ratings yet
2EL1730-ML-Lecture05-Trees and Ensemble Learning
70 pages
Tree Based Classifiers: Dinesh R
No ratings yet
Tree Based Classifiers: Dinesh R
54 pages
Unit 3
100% (1)
Unit 3
21 pages
SQL Server: Tips and Tricks - 2
From Everand
SQL Server: Tips and Tricks - 2
Priyanka Agarwal
4.5/5 (3)
Choose your WoW - Second Edition (FRENCH): A Disciplined Agile Approach to Optimizing Your Way of Working
From Everand
Choose your WoW - Second Edition (FRENCH): A Disciplined Agile Approach to Optimizing Your Way of Working
Mark Lines
No ratings yet
Practice Tests for CASAS Math GOAL 2 Level E, Forms 929M and 930M
From Everand
Practice Tests for CASAS Math GOAL 2 Level E, Forms 929M and 930M
Coaching For Better Learning
No ratings yet
Data Mining Using Python Lab
100% (1)
Data Mining Using Python Lab
63 pages
A Data Mining Architecture For Distributed Environments: Lecture Notes in Computer Science June 2002
No ratings yet
A Data Mining Architecture For Distributed Environments: Lecture Notes in Computer Science June 2002
13 pages
Classification: Dr. Sanjay Ranka
No ratings yet
Classification: Dr. Sanjay Ranka
51 pages
Random Forest
No ratings yet
Random Forest
225 pages
Stable Variable Selection For Right Censored Data: Comparison of Methods
No ratings yet
Stable Variable Selection For Right Censored Data: Comparison of Methods
29 pages
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
No ratings yet
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
49 pages
Statistical Machine Learning
No ratings yet
Statistical Machine Learning
28 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Enrollment Prediction - Project
No ratings yet
Enrollment Prediction - Project
17 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
89 pages
Iceberg Queries and Other Data Mining Concepts
No ratings yet
Iceberg Queries and Other Data Mining Concepts
53 pages
S.S.V.P.S.'S B.S. Deore College of Engineering, Dhule 2017-2018
No ratings yet
S.S.V.P.S.'S B.S. Deore College of Engineering, Dhule 2017-2018
18 pages
Data Mining (Banking)
No ratings yet
Data Mining (Banking)
8 pages
ML Unit-3
No ratings yet
ML Unit-3
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Secure Data Deduplication and Auditing For Cloud Data Storage 1 1
No ratings yet
Secure Data Deduplication and Auditing For Cloud Data Storage 1 1
4 pages
Report Assignment TIS3151 (The Nanobots)
No ratings yet
Report Assignment TIS3151 (The Nanobots)
21 pages
Fraud
No ratings yet
Fraud
12 pages
Data Analytics With Cognos Questions
No ratings yet
Data Analytics With Cognos Questions
15 pages
Data Mining, Klasifikasi
No ratings yet
Data Mining, Klasifikasi
88 pages
Twitter Sentiment Analysis Project Report Compressed
No ratings yet
Twitter Sentiment Analysis Project Report Compressed
33 pages
21csc305p Machine Learning Unit 5
No ratings yet
21csc305p Machine Learning Unit 5
61 pages
Intelligent Sales Prediction Using Machine Learning Techniques
No ratings yet
Intelligent Sales Prediction Using Machine Learning Techniques
6 pages
2017 Chicken Meat Freshness Identification Using The Histogram Color Feature
No ratings yet
2017 Chicken Meat Freshness Identification Using The Histogram Color Feature
5 pages
Decision - Tree - Regression - Ipynb - Colab
No ratings yet
Decision - Tree - Regression - Ipynb - Colab
3 pages
Song Et Al. 2013
No ratings yet
Song Et Al. 2013
18 pages
Crop Prediction Using Machine Learning
No ratings yet
Crop Prediction Using Machine Learning
6 pages
Post-Earthquake Restoration Modelling of A Railway Bridge Network
No ratings yet
Post-Earthquake Restoration Modelling of A Railway Bridge Network
14 pages
TD1 ELTP 2023 Correction
No ratings yet
TD1 ELTP 2023 Correction
6 pages