Module 3

The document discusses classification and prediction tasks in machine learning. It provides examples of classification, where models predict categorical labels, and prediction, where models predict continuous values. The document outlines the two-step process of classification - the learning step to build a classifier from training data, and the classification step to apply the model to new data. It also discusses decision tree algorithms for classification, including how decision trees are constructed in a top-down recursive manner and used to classify data.

Uploaded by

Abhishek Chandrasenan Nair

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views64 pages

Module 3

Uploaded by

Abhishek Chandrasenan Nair

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Classification

Eg1:- A bank loans officer needs analysis of data to learn which

loan applicants are “safe” and which are “risky” for the bank.
Eg2:- A Marketing Manager need to analyse to guess whether a
customer with a given profile will buy a new computer.
Eg3:- A medical researcher wants to analyze breast cancer data
in order to predict which one of three specific treatments a
patient should receive.
• In each of these examples, the data analysis task is
classification, where a model or classifier is constructed to
predict categorical labels.
– Safe or risky
– Yes or No
– Treatment A, Treatment B or Treatment C
• These categories can be represented by discrete values,
where the ordering among values has no meaning
Prediction
• Eg:- A marketing manager would like to predict how much a
given customer will spend during a sale at AllElectronics.
• This data analysis task is an example of numeric prediction.
• the model constructed predicts a continuous-valued function,
or ordered value, as opposed to a categorical label.
• This model is a predictor.
• Regression analysis is a statistical methodology used for
numeric prediction
Classification—A Two-Step Process
1. Learning Step – a classification model is constructed
2. Classification step – the model is used to predict class labels
for given data
• Learning: A classifier is built describing a set of predetermined
classes
– The classification algorithm builds a classifier by analyzing or learning
from a training set made up of database tuples and their associated
class labels
– Each tuple/sample is assumed to belong to a predefined class, as
determined by the class label attribute
– The set of tuples used for model construction: training set
– The model is represented as classification rules, decision trees, or
mathematical formulae
Classification—A Two-Step Process

• Classification step(Model usage): for classifying future or

unknown objects
– Estimate accuracy of the model
• The known label of test sample is compared with the classified
result from the model
• Accuracy rate is the percentage of test set samples that are
correctly classified by the model
• Test set is independent of training set, otherwise over-fitting will
occur
• A tuple, X, is represented by an n-dimensional attribute
vector, X = (x1, x2, …….., xn), depicting n measurements made
on the tuple from n database attributes, respectively, A1,
A2,… , An.
• Each tuple, X, is assumed to belong to a predefined class as
determined by another database attribute called the class
label attribute.
• The class label attribute is discrete-valued and unordered.
• It is categorical in that each value serves as a category or
class.
• individual tuples making up the training set are referred to as
training tuples
• Supervised Learning – the class label of each training tuple is
given
• unsupervised learning (clustering) - the class label of each
training tuple is not known, and
• the number or set of classes to be learned may not be known
in advance.
• Classification step
– Test data are used to estimate the accuracy of the
classification rules.
– The accuracy of a classifier on a given test set is the
percentage of test set tuples that are correctly classified by
the classifier.
– If the accuracy of the classifier is considered acceptable,
the classifier can be used to classify future data tuples for
which the class label is not known.
Issues regarding classification and prediction:
Preparing the Data for Classification and Prediction
• Data cleaning
– Preprocess data in order to reduce noise and handle missing
values
• Relevance analysis (feature selection)
– Remove the irrelevant or redundant attributes
– Correlation analysis
– Attribute subset selection
• Data transformation
– Normalization and generalization
– Concept hierarchy
Comparing Classification and Prediction Methods

• Accuracy
– The accuracy of a classifier refers to the ability of a given
classifier to correctly predict the class label of new or
previously unseen data.
– the accuracy of a predictor refers to how well a given
predictor can guess the value of the predicted attribute for
new or previously unseen data.
• Speed and scalability
– refers to the computational costs involved in generating
and using the given classifier or predictor.
(time to construct the model, time to use the model)
• Robustness
– Refers to the ability to make correct predictions given noisy
data and missing values
• Scalability
– Ability to construct classifier or predictor efficiently given
large amounts of data
• Interpretability:
– Level of understanding and insight provided by the
model(classifier or predictor)
Classification by Decision Tree Induction
• Decision Tree Induction - learning of decision trees from
class-labeled training tuples
• Decision tree
– A flow-chart-like tree structure
– Internal node denotes a test on an attribute
– Branch represents an outcome of the test
– Leaf nodes represent class labels or class distribution
• Decision tree generation consists of two phases
– Tree construction
• At start, all the training examples are at the root
• Partition examples recursively based on selected
attributes
– Tree pruning
• Identify and remove branches that reflect noise or
outliers
• Use of decision tree: Classifying an unknown sample
– Test the attribute values of the sample against the decision
tree
• How are decision trees used for classification?
• Given a tuple, X, for which the associated class label is
unknown, the attribute values of the tuple are tested against
the decision tree.
• A path is traced from the root to a leaf node, which holds the
class prediction for that tuple.
• Decision trees can easily be converted to classification rules.
Why are decision tree classifiers so popular?

• Decision Tree
– construction does not require any domain knowledge or
parameter setting
– It is appropriate for exploratory knowledge discovery.
– It can handle high dimensional data.
– The learning and classification steps of decision tree
induction are simple and fast.
– good accuracy.
– application areas:- medicine, manufacturing and
production, financial analysis, astronomy, molecular
biology etc.
• In early 1980’s , J. Ross Quinlan, a researcher in machine
learning, developed a decision tree algorithm known as ID3
(Iterative Dichotomiser).
• Quinlan later presented C4.5 (a successor of ID3).
• In 1984, a group of statisticians (L. Breiman, J. Friedman, R.
Olshen, and C. Stone) published the book Classification and
Regression Trees (CART), which described the generation of
binary decision trees.
• ID3 and CART were invented independently of one another at
around the same time, yet follow a similar approach for learning
decision trees from training tuples.
• ID3, C4.5, and CART adopt a greedy (nonbacktracking) approach
in which decision trees are constructed in a top-down recursive
divide-and-conquer manner.
Algorithm for Decision Tree Induction
• Basic algorithm (a greedy algorithm)
– Tree is constructed in a top-down recursive divide-and-conquer manner
– At start, all the training examples are at the root
– Attributes are categorical (if continuous-valued, they are discretized in
advance)
– Examples are partitioned recursively based on selected attributes
– Test attributes are selected on the basis of a heuristic or statistical
measure (e.g., information gain)
• Conditions for stopping partitioning
– All samples for a given node belong to the same class
– There are no remaining attributes for further partitioning – majority
voting is employed for classifying the leaf
– There are no samples left
Basic algorithm for inducing a decision tree from training tuples
Algorithm: Generate decision tree. Generate a decision tree from the training tuples
of data partition D.
• Input: Data partition, D, which is a set of training tuples and their associated class
labels;
• attribute list, the set of candidate attributes;
• Attribute selection method, a procedure to determine the splitting criterion that
“best” partitions the data tuples into individual classes. This criterion consists of a
splitting attribute and, possibly, either a split point or splitting subset.
• Output: A decision tree.
• Method:
• (1) create a node N;
• (2) if tuples in D are all of the same class, C then
• (3) return N as a leaf node labeled with the class C;
• (4) if attribute list is empty then
• (5) return N as a leaf node labeled with the majority class in D; // majority
voting
Basic algorithm for inducing a decision tree from training tuples(contd.)
• (6) apply Attribute selection method(D, attribute list) to find the “best” splitting
criterion;
• (7) label node N with splitting criterion;
• (8) if splitting attribute is discrete-valued and
multiway splits allowed then // not restricted to binary trees
• (9) attribute list ==attribute list - splitting attribute; // remove splitting attribute
• (10) for each outcome j of splitting criterion
• // partition the tuples and grow subtrees for each partition
• (11) let Dj be the set of data tuples in D satisfying outcome j; // a partition
• (12) if Dj is empty then
• (13) attach a leaf labeled with the majority class in D to node N;
• (14) else attach the node returned by Generate decision tree(Dj, attribute list)
to node N;
• endfor
• (15) return N;
Attribute Selection Measure
• A heuristic for selecting the splitting criterion that “best”
separates a given data partition, D, of class-labeled training
tuples into individual classes.
• Also known as splitting rules because they determine how the
tuples at a given node are to be split.
• Provides a ranking for each attribute describing the given training
tuples.
• The attribute having the best score for the measure is chosen as
the splitting attribute for the given tuples.
• If the splitting attribute is continuous-valued or if we are
restricted to binary trees then either a split point or a splitting
subset must also be determined as part of the splitting criterion.
• three popular attribute selection measures—
– Information gain, gain ratio, and gini index.
Information Gain
• ID3 uses Information Gain as attribute selection method.
• Node N hold(represent) the tuples of partition D.
• The attribute with the highest information gain is chosen as
the splitting attribute for node N. This attribute minimizes
the information needed to classify the tuples in the resulting
partitions
• To partition tuples in D on some attribute A having v distinct
values {a1,a2,……..av},
• If A is discrete valued, these values correspond to v outcomes
of a test on A.
• Attribute A can be used to split D into v partitions
{D1,D2,…….,Dv}, where Dj contains tuples in D having
outcome aj of A.
• These partitions correspond to the branches grown from
node N.
• If attribute a is continuous-valued, we must determine the
best split point for A.
These partitions may be impure. (ie. A partition may contain
tuples from different classes rather than from a single class)
Information Gain-Continuous valued attribute
• determine the “best” split-point for A
• sort the values of A in increasing order.
• the midpoint between each pair of adjacent values is a
possible split-point.
• given v values of A, then v-1 possible splits values.
• the midpoint between ai and ai+1 of A is

• The point with the minimum expected information

requirement for A is selected as the split point for A.
• D1 is the set of tuples in D satisfying A < split point, and
• D2 is the set of tuples in D satisfying A > split point.
Gain Ratio
• The information gain prefer attributes with large number of
values.
• A split on product_ID would result in a large number of
partitions (as many as there are values), each one containing
just one tuple.
• Because each partition is pure, the information required to
classify data set D based on this partitioning would be

• Therefore, the information gained by partitioning on this

attribute is maximal.
• such a partitioning is useless for classification.
To determine the best binary split on A, we examine all of the possible
subsets that can be formed using known values of A.
Each subset, SA, can be considered as a binary
test for attribute A of the form “?”.
Given a tuple, this test is satisfied if the value of A for the tuple is among
the values listed in SA.
the midpoint between each pair of (sorted) adjacent values is taken as a possible split-
point.
The point giving the minimum Gini index for a given (continuous-valued) attribute is
taken as the split-point of that attribute.
for a possible split-point of A, D1 is the set of tuples in D satisfying A < split point, and
D2 is the set of tuples in D satisfying A > split point.
• buys computer = yes – 9 tuples
• buys computer = No – 5 tuples
• Gini index to compute the impurity of D:

• To find the splitting criterion for the tuples in D, we compute the

gini index for each attribute.
• We start with the attribute income and consider each of the
possible splitting subsets.
• Consider the subset {low, medium}.
• This would result in 10 tuples in partition D1 satisfying the
condition
Tree Pruning
• When a decision tree is built, many of the branches will reflect
anomalies in the training data due to noise or outliers.
• Tree pruning methods address this problem of overfitting the
data.
• Pruned trees tend to be smaller and less complex and, thus,
easier to comprehend.
• They are usually faster and better at correctly classifying
independent test data than unpruned trees.
• There are two common approaches to tree pruning:
• prepruning and postpruning.`
- Prepruning: Halt tree construction early—do not split a
node if this would result in the goodness measure falling
below a threshold
• Upon halting, the node becomes a leaf. The leaf may hold the
most frequent class among the subset tuples
• measures such as statistical significance, information gain,
Gini index, and so on can be used to assess the goodness of a
split.
• Difficult to choose an appropriate threshold
– Postpruning: Remove branches(sub trees) from a “fully
grown” tree—get a sequence of progressively pruned
trees.
• A subtree at a given node is pruned by removing its branches
and replacing it with a leaf. The leaf is labeled with the most
frequent class among the subtree being replaced.
• Use a set of data different from the training data to
decide which is the “best pruned tree”- pruning set
C4.5 uses a method called pessimistic pruning, which is
similar to the cost complexity method in that it also uses
error rate estimates to make decisions regarding subtree
pruning. Pessimistic pruning, however, does not require the
use of a prune set. Instead, it uses the training set to estimate
error rates.
Postpruning requires more computation than prepruning, yet
generally leads to a more reliable tree. No single pruning
method has been found to be superior over all others
Bayesian Classification: Why?

• Probabilistic learning: can predict class membership probabilities,

such as the probability that a given tuple belongs to a particular
class.
• Incremental: Each training example can incrementally
increase/decrease the probability that a hypothesis is correct.
Prior knowledge can be combined with observed data.
• Probabilistic prediction: Predict multiple hypotheses, weighted
by their probabilities
• Standard: Even when Bayesian methods are computationally
intractable, they can provide a standard of optimal decision
making against which other methods can be measured
Bayesian Classification
• A statistical classifier
• Predict class membership probabilities
• Naïve Bayesian classifier is comparable in
performance with decision tree and selected
neural network classifiers.
• Based on Bayes’ Theorem
Bayes’ Theorem
• Let X be a data tuple.
• In Bayesian terms, X is considered “evidence.”
• Let H be some hypothesis, such that the data tuple X belongs
to a specified class C.
• For classification problems, we want to determine P(H/X), the
probability that the hypothesis H holds given the “evidence”
or tuple X.
• P(H/X) is the posterior probability.
• we are looking for the probability that tuple X belongs to class
C, given that we know the attribute description of X.
• P(H/X) is the posterior probability of H conditioned on X.
• Eg:- X is a 35-year-old customer with an income of $40,000.
• H is the hypothesis that our customer will buy a computer.
• P(H/X) reflects the probability that customer X will buy a computer
given that we know the customer’s age and income.
• In contrast, P(H) is the prior probability, or a priori probability, of H.
• This is the probability that any given customer will buy a computer,
regardless of age, income, or any other information, for that
matter.
• The posterior probability, P(H/X), is based on more information
(e.g., customer information) than the prior probability, P(H), which
is independent of X.
• P(X/H) is the posterior probability of X conditioned on H. It is the
probability that a customer, X, is 35 years old and earns $40,000,
given that we know the customer will buy a computer.
• P(X) is the prior probability of X. It is the probability that a person
from our set of customers is 35 years old and earns $40,000.
• P(H), P(X/H), and P(X) may be estimated from the given data.
• Bayes’ theorem provides a way of calculating the posterior
probability, P(H/X) from P(H), P(X/H), and P(X).
• Bayes’ theorem is
Naïve Bayesian Classifier
• Let D be a training set of tuples and their associated class
labels.
• A tuple X is an n dimensional attribute vector X={x1,x2,….,xn}
with n attributes, A1,A2,…….,An.
• There are ‘m’ classes, C1,C2,….,Cm.
• Given a tuple, X, the classifier will predict that X belongs to
the class having the highest posterior probability, conditioned
on X.
• The naïve Bayesian classifier predicts that tuple X belongs to
the class Ci if and only if
• As P(X) is constant for all classes, only P(X/Ci)P(Ci) need be
maximized.
• If the class prior probabilities are not known, that is, P(C1) =
P(C2) = …….. = P(Cm), we assume that classes are equally
likely and we would therefore maximize P(X/Ci). Otherwise,
we maximize P(X/Ci)P(Ci).
• The class prior probabilities may be estimated by
where |Ci,D| is the number of training tuples of class Ci in D.
• Given data sets with many attributes, it would be extremely
computationally expensive to compute P(X/Ci).
• In order to reduce computational complexity, the naive
assumption of class conditional independence is made- the
values of the attributes are conditionally independent of one
another, given the class label of the tuple.

• The probabilities P(x1/Ci), P(x2/Ci), ………, P(xn/Ci) can be

estimated from the training tuples.
• xk refers to the value of attribute Ak for tuple X.
• For each attribute, we look at whether the attribute is
categorical or continuous-valued.
We need to compute μCi and , which are the mean (i.e., average) and
standard deviation, respectively, of the values of attribute Ak for training
tuples of class Ci.
Bayesian Belief Networks

A belief network is defined by two components—a directed

acyclic graph and a set of conditional probability tables. Each
node in the directed acyclic graph represents a random variable.
Each arc represents a probabilistic dependence.
If an arc is drawn from a node Y to a node Z, thenY is a parent or
immediate predecessor of Z, and Z is a descendant of Y.
Each variable is conditionally independent of its non descendants in
the graph, given its parents.
• Having lung cancer is influenced by a person’s family history
of lung cancer, as well as whether or not the person is a
smoker.
• PositiveXRay is independent of whether the patient has a
family history of lung cancer or is a smoker, given that we
know the patient has lung cancer.
• In other words, once we know the outcome of the variable
LungCancer, then the variables FamilyHistory and Smoker do
not provide any additional information regarding
PositiveXRay.
• The arcs also show that the variable LungCancer is
conditionally independent of Emphysema, given its parents,
FamilyHistory and Smoker.
• A belief network has one conditional probability table (CPT)
for each variable.
• The CPT for a variable Y specifies the conditional distribution
• , where Parents(Y) are the parents of Y.

A node within the network can be selected as an “output”

node, representing a class label attribute.

Module 3 Notes
No ratings yet
Module 3 Notes
31 pages
R20 DMT Unit-Iii
No ratings yet
R20 DMT Unit-Iii
21 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
41 pages
DWDM Unit 4 PDF
No ratings yet
DWDM Unit 4 PDF
18 pages
DWDM Unit IV Note
No ratings yet
DWDM Unit IV Note
21 pages
Topic7 Classification
No ratings yet
Topic7 Classification
104 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
Classification and Prediction
No ratings yet
Classification and Prediction
69 pages
Classification and Prediction Guide
No ratings yet
Classification and Prediction Guide
93 pages
Data Mining Unit-Iii
No ratings yet
Data Mining Unit-Iii
36 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
DM Unit-3
No ratings yet
DM Unit-3
46 pages
Module III
No ratings yet
Module III
130 pages
CH 5
No ratings yet
CH 5
84 pages
Dmi Unit 4
No ratings yet
Dmi Unit 4
34 pages
Unit 3
No ratings yet
Unit 3
28 pages
Classification
No ratings yet
Classification
81 pages
Dwdm-Unit-3 R16
No ratings yet
Dwdm-Unit-3 R16
14 pages
DWM - Module 3
No ratings yet
DWM - Module 3
22 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
50 pages
Data Classification & Prediction Guide
No ratings yet
Data Classification & Prediction Guide
34 pages
DM Unit-3
No ratings yet
DM Unit-3
23 pages
DWDM Module IV
No ratings yet
DWDM Module IV
57 pages
4 Classification
No ratings yet
4 Classification
20 pages
Data Mining and Warehousing Mod3
No ratings yet
Data Mining and Warehousing Mod3
69 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
DWM Unit-V Notes
No ratings yet
DWM Unit-V Notes
15 pages
Classification and Prediction Overview
No ratings yet
Classification and Prediction Overview
75 pages
Classification & Prediction Guide
100% (1)
Classification & Prediction Guide
67 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
Unit-Iii: Classification and Prediction
No ratings yet
Unit-Iii: Classification and Prediction
21 pages
Classification
No ratings yet
Classification
23 pages
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
Updated DM Unit 3
No ratings yet
Updated DM Unit 3
28 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
Data Mining: Classification Algorithms
No ratings yet
Data Mining: Classification Algorithms
34 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
4 & 5 DWM 2024-25
No ratings yet
4 & 5 DWM 2024-25
32 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
CH 4
No ratings yet
CH 4
21 pages
Classification and Prediction
No ratings yet
Classification and Prediction
143 pages
DWDM Unit-3: What Is Classification? What Is Prediction?
No ratings yet
DWDM Unit-3: What Is Classification? What Is Prediction?
12 pages
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
No ratings yet
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
43 pages
Decision Tree Classification Guide
No ratings yet
Decision Tree Classification Guide
23 pages
Unit-3 DWDM
No ratings yet
Unit-3 DWDM
11 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
141 pages
7 Classification
100% (3)
7 Classification
63 pages
Classification Basics for Data Mining
No ratings yet
Classification Basics for Data Mining
29 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
03 Decision Tree
No ratings yet
03 Decision Tree
59 pages
DM Module-3 Notes
No ratings yet
DM Module-3 Notes
25 pages
Lecture11 Ch8 ClassBasic Part1
No ratings yet
Lecture11 Ch8 ClassBasic Part1
38 pages
TTDS Lecture 4
No ratings yet
TTDS Lecture 4
31 pages
Classification
No ratings yet
Classification
45 pages
Classification
No ratings yet
Classification
75 pages
Unit-4 DM
No ratings yet
Unit-4 DM
15 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
For More Visit WWW - Ktunotes.in
No ratings yet
For More Visit WWW - Ktunotes.in
21 pages
My Fault Book 687
No ratings yet
My Fault Book 687
4 pages
Aamir Resume
No ratings yet
Aamir Resume
2 pages
Introduction To Sensor-Based Measurement System
No ratings yet
Introduction To Sensor-Based Measurement System
14 pages
Microbit Lesson 123
100% (2)
Microbit Lesson 123
62 pages
CSE111 Mcq's
No ratings yet
CSE111 Mcq's
47 pages
Show My Homework Claremont High School Academy
100% (1)
Show My Homework Claremont High School Academy
5 pages
System - Analyst - Deputy Manager - 555 - PDF
No ratings yet
System - Analyst - Deputy Manager - 555 - PDF
5 pages
Ebooks File Sadie and The Bad Boy Billionaire: A Sweet Romantic Comedy (Oakley Island Romcoms Book 3) Jenny Proctor & Emma St. Clair All Chapters
100% (2)
Ebooks File Sadie and The Bad Boy Billionaire: A Sweet Romantic Comedy (Oakley Island Romcoms Book 3) Jenny Proctor & Emma St. Clair All Chapters
66 pages
Dell Inspiron 15 5570 LA-F114PR10 CAL60 UMA 20170726
No ratings yet
Dell Inspiron 15 5570 LA-F114PR10 CAL60 UMA 20170726
53 pages
H310M Pro VDH Plus
No ratings yet
H310M Pro VDH Plus
1 page
APC Smart-UPS RT 20kVA Specifications
No ratings yet
APC Smart-UPS RT 20kVA Specifications
4 pages
Deep Tech Exits Surge in 2023
No ratings yet
Deep Tech Exits Surge in 2023
14 pages
Asi 05 00011
No ratings yet
Asi 05 00011
9 pages
IGCSE Add Math Past Papers 2011-2021
No ratings yet
IGCSE Add Math Past Papers 2011-2021
11 pages
112 51 Demo
No ratings yet
112 51 Demo
7 pages
Chipotle's Success and Market Strategy
No ratings yet
Chipotle's Success and Market Strategy
5 pages
ICT Policy of Bangladesh
86% (7)
ICT Policy of Bangladesh
23 pages
HANASitter Monitoring Tool Guide
No ratings yet
HANASitter Monitoring Tool Guide
57 pages
Comparison Among DDC, UDC, LC
100% (1)
Comparison Among DDC, UDC, LC
5 pages
Frappe Framework & ERPNext Installation
No ratings yet
Frappe Framework & ERPNext Installation
4 pages
Power BI KPOP Instructions
No ratings yet
Power BI KPOP Instructions
4 pages
Fortigate 200f Series
No ratings yet
Fortigate 200f Series
10 pages
Emileandro Perito Quindiagan
No ratings yet
Emileandro Perito Quindiagan
5 pages
Stochastic Approximation
No ratings yet
Stochastic Approximation
9 pages
Drive Test Interview Questions
94% (31)
Drive Test Interview Questions
14 pages
Chapter 8-Structuring System Data Requirements
100% (1)
Chapter 8-Structuring System Data Requirements
51 pages
Onehouse
No ratings yet
Onehouse
39 pages
2020 05 27 Introduction To Epics, User Stories And..
No ratings yet
2020 05 27 Introduction To Epics, User Stories And..
23 pages
Unison Mosaic Hardware InstallGuide RevC
No ratings yet
Unison Mosaic Hardware InstallGuide RevC
60 pages
FS2Crew Labs PF Tutorial
No ratings yet
FS2Crew Labs PF Tutorial
35 pages

Module 3

Uploaded by

Module 3

Uploaded by

Classification

Eg1:- A bank loans officer needs analysis of data to learn which

• Classification step(Model usage): for classifying future or

• The point with the minimum expected information

• Therefore, the information gained by partitioning on this

• To find the splitting criterion for the tuples in D, we compute the

• Probabilistic learning: can predict class membership probabilities,

• The probabilities P(x1/Ci), P(x2/Ci), ………, P(xn/Ci) can be

A belief network is defined by two components—a directed

A node within the network can be selected as an “output”

You might also like