0% found this document useful (0 votes)

9 views27 pages

Unit 4-1

The document discusses various estimation methods in machine learning, focusing on parametric and nonparametric approaches, including decision trees and discriminant functions. It explains the structure and functioning of decision trees, their dynamic nature, and the concepts of entropy and information gain. Additionally, it covers linear and quadratic discriminant functions, their advantages and disadvantages, and the geometric interpretation of decision boundaries for classification tasks.

Uploaded by

Virupaksh Alur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views27 pages

Unit 4-1

Uploaded by

Virupaksh Alur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Decision Tree

Parametric estimation
• A single model is defined for the entire input space (the domain of all
possible inputs).

• parameters of this model are learned using the entire training dataset.

• Once the model is trained, the same set of parameters is applied to any
new test input, regardless of its specific location in the input space.
Nonparametric Estimation
• The input space is divided into local regions based on a distance measure
(e.g., the Euclidean norm, which measures how far points are from each
other in space).

• For each test input, a local model is created using the training data within
its corresponding region.
Instance-Based Models
• nonparametric models that rely on training data instances directly to make
predictions (e.g., k-nearest neighbors).

• calculating the distance from the test input to every training instance, which
is computationally expensive, with a time complexity of O(N),where N is
the number of training instances.
Decision Tree
• applies a divide-and-conquer strategy to solve problems.

• A decision tree is a structured, hierarchical representation of data. It

branches out like a tree, where each internal node represents a decision
based on attribute, and each leaf node represents an outcome or a
class.

• can be used for classification (assigning labels to data) or regression

(predicting numerical values).
• Internal Decision Nodes: Represent points in the tree where decisions
are made based on a specific condition or test.

• Terminal Leaves: Represent the endpoints of the tree. These contain

the output value or class label for the given input.

• Decision Nodes: Each node applies a test function on the input data

• Branches: Paths connecting decision nodes, corresponding to the test

outcomes.

• Leaf Node: The tree ends here, and the value in the leaf node is the
final output for the input.
• A learning algorithm takes a labeled training dataset (data where the
inputs and their corresponding outputs are known) to construct the
decision tree.

• splitting the data repeatedly based on attributes to minimize errors and

improve predictive accuracy.

• directly learn a rule base from the data, bypassing the tree structure.

• parametric models (e.g., linear regression), decision trees do not

assume any predefined distribution (e.g., Gaussian) for the class
densities (how data points are distributed within each class).
Dynamic Tree Structure
• the tree is built incrementally during the training process, adding branches
and leaves based on the complexity and characteristics of the data.

• Splitting the data recursively at decision nodes, based on conditions that

reduce uncertainty (e.g., using measures like Gini Impurity or Information
Gain).

• if the data shows a complex relationship between input features and output
classes, the tree will add more splits, branches, and leaves to accurately
model those relationships.
• Entropy quantifies the amount of disorder or uncertainty in a dataset.
A lower entropy means the dataset is more homogeneous (e.g., all
data points belong to the same class), while higher entropy indicates
more heterogeneity (e.g., data points are mixed across multiple
classes).
• Information Gain (IG) is the difference between the entropy of the
dataset before the split and the weighted entropy of the subsets after
the split.
Linear Discrimination
• the fundamental assumption is that instances of different classes can
be separated by a linear decision boundary.

• hyperplane in the feature space can effectively distinguish between

instances belonging to different categories.

• Set of discriminant functions for classification gj (x), j = 1,...,K,

• prior probabilities :

• Class likelihoods:

• Posterior densities:

• Discriminant based classification: assume a model directly for the

discriminant, bypassing the estimation of likelihoods or posteriors.

• Model for discriminant:

• discriminant function that directly assigns class labels to input

samples without explicitly estimating likelihoods or posterior probabilities.
Inductive Bias in Discriminant-Based Classification:
• Instead of Making an assumption on the form of the class densities, make an
assumption on the form of the boundaries separating classes.
Example: Perceptron, Support Vector Machines (SVM), Logistic Regression
Linear discriminant:
• Used mainly for low space and time complexity O(d).

• Before using complex models like neural networks or kernel-based SVMs, first try
a linear discriminant model to check if it performs well.
Quadratic Discriminant Function:
• Allows for more complex decision boundaries than a linear model.

Advantages:
• Capture the complex boundaries

Disadvantages:
• High computational cost
• Risk of overfitting
• map the data to a higher-dimensional space and apply a linear model.
• Instead of directly using a quadratic discriminant, preprocess the input by
adding higher-order features.

• Original input:

• New transformed variables:

• New feature vector:

Advantages:
• computational complexity is low
• More interpretable

Disadvantages:
• risk of overfitting
Generalized approach:
• Basic function:

Examples:
• Polynomial basis:

• Trigonometric basis:
Geometry of the Linear Discriminant:
• Two classes:

• Decision Hyperplane: g(x)=0

• Weight vector defines the orientation of the hyperplane.

• Bias/threshold defines the position of the hyperplane.
• Geometric Interpretation of the Decision Hyperplane
• Any point x in the input space can be decomposed into two
components
• Multiple Classes
Pairwise separation:
It uses K(K − 1)/2 linear discriminants, gij (x), one for every pair of
distinct classes.

Parametric Discrimination Revisited:

Gradient Descent:
optimize the parameters of the discriminant function to minimize
classification error on the training set.
Logistic discrimination
Two classes:log likelihood ratio is linear

• x may be composed of discrete attributes or may be a mixture of

continuous and discrete attributes.
• Using baye’s rule:
• Sigmoid function
Multiple classes:
Discrimination by regression
• Probabilistic model

Determining Spot Heights From Contours
0% (1)
Determining Spot Heights From Contours
13 pages
Dsbdunitiii T1729232981820-1
No ratings yet
Dsbdunitiii T1729232981820-1
26 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Types of Kernels in Support Vector Machines
No ratings yet
Types of Kernels in Support Vector Machines
14 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Lecture 7 Overview of ML Models
No ratings yet
Lecture 7 Overview of ML Models
77 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Unit-Iii: Classification and Prediction
No ratings yet
Unit-Iii: Classification and Prediction
21 pages
Pattern Recognition
No ratings yet
Pattern Recognition
50 pages
Data Analytics Unit IV
No ratings yet
Data Analytics Unit IV
36 pages
Unit - 3
No ratings yet
Unit - 3
73 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
Data Mining NOTES
No ratings yet
Data Mining NOTES
57 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Decision Trees
67% (3)
Decision Trees
14 pages
Unit II
No ratings yet
Unit II
34 pages
Machine Learning Notes ?
No ratings yet
Machine Learning Notes ?
14 pages
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
No ratings yet
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
60 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
Slide 3
No ratings yet
Slide 3
23 pages
ML and Ai Unit 04 and Unit 05
No ratings yet
ML and Ai Unit 04 and Unit 05
58 pages
Dmi Unit 4
No ratings yet
Dmi Unit 4
34 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
Classification
No ratings yet
Classification
75 pages
Unit-4 DM
No ratings yet
Unit-4 DM
15 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
4 Classification
No ratings yet
4 Classification
20 pages
Classification and Prediction
No ratings yet
Classification and Prediction
143 pages
Rf&DTfratello 2018
No ratings yet
Rf&DTfratello 2018
10 pages
DWDM Unit IV Note
No ratings yet
DWDM Unit IV Note
21 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Module 3 Notes
No ratings yet
Module 3 Notes
31 pages
Deep Learning Answers
No ratings yet
Deep Learning Answers
36 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
41 pages
ML Notes
No ratings yet
ML Notes
50 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Session 5
No ratings yet
Session 5
36 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Decision Tree Part 1
No ratings yet
Decision Tree Part 1
16 pages
Classification DecisionTreesNaiveBayeskNN
No ratings yet
Classification DecisionTreesNaiveBayeskNN
75 pages
Unit 5
No ratings yet
Unit 5
25 pages
UNIT II 2.1 ML Decision Tree Learning
No ratings yet
UNIT II 2.1 ML Decision Tree Learning
55 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages
Decision Trees
No ratings yet
Decision Trees
37 pages
Classifiction
No ratings yet
Classifiction
42 pages
WK 07
No ratings yet
WK 07
8 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
7 pages
1Z0 1091 24 Demo
0% (1)
1Z0 1091 24 Demo
6 pages
Solved - Discuss The Difference Between Resource Loading and Resource Leveling, and Provide An Example..
No ratings yet
Solved - Discuss The Difference Between Resource Loading and Resource Leveling, and Provide An Example..
2 pages
Docker Kubernetes Made Easy Interactive Ebook FINAL
No ratings yet
Docker Kubernetes Made Easy Interactive Ebook FINAL
7 pages
Mkt4218: New Product and Innovation
No ratings yet
Mkt4218: New Product and Innovation
36 pages
HFSS Filter
No ratings yet
HFSS Filter
37 pages
Fluent-Intro 15.0 L07 Turbulence PDF
No ratings yet
Fluent-Intro 15.0 L07 Turbulence PDF
48 pages
Booklet Primer Grado Insps 2024
No ratings yet
Booklet Primer Grado Insps 2024
42 pages
Simplified Com Ai Presentation Maker Agile Workflow
No ratings yet
Simplified Com Ai Presentation Maker Agile Workflow
3 pages
Formal Letter Unsatisfactory Facilities at The Library
No ratings yet
Formal Letter Unsatisfactory Facilities at The Library
1 page
Configuring ODI External User Authentication
No ratings yet
Configuring ODI External User Authentication
18 pages
Behringer MIC100 P0207 M en
No ratings yet
Behringer MIC100 P0207 M en
0 pages
Practical Questions Mysql For Record 2023-24
0% (1)
Practical Questions Mysql For Record 2023-24
4 pages
Integrative Programming and Technology 1
No ratings yet
Integrative Programming and Technology 1
4 pages
Class 9 Question Paper New
No ratings yet
Class 9 Question Paper New
8 pages
Geisel Layout
No ratings yet
Geisel Layout
1 page
Animasi Pesawat Menggunakan OpenGL
No ratings yet
Animasi Pesawat Menggunakan OpenGL
11 pages
2nd Quarter Exam Mil
100% (2)
2nd Quarter Exam Mil
3 pages
KeralaPentecostHistory (SajuMathew) PDF
100% (1)
KeralaPentecostHistory (SajuMathew) PDF
440 pages
Technical Service Guide: General Electric Side-by-Side Knob Control/Metal Liner Refrigerator
No ratings yet
Technical Service Guide: General Electric Side-by-Side Knob Control/Metal Liner Refrigerator
70 pages
ACS-0026407 261112 Systems Analyst Result Letter 2025-06-11 - Completed
No ratings yet
ACS-0026407 261112 Systems Analyst Result Letter 2025-06-11 - Completed
5 pages
Resume 2022
No ratings yet
Resume 2022
2 pages
TH1n EN Datasheet
No ratings yet
TH1n EN Datasheet
2 pages
Self Balancing Scooter Ver 20 PDF
No ratings yet
Self Balancing Scooter Ver 20 PDF
9 pages
Assignment Guidelines-July'24 Session
No ratings yet
Assignment Guidelines-July'24 Session
2 pages
Smart Parking System..
No ratings yet
Smart Parking System..
75 pages
Imo PS S-VDR
No ratings yet
Imo PS S-VDR
5 pages
27MP58VQP
No ratings yet
27MP58VQP
30 pages
FlashSystem Redirect On Write Snapshots 2021 Jul 01
No ratings yet
FlashSystem Redirect On Write Snapshots 2021 Jul 01
8 pages
Change The VxRail Manager IP Address
No ratings yet
Change The VxRail Manager IP Address
2 pages

Unit 4-1

Uploaded by

Unit 4-1

Uploaded by

Decision Tree

• A decision tree is a structured, hierarchical representation of data. It

• can be used for classification (assigning labels to data) or regression

• Terminal Leaves: Represent the endpoints of the tree. These contain

• Branches: Paths connecting decision nodes, corresponding to the test

• splitting the data repeatedly based on attributes to minimize errors and

• parametric models (e.g., linear regression), decision trees do not

• Splitting the data recursively at decision nodes, based on conditions that

• hyperplane in the feature space can effectively distinguish between

• Set of discriminant functions for classification gj (x), j = 1,...,K,

• Discriminant based classification: assume a model directly for the

• Model for discriminant:

• discriminant function that directly assigns class labels to input

• New transformed variables:

• New feature vector:

• Decision Hyperplane: g(x)=0

• Weight vector defines the orientation of the hyperplane.

Parametric Discrimination Revisited:

• x may be composed of discrete attributes or may be a mixture of

You might also like