MLT Unit-3 Important Questions

The document discusses basic terminology used in decision trees such as root node, splitting, decision node, leaf/terminal node, pruning, branch/sub-tree, and parent and child node. It also explains various decision tree learning algorithms like ID3, C4.5, and CART. Issues related to decision tree applications and advantages and disadvantages of K-nearest neighbor algorithm are also covered.

Uploaded by

Jitendra Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views8 pages

MLT Unit-3 Important Questions

Uploaded by

Jitendra Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Q1. Describe the basic terminology used in decision tree.

Ans. Basic terminology used in decision trees are:

 1. Root node: The complete population or sample is represented, and this is

then separated into two or more homogeneous sets.
 2. Splitting: It is a process of dividing a node into two or more sub-nodes.
 3. Decision node: When a sub-node splits into further sub-nodes, then it is
called decision node.

 4. Leaf/Terminal node: Nodes that do not split is called leaf or terminal

node.
 5. Pruning: Pruning is the process of removing sub-nodes from a decision
node. The dividing process is in opposition to this one.
 6. Branch/sub-tree: A sub section of entire tree is called branch or sub tree.
 7. Parent and child node: The parent node of sub-nodes is a node that has
been divided into sub-nodes, whereas sub-nodes are the parent node’s
offspring.
Q2. Explain various decision tree learning algorithms.

Ans. Various decision tree learning algorithms are:

1. ID3 (Iterative Dichotomiser 3):

i. A decision tree can be created using the D3 technique using a dataset.

ii. To build a decision tree, ID3 does a top-down, greedy search across the
provided sets, testing each attribute at each tree node to determine which
attribute is most effective for categorizing a specific e.
iii. As a result, the test attribute for the present node can be chosen as the
attribute with the largest information gain.
iv. This algorithm favours smaller decision trees over larger ones. As it does not
build the smallest tree, it is a heuristic algorithm.
v. Only categorical attributes are accepted by ID3 when creating a decision tree
model. When there is noise and when ID3 is used serially, accurate results are
not produced.
vi. Consequently, data is preprocessed before a decision tree is built.
vii. Information gain is determined for each and every attribute used in the
construction of a decision tree, and the attribute with the highest information
gain is indicated by arcs. the root node appears. The remaining potential values
are indicated.
viii. All potential result examples are analyzed to see if they all fall under the
same class or not. When referring to instances of the same class, the class is
identified by a single name; otherwise, instances are categorized according to
the splitting attribute.
2 C4.5:

i. The algorithm C4.5 is employed to create a decision tree. It is an

improvement to the ID3 algorithm.
ii. C4.5 is referred to as a statistical classifier since it creates decision trees that
may be used for categorization.
iii. It handles both continuous and discrete attributes, as well as missing values
and pruning trees after construction, making it superior to the ID3 method.
iv. C5.0 is the commercial replacement for C4.5 since it is quicker, uses less
memory, and can create smaller decision trees.
v. Version C4.5 executes a tree pruning operation by default. Smaller trees are
created as a result, along with straightforward rules and more logical
interpretations.
3. CART (Classification And Regression Trees):

i. CART algorithm builds both classification and regression trees.

ii. The classification tree is constructed by CART through binary splitting of the
attribute.
iii. Gini Index is used for selecting the splitting attribute.
iv. The CART is also used for regression analysis with the help of regression
tree.
v. The regression feature of CART can be used in forecasting a dependent
variable given a set of predictor variable over a given period of time.
vi. CART has an average speed of processing and supports both continuous and
nominal attribute data.

Q3. Explain inductive bias with inductive system.

Ans. Inductive bias:

1. The limitations imposed by the learning method’s presumptions are referred

to as inductive bias.
2. Assume, for instance, that a conjunction of a group of eight notions can be
used to describe a solution to the problem of road safety.
3. More intricate phrases that cannot be stated as conjunctions are not permitted
by this.
4. Because of this inductive bias, we are unable to study some potential
solutions because they are outside the version space that we have looked at.
5. Every potential hypothesis that may be articulated would need to be present
in the version space in order to have an impartial learner.
6. The whole collection of training data could never be more general than the
answer the learner came up with.
7. In other words, it would be able to categorize information that it had already
seen (much like a rote learner could), but it would be unable to generalize to
categorize information that it had never seen before.
8. The candidate elimination algorithm’s inductive bias prevents it from
classifying fresh data unless all of the hypotheses in its version space assign the
same classification to it.
9. As a result, the learning approach is constrained by the inductive bias.
Inductive system:

Q4. Discuss the issues related to the applications of decision trees.

Ans. Issues related to the applications of decision trees are:

1. Missing data:

 a. When values have gone unrecorded, or they might be too expensive to

obtain.
 b. Two problems arise:
 i. To classify an object that is missing from the test attributes.
 ii. To modify the information gain formula when examples have
unknown values for the attribute.
2. Multi-valued attributes:

 a. The information gain measure provides an inaccurate representation of an

attribute’s usefulness when it has a wide range of possible values.
 b. In the worst scenario, we might employ an attribute with a unique value
for each sample.
 c. The information gain measure would then have its highest value for this
feature, even though the attribute might be unimportant or meaningless,
since each group of samples would then be a singleton with a distinct
classification.
 d. Using the gain ratio is one option.
3. Continuous and integer valued input attributes:

 a. There are an endless number of possible options for height and weight.
 b. Decision tree learning algorithms discover the split point that provides the
best information gain rather than creating an endless number of branches.
 c. Finding suitable Split points can be done efficiently using dynamic
programming techniques, but this is still the most expensive step in practical
decision tree learning applications.
4. Continuous-valued output attributes:

 a. A regression tree is required when attempting to predict a numerical

value, such as the cost of an artwork, as opposed to discrete classifications.
 b. Rather than having a single value at each leaf, such a tree has a linear
function of a subset of numerical properties.
 c. The learning algorithm must determine when to stop splitting and start
using the remaining attributes in linear regression.
Q5. What are the advantages and disadvantages of K-nearest neighbour
algorithm ?

Ans. Advantages of KNN algorithm:

1. No training period:
 a. KNN is referred to as a lazy learner (Instance-based learning).
 b. Throughout the training phase, it does not learn anything. The training
data are not used to derive any discriminative function.
 c. In other words, it doesn’t require any training. Only when making real-
time predictions does it draw on the training dataset it has stored.
 d. As a result, the KNN method runs significantly more quickly than
other algorithms, such as SVM, Linear Regression, etc., that call for
training.
2. As the KNN algorithm doesn’t need to be trained before producing predictions,
new data can be supplied without disrupting the algorithm’s accuracy.
3. KNN is quite simple to use. KNN implementation just needs two parameters: the
value of K and the distance function (for example, Euclidean).
Disadvantages of KNN:

1. Does not work well with large dataset: In large datasets, the algorithm’s
speed suffers due to the high cost of computing the distance between each new
point and each current point.
2. Does not work well with high dimensions: The KNN technique does not
perform well with high dimensional data because it becomes challenging for the
algorithm to calculate the distance in each dimension when there are many
dimensions.
3. Need feature scaling: Before we apply the KNN method to any dataset, we
must do feature scaling (standardization and normalization). In the absence of
this, KINN may produce inaccurate forecasts.
4. Sensitive to noisy data, missing values and outliers: KNN is sensitive to
dataset noise. Outliers must be eliminated, and missing values must be
manually represented.
Q6. What are the benefits of CBL as a lazy problem solving method ?

Ans. The benefits of CBL as a lazy Problem solving method are:

1. Ease of knowledge elicitation:

a. Instead of hard to extract rules, lazy methods can use readily available
case or problem instances.
b. As a result, case gathering and structure take the place of traditional
knowledge engineering.
2. Absence of problem-solving bias:
a. Cases can be utilized for many different types of problem-solving since
they are recorded in a raw format.
b. This contrasts with eager procedures, which are limited to serving the
original intent for which they were created.
3. Incremental learning:
a. The ACBL system can be implemented with a small number of solved
cases serving as the case basis.
b. More cases will be added to the case base, enhancing the system’s
capacity for problem-solving.
c. In addition to expanding the case database, it is also possible to build new
indexes and cluster types or modify those that already exist.
d. In contrast, anytime informatics extraction (knowledge generalization) is
carried out, a specific training period is needed.
e. As a result, it is possible to adapt dynamically online to a flexible
environment.
4. Suitability for complex and not-fully formalized solution spaces:
a. CBL systems can be applied to a problem domain with an incomplete
model; implementation requires both the identification of pertinent case
features and the provision of appropriate cases, possibly from a partial case
base.
b. Lazy techniques, which replace the supplied data with abstractions
acquired through generalization, are more suitable for complex solution
spaces than eager approaches.
5. Suitability for sequential problem solving:
a. The preservation of history in the form of a sequence of states or
operations is beneficial for sequential activities, such as the reinforcement
learning issues mentioned above.
b. Such a storage is facilitated by lazy approaches.
6. Ease of explanation:
a. The similarity of the current issue to the retrieved example can be used to
justify the outcomes of a CBL system.
b. CBL are simple to link to earlier incidents, making system failures easier
to examine.
7. Ease of maintenance: This is especially true given that CBL systems can
easily adjust to numerous changes in the issue domain and the pertinent
environment through simple acquisition.

Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
SWOT Sapphire
67% (3)
SWOT Sapphire
4 pages
Building Transformer-Based Natural Language Processing Applications
No ratings yet
Building Transformer-Based Natural Language Processing Applications
3 pages
Assignment 04
No ratings yet
Assignment 04
17 pages
Decision Trees and Random Forest Q&a
No ratings yet
Decision Trees and Random Forest Q&a
48 pages
ML Unit-Ii Notes
No ratings yet
ML Unit-Ii Notes
17 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
Decision Trees: Make A Decision (Represent An Outcome
No ratings yet
Decision Trees: Make A Decision (Represent An Outcome
4 pages
25 Questions To Test Your Skills On Decision Trees
No ratings yet
25 Questions To Test Your Skills On Decision Trees
10 pages
Unit Iir20
No ratings yet
Unit Iir20
22 pages
Unit-4 FDS
No ratings yet
Unit-4 FDS
19 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
FMLanswerkey-IT 2
No ratings yet
FMLanswerkey-IT 2
11 pages
Ml-Unit Iii-1
No ratings yet
Ml-Unit Iii-1
46 pages
ML Notes
No ratings yet
ML Notes
50 pages
FDS - Iv Unit
No ratings yet
FDS - Iv Unit
18 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Peer Reviewed Scientific Journals
No ratings yet
Peer Reviewed Scientific Journals
9 pages
Unit 3,4,5 ML (CS - AI)
No ratings yet
Unit 3,4,5 ML (CS - AI)
37 pages
AIML Removed
No ratings yet
AIML Removed
25 pages
AIML Removed Merged
No ratings yet
AIML Removed Merged
31 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
AIML Final Cpy Word
No ratings yet
AIML Final Cpy Word
15 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
ML Unit 03
No ratings yet
ML Unit 03
23 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
ML Unit-3
No ratings yet
ML Unit-3
23 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
ML Unit 3
No ratings yet
ML Unit 3
30 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
DMT MCQ
No ratings yet
DMT MCQ
15 pages
Unit3 ML
No ratings yet
Unit3 ML
23 pages
ML Unit 2 Final - III Yr
No ratings yet
ML Unit 2 Final - III Yr
72 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
ML Unit 3
No ratings yet
ML Unit 3
15 pages
Unit 3
No ratings yet
Unit 3
19 pages
CS467-M4-Machine Learning-Ktustudents - in
No ratings yet
CS467-M4-Machine Learning-Ktustudents - in
9 pages
ML 3
No ratings yet
ML 3
20 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
Dmi Unit 4
No ratings yet
Dmi Unit 4
34 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
ML Mid-2 Objective
No ratings yet
ML Mid-2 Objective
12 pages
Decision Tree Intro MDT903
No ratings yet
Decision Tree Intro MDT903
40 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
ML - Module-3-Chapter-6 RNSIT
No ratings yet
ML - Module-3-Chapter-6 RNSIT
10 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
15 pages
Decision Tree
No ratings yet
Decision Tree
82 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Unit Ivnotes
No ratings yet
Unit Ivnotes
19 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Decsion Tree
No ratings yet
Decsion Tree
6 pages
Classification
No ratings yet
Classification
8 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
DocScanner 16 May 2024 23-42
No ratings yet
DocScanner 16 May 2024 23-42
10 pages
DocScanner 7 Jun 2024 11-48
No ratings yet
DocScanner 7 Jun 2024 11-48
27 pages
Anuj DAA
No ratings yet
Anuj DAA
12 pages
DocScanner 7 Jun 2024 13-54
No ratings yet
DocScanner 7 Jun 2024 13-54
30 pages
DAA Question Bank
No ratings yet
DAA Question Bank
4 pages
Amberlytics Final
No ratings yet
Amberlytics Final
4 pages
Physics of Data Science and Machine Learning 1st Edition Rauf - The 2025 Ebook Edition Is Available With Updated Content
No ratings yet
Physics of Data Science and Machine Learning 1st Edition Rauf - The 2025 Ebook Edition Is Available With Updated Content
72 pages
NVIDIA's Strategic Innovation and Resources
No ratings yet
NVIDIA's Strategic Innovation and Resources
7 pages
Factors Influencing Artificial Intelligence Adoption in The Accounting Profession - The Case of Public Sector in Kuwait
No ratings yet
Factors Influencing Artificial Intelligence Adoption in The Accounting Profession - The Case of Public Sector in Kuwait
25 pages
De So Ha Noi
No ratings yet
De So Ha Noi
9 pages
Original
No ratings yet
Original
23 pages
TRIZ Understanding
100% (1)
TRIZ Understanding
8 pages
Generative AI: Credit: Rahul Kumar A Bca' A Sec 4 Sem
No ratings yet
Generative AI: Credit: Rahul Kumar A Bca' A Sec 4 Sem
10 pages
Information Retrieval: DR Sharifullah Khan Nust Seecs
No ratings yet
Information Retrieval: DR Sharifullah Khan Nust Seecs
32 pages
Widyatama - Digital Economy v.1
No ratings yet
Widyatama - Digital Economy v.1
15 pages
Generative AI Hub
No ratings yet
Generative AI Hub
29 pages
AI Trading Proposal
No ratings yet
AI Trading Proposal
6 pages
Convolutional Neural Network For A Self-Driving Car in A Virtual Environment
No ratings yet
Convolutional Neural Network For A Self-Driving Car in A Virtual Environment
6 pages
Gym Management System
No ratings yet
Gym Management System
51 pages
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages
PAI QP
No ratings yet
PAI QP
2 pages
Key Track
No ratings yet
Key Track
11 pages
Ppce Unit - I Process Planning and Cost Estimation
No ratings yet
Ppce Unit - I Process Planning and Cost Estimation
14 pages
2305 10435
No ratings yet
2305 10435
40 pages
SBC Panescu Carte All PDF
No ratings yet
SBC Panescu Carte All PDF
291 pages
14 Project Management Trends Emerging in 2024
No ratings yet
14 Project Management Trends Emerging in 2024
7 pages
Mastering Cybersecurity With ChatGPT
No ratings yet
Mastering Cybersecurity With ChatGPT
61 pages
Bai602 ML I
100% (1)
Bai602 ML I
4 pages
Syllabus AIML BCA
No ratings yet
Syllabus AIML BCA
19 pages
Deep Learning For Consumer Devices and Services
No ratings yet
Deep Learning For Consumer Devices and Services
9 pages
Deep Learning For Tomato Diseases: Classification and Symptoms Visualization
No ratings yet
Deep Learning For Tomato Diseases: Classification and Symptoms Visualization
18 pages
Acorn Tokoro ICF
No ratings yet
Acorn Tokoro ICF
11 pages
TikTok Whats Next Shopping Trend Report 2024
No ratings yet
TikTok Whats Next Shopping Trend Report 2024
26 pages