MLT Unit-3 Important Questions
MLT Unit-3 Important Questions
1. Missing data:
a. There are an endless number of possible options for height and weight.
b. Decision tree learning algorithms discover the split point that provides the
best information gain rather than creating an endless number of branches.
c. Finding suitable Split points can be done efficiently using dynamic
programming techniques, but this is still the most expensive step in practical
decision tree learning applications.
4. Continuous-valued output attributes:
1. No training period:
a. KNN is referred to as a lazy learner (Instance-based learning).
b. Throughout the training phase, it does not learn anything. The training
data are not used to derive any discriminative function.
c. In other words, it doesn’t require any training. Only when making real-
time predictions does it draw on the training dataset it has stored.
d. As a result, the KNN method runs significantly more quickly than
other algorithms, such as SVM, Linear Regression, etc., that call for
training.
2. As the KNN algorithm doesn’t need to be trained before producing predictions,
new data can be supplied without disrupting the algorithm’s accuracy.
3. KNN is quite simple to use. KNN implementation just needs two parameters: the
value of K and the distance function (for example, Euclidean).
Disadvantages of KNN:
1. Does not work well with large dataset: In large datasets, the algorithm’s
speed suffers due to the high cost of computing the distance between each new
point and each current point.
2. Does not work well with high dimensions: The KNN technique does not
perform well with high dimensional data because it becomes challenging for the
algorithm to calculate the distance in each dimension when there are many
dimensions.
3. Need feature scaling: Before we apply the KNN method to any dataset, we
must do feature scaling (standardization and normalization). In the absence of
this, KINN may produce inaccurate forecasts.
4. Sensitive to noisy data, missing values and outliers: KNN is sensitive to
dataset noise. Outliers must be eliminated, and missing values must be
manually represented.
Q6. What are the benefits of CBL as a lazy problem solving method ?