Long ML
Long ML
QUESTION BANK
B.TECH- Computer Science and Engineering (5 sem) Subject:-
th
Machine Learning
KModes clustering is one of the unsupervised Machine Learning algorithms that is used to
cluster categorical variables.
You might be wondering, why KModes clustering when we already have KMeans.
KMeans uses mathematical measures (distance) to cluster continuous data. The lesser the dist
more similar our data points are. Centroids are updated by Means.
But for categorical data points, we cannot calculate the distance. So we go for KModes algorith
the dissimilarities(total mismatches) between the data points. The lesser the dissimilarities the
similar our data points are. It uses Modes instead of means.
11 Explain DBSCAN clustering algorithm.
Density-Based Clustering Algorithms
A dataset contains a huge number of input features in various cases, which makes the predictive model
complicated. Because it is very difficult to visualize or make predictions for the training dataset with a hi
features, for such cases, dimensionality reduction techniques are required to use.
Dimensionality reduction technique can be defined as, "It is a way of converting the higher dimensions
lesser dimensions dataset ensuring that it provides similar information." These techniques are
in machine learning for obtaining a better fit predictive model while solving the classification and regress
It is commonly used in the fields that deal with high-dimensional data, such as speech recognition, signa
bioinformatics, etc. It can also be used for data visualization, noise reduction, cluster analysis, etc.
There are mainly three ways to implement reinforcement-learning in ML, which are:
1. Value-based:
The value-based approach is about to find the optimal value function, which is the maximum va
under any policy. Therefore, the agent expects the long-term return at any state(s) under policy π
2. Policy-based:
Policy-based approach is to find the optimal policy for the maximum future rewards without us
function. In this approach, the agent tries to apply such a policy that the action performed in each
maximize the future
The policy-based approach has mainly two types of policy:
o Deterministic: The same action is produced by the policy (π) at any state.
o Stochastic: In this policy, probability determines the produced action.
3. Model-based: In the model-based approach, a virtual model is created for the environment, a
explores that environment to learn it. There is no particular solution or algorithm for this approach
model representation is different for each environment.
AD
16 Explain the concept of the Bellman equation in reinforcement learning with an example.
According to the Bellman Equation, long-term- reward in a given action is equal to the rew
the current action combined with the expected reward from the future actions taken at th
time. Let’s try to understand first.
Let’s take an example:
Here we have a maze which is our environment and the sole goal of our agent is to reach the
state (R = 1) or to get Good reward and to avoid the fire state because it will be a failure (R
get Bad reward.
V(s)=maxa(R(s,a)+ γV(s’))
The max denotes the most optimum action among all the actions that the agent can take in a
state which can lead to the reward after repeating this process every consecutive step.
For example:
• The state left to the fire state (V = 0.9) can go UP, DOWN, RIGHT but NOT LEFT be
wall(not accessible). Among all these actions available the maximum value for that s
the UP action.
• The current starting state of our agent can choose any random action UP or RIGHT
lead towards the reward with the same number of steps.
By using the Bellman equation our agent will calculate the value of every step except for the t
the fire state (V = 0), they cannot have values since they are the end of the maze.
So, after making such a plan our agent can easily accomplish its goal by just following the incr
values.
It’s just a thing function that you use to get the output of node. It is also known
as Transfer Function.
It is used to determine the output of neural network like yes or no. It maps the
values in between 0 to 1 or -1 to 1 etc. (depending upon the function).
As you can see the function is a line or linear. Therefore, the output of the functions
confined between any range.
Fig: Linear Activation Function
Equation : f(x) = x
It doesn’t help with the complexity or various parameters of usual data that is fed to
networks.
The Nonlinear Activation Functions are the most used activation functions. Nonline
to makes the graph look something like this
Fig: Non-linear Activation Function
It makes it easy for the model to generalize or adapt with variety of data and to diffe
between the output.
Applications:
o Machine Translation
o Robot Control
o Time Series Prediction
o Speech Recognition
o Speech Synthesis
o Time Series Anomaly Detection
o Rhythm Learning
o Music Composition
20 Explain different methods of ensemble learning.
There are 3 most common ensemble learning methods in machine learning. These are as follows:
o Bagging
o Boosting
o Stacking
However, we will mainly discuss Stacking on this topic.
1. Bagging
Bagging is a method of ensemble modeling, which is primarily used to solve supervised machine learning problems. It i
completed in two steps as follows: o Bootstrapping: It is a random sampling method that is used to derive samples from
using the replacement procedure. In this method, first, random data samples are fed to the primary model, and then a
algorithm is run on the samples to complete the learning process. o Aggregation: This is a step that involves the proces
the output of all base models and, based on their output, predicting an aggregate result with greater accuracy and red
Example: In the Random Forest method, predictions from multiple decision trees are ensembled parallelly. Further, in
problems, we use an average of these predictions to get the final output, whereas, in classification problems, the mod
the predicted class.
2. Boosting
Boosting is an ensemble method that enables each member to learn from the preceding member's mistakes and make
predictions for the future. Unlike the bagging method, in boosting, all base learners (weak) are arranged in a sequentia
they can learn from the mistakes of their preceding learner. Hence, in this way, all weak learners get turned into strong
make a better predictive model with significantly improved performance.
3. Stacking
Stacking is one of the popular ensemble modeling techniques in machine learning. Various weak learners are ensembl
manner in such a way that by combining them with Meta learners, we can predict better predictions for the future.