AIML Final Cpy Word
AIML Final Cpy Word
2. Explain decision tree terminology. Ans:-The decision tree comprises of root node, leaf node, branch nodes,
parent/child node etc. following is the explanation of this terminology. Root Node: Root node is from where the
decision tree starts. It represents the entire dataset, which further gets divided into two or more homogeneous
sets. Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a
leaf node. Splitting: Splitting is the process of dividing the decision node/root node into sub nodes according to
the given conditions. Branch/Sub Tree: A tree formed by splitting the tree. Pruning: Pruning is the process of
removing the unwanted branches from the tree. Parent/Child node: The root node of the tree is called the parent
node, and other nodes are called the child nodes.
3. How does the Decision Tree algorithm Work for classification? Ans:-In a decision tree, for predicting the class of
the given dataset, the algorithm starts from the root node of the tree. This algorithm compares the values of root
attribute with the record (real dataset) attribute and, based on the comparison, follows the branch and jumps to
the next node. For the next node, the algorithm again compares the attribute value with the other sub-nodes and
move further. It continues the process until it reaches the leaf node of the tree. Step-1: Begin the tree with the
root node, says S, which contains the complete dataset. Step-2: Find the best attribute in the dataset using
Attribute Selection Measure (ASM) i.e. information gain and Gini index. Step-3: Divide the S into subsets that
contains possible values for the best attributes. Step-4: Generate the decision tree node, which contains the best
attribute. Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3.
Continue this process until a stage is reached where you cannot further classify the nodes and called the final node
as a leaf node.
5. Explain entropy reduction, information gain and Gini index in decision tree. Ans:- While implementing a
Decision tree, the main issue arises that how to select the best attribute for the root node and for sub-nodes. So, to
solve such problems there is a technique which is called as Attribute selection measure or ASM. By this
measurement, we can easily select the best attribute for the nodes of the tree. There are two popular techniques
for ASM, which are: Information Gain: Information gain is the measurement of changes in entropy after the
segmentation of a dataset based on an attribute. It calculates how much information a feature provides us about
a class. According to the value of information gain, we split the node and build the decision tree. A decision tree
algorithm always tries to maximize the value of information gain, and a node/attribute having the highest
information gain is split first. It can be calculated using the below formula: Information Gain= Entropy(S) –
[(Weighted Average) * Entropy (each feature)] Entropy: Entropy is a metric to measure the impurity in a given
attribute. It specifies randomness in data. Entropy can be calculated as: Entropy(s) = – P(yes)log2 P(yes) – P(no) log2
P(no) Where, S= Total number of samples, P(yes)= probability of yes, P(no)= probability of no Gini Index: Gini
index is a measure of impurity or purity used while creating a decision tree in the CART (Classification and
Regression Tree) algorithm. An attribute with the low Gini index should be preferred as compared to the high Gini
index. Gini index can be calculated using the formula: Gini Index= 1 – ∑jPj 2 It only creates binary splits, and the
CART algorithm uses the Gini index to create binary splits.
6. What are advantages and limitations of the decision trees?
Advantages of the Decision Tree It is simple to understand as it follows the same process which a human follow
while making any decision in real-life. It can be very useful for solving decision-related problems. It helps to think
about all the possible outcomes for a problem. There is less requirement of data cleaning compared to other
algorithms.
Disadvantages of the Decision Tree The decision tree contains lots of layers, which makes it complex. It may
have an overfitting issue, which can be resolved using the Random Forest algorithm. For more class labels, the
computational complexity of the decision tree may increase.
11. Explain random forest tree terminology. Bagging: Given the training set of N examples, we repeatedly sample
subsets of the training data of size n where n is less than N. Sampling is done at random but with replacement. This
subsampling of a training set is called bootstrap aggregating, or bagging, for short. Random subspace method: If
each training example has M features, we take a subset of them of size m < M to train each estimator. So no
estimator sees the full training set, each estimator sees only m features of n training examples. Training
estimators: We create Ntree decision trees, or estimators, and train each one on a different set of m features and n
training examples. The trees are not pruned, as they would be in the case of training a simple decision tree
classifier. Perform inference by aggregating predictions of estimators: To make a prediction for a new incoming
example, we pass the relevant features of this example to each of the Ntree estimators. We will obtain Ntree
predictions, which we need to combine to produce the overall prediction of the random forest. In the case of
classification, we will use QUESTION BANK FOR UNIT 3: CLASSIFICATION & REGRESSION Dr. Abhishek D. Patange,
Mechanical Engineering, College of Engineering Pune (COEP) majority voting to decide on the predicted class, and in
the case of regression, we will take the mean value of the predictions of all the estimators.
13. What is the difference between simple decision tree and random forest tree?
18. What are the Pros and Cons of using Naive Bayes? Ans:-Advantages of Naïve Bayes Classifier: Naïve Bayes is
one of the fast and easy ML algorithms to predict a class of datasets. It can be used for Binary as well as Multi-
class Classifications. It performs well in Multi-class predictions as compared to the other Algorithms. It is the
most popular choice for text classification problems. Disadvantages of Naïve Bayes Classifier: Naive Bayes assumes
that all features are independent or unrelated, so it cannot learn the relationship between features. QUESTION
BANK FOR UNIT 3: CLASSIFICATION & REGRESSION Dr. Abhishek D. Patange, Mechanical Engineering, College of
Engineering Pune (COEP) The requirement of predictors to be independent. In most of the real life cases, the
predictors are dependent, this hinders the performance of the classifier.
23. What is Support Vector Machine? Ans:- Support
Vector Machine or SVM is one of the most popular
Supervised Learning algorithms, which is used for
Classification as well as Regression problems.
However, primarily, it is used for Classification
problems in Machine Learning. The goal of the SVM
algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into
classes so that we can easily put the new data point in
the correct category in the future. This best decision
boundary is called a hyperplane. SVM chooses the
extreme points/vectors that help in creating the in which there are two different categories that are
hyperplane. These extreme cases are called as classified using a decision boundary or hyperplane:
support vectors, and hence algorithm is termed as
Support Vector Machine. Consider the below diagram
27. Explain Support Vector Machine terminology. Ans:- Support Vector Machines are part of the supervised
learning model with an associated learning algorithm. It is the most powerful and flexible algorithm used for
classification, regression, and detection of outliers. It is used in case of high dimension spaces; where each data
item is plotted as a point in n-dimension space such that each feature value corresponds to the value of specific
coordinate. The classification is made on the basis of a hyperplane/line as wide as possible, which distinguishes
between two categories more clearly. Basically, support vectors are the observational points of each individual,
whereas the support vector machine is the boundary that differentiates one class from another class. Some
significant terminology of SVM is given below: Support Vectors: These are the data point or the feature vectors
lying nearby to the hyperplane. These help in defining the separating line. Hyperplane: It is a subspace whose
dimension is one less than that of a decision plane. It is used to separate different objects into their distinct
categories. The best hyperplane is the one with the maximum separation distance between the two classes.
Margins: It is defined as the distance (perpendicular) from the data point to the decision boundary. There are two
types of margins: good margins and margins. Good margins are the one with huge margins and the bad margins in
which the margin is minor. The main goal of SVM is to find the maximum marginal hyperplane, so as to segregate
the dataset into distinct classes. It undergoes the following steps: Firstly the SVM will produce the hyperplanes
repeatedly, which will separate out the class in the best suitable way. Then we will look for the best option that
will help in correct segregation.
Hyper parameters in detail
35. Explain principle of Logistic Regression. Logistic Function (Sigmoid Function): The sigmoid function is a
mathematical function used to map the predicted values to probabilities. It maps any real value into another value
within a range of 0 and 1. The value of the logistic regression must be between 0 and 1, which cannot go beyond
this limit, so it forms a curve like the "S" form. The S-form curve is called the sigmoid function or the logistic
function. In logistic regression, we use the concept of the threshold value, which defines the probability of either 0
or 1. Such as values above the threshold value tends to 1, and a value below the threshold values tends to 0.
Assumptions for Logistic Regression: The dependent variable must be categorical in nature. The independent
variable should not have multi-collinearity. Logistic Regression Equation: The Logistic regression equation can be
obtained from the Linear Regression equation. The mathematical steps to get Logistic Regression equations are
given below: We know the equation of the straight line can be written as: In Logistic Regression y can be between 0
and 1 only, so for this let's divide the above equation by (1-y): But we need range between -[infinity] to +[infinity].
38. Differentiate between logistic regression and linear regression?
S.N. Linear Regression Logistic Regression
1 It is used to predict the continuous dependent It is used to predict the categorical dependent
variable using a given set of independent variables. variable using a given set of independent variables.
2 It is used for solving Regression problem. It is used for solving Classification problems.
3 In this, we predict the value of continuous variables. In this, we predict the values of categorical
variables.
4 In this, we find the best fit line, by which we can In this, we find the S-curve by which we can
easily predict the output. classify the samples.
5 Least square estimation method is used for Maximum likelihood estimation method is used for
estimation of accuracy. estimation of accuracy.
6 The output for Linear Regression must be a The output of Logistic Regression must be a
continuous value, such as price, age, etc. Categorical value such as 0 or 1, Yes or No, etc.
7 In this, it is required that relationship between In this, it is not required to have the linear
dependent variable and independent variable must relationship between the dependent and
be linear. independent variable.
8 In this, there may be collinearity between the In this, there should not be collinearity between
independent variables. the independent variable.
9 Linear Regression is a supervised regression model. Logistic Regression is a supervised classification
model.
In this, we predict the value by an integer number. In this, we predict the value by 1 or 0.
40. What is meant by K Nearest Neighbor algorithm? K-Nearest Neighbour is one of the simplest Machine
Learning algorithms based on Supervised Learning technique. K-NN algorithm assumes the similarity between the
new case/data and available cases and put the new case into the category that is most similar to the available
categories. K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This
means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm.
K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the Classification
problems. K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data.
It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it
stores the dataset and at the time of classification, it performs an action on the dataset. KNN algorithm at the
training phase just stores the dataset and when it gets new data, then it classifies that data into a category that is
much similar to the new data. Example: Suppose, we have an image of a creature that looks similar to cat and dog,
but we want to know either it is a cat or dog. So for this identification, we can use the KNN algorithm, as it works on
a similarity measure. Our KNN model will find the similar features of the new data set to the cats and dogs images
and based on the most similar features it will put it in either cat or dog category.
43. What is the difference between KNN and K means? K-NN is a Supervised while K-means is an unsupervised
Learning. K-NN is a classification or regression machine learning algorithm while K-means is a clustering machine
learning algorithm. K-NN is a lazy learner while K-Means is an eager learner. An eager learner has a model fitting
that means a training step but a lazy learner does not have a training phase. K-NN performs much better if all of
the data have the same scale but this is not true for K-means. K-means is a clustering algorithm that tries to
partition a set of points into K sets (clusters) such that the points in each cluster tend to be near each other. It is
unsupervised because the points have no external classification. K-nearest neighbors is a classification (or
regression) algorithm that in order to determine the classification of a point, combines the classification of the K
nearest points. It is supervised because you are trying to classify a point based on the known classification of other
points.
47. What are advantages and limitations of KNN and K means? The k-nearest neighbors (KNN) algorithm is a
simple, supervised machine learning algorithm that can be used to solve both classification and regression
problems. It‘s easy to implement and understand, but has a major drawback of becoming significantly slows as the
size of that data in use grows. KNN works by finding the distances between a query and all the examples in the data,
selecting the specified number examples (K) closest to the query, then votes for the most frequent label (in the case
of classification) or averages the labels (in the case of regression). In the case of classification and regression, we
saw that choosing the right K for our data is done by trying several Ks and picking the one that works best.
Advantages The algorithm is simple and easy to implement. There‘s no need to build a model, tune several
parameters, or make additional assumptions. The algorithm is versatile. It can be used for classification,
regression, and search (as we will see in the next section). Disadvantages The algorithm gets significantly slower
as the number of examples and/or predictors/independent variables increase.
Unit-4
7. Enlist steps involved in development of classification model. Following are the steps to be considered in
development of classification model. 1 - Data Collection The quantity & quality of your data dictate how accurate
our model is The outcome of this step is generally a representation of data which we will use for training Using
experimental data, data generated by simulations, pre-collected data, by way of datasets from Kaggle, UCI, etc., still
fits into this step 2 - Data Preparation Wrangle data and prepare it for training Clean that which may require it
(remove duplicates, correct errors, deal with missing values, normalization, and data type conversions, etc.)
Randomize data, which erases the effects of the particular order in which we collected and/or otherwise prepared
our data Visualize data to help detect relevant relationships between variables or class imbalances (bias alert!), or
perform other exploratory analysis Split into training and evaluation sets 3 - Choose a Model Different
algorithms are for different tasks; choose the right one 4 - Train the Model The goal of training is to answer a
question or make a prediction correctly as often as possible Linear regression example: algorithm would need to
learn values for m (or W) and b (x is input, y is output) Each iteration of process is a training step 5 - Evaluate the
Model Uses some metric or combination of metrics to "measure" objective performance of model Test the
model against previously unseen data This unseen data is meant to be somewhat representative of model
performance in the real world, but still helps tune the model (as opposed to test data, which does not) Good
train/evaluation split? 80/20, 70/30, or similar, depending on domain, data availability, dataset particulars, etc. 6 –
Hyper parameter Tuning This step refers to hyperparameter tuning, which is an "artform" as opposed to a science
Tune model parameters for improved performance Simple model hyperparameters may include: number of
training steps, learning rate, initialization values and distribution, etc. 7 - Make Predictions Using further (test set)
data which have, until this point, been withheld from the model (and for which class labels are known), are used to
test the model; a better approximation of how the model will perform in the real world.
20. What is test data? A test set in machine learning is a secondary (or tertiary) data set that is used to test a
machine learning program after it has been trained on an initial training data set. The idea is that predictive models
always have some sort of unknown capacity that needs to be tested out, as opposed to analysed from a
programming perspective. A test set is also known as a test data set or test data.
21. What is validation data? In machine learning, a validation set is used to “tune the parameters” of a classifier.
The validation test evaluates the program‟s capability according to the variation of parameters to see how it might
function in successive testing. The validation set is also known as a validation data set, development set or dev
set.
27. Explain hyper parameter tuning for SVM. The choice of kernel that will control the manner in which the input
variables will be projected. There are many to choose from, but linear, polynomial, and RBF are the most common,
perhaps just linear and RBF in practice. kernels in [‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’] If the polynomial kernel works
out, then it is a good idea to dive into the degree hyperparameter. Another critical parameter is the penalty (C)
that can take on a range of values and has a dramatic effect on the shape of the resulting regions for each class. A
log scale might be a good starting point. C in [100, 10, 1.0, 0.1, 0.001]
28. Explain hyper parameter tuning for ANN. Number of neurons: A weight is the amplification of input signals to
a neuron and bias is an additive bias term to a neuron. Activation function: Defines how a neuron or group of
neurons activate ("spiking") based on input connections and bias term(s). Learning rate: Step length for gradient
descent update Batch size: Number of training examples in each gradient descent (gd) update. Epochs: The
number of times all training examples have been passed through the network during training. Loss function: Loss
function specifies how to calculate the error between prediction and label for a given training example. The error is
backpropagated during training in order to update learnable parameters. Number of layers: Typically layers
between input and output layer, which are called hidden layers.
Unit-5
1. What is reinforcement learning? State one practical example. Reinforcement Learning is a feedback-based
Machine learning technique in which an agent learns to behave in an environment by performing the actions and
seeing the results of actions. For each good action, the agent gets positive feedback, and for each bad action, the
agent gets negative feedback or penalty. In Reinforcement Learning, the agent learns automatically using
feedbacks without any labeled data, unlike supervised learning. Since there is no labeled data, so the agent is
bound to learn by its experience only. RL solves a specific type of problem where decision making is sequential,
and the goal is long-term, such as game-playing, robotics, etc. The agent interacts with the environment and
explores it by itself. The primary goal of an agent in reinforcement learning is to improve the performance by
getting the maximum positive rewards. The agent learns with the process of hit and trial, and based on the
experience, it learns to perform the task in a better way. Hence, we can say that "Reinforcement learning is a type
of machine learning method where an intelligent agent (computer program) interacts with the environment and
learns to act within that." How a Robotic dog learns the movement of his arms is an example of Reinforcement
learning. It is a core part of Artificial intelligence, and all AI agent works on the concept of reinforcement learning.
Here we do not need to pre-program the agent, as it learns from its own experience without any human
intervention. Example: Suppose there is an AI agent present within a maze environment, and his goal is to find the
diamond. The agent interacts with the environment by performing some actions, and based on those actions, the
state of the agent gets changed, and it also receives a reward or penalty as feedback. The agent continues doing
these three things (take action, change state/remain in the same state, and get feedback), and by doing these
actions, he learns and explores the environment.
Explain value-based, policy-based, and model-based reinforcement learning. There are mainly three ways to
implement reinforcement-learning in ML, which are: Value-based: The value-based approach is about to find the
optimal value function, which is the maximum value at a state under any policy. Therefore, the agent expects the
long-term return at any state(s) under policy π. Policy-based: Policy-based approach is to find the optimal policy
for the maximum future rewards without using the value function. In this approach, the agent tries to apply such a
policy that the action performed in each step helps to maximize the future reward. The policy-based approach has
mainly two types of policy: Deterministic: The same action is produced by the policy (π) at any state. Stochastic:
In this policy, probability determines the produced action. Model-based: In the model-based approach, a virtual
model is created for the environment, and the agent explores that environment to learn it. There is no particular
solution or algorithm for this approach because the model representation is different for each environment.
Explain elements of reinforcement learning. There are four main elements of Reinforcement Learning, which are
given below: Policy, Reward Signal, Value Function, Model of the environment Policy: A policy can be defined as a
way how an agent behaves at a given time. It maps the perceived states of the environment to the actions taken on
those states. A policy is the core element of the RL as it alone can define the behavior of the agent. In some cases, it
may be a simple function or a lookup table, whereas, for other cases, it may involve general computation as a
search process. It could be deterministic or a stochastic policy: For deterministic policy: a = π(s) For stochastic
policy: π(a | s) = P[At =a | St = s] Reward Signal: The goal of reinforcement learning is defined by the reward signal.
At each state, the environment sends an immediate signal to the learning agent, and this signal is known as a
reward signal. These rewards are given according to the good and bad actions taken by the agent. The agent's main
objective is to maximize the total QUESTION BANK FOR UNIT 5: REINFORCED AND DEEP LEARNING Dr. Abhishek D.
Patange, Mechanical Engineering, College of Engineering Pune (COEP) number of rewards for good actions. The
reward signal can change the policy, such as if an action selected by the agent leads to low reward, then the policy
may change to select other actions in the future. Value Function: The value function gives information about how
good the situation and action are and how much reward an agent can expect. A reward indicates the immediate
signal for each good and bad action, whereas a value function specifies the good state and action for the future. The
value function depends on the reward as, without reward, there could be no value. The goal of estimating values is
to achieve more rewards. Model: The last element of reinforcement learning is the model, which mimics the
behaviour of the environment. With the help of the model, one can make inferences about how the environment
will behave. Such as, if a state and an action are given, then a model can predict the next state and reward. The
model is used for planning, which means it provides a way to take a course of action by considering all future
situations before actually experiencing those situations. The approaches for solving the RL problems with the help
of the model are termed as the model-based approach. Comparatively, an approach without using a model is called
a model-free approach.
7. What is the Bellman Equation? How is it helpful in RL? The Bellman equation was introduced by the
Mathematician Richard Ernest Bellman in the year 1953, and hence it is called as a Bellman equation. It is associated
with dynamic programming and used to calculate the values of a decision problem at a certain point by including
the values of previous states. It is a way of calculating the value functions in dynamic programming or environment
that leads to modern reinforcement learning. The key-elements used in Bellman equations are: Action performed
by the agent is referred to as "a" State occurred by performing the action is "s." The reward/feedback obtained
for each good and bad action is "R." A discount factor is Gamma "γ." The Bellman equation can be written as: V(s)
= max [R(s,a) + γV(s`)] Where, V(s)= value calculated at a particular point. R(s,a) = Reward at a particular state s by
performing an action. γ = Discount factor V(s`) = The value at the previous state. In the above equation, we are
taking the max of the complete values because the agent tries to find the optimal solution always. So now, using the
Bellman equation, we will find value at each state of the given environment. We will start from the block, which is
next to the target block. For 1st block: V(s3) = max [R(s,a) + γV(s`)], here V(s')= 0 because there is no further state to
move. V(s3)= max[R(s,a)]=> V(s3)= max[1]=> V(s3)= 1. For 2nd block: V(s2) = max [R(s,a) + γV(s`)], here γ= 0.9(lets),
V(s')= 1, and R(s, a)= 0, because there is no reward at this state. V(s2)= max[0.9(1)]=> V(s)= max[0.9]=> V(s2) =0.9
For 3rd block: V(s1) = max [R(s,a) + γV(s`)], here γ= 0.9(lets), V(s')= 0.9, and R(s, a)= 0, because there is no reward at
this state also. V(s1)= max[0.9(0.9)]=> V(s3)= max[0.81]=> V(s1) =0.81 QUESTION BANK FOR UNIT 5: REINFORCED
AND DEEP LEARNING Dr. Abhishek D. Patange, Mechanical Engineering, College of Engineering Pune (COEP) For 4th
block: V(s5) = max [R(s,a) + γV(s`)], here γ= 0.9(lets), V(s')= 0.81, and R(s, a)= 0, because there is no reward at this
state also. V(s5)= max[0.9(0.81)]=> V(s5)= max[0.81]=> V(s5) =0.73 For 5th block: V(s9) = max [R(s,a) + γV(s`)], here
γ= 0.9(lets), V(s')= 0.73, and R(s, a)= 0, because there is no reward at this state also. V(s9)= max[0.9(0.73)]=> V(s4)=
max[0.81]=> V(s4) =0.66.
10. Explain Markov Decision Process Markov Decision Process or MDP, is used to formalize the reinforcement
learning problems. If the environment is completely observable, then its dynamic can be modeled as a Markov
Process. In MDP, the agent constantly interacts with the environment and performs actions; at each action, the
environment responds and generates a new state. MDP is used to describe the environment for the RL, and almost
all the RL problem can be formalized using MDP. MDP contains a tuple of four elements (S, A, Pa, Ra): A set of
finite States S A set of finite Actions A Rewards received after transitioning from state S to state S', due to action
a. Probability Pa. MDP uses Markov property, and to better understand the MDP, we need to learn about it.
Markov Property: It says that "If the agent is present in the current state S1, performs an action a1 and move to the
state s2, then the state transition from s1 to s2 only depends on the current state and future action and states do
not depend on past actions, rewards, or states." Or, in other words, as per Markov Property, the current state
transition does not depend on any past action or state. Hence, MDP is an RL problem that satisfies the Markov
property. Such as in a Chess game, the players only focus on the current state and do not need to remember past
actions or states. Finite MDP: A finite MDP is when there are finite states, finite rewards, and finite actions. In RL,
we consider only the finite MDP. Markov Process: Markov Process is a memoryless process with a sequence of
random states S1, S2, ....., St that uses the Markov Property. Markov process is also known as Markov chain, which
is a tuple (S, QUESTION BANK FOR UNIT 5: REINFORCED AND DEEP LEARNING Dr. Abhishek D. Patange, Mechanical
Engineering, College of Engineering Pune (COEP) P) on state S and transition function P. These two components (S
and P) can define the dynamics of the system.
11. Explain Q-Learning. Q-learning is an off policy RL algorithm, which is used for the temporal difference
Learning. The temporal difference learning methods are the way of comparing temporally successive predictions.
It learns the value function Q (S, a), which means how good to take action "a" at a particular state "s." The below
flowchart explains the working of Q- learning: State Action Reward State action (SARSA): SARSA stands for State
Action Reward State action, which is an on-policy temporal difference learning method. The on-policy control
method selects the action for each state while learning using a specific policy. The goal of SARSA is to calculate the
Q π (s, a) for the selected current policy π and all pairs of (s-a). The main difference between Q-learning and
SARSA algorithms is that unlike Q learning, the maximum reward for the next state is not required for updating the
Q-value in the table. In SARSA, new action and reward are selected using the same policy, which has determined
the original action.The SARSA is named because it uses the quintuple Q(s, a, r, s', a').
17. Explain Artificial Neural Network (ANN). This is another name for Deep Neural network or Deep Learning. What
does a Neural Network mean? What neural network essentially means is we take logistic regression and repeat it
multiple times. In a normal logistic regression, we have an input layer and an output layer. But in the case of a
Neural Network, there is at least one hidden layer of regression between these input and output layers. How many
layers are needed to call it a “Deep” neural network? Well of course there is no specific amount of layers to
classify a neural network as deep. The term ―Deep‖ is quite frankly relative to every problem. The correct
question we can ask is ―How much deep?‖. For example, the answer to ―How deep is your swimming pool?‖ can
be answered in multiple ways. QUESTION BANK FOR UNIT 5: REINFORCED AND DEEP LEARNING Dr. Abhishek D.
Patange, Mechanical Engineering, College of Engineering Pune (COEP) It could be 2 meters deep or 10 meters
deep, but it has ―depth‖. Same with our neural network, it can have 2 hidden layers or ―thousands‖ hidden
layers(yes you heard that correctly). So I’d like to just stick with the question of ―How much deep?‖ for the time
being.
18. Explain elements of Deep Learning? Researchers tried to mimic the working of the human brain and replicated
it into the machine making machines capable of thinking and solving complex problems. Deep Learning (DL) is a
subset of Machine Learning (ML) that allows us to train a model using a set of inputs and then predict output based.
Like the human brain, the model consists of a set of neurons that can be grouped into 3 layers: a) Input Layer: It
receives input and passes it to hidden layers. b) Hidden Layers: There can be 1 or more hidden layers in Deep Neural
Network (DNN). ―Deep‖ in DL refers to having more than 1 layer. All computations are done by hidden layers. c)
Output Layer: This layer receives input from the last hidden layer and gives the output.
26. What are GD optimization methods and which optimizer to use? To overcome challenges in GD, some
optimization methods are used by AI community. Further, less efforts are required in hyperparameter tuning. a)
Adagrad Algorithm of choice in case of sparse data. Eliminate need of manually tuning learning rate unlike GD.
Default value of 0.01 is preferred. b) Adadelta Reduces adagrad’s monotonically decreasing learning rate. Do not
require default learning rate. c) RMSprop RMSprop and adadelta were developed for same purpose at same time.
Learning rate = 0.001 is preferred. d) Adam It works well with most of problems and is algorithm of choice.
Seen as combination of RMSprop and momentum. AdaMax and Nadam are variants of Adam.
27. What is Convolutional Neural Network (CNN)? “Convolution neural networks” indicates that these are simply
neural networks with some mathematical operation (generally matrix multiplication) in between their layers called
convolution. It was proposed by Yann LeCun in 1998. It's one of the most popular uses in Image Classification.
Convolution neural network can broadly be classified into these steps: 1. Input layer QUESTION BANK FOR UNIT 5:
REINFORCED AND DEEP LEARNING Dr. Abhishek D. Patange,
Mechanical Engineering, College of Engineering Pune (COEP) 2.
Convolutional layer 3. Output layers
31. What is a Neural Network Activation Function? An Activation Function decides whether a neuron should be
activated or not. This means that it will decide whether the neuron’s input to the network is important or not in the
process of prediction using simpler mathematical operations. The role of the Activation Function is to derive output
from a set of input values fed to a node (or a layer). But—Let’s take a step back and clarify: What exactly is a node?
Well, if we compare the neural network to our brain, a node is a replica of a neuron that receives a set of input
signals—external stimuli. Depending on the nature and intensity of these input signals, the brain processes them
and decides whether the neuron should be activated (―fired‖) or not. In deep learning, this is also the role of the
Activation Function—that’s why it’s often referred to as a Transfer Function in Artificial Neural Network. The
primary role of the Activation Function is to transform the summed weighted input from the node into an output
value to be fed to the next hidden layer or as output.
36. State examples of Deep Learning. Deep learning applications are used in industries from automated driving to
medical devices. Automated Driving: Automotive researchers are using deep learning to automatically detect
objects such as stop signs and traffic lights. In addition, deep learning is used to detect pedestrians, which helps
decrease accidents. Aerospace and Defense: Deep learning is used to identify objects from satellites that locate
areas of interest, and identify safe or unsafe zones for troops. Medical Research: Cancer researchers are using deep
learning to automatically detect cancer cells. Teams at UCLA built an advanced microscope that yields a high-
dimensional data set used to train a deep learning application to accurately identify cancer cells. Industrial
Automation: Deep learning is helping to improve worker safety around heavy machinery by automatically detecting
when people or objects are within an unsafe distance of machines. Electronics: Deep learning is being used in
automated hearing and speech translation. For example, home assistance devices that respond to your voice and
know your preferences are powered by deep learning applications.
37. What's the Difference Between Machine Learning and Deep Learning? Deep learning is a specialized form of
machine learning. A machine learning workflow starts with relevant features being manually extracted from images.
The features are then used to create a model that categorizes the objects in the image. With a deep learning
workflow, relevant features are automatically extracted from images. In addition, deep learning performs ―end-to-
end learning‖ – where a network is given raw data and a task to perform, such as classification, and it learns how to
do this automatically. Another key difference is deep learning algorithms scale with data, whereas shallow learning
converges. Shallow learning refers to machine learning methods that plateau at a certain level of performance
when you add more examples and training data to the network. A key advantage of deep learning networks is that
they often continue to improve as the size of your data increases. In machine learning, you manually choose
features and a classifier to sort images. With deep learning, feature extraction and modeling steps are automatic.
Unit-6
1. What is human-machine interaction? HMI is all about how people and automated systems interact and
communicate with each other. That has long ceased to be confined to just traditional machines in industry and now
also relates to computers, digital systems or devices for the Internet of Things (IoT). More and more devices are
connected and automatically carry out tasks. Operating all of these machines, systems and devices needs to be
intuitive and must not place excessive demands on users. Human-machine interaction is all about how people and
automated systems interact with each other. HMI now plays a major role in industry and everyday life: More and
more devices are connected and automatically carry out tasks. A user interface that is as intuitive as possible is
therefore needed to enable smooth operation of these machines. That can take very different forms.
2. How does human-machine interaction work? Smooth communication between people and machines requires
interfaces: The place where or action by which a user engages with the machine. Simple examples are light
switches or the pedals and steering wheel in a car: An action is triggered when you flick a switch, turn the steering
wheel or step on a pedal. However, a system can also be controlled by text being keyed in, a mouse, touch
screens, voice or gestures. The devices are either controlled directly: Users touch the smartphone’s screen or
issue a verbal command. Or the systems automatically identify what people want: Traffic lights change color on
their own when a vehicle drives over the inductive loop in the road’s surface. Other technologies are not so much
there to control devices, but rather to complement our sensory organs. One example of that is a virtual reality glass.
There are also digital assistants: Chatbots, for instance, reply automatically to requests from customers and keep
on learning. User interfaces in HMI are the places where or actions by which the user engages with the machine.
A system can be operated by means of buttons, a mouse, touch screens, voice or gesture, for instance.
3. What human-machine systems are there? For a long time, machines were mainly controlled by switches,
levers, steering wheels or buttons; these were joined later by the keyboard and mouse. Now we are in the age of
the touch screen. Body sensors in wearables that automatically collect data are also modern interfaces. Voice
control is also making rapid advances: Users can already control digital assistants, such as Amazon Alexa or Google
Assistant, by voice. That entails far less effort. Chatbots are also used in such systems and their ability to
communicate with people is improving more and more thanks to artificial intelligence.
4. What trends are there in human-machine interaction? Gesture control is at least as intuitive as voice control.
That means robovacs, for example, could be stopped by a simple hand signal in the future. Google and Infineon
have already developed a new type of gesture control by the name of ―Soli‖: Devices can also be operated in the
dark or remotely with the aid of radar technology. Technologies that augment reality now also act as an interface.
Virtual reality glasses immerse users in an artificially created 3D world, while augmented reality glasses
superimpose virtual elements in the real environment. Mixed reality glasses combine both technologies, thus
enabling scenarios to be presented realistically thanks to their high resolution.
5. What opportunities and challenges arise from human-machine interaction? Modern HMI helps people to use
even very complex systems with ease. Machines also keep on getting better at interpreting signals – and that is
important in particular in autonomous driving. Human needs are identified even more accurately, which means
robots can be used in caring for people, for instance. One potential risk is the fact that hackers might obtain
information on users via the machines’ sensors. Last but not least, security is vital in human-machine interaction.
Some critics also fear that self-learning machines may become a risk by taking actions autonomously. It is also
necessary to clarify the question of who is liable for accidents caused by HMI.
6. Where is human-machine interaction headed? Whether voice and gesture control or virtual, augmented and
mixed reality, HMI interaction is far from reaching the end of the line. In future, data from different sensors will
also increasingly be combined to capture and control complex processes optimally. The human senses will be
replicated better and better with the aid of, for example, gas sensors, 3D cameras and pressure sensors, thus
expanding the devices’ capabilities. In contrast, there will be fewer of the input devices that are customary at
present, such as remote controllers.
8. Make a list of maintenance and explain in brief. Discuss the scope of AIML. Predictive maintenance: Predictive
maintenance is used for Identify anomalies in the process, which help in preventive maintenance. Estimate the
demand for product, raw material etc.: based on historical data and current scenario. Forecast possible outcomes
based on data obtained from the process. Prescriptive maintenance: Prescriptive maintenance is used to identify
ways in which an industrial process can be improved. While predictive maintenance tells when could a
component/asset fails, prescriptive analytics tells what action you need to take to avoid the failure. So, you can use
the results obtained from prescriptive analysis to plan the maintenance schedule, review your supplier, etc.
Prescriptive maintenance also helps you manage complex problems in the production process using relevant
information. Descriptive maintenance: The core purpose of descriptive maintenance is to describe the problem by
diagnosing the symptoms. This method also helps discover the trends and patterns based on historical data. The
results of a descriptive maintenance are usually shown in the form of charts and graphs. These data visualization
tools make it easy for all the stakeholders, even those who are non-technical to understand the problems in the
manufacturing process. Diagnostic maintenance: Diagnostic maintenance is also referred to as root cause analysis.
While descriptive maintenance can tell what happened based on historical data, diagnostic maintenance tells you
why it happened. Data mining, data discover, correlation, and down and drill through methods are used in
diagnostic analytics. Diagnostic maintenance can be used to identify cause for equipment malfunction or reason for
the drop in the product quality.
Q5 Explain different applications in health care where AIML can be used. Artificial Intelligence (AI) and Machine
Learning (ML) have numerous applications in healthcare that can revolutionize the industry. Here are some areas
where AI and ML can be used in healthcare: 1. Medical Image Analysis: AI and ML techniques can assist in the
analysis and interpretation of medical images such as X-rays, MRIs, and CT scans. Algorithms can be trained to
detect abnormalities, tumors, or other diseases, helping radiologists in diagnosis and decision-making. 2. Disease
Diagnosis and Risk Prediction: ML models can be developed to analyze patient data, including symptoms, medical
history, and laboratory results, to aid in disease diagnosis and risk prediction. For example, ML algorithms can assist
in early detection of diseases like cancer, diabetes, or heart diseases, based on patient data patterns. 3. Drug
Discovery and Development: AI can accelerate the drug discovery and development process by analyzing vast
amounts of biological data, identifying potential drug targets, predicting the efficacy of drug candidates, and
designing new molecules. This can help in developing new treatments and personalized medicine. 4. Health
Monitoring and Wearable Devices: ML algorithms can analyze data collected from wearable devices and health
monitoring sensors to track vital signs, activity levels, sleep patterns, and identify patterns or anomalies that may
indicate health issues. This enables proactive health management and early intervention. 5. Electronic Health
Records (EHR) Analysis: ML can be used to analyze electronic health records to identify patterns, predict disease
progression, and optimize treatment plans. This can help in personalized medicine, resource allocation, and
improving healthcare delivery. 6. Virtual Assistants and Chatbots: AI-powered virtual assistants and chatbots can
provide personalized healthcare recommendations, answer medical queries, and assist with appointment
scheduling. They can triage patients, provide preliminary diagnoses, and offer guidance on self-care. These are just
a few examples of the applications of AI and ML in healthcare.
Q6 Write a short note on use of AIML in traffic control? 1. Traffic Flow Prediction: ML algorithms can analyze
historical traffic data, weather conditions, and other relevant factors to predict traffic flow patterns. These
predictions can be used to optimize traffic signal timings, adjust lane configurations, and improve overall traffic
management. 2. Intelligent Traffic Signal Control: AI-powered traffic signal control systems can adapt signal timings
in real-time based on traffic conditions. ML models analyze live traffic data from cameras, sensors, and other
sources to optimize signal timings, reduce congestion, and prioritize traffic flow. 3. Smart Traffic Management: AI
and ML technologies enable the development of intelligent traffic management systems that can detect and
respond to changing traffic conditions. These systems can identify incidents, accidents, or road hazards and
automatically alert authorities or adjust traffic patterns to reroute vehicles and minimize disruptions. 4. Traffic
Incident Detection: ML algorithms can analyze live video feeds from surveillance cameras and detect traffic
incidents such as accidents, breakdowns, or road obstructions. This enables quick detection and prompt response
by authorities, improving incident management and reducing response times. 5. Route Optimization: AI-powered
route planning systems can analyze real-time traffic data, historical patterns, and other variables to suggest the
most efficient routes for drivers. These systems can help alleviate congestion by distributing traffic across different
routes and reducing travel times. 6. Smart Parking Management: ML algorithms can be used to optimize parking
management by analyzing data from sensors, cameras, and parking availability information. AI systems can guide
drivers to available parking spaces, reduce search times, and improve overall parking efficiency.
Q7 Explain different steps in Dynamic system reduction. 1. System Analysis: The first step in dynamic system
reduction is to thoroughly analyze the dynamic system. This involves understanding the system's structure,
components, inputs, outputs, and the relationships between them. The analysis helps identify the important
variables, CO6 BL2 6 subsystems, and dynamics that need to be preserved during the reduction process. 2. Variable
Selection: In this step, the most relevant variables of the dynamic system are selected for inclusion in the reduced
model. These variables are chosen based on their significance in capturing the system's behavior and their impact
on the desired outputs. Variables that have minimal influence or can be approximated by other variables may be
excluded. 4. Model Approximation: Once the order is reduced, the next step is to approximate the dynamics of the
remaining variables. This can be done through different mathematical techniques such as linearization, curve fitting,
interpolation, or regression analysis. The goal is to capture the essential dynamics while simplifying the
mathematical representation. 5. Validation and Error Analysis: After the reduction process, it is essential to
validate the reduced model against the original dynamic system. The reduced model's behavior is compared with
the behavior of the original system under various inputs and operating conditions. Error analysis is conducted to
assess the accuracy of the reduced model and identify any discrepancies or limitations. 7. Model Integration: Once
a satisfactory reduced model is obtained, it can be integrated into practical applications. The reduced model can be
used for simulation, control design, optimization, or other purposes where a simplified representation of the
original dynamic system is sufficient.
Q8 Explain the use of Machine learning in process Optimization. 1. Predictive Analytics: Machine learning models
can be trained on historical process data to predict future outcomes and behavior. These models can forecast
process variables, equipment failures, maintenance needs, and product quality. By leveraging predictive analytics,
organizations can optimize scheduling, resource allocation, and decision-making, leading to improved process
efficiency and reduced downtime.2. Anomaly Detection: Machine learning algorithms can detect anomalies in
process data that deviate from expected patterns. By identifying abnormal conditions, such as equipment
malfunctions or process deviations, timely interventions can be made to prevent failures, reduce waste, and
optimize resource utilization. 3. Optimization Algorithms: Machine learning algorithms, such as genetic algorithms,
reinforcement learning, or particle swarm optimization, can be employed to find optimal process parameters and
configurations. These algorithms can explore a large solution space and identify the best settings for variables and
parameters, maximizing efficiency, productivity, and yield. 4. Quality Control and Defect Detection: Machine
learning models can analyze sensor data, images, or other measurements to detect defects or variations in product
quality. By identifying patterns associated with quality issues, organizations can make real-time adjustments to the
process, minimizing waste and maintaining high-quality standards. 5. Energy Efficiency: Machine learning can
optimize energy consumption in industrial processes. By analyzing historical energy usage data, machine learning
algorithms can identify energy-saving opportunities, optimize energy usage patterns, and suggest energy-efficient
operating strategies. This leads to cost savings and reduced environmental impact. 6. Resource Allocation: Machine
learning models can analyze process data and operational variables to optimize resource allocation, such as raw
material usage, inventory management, and workforce scheduling. By dynamically adjusting resource allocation
based on real-time data, organizations can minimize waste, reduce costs, and improve overall process efficiency. 7.
Process Control and Automation: Machine learning algorithms can be integrated into control systems to optimize
process parameters in real-time. By continuously analyzing process data, these algorithms can adjust control
variables, such as temperature, pressure, or flow rates, to maintain optimal operating conditions, reduce variations,
and achieve desired process outcomes.