Ai & ML 2 Marks Was
Ai & ML 2 Marks Was
Ai & ML 2 Marks Was
(Regulation – 2021)
CS3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
QUESTION BANK
TWO MARKS WITH ANSWERS
UNIT I PROBLEM SOLVING
Introduction to AI - AI Applications - Problem solving agents – search algorithms – uninformed
search strategies – Heuristic search strategies – Local search and optimization problems –
adversarial search – constraint satisfaction problems (CSP)
Part A
2. What is an agent?
An agent is anything that can be viewed as perceiving its environment through sensors
and acting upon that environment through actuators.
7. Are reflex actions (such as flinching from a hot stove) rational? Are they intelligent?
Reflex actions can be considered rational. If the body is performing the action, then it can
be argued that reflex actions are rational because of evolutionary adaptation. Flinching from a hot
stove is a normal reaction, because the body wants to keep itself out of danger and getting away
from something hot is a way to do that.
Reflex actions are also intelligent. Intelligence suggests that there is reasoning and logic
involved in the action itself.
AI is both science and engineering. Observing and experimenting, which are at the core
of any science, allows us to study artificial intelligence. From what we learn by observation and
experimentation, we are able to engineer new systems that encompass what we learn and that
may even be capable of learning themselves.
13. What are the components of well-defined problems? (or) What are the four components to
define a problem? Define them?
The four components to define a problem are,
1) Initial state – it is the state in which agent starts in.
2) A description of possible actions – it is the description of possible actions which are available to
the agent.
3) The goal test – it is the test that determines whether a given state is goal (final) state.
4) A path cost function – it is the function that assigns a numeric cost (value) to each path.
The problem-solving agent is expected to choose a cost function that reflects its own performance
measure.
15. Give example for real world end toy problems. Real world problem examples:
Airline travel problem.
Touring problem.
Traveling salesman problem
VLSI Layout problem
Robot navigation
Automatic Assembly
Internet searching
Toy problem Examples:
8 – Queen problem
8 – Puzzle problem
Vacuum world problem
21. What is the power of heuristic search? (or) Why does one go for heuristics search?
Heuristic search uses problem specific knowledge while searching in state space. This
helps to improve average search performance. They use evaluation functions which denote
relative desirability (goodness) of a expanding node set. This makes the search more efficient and
faster. One should go for heuristic search because it has power to solve large, hard problems in
affordable times.
23. State the reason when hill climbing often gets stuck?
Local maxima are the state where hill climbing algorithm is sure to get struck. Local
maxima are the peak that is higher than each of its neighbour states, but lower than the global
maximum. So we have missed the better state here. All the search procedure turns out to be
wasted here. It is like a dead end.
24. When a heuristic function h is said to be admissible? Give an admissible heuristic function
for TSP?
Admissible heuristic function is that function which never over estimates the cost to reach
the goal state. It means that h(n) gives true cost to reach the goal state ‘n’. The admissible
heuristic for TSP is
a. Minimum spanning tree.
b. Minimum assignment problem
25. What do you mean by local maxima with respect to search technique?
Local maximum is the peak that is higher than each of its neighbour states, but lowers
than the global maximum i.e. a local maximum is a tiny hill on the surface whose peak is not as
high as the main peak (which is a optimal solution). Hill climbing fails to find optimum solution
when it encounters local maxima. Any small move, from here also makes things worse
(temporarily). At local maxima all the search procedure turns out to be wasted here. It is like a
dead end.
P(A/B)=[P(A)*P(B/A)]/P(B)
0.4 = (0.3*P(B/A))/0.5
P(B/A) = 0.66
Introduction to machine learning – Linear Regression Models: Least squares, single & multiple
variables, Bayesian linear regression, gradient descent, Linear Classification Models:
Discriminant function – Probabilistic discriminative model - Logistic regression, Probabilistic
generative model – Naive Bayes, Maximum margin classifier – Support vector machine,
Decision Tree, Random forests.
PART - A
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
Transduction
Learning to Learn
8. What are the three stages to build the hypotheses or model in machine learning?
Model building
Model testing
Applying the model
11. What is the difference between artificial learning and machine learning?
Designing and developing algorithms according to the behaviours based on empirical data are
known as Machine Learning. While artificial intelligence in addition to machine learning, it also
covers other aspects like knowledge representation, natural language processing, planning,
robotics etc.
13. What is the main key difference between supervised and unsupervised machine learning?
supervised learning Unsupervised learning
The supervised learning technique needs Unsupervised learning does not need
labelled data to train the model. For example, any labelled dataset. This is the main
to solve a classification problem (a key difference between supervised
supervised learning task), you need to have learning and
label data to train the model and to classify unsupervised learning.
the data into your labelled groups.
14. What is a Linear Regression?
In simple terms, linear regression is adopting a linear approach to modeling the relationship
between a dependent variable (scalar response) and one or more independent variables
(explanatory variables). In case you have one explanatory variable, you call it a simple linear
regression. In case you have more than one independent variable, you refer to the process as
multiple linear regressions.
17. What is the difference between stochastic gradient descent (SGD) and gradient descent
(GD)?
Both algorithms are methods for finding a set of parameters that minimize a loss function by
evaluating parameters against data and then making adjustments. In standard gradient descent,
you'll evaluate all training samples for each set of parameters. This is akin to taking big, slow
steps toward the solution. In stochastic gradient descent, you'll evaluate only 1 training sample
for the set of parameters before updating them. This is akin to taking small, quick steps toward
the solution.
19. What is the difference between least squares regression and multiple regression?
The goal of multiple linear regression is to model the linear relationship between the
explanatory (independent) variables and response (dependent) variables. In essence, multiple
regression is the extension of ordinary least-squares (OLS) regression because it involves more
than one explanatory variable.
36. Do you think 50 small decision trees are better than a large one? Why?
Yes. Because a random forest is an ensemble method that takes many weak decision trees to
make a strong learner. Random forests are more accurate, more robust, and less prone to
overfitting.
37. You’ve built a random forest model with 10000 trees. You got delighted after getting
training error as 0.00. But, the validation error is 34.23. What is going on? Haven’t you
trained your model perfectly?
The model has overfitted. Training error 0.00 means the classifier has mimicked the training
data patterns to an extent, that they are not available in the unseen data. Hence, when this
classifier was run on an unseen sample, it couldn’t find those patterns and returned predictions
with higher error. In a random forest, it happens when we use a larger number of trees than
necessary. Hence, to avoid this situation, we should tune the number of trees using cross-
validation.
38. When would you use random forests vs SVM and why?
There are a couple of reasons why a random forest is a better choice of the model than a
support vector machine:
● Random forests allow you to determine the feature importance. SVM’s can’t do this.
● Random forests are much quicker and simpler to build than an SVM.
● For multi-class classification problems, SVMs require a one-vs-rest method,
● which is less scalable and more memory intensive.
Part – B
1. Assume a disease so rare that it is seen in only one person out of every million. Assume also
that we have a test that is effective in that if a person has the disease, there is a 99 percent
chance that the test result will be positive; however, the test is not perfect, and there is a one
in a thousand chance that the test result will be positive on a healthy person. Assume that a
new patient arrives and the test result is positive. What is the probability that the patient
has the disease?
2. Explain Naïve Bayes Classifier with an Example.
3. Explain SVM Algorithm in Detail.
4. Explain Decision Tree Classification.
5. Explain the principle of the gradient descent algorithm. Accompany your explanation with
a diagram. Explain the use of all the terms and constants that you introduce and comment
on the range of values that they can take.
6. Explain the following
a) Linear regression
b) Logistic Regression
DEPARTMENT OF INFORMATION TECHNO1LOGY
(Regulation – 2021)
CS3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
QUESTION BANK
TWO MARKS WITH ANSWERS
UNIT IV ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING
Combining multiple learners: Model combination schemes, Voting, Ensemble Learning -
bagging, boosting, stacking, Unsupervised learning: K-means, Instance Based Learning: KNN,
Gaussian mixture models and Expectation maximization
PART – A
8. What are Gaussian mixture models How is expectation maximization used in it?
Expectation maximization provides an iterative solution to maximum likelihood
estimation with latent variables. Gaussian mixture models are an approach
to density estimation where the parameters of the distributions are fit using the expectation-
maximization algorithm.
17. What are the three main types gradient descent algorithm?
There are three types of gradient descent learning algorithms: batch gradient descent,
stochastic gradient descent and mini-batch gradient descent.
19. How do you solve the vanishing gradient problem within a deep neural network?
The vanishing gradient problem is caused by the derivative of the activation function used to
create the neural network. The simplest solution to the problem is to replace the activation
function of the network. Instead of sigmoid, use an activation function such as ReLU
1. Draw the architecture of a single layer perceptron (SLP) and explain its operation.
Mention its advantages and disadvantages.
2. Draw the architecture of a Multilayer perceptron (MLP) and explain its
operation. Mention its advantages and disadvantages.
3. Explain the stochastic optimization methods for weight determination.
4. Describe back propagation and features of back propagation.
5. Write the flowchart of error back-propagation training algorithm.
6. Develop a Back propagation algorithm for Multilayer Feed forward neural network
consisting of one input layer, one hidden layer and output layer from first principles.
7. List the factors that affect the performance of multilayer feed-forward neural network.
8. Difference between a Shallow Net & Deep Learning Net.
9. How do you tune hyperparameters for better neural network performance? Explain in
detail.