Science BSC Computer Science Semester 5 2022 November Elective I Artificial Intelligence Cbcs
Science BSC Computer Science Semester 5 2022 November Elective I Artificial Intelligence Cbcs
V (Elective I)
Artificial Intelligence
Mumbai University Examination Paper Solution: Nov-22
Q.P. Code:13494
MUQuestionPapers.com
VI. The __________ for an agent specifies the action taken by the agent in response
to any percept sequence.
A. Agent function
B. Agent program
C. Agent Structure
D. None of the above
Ans: A. Agent function
VII. To pass the total Turing test the computer will need ___________
A. Robotics
B. Computer Vision
C. Both A and B
D. None of the above
Ans: C. Both A and B
VIII. ____________ expands shallowest unexpanded node first.
A. Breadth First Search
B. Depth First Search
C. IDA
D. A*
Ans: A. Breadth First Search
IX. The ________, which determines whether a given state is a goal state.
A. Goal test
B. Path test
C. Agent test
D. None of the above
Ans: A. Goal Test
X. How the decision tree reaches its decision?
A. Single test
B. Two test
C. Sequence of tests
D. No test
Ans: C. Sequence of tests
(b) Fill in the blanks (lm, Multiple Linear Model, Problem, Solution, Activation, Evidence [5]
1. A search algorithm takes Problem as an input and returns Solution as an output.
3. When there is more than one independent variable in the model, then the linear
model is termed as Multiple Linear Model.
MUQuestionPapers.com
2. Attempt any three of the following [15]
Q2(a) What are Uninformed Strategies? Explain any one in detail. (5)
Ans. 1. Uninformed search, also known as blind search, is a search algorithm that
explores a problem space without any specific knowledge or information about the
problem other than the initial state and the possible actions to take.
2. These algorithms are typically less efficient than informed search algorithms but can be
useful in certain scenarios or as a basis for more advanced search techniques.
4. Breadth-first Search:
a) Breadth-first search is the most common search strategy for traversing a tree or
graph. This algorithm searches breadthwise in a tree or graph, so it is called
breadth-first search.
b) BFS algorithm starts searching from the root node of the tree and expands all
successor node at the current level before moving to nodes of next level.
c) Breadth-first search implemented using FIFO queue data structure.
d) Advantages:
BFS will provide a solution if any solution exists.
If there are more than one solutions for a given problem, then BFS will provide
the minimal solution which requires the least number of steps.
MUQuestionPapers.com
e) Disadvantages:
It requires lots of memory since each level of the tree must be saved into memory
to expand the next level.
BFS needs lots of time if the solution is far away from the root node.
f) Space Complexity: Space complexity of BFS algorithm is given by the Memory
size of frontier which is O(bd).
g) Completeness: BFS is complete, which means if the shallowest goal node is at
some finite depth, then BFS will find a solution.
h) Optimality: BFS is optimal if path cost is a non-decreasing function of the depth
of the node.
2. Agents can be classified into different types based on their characteristics, such as
whether they are reactive or proactive, whether they have a fixed or dynamic environment,
and whether they are single or multi-agent systems.
3. Artificial intelligence is defined as the study of rational agents. A rational agent could be
anything that makes decisions, such as a person, firm, machine, or software. It carries out an
action with the best outcome after considering past and current percepts (agent’s perceptual
inputs at a given instance).
MUQuestionPapers.com
4. Structure of an AI Agent
5. There are many examples of agents in artificial intelligence. Here are a few:
Intelligent personal assistants: These are agents that are designed to help users with
various tasks, such as scheduling appointments, sending messages, and setting
reminders. Examples of intelligent personal assistants include Siri, Alexa, and
Google Assistant.
Autonomous robots: These are agents that are designed to operate autonomously in
the physical world. They can perform tasks such as cleaning, sorting, and delivering
goods. Examples of autonomous robots include the Roomba vacuum cleaner and the
Amazon delivery robot.
6. Types of Agents:
Q2(c) What is well defined problem and solution explain for Romanian map? (5)
Ans. 1. A well-defined problem is a problem that has a clear and unambiguous statement, a
defined goal, a specified set of constraints, and a well-understood problem domain. It is
characterized by having all the necessary information and criteria required to find a
solution.
2. In the context of a Romanian map, let's consider the following well-defined problem and
its solution:
Problem: Find the shortest route between two cities on the Romanian map.
Solution: The solution to this problem involves using a pathfinding algorithm, such
as Dijkstra's algorithm or A* search algorithm, to determine the shortest route
between the given pair of cities.
MUQuestionPapers.com
are represented as nodes, and the connections between cities (roads) are represented
as edges. Each edge has a weight or cost associated with it, representing the distance
or travel time between the connected cities.
b) Input: Identify the starting city and the destination city for which you want to find
the shortest route.
c) Initialization: Initialize the algorithm by assigning a distance value of 0 to the
starting city and infinity to all other cities. Set the starting city as the current city.
d) Exploration: Iterate through the cities based on the selected pathfinding algorithm,
considering the neighboring cities connected by edges. Update the distance values of
the neighboring cities if a shorter path is found. Keep track of the path taken to reach
each city.
e) Goal Test: Once the destination city is reached, the algorithm terminates, as the
shortest route has been found.
f) Traceback: From the destination city, trace back the path taken by following the
cities' predecessors recorded during the exploration phase. This will give you the
sequence of cities representing the shortest route.
g) Output: The output is the shortest route, which includes the sequence of cities to be
visited from the starting city to the destination city, along with the total distance or
cost associated with the route.
Ans. 1. PEAS stands for Performance measure, Environment, Actuators, and Sensors. It is a
framework used in artificial intelligence to analyze and understand the behavior and
characteristics of intelligent agents
3.Environment − The environment refers to the agent's immediate surroundings at the time
the agent is working in that environment. Depending on the mobility of the agent, it might
be static or dynamic.
MUQuestionPapers.com
object-picking arms, track-changing devices, etc. are examples of actuators.
In-car driving tools like cameras, sonar systems. are used to collect environmental data.
MUQuestionPapers.com
Q2(e) List and explain the categories of definitions of AI. (5)
Ans. 1. John McCarthy in 1955, defined It is the science and engineering of making
intelligent machines, especially intelligent computer programs. It is related to the similar
task of using computers to understand human intelligence.
Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI. Following is
flow diagram which explain the types of AI.
3. Super AI:
MUQuestionPapers.com
Artificial Intelligence type-2: Based on functionality
1. Reactive Machines
Purely reactive machines are the most basic types of Artificial Intelligence.
Such AI systems do not store memories or past experiences for future actions.
Google's AlphaGo is also an example of reactive machines.
2. Limited Memory
Limited memory machines can store past experiences or some data for a short period
of time.
Self-driving cars are one of the best examples of Limited Memory systems. These
cars can store recent speed of nearby cars, the distance of other cars, speed limit, and
other information to navigate the road.
3.Theory of Mind
Theory of Mind AI should understand the human emotions, people, beliefs, and be
able to interact socially like humans.
This type of AI machines are still not developed, but researchers are making lots of
efforts and improvement for developing such AI machines.
4. Self-Awareness
Ans. 1. Greedy Best-First Search is an AI search algorithm that attempts to find the most
promising path from a given starting point to a goal. It prioritizes paths that appear to be the
most promising, regardless of whether or not they are actually the shortest path.
2.The algorithm works by evaluating the cost of each possible path and then expanding the
path with the lowest cost. This process is repeated until the goal is reached.
MUQuestionPapers.com
Advantages of Greedy Best-First Search:
Inaccurate Results: Greedy Best-First Search is not always guaranteed to find the
optimal solution, as it is only concerned with finding the most promising path.
Local Optima: Greedy Best-First Search can get stuck in local optima, meaning that
the path chosen may not be the best possible path.
Lack of Completeness: Greedy Best-First Search is not a complete algorithm,
meaning it may not always find a solution if one is exists. This can happen if the
algorithm gets stuck in a cycle or if the search space is a too much complex.
Pathfinding: Greedy Best-First Search is used to find the shortest path between two
points in a graph. It is used in many applications such as video games, robotics, and
navigation systems.
Machine Learning: Greedy Best-First Search can be used in machine learning
MUQuestionPapers.com
algorithms to find the most promising path through a search space.
Optimization: Greedy Best-First Search can be used to optimize the parameters of a
system in order to achieve the desired result.
Ans. 1. Overfitting is a phenomenon that can occur when constructing decision trees, and it
refers to a situation where the tree model becomes excessively complex and specific to the
training data, resulting in poor performance on new, unseen data.
MUQuestionPapers.com
depth) or using regularization terms in the tree building process (e.g., cost
complexity), can also help combat overfitting.
Ans. 1. In the context of Artificial Intelligence (AI) and machine learning, entropy is a
measure of impurity or uncertainty in a dataset. It is commonly used in decision tree
algorithms, such as ID3 or C4.5, to assess the quality of a split and make informed decisions
during the learning process.
Where:
4. The resulting entropy value provides a measure of the disorder or uncertainty within the
dataset. A lower entropy indicates a more pure or homogeneous distribution of class
labels, while a higher entropy suggests greater diversity or randomness.
5. The calculation of entropy helps in decision tree algorithms to determine the most
informative feature or attribute for splitting the data. The goal is to minimize entropy by
selecting the feature that maximally reduces uncertainty and improves the separation of
different classes.
6. By comparing the entropy before and after a split, the information gain can be
calculated, which quantifies the reduction in entropy achieved by the split. High
information gain implies a more significant reduction in uncertainty and a more
effective split for decision making.
MUQuestionPapers.com
7. Entropy, along with information gain, forms the basis for feature selection and
decision-making processes in various machine learning algorithms. It enables the
identification of the most valuable features that contribute to the predictive power and
accuracy of AI models.
Neurons (Nodes):
An artificial neural network consists of interconnected computational units called
neurons or nodes. These nodes simulate the behavior of biological neurons and
perform computations on input data.
Layers:
Neurons are organized into layers within an ANN. The three main types of layers
are:
a. Input Layer: This layer receives and represents the input data to the network.
Each node in the input layer corresponds to a specific feature or attribute of the input
data.
b. Hidden Layers: These intermediate layers perform computations and
transformations on the input data. They extract relevant features and patterns from
the data, allowing the network to learn complex relationships.
c. Output Layer: The final layer produces the output or prediction of the network.
The number of nodes in the output layer depends on the type of problem being
solved (e.g., binary classification, multi-class classification, regression).
Feedforward and Backpropagation:
In the feedforward phase, data flows through the network from the input layer to the
output layer, with computations and activations occurring at each neuron. The output
is then compared to the desired output, and an error or loss value is computed.
Training and Learning:
MUQuestionPapers.com
ANNs are trained using labeled training data. The network iteratively adjusts its
weights and biases to minimize the error between its predictions and the true labels.
This learning process is known as supervised learning, where the network learns
from examples and gradually improves its performance.
3. Artificial neural networks are widely used in various domains, including image
recognition, natural language processing, speech recognition, recommendation systems, and
many more. Their ability to model complex non-linear relationships and learn from data
makes them a powerful tool in the field of artificial intelligence and machine learning.
Q3(d) What is Logistic Regression explain with the help of an example. (5)
Ans. 1. Logistic regression is a statistical learning algorithm used for binary classification
problems, where the goal is to predict the probability of an input belonging to one of two
classes. Despite its name, logistic regression is a classification algorithm rather than a
regression algorithm.
Suppose you work for a bank and want to predict whether a loan applicant will be approved
or denied based on certain factors such as income, credit score, and debt-to-income ratio.
You have a dataset containing historical loan applications along with their approval
outcomes.
Data Preparation:
Each loan application in the dataset is represented by a set of features, such as
income, credit score, and debt-to-income ratio. Let's assume we have two features:
income (numeric) and credit score (numeric).The target variable is the loan approval
outcome, which can take two values: approved (1) or denied (0).
Hypothesis and Model:
In logistic regression, we assume a linear relationship between the features and the
log-odds of the loan approval. The log-odds are transformed using the logistic
function (sigmoid function) to obtain the predicted probability. The logistic
regression model can be represented as:
where W₀, W₁, and W₂ are the parameters (weights) to be learned from the data.
MUQuestionPapers.com
Learning and Training:
The goal is to learn the optimal values of the parameters that best fit the data. This is
achieved through an optimization algorithm, such as maximum likelihood estimation
or gradient descent. The algorithm iteratively adjusts the weights to minimize the
difference between the predicted probabilities and the actual loan approval outcomes
in the training data.
Decision Boundary and Prediction:
Once the model is trained, it can be used to predict the loan approval outcome for
new loan applications.
Ans.
1. Support Vector Machine (SVM) is a powerful supervised machine learning algorithm used
for classification and regression tasks. It finds an optimal hyperplane or decision boundary
that maximally separates different classes while maximizing the margin between them.
2. Here's an explanation of SVM and its properties:
Linear Separability:
SVM works on the principle of finding a hyperplane that can linearly separate the
input data into different classes. Linear separability means that the classes can be
separated by a straight line or hyperplane in the feature space.
Margin Maximization:
SVM aims to find the hyperplane that maximizes the margin between the classes.
MUQuestionPapers.com
The margin is the distance between the hyperplane and the closest data points from
each class. By maximizing the margin, SVM seeks to achieve better generalization
and robustness to new data.
Support Vectors:
Support vectors are the data points that lie closest to the decision boundary or
hyperplane. These points have the most influence on determining the hyperplane's
position and are used to define the decision boundary. SVM focuses only on these
support vectors, making it memory-efficient and suitable for handling large datasets.
Kernel Functions:
SVM can handle non-linearly separable data by using kernel functions. Kernel
functions transform the input data into a higher-dimensional feature space, where it
becomes linearly separable. This transformation allows SVM to find a hyperplane in
the transformed feature space, which corresponds to a non-linear decision boundary
in the original space.
Interpretability:
SVM provides interpretability in terms of support vectors and their associated
weights. The support vectors represent the most critical instances for determining
the decision boundary. Additionally, the weight assigned to each feature provides
insights into the feature's importance in the classification process.
3. Support Vector Machines are widely used in various domains, including image
classification, text classification, bioinformatics, and finance. They offer good
generalization performance, especially when the data is well-separated or when kernel
functions are used to handle non-linear relationships
2. It leverages the wisdom of the crowd by aggregating the predictions of multiple models
to achieve better overall performance.
3. Bagging (Bootstrap Aggregating) is one popular ensemble learning algorithm that works
by creating multiple subsets of the original training data through random sampling with
replacement. Each subset is used to train a separate base learner, typically using the same
learning algorithm. The individual models are then combined or aggregated to make
MUQuestionPapers.com
predictions.
Data Sampling:
Bagging starts by creating multiple subsets of the original training data through
random sampling with replacement. Each subset, known as a bootstrap sample, has
the same size as the original dataset.
Base Learners:
A base learning algorithm, such as decision trees, is applied to each bootstrap
sample independently. The same learning algorithm is typically used to ensure
consistency and comparability among the base learners.
Prediction Aggregation:
Bagging combines the predictions of the individual models to make final
predictions.
Robustness and Generalization:
Bagging aims to improve the overall performance of the ensemble by reducing
variance and increasing stability.
5. The bagging algorithm, through its aggregation of multiple models, reduces overfitting,
improves generalization, and increases prediction accuracy.
Ans. 1. Maximum Likelihood Estimation (MLE) is a widely used method for parameter
learning in continuous models. It involves finding the parameter values that maximize the
likelihood of observing the given data.
Model Assumption: Continuous models assume that the data follows a specific
probability distribution, such as the Gaussian distribution.
Likelihood Function: The likelihood function measures the probability of observing
MUQuestionPapers.com
the given data under the assumed distribution. For continuous models, the likelihood
function is the product of the probability density function (PDF) evaluated at each
data point.
Log-Likelihood Function: To simplify calculations and avoid numerical instability,
it is common to work with the log-likelihood function, which is the logarithm of the
likelihood function.
Parameter Estimation: The goal is to find the parameter values that maximize the
log-likelihood function or minimize the negative log-likelihood (NLL) function.
Optimization algorithms, such as gradient descent or numerical optimization
methods, are used to iteratively update the parameter values.
Convergence: The optimization process continues until a convergence criterion is
met, typically based on the change in log-likelihood or the gradient magnitude
falling below a threshold.
2. It involves an agent that learns to navigate and optimize its behavior by receiving rewards
or penalties based on its actions.
Agent and Environment: In RL, an agent interacts with an environment, which can
be a simulated or real-world setting. The agent takes actions, and the environment
responds with new states and rewards.
Learning through Feedback: The agent learns through a trial-and-error process. It
doesn't receive explicit instructions but instead learns from the consequences of its
actions.
Feedback is provided in the form of rewards or penalties, which indicate the
MUQuestionPapers.com
desirability or undesirability of the agent's actions.
Exploration and Exploitation: RL involves a trade-off between exploration and
exploitation. Initially, the agent explores different actions and their outcomes to
learn about the environment. Over time, it starts exploiting its knowledge to make
optimal decisions.
Markov Decision Process (MDP): RL problems are often formulated as Markov
Decision Processes, which formalize the sequential decision-making process. MDPs
consist of states, actions, transition probabilities, rewards, and a discount factor to
balance immediate and future rewards.
Q4(c) What are beta distributions explain with the example. (5)
Suppose you have an e-commerce website and want to optimize the conversion rate (the
proportion of visitors who make a purchase).
You have collected data on 1000 visitors, where 400 made a purchase and 600 did not.
To model the conversion rate, you can use a Beta distribution with parameters α = 401 (400
+ 1) and β = 601 (600 + 1). With this Beta distribution, you can compute statistics such as
the mean, variance, and percentiles to understand the distribution of conversion rates.
One of the strengths of the Beta distribution is its conjugacy in Bayesian inference. This
means that if the prior distribution for a parameter is Beta, and the likelihood function is
binomial, the posterior distribution will also be a Beta distribution. This property allows for
MUQuestionPapers.com
efficient updating of beliefs and incorporating new evidence as more data becomes
available.
Applications:
The Beta distribution finds applications in various AI tasks, including A/B testing,
click-through rate modeling, sentiment analysis, and Bayesian modeling.
It is particularly useful when dealing with proportions or probabilities constrained
within a range, providing a flexible and interpretable distribution for modeling
uncertainty.
The Beta distribution is a valuable tool in AI and machine learning, providing a
flexible framework for modeling random variables constrained within a specific
range. Its ability to capture uncertainty and update beliefs in a Bayesian setting
makes it suitable for a wide range of applications involving proportions and
probabilities.
Step 1: Initialization: Start by initializing the model's parameters. This could involve setting
initial values randomly or based on prior knowledge.
Step 2: E-step (Expectation): In the E-step, the algorithm estimates the expected value of
the hidden variables given the observed data and the current parameter estimates. It
computes the posterior distribution of the hidden variables, which represents the probability
distribution over the possible values of the hidden variables.
Step 3: M-step (Maximization): In the M-step, the algorithm updates the parameter
estimates to maximize the likelihood of the observed data given the expected values of the
hidden variables obtained from the E-step.This step involves finding the parameter values
that maximize the expected log-likelihood, also known as the Q-function.
Step 4: Iteration: The E-step and M-step are repeated iteratively until convergence. In each
iteration, the E-step computes the expected values of the hidden variables, and the M-step
MUQuestionPapers.com
updates the parameter estimates based on these expected values.
Ans. 1. Naive Bayes is a simple yet effective probabilistic machine learning algorithm used
for classification tasks. It is based on the principle of Bayes' theorem and assumes
independence among the features, hence the term "naive."
Bayes' Theorem: Naive Bayes is based on Bayes' theorem, which calculates the posterior
probability of a class given the observed features.
Bayes' theorem states that P(C|X) = (P(X|C) * P(C)) / P(X), where C represents the class
label and X represents the set of observed features.
3.Independence Assumption: The Naive Bayes algorithm assumes that the features are
conditionally independent of each other given the class label. This assumption simplifies the
calculations and allows for efficient inference.
4. Parameter Estimation: Naive Bayes estimates the parameters required for classification.
For categorical features, it estimates the class prior probabilities P(C) and the conditional
probabilities P(X|C) for each feature given each class.
5. Classification: To classify a new instance with Naive Bayes, the algorithm calculates the
posterior probabilities for each class given the observed features using Bayes' theorem.
6. Naive Bayes is known for its simplicity, speed, and scalability. It works well on large
datasets and can handle high-dimensional feature spaces.
7.Although it assumes feature independence, Naive Bayes often performs surprisingly well
in practice, especially for text classification, spam filtering, sentiment analysis, and other
similar tasks.
MUQuestionPapers.com
8.It is a popular choice for baseline models and is widely used in various applications due to
its efficiency and reasonable performance.
Ans. 1. A Hidden Markov Model (HMM) is a statistical model used to model sequential
data with hidden states. It is widely employed in various fields, including speech
recognition, natural language processing, bioinformatics, finance, and more.
States: HMM consists of a set of hidden states, which represent the underlying,
unobservable variables of the system. These states form a Markov chain, implying
that the probability of transitioning from one state to another only depends on the
current state and not on the history.
Observations: Each hidden state generates an observable output (observation)
according to a specific probability distribution. However, the exact state generating
an observation remains unknown (hence "hidden").
State Transition Probabilities: HMM incorporates state transition probabilities,
indicating the likelihood of moving from one state to another. These probabilities are
usually represented in a transition matrix.
Emission Probabilities: Each state is associated with an emission probability
distribution, which models the likelihood of generating a particular observation
given the state. These probabilities are often presented in an emission matrix.
Evaluation
Decoding
Learning
Speech Recognition: HMMs are used to model phonemes and words in speech recognition
systems, aiding in speech-to-text conversion.
Natural Language Processing: HMMs are employed for part-of-speech tagging, named
entity recognition, and other sequential text processing tasks.
MUQuestionPapers.com
Bioinformatics: HMMs are applied to analyze DNA and protein sequences, identifying
genes, regulatory elements, and functional domains.
Finance: HMMs are used for predicting market trends and volatility.
Ans:
Goals: The agent's goals define the desired states or outcomes that it aims to achieve.
These goals can be simple or complex, short-term or long-term, and may be
predefined by the designer or learned from experience.
Perception: The agent perceives its environment through sensors, which provide
information about the current state of the environment and the agent's internal state.
Decision-making: The agent uses its internal knowledge, reasoning capabilities, and
possibly learning mechanisms to make decisions about which actions to take. The
goal-based agent evaluates potential actions based on their expected outcomes and
their alignment with achieving its goals.
MUQuestionPapers.com
Execution and Feedback Loop: The agent interacts with the environment, performs
actions, and then receives feedback or observations about the consequences of its
actions. This feedback is used to update the agent's internal state and knowledge,
which can affect future decisions and actions.
Chess-playing Agent: In a chess-playing agent, the goal is to win the game. The
agent observes the current state of the chessboard, analyzes potential moves, and
selects the move that maximizes the likelihood of winning.
2. The evaluation metrics and methods can vary based on the nature of the problem, the
available resources, and the specific goals of the assessment. Here are some common
approaches to measure problem-solving performance:
MUQuestionPapers.com
exploration is necessary, evaluating how well an agent balances exploration (trying
new approaches) and exploitation (using known strategies) can be important.
Learning Curves: For AI systems, learning curves can be used to measure problem-
solving performance over time. This involves plotting performance metrics against
the number of training examples or problem-solving attempts to understand the
system's learning progress.
Human Evaluation: In some cases, especially in creative or subjective problem-
solving tasks, human evaluation may be necessary. Experts or independent judges
can assess the quality of solutions generated by individuals or AI systems.
Ans. 1. Linear regression is a statistical method used to model the relationship between a
dependent variable and one or more independent variables by fitting a linear equation to the
observed data.
2.The goal of linear regression is to find the best-fitting line (or hyperplane in higher
dimensions) that minimizes the difference between the predicted values and the actual
values of the dependent variable.
3.It is one of the simplest and most widely used techniques in statistics and machine
learning for predicting continuous numerical outcomes.
4.There are different types of linear regression based on the number of independent
variables and the specific characteristics of the data. The two main types are:
MUQuestionPapers.com
linear regression cost function to prevent overfitting and handle multicollinearity
issues in multiple linear regression.
Q5(d) How supervised learning algorithm works explain with example? (5)
Ans. 1. Supervised learning is a type of machine learning where the algorithm learns from a
labeled dataset, meaning it is provided with input-output pairs during the training process.
The goal is for the algorithm to learn the mapping between the input and the corresponding
output so that it can make accurate predictions on new, unseen data.
Data Collection: First, you need to gather a dataset with labeled examples. It would
contain rows of data, where each row represents a house, and the columns represent
the features (size, bedrooms, location) and the target variable (price). Each row
would look like this:
Data Preprocessing: Before training the model, you need to preprocess the data. This
includes tasks like handling missing values, converting categorical variables (like
"Location") into numerical representations (e.g., one-hot encoding), and scaling the
features to bring them to a similar range.
Training the Model: Now, you can use the labeled data to train a supervised learning
algorithm, such as a linear regression model. The algorithm will learn the
relationship between the input features (size, bedrooms, location) and the target
variable (price) by adjusting its internal parameters during the training process.
Model Evaluation: After training, you need to evaluate the model's performance to
see how well it can predict house prices on new, unseen data. You can split the
dataset into a training set and a test set. The training set is used to train the model,
and the test set is used to evaluate its performance. Common evaluation metrics for
regression tasks include Mean Squared Error (MSE) and R-squared (coefficient of
determination).
Making Predictions: Once the model is trained and evaluated, you can use it to make
predictions on new houses. For instance, if you have the features of a house that you
want to sell, such as its size, number of bedrooms, and location, you can input these
MUQuestionPapers.com
values into the trained model, and it will predict the corresponding price for that
house.
Ans. 1. Statistical learning is a field of study that focuses on developing and applying
methods to analyze and model data to make predictions and gain insights.
2.It involves using statistical techniques and mathematical models to learn patterns and
relationships from data.
MUQuestionPapers.com
Q5(f)Explain any one application of Reinforcement learning? (5)
2. The goal of the agent is to maximize the cumulative reward over time, leading to the
discovery of optimal strategies or policies.
4.For example, a robot can learn to walk, pick up objects, or navigate through a cluttered
environment using reinforcement learning.
5.The robot receives rewards or penalties based on the success or failure of its actions,
allowing it to improve its movements and behaviors over time, ultimately becoming more
efficient and adaptive in completing tasks.
2.It is a function that provides an estimate or approximation of the cost or value of reaching
a goal state from a given state in a search space.
3.Estimation of Cost: The heuristic function evaluates the potential cost or distance from the
current state to the goal state. It provides an informed estimate, which means it guides the
search algorithm towards promising paths that are likely to lead to the goal.
5.Influence on Search Algorithms: Heuristic functions play a crucial role in guiding search
algorithms like A* (A star) and Best-First Search. By using the heuristic function, these
algorithms prioritize exploring states that are expected to lead to the goal, making the search
more efficient and effective.
6. The effectiveness of a heuristic function lies in its ability to strike a balance between
MUQuestionPapers.com
providing a good estimate of the cost while avoiding excessive computational complexity.
A well-designed heuristic function can significantly speed up search algorithms and help
find solutions more quickly in various AI and optimization problems.
Ans. 1. A non-parametric model is a type of statistical model that does not make explicit
assumptions about the underlying distribution of the data. Unlike parametric models, which
have a fixed number of parameters, non-parametric models can adapt to the complexity of
the data and often require more data to learn effectively.
Flexibility: Non-parametric models are highly flexible and can represent complex
relationships between variables without imposing specific constraints on the data
distribution. They can handle both linear and non-linear relationships, making them
suitable for a wide range of data types and patterns.
Adaptability: Non-parametric models can adjust to the size of the dataset, becoming
more accurate as more data becomes available. They do not assume fixed parameter
values, making them well-suited for situations where the underlying data distribution
may be unknown or change over time.
Example: One of the most popular non-parametric models is the k-nearest neighbors
(KNN) algorithm. In KNN, predictions are made based on the majority class of the
k-nearest data points to the target instance. It does not make assumptions about the
data distribution and can handle complex decision boundaries in classification tasks.
Non-parametric models are commonly used in various machine learning tasks, such
as classification, regression, density estimation, and clustering. However, they may
require more computational resources and larger datasets compared to parametric
models due to their increased flexibility and adaptabilit
MUQuestionPapers.com