Artificial Intelligence PYQ Theory
Artificial Intelligence PYQ Theory
What
● are the types of NLP models?
NLP models can be classified into two main types: rule-based and statistical. Rule-
based models use predefined rules and dictionaries to analyze and generate natural
language data. Statistical models use probabilistic methods and data-driven
approaches to learn from language data and make predictions.
NLP models have many applications in various domains and industries, such as
search engines, chatbots, voice assistants, social media analysis, text mining,
information extraction, natural language generation, machine translation, speech
recognition, text summarization, question answering, sentiment analysis, and more.
Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a language
means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk
of txt into paragraphs, sentences, and words.
Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for grammar and arranging
words in a manner that shows the relationship among the words. The sentence such as “The school
goes to boy” is rejected by English syntactic analyzer.
Semantic Analysis − It draws the exact meaning or the dictionary meaning from the text. The text is
checked for meaningfulness. It is done by mapping syntactic structures and objects in the task
domain. The semantic analyzer disregards sentence such as “hot ice-cream”.
Discourse Integration − The meaning of any sentence depends upon the meaning of the sentence just
before it. In addition, it also brings about the meaning of immediately succeeding sentence.
Pragmatic Analysis − During this, what was said is re-interpreted on what it actually meant. It involves
deriving those aspects of language which require real world knowledge.
What Is Natural Language Understanding (NLU)?
● The most basic form of NLU is parsing, which takes text written in
natural language and converts it into a structured format that
computers can understand.
● For example, the words "hello world" would be converted into their
respective parts of speech (nouns and verbs), while "I am hungry"
would be split into two sentences: "I am" and "hungry."
● Parsing is only one part of NLU; other tasks include sentiment analysis,
entity recognition, and semantic role labeling.
Applications of AI
AI has been dominant in various fields such as −
Gaming ,Expert Systems, Vision Systems , Speech Recognition, Handwriting
Recognition, Intelligent Robots
analyzing whatwhen
For example, it means. It canreads
a human happen on a small
a user's scale.
question on Twitter and replies
with an answer, or on a large scale, like when Google parses millions of
documents to figure out what they're about.
Human preference in encoding uncertainty refers to the tendencies or biases that humans
exhibit when parsing ambiguous language structures or sentences. Parsing involves analyzing
the syntactic structure of a sentence to determine its meaning, and ambiguity arises when there
are multiple possible interpretations for a given sentence.
The text outlines several principles that reflect human preferences in resolving ambiguity during
parsing:
1. Minimal Attachment: People prefer simpler sentence structures when faced with ambiguity.
They tend to choose interpretations that involve the fewest elements in the sentence's structure.
2. Right Association (Late Closure): When encountering new parts of a sentence, like
phrases or clauses, people tend to assume they're connected to the most recent part they've
read or heard. This helps maintain a smooth flow of understanding.
3. Lexical Preferences: The specific words used in a sentence can influence how people
interpret ambiguous parts. Some words, like verbs, have preferences for how they interact with
other parts of the sentence. This can override other parsing principles in certain situations.n
essence, when people encounter uncertainty in understanding a sentence, they typically opt for
simpler structures, assume new parts relate to what came just before, and take into account the
specific words used to help make sense of the ambiguity.
Agents can be grouped into five classes based on their degree of perceived intelligence and capability :
Key Components:Percept:
Current input from the environment.
Condition-Action Rules: Predefined "if-then" rules.
Action: Output based on the current percept.
Model-based reflex agent
○ The Model-based agent can work in a partially observable environment, and track the situation.
○ These agents have the model, "which is knowledge of the world" and based on the model they
perform actions.
Goal-based agents
○ The knowledge of the current state environment is not always sufficient to decide for an agent to
what to do.
○ The agent needs to know its goal which describes desirable situations.
○ Goal-based agents expand the capabilities of the model-based agent by having the "goal"
information.
Utility-based agents
○ These agents are similar to the goal-based agent but provide an extra component of utility
measurement which makes them different by providing a measure of success at a given state.
○ Utility-based agent act based not only goals but also the best way to achieve the goal.
○ The Utility-based agent is useful when there are multiple possible alternatives, and an agent has
to choose in order to perform the best action.
○ The utility function maps each state to a real number to check how efficiently each action
achieves the goals.
Learning Agents
○ AlearningagentinAIisthetypeofagentwhichcanlearnfromitspastexperiences,orithas
learningcapabilities.
○ Itstartstoactwithbasicknowledgeandthenabletoactandadaptautomaticallythrough
learning.
○ Alearningagenthasmainlyfourconceptualcomponents,whichare:
b. Critic: Learningelementtakesfeedbackfromcriticwhichdescribesthathowwellthe
agentisdoingwith
respecttoafixedperformancestandard.
d. Problem generator: This component is responsible for suggesting actions that will
tolead
new and
informative experiences.
Knowledge Acquisition
● Knowledge Acquisition in AI Overview:
● Knowledge acquisition in artificial intelligence encompasses the sophisticated
process of gathering, filtering, and comprehending information and experiences within
specific domains.
● It serves as the cornerstone for both machine learning and knowledge-based
systems, providing the foundational understanding necessary for AI systems to make
informed decisions.
● Learning from Examples: AI systems learn patterns and generalizations from curated
training data, a common approach in machine learning.
● Natural Language Processing: Utilizing advanced techniques like text mining, this
method extracts knowledge from textual data, aiding in information comprehension.
● Semantic Web: Standards like RDF and OWL enable structured representation of
knowledge on the internet, facilitating computer processing.
● Model Selection: Selecting appropriate AI models tailored to specific tasks and domains.
● Transfer Learning: Enabling AI systems to adapt and learn from new environments and
data streams for real-world applicability.
Parsing
1. Parsing Definition: Parsing is an automated process that analyzes the structure
of a sentence based on predefined grammar rules and a lexicon. It involves
breaking down the sentence into its constituent parts and determining their
syntactic relationships.
4. Grouping and Labeling: During parsing, the parser groups and labels different
parts of the sentence to illustrate their relationships. This helps in understanding
how the components of the sentence interact.
Machine learning is the branch of Artificial Intelligence that focuses on developing models and
algorithms that let computers learn from data and improve from previous experience without
being explicitly programmed for every task. In simple words, ML teaches the systems to think
and understand like humans by learning from the data.
There are several types of machine learning, each with special characteristics and applications.
Some of the main types of machine learning algorithms are as follows:
Decision Trees: Decision trees are hierarchical models that recursively split data into
smaller subsets based on feature values, leading to a tree-like structure of decisions.
Each internal node represents a decision based on a feature, and each leaf node
represents a class label or prediction. Decision trees are interpretable and can handle both numerical and
categorical data.
Support Vector Machines (SVM): SVM is a supervised learning algorithm used for classification and
regression tasks. It finds the optimal hyperplane that separates data points of different classes in a high-
dimensional space. SVM aims to maximize the margin between classes while minimizing classification
errors.
k-Nearest Neighbors (k-NN): k-NN is a simple and intuitive algorithm for classification and regression.
Given a new data point, it predicts the class label or value based on the majority vote or average of its k
nearest neighbors in the training dataset. The choice of k influences the model's bias-variance tradeoff.
Clustering Algorithms: Clustering algorithms group similar data points together based on their proximity
or similarity in feature space. Common clustering methods include k-means clustering, hierarchical
clustering, and density-based clustering. Clustering is an unsupervised learning technique used for data
exploration, pattern recognition, and anomaly detection.
Ensemble Learning: Ensemble learning combines multiple individual models to improve predictive
performance and robustness. Examples of ensemble methods include Random Forests, Gradient
Boosting Machines (GBM), AdaBoost, and Stacking. Ensemble techniques leverage the diversity of
models to reduce overfitting and achieve better generalization on unseen data.
Deep Learning: Deep learning is a subset of neural network algorithms that involve architectures with
multiple layers of neurons, allowing for the representation of complex patterns and hierarchical
abstractions. Deep learning has revolutionized various fields, including computer vision, natural language
processing, and speech recognition, by achieving state-of-the-art performance on challenging tasks.
These are just a few examples of learning methods in AI, each with its strengths, weaknesses, and
applications across different domains and problem types. Choosing the appropriate learning method
depends on factors such as the nature of the data, the complexity of the problem, and the desired
outcomes.
What is Regression?
It seeks to find the best-fitting model, which can be utilized to make predictions or draw
conclusions.
Regression analysis problem works with if output variable is a real or continuous value,
such as “salary” or “weight”. Many different models can be used, the simplest is the
linear regression. It tries to fit data with the best hyper-plane which goes through the
points.
Data Gathering: You collect data on people who did or didn't buy the product, along with
info about their age, income, etc.
Model Building: Logistic regression looks at all these clues and figures out the
relationship between each clue (called a feature) and the probability of someone buying
the product. It tries to draw a line through the data that best separates the buyers from
the non-buyers.
Probability Prediction: Once the model is trained, you can give it new clues about a
person, like their age and income, and it'll give you a probability score of how likely they
are to buy the product.
Decision Making: Based on that probability score, you can make a decision. For
example, if the score is above a certain threshold, you might decide to target that
person with a marketing campaign.
So, logistic regression is like a smart detective that helps you predict the probability of
an event happening based on various clues, which is super handy in making decisions
in many real-world situations!
Semantic Network
Semantic networks are a type of data representation incorporating linguistic information
that describes concepts or objects and the relationship or dependency between them.
There are mainly four ways of knowledge representation which are given as follows:
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
3. These networks are not intelligent and depend on the creator of the system.
Fuzzy logic contains the multiple logical values and these values are the truth values of a
variable or problem between 0 and 1. This concept was introduced by Lofti Zadeh in 1965 based
on the Fuzzy Set Theory. This concept provides the possibilities which are not given by
computers, but similar to the range of possibilities generated by humans.
● It is used for helping the minimization of the logics created by the human.
● It is the best method for finding the solution of those problems which are suitable for
approximate or uncertain reasoning.
● It always offers two values, which denote the two possible solutions for a problem and
statement.
● It allows users to build or create the functions which are non-linear of arbitrary
complexity.
● In the Fuzzy logic, any system which is logical can be easily fuzzified.
● It is also used by the quantitative analysts for improving their algorithm's execution.
Fuzzy sets allow for degrees of Crisp sets, also known as classical
membership between 0 and 1, sets, follow traditional set theory,
representing the extent to which an where elements either belong to the
element belongs to the set. set (1) or do not belong to the set (0).
Membership grades can range from There are no intermediate values or
fully belonging (1) to not belonging at degrees of membership in crisp sets.
all (0), with intermediate values
indicating partial membership.
Fuzzy sets are suitable for Crisp sets are binary in nature and
representing and reasoning with are not designed to handle
uncertainty, ambiguity, and vagueness uncertainty. They are suitable for
in data or concepts. They allow for situations where elements can be
flexible modeling of imprecise clearly categorized as either belonging
information and are particularly useful or not belonging to a set without
when dealing with subjective or ambiguity.
qualitative data.
Fuzzy sets find applications in fields Crisp sets are commonly used in
such as artificial intelligence, control traditional mathematics, logic, and
systems, decision-making, pattern computer science applications where
recognition, and expert systems, exact categorization or classification is
where uncertainty and imprecision are required, such as in database queries,
inherent in the data or reasoning set theory, and formal logic.
process.
Fuzzy sets offer flexibility in modeling Crisp sets are rigid in their
complex relationships and capturing representation, defining strict
nuances in data that may not fit neatly boundaries between elements that
into crisp categories. They can belong to the set and those that do
represent gradual transitions and not. They are suitable for situations
fuzzy boundaries between categories, where clear-cut distinctions between
allowing for more realistic and categories are sufficient, but may
expressive modeling. struggle to capture the richness of
real-world data with fuzzy or
overlapping characteristics.
What is Forward and Backward Chaining? Write the difference between them.
Forward chaining and backward chaining are two fundamental inference methods used
in artificial intelligence and knowledge representation systems, particularly in the
context of rule-based systems and expert systems. These methods are employed to
deduce conclusions from a set of rules or facts, enabling automated reasoning and
decision-making.
Forward Chaining:
Forward chaining, also known as data-driven reasoning or goal-oriented reasoning,
starts with the available facts and rules and iteratively applies them to infer new
conclusions until a specific goal or condition is satisfied. It proceeds in a forward
direction, moving from the known facts to potential conclusions.
Backward Chaining:
Backward chaining, also known as goal-driven reasoning or backward reasoning, starts
with the desired goal or conclusion and works backward to determine the sequence of
steps needed to achieve that goal. It focuses on finding the conditions or facts that
must be true in order for the goal to be reached.
Goal Definition: It starts with a specific goal or conclusion that needs to be satisfied.
Rule Selection: Rules are selected based on their relevance to the goal or conclusion.
Hypothesis Testing: The system attempts to find evidence or facts that support the
conditions specified by the selected rules.
Recursive Process: If the evidence is not directly available, the process recursively
applies backward chaining to determine the conditions required to satisfy the selected
rules.
Termination: The process continues until the necessary conditions are met or until no
further backward chaining is possible.
Forward Chaining Backward Chaining
Proceeds from known facts to potential Proceeds from the desired goal or
conclusions. conclusion to the conditions or facts
needed to achieve it.
Starts with the available facts or data Starts with the desired goal or
conclusion.
Applies rules iteratively to generate new Works backward from the goal,
conclusions until a goal is reached. determining the conditions needed to
satisfy the goal.
Can be more efficient in systems with a Can be more efficient when the focus is
large amount of data or when the goal is on reaching a specific goal or conclusion,
to explore multiple potential outcomes. as it avoids unnecessary rule
applications.
Define prior probability and conditional probability. State Bayes’s theorem.
How is it useful for decision making under uncertainty?
Prior Probability:
Prior probability refers to the probability of an event occurring before any additional
information is taken into account. It represents the initial belief or expectation about the
likelihood of an event based on prior knowledge, historical data, or assumptions. Prior
probabilities are often denoted by P(A), where A is an event of interest.
Conditional Probability:
Conditional probability refers to the probability of an event occurring given that another
event has already occurred. It represents the updated probability of an event based on
new information or a specific condition. Conditional probabilities are denoted by P(A|B),
where A is the event of interest and B is the condition or event that has already
occurred.
Bayes' Theorem:
Bayes' theorem is a fundamental theorem in probability theory that describes how to
update the probability of an event based on new evidence or information.
Mathematically, it is expressed as follows:
Where:
Expert systems are computer-based systems that mimic the problem-solving ability of a
human expert in a specific domain. They are important because they can capture and
utilize the knowledge and expertise of human experts to solve complex problems, make
decisions, and provide recommendations in various domains.
Capture and Preserve Expertise: Expert systems can capture and preserve the
knowledge and expertise of human experts, even when those experts are not available
or are difficult to access. This ensures that valuable expertise is retained within the
organization or domain.
Rapid Problem Solving: Expert systems can quickly analyze complex problems and
provide solutions or recommendations based on the accumulated knowledge and rules.
This enables organizations to solve problems more efficiently and effectively, leading to
improved productivity and performance.
Training and Education: Expert systems can serve as training tools for novices by
providing guidance, explanations, and recommendations based on expert knowledge.
They can help transfer expertise from experienced practitioners to new employees or
students, accelerating the learning process.
24/7 Availability: Expert systems can be available around the clock, providing
on-demand assistance and support to users regardless of time or location. This ensures
that expertise is accessible whenever needed, improving responsiveness and customer
satisfaction.
Knowledge Representation Techniques in Expert Systems:
Fuzzy Logic: Fuzzy logic allows for the representation of uncertainty and vagueness in
expert knowledge by using linguistic variables and fuzzy sets to describe relationships
between variables and conditions.
LISP is a programming language famous for its symbolic expressions, functional programming
style, and dynamic typing. It's known for its powerful macro tools that enable metaprogramming,
automatic memory management via garbage collection, and interactive development
environments. LISP has been crucial in AI research and symbolic computation, shaping the
development of many programming languages. Its flexibility and expressive capabilities make it
ideal for tasks like language processing, expert systems, and quick prototyping. Despite its age,
LISP remains relevant, driving innovation in programming languages and AI applications.
Uncertainty in artificial intelligence (AI) refers to the lack of complete information or the
presence of variability in data and models. Understanding and modeling uncertainty is
crucial for making informed decisions and improving the robustness of AI systems.
AI deals with uncertainty by using models and methods that assign probabilities to
different outcomes.
Managing Uncertainty:
AI uses various techniques to handle uncertainty and make decisions:
Labeled Data: In regression, the training dataset consists of input-output pairs, where
each input is associated with a known output value.
Learning from Examples: The regression algorithm learns from these examples to
establish a relationship between the input features and the output variable.
● The goal of Support Vector Machine (SVM) is to find the optimal hyperplane that
best separates the data points of different classes in a high-dimensional space.
This hyperplane is chosen in such a way that it maximizes the margin, which is
the distance between the hyperplane and the nearest data point (support vector)
of each class. The margin represents the margin of separation or the level of
confidence in the classification.
1. Find the Support Vectors: Identify the data points (support vectors) that lie
closest to the hyperplane. These support vectors are the data points that have a
non-zero slack variable (or margin).
2. Compute the Margin: The margin is the perpendicular distance from the
hyperplane to the closest support vector. Mathematically, the margin can be
calculated as the inverse of the Euclidean norm (magnitude) of the weight vector
(w) of the hyperplane.
∥𝑤∥
Margin=
1
Where: 𝑤 = w is the weight vector of the hyperplane.
3. Normalize the Margin: In some cases, the margin may need to be normalized by
dividing it by the Euclidean norm of the weight vector. This normalization ensures
that the margin is invariant to the scale of the weight vector.
Normalized Margin=
∥𝑤∥
𝑀𝑎𝑟𝑔𝑖𝑛
What i s first order predicate Logic?
○ First-order logic is another way of knowledge representation in artificial
intelligence. It is an extension to propositional logic.
Quantifiers: First-order logic includes quantifiers such as "forall" (∀) and "exists" (∃),
allowing for statements about all objects in the domain (universal quantification) or at
least one object (existential quantification).
Functions and Constants: First-order logic allows for the use of functions and constants
to denote operations and specific objects, respectively.
Equality: First-order logic includes the equality symbol (=) to express relationships
between objects being equal.
Explain Hill Climbing Algorithm.
● A node of hill climbing algorithm has two components which are state and value.
● In this algorithm, we don't need to maintain and handle the search tree or graph
as it only keeps a single current state.
Explain plateau, ridge, and local maxima.
1. Local Maximum: A local maximum is a peak state in the landscape which is better
than each of its neighboring states, but there is another state also present which is
higher than the local maximum.
Solution: Backtracking technique can be a solution of the local maximum in state space
landscape. Create a list of the promising path so that the algorithm can backtrack the
search space and explore other paths as well.
2. Plateau: A plateau is the flat area of the search space in which all the neighbor states
of the current state contains the same value, because of this algorithm does not find
any best direction to move. A hill-climbing search might be lost in the plateau area.
Solution: The solution for the plateau is to take big steps or very little steps while
searching, to solve the problem. Randomly select a state which is far away from the
current state so it is possible that the algorithm could find non- plateau region.
3. Ridges: A ridge is a special form of the local maximum. It has an area which is higher
than its surrounding areas, but itself has a slope, and cannot be reached in a single
move.
● A* combines a cost function that measures actual path cost (g(n)) with a
heuristic function (h(n)) that estimates the remaining cost to the goal. This
combination guides the algorithm efficiently.
● The open set and closed set are essential data structures in A* for
managing the exploration of nodes. The open set contains nodes to be
evaluated, while the closed set holds nodes already evaluated.
● Heuristics play a vital role in A*, providing estimates of how close a node is
to the goal. Admissible and consistent heuristics help prioritize exploration.
● Begin by initializing two sets: the open set and the closed set.
● The open set initially contains only the starting node, while the closed set
is empty.
● Set the cost of reaching the starting node (g-score) to zero and calculate
the heuristic cost estimate to the goal (h-score) for the starting node.
2. Main Loop:
● At each iteration of the loop, select the node from the open set with the
lowest f-score (f = g + h).
● This node is the most promising candidate for evaluation, as it balances
the actual cost incurred (g) and the estimated remaining cost (h).
4. Evaluating Neighbors:
● For the selected node, consider its neighboring nodes (also known as
successors).
● Calculate the actual cost to reach each neighbor from the current node
(g-score).
● Calculate the heuristic cost estimate from each neighbor to the goal
(h-score).
5. Updating Costs:
● For each neighbor, calculate the total estimated cost (f-score) by summing
the actual cost (g-score) and the heuristic estimate (h-score).
● If a neighbor is not in the open set, add it to the open set.
● If a neighbor is already in the open set and its f-score is lower than the
previously recorded f-score, update the neighbor's f-score and set its
parent to the current node. This means a shorter path to the neighbor has
been discovered.
6. Moving to the Next Node:
● After evaluating the neighbors of the current node, move the current node
to the closed set, indicating that it has been fully evaluated.
● Return to the main loop and select the next node for evaluation based on
its f-score.
● If the goal node is reached, the algorithm terminates, and the optimal path
can be reconstructed by backtracking from the goal node to the starting
node using the parent pointers.
● If the open set becomes empty without reaching the goal, the algorithm
terminates with the conclusion that no path exists.
● Once the goal is reached, you can reconstruct the optimal path by
following the parent pointers from the goal node back to the starting node.
This path represents the shortest route.
● Termination: AO* does not have a fixed termination condition like A*.
Instead, it can be terminated at any point during the search process to
return the best solution found so far. This makes it suitable for real-time
applications where finding the optimal solution within a limited time frame
is more important than finding the globally optimal solution.
● Continuous Learning: As it explores, AO* doesn't just settle for its first
guess. It keeps learning and refining its plan, getting better and better as it
goes along.
● Smarter Searches: AO* is like having a guide who knows the shortcuts. It
uses clever tricks to look in the most promising places first, making its
search faster and more efficient.
● Best Solution on Demand: Need a solution ASAP? AO* is ready to give you
Backpropagation Algorithm
● In machine learning, backpropagation is an effective algorithm used to train
artificial neural networks, especially in feed-forward neural networks.
1. Forward pass
2. Backward pass