Made," and Intelligence Defines "Thinking Power", Hence AI Means "A Man-Made Thinking Power."
Made," and Intelligence Defines "Thinking Power", Hence AI Means "A Man-Made Thinking Power."
Intelligent Agents:
An intelligent agent is an autonomous entity which act upon an environment using sensors and actuators for achieving g
from the environment to achieve their goals. A thermostat is an example of an intelligent agent.
Following are the main four rules for an AI agent:
o Rule 1: An AI agent must have the ability to perceive the environment.
o Rule 2: The observation must be used to make decisions.
o Rule 3: Decision should result in an action.
o Rule 4: The action taken by an AI agent must be a rational action.
Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way to maximize its performa
A rational agent is said to perform the right things. AI is about creating rational agents to use for game theory and decis
scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning algorithm, for each best pos
reward and for each wrong action, an agent gets a negative reward.
Note: Rational agents in AI are very similar to intelligent agents.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be judged on the basis of following
o Performance measure which defines the success criterion.
o Agent prior knowledge of its environment.
o Best possible actions that an agent can perform.
o The sequence of percept.
Note: Rationality differs from Omniscience because an Omniscient agent knows the actual outcome of its action and act accord
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The structure of an intelligent agent
agent program. It can be viewed as:
1. Agent = Architecture + Agent program
Following are the main three terms involved in the structure of an AI agent:
Architecture: Architecture is machinery that an AI agent executes on.
Agent Function: Agent function is used to map a percept to an action.
1. f:P* → A
Agent program: Agent program is an implementation of agent function. An agent program executes on the physical ar
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or rational agent, then we can g
representation model. It is made up of four words:
o P: Performance measure
o E: Environment
o A: Actuators
Agent Performance Environment Actuators Sensors o S: Sensors
measure Here performance measure is the objectiv
behavior.
PEAS for self-driving cars:
1. o Heal o Pati o Tests Keyboard
Med thy ent o Treat (Entry of Let's suppose a self-driving car then PEAS
ical patie o Hos ment symptoms Performance: Safety, time, legal drive, c
Dia nt pita s ) Environment: Roads, other vehicles, road
gno o Mini l Actuators: Steering, accelerator, brake, s
se mize o Staf Sensors: Camera, GPS, speedometer, odo
d f Example of Agents with their PEAS representati
cost
4. Utility-based agents
o These agents are similar to the goal-based agent but provide an extra component of utility measurement
which makes them different by providing a measure of success at a given state.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an agent has to choose
in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each action achieves the
goals.
5. Learning Agents
o A learning agent in AI is the type of agent which can learn from its past experiences, or it has learning
capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically through learning.
o A learning agent has mainly four conceptual components, which are:
1. Learning element: It is responsible for making improvements by learning from environment
2. Critic: Learning element takes feedback from critic which describes that how well the agent is
doing with respect to a fixed performance standard.
3. Performance element: It is responsible for selecting external action
4. Problem generator: This component is responsible for suggesting actions that will lead to new
and informative experiences.
o Hence, learning agents are able to learn, analyze performance, and look for new ways to improve the
performance.
What is an Expert System?
An expert system is a computer program that is designed to solve complex problems and to provide decision-
making ability like a human expert. It performs this by extracting knowledge from its knowledge base using the
reasoning and inference rules according to the user queries.
The expert system is a part of AI, and the first ES was developed in the year 1970, which was the first successful
approach of artificial intelligence. It solves the most complex issue as an expert by extracting the knowledge
stored in its knowledge base. The system helps in decision making for complex problems using both facts and
heuristics like a human expert. It is called so because it contains the expert knowledge of a specific domain
and can solve any complex problem of that particular domain. These systems are designed for a specific domain,
such as medicine, science, etc.
The performance of an expert system is based on the expert's knowledge stored in its knowledge base. The more
knowledge stored in the KB, the more that system improves its performance. One of the common examples of an
ES is a suggestion of spelling errors while typing in the Google search box.
Below is the block diagram that represents the working of an expert system:
Note: It is important to remember that an expert system is not used to replace the human experts; instead, it is used to
assist the human in making a complex decision. These systems do not have human capabilities of thinking and work on
the basis of the knowledge base of the particular domain.
Below are some popular examples of the Expert System:
o DENDRAL: It was an artificial intelligence project that was made as a chemical analysis expert system. It
was used in organic chemistry to detect unknown organic molecules with the help of their mass spectra
and knowledge base of chemistry.
o MYCIN: It was one of the earliest backward chaining expert systems that was designed to find the
bacteria causing infections like bacteraemia and meningitis. It was also used for the recommendation of
antibiotics and the diagnosis of blood clotting diseases.
o PXDES: It is an expert system that is used to determine the type and level of lung cancer. To determine
the disease, it takes a picture from the upper body, which looks like the shadow. This shadow identifies
the type and degree of harm.
o CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer at early stages.
Characteristics of Expert System
o High Performance: The expert system provides high performance for solving any type of complex
problem of a specific domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable by the user. It can take input in
human language and provides the output in the same way.
o Reliable: It is much reliable for generating an efficient and accurate output.
o Highly responsive: ES provides the result for any complex query within a very short period of time.
Components of Expert System
An expert system mainly consists of three components:
o User Interface
o Inference Engine
o Knowledge Base
1. User Interface
With the help of a user interface, the expert system interacts with the user, takes queries as an input in a readable
format, and passes it to the inference engine. After getting the response from the inference engine, it displays the
output to the user. In other words, it is an interface that helps a non-expert user to communicate with the
expert system to find a solution.
2. Inference Engine(Rules of Engine)
o The inference engine is known as the brain of the expert system as it is the main processing unit of the
system. It applies inference rules to the knowledge base to derive a conclusion or deduce new information.
It helps in deriving an error-free solution of queries asked by the user.
o With the help of an inference engine, the system extracts the knowledge from the knowledge base.
o There are two types of inference engine:
o Deterministic Inference engine: The conclusions drawn from this type of inference engine are assumed
to be true. It is based on facts and rules.
o Probabilistic Inference engine: This type of inference engine contains uncertainty in conclusions, and
based on the probability.
Inference engine uses the below modes to derive the solutions:
o Forward Chaining: It starts from the known facts and rules, and applies the inference rules to add their
conclusion to the known facts.
o Backward Chaining: It is a backward reasoning method that starts from the goal and works backward to
prove the known facts.
3. Knowledge Base
o The knowledgebase is a type of storage that stores knowledge acquired from the different experts of the
particular domain. It is considered as big storage of knowledge. The more the knowledge base, the more
precise will be the Expert System.
o It is similar to a database that contains information and rules of a particular domain or subject.
o One can also view the knowledge base as collections of objects and their attributes. Such as a Lion is an
object and its attributes are it is a mammal, it is not a domestic animal, etc.
Components of Knowledge Base
o Factual Knowledge: The knowledge which is based on facts and accepted by knowledge engineers comes
under factual knowledge.
o Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.
Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base using the If-
else rules.
Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain knowledge,
specifying the rules to acquire the knowledge from various experts, and store that knowledge into the knowledge
base.
Development of Expert System
Here, we will explain the working of an expert system by taking an example of MYCIN ES. Below are some steps to
build an MYCIN:
o Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human experts specialized in the
medical field of bacterial infection, provide information about the causes, symptoms, and other knowledge
in that domain.
o The KB of the MYCIN is updated successfully. In order to test it, the doctor provides a new problem to it.
The problem is to identify the presence of the bacteria by inputting the details of a patient, including the
symptoms, current condition, and medical history.
o The ES will need a questionnaire to be filled by the patient to know the general information about the
patient, such as gender, age, etc.
o Now the system has collected all the information, so it will find the solution for the problem by applying if-
then rules using the inference engine and using the facts stored within the KB.
o In the end, it will provide a response to the patient by using the user interface.
Participants in the development of Expert System
There are three primary participants in the building of Expert System:
1. Expert: The success of an ES much depends on the knowledge provided by human experts. These experts
are those persons who are specialized in that specific domain.
2. Knowledge Engineer: Knowledge engineer is the person who gathers the knowledge from the domain
experts and then codifies that knowledge to the system according to the formalism.
3. End-User: This is a particular person or a group of people who may not be experts, and working on the
expert system needs the solution or advice for his queries, which are complex.
Why Expert System?
Before using any technology, we must have an idea about why to use that technology and hence the same for the
ES. Although we have human experts in every field, then what is the need to develop a computer-based system.
So below are the points that are describing the need of the ES:
1. No memory Limitations: It can store as much data as required and can memorize it at the time of its
application. But for human experts, there are some limitations to memorize all things at every time.
2. High Efficiency: If the knowledge base is updated with the correct knowledge, then it provides a highly
efficient output, which may not be possible for a human.
3. Expertise in a domain: There are lots of human experts in each domain, and they all have different
skills, different experiences, and different skills, so it is not easy to get a final output for the query. But if
we put the knowledge gained from human experts into the expert system, then it provides an efficient
output by mixing all the facts and knowledge
4. Not affected by emotions: These systems are not affected by human emotions such as fatigue, anger,
depression, anxiety, etc.. Hence the performance remains constant.
5. High security: These systems provide high security to resolve any query.
6. Considers all the facts: To respond to any query, it checks and considers all the available facts and
provides the result accordingly. But it is possible that a human expert may not consider some facts due to
any reason.
7. Regular updates improve the performance: If there is an issue in the result provided by the expert
systems, we can improve the performance of the system by updating the knowledge base.
Capabilities of the Expert System
Below are some capabilities of an Expert System:
o Advising: It is capable of advising the human being for the query of any domain from the particular ES.
o Provide decision-making capabilities: It provides the capability of decision making in any domain,
such as for making any financial decision, decisions in medical science, etc.
o Demonstrate a device: It is capable of demonstrating any new products such as its features,
specifications, how to use that product, etc.
o Problem-solving: It has problem-solving capabilities.
o Explaining a problem: It is also capable of providing a detailed description of an input problem.
o Interpreting the input: It is capable of interpreting the input given by the user.
o Predicting results: It can be used for the prediction of a result.
o Diagnosis: An ES designed for the medical field is capable of diagnosing a disease without using multiple
components as it already contains various inbuilt medical tools.
Advantages of Expert System
o These systems are highly reproducible.
o They can be used for risky places where the human presence is not safe.
o Error possibilities are less if the KB contains correct knowledge.
o The performance of these systems remains steady as it is not affected by emotions, tension, or fatigue.
o They provide a very high speed to respond to a particular query.
Limitations of Expert System
o The response of the expert system may get wrong if the knowledge base contains the wrong information.
o Like a human being, it cannot produce a creative output for different scenarios.
o Its maintenance and development costs are very high.
o Knowledge acquisition for designing is much difficult.
o For each domain, we require a specific ES, which is one of the big limitations.
o It cannot learn from itself and hence requires manual updates.
Applications of Expert System
o In designing and manufacturing domain
It can be broadly used for designing and manufacturing physical devices such as camera lenses and
automobiles.
o In the knowledge domain
These systems are primarily used for publishing the relevant knowledge to the users. The two popular ES
used for this domain is an advisor and a tax advisor.
o In the finance domain
In the finance industries, it is used to detect any type of possible fraud, suspicious activity, and advise
bankers that if they should provide loans for business or not.
o In the diagnosis and troubleshooting of devices
In medical diagnosis, the ES system is used, and it was the first area where these systems were used.
o Planning and Scheduling
The expert systems can also be used for planning and scheduling some particular tasks for achieving the
goal of that task.
MODULE 2
Problem-solving agents:
1. Search Space: Search space represents a set of possible solutions, which a system may have.
2. Start State: It is a state from where agent begins the search.
3. Goal test: It is a function which observe the current state and returns whether the goal state
is achieved or not.
o Search tree: A tree representation of search problem is called Search tree. The root of the search tree
is the root node which is corresponding to the initial state.
o Actions: It gives the description of all the available actions to the agent.
o Transition model: A description of what each action do, can be represented as a transition model.
o Path Cost: It is a function which assigns a numeric cost to each path.
o Solution: It is an action sequence which leads from the start node to the goal node.
o Optimal Solution: If a solution has the lowest cost among all solutions.
Following are the four essential properties of search algorithms to compare the efficiency of these algorithms:
Completeness: A search algorithm is said to be complete if it guarantees to return a solution if at least any
solution exists for any random input.
Optimality: If a solution found for an algorithm is guaranteed to be the best solution (lowest path cost) among
all other solutions, then such a solution for is said to be an optimal solution.
Time Complexity: Time complexity is a measure of time for an algorithm to complete its task.
Space Complexity: It is the maximum storage space required at any point during the search, as the
complexity of the problem.
Types of search algorithms
Based on the search problems we can classify the search algorithms into uninformed (Blind search)
search and informed search (Heuristic search) algorithms.
Uninformed/Blind Search:
The uninformed search does not contain any domain knowledge such as closeness, the location of the goal. It
operates in a brute-force way as it only includes information about how to traverse the tree and how to identify
leaf and goal nodes. Uninformed search applies a way in which search tree is searched without any information
about the search space like initial state operators and test for the goal, so it is also called blind search.It
examines each node of the tree until it achieves the goal node.
It can be divided into five main types:
o Breadth-first search
o Uniform cost search
o Depth-first search
o Iterative deepening depth-first search
o Bidirectional Search
Informed Search
Informed search algorithms use domain knowledge. In an informed search, problem information is available
which can guide the search. Informed search strategies can find a solution more efficiently than an uninformed
search strategy. Informed search is also called a Heuristic search.
A heuristic is a way which might not always be guaranteed for best solutions but guaranteed to find a good
solution in reasonable time.
Informed search can solve much complex problem which could not be solved in another way.
An example of informed search algorithms is a traveling salesman problem.
1. Greedy Search
2. A* Search
Uninformed Search Algorithms
Uninformed search is a class of general-purpose search algorithms which operates in brute force-way.
Uninformed search algorithms do not have additional information about state or search space other
than how to traverse the tree, so it is also called blind search.
Following are the various types of uninformed search algorithms:
1. Breadth-first Search
2. Depth-first Search
3. Depth-limited Search
4. Iterative deepening depth-first search
5. Uniform cost search
6. Bidirectional Search
1. Breadth-first Search:
o Breadth-first search is the most common search strategy for traversing a tree or graph. This algorithm
searches breadthwise in a tree or graph, so it is called breadth-first search.
o BFS algorithm starts searching from the root node of the tree and expands all successor node at the
current level before moving to nodes of next level.
o The breadth-first search algorithm is an example of a general-graph search algorithm.
o Breadth-first search implemented using FIFO queue data structure.
Advantages:
o BFS will provide a solution if any solution exists.
o If there are more than one solutions for a given problem, then BFS will provide the minimal solution which
requires the least number of steps.
Disadvantages:
o It requires lots of memory since each level of the tree must be saved into memory to expand the next
level.
o BFS needs lots of time if the solution is far away from the root node.
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm from the root node S to
goal node K. BFS search algorithm traverse in layers, so it will follow the path which is shown by the dotted arrow,
and the traversed path will be:
1. S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of nodes traversed in BFS
until the shallowest Node. Where the d= depth of shallowest solution and b is a node at every state.
T (b) = 1+b2+b3+.......+ bd= O (bd)
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier which is O(b d).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth, then BFS will
find a solution.
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.
2. Depth-first Search
o Depth-first search isa recursive algorithm for traversing a tree or graph data structure.
o It is called the depth-first search because it starts from the root node and follows each path to its greatest
depth node before moving to the next path.
o DFS uses a stack data structure for its implementation.
o The process of the DFS algorithm is similar to the BFS algorithm.
Note: Backtracking is an algorithm technique for finding all possible solutions using recursion.
Advantage:
o DFS requires very less memory as it only needs to store a stack of the nodes on the path from root node
to the current node.
o It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right path).
Disadvantage:
o There is the possibility that many states keep re-occurring, and there is no guarantee of finding the
solution.
o DFS algorithm goes for deep down searching and sometime it may go to the infinite loop.
Example:
In the below search tree, we have shown the flow of depth-first search, and it will follow the order as:
Root node--->Left node ----> right node.
It will start searching from root node S, and traverse A, then B, then D and E, after traversing E, it will backtrack
the tree as E has no other successor and still goal node is not found. After backtracking it will traverse node C and
then G, and here it will terminate as it found goal node.
Completeness: DFS search algorithm is complete within finite state space as it will expand every node within a
limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the algorithm. It is given
by:
T(n)= 1+ n2+ n3 +.........+ nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest solution
depth)
Space Complexity: DFS algorithm needs to store only single path from the root node, hence space complexity of
DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps or high cost to reach
to the goal node.
Informed Search Algorithms
So far we have talked about the uninformed search algorithms which looked through search space for all
possible solutions of the problem without having any additional knowledge about search space. But informed
search algorithm contains an array of knowledge such as how far we are from the goal, path cost, how to reach
to goal node, etc. This knowledge help agents to explore less to the search space and find more efficiently the
goal node.
The informed search algorithm is more useful for large search space. Informed search algorithm uses the idea
of heuristic, so it is also called Heuristic search.
Heuristics function: Heuristic is a function which is used in Informed Search, and it finds the most promising
path. It takes the current state of the agent as its input and produces the estimation of how close agent is from
the goal. The heuristic method, however, might not always give the best solution, but it guaranteed to find a
good solution in reasonable time. Heuristic function estimates how close a state is to the goal. It is represented
by h(n), and it calculates the cost of an optimal path between the pair of states. The value of the heuristic
function is always positive.
Admissibility of the heuristic function is given as:
1. h(n) <= h*(n)
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be less than
or equal to the estimated cost.
Pure Heuristic Search:
Pure heuristic search is the simplest form of heuristic search algorithms. It expands nodes based on their
heuristic value h(n). It maintains two lists, OPEN and CLOSED list. In the CLOSED list, it places those nodes
which have already expanded and in the OPEN list, it places nodes which have yet not been expanded.
On each iteration, each node n with the lowest heuristic value is expanded and generates all its successors and
n is placed to the closed list. The algorithm continues unit a goal state is found.
In the informed search we will discuss two main algorithms which are given below:
o Best First Search Algorithm(Greedy search)
o A* Search Algorithm
1.) Best-first Search Algorithm (Greedy Search):
Greedy best-first search algorithm always selects the path which appears best at that moment. It is the
combination of depth-first search and breadth-first search algorithms. It uses the heuristic function and search.
Best-first search allows us to take the advantages of both algorithms. With the help of best-first search, at
each step, we can choose the most promising node. In the best first search algorithm, we expand the node
which is closest to the goal node and the closest cost is estimated by heuristic function, i.e.
1. f(n)= g(n).
Were, h(n)= estimated cost from node n to the goal.
The greedy best first algorithm is implemented by the priority queue.
Best first search algorithm:
o Step 1: Place the starting node into the OPEN list.
o Step 2: If the OPEN list is empty, Stop and return failure.
o Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places it in the
CLOSED list.
o Step 4: Expand the node n, and generate the successors of node n.
o Step 5: Check each successor of node n, and find whether any node is a goal node or not. If any
successor node is goal node, then return success and terminate the search, else proceed to Step 6.
o Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check if the
node has been in either OPEN or CLOSED list. If the node has not been in both list, then add it to the
OPEN list.
o Step 7: Return to Step 2.
Advantages:
o Best first search can switch between BFS and DFS by gaining the advantages of both the algorithms.
o This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
o It can behave as an unguided depth-first search in the worst case scenario.
o It can get stuck in a loop as DFS.
o This algorithm is not optimal.
Example:
Consider the below search problem, and we will traverse it using greedy best-first search. At each iteration,
each node is expanded using evaluation function f(n)=h(n) , which is given in the below table.
In this search example, we are using two lists which are OPEN and CLOSED Lists. Following are the iteration
for traversing the above example.
At each point in the search space, only those node is expanded which have the lowest value of f(n), and the algorithm
terminates when the goal node is found.
Algorithm of A* search:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation function (g+h), if node n
is goal node then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each successor n',
check whether n' is already in the OPEN or CLOSED list, if not then compute evaluation function for n' and
place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer which
reflects the lowest g(n') value.
Step 6: Return to Step 2.
Advantages:
o A* search algorithm is the best algorithm than other search algorithms.
o A* search algorithm is optimal and complete.
o This algorithm can solve very complex problems.
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so it
is not practical for various large-scale problems.
Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all states is
given in the below table so we will calculate the f(n) of each state using the formula f(n)= g(n) + h(n), where
g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.
Solution:
Initialization: {(S, 5)}
Iteration1: {(S--> A, 4), (S-->G, 10)}
Iteration2: {(S--> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path with cost 6.
Points to remember:
o A* algorithm returns the path which occurred first, and it does not search for all remaining paths.
o The efficiency of A* algorithm depends on the quality of heuristic.
o A* algorithm expands all nodes which satisfy the condition f(n)<="" li="">
Complete: A* algorithm is complete as long as:
o Branching factor is finite.
o Cost at every action is fixed.
Optimal: A* search algorithm is optimal if it follows below two conditions:
o Admissible: the first condition requires for optimality is that h(n) should be an admissible heuristic for
A* tree search. An admissible heuristic is optimistic in nature.
o Consistency: Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and the number
of nodes expanded is exponential to the depth of solution d. So the time complexity is O(b^d), where b is the
branching factor.
Space Complexity: The space complexity of A* search algorithm is O(b^d)
Hill Climbing Algorithm in Artificial Intelligence
o Hill climbing algorithm is a local search algorithm which continuously moves in the direction of increasing
elevation/value to find the peak of the mountain or best solution to the problem. It terminates when it
reaches a peak value where no neighbor has a higher value.
o Hill climbing algorithm is a technique which is used for optimizing the mathematical problems. One of the
widely discussed examples of Hill climbing algorithm is Traveling-salesman Problem in which we need to
minimize the distance traveled by the salesman.
o It is also called greedy local search as it only looks to its good immediate neighbor state and not beyond
that.
o A node of hill climbing algorithm has two components which are state and value.
o Hill Climbing is mostly used when a good heuristic is available.
o In this algorithm, we don't need to maintain and handle the search tree or graph as it only keeps a single
current state.
Features of Hill Climbing:
Following are some main features of Hill Climbing Algorithm:
o Generate and Test variant: Hill Climbing is the variant of Generate and Test method. The Generate and
Test method produce feedback which helps to decide which direction to move in the search space.
o Greedy approach: Hill-climbing algorithm search moves in the direction which optimizes the cost.
o No backtracking: It does not backtrack the search space, as it does not remember the previous states.
State-space Diagram for Hill Climbing:
The state-space landscape is a graphical representation of the hill-climbing algorithm which is showing a graph
between various states of algorithm and Objective function/Cost.
On Y-axis we have taken the function which can be an objective function or cost function, and state-space on the
x-axis. If the function on Y-axis is cost then, the goal of search is to find the global minimum and local minimum.
If the function of Y-axis is Objective function, then the goal of the search is to find the global maximum and local
maximum.
2. Plateau: A plateau is the flat area of the search space in which all the neighbor states of the current state
contains the same value, because of this algorithm does not find any best direction to move. A hill-climbing search
might be lost in the plateau area.
Solution: The solution for the plateau is to take big steps or very little steps while searching, to solve the
problem. Randomly select a state which is far away from the current state so it is possible that the algorithm could
find non-plateau region.
3. Ridges: A ridge is a special form of the local maximum. It has an area which is higher than its surrounding
areas, but itself has a slope, and cannot be reached in a single move.
Solution: With the use of bidirectional search, or by moving in different directions, we can improve this problem.
Simulated Annealing:
A hill-climbing algorithm which never makes a move towards a lower value guaranteed to be incomplete because it
can get stuck on a local maximum. And if algorithm applies a random walk, by moving a successor, then it may
complete but not efficient. Simulated Annealing is an algorithm which yields both efficiency and completeness.
In mechanical term Annealing is a process of hardening a metal or glass to a high temperature then cooling
gradually, so this allows the metal to reach a low-energy crystalline state. The same process is used in simulated
annealing in which the algorithm picks a random move, instead of picking the best move. If the random move
improves the state, then it follows the same path. Otherwise, the algorithm follows the path which has a
probability of less than 1 or it moves downhill and chooses another pathMODULE 3
Adversarial Search
Adversarial search is a search, where we examine the problem which arises when we try to plan ahead
of the world and other agents are planning against us.
o In previous topics, we have studied the search strategies which are only associated with a single agent
that aims to find the solution which often expressed in the form of a sequence of actions.
o But, there might be some situations where more than one agent is searching for the solution in the same
search space, and this situation usually occurs in game playing.
o The environment with more than one agent is termed as multi-agent environment, in which each agent
is an opponent of other agent and playing against each other. Each agent needs to consider the action of
other agent and effect of that action on their performance.
o So, Searches in which two or more players with conflicting goals are trying to explore the same
search space for the solution, are called adversarial searches, often known as Games.
o Games are modeled as a Search problem and heuristic evaluation function, and these are the two main
factors which help to model and solve games in AI.
Types of Games in AI:
o Perfect information: A game with the perfect information is that in which agents can look into the
complete board. Agents have all the information about the game, and they can see each other moves also.
Examples are Chess, Checkers, Go, etc.
o Imperfect information: If in a game agents do not have all information about the game and not aware
with what's going on, such type of games are called the game with imperfect information, such as tic-tac-
toe, Battleship, blind, Bridge, etc.
o Deterministic games: Deterministic games are those games which follow a strict pattern and set of rules
for the games, and there is no randomness associated with them. Examples are chess, Checkers, Go, tic-
tac-toe, etc.
o Non-deterministic games: Non-deterministic are those games which have various unpredictable events
and has a factor of chance or luck. This factor of chance or luck is introduced by either dice or cards.
These are random, and each action response is not fixed. Such games are also called as stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
Note: In this topic, we will discuss deterministic games, fully observable environment, zero-sum, and where each agent
acts alternatively.
Zero-Sum Game
o Zero-sum games are adversarial search which involves pure competition.
o In Zero-sum game each agent's gain or loss of utility is exactly balanced by the losses or gains of utility of
another agent.
o One player of the game try to maximize one single value, while other player tries to minimize it.
o Each move by one player in the game is called as ply.
o Chess and tic-tac-toe are examples of a Zero-sum game.
Zero-sum game: Embedded thinking
The Zero-sum game involved embedded thinking in which one agent or player is trying to figure out:
o What to do.
o How to decide the move
o Needs to think about his opponent as well
o The opponent also thinks what to do
Each of the players is trying to find out the response of his opponent to their actions. This requires embedded
thinking or backward reasoning to solve the game problems in AI.
Formalization of the problem:
A game can be defined as a type of search in AI which can be formalized of the following elements:
o Initial state: It specifies how the game is set up at the start.
o Player(s): It specifies which player has moved in the state space.
o Action(s): It returns the set of legal moves in state space.
o Result(s, a): It is the transition model, which specifies the result of moves in the state space.
o Terminal-Test(s): Terminal test is true if the game is over, else it is false at any case. The state where
the game ends is called terminal states.
o Utility(s, p): A utility function gives the final numeric value for a game that ends in terminal states s for
player p. It is also called payoff function. For Chess, the outcomes are a win, loss, or draw and its payoff
values are +1, 0, ½. And for tic-tac-toe, utility values are +1, -1, and 0.
Game tree:
A game tree is a tree where nodes of the tree are the game states and Edges of the tree are the moves by
players. Game tree involves initial state, actions function, and result Function.
Example: Tic-Tac-Toe game tree:
The following figure is showing part of the game-tree for tic-tac-toe game. Following are some key points of the
game:
o There are two players MAX and MIN.
o Players have an alternate turn and start with MAX.
o MAX maximizes the result of the game tree
o MIN minimizes the result.
Example Explanation:
o From the initial state, MAX has 9 possible moves as he starts first. MAX place x and MIN place o, and both
player plays alternatively until we reach a leaf node where one player has three in a row or all squares are
filled.
o Both players will compute each node, minimax, the minimax value which is the best achievable utility
against an optimal adversary.
o Suppose both the players are well aware of the tic-tac-toe and playing the best play. Each player is doing
his best to prevent another one from winning. MIN is acting against Max in the game.
o So in the game tree, we have a layer of Max, a layer of MIN, and each layer is called as Ply. Max place x,
then MIN puts o to prevent Max from winning, and this game continues until the terminal node.
o In this either MIN wins, MAX wins, or it's a draw. This game-tree is the whole search space of possibilities
that MIN and MAX are playing tic-tac-toe and taking turns alternately.
Hence adversarial Search for the minimax procedure works as follows:
o It aims to find the optimal strategy for MAX to win the game.
o It follows the approach of Depth-first search.
o In the game tree, optimal leaf node could appear at any depth of the tree.
o Propagate the minimax values up to the tree until the terminal node discovered.
In a given game tree, the optimal strategy can be determined from the minimax value of each node, which can be
written as MINIMAX(n). MAX prefer to move to a state of maximum value and MIN prefer to move to a state of
minimum value then:
Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we will compare each
value in terminal state with initial value of Maximizer and determines the higher nodes values. It will find the
maximum among the all.
o For node D max(-1,- -∞) => max(-1,4)= 4
o For Node E max(2, -∞) => max(2, 6)= 6
o For Node F max(-3, -∞) => max(-3,-5) = -3
o For node G max(0, -∞) = max(0, 7) = 7
Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will find the
3rd layer node values.
o For node B= min(4,6) = 4
o For node C= min (-3, 7) = -3
Step 3: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value and find the
maximum value for the root node. In this game tree, there are only 4 layers, hence we reach immediately to the
root node, but in real games, there will be more than 4 layers.
o For node A max(4, -3)= 4
That was the complete workflow of the minimax two player game.
Properties of Mini-Max algorithm:
o Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the finite search
tree.
o Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
o Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max algorithm
is O(bm), where b is branching factor of the game-tree, and m is the maximum depth of the tree.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is O(bm).
Limitation of the minimax Algorithm:
The main drawback of the minimax algorithm is that it gets really slow for complex games such as Chess, go, etc.
This type of games has a huge branching factor, and the player has lots of choices to decide. This limitation of the
minimax algorithm can be improved from alpha-beta pruning which we have discussed in the next topic.
Alpha-Beta Pruning
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization technique for the
minimax algorithm.
o As we have seen in the minimax search algorithm that the number of game states it has to examine are
exponential in depth of the tree. Since we cannot eliminate the exponent, but we can cut it to half. Hence
there is a technique by which without checking each node of the game tree we can compute the correct
minimax decision, and this technique is called pruning. This involves two threshold parameter Alpha and
beta for future expansion, so it is called alpha-beta pruning. It is also called as Alpha-Beta Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune the tree leaves
but also entire sub-tree.
o The two-parameter can be defined as:
1. Alpha: The best (highest-value) choice we have found so far at any point along the path of
Maximizer. The initial value of alpha is -∞.
2. Beta: The best (lowest-value) choice we have found so far at any point along the path of
Minimizer. The initial value of beta is +∞.
o The Alpha-beta pruning to a standard minimax algorithm returns the same move as the standard
algorithm does, but it removes all the nodes which are not really affecting the final decision but making
algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.
Note: To better understand this topic, kindly study the minimax algorithm.
Condition for Alpha-beta pruning:
The main condition which required for alpha-beta pruning is:
1. α>=β
Key points about alpha-beta pruning:
o The Max player will only update the value of alpha.
o The Min player will only update the value of beta.
o While backtracking the tree, the node values will be passed to upper nodes instead of values of alpha and
beta.
o We will only pass the alpha, beta values to the child nodes.
Pseudo-code for Alpha-beta Pruning:
1. function minimax(node, depth, alpha, beta, maximizingPlayer) is
2. if depth ==0 or node is a terminal node then
3. return static evaluation of node
4.
5. if MaximizingPlayer then // for Maximizer Player
6. maxEva= -infinity
7. for each child of node do
8. eva= minimax(child, depth-1, alpha, beta, False)
9. maxEva= max(maxEva, eva)
10. alpha= max(alpha, maxEva)
11. if beta<=alpha
12. break
13. return maxEva
14.
15. else // for Minimizer player
16. minEva= +infinity
17. for each child of node do
18. eva= minimax(child, depth-1, alpha, beta, true)
19. minEva= min(minEva, eva)
20. beta= min(beta, eva)
21. if beta<=alpha
22. break
23. return minEva
Working of Alpha-Beta Pruning:
Let's take an example of two-player search tree to understand the working of Alpha-beta pruning
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these value of
alpha and beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the same value to its
child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2
and then 3, and the max (2, 3) = 3 will be the value of α at node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β= +∞,
will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β=
3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and
β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be
compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right successor of E
will be pruned, and algorithm will not traverse it, and the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of alpha
will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values now passes
to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and then
compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed,
it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the condition α>=β, so
the next child of C which is G will be pruned, and the algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game
tree which is the showing the nodes which are computed and nodes which has never computed. Hence the optimal
value for the maximizer is 3 for this example.
Move Ordering in Alpha-Beta pruning:
The effectiveness of alpha-beta pruning is highly dependent on the order in which each node is examined. Move
order is an important aspect of alpha-beta pruning.
o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the leaves of the
tree, and works exactly as minimax algorithm. In this case, it also consumes more time because of alpha-
beta factors, such a move of pruning is called worst ordering. In this case, the best move occurs on the
right side of the tree. The time complexity for such an order is O(b m).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in the
tree, and best moves occur at the left side of the tree. We apply DFS hence it first search left of the tree
and go deep twice as minimax algorithm in the same amount of time. Complexity in ideal ordering is
O(bm/2).
o An intelligent agent needs knowledge about the real world for taking decisions and reasoning to act
efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining an internal state
of knowledge, reason over that knowledge, update their knowledge after observations and
take actions. These agents can represent the world with some formal representation and act
intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
The above diagram is representing a generalized architecture for a knowledge-based agent. The knowledge-
based agent (KBA) take input from the environment by perceiving the environment. The input is taken by the
inference engine of the agent and which also communicate with KB to decide as per the knowledge store in KB.
The learning element of KBA regularly updates the KB by learning new knowledge.
Knowledge-base is required for updating knowledge for an agent to learn with experiences and take action as
per the knowledge.
Inference system
Inference means deriving new sentences from old. Inference system allows us to add a new sentence to the
knowledge base. A sentence is a proposition about the world. Inference system applies logical rules to the KB
to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference system works mainly
in two rules which are given as:
o Forward chaining
o Backward chaining
Operations Performed by KBA
Following are three operations which are performed by KBA in order to show the intelligent
behavior:
1. TELL: This operation tells the knowledge base what it perceives from the environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
A generic knowledge-based agent:
Following is the structure outline of a generic knowledge-based agents program:
1. function KB-AGENT(percept):
2. persistent: KB, a knowledge base
3. t, a counter, initially 0, indicating time
4. TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
5. Action = ASK(KB, MAKE-ACTION-QUERY(t))
6. TELL(KB, MAKE-ACTION-SENTENCE(action, t))
7. t = t + 1
8. return action
The knowledge-based agent takes percept as input and returns an action as output. The agent maintains the
knowledge base, KB, and it initially has some background knowledge of the real world. It also has a counter to
indicate the time for the whole process, and this counter is initialized with zero.
Each time when the function is called, it performs its three operations:
o Firstly it TELLs the KB what it perceives.
o Secondly, it asks KB what action it should take
o Third agent program TELLS the KB that which action was chosen.
The MAKE-PERCEPT-SENTENCE generates a sentence as setting that the agent perceived the given percept at
the given time.
The MAKE-ACTION-QUERY generates a sentence to ask which action should be done at the current time.
MAKE-ACTION-SENTENCE generates a sentence which asserts that the chosen action was executed.
Various levels of knowledge-based agent:
A knowledge-based agent can be viewed at different levels which are given below:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to specify what the
agent knows, and what the agent goals are. With these specifications, we can fix its behavior. For example,
suppose an automated taxi agent needs to go from a station A to station B, and he knows the way from A to B,
so this comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is stored. At this level,
sentences are encoded into different logics. At the logical level, an encoding of knowledge into logical
sentences occurs. At the logical level we can expect to the automated taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level agent perform actions
as per logical and knowledge level. At this level, an automated taxi agent actually implement his knowledge
and logic so that he can reach to the destination.
Approaches to designing a knowledge-based agent:
There are mainly two approaches to build a knowledge-based agent:
1. 1. Declarative approach: We can create a knowledge-based agent by initializing with an empty
knowledge base and telling the agent all the sentences with which we want to start with. This approach
is called Declarative approach.
2. 2. Procedural approach: In the procedural approach, we directly encode desired behavior as a
program code. Which means we just need to write a program that already encodes the desired
behavior or agent.
However, in the real world, a successful agent can be built by combining both declarative and procedural
approaches, and declarative knowledge can often be compiled into more efficient procedural code.
Propositional logic in Artificial intelligence
Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions. A
proposition is a declarative statement which is either true or false. It is a technique of knowledge representation in
logical and mathematical form.
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
Following are some basic facts about propositional logic:
o Propositional logic is also called Boolean logic as it works on 0 and 1.
o In propositional logic, we use symbolic variables to represent the logic, and we can use any symbol for a
representing a proposition, such A, B, C, P, Q, R, etc.
o Propositions can be either true or false, but it cannot be both.
o Propositional logic consists of an object, relations or function, and logical connectives.
o These connectives are also called logical operators.
o The propositions and connectives are the basic elements of the propositional logic.
o Connectives can be said as a logical operator which connects two sentences.
o A proposition formula which is always true is called tautology, and it is also called a valid sentence.
o A proposition formula which is always false is called Contradiction.
o A proposition formula which has both true and false values is called
o Statements which are questions, commands, or opinions are not propositions such as "Where is Rohini",
"How are you", "What is your name", are not propositions.
Syntax of propositional logic:
The syntax of propositional logic defines the allowable sentences for the knowledge representation. There are two
types of Propositions:
a. Atomic Propositions
b. Compound propositions
o Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single proposition
symbol. These are the sentences which must be either true or false.
Example:
1. a) 2+2 is 4, it is an atomic proposition as it is a true fact.
2. b) "The Sun is cold" is also a proposition as it is a false fact.
o Compound proposition: Compound propositions are constructed by combining simpler or atomic
propositions, using parenthesis and logical connectives.
Example:
1. a) "It is raining today, and street is wet."
2. b) "Ankit is a doctor, and his clinic is in Mumbai."
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence logically. We can
create compound propositions with the help of logical connectives. There are mainly five connectives, which are
given as follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or negative
literal.
2. Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.
Example: Rohan is intelligent and hardworking. It can be written as,
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
3. Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P and Q are
the propositions.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
4. Implication: A sentence such as P → Q, is called an implication. Implications are also known as if-then
rules. It can be represented as
If it is raining, then the street is wet.
Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
5. Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am breathing,
then I am alive
P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.
Following is the summarized table for Propositional Logic Connectives:
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible scenarios. We can combine
all the possible combination with logical connectives, and the representation of these combinations in a tabular
format is called Truth table. Following are the truth table for all logical connectives:
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or logical operators. This
order should be followed while evaluating a propositional problem. Following is the list of the precedence order for
operators:
Precedence Operators
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
Limitations of Propositional logic:
o We cannot represent relations like ALL, some, or none with propositional logic. Example:
1. All the girls are intelligent.
2. Some apples are sweet.
o Propositional logic has limited expressive power.
o In propositional logic, we cannot describe statements in terms of their properties or logical relationships.
First-Order Logic in Artificial intelligence
In the topic of Propositional logic, we have seen that how to represent statements using propositional logic. But
unfortunately, in propositional logic, we can only represent the facts, which are either true or false. PL is not
sufficient to represent the complex sentences or natural language statements. The propositional logic has very
limited expressive power. Consider the following sentence, which we cannot represent using PL logic.
o "Some humans are intelligent", or
o "Sachin likes cricket."
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such as
first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to
propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a
powerful language that develops information about the objects in a more easy way and can also express
the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such
as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
o Syntax
o Semantics
Syntax of First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic
syntactic elements of first-order logic are symbols. We write statements in short-hand notation in FOL.
Basic Elements of First-order logic:
Following are the basic elements of FOL syntax:
Constant 1, 2, A, John, Mumbai, cat,....
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a
predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).
Chinky is a cat: => cat (Chinky).
Complex Sentences:
o Complex sentences are made by combining atomic sentences using connectives.
First-order logic statements can be divided into two parts:
o Subject: Subject is the main part of the statement.
o Predicate: A predicate can be defined as a relation, which binds two atoms together in a statement.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of the
statement and second part "is an integer," is known as a predicate.
Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(2) and (3) are already added.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers from the
conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so we can add
Criminal(Robert) which infers all the available facts. And hence we reached our goal statement.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we can see in Rule-1,
the goal predicate Criminal (Robert) is present with substitution {Robert/P}. So we will add all the conjunctive
facts below the first level and will replace p with Robert.
Here we can see American (Robert) is a fact, so it is proved here.
Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it satisfies Rule-(5).
Weapon (q) is also true with the substitution of a constant T1 at q.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which satisfies the Rule- 4,
with the substitution of A in place of r. So these two statements are proved here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6. And hence all the
statements are proved true using backward chaining.
Difference between backward chaining and forward chaining
Following is the difference between the forward chaining and backward chaining:
o Forward chaining as the name suggests, start from the known facts and move forward by applying
inference rules to extract more data, and it continues until it reaches to the goal, whereas backward
chaining starts from the goal, move backward by using inference rules to determine the facts that satisfy
the goal.
o Forward chaining is called a data-driven inference technique, whereas backward chaining is called
a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward chaining is known as a top-
down approach.
o Forward chaining uses breadth-first search strategy, whereas backward chaining uses depth-first
search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process monitoring, diagnosis, and
classification, whereas backward chaining can be used for classification and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward chaining tries to avoid the
unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge base, whereas in backward
chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward chaining is fast as it checks few
required rules only.
1. Forward chaining starts from known Backward chaining starts from the goal
facts and applies inference rule to and works backward through inference
extract more data unit it reaches to rules to find the required facts that support the goal.
the goal.
5. Forward chaining tests for all the Backward chaining only tests for few required rules.
available rules
6. Forward chaining is suitable for the Backward chaining is suitable for diagnostic,
planning, monitoring, control, and prescription, and debugging application.
interpretation application.
9. Forward chaining is aimed for any Backward chaining is only aimed for the required data.
conclusion.
Uncertainty:
Till now, we have learned knowledge representation using first-order logic and propositional logic with certainty,
which means we were sure about the predicates. With this knowledge representation, we might write A→B, which
means if A is true then B is true, but consider a situation where we are not sure about whether A is true or not
then we cannot express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain reasoning or
probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
2. Experimental Errors
3. Equipment fault
4. Temperature variation
5. Climate change.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability to indicate
the uncertainty in knowledge. In probabilistic reasoning, we combine probability theory with logic to handle the
uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is the result
of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as "It will
rain today," "behavior of someone for some situations," "A match between two teams or two players." These are
probable sentences for which we can assume that it will happen but not sure about it, so here we use probabilistic
reasoning.
Need of probabilistic reasoning in AI:
o When there are unpredictable outcomes.
o When specifications or possibilities of predicates becomes too large to handle.
o When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
o Bayes' rule
o Bayesian Statistics
Note: We will learn the above two rules in later chapters.
As probabilistic reasoning uses probability and related terms, so before understanding probabilistic reasoning, let's
understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the numerical measure
of the likelihood that an event will occur. The value of probability always remains between 0 and 1 that represent
ideal uncertainties.
1. 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
1. P(A) = 0, indicates total uncertainty in an event A.
1. P(A) =1, indicates total certainty in an event A.
We can find the probability of an uncertain event by using the below formula.
It can be explained by using the below Venn diagram, where B is occurred event, so sample space will be reduced
to set B, and now we can only calculate event A when event B is already occurred by dividing the probability
of P(A⋀B) by P( B ).
Example:
In a class, there are 70% of the students who like English and 40% of the students who likes English and
mathematics, and then what is the percent of students those who like English also like mathematics?
Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
Hence, 57% are the students who like English also like Mathematics.
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI
systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when
we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of
evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is very useful in
cases where we have a good probability of these three terms and want to determine the fourth one. Suppose we
want to perceive the effect of some unknown cause, and want to compute that cause, then the Bayes' rule
becomes:
Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80% of the time. He
is also aware of some more facts, which are given as follows:
o The Known probability that a patient has meningitis disease is 1/30,000.
o The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningitis. , so we
can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that the card
is king is 4/52, then calculate posterior probability P(King|Face), which means the drawn face card is
a king card.
Solution:
LP x is Large Positive
MP x is Medium Positive
S x is Small
MN x is Medium Negative
LN x is Large Negative
Algorithm
Define linguistic Variables and terms (start)
Construct membership functions for them. (start)
Construct knowledge base of rules (start)
Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
Evaluate rules in the rule base. (Inference Engine)
Combine results from each rule. (Inference Engine)
Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are input and output variables in the form of simple words or sentences. For room temperature, cold, warm, hot, etc., are
linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is a linguistic term and it can cover some portion of overall temperature values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. Condition Action