0% found this document useful (0 votes)
11 views38 pages

My Notebook For Ai

Notes of Artificial intelligence

Uploaded by

Milan Bali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views38 pages

My Notebook For Ai

Notes of Artificial intelligence

Uploaded by

Milan Bali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

lOMoARcPSD|35678125

My notebook for AI

Foundation Of Artificial Intelligence (Chandigarh University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Unit 2:Constructing Search Algo trees (Binary Search Tree)


30 October 2023 11:40

What is a tree?
A tree is a kind of data structure that is used to represent the data in hierarchical form. It can be defined as a collection of objects or entities called as nodes that are linked together to simulate
a hierarchy. Tree is a non-linear data structure as the data in a tree is not stored linearly or sequentially.

A binary search tree follows some order to arrange the elements. In a Binary search tree, the value of left node must be smal ler than the parent node, and the value of right node must be greater than the
parent node. This rule is applied recursively to the left and right subtrees of the root.
Let's understand the concept of Binary search tree with an example.

In the above figure, we can observe that the root node is 40, and all the nodes of the left subtree are smaller than the root node, and all the nodes of the right subtree are greater than the root node.
Similarly, we can see the left child of root node is greater than its left child and smaller than its right child. So, it als o satisfies the property of binary search tree. Therefore, we can say that the tree in the
above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be binary search tree or not.

In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree.
Therefore, the above tree is not a binary search tree.
Insertion in Binary Search tree
A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start searching from the root node; if the node to be inserted is less than the root
node, then search for an empty location in the left subtree. Else, search for the empty location in the right subtree and ins ert the data. Insert in BST is similar to searching, as
we always have to maintain the rule that the left subtree is smaller than the root, and right subtree is larger than the root .
Now, let's see the process of inserting a node into BST using an example.

Deletion in Binary Search tree


In a binary search tree, we must delete a node from the tree by keeping in mind that the property of BST is not violated. To delete a node from BST, there are three possible
situations occur -
• The node to be deleted is the leaf node, or,
• The node to be deleted has only one child, and,
• The node to be deleted has two children
We will understand the situations listed above in detail.
When the node to be deleted is the leaf node
It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with NULL and simply free the allocat ed space.
We can see the process to delete a leaf node from BST in the below image. In below image, suppose we have to delete node 90, as the node to be deleted is a leaf node, so
it will be replaced with NULL, and the allocated space will free.

When the node to be deleted has only one child


In this case, we have to replace the target node with its child, and then delete the child node. It means that after replacin g the target node with its child node, the child node
will now contain the value to be deleted. So, we simply have to replace the child node with NULL and free up the allocated sp ace.
We can see the process of deleting a node with one child from BST in the below image. In the below image, suppose we have to delete the node 79, as the node to be
deleted has only one child, so it will be replaced with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily deleted.

Downloaded
Unit 2 Page 1 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

So, the replaced node 79 will now be a leaf node that can be easily deleted.

When the node to be deleted has two children


This case of deleting a node in BST is a bit complex among other two cases. In such a case, the steps to be followed are list ed as follows -
• First, find the inorder successor of the node to be deleted.
• After that, replace that node with the inorder successor until the target node is placed at the leaf of tree.
• And at last, replace the node with NULL and free up the allocated space.
The inorder successor is required when the right child of the node is not empty. We can obtain the inorder successor by findi ng the minimum element in the right child of the
node.
We can see the process of deleting a node with two children from BST in the below image. In the below image, suppose we have to delete node 45 that is the root node, as
the node to be deleted has two children, so it will be replaced with its inorder successor. Now, node 45 will be at the leaf of the tree so that it can be deleted easily.

The complexity of the Binary Search tree


AVERAGE WORST CASE
SEARCH O(log n) O(n)
INSERT O(log n) O(n)
DELETE O(log n) O(n)

Downloaded
Unit 2 Page 2 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

2: Stochastic Search(Simulated Annealing & Game Playing)


31 October 2023 16:01

Stochastic search is a class of optimization algorithms used to find the best solution to a problem in situations where the o bjective function is subject to random noise, and it
may not always be possible to determine the exact solution due to the presence of uncertainty. Stochastic search algorithms a re commonly employed in various fields,
including machine learning, artificial intelligence, operations research, and optimization.
The main characteristic of stochastic search is that it incorporates randomness into the search process, which can help explo re the solution space more effectively,
especially in situations where the solution space is complex, multimodal (multiple local optima), or noisy. Stochastic search algorithms make use of probabilistic techniques
and random sampling to find solutions that are either optimal or near-optimal.

Some well-known stochastic search algorithms include:

Simulated Annealing
Game Playing
Bayesian Theorem

Simulated Annealing
Simulated annealing is a probabilistic optimization algorithm that is often used in artificial intelligence, specifically in solving problems related to optimization and search. It is inspired by the annealing
process in metallurgy, where a material is heated and slowly cooled to remove defects and find the most stable crystalline structure. In the context of AI and optimization, simulated annealing is used to
find the optimal or near-optimal solution to a problem by exploring a complex and often multimodal solution space.

Here's how simulated annealing works in artificial intelligence:


1. Objective Function: Simulated annealing is used to optimize an objective function, which measures the quality of a solution. The goal is to find the solution that maximizes or minimizes
this function.
2. Initial Solution: A random or initial solution is chosen to start the search. This solution can be generated randomly or through some other method.
3. Temperature: Simulated annealing introduces the concept of "temperature." Initially, the temperature is set high, allowing the algorithm to accept solutions that may not be better than the
current one. The temperature gradually decreases over time according to a cooling schedule.
4. Neighbour Generation: At each iteration, the algorithm generates a neighbouring solution by making a small perturbation to the current solution. This can be achieved through various
means, depending on the problem at hand.
5. Evaluating Neighbours: The objective function is used to evaluate the quality of the current solution and the neighbouring solution. The difference in their objective function values is
calculated (often called the "energy" difference).
6. Acceptance Probability: Simulated annealing introduces a probability of accepting a worse solution. If the neighbouring solution is better, it is always accepted. If the neighbouring solution
is worse, it may still be accepted with a certain probability, which is determined by the energy difference and the current temperature.
7. Iteration: The process of generating neighbours, evaluating them, and accepting or rejecting them based on the acceptance probability is repeated for a certain number of iterations or until
a stopping criterion is met.
8. Cooling Schedule: The cooling schedule defines how the temperature decreases over time. Typically, the temperature decreases slowly at the beginning and more rapidly as the algorithm
progresses. The choice of cooling schedule can significantly impact the algorithm's performance.
9. Termination: The algorithm terminates when the temperature reaches a predefined minimum value or when a stopping criterion (e.g., a maximu m number of iterations) is met.

Link for video lecture


L32: Simulated Annealing in Artificial Intelligence | Difference Hill Climbing & Simulated Annealing

Game Playing
Game playing in stochastic search involves making decisions in a game with elements of chance or randomness. Stochastic games are characterized by uncertainty, and players must account for this
uncertainty when making their moves.
Game playing was one of the first tasks undertaken in Artificial Intelligence. Game theory has its history from 1950, almost from the days when computers became programmable. The very first game that
is been tackled in AI is chess. Initiators in the field of game theory in AI were Konard Zuse (the inventor of the first programmable computer and the first programming language), Claude Shannon (the
inventor of information theory), Norbert Wiener (the creator of modern control theory), and Alan Turing. Since then, there has been a steady progress in the standard of play, to the point that machines
have defeated human champions (although not every time) in chess and backgammon, and are competitive in many other games.

TYPES OF GAMES

Perfect Information Game: In which player knows all the possible moves of himself and opponent and their results. E.g. Chess.
Imperfect Information Game: In which player does not know all the possible moves of the opponent. E.g. Bridge since all the cards are not visible to player.

Game playing is a search problem defined by following components:


Initial state: This defines initial configuration of the game and identifies first payer to move.
Successor function: This identifies which are the possible states that can be achieved from the current state. This function returns a list of (move, state) pairs, each indicating a legal move and the resulting
state.
Goal test: Which checks whether a given state is a goal state or not. States where the game ends are called as terminal states.
Path cost / utility / payoff function: Which gives a numeric value for the terminal states? In chess, the outcome is win, lose or draw, with values +1, -1, or 0. Some games have wider range of possible
outcomes

Characteristics of game playing

Unpredictable Opponent: Generally we cannot predict the behaviour of the opponent. Thus we need to find a solution which is a strategy specifying a move for every possible opponent move or every
possible state.
Time Constraints: Every game has a time constraints. Thus it may be infeasible to find the best move in this time.

How to play a game


Typical structure of the game in the AI is:
•  2- person game
•  Players alternate moves
•  Zero-sum game: one player’s loss is the other’s gain
•  Perfect information: both players have access to complete information about the state of the game. No information is hidden from either player.
• No chance (e. g. using dice) involved.
E.g. Tic- Tac- Toe, Checkers, Chess, Go, Nim, Othello
For dealing with such types of games, consider all the legal moves you can make from the current position. Compute the new position resulting from each move. Evaluate each resulting position and
determine which is best for you. Make that move. Wait for your opponent to move and repeat the procedure. But for this procedure the main problem is how to evaluate the position? Evaluation function
or static evaluator is used to evaluate the ‘goodness’ of a game position. The zero- sum assumption allows us to use a single evaluation function to describe the goodness of a position with respect to both
players. Let's consider, f(n) is the evaluation function of the position ‘n’. Then,

Downloaded
Unit 2 Page 3 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

players. Let's consider, f(n) is the evaluation function of the position ‘n’. Then,
– f(n) >> 0: position n is good for me and bad for you
– f(n) << 0: position n is bad for me and good for you
– f(n) near 0: position n is a neutral position
e.g. evaluation function for Tic- Tac- Toe:
f( n) = [# of 3- lengths open for me] - [# of 3- lengths open for you]
where a 3- length is a complete row, column, or diagonal

Games are represented in the form of trees wherein nodes represent all the possible states of a game and edges represent moves between them. Initial state of the game is represented by root and
terminal states by leaves of the tree. In a normal search problem, the optimal solution would be a sequence of moves leading to a goal state that is a win. Even a simple game like tic-tac-toe is too complex
for us to draw the entire game tree. Fig 1 shows part of the game tree for tic-tac-toe. Game tree for Tic-Tac-Toe

Let us represent two players by ‘X’ and ‘O’. From the initial state, X has nine possible moves. Play alternates between X and O until we reach leaves. The number on each leaf node indicates the utility value
of the terminal states from the point of view of X. High values are assumed to be good for X and bad for O.

Link for Video Lecture


Introduction to Game Playing in Artificial Intelligence | Learn Game Playing Algorithms with Example

Downloaded
Unit 2 Page 4 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

3: A* Search Implementation
31 October 2023 16:17

A* search is the most commonly known form of best-first search. It uses heuristic function h(n), and cost to reach the node n from the start state g(n). It has combined features
of UCS and greedy best-first search, by which it solve the problem efficiently. A* search algorithm finds the shortest path through the search space using the heuristic
function. This search algorithm expands less search tree and provides optimal result faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n).
In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we can combine both costs as following, and this sum is called as a fitness
number.

Algorithm of A* search:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation function (g+h), if node n is goal node then ret urn success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each successor n', check whether n' is already in the OPEN or CLOSED list, if not
then compute evaluation function for n' and place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer which reflects the lowest g(n') value.
Step 6: Return to Step 2.

Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all states is given in the b elow table so we will calculate the f(n) of each state using the formula f(n)= g(n) + h(n),
where g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.

Solution:

Initialization: {(S, 5)}


Iteration1: {(S--> A, 4), (S-->G, 10)}
Iteration2: {(S--> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path with cost 6.
Points to remember:
• A* algorithm returns the path which occurred first, and it does not search for all remaining paths.
• The efficiency of A* algorithm depends on the quality of heuristic.
• A* algorithm expands all nodes which satisfy the condition f(n)<="" li="">
Complete: A* algorithm is complete as long as:
• Branching factor is finite.
• Cost at every action is fixed.
Optimal: A* search algorithm is optimal if it follows below two conditions:
• Admissible: the first condition requires for optimality is that h(n) should be an admissible heuristic for A* tree search. An admissible heuristic is optimistic in nature.
• Consistency: Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and the number of nodes expanded is exponential to the depth of solution d. So
the time complexity is O(b^d), where b is the branching factor.
Space Complexity: The space complexity of A* search algorithm is O(b^d)

Advantages:
• A* search algorithm is the best algorithm than other search algorithms.
• A* search algorithm is optimal and complete.
• This algorithm can solve very complex problems.
Disadvantages:
• It does not always produce the shortest path as it mostly based on heuristics and approximation.
• A* search algorithm has some complexity issues.

Downloaded
Unit 2 Page 5 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

• The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so it is not practical for various large-scale problems.

Link for video lecture


L28: A Star(A*) Search Algorithm in Artificial Intelligence with Examples | Informed Search in AI

Downloaded
Unit 2 Page 6 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

4:Mini Max Search


01 November 2023 12:20

• Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making and game theory. It provides an optimal move for the player assuming that
opponent is also playing optimally.
• Mini-Max algorithm uses recursion to search through the game-tree.
• Min-Max algorithm is mostly used for game playing in AI. Such as Chess, Checkers, tic-tac-toe, go, and various tow-players game. This Algorithm computes the minimax
decision for the current state.
• In this algorithm two players play the game, one is called MAX and other is called MIN.
• Both the players fight it as the opponent player gets the minimum benefit while they get the maximum benefit.
• Both Players of the game are opponent of each other, where MAX will select the maximized value and MIN will select the minimi zed value.
• The minimax algorithm performs a depth-first search algorithm for the exploration of the complete game tree.
• The minimax algorithm proceeds all the way down to the terminal node of the tree, then backtrack the tree as the recursion.

Pseudo-code for Minimax Algorithm


1. function minimax(node, depth, maximizingPlayer) is
2. if depth ==0 or node is a terminal node then
3. return static evaluation of node
4.
5. if MaximizingPlayer then // for Maximizer Player
6. maxEva= -infinity
7. for each child of node do
8. eva= minimax(child, depth-1, false)
9. maxEva= max(maxEva,eva) //gives Maximum of the values
10. return maxEva
11.
12. else // for Minimizer player
13. minEva= +infinity
14. for each child of node do
15. eva= minimax(child, depth-1, true)
16. minEva= min(minEva, eva) //gives minimum of the values
17. return minEva

Initial call:
Minimax(node, 3, true)

Working of Min-Max Algorithm:


• The working of the minimax algorithm can be easily described using an example. Below we have taken an example of game -tree which is representing the two-player game.
• In this example, there are two players one is called Maximizer and other is called Minimizer.
• Maximizer will try to get the Maximum possible score, and Minimizer will try to get the minimum possible score.
• This algorithm applies DFS, so in this game-tree, we have to go all the way through the leaves to reach the terminal nodes.
• At the terminal node, the terminal values are given so we will compare those value and backtrack the tree until the initial s tate occurs. Following are the main steps involved in
solving the two-player game tree:
Step-1: In the first step, the algorithm generates the entire game-tree and apply the utility function to get the utility values for the terminal states. In the below tree diagram,
let's take A is the initial state of the tree. Suppose maximizer takes first turn which has worst -case initial value =- infinity, and minimizer will take next turn which has worst-
case initial value = +infinity.

Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we will compare each value in terminal state with initial value of Maximizer and determines the
higher nodes values. It will find the maximum among the all.
• For node D max(-1,- -∞) => max(-1,4)= 4
• For Node E max(2, -∞) => max(2, 6)= 6
• For Node F max(-3, -∞) => max(-3,-5) = -3
• For node G max(0, -∞) = max(0, 7) = 7

Downloaded
Unit 2 Page 7 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

For node G max(0, -∞) = max(0, 7) = 7

Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will find the 3 rd layer node values.
• For node B= min(4,6) = 4
• For node C= min (-3, 7) = -3

Step 4: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value and find the maximum value for the roo t node. In this game tree, there are only
4 layers, hence we reach immediately to the root node, but in real games, there will be more than 4 layers.
• For node A max(4, -3)= 4

That was the complete workflow of the minimax two player game.
Properties of Mini-Max algorithm:
• Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the finite search tree.
• Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
• Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max algorithm is O(bm), where b is branching factor of the game-tree, and m is the
maximum depth of the tree.
• Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is O(bm).

Limitation of the minimax Algorithm:


The main drawback of the minimax algorithm is that it gets really slow for complex games such as Chess, go, etc. This type of games has a huge branching factor, and the
player has lots of choices to decide. This limitation of the minimax algorithm can be improved from alpha-beta pruning which we have discussed in the next topic.

Link for Video Lecture


L64: Minimax Algorithm in Game Playing with examples | Artificial Intelligence Lectures in Hindi

Downloaded
Unit 2 Page 8 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Downloaded
Unit 2 Page 9 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

5:Alpha-Beta Pruning
01 November 2023 12:21

• Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization technique for the minimax algorithm.
• As we have seen in the minimax search algorithm that the number of game states it has to examine are exponential in depth of the tree. Since we cannot eliminate the
exponent, but we can cut it to half. Hence there is a technique by which without checking each node of the game tree we can compute the correct minimax decision, and
this technique is called pruning. This involves two threshold parameter Alpha and beta for future expansion, so it is called alpha-beta pruning. It is also called
as Alpha-Beta Algorithm.
• Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune the tree leaves but also entire sub-tree.
• The two-parameter can be defined as:
1. Alpha: The best (highest-value) choice we have found so far at any point along the path of Maximizer. The initial value of alpha is -∞.
2. Beta: The best (lowest-value) choice we have found so far at any point along the path of Minimizer. The initial value of beta is +∞.
• The Alpha-beta pruning to a standard minimax algorithm returns the same move as the standard algorithm does, but it removes all the nodes which are not really
affecting the final decision but making algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.

Condition for Alpha-beta pruning:


The main condition which required for alpha-beta pruning is:
1. α>=β
• The Max player will only update the value of alpha.
• The Min player will only update the value of beta.
• While backtracking the tree, the node values will be passed to upper nodes instead of values of alpha and beta.
• We will only pass the alpha, beta values to the child nodes.

Working of Alpha-Beta Pruning:


Let's take an example of two-player search tree to understand the working of Alpha-beta pruning
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these value of alpha and beta passed down to node B where again α= -∞ and
β= +∞, and Node B passes the same value to its child D.

Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α
at node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β= +∞, will compare with the available subsequent nodes value, i.e.
min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.

In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5
and β= 3, where α>=β, so the right successor of E will be pruned, and algorithm will not traverse it, and the value at node E will be 5.

Downloaded
Unit 2 Page 10 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of alpha will be changed the maximum available value is 3 as max
(-∞, 3)= 3, and β= +∞, these two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α
remains 3, but the node value of F will become 1.

Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β=
1, and again it satisfies the condition α>=β, so the next child of C which is G will be pruned, and the algorithm will not compute the entire sub-tree G.

Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game tree which is the showing the nodes which are computed
and nodes which has never computed. Hence the optimal value for the maximizer is 3 for this example.

Move Ordering in Alpha-Beta pruning:


The effectiveness of alpha-beta pruning is highly dependent on the order in which each node is examined. Move order is an important aspect of alpha-beta pruning.
It can be of two types:
• Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the leaves of the tree, and works exactly as minimax algorithm. In this case, it
also consumes more time because of alpha-beta factors, such a move of pruning is called worst ordering. In this case, the best move occurs on the right side of the
tree. The time complexity for such an order is O(b m).

Downloaded
Unit 2 Page 11 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

• Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in the tree, and best moves occur at the left side of the tree. We
apply DFS hence it first search left of the tree and go deep twice as minimax algorithm in the same amount of time. Complexit y in ideal ordering is O(bm/2).
Rules to find good ordering:
Following are some rules to find good ordering in alpha-beta pruning:
• Occur the best move from the shallowest node.
• Order the nodes in the tree such that the best nodes are checked first.
• Use domain knowledge while finding the best move. Ex: for Chess, try order: captures first, then threats, then forward moves, backward moves.
• We can bookkeep the states, as there is a possibility that states may repeat.

Link for video


Alpha Beta Pruning in Hindi with Example | Artificial Intelligence

Downloaded
Unit 2 Page 12 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

6:Knowledge Representation: Preposition logic


01 November 2023 12:22

Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions. A proposition is a declarative statement which is either true or false. It is a technique of
knowledge representation in logical and mathematical form.

Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.

Following are some basic facts about propositional logic:

• Propositional logic is also called Boolean logic as it works on 0 and 1.


• In propositional logic, we use symbolic variables to represent the logic, and we can use any symbol for a representing a prop osition, such A, B, C, P, Q, R, etc.
• Propositions can be either true or false, but it cannot be both.
• The propositions and connectives are the basic elements of the propositional logic.
• Connectives can be said as a logical operator which connects two sentences.
• A proposition formula which has both true and false values is called
• Statements which are questions, commands, or opinions are not propositions such as "Where is Rohini", "How are you", "What is your name", are not
propositions.

Syntax of propositional logic:


The syntax of propositional logic defines the allowable sentences for the knowledge representation. There are two types of Pr opositions:
1. Atomic Propositions
2. Compound propositions

• Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single proposition symbol. These are the sentences which must be either true
or false.

Example:
5. a) 2+2 is 4, it is an atomic proposition as it is a true fact.
6. b) "The Sun is cold" is also a proposition as it is a false fact.

• Compound proposition: Compound propositions are constructed by combining simpler or atomic propositions, using parenthesis and logical connectives.

Example:
7. a) "It is raining today, and street is wet."
8. b) "Ankit is a doctor, and his clinic is in Mumbai."

Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence logically. We can create compound propositions with the help of logical
connectives. There are mainly five connectives, which are given as follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or negative literal.
2. Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.
Example: Rohan is intelligent and hardworking. It can be written as,
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
3. Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P and Q are the propositions.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
4. Implication: A sentence such as P → Q, is called an implication. Implications are also known as if-then rules. It can be represented as
If it is raining, then the street is wet.
Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
5. Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am breathing, then I am alive
P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.
Following is the summarized table for Propositional Logic Connectives:

Link for videos Lecture


Propositional Logic in Artificial Intelligence in Hindi | Knowledge Representation | All Imp Points

Unit 2Downloaded
Page 13 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Unit 2Downloaded
Page 14 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

7: First- Order Logic


01 November 2023 12:22

In the topic of Propositional logic, we have seen that how to represent statements using propositional logic. But unfortunately, in propositional logic, we can only represent the facts, which are
either true or false. PL is not sufficient to represent the complex sentences or natural language statements. The propositional logic has very limited expressive power. Consider the following
sentence, which we cannot represent using PL logic.

• "Some humans are intelligent", or


• "Sachin likes cricket."

To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such as first-order logic.
First-Order logic:
• First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to propositional logic.
• FOL is sufficiently expressive to represent the natural language statements in a concise way.
• First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a powerful language that develops information about the
objects in a more easy way and can also express the relationship between those objects.
• First-order logic (like natural language) does not only assume that the world contains facts like propositional logic but also assumes the following things in the
world:
○ Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
○ Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the sister of, brother of, has color, comes between
○ Function: Father of, best friend, third inning of, end of, ......
• As a natural language, first-order logic also has two main parts:
1. Syntax
2. Semantics
Syntax of First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic syntactic elements of first-order logic are symbols.
We write statements in short-hand notation in FOL.
Basic Elements of First-order logic:
Following are the basic elements of FOL syntax:
Constant 1, 2, A, John, Mumbai, cat,....
Variables x, y, z, a, b,....
Predicates Brother, Father, >,....
Function sqrt, LeftLegOf, ....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃

Atomic sentences:
• Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a predicate symbol followed by a parenthesis with a
sequence of terms.
• We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).
Chinky is a cat: => cat (Chinky).

Complex Sentences:
• Complex sentences are made by combining atomic sentences using connectives.
First-order logic statements can be divided into two parts:
• Subject: Subject is the main part of the statement.
• Predicate: A predicate can be defined as a relation, which binds two atoms together in a statement.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of the statement and second part "is an integer," is known as a
predicate.

Link for videos Lecture


L56: First Order Logic (FOL) | Predicate Logic Introduction | Quantifiers in Predicate Logic | AI

UnitDownloaded
2 Page 15 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

UnitDownloaded
2 Page 16 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

8:Forward and Backward Chaining


01 November 2023 12:23

In artificial intelligence, forward and backward chaining is one of the important topics, but before understanding forward and backward chaining lets first understand that
from where these two terms came.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which applies logical rules to the knowledge base to infer new information from
known facts. The first inference engine was part of the expert system. Inference engine commonly proceeds in two modes, which are:
1. Forward chaining
2. Backward chaining
Horn Clause and Definite clause:
Horn clause and definite clause are the forms of sentences, which enables knowledge base to use a more restricted and efficient inference algorithm. Logical inference
algorithms use forward and backward chaining approaches, which require KB in the form of the first-order definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as a definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of mliterals with at most one positive literal is known as horn clause. Hence all the definite clauses are horn clauses.
Example: (¬ p V ¬ q V k). It has only one positive literal k.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when using an inference engine. Forward chaining is a form of reasoning which start
with atomic sentences in the knowledge base and applies inference rules (Modus Ponens) in the forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied, and add their conclusion to the known facts. This process repeats
until the problem is solved.
Properties of Forward-Chaining:
• It is a down-up approach, as it moves from bottom to top.
• It is a process of making a conclusion based on known facts or data, by starting from the initial state and reaches the goal state.
• Forward-chaining approach is also called as data-driven as we reach to the goal using available data.
• Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and production rule systems.
Forward chaining proof:
Step-1:
In the first step we will start with the known facts and will choose the sentences which do not have implications, such as: American(Robert), Enemy(A, America),
Owns(A, T1), and Missile(T1). All these facts will be represented as below.

Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(2) and (3) are already added.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers from the conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from Rule-(7).

Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so we can add Criminal(Robert) which infers all the available facts. And hence
we reached our goal statement.

Hence it is proved that Robert is Criminal using forward chaining approach.


B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method when using an inference engine. A backward chaining algorithm is a form of
reasoning, which starts with the goal and works backward, chaining through rules to find known facts that support the goal.
Properties of backward chaining:
• It is known as a top-down approach.
• Backward-chaining is based on modus ponens inference rule.
• In backward chaining, the goal is broken into sub-goal or sub-goals to prove the facts true.
• It is called a goal-driven approach, as a list of goals decides which rules are selected and used.
• Backward -chaining algorithm is used in game theory, automated theorem proving tools, inference engines, proof assistants, and various AI applications.
• The backward-chaining method mostly used a depth-first search strategy for proof.
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert), and then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts, and at last, we will prove those facts true. So our goal fact is "Robert is Criminal,"
so following is the predicate of it.

Downloaded
Unit 2 Page 17 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we can see in Rule-1, the goal predicate Criminal (Robert) is present with
substitution {Robert/P}. So we will add all the conjunctive facts below the first level and will replace p with Robert.
Here we can see American (Robert) is a fact, so it is proved here.

Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it satisfies Rule-(5). Weapon (q) is also true with the substitution of a constant T1 at
q.

Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which satisfies the Rule- 4, with the substitution of A in place of r. So these two
statements are proved here.

Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6. And hence all the statements are proved true using backward chaining.

Link for videos Lecture


Inference in artificial intelligence | forward chaining & backward chaining artificial intelligence

Downloaded
Unit 2 Page 18 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

DIFFERENCE BETWEEN FORWARD AND BACKWARD CHAINING


L60: Forward chaining, Backward chaining | Example | Comparison | Artificial Intelligence Lectures

Downloaded
Unit 2 Page 19 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

9:Introduction to Probabilistic Reasoning


01 November 2023 12:23

Uncertainty:
Till now, we have learned knowledge representation using first-order logic and propositional logic with certainty, which means we were sure about the predicates. With this
knowledge representation, we might write A→B, which means if A is true then B is true, but consider a situation where we are not sure about whether A is true or not then we cannot
express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
1. Information occurred from unreliable sources.
2. Experimental Errors
3. Equipment fault
4. Temperature variation
5. Climate change.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability to indicate the uncertainty in knowledge. In probabilistic reasoning,
we combine probability theory with logic to handle the uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is the result of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as "It will rain today," "behavior of someone for some situations," "A
match between two teams or two players." These are probable sentences for which we can assume that it will happen but not sure about it, so here we use probabilistic
reasoning.
Need of probabilistic reasoning in AI:
• When there are unpredictable outcomes.
• When specifications or possibilities of predicates becomes too large to handle.
• When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
• Bayes' rule
• Bayesian Statistic

As probabilistic reasoning uses probability and related terms, so before understanding probabilistic reasoning, let's understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the numerical measure of the likelihood that an event will occur. The value of
probability always remains between 0 and 1 that represent ideal uncertainties.
1. 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
1. P(A) = 0, indicates total uncertainty in an event A.
1. P(A) =1, indicates total certainty in an event A.
We can find the probability of an uncertain event by using the below formula.

• P(¬A) = probability of a not happening event.


• P(¬A) + P(A) = 1.
Event: Each possible outcome of a variable is called an event.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in the real world.
Prior probability: The prior probability of an event is probability computed before observing new information.
Posterior Probability: The probability that is calculated after all evidence or information has taken into account. It is a combination of prior probability and new information.

Conditional probability:
Conditional probability is a probability of occurring an event when another event has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the probability of A under the conditions of B", it can be written as:

Where P(A⋀B)= Joint probability of a and B


P(B)= Marginal probability of B.
If the probability of A is given and we need to find the probability of B, then it will be given as:

Link for videos Lecture


Probabilistic Reasoning in artificial intelligence | Uncertainty | Lec-26

Downloaded
Unit 2 Page 20 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

10:Bayes Theorem
01 November 2023 12:24

Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an application of Bayes' theorem, which is fundamental to
Bayesian statistics.
It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
Bayes' theorem allows updating the probability prediction of an event by observing new information of the real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known event B:
As from product rule we can write:

1. P(A ⋀ B)= P(A|B) P(B) or


Similarly, the probability of event B with known event A:
1. P(A ⋀ B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:

The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as:

Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is very useful in cases where we have a good probability of these three
terms and want to determine the fourth one. Suppose we want to perceive the effect of some unknown cause, and want to compute that cause, then the Bayes' rule
becomes:

Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80% of the time. He is also aware of some more facts, which are given as
follows:
• The Known probability that a patient has meningitis disease is 1/30,000.
• The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02

Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that the card is king is 4/52, then calculate posterior probability
P(King|Face), which means the drawn face card is a king card.
Solution:

P(king): probability that the card is King= 4/52= 1/13


P(face): probability that a card is a face card= 3/13
P(Face|King): probability of face card when we assume it is a king = 1
Putting all values in equation (i) we will get:

Application of Bayes' theorem in Artificial intelligence:


Following are some applications of Bayes' theorem:
• It is used to calculate the next step of the robot when the already executed step is given.
• Bayes' theorem is helpful in weather forecasting.
• It can solve the Monty Hall problem.

Link for videos Lecture


L73: Bayes Theorem in Artificial Intelligence with Applications & Example | AI Lectures in Hindi

Unit 2 Downloaded
Page 21 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Unit 2 Downloaded
Page 22 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

1:Advanced Knowledge Representation and Reasoning: Knowledge


Representation Issues
01 November 2023 12:24

Humans are best at understanding, reasoning, and interpreting knowledge. Human knows things, which is knowledge and as per th eir knowledge they perform various
actions in the real world. But how machines do all these things comes under knowledge representation and reasoning . Hence we can describe Knowledge
representation as following:
• Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence which concerned with AI agents thinki ng and how thinking contributes to
intelligent behavior of agents.
• It is responsible for representing information about the real world so that a computer can understand and can utilize this kn owledge to solve the complex real world
problems such as diagnosis a medical condition or communicating with humans in natural language.
• It is also a way which describes how we can represent knowledge in artificial intelligence. Knowledge representation is not j ust storing data into some database, but
it also enables an intelligent machine to learn from that knowledge and experiences so that it can behave intelligently like a human.
What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:
• Object: All the facts about objects in our world domain. E.g., Guitars contains strings, trumpets are brass instruments.
• Events: Events are the actions which occur in our world.
• Performance: It describe behavior which involves knowledge about how to do things.
• Meta-knowledge: It is knowledge about what we know.
• Facts: Facts are the truths about the real world and what we represent.
• Knowledge-Base: The central component of the knowledge-based agents is the knowledge base. It is represented as KB. The Knowledgebase is a group of the
Sentences (Here, sentences are used as a technical term and not identical with the English language).
Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data, and situations. Following are the types of knowle dge in artificial intelligence:

Issues in Knowledge Representation


• Important Attributed: Any attribute of objects so basic that they occur in almost every problem domain? ...
• Relationship among attributes: Any important relationship that exists among object attributed? ...
• Choosing Granularity: ...
• Set of objects: ...
• Finding Right structure:

Link for video lecture


Issues in Knowledge Representation in Artificial Intelligence

Downloaded
Unit 3 Page 23 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

2: Nonmonotonic Reasoning
01 November 2023 12:25

The reasoning is the mental process of deriving logical conclusion and making predictions from available knowledge, facts, an d beliefs. Or we can say, "Reasoning is a way
to infer facts from existing data." It is a general process of thinking rationally, to find valid conclusions.
In artificial intelligence, the reasoning is essential so that the machine can also think rationally as a human brain, and ca n perform like a human.
Types of Reasoning
In artificial intelligence, reasoning can be divided into the following categories:
• Deductive reasoning
• Inductive reasoning
• Abductive reasoning
• Common Sense Reasoning
• Monotonic Reasoning
• Non-monotonic Reasoning

Monotonic Reasoning:
In monotonic reasoning, once the conclusion is taken, then it will remain the same even if we add some other information to e xisting information in our knowledge base. In
monotonic reasoning, adding knowledge does not decrease the set of prepositions that can be derived.
To solve monotonic problems, we can derive the valid conclusion from the available facts only, and it will not be affected by new facts.
Monotonic reasoning is not useful for the real-time systems, as in real time, facts get changed, so we cannot use monotonic reasoning.
Monotonic reasoning is used in conventional reasoning systems, and a logic -based system is monotonic.
Any theorem proving is an example of monotonic reasoning.
Example:
• Earth revolves around the Sun.
It is a true fact, and it cannot be changed even if we add another sentence in knowledge base like, "The moon revolves around the earth" Or "Earth is not round," etc.
Advantages of Monotonic Reasoning:
• In monotonic reasoning, each old proof will always remain valid.
• If we deduce some facts from available facts, then it will remain valid for always.
Disadvantages of Monotonic Reasoning:
• We cannot represent the real world scenarios using Monotonic reasoning.
• Hypothesis knowledge cannot be expressed with monotonic reasoning, which means facts should be true.
• Since we can only derive conclusions from the old proofs, so new knowledge from the real world cannot be added.

Non-monotonic Reasoning
In Non-monotonic reasoning, some conclusions may be invalidated if we add some more information to our knowledge base.
Logic will be said as non-monotonic if some conclusions can be invalidated by adding more knowledge into our knowledge base.
Non-monotonic reasoning deals with incomplete and uncertain models.
"Human perceptions for various things in daily life, "is a general example of non-monotonic reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
• Birds can fly
• Penguins cannot fly
• Pitty is a bird
So from the above sentences, we can conclude that Pitty can fly.
However, if we add one another sentence into knowledge base " Pitty is a penguin", which concludes "Pitty cannot fly", so it invalidates the above conclusion.
Advantages of Non-monotonic reasoning:
• For real-world systems such as Robot navigation, we can use non-monotonic reasoning.
• In Non-monotonic reasoning, we can choose probabilistic facts or can make assumptions.
Disadvantages of Non-monotonic Reasoning:
• In non-monotonic reasoning, the old facts may be invalidated by adding new sentences.
• It cannot be used for theorem proving.

Link for video lecture


Lec-20_Non-monotonic reasoning_Part-1 | Artificial Intelligence | Computer Engineering

Lec-21_Non-monotonic reasoning_Part-2 | Artificial Intelligence | Computer Engineering

Downloaded
Unit 3 Page 24 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

3: Other Knowledge Representation Scheme Reasoning under


Uncertainty
01 November 2023 12:25

In artificial intelligence, reasoning under uncertainty is a critical aspect of knowledge representation and decision-making. One common knowledge representation scheme used for this
purpose is Probabilistic Graphical Models (PGMs), which include Bayesian Networks and Markov Networks. Here's a brief overview of reasoning under uncertainty using PGMs:

1. Probabilistic Graphical Models (PGMs): These models are designed to represent and manipulate uncertainty in a probabilistic manner. They provide a structured way tomodel complex,
uncertain relationships among variables.
2. Bayesian Networks (BNs): Bayesian Networks are graphical models that use directed acyclic graphs to represent conditional dependencies between variables. They are particularly
useful for capturing cause-and-effect relationships and can be used for both modeling and inference.
3. Markov Networks: Unlike Bayesian Networks, Markov Networks use an undirected graph to represent associations between variables. They are moresuitable for modeling situations
where the causal relationships are not well defined, and dependencies are more about statistical associations.
4. Inference: Reasoning under uncertainty in PGMs involves performing inference to make decisions or predictions. Inference can be either exact or approximate, depending on the
complexity of the model. Common inference algorithms include variable elimination, belief propagation, and sampling methods ilke Markov Chain Monte Carlo (MCMC).
5. Uncertain Evidence: PGMs can handle uncertain evidence, which is valuable in real-world scenarios where data may be incomplete or noisy. Bayesian Networks, for example, can
update beliefs about variables given new evidence using Bayes' theorem.
6. Decision Making: PGMs can be used for decision-making under uncertainty through decision networks, a specialized form of Bayesian Networks. These models incorporate utility
functions to make optimal decisions in the presence of uncertainty.
7. Applications: PGMs are widely used in various AI applications, including medical diagnosis, natural language processing, recommendation systems, robotics, and more. They provide a
principled way to deal with uncertainty, which is inherent in many real-world problems.
8. Challenges: While PGMs are powerful tools for reasoning under uncertainty, they can face challenges in terms of scalability and the complexity of modeling real-world systems with many
variables. Researchers continue to develop more efficient and expressive models to address these challenges.

In summary, reasoning under uncertainty in artificial intelligence involves the use of Probabilistic Graphical Models, such as Bayesian Networks and Markov Networks, to represent,
manipulate, and make decisions in situations where information is uncertain or incomplete. These models have widespread applications and are a fundamental component of AI systems
dealing with real-world data and uncertainty.

Link For Video Lecture


Reasoning under Uncertainty in Artificial Intelligence

Downloaded
Unit 3 Page 25 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

4: Acting under Uncertainity


01 November 2023 12:26

Acting under uncertainty, in the context of artificial intelligence and decision-making, refers to making choices or taking actions in situations where there is incomplete or uncertain information about
the environment or the outcomes of those actions. It is a fundamental aspect of AI systems designed to interact with the real world. Here's an overview of how AI systems can act under uncertainty:

1. **Decision Theory:** Decision theory is a framework used to make rational decisions under uncertainty. It involves defining objectives, assessing probabilities of different outcomes, and assigning
utilities (values) to these outcomes. By calculating expected utilities, decision-makers can choose actions that maximize their expected utility.

2. **Uncertain Environments:** AI systems often operate in environments where outcomes are not completely predictable. For example, a self-driving car must make decisions based on uncertain
sensor data and the behavior of other road users.

3. **Uncertain Information:** In many cases, AI systems have to deal with incomplete or noisy information. This uncertainty can arise from sensor limitations, data imperfections, or ambiguities in
the environment.

4. **Modeling Uncertainty:** To act under uncertainty, AI systems use models that capture uncertain information. These models can include probabilistic models, Bayesian networks, Markov
decision processes (MDPs), and more. These models help the AI system reason about the likelihood of different outcomes and plan accordingly.

5. **Exploration vs. Exploitation:** In reinforcement learning, a common approach to acting under uncertainty is the exploration-exploitation trade-off. The AI agent needs to decide whether to
explore new actions (potentially yielding valuable information) or exploit known actions to maximize immediate rewards.

6. **Risk Management:** Decision-makers can also take into account their risk tolerance when acting under uncertainty. Some decisions may involve higher risk,while others prioritize safety or
conservative strategies.

7. **Adaptive Strategies:** AI systems can adapt their strategies based on the evolving uncertainty in the environment. They might use techniques such as online learning or adaptive control to
adjust their behavior in real time.

8. **Sensor Fusion:** In cases where the AI system relies on multiple sensors or information sources, sensor fusion techniques are used to combine and reconcile data from different sources while
accounting for uncertainty in each source.

9. **Monte Carlo Methods:** Monte Carlo methods, like Monte Carlo Tree Search (MCTS), are often used to estimate the value of different actions when the exact outcomes are uncertain. These
methods involve sampling possible scenarios and averaging the results.

10. **Feedback Loops:** Continuous feedback and learning from the outcomes of actions can help AI systems improve their decision-making over time, adapting to the changing level of uncertainty.

Acting under uncertainty is a critical aspect of AI systems in various domains, including autonomous robotics, financial trading, healthcare, and more. These systems are designed to make rational
decisions in complex, dynamic, and uncertain environments, taking into account the available information and the inherent uncertainties in the world.

Link for video lecture


Acting under uncertainty

Unit 3 Downloaded
Page 26 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

5: Bayes Rule
01 November 2023 12:27

Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event with uncertain knowledge.In probability theory, it relates
the conditional probability and marginal probabilities of two random events. Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics. It is a way to calculate the value of P(B|A) with the knowledge of P(A|B). Bayes' theorem allows updating the
probability prediction of an event by observing new information of the real world.

Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of cancer more accurately with the help of age. Bayes' theorem can be
derived using product rule and conditional probability of event A with known event B:
As from product rule we can write:
1. P(A ⋀B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
1. P(A ⋀B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:

The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI systems for probabilistic inference. It shows the simple relationship between
joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculatethe probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the
evidence
P(B) is called marginal probability, pure probability of an evidence. -
In the equation (a), in general, we can write P(B) = P(A)*P(B|Ai), hence the Bayes' rule can
be written as:

Where A 1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.


Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is very useful in cases where we have a good probability of these three terms and want to
determine the fourth one. Suppose we want to perceive the effect of some unknown cause, and want to compute that cause, then the Bayes' rule becomes:

Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs
80% of the time. He is also aware of some more facts, which are given as follows:
The Known probability that a patient has meningitis disease is 1/30,000.
The Known probability that a patient has a stiff neck is 2%.
Letabethepropositionthatpatienthasstiffneckandbbethepropositionthatpatienthas
Meningitis,
so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02

Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff
neck.

Downloaded
Unit 3 Page 27 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

6:Representing Knowledge in a Uncertain Domain- Bayesian Network


01 November 2023 12:28

Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a problem which has unc ertainty. We can define a Bayesian
network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability distribution, and also use probability theory for prediction and anomaly
detection.
Backward Skip 10sPlay Video Forward Skip 10s

Real world applications are probabilistic in nature, and to represent the relationship between multiple events, we need a Bay esian network. It can also be used in various
tasks including prediction, anomaly detection, diagnostics, automated insight, reasoning, time series prediction , and decision making under uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities.
The generalized form of Bayesian network that represents and solve decision problems under uncertain knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:

• Each node corresponds to the random variables, and a variable can be continuous or discrete.
• Arc or directed arrows represent the causal relationship or conditional probabilities between random variables. These directed links or arrows conne ct the pair of
nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that means that nodes are independent with each other
○ In the above diagram, A, B, C, and D are random variables represented by the nodes of the network graph.
○ If we are considering node B, which is connected with node A by a directed arrow, then node A is called the parent of Node B.
○ Node C is independent of node A.

The Bayesian network has mainly two components:


• Causal Component
• Actual numbers
Explanation of Bayesian network:
Let's understand the Bayesian network through an example by creating a directed acyclic graph:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds at detecting a burglary but also responds for minor
earthquakes. Harry has two neighbors David and Sophia, who have taken a responsibility to inform Harry at work when they hear the alarm. David always calls Harry
when he hears the alarm, but sometimes he got confused with the phone ringing and calls at that time too. On the other hand, Sophia likes to listen to high music, so
sometimes she misses to hear the alarm. Here we would like to compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary, nor an earthquake occurred, and David and Sophia both called the Harry.
Solution:
• The Bayesian network for the above problem is given below. The network structure is showing that burglary and earthquake is t he parent node of the alarm and
directly affecting the probability of alarm's going off, but David and Sophia's calls depend on alarm probability.
• The network is representing that our assumptions do not directly perceive the burglary and also do not notice the minor earth quake, and they also not confer before
calling.
• The conditional distributions for each node are given as conditional probabilities table or CPT.
• Each row in the CPT must be sum to 1 because all the entries in the table represent an exhaustive set of cases for the variab le.
• In CPT, a boolean variable with k boolean parents contains 2 K probabilities. Hence, if there are two parents, then CPT will contain 4 probability values
List of all events occurring in this network:
• Burglary (B)
• Earthquake(E)
• Alarm(A)
• David Calls(D)
• Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can rewrite the above probability statement using joint probability distribution:
P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
= P [D| A]. P [ S| A, B, E]. P[ A, B, E]
= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]

Downloaded
Unit 3 Page 28 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]

Let's take the observed probability for the Burglary and earthquake component:
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on Burglar and earthquake:
B E P(A= True) P(A= False)
True True 0.94 0.06
True False 0.95 0.04
False True 0.31 0.69
False False 0.001 0.999
Conditional probability table for David Calls:
The Conditional probability of David that he will call depends on the probability of Alarm.
A P(D= True) P(D= False)
True 0.91 0.09
False 0.05 0.95
Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
A P(S= True) P(S= False)
True 0.75 0.25
False 0.02 0.98
From the formula of joint distribution, we can write the problem statement in the form of probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional independence statements.

Link for video lecture

Bayesian Belief Network ll Directed Acyclic Graph and Conditional Probability Table Explained Hindi

Bayesian Belief Network Explained with Solved Example in Hindi

Downloaded
Unit 3 Page 29 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

Learning in artificial intelligence (AI) refers to the process by which AI systems acquire knowledge and improve their perfor mance over time. Learning is an automated learning with little or no human
intervention. It involves programming computers so that they learn from the available inputs. The main purpose of machine lea rning is to explore and construct algorithms that can learn from the
previous data and make predictions on new input kodata.
The input to a learning algorithm is training data, representing experience, and the output is any expertise, which usually takes the form of another algorithm that can perform a task. The input data to
.Unit 4: Learning a machine learning system can be numerical, textual, audio, visual, or multimedia. The corresponding output data of the syste m can be a floating-point number, for instance, the velocity of a rocket,
an integer representing a category or a class, for example, a pigeon or a sunflower from image recognition.
14 November 2023 11:47 AI learning can be broadly categorized into three main types: supervised learning, unsupervised learning, and reinforcement learning.
1. Supervised Learning:
• In supervised learning, the AI model is trained on a labeled dataset, where the input data is paired with corresponding output labels.
• The algorithm learns to map the input data to the correct output by generalizing from the labeled examples provided during tr aining.
• Common applications include image recognition, speech recognition, and classification problems.
2. Unsupervised Learning:
• Unsupervised learning involves training an AI model on unlabeled data, and the system must find patterns or relationships wit hin the data without explicit guidance.
• Clustering and dimensionality reduction are common tasks in unsupervised learning.
• Applications include clustering similar documents, anomaly detection, and generating representative samples.
3. Reinforcement Learning:
• Reinforcement learning is a type of learning where an agent interacts with an environment and learns to make decisions by rec eiving feedback in the form of rewards or penalties.
• The agent explores different actions and learns to maximize cumulative rewards over time.
• Reinforcement learning is used in applications like game playing, robotics, and autonomous systems.
4. Semi-Supervised Learning:
• Semi-supervised learning combines elements of both supervised and unsupervised learning.
• It involves training a model on a dataset that contains both labeled and unlabeled examples.
• This approach is useful when obtaining a fully labeled dataset is expensive or time-consuming.
5. Self-Supervised Learning:
• Self-supervised learning is a subset of unsupervised learning where the model generates its own labels from the input data.
• It often involves creating surrogate tasks, such as predicting parts of the input from other parts, to enable learning withou t explicit labels.
6. Transfer Learning:
• Transfer learning involves training a model on one task and then transferring the knowledge gained to a different but related task.
• This can save computational resources and time, especially when labeled data for the target task is limited.
7. Neural Networks and Deep Learning:
• Deep learning, a subset of machine learning, focuses on neural networks with multiple layers (deep neural networks).
• Techniques like convolutional neural networks (CNNs) for image processing and recurrent neural networks (RNNs) for sequence d ata have been particularly successful.
Continuous learning and adaptation are crucial in AI, as models need to stay relevant in dynamic environments. Researchers and practitioners often fine-tune existing models, leverage
transfer learning, and explore novel algorithms to enhance AI systems' capabilities. Ongoing research and advancements in AI contribute to the evolution of learning techniques and
the development of more sophisticated models.

Video Link
Supervised, Unsupervised and Reinforcement Learning in Artificial Intelligence in Hindi

Unit 4 Page 30
Downloaded by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

2: Rote Learning in AI
14 November 2023 12:00

"Rote learning" in the context of artificial intelligence refers to a type of learning where a model memorizes information wi thout understanding the underlying concepts. In
rote learning, the emphasis is on memorization through repetition rather than comprehension or problem -solving. This approach is more commonly associated with
traditional, rule-based systems rather than modern machine learning methods. However, it's worth noting that some simple machine learning algor ithms can exhibit
characteristics of rote learning, particularly when the training data is limited, and the model relies on memorizing specific examples.
Here are some key points regarding rote learning in AI:
1. Memorization without Understanding:
• Rote learning involves memorizing specific facts, patterns, or sequences without necessarily grasping the underlying principl es or logic.
• The model essentially stores information in its memory and reproduces it when faced with similar situations.
2. Limited Generalization:
• Rote learning tends to result in limited generalization to new or unseen examples. The model may perform well on the specific examples it has memorized but
struggle with variations or novel cases.
3. Not Common in Modern Machine Learning:
• Modern machine learning, especially techniques like deep learning, focuses on learning representations and features from data rather than relying solely on
memorization.
• Neural networks, for example, aim to learn hierarchical representations of data, allowing for more robust generalization.
4. Rule-Based Systems:
• Rote learning is more commonly associated with rule-based systems, where explicit rules are defined and followed.
• In these systems, the model applies predefined rules to input data without necessarily adapting or learning from the data.
5. Challenges in Complex Environments:
• Rote learning is generally not suitable for handling complex and dynamic environments where understanding and adaptation are crucial.
• In AI applications where reasoning, decision-making, and adaptability are essential, models that rely solely on rote learning may struggle to perform well.
While rote learning is not a preferred approach in modern AI, it can sometimes be observed in the behaviour of simpler models or in the early stages of training more
complex models. However, the field of AI has largely moved toward more sophisticated learning paradigms, such as supervised l earning, unsupervised learning, and
reinforcement learning, which aim to capture underlying patterns and relationships in data for better generalization and adap tability.

Video link
https://youtu.be/ewb6pTpyUns?si=L1jS-LyAcVDDLM-K

Downloaded
Unit 4 Page 31 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

3:Learning by Taking Advice


14 November 2023 11:57

Learning by taking advice is a concept that aligns with certain aspects of machine learning and artificial intelligence, particularly in the context of interactive and adaptive systems.
Here are some perspectives on learning by taking advice:
1. Interactive Learning:
• In interactive learning scenarios, a system may receive advice or feedback from a knowledgeable source to improve its performance.
• This advice can take the form of corrections, suggestions, or additional information provided during the learning process.
2. Human-in-the-Loop Systems:
• Some AI systems are designed to work in conjunction with human users who provide guidance or advice.
• This human-in-the-loop approach is common in applications like interactive machine translation, where the system generates translations and a human expert provides
corrections.
3. Reinforcement Learning with Human Feedback:
• In reinforcement learning, an agent learns by interacting with an environment and receiving feedback in the form of rewards or penalties.
• Learning by taking advice can be integrated into reinforcement learning, where the agent receives additional guidance or feedback from a human supervisor to expedite the
learning process.
4. Supervised Learning from Demonstrations:
• Another form of learning by taking advice is demonstrated in approaches like imitation learning or learning from demonstrations.
• In these cases, an AI system observes demonstrations or expert behaviours and learns to mimic or generalize from this advice.
5. Transfer Learning and Knowledge Transfer:
• Learning by taking advice is also related to the concept of transfer learning, where knowledge gained in one task is leveraged to improve performance in a related task.
• Advice or guidance obtained from one context can be applied to enhance learning in a different but related context.
6. Adaptive Systems:
• Adaptive systems may actively seek advice or input from users or external sources to update their models and improve their performance over time.
• This adaptability is crucial in dynamic environments where the system needs to stay relevant and effective.
7. Ethical Considerations:
• In certain applications, such as AI in healthcare or finance, where decisions have significant consequences, incorporating expert advice can be essential for ethical and
responsible AI deployment.
Learning by taking advice is a dynamic and evolving area within AI, reflecting the broader trend toward creating more interactive, user-friendly, and adaptive systems. This
approach acknowledges the importance of human expertise and domain knowledge in guiding AI systems to achieve better performance and more responsible decision-making

Video Link

https://youtu.be/s97Yh5UhdQM?si=oQ3Pe4hMYWO4YKl-

L78: Learning | Process, Components of Learner System | Artificial Intelligence Lectures in Hindi

Unit Downloaded
4 Page 32 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

4:Learning in Problem Solving


14 November 2023 12:05

Learning in problem-solving is a critical aspect of artificial intelligence and machine learning. Problem -solving in AI involves devising algorithms, models, or
systems that can analyse information, reason about it, and generate solutions to complex challenges. Learning mechanisms enab le AI systems to improve their
problem-solving abilities over time. Here are several ways in which learning is integrated into problem -solving in AI:
1. Supervised Learning for Problem-Solving:
• In supervised learning, a model is trained on a labelled dataset, where inputs are associated with corresponding outputs.
• This approach is used in problem-solving tasks where there is a clear mapping between inputs and desired outputs, such as image classification or natural
language processing.
2. Reinforcement Learning for Adaptive Problem-Solving:
• Reinforcement learning involves an agent learning to make decisions by receiving feedback in the form of rewards or penalties .
• In problem-solving contexts, reinforcement learning can be applied to situations where an agent needs to learn a sequence of actions to achieve a goal, such
as game playing or robotic control.
3. Unsupervised Learning for Pattern Recognition:
• Unsupervised learning is employed when the data is not labelled, and the system needs to identify patterns or structures with in the data.
• Clustering and dimensionality reduction techniques in unsupervised learning can aid in problem -solving by revealing underlying structures in the data.
4. Transfer Learning for Generalization:
• Transfer learning allows a model to leverage knowledge gained in one domain to improve performance in another related domain.
• This is valuable in problem-solving scenarios where training data might be scarce, and knowledge from a related task can be transferred to enhance the
model's capabilities.
5. Self-Supervised Learning for Feature Learning:
• Self-supervised learning involves training models to predict certain aspects of the input data from other parts of the same data.
• This approach can be beneficial for problem-solving by enabling the model to learn useful representations or features without explicit labels.
6. Ensemble Learning for Robust Solutions:
• Ensemble learning combines multiple models to improve overall performance and robustness.
• In problem-solving, ensembles can be used to address different aspects of a complex problem, providing a more comprehensive and accurate solution.
7. Explainable AI for Transparency in Problem-Solving:
• Explainable AI techniques aim to make the decision-making process of AI systems more transparent and understandable.
• This is crucial in problem-solving applications where stakeholders need to trust and comprehend the solutions provided by AI models.
8. Continuous Learning for Adaptability:
• Continuous learning ensures that AI systems can adapt to changes in the problem space over time.
• It involves updating models with new data and experiences to maintain relevance in dynamic environments.
The integration of learning into problem-solving in AI reflects the need for systems that can adapt, generalize, and improve their performance over time.
Researchers and practitioners use a combination of these learning approaches to address diverse problem -solving challenges across various domains.

Unit Downloaded
4 Page 33 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

5:Winston's Learning Program


14 November 2023 12:06

Winston's learning program in artificial intelligence is a seminal piece of work in the field of machine learning. It was dev eloped by Patrick Winston in the early 1970s, and
it was one of the first programs to demonstrate the ability to learn from examples.

Winston's program is a supervised learning program, which means that it is trained on a set of labeled examples. In this case , the examples are line drawings of scenes
containing children's toy blocks. The program's goal is to learn to identify the objects in the scenes and their relationship s to each other.

The program works by first creating a descriptive network for each scene. The nodes in the network represent the objects in t he scene, and the edges represent the
relationships between them. The program then uses the descriptive network to infer the types of objects in the scene and thei r sizes and orientations.

Winston's program was able to learn to identify a variety of different types of blocks, including bricks, cubes, pyramids, an d wedges. It was also able to learn to infer the
relationships between blocks, such as which block was supporting another block.

Winston's learning program was an important milestone in the development of machine learning. It demonstrated that it is poss ible to create programs that can learn from
examples, without the need for explicit programming. Winston's program has also been influential in the development of other machine learning algorithms, such as
decision trees and support vector machines.

Here is a more detailed overview of the steps involved in Winston's learning program:

1. The program is presented with a line drawing of a scene containing children's toy blocks.
2. The program uses Guzman's algorithm to identify the bodies in the scene.
3. The program determines which edges belong to which object and fills in partially occluded edges.
4. The program infers the types of objects (brick, wedge, etc.) from the shapes and adjacency relationships of the viable faces.
5. The program infers the sizes and orientations of the objects.
6. The program creates a descriptive network for the scene, with nodes representing the objects and edges representing the relationships between them.

Winston's learning program has been applied to a variety of different problems, including scene recognition, natural language processing, and medical diagnosis. It is a
powerful tool for machine learning, and it continues to be used by researchers and practitioners today.

Video link

L44: Blocks World Problem in Artificial Intelligence with Solution | AI Lectures in Hindi

Downloaded
Unit 4 Page 34 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

6:Decision Trees
14 November 2023 17:00

Decision Trees are a popular machine learning algorithm used in artificial intelligence for both classification and regression tasks. They are part of the supervised learning
paradigm, where the algorithm learns to map input features to output labels based on a training dataset. Decision Trees are particularly effective for tasks where the
decision-making process can be represented as a tree-like structure.
Here are the key concepts associated with Decision Trees in AI:
1. Tree Structure:
• A Decision Tree is a hierarchical tree-like structure where each node represents a decision or a test on an attribute, each branch represents the outcome of the test,
and each leaf node represents the final output (class label or regression value).
2. Decision Nodes:
• Decision nodes are points in the tree where a decision or test is made based on a specific feature or attribute.
3. Branches:
• Branches emanate from decision nodes and represent the possible outcomes of the decision or test.
4. Leaf Nodes:
• Leaf nodes are the terminal nodes of the tree and contain the final output, which can be a class label in classification problems or a regression value in regression
problems.
5. Entropy and Information Gain (for Classification):
• In classification tasks, Decision Trees aim to maximize information gain or reduce entropy at each decision node. This involves selecting features that best split the
data into homogeneous subsets with respect to the target variable.
6. Gini Impurity (for Classification):
• Another measure used for classification trees is Gini impurity, which quantifies the likelihood of misclassifying a randomly chosen element.
7. CART (Classification and Regression Trees):
• CART is a widely used algorithm for constructing Decision Trees. It can handle both classification and regression tasks.
8. Pruning:
• Pruning is a technique used to prevent overfitting in Decision Trees. It involves removing certain branches or nodes that do not significantly contribute to the model's
predictive power.
9. Feature Importance:
• Decision Trees can provide insight into feature importance. Features higher up in the tree structure are generally more important in making decisions.
10. Regression Trees:
• In regression tasks, Decision Trees predict a continuous value at each leaf node, making them suitable for predicting numeric outcomes.
11. Ensemble Methods:
• Decision Trees are often used in ensemble methods like Random Forests and Gradient Boosting, where multiple trees are combined to improve overall predictive
performance.
Decision Trees are interpretable, easy to understand, and provide a visual representation of the decision-making process. However, they can be sensitive to small
variations in the data, leading to overfitting. Ensemble methods, which combine multiple Decision Trees, are often used to address this limitation and improve overall model
robustness.

Video link
Decision Tree Classification in Machine Learning | Decision Tree in ML

Downloaded
Unit 4 Page 35 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

1: Expert Systems: Representing and Using Domain Knowledge


14 November
Expert 2023
systems 11:47of artificial intelligence (AI) technology that is designed to emulate the decision -making abilities of a human expert in a specific domain. These systems leverage a knowledge
are a type
base, an inference engine, and a user interface to represent and utilize domain knowledge. Here's an overview of how expert s ystems represent and use domain knowledge:

1. **Knowledge Base:**
- The knowledge base is a central component of an expert system. It contains the information and rules that represent the exper tise of human experts in a particular field.
- Knowledge is typically organized into two main components: facts and rules.
- **Facts:** These are pieces of information about the specific problem domain. They represent the current state of knowledge o r the data relevant to the problem.
- **Rules:** These are conditional statements that express relationships between various facts. Rules capture the decision -making process of human experts.

2. **Inference Engine:**
- The inference engine is responsible for reasoning and making decisions based on the knowledge stored in the knowledge base.
- It uses various inference mechanisms to derive new conclusions from the given facts and rules.
- Common inference mechanisms include forward chaining (data-driven reasoning) and backward chaining (goal-driven reasoning).

3. **User Interface:**
- The user interface facilitates communication between the expert system and the end -user or domain expert.
- It may include a natural language interface, graphical user interface (GUI), or other interactive means for users to input in formation, query the system, and interpret the results.

4. **Knowledge Acquisition:**
- Knowledge acquisition is the process of capturing and entering expertise into the knowledge base.
- Domain experts or knowledge engineers are involved in this process, where they extract information from human experts, docume ntation, and other sources to populate the knowledge base.

5. **Explanation Facility:**
- Expert systems often include an explanation facility to provide users with explanations of the system's reasoning process.
- This enhances transparency and helps users understand why a particular decision or recommendation was made.

6. **Certainty Factors:**
- Some expert systems use certainty factors or confidence levels to express the system's degree of confidence in a particular c onclusion.
- Certainty factors help in dealing with uncertainty and can be useful in decision -making.

7. **Learning and Adaptation:**


- Some advanced expert systems incorporate learning mechanisms to improve their performance over time.
- Learning may involve updating the knowledge base based on feedback or adjusting the certainty factors.

8. **Examples of Expert Systems:**


- MYCIN, developed for medical diagnosis, is a classic example of an expert system.
- Dendral, designed for chemical analysis, is another early expert system.

Expert systems find applications in various domains such as medicine, finance, engineering, and troubleshooting. They are esp ecially useful in situations where human expertise is valuable but not
always readily available.

It's important to note that while expert systems were popular in the early days of AI, more recent AI approaches, such as mac hine learning and deep learning, have gained prominence. These newer
approaches often excel in tasks with large amounts of data, complex patterns, and the ability to learn from examples.

Expert system in Artificial intelligence in hindi

Unit 5 Downloaded
Page 36 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125

2: Knowledge Acquisition
14 November 2023 17:21

Knowledge acquisition is a crucial step in the development of expert systems. It involves capturing, organizing, and formalizing the expertise of human domain experts and transferring it into a format
that an expert system can use. The goal is to build a knowledge base that reflects the knowledge, rules, and decision-making processes of human experts in a specific domain. Here are key aspects of
knowledge acquisition in expert systems:

1. **Domain Analysis:**
- Before knowledge acquisition begins, a thorough analysis of the target domain is conducted. This involves understanding the problem, identifying key concepts, defining the scope of the expert
system, and determining the goals and objectives.

2. **Identifying Experts:**
- Domain experts possess the knowledge that needs to be transferred to the expert system. Identifying and involving these experts in the knowledge acquisition process is crucial.
- Experts can be individuals with practical experience, deep knowledge, or a combination of both in the domain of interest.

3. **Knowledge Elicitation Techniques:**


- Knowledge elicitation involves extracting knowledge from human experts. Various techniques are used, including interviews, questionnaires, observations, and brainstorming sessions.
- Structured interviews and open-ended questions are common methods to encourage experts to articulate their knowledge.

4. **Prototyping:**
- Prototyping involves developing a preliminary version of the expert system to help experts visualize how their knowledge will be represented and used.
- This can facilitate discussions between knowledge engineers and domain experts to refine the system's design and capture more accurate knowledge.

5. **Documenting Knowledge:**
- Knowledge engineers document the acquired knowledge in a structured form that can be understood by the expert system. This documentation includes facts, rules, relationships, and any other
relevant information.

6. **Knowledge Representation:**
- Choosing an appropriate knowledge representation scheme is crucial. This involves deciding how to represent facts, rules, and relationships in a way that the inference engine can effectively use.
- Common representations include frames, semantic networks, production rules, and frames.

7. **Validation and Verification:**


- The acquired knowledge should be validated and verified for accuracy and completeness. This involves feedback loops with domain experts to ensure that the knowledge base accurately reflects
their expertise.

8. **Incremental Knowledge Acquisition:**


- Knowledge acquisition is often an iterative process. As the expert system evolves, additional knowledge may be acquired to enhance its capabilities or address limitations identified during testing.

9. **Formalizing Heuristic Knowledge:**


- Experts often use heuristics or rules of thumb in decision-making. Translating these heuristics into a formalized, machine-readable format is a critical part of knowledge acquisition.

10. **Handling Uncertainty and Incompleteness:**


- Expert knowledge may involve uncertainties or incomplete information. Knowledge engineers need to represent and handle such aspects appropriately in the expert system.

11. **Training and Collaboration:**


- Training sessions with domain experts help knowledge engineers better understand the domain and the subtleties of expert decision-making. Continuous collaboration ensures that the system
remains aligned with expert knowledge.

Knowledge acquisition is a challenging and iterative process that requires effective communication between knowledge engineers and domain experts. The success of an expert system often depends
on how accurately and comprehensively the expertise is captured and represented in the knowledge base.

L66: Knowledge Acquisition | Artificial Intelligence | Architecture, Tasks, Techniques | AI Lectures

Unit 5 Downloaded
Page 37 by Milandeep Kour Bali ([email protected])

You might also like