My Notebook For Ai
My Notebook For Ai
My notebook for AI
What is a tree?
A tree is a kind of data structure that is used to represent the data in hierarchical form. It can be defined as a collection of objects or entities called as nodes that are linked together to simulate
a hierarchy. Tree is a non-linear data structure as the data in a tree is not stored linearly or sequentially.
A binary search tree follows some order to arrange the elements. In a Binary search tree, the value of left node must be smal ler than the parent node, and the value of right node must be greater than the
parent node. This rule is applied recursively to the left and right subtrees of the root.
Let's understand the concept of Binary search tree with an example.
In the above figure, we can observe that the root node is 40, and all the nodes of the left subtree are smaller than the root node, and all the nodes of the right subtree are greater than the root node.
Similarly, we can see the left child of root node is greater than its left child and smaller than its right child. So, it als o satisfies the property of binary search tree. Therefore, we can say that the tree in the
above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree.
Therefore, the above tree is not a binary search tree.
Insertion in Binary Search tree
A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start searching from the root node; if the node to be inserted is less than the root
node, then search for an empty location in the left subtree. Else, search for the empty location in the right subtree and ins ert the data. Insert in BST is similar to searching, as
we always have to maintain the rule that the left subtree is smaller than the root, and right subtree is larger than the root .
Now, let's see the process of inserting a node into BST using an example.
Downloaded
Unit 2 Page 1 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
So, the replaced node 79 will now be a leaf node that can be easily deleted.
Downloaded
Unit 2 Page 2 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Stochastic search is a class of optimization algorithms used to find the best solution to a problem in situations where the o bjective function is subject to random noise, and it
may not always be possible to determine the exact solution due to the presence of uncertainty. Stochastic search algorithms a re commonly employed in various fields,
including machine learning, artificial intelligence, operations research, and optimization.
The main characteristic of stochastic search is that it incorporates randomness into the search process, which can help explo re the solution space more effectively,
especially in situations where the solution space is complex, multimodal (multiple local optima), or noisy. Stochastic search algorithms make use of probabilistic techniques
and random sampling to find solutions that are either optimal or near-optimal.
Simulated Annealing
Game Playing
Bayesian Theorem
Simulated Annealing
Simulated annealing is a probabilistic optimization algorithm that is often used in artificial intelligence, specifically in solving problems related to optimization and search. It is inspired by the annealing
process in metallurgy, where a material is heated and slowly cooled to remove defects and find the most stable crystalline structure. In the context of AI and optimization, simulated annealing is used to
find the optimal or near-optimal solution to a problem by exploring a complex and often multimodal solution space.
Game Playing
Game playing in stochastic search involves making decisions in a game with elements of chance or randomness. Stochastic games are characterized by uncertainty, and players must account for this
uncertainty when making their moves.
Game playing was one of the first tasks undertaken in Artificial Intelligence. Game theory has its history from 1950, almost from the days when computers became programmable. The very first game that
is been tackled in AI is chess. Initiators in the field of game theory in AI were Konard Zuse (the inventor of the first programmable computer and the first programming language), Claude Shannon (the
inventor of information theory), Norbert Wiener (the creator of modern control theory), and Alan Turing. Since then, there has been a steady progress in the standard of play, to the point that machines
have defeated human champions (although not every time) in chess and backgammon, and are competitive in many other games.
TYPES OF GAMES
Perfect Information Game: In which player knows all the possible moves of himself and opponent and their results. E.g. Chess.
Imperfect Information Game: In which player does not know all the possible moves of the opponent. E.g. Bridge since all the cards are not visible to player.
Unpredictable Opponent: Generally we cannot predict the behaviour of the opponent. Thus we need to find a solution which is a strategy specifying a move for every possible opponent move or every
possible state.
Time Constraints: Every game has a time constraints. Thus it may be infeasible to find the best move in this time.
Downloaded
Unit 2 Page 3 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
players. Let's consider, f(n) is the evaluation function of the position ‘n’. Then,
– f(n) >> 0: position n is good for me and bad for you
– f(n) << 0: position n is bad for me and good for you
– f(n) near 0: position n is a neutral position
e.g. evaluation function for Tic- Tac- Toe:
f( n) = [# of 3- lengths open for me] - [# of 3- lengths open for you]
where a 3- length is a complete row, column, or diagonal
Games are represented in the form of trees wherein nodes represent all the possible states of a game and edges represent moves between them. Initial state of the game is represented by root and
terminal states by leaves of the tree. In a normal search problem, the optimal solution would be a sequence of moves leading to a goal state that is a win. Even a simple game like tic-tac-toe is too complex
for us to draw the entire game tree. Fig 1 shows part of the game tree for tic-tac-toe. Game tree for Tic-Tac-Toe
Let us represent two players by ‘X’ and ‘O’. From the initial state, X has nine possible moves. Play alternates between X and O until we reach leaves. The number on each leaf node indicates the utility value
of the terminal states from the point of view of X. High values are assumed to be good for X and bad for O.
Downloaded
Unit 2 Page 4 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
3: A* Search Implementation
31 October 2023 16:17
A* search is the most commonly known form of best-first search. It uses heuristic function h(n), and cost to reach the node n from the start state g(n). It has combined features
of UCS and greedy best-first search, by which it solve the problem efficiently. A* search algorithm finds the shortest path through the search space using the heuristic
function. This search algorithm expands less search tree and provides optimal result faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n).
In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we can combine both costs as following, and this sum is called as a fitness
number.
Algorithm of A* search:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation function (g+h), if node n is goal node then ret urn success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each successor n', check whether n' is already in the OPEN or CLOSED list, if not
then compute evaluation function for n' and place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer which reflects the lowest g(n') value.
Step 6: Return to Step 2.
Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all states is given in the b elow table so we will calculate the f(n) of each state using the formula f(n)= g(n) + h(n),
where g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.
Solution:
Advantages:
• A* search algorithm is the best algorithm than other search algorithms.
• A* search algorithm is optimal and complete.
• This algorithm can solve very complex problems.
Disadvantages:
• It does not always produce the shortest path as it mostly based on heuristics and approximation.
• A* search algorithm has some complexity issues.
Downloaded
Unit 2 Page 5 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
• The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so it is not practical for various large-scale problems.
Downloaded
Unit 2 Page 6 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
• Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making and game theory. It provides an optimal move for the player assuming that
opponent is also playing optimally.
• Mini-Max algorithm uses recursion to search through the game-tree.
• Min-Max algorithm is mostly used for game playing in AI. Such as Chess, Checkers, tic-tac-toe, go, and various tow-players game. This Algorithm computes the minimax
decision for the current state.
• In this algorithm two players play the game, one is called MAX and other is called MIN.
• Both the players fight it as the opponent player gets the minimum benefit while they get the maximum benefit.
• Both Players of the game are opponent of each other, where MAX will select the maximized value and MIN will select the minimi zed value.
• The minimax algorithm performs a depth-first search algorithm for the exploration of the complete game tree.
• The minimax algorithm proceeds all the way down to the terminal node of the tree, then backtrack the tree as the recursion.
Initial call:
Minimax(node, 3, true)
Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we will compare each value in terminal state with initial value of Maximizer and determines the
higher nodes values. It will find the maximum among the all.
• For node D max(-1,- -∞) => max(-1,4)= 4
• For Node E max(2, -∞) => max(2, 6)= 6
• For Node F max(-3, -∞) => max(-3,-5) = -3
• For node G max(0, -∞) = max(0, 7) = 7
Downloaded
Unit 2 Page 7 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will find the 3 rd layer node values.
• For node B= min(4,6) = 4
• For node C= min (-3, 7) = -3
Step 4: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value and find the maximum value for the roo t node. In this game tree, there are only
4 layers, hence we reach immediately to the root node, but in real games, there will be more than 4 layers.
• For node A max(4, -3)= 4
That was the complete workflow of the minimax two player game.
Properties of Mini-Max algorithm:
• Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the finite search tree.
• Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
• Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max algorithm is O(bm), where b is branching factor of the game-tree, and m is the
maximum depth of the tree.
• Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is O(bm).
Downloaded
Unit 2 Page 8 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Downloaded
Unit 2 Page 9 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
5:Alpha-Beta Pruning
01 November 2023 12:21
• Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization technique for the minimax algorithm.
• As we have seen in the minimax search algorithm that the number of game states it has to examine are exponential in depth of the tree. Since we cannot eliminate the
exponent, but we can cut it to half. Hence there is a technique by which without checking each node of the game tree we can compute the correct minimax decision, and
this technique is called pruning. This involves two threshold parameter Alpha and beta for future expansion, so it is called alpha-beta pruning. It is also called
as Alpha-Beta Algorithm.
• Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune the tree leaves but also entire sub-tree.
• The two-parameter can be defined as:
1. Alpha: The best (highest-value) choice we have found so far at any point along the path of Maximizer. The initial value of alpha is -∞.
2. Beta: The best (lowest-value) choice we have found so far at any point along the path of Minimizer. The initial value of beta is +∞.
• The Alpha-beta pruning to a standard minimax algorithm returns the same move as the standard algorithm does, but it removes all the nodes which are not really
affecting the final decision but making algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α
at node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β= +∞, will compare with the available subsequent nodes value, i.e.
min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5
and β= 3, where α>=β, so the right successor of E will be pruned, and algorithm will not traverse it, and the value at node E will be 5.
Downloaded
Unit 2 Page 10 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of alpha will be changed the maximum available value is 3 as max
(-∞, 3)= 3, and β= +∞, these two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α
remains 3, but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β=
1, and again it satisfies the condition α>=β, so the next child of C which is G will be pruned, and the algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game tree which is the showing the nodes which are computed
and nodes which has never computed. Hence the optimal value for the maximizer is 3 for this example.
Downloaded
Unit 2 Page 11 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
• Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in the tree, and best moves occur at the left side of the tree. We
apply DFS hence it first search left of the tree and go deep twice as minimax algorithm in the same amount of time. Complexit y in ideal ordering is O(bm/2).
Rules to find good ordering:
Following are some rules to find good ordering in alpha-beta pruning:
• Occur the best move from the shallowest node.
• Order the nodes in the tree such that the best nodes are checked first.
• Use domain knowledge while finding the best move. Ex: for Chess, try order: captures first, then threats, then forward moves, backward moves.
• We can bookkeep the states, as there is a possibility that states may repeat.
Downloaded
Unit 2 Page 12 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions. A proposition is a declarative statement which is either true or false. It is a technique of
knowledge representation in logical and mathematical form.
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
• Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single proposition symbol. These are the sentences which must be either true
or false.
Example:
5. a) 2+2 is 4, it is an atomic proposition as it is a true fact.
6. b) "The Sun is cold" is also a proposition as it is a false fact.
• Compound proposition: Compound propositions are constructed by combining simpler or atomic propositions, using parenthesis and logical connectives.
Example:
7. a) "It is raining today, and street is wet."
8. b) "Ankit is a doctor, and his clinic is in Mumbai."
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence logically. We can create compound propositions with the help of logical
connectives. There are mainly five connectives, which are given as follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or negative literal.
2. Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.
Example: Rohan is intelligent and hardworking. It can be written as,
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
3. Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P and Q are the propositions.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
4. Implication: A sentence such as P → Q, is called an implication. Implications are also known as if-then rules. It can be represented as
If it is raining, then the street is wet.
Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
5. Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am breathing, then I am alive
P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.
Following is the summarized table for Propositional Logic Connectives:
Unit 2Downloaded
Page 13 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Unit 2Downloaded
Page 14 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
In the topic of Propositional logic, we have seen that how to represent statements using propositional logic. But unfortunately, in propositional logic, we can only represent the facts, which are
either true or false. PL is not sufficient to represent the complex sentences or natural language statements. The propositional logic has very limited expressive power. Consider the following
sentence, which we cannot represent using PL logic.
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such as first-order logic.
First-Order logic:
• First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to propositional logic.
• FOL is sufficiently expressive to represent the natural language statements in a concise way.
• First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a powerful language that develops information about the
objects in a more easy way and can also express the relationship between those objects.
• First-order logic (like natural language) does not only assume that the world contains facts like propositional logic but also assumes the following things in the
world:
○ Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
○ Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the sister of, brother of, has color, comes between
○ Function: Father of, best friend, third inning of, end of, ......
• As a natural language, first-order logic also has two main parts:
1. Syntax
2. Semantics
Syntax of First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic syntactic elements of first-order logic are symbols.
We write statements in short-hand notation in FOL.
Basic Elements of First-order logic:
Following are the basic elements of FOL syntax:
Constant 1, 2, A, John, Mumbai, cat,....
Variables x, y, z, a, b,....
Predicates Brother, Father, >,....
Function sqrt, LeftLegOf, ....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
• Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a predicate symbol followed by a parenthesis with a
sequence of terms.
• We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).
Chinky is a cat: => cat (Chinky).
Complex Sentences:
• Complex sentences are made by combining atomic sentences using connectives.
First-order logic statements can be divided into two parts:
• Subject: Subject is the main part of the statement.
• Predicate: A predicate can be defined as a relation, which binds two atoms together in a statement.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of the statement and second part "is an integer," is known as a
predicate.
UnitDownloaded
2 Page 15 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
UnitDownloaded
2 Page 16 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
In artificial intelligence, forward and backward chaining is one of the important topics, but before understanding forward and backward chaining lets first understand that
from where these two terms came.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which applies logical rules to the knowledge base to infer new information from
known facts. The first inference engine was part of the expert system. Inference engine commonly proceeds in two modes, which are:
1. Forward chaining
2. Backward chaining
Horn Clause and Definite clause:
Horn clause and definite clause are the forms of sentences, which enables knowledge base to use a more restricted and efficient inference algorithm. Logical inference
algorithms use forward and backward chaining approaches, which require KB in the form of the first-order definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as a definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of mliterals with at most one positive literal is known as horn clause. Hence all the definite clauses are horn clauses.
Example: (¬ p V ¬ q V k). It has only one positive literal k.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when using an inference engine. Forward chaining is a form of reasoning which start
with atomic sentences in the knowledge base and applies inference rules (Modus Ponens) in the forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied, and add their conclusion to the known facts. This process repeats
until the problem is solved.
Properties of Forward-Chaining:
• It is a down-up approach, as it moves from bottom to top.
• It is a process of making a conclusion based on known facts or data, by starting from the initial state and reaches the goal state.
• Forward-chaining approach is also called as data-driven as we reach to the goal using available data.
• Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and production rule systems.
Forward chaining proof:
Step-1:
In the first step we will start with the known facts and will choose the sentences which do not have implications, such as: American(Robert), Enemy(A, America),
Owns(A, T1), and Missile(T1). All these facts will be represented as below.
Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(2) and (3) are already added.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers from the conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so we can add Criminal(Robert) which infers all the available facts. And hence
we reached our goal statement.
Downloaded
Unit 2 Page 17 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we can see in Rule-1, the goal predicate Criminal (Robert) is present with
substitution {Robert/P}. So we will add all the conjunctive facts below the first level and will replace p with Robert.
Here we can see American (Robert) is a fact, so it is proved here.
Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it satisfies Rule-(5). Weapon (q) is also true with the substitution of a constant T1 at
q.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which satisfies the Rule- 4, with the substitution of A in place of r. So these two
statements are proved here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6. And hence all the statements are proved true using backward chaining.
Downloaded
Unit 2 Page 18 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Downloaded
Unit 2 Page 19 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Uncertainty:
Till now, we have learned knowledge representation using first-order logic and propositional logic with certainty, which means we were sure about the predicates. With this
knowledge representation, we might write A→B, which means if A is true then B is true, but consider a situation where we are not sure about whether A is true or not then we cannot
express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
1. Information occurred from unreliable sources.
2. Experimental Errors
3. Equipment fault
4. Temperature variation
5. Climate change.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability to indicate the uncertainty in knowledge. In probabilistic reasoning,
we combine probability theory with logic to handle the uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is the result of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as "It will rain today," "behavior of someone for some situations," "A
match between two teams or two players." These are probable sentences for which we can assume that it will happen but not sure about it, so here we use probabilistic
reasoning.
Need of probabilistic reasoning in AI:
• When there are unpredictable outcomes.
• When specifications or possibilities of predicates becomes too large to handle.
• When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
• Bayes' rule
• Bayesian Statistic
As probabilistic reasoning uses probability and related terms, so before understanding probabilistic reasoning, let's understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the numerical measure of the likelihood that an event will occur. The value of
probability always remains between 0 and 1 that represent ideal uncertainties.
1. 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
1. P(A) = 0, indicates total uncertainty in an event A.
1. P(A) =1, indicates total certainty in an event A.
We can find the probability of an uncertain event by using the below formula.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the probability of A under the conditions of B", it can be written as:
Downloaded
Unit 2 Page 20 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
10:Bayes Theorem
01 November 2023 12:24
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an application of Bayes' theorem, which is fundamental to
Bayesian statistics.
It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
Bayes' theorem allows updating the probability prediction of an event by observing new information of the real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known event B:
As from product rule we can write:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is very useful in cases where we have a good probability of these three
terms and want to determine the fourth one. Suppose we want to perceive the effect of some unknown cause, and want to compute that cause, then the Bayes' rule
becomes:
Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80% of the time. He is also aware of some more facts, which are given as
follows:
• The Known probability that a patient has meningitis disease is 1/30,000.
• The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that the card is king is 4/52, then calculate posterior probability
P(King|Face), which means the drawn face card is a king card.
Solution:
Unit 2 Downloaded
Page 21 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Unit 2 Downloaded
Page 22 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Humans are best at understanding, reasoning, and interpreting knowledge. Human knows things, which is knowledge and as per th eir knowledge they perform various
actions in the real world. But how machines do all these things comes under knowledge representation and reasoning . Hence we can describe Knowledge
representation as following:
• Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence which concerned with AI agents thinki ng and how thinking contributes to
intelligent behavior of agents.
• It is responsible for representing information about the real world so that a computer can understand and can utilize this kn owledge to solve the complex real world
problems such as diagnosis a medical condition or communicating with humans in natural language.
• It is also a way which describes how we can represent knowledge in artificial intelligence. Knowledge representation is not j ust storing data into some database, but
it also enables an intelligent machine to learn from that knowledge and experiences so that it can behave intelligently like a human.
What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:
• Object: All the facts about objects in our world domain. E.g., Guitars contains strings, trumpets are brass instruments.
• Events: Events are the actions which occur in our world.
• Performance: It describe behavior which involves knowledge about how to do things.
• Meta-knowledge: It is knowledge about what we know.
• Facts: Facts are the truths about the real world and what we represent.
• Knowledge-Base: The central component of the knowledge-based agents is the knowledge base. It is represented as KB. The Knowledgebase is a group of the
Sentences (Here, sentences are used as a technical term and not identical with the English language).
Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data, and situations. Following are the types of knowle dge in artificial intelligence:
Downloaded
Unit 3 Page 23 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
2: Nonmonotonic Reasoning
01 November 2023 12:25
The reasoning is the mental process of deriving logical conclusion and making predictions from available knowledge, facts, an d beliefs. Or we can say, "Reasoning is a way
to infer facts from existing data." It is a general process of thinking rationally, to find valid conclusions.
In artificial intelligence, the reasoning is essential so that the machine can also think rationally as a human brain, and ca n perform like a human.
Types of Reasoning
In artificial intelligence, reasoning can be divided into the following categories:
• Deductive reasoning
• Inductive reasoning
• Abductive reasoning
• Common Sense Reasoning
• Monotonic Reasoning
• Non-monotonic Reasoning
Monotonic Reasoning:
In monotonic reasoning, once the conclusion is taken, then it will remain the same even if we add some other information to e xisting information in our knowledge base. In
monotonic reasoning, adding knowledge does not decrease the set of prepositions that can be derived.
To solve monotonic problems, we can derive the valid conclusion from the available facts only, and it will not be affected by new facts.
Monotonic reasoning is not useful for the real-time systems, as in real time, facts get changed, so we cannot use monotonic reasoning.
Monotonic reasoning is used in conventional reasoning systems, and a logic -based system is monotonic.
Any theorem proving is an example of monotonic reasoning.
Example:
• Earth revolves around the Sun.
It is a true fact, and it cannot be changed even if we add another sentence in knowledge base like, "The moon revolves around the earth" Or "Earth is not round," etc.
Advantages of Monotonic Reasoning:
• In monotonic reasoning, each old proof will always remain valid.
• If we deduce some facts from available facts, then it will remain valid for always.
Disadvantages of Monotonic Reasoning:
• We cannot represent the real world scenarios using Monotonic reasoning.
• Hypothesis knowledge cannot be expressed with monotonic reasoning, which means facts should be true.
• Since we can only derive conclusions from the old proofs, so new knowledge from the real world cannot be added.
Non-monotonic Reasoning
In Non-monotonic reasoning, some conclusions may be invalidated if we add some more information to our knowledge base.
Logic will be said as non-monotonic if some conclusions can be invalidated by adding more knowledge into our knowledge base.
Non-monotonic reasoning deals with incomplete and uncertain models.
"Human perceptions for various things in daily life, "is a general example of non-monotonic reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
• Birds can fly
• Penguins cannot fly
• Pitty is a bird
So from the above sentences, we can conclude that Pitty can fly.
However, if we add one another sentence into knowledge base " Pitty is a penguin", which concludes "Pitty cannot fly", so it invalidates the above conclusion.
Advantages of Non-monotonic reasoning:
• For real-world systems such as Robot navigation, we can use non-monotonic reasoning.
• In Non-monotonic reasoning, we can choose probabilistic facts or can make assumptions.
Disadvantages of Non-monotonic Reasoning:
• In non-monotonic reasoning, the old facts may be invalidated by adding new sentences.
• It cannot be used for theorem proving.
Downloaded
Unit 3 Page 24 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
In artificial intelligence, reasoning under uncertainty is a critical aspect of knowledge representation and decision-making. One common knowledge representation scheme used for this
purpose is Probabilistic Graphical Models (PGMs), which include Bayesian Networks and Markov Networks. Here's a brief overview of reasoning under uncertainty using PGMs:
1. Probabilistic Graphical Models (PGMs): These models are designed to represent and manipulate uncertainty in a probabilistic manner. They provide a structured way tomodel complex,
uncertain relationships among variables.
2. Bayesian Networks (BNs): Bayesian Networks are graphical models that use directed acyclic graphs to represent conditional dependencies between variables. They are particularly
useful for capturing cause-and-effect relationships and can be used for both modeling and inference.
3. Markov Networks: Unlike Bayesian Networks, Markov Networks use an undirected graph to represent associations between variables. They are moresuitable for modeling situations
where the causal relationships are not well defined, and dependencies are more about statistical associations.
4. Inference: Reasoning under uncertainty in PGMs involves performing inference to make decisions or predictions. Inference can be either exact or approximate, depending on the
complexity of the model. Common inference algorithms include variable elimination, belief propagation, and sampling methods ilke Markov Chain Monte Carlo (MCMC).
5. Uncertain Evidence: PGMs can handle uncertain evidence, which is valuable in real-world scenarios where data may be incomplete or noisy. Bayesian Networks, for example, can
update beliefs about variables given new evidence using Bayes' theorem.
6. Decision Making: PGMs can be used for decision-making under uncertainty through decision networks, a specialized form of Bayesian Networks. These models incorporate utility
functions to make optimal decisions in the presence of uncertainty.
7. Applications: PGMs are widely used in various AI applications, including medical diagnosis, natural language processing, recommendation systems, robotics, and more. They provide a
principled way to deal with uncertainty, which is inherent in many real-world problems.
8. Challenges: While PGMs are powerful tools for reasoning under uncertainty, they can face challenges in terms of scalability and the complexity of modeling real-world systems with many
variables. Researchers continue to develop more efficient and expressive models to address these challenges.
In summary, reasoning under uncertainty in artificial intelligence involves the use of Probabilistic Graphical Models, such as Bayesian Networks and Markov Networks, to represent,
manipulate, and make decisions in situations where information is uncertain or incomplete. These models have widespread applications and are a fundamental component of AI systems
dealing with real-world data and uncertainty.
Downloaded
Unit 3 Page 25 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Acting under uncertainty, in the context of artificial intelligence and decision-making, refers to making choices or taking actions in situations where there is incomplete or uncertain information about
the environment or the outcomes of those actions. It is a fundamental aspect of AI systems designed to interact with the real world. Here's an overview of how AI systems can act under uncertainty:
1. **Decision Theory:** Decision theory is a framework used to make rational decisions under uncertainty. It involves defining objectives, assessing probabilities of different outcomes, and assigning
utilities (values) to these outcomes. By calculating expected utilities, decision-makers can choose actions that maximize their expected utility.
2. **Uncertain Environments:** AI systems often operate in environments where outcomes are not completely predictable. For example, a self-driving car must make decisions based on uncertain
sensor data and the behavior of other road users.
3. **Uncertain Information:** In many cases, AI systems have to deal with incomplete or noisy information. This uncertainty can arise from sensor limitations, data imperfections, or ambiguities in
the environment.
4. **Modeling Uncertainty:** To act under uncertainty, AI systems use models that capture uncertain information. These models can include probabilistic models, Bayesian networks, Markov
decision processes (MDPs), and more. These models help the AI system reason about the likelihood of different outcomes and plan accordingly.
5. **Exploration vs. Exploitation:** In reinforcement learning, a common approach to acting under uncertainty is the exploration-exploitation trade-off. The AI agent needs to decide whether to
explore new actions (potentially yielding valuable information) or exploit known actions to maximize immediate rewards.
6. **Risk Management:** Decision-makers can also take into account their risk tolerance when acting under uncertainty. Some decisions may involve higher risk,while others prioritize safety or
conservative strategies.
7. **Adaptive Strategies:** AI systems can adapt their strategies based on the evolving uncertainty in the environment. They might use techniques such as online learning or adaptive control to
adjust their behavior in real time.
8. **Sensor Fusion:** In cases where the AI system relies on multiple sensors or information sources, sensor fusion techniques are used to combine and reconcile data from different sources while
accounting for uncertainty in each source.
9. **Monte Carlo Methods:** Monte Carlo methods, like Monte Carlo Tree Search (MCTS), are often used to estimate the value of different actions when the exact outcomes are uncertain. These
methods involve sampling possible scenarios and averaging the results.
10. **Feedback Loops:** Continuous feedback and learning from the outcomes of actions can help AI systems improve their decision-making over time, adapting to the changing level of uncertainty.
Acting under uncertainty is a critical aspect of AI systems in various domains, including autonomous robotics, financial trading, healthcare, and more. These systems are designed to make rational
decisions in complex, dynamic, and uncertain environments, taking into account the available information and the inherent uncertainties in the world.
Unit 3 Downloaded
Page 26 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
5: Bayes Rule
01 November 2023 12:27
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event with uncertain knowledge.In probability theory, it relates
the conditional probability and marginal probabilities of two random events. Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics. It is a way to calculate the value of P(B|A) with the knowledge of P(A|B). Bayes' theorem allows updating the
probability prediction of an event by observing new information of the real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of cancer more accurately with the help of age. Bayes' theorem can be
derived using product rule and conditional probability of event A with known event B:
As from product rule we can write:
1. P(A ⋀B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
1. P(A ⋀B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI systems for probabilistic inference. It shows the simple relationship between
joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculatethe probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the
evidence
P(B) is called marginal probability, pure probability of an evidence. -
In the equation (a), in general, we can write P(B) = P(A)*P(B|Ai), hence the Bayes' rule can
be written as:
Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs
80% of the time. He is also aware of some more facts, which are given as follows:
The Known probability that a patient has meningitis disease is 1/30,000.
The Known probability that a patient has a stiff neck is 2%.
Letabethepropositionthatpatienthasstiffneckandbbethepropositionthatpatienthas
Meningitis,
so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff
neck.
Downloaded
Unit 3 Page 27 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a problem which has unc ertainty. We can define a Bayesian
network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability distribution, and also use probability theory for prediction and anomaly
detection.
Backward Skip 10sPlay Video Forward Skip 10s
Real world applications are probabilistic in nature, and to represent the relationship between multiple events, we need a Bay esian network. It can also be used in various
tasks including prediction, anomaly detection, diagnostics, automated insight, reasoning, time series prediction , and decision making under uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities.
The generalized form of Bayesian network that represents and solve decision problems under uncertain knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
• Each node corresponds to the random variables, and a variable can be continuous or discrete.
• Arc or directed arrows represent the causal relationship or conditional probabilities between random variables. These directed links or arrows conne ct the pair of
nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that means that nodes are independent with each other
○ In the above diagram, A, B, C, and D are random variables represented by the nodes of the network graph.
○ If we are considering node B, which is connected with node A by a directed arrow, then node A is called the parent of Node B.
○ Node C is independent of node A.
Downloaded
Unit 3 Page 28 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Let's take the observed probability for the Burglary and earthquake component:
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on Burglar and earthquake:
B E P(A= True) P(A= False)
True True 0.94 0.06
True False 0.95 0.04
False True 0.31 0.69
False False 0.001 0.999
Conditional probability table for David Calls:
The Conditional probability of David that he will call depends on the probability of Alarm.
A P(D= True) P(D= False)
True 0.91 0.09
False 0.05 0.95
Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
A P(S= True) P(S= False)
True 0.75 0.25
False 0.02 0.98
From the formula of joint distribution, we can write the problem statement in the form of probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional independence statements.
Bayesian Belief Network ll Directed Acyclic Graph and Conditional Probability Table Explained Hindi
Downloaded
Unit 3 Page 29 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Learning in artificial intelligence (AI) refers to the process by which AI systems acquire knowledge and improve their perfor mance over time. Learning is an automated learning with little or no human
intervention. It involves programming computers so that they learn from the available inputs. The main purpose of machine lea rning is to explore and construct algorithms that can learn from the
previous data and make predictions on new input kodata.
The input to a learning algorithm is training data, representing experience, and the output is any expertise, which usually takes the form of another algorithm that can perform a task. The input data to
.Unit 4: Learning a machine learning system can be numerical, textual, audio, visual, or multimedia. The corresponding output data of the syste m can be a floating-point number, for instance, the velocity of a rocket,
an integer representing a category or a class, for example, a pigeon or a sunflower from image recognition.
14 November 2023 11:47 AI learning can be broadly categorized into three main types: supervised learning, unsupervised learning, and reinforcement learning.
1. Supervised Learning:
• In supervised learning, the AI model is trained on a labeled dataset, where the input data is paired with corresponding output labels.
• The algorithm learns to map the input data to the correct output by generalizing from the labeled examples provided during tr aining.
• Common applications include image recognition, speech recognition, and classification problems.
2. Unsupervised Learning:
• Unsupervised learning involves training an AI model on unlabeled data, and the system must find patterns or relationships wit hin the data without explicit guidance.
• Clustering and dimensionality reduction are common tasks in unsupervised learning.
• Applications include clustering similar documents, anomaly detection, and generating representative samples.
3. Reinforcement Learning:
• Reinforcement learning is a type of learning where an agent interacts with an environment and learns to make decisions by rec eiving feedback in the form of rewards or penalties.
• The agent explores different actions and learns to maximize cumulative rewards over time.
• Reinforcement learning is used in applications like game playing, robotics, and autonomous systems.
4. Semi-Supervised Learning:
• Semi-supervised learning combines elements of both supervised and unsupervised learning.
• It involves training a model on a dataset that contains both labeled and unlabeled examples.
• This approach is useful when obtaining a fully labeled dataset is expensive or time-consuming.
5. Self-Supervised Learning:
• Self-supervised learning is a subset of unsupervised learning where the model generates its own labels from the input data.
• It often involves creating surrogate tasks, such as predicting parts of the input from other parts, to enable learning withou t explicit labels.
6. Transfer Learning:
• Transfer learning involves training a model on one task and then transferring the knowledge gained to a different but related task.
• This can save computational resources and time, especially when labeled data for the target task is limited.
7. Neural Networks and Deep Learning:
• Deep learning, a subset of machine learning, focuses on neural networks with multiple layers (deep neural networks).
• Techniques like convolutional neural networks (CNNs) for image processing and recurrent neural networks (RNNs) for sequence d ata have been particularly successful.
Continuous learning and adaptation are crucial in AI, as models need to stay relevant in dynamic environments. Researchers and practitioners often fine-tune existing models, leverage
transfer learning, and explore novel algorithms to enhance AI systems' capabilities. Ongoing research and advancements in AI contribute to the evolution of learning techniques and
the development of more sophisticated models.
Video Link
Supervised, Unsupervised and Reinforcement Learning in Artificial Intelligence in Hindi
Unit 4 Page 30
Downloaded by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
2: Rote Learning in AI
14 November 2023 12:00
"Rote learning" in the context of artificial intelligence refers to a type of learning where a model memorizes information wi thout understanding the underlying concepts. In
rote learning, the emphasis is on memorization through repetition rather than comprehension or problem -solving. This approach is more commonly associated with
traditional, rule-based systems rather than modern machine learning methods. However, it's worth noting that some simple machine learning algor ithms can exhibit
characteristics of rote learning, particularly when the training data is limited, and the model relies on memorizing specific examples.
Here are some key points regarding rote learning in AI:
1. Memorization without Understanding:
• Rote learning involves memorizing specific facts, patterns, or sequences without necessarily grasping the underlying principl es or logic.
• The model essentially stores information in its memory and reproduces it when faced with similar situations.
2. Limited Generalization:
• Rote learning tends to result in limited generalization to new or unseen examples. The model may perform well on the specific examples it has memorized but
struggle with variations or novel cases.
3. Not Common in Modern Machine Learning:
• Modern machine learning, especially techniques like deep learning, focuses on learning representations and features from data rather than relying solely on
memorization.
• Neural networks, for example, aim to learn hierarchical representations of data, allowing for more robust generalization.
4. Rule-Based Systems:
• Rote learning is more commonly associated with rule-based systems, where explicit rules are defined and followed.
• In these systems, the model applies predefined rules to input data without necessarily adapting or learning from the data.
5. Challenges in Complex Environments:
• Rote learning is generally not suitable for handling complex and dynamic environments where understanding and adaptation are crucial.
• In AI applications where reasoning, decision-making, and adaptability are essential, models that rely solely on rote learning may struggle to perform well.
While rote learning is not a preferred approach in modern AI, it can sometimes be observed in the behaviour of simpler models or in the early stages of training more
complex models. However, the field of AI has largely moved toward more sophisticated learning paradigms, such as supervised l earning, unsupervised learning, and
reinforcement learning, which aim to capture underlying patterns and relationships in data for better generalization and adap tability.
Video link
https://youtu.be/ewb6pTpyUns?si=L1jS-LyAcVDDLM-K
Downloaded
Unit 4 Page 31 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Learning by taking advice is a concept that aligns with certain aspects of machine learning and artificial intelligence, particularly in the context of interactive and adaptive systems.
Here are some perspectives on learning by taking advice:
1. Interactive Learning:
• In interactive learning scenarios, a system may receive advice or feedback from a knowledgeable source to improve its performance.
• This advice can take the form of corrections, suggestions, or additional information provided during the learning process.
2. Human-in-the-Loop Systems:
• Some AI systems are designed to work in conjunction with human users who provide guidance or advice.
• This human-in-the-loop approach is common in applications like interactive machine translation, where the system generates translations and a human expert provides
corrections.
3. Reinforcement Learning with Human Feedback:
• In reinforcement learning, an agent learns by interacting with an environment and receiving feedback in the form of rewards or penalties.
• Learning by taking advice can be integrated into reinforcement learning, where the agent receives additional guidance or feedback from a human supervisor to expedite the
learning process.
4. Supervised Learning from Demonstrations:
• Another form of learning by taking advice is demonstrated in approaches like imitation learning or learning from demonstrations.
• In these cases, an AI system observes demonstrations or expert behaviours and learns to mimic or generalize from this advice.
5. Transfer Learning and Knowledge Transfer:
• Learning by taking advice is also related to the concept of transfer learning, where knowledge gained in one task is leveraged to improve performance in a related task.
• Advice or guidance obtained from one context can be applied to enhance learning in a different but related context.
6. Adaptive Systems:
• Adaptive systems may actively seek advice or input from users or external sources to update their models and improve their performance over time.
• This adaptability is crucial in dynamic environments where the system needs to stay relevant and effective.
7. Ethical Considerations:
• In certain applications, such as AI in healthcare or finance, where decisions have significant consequences, incorporating expert advice can be essential for ethical and
responsible AI deployment.
Learning by taking advice is a dynamic and evolving area within AI, reflecting the broader trend toward creating more interactive, user-friendly, and adaptive systems. This
approach acknowledges the importance of human expertise and domain knowledge in guiding AI systems to achieve better performance and more responsible decision-making
Video Link
https://youtu.be/s97Yh5UhdQM?si=oQ3Pe4hMYWO4YKl-
L78: Learning | Process, Components of Learner System | Artificial Intelligence Lectures in Hindi
Unit Downloaded
4 Page 32 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Learning in problem-solving is a critical aspect of artificial intelligence and machine learning. Problem -solving in AI involves devising algorithms, models, or
systems that can analyse information, reason about it, and generate solutions to complex challenges. Learning mechanisms enab le AI systems to improve their
problem-solving abilities over time. Here are several ways in which learning is integrated into problem -solving in AI:
1. Supervised Learning for Problem-Solving:
• In supervised learning, a model is trained on a labelled dataset, where inputs are associated with corresponding outputs.
• This approach is used in problem-solving tasks where there is a clear mapping between inputs and desired outputs, such as image classification or natural
language processing.
2. Reinforcement Learning for Adaptive Problem-Solving:
• Reinforcement learning involves an agent learning to make decisions by receiving feedback in the form of rewards or penalties .
• In problem-solving contexts, reinforcement learning can be applied to situations where an agent needs to learn a sequence of actions to achieve a goal, such
as game playing or robotic control.
3. Unsupervised Learning for Pattern Recognition:
• Unsupervised learning is employed when the data is not labelled, and the system needs to identify patterns or structures with in the data.
• Clustering and dimensionality reduction techniques in unsupervised learning can aid in problem -solving by revealing underlying structures in the data.
4. Transfer Learning for Generalization:
• Transfer learning allows a model to leverage knowledge gained in one domain to improve performance in another related domain.
• This is valuable in problem-solving scenarios where training data might be scarce, and knowledge from a related task can be transferred to enhance the
model's capabilities.
5. Self-Supervised Learning for Feature Learning:
• Self-supervised learning involves training models to predict certain aspects of the input data from other parts of the same data.
• This approach can be beneficial for problem-solving by enabling the model to learn useful representations or features without explicit labels.
6. Ensemble Learning for Robust Solutions:
• Ensemble learning combines multiple models to improve overall performance and robustness.
• In problem-solving, ensembles can be used to address different aspects of a complex problem, providing a more comprehensive and accurate solution.
7. Explainable AI for Transparency in Problem-Solving:
• Explainable AI techniques aim to make the decision-making process of AI systems more transparent and understandable.
• This is crucial in problem-solving applications where stakeholders need to trust and comprehend the solutions provided by AI models.
8. Continuous Learning for Adaptability:
• Continuous learning ensures that AI systems can adapt to changes in the problem space over time.
• It involves updating models with new data and experiences to maintain relevance in dynamic environments.
The integration of learning into problem-solving in AI reflects the need for systems that can adapt, generalize, and improve their performance over time.
Researchers and practitioners use a combination of these learning approaches to address diverse problem -solving challenges across various domains.
Unit Downloaded
4 Page 33 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
Winston's learning program in artificial intelligence is a seminal piece of work in the field of machine learning. It was dev eloped by Patrick Winston in the early 1970s, and
it was one of the first programs to demonstrate the ability to learn from examples.
Winston's program is a supervised learning program, which means that it is trained on a set of labeled examples. In this case , the examples are line drawings of scenes
containing children's toy blocks. The program's goal is to learn to identify the objects in the scenes and their relationship s to each other.
The program works by first creating a descriptive network for each scene. The nodes in the network represent the objects in t he scene, and the edges represent the
relationships between them. The program then uses the descriptive network to infer the types of objects in the scene and thei r sizes and orientations.
Winston's program was able to learn to identify a variety of different types of blocks, including bricks, cubes, pyramids, an d wedges. It was also able to learn to infer the
relationships between blocks, such as which block was supporting another block.
Winston's learning program was an important milestone in the development of machine learning. It demonstrated that it is poss ible to create programs that can learn from
examples, without the need for explicit programming. Winston's program has also been influential in the development of other machine learning algorithms, such as
decision trees and support vector machines.
Here is a more detailed overview of the steps involved in Winston's learning program:
1. The program is presented with a line drawing of a scene containing children's toy blocks.
2. The program uses Guzman's algorithm to identify the bodies in the scene.
3. The program determines which edges belong to which object and fills in partially occluded edges.
4. The program infers the types of objects (brick, wedge, etc.) from the shapes and adjacency relationships of the viable faces.
5. The program infers the sizes and orientations of the objects.
6. The program creates a descriptive network for the scene, with nodes representing the objects and edges representing the relationships between them.
Winston's learning program has been applied to a variety of different problems, including scene recognition, natural language processing, and medical diagnosis. It is a
powerful tool for machine learning, and it continues to be used by researchers and practitioners today.
Video link
L44: Blocks World Problem in Artificial Intelligence with Solution | AI Lectures in Hindi
Downloaded
Unit 4 Page 34 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
6:Decision Trees
14 November 2023 17:00
Decision Trees are a popular machine learning algorithm used in artificial intelligence for both classification and regression tasks. They are part of the supervised learning
paradigm, where the algorithm learns to map input features to output labels based on a training dataset. Decision Trees are particularly effective for tasks where the
decision-making process can be represented as a tree-like structure.
Here are the key concepts associated with Decision Trees in AI:
1. Tree Structure:
• A Decision Tree is a hierarchical tree-like structure where each node represents a decision or a test on an attribute, each branch represents the outcome of the test,
and each leaf node represents the final output (class label or regression value).
2. Decision Nodes:
• Decision nodes are points in the tree where a decision or test is made based on a specific feature or attribute.
3. Branches:
• Branches emanate from decision nodes and represent the possible outcomes of the decision or test.
4. Leaf Nodes:
• Leaf nodes are the terminal nodes of the tree and contain the final output, which can be a class label in classification problems or a regression value in regression
problems.
5. Entropy and Information Gain (for Classification):
• In classification tasks, Decision Trees aim to maximize information gain or reduce entropy at each decision node. This involves selecting features that best split the
data into homogeneous subsets with respect to the target variable.
6. Gini Impurity (for Classification):
• Another measure used for classification trees is Gini impurity, which quantifies the likelihood of misclassifying a randomly chosen element.
7. CART (Classification and Regression Trees):
• CART is a widely used algorithm for constructing Decision Trees. It can handle both classification and regression tasks.
8. Pruning:
• Pruning is a technique used to prevent overfitting in Decision Trees. It involves removing certain branches or nodes that do not significantly contribute to the model's
predictive power.
9. Feature Importance:
• Decision Trees can provide insight into feature importance. Features higher up in the tree structure are generally more important in making decisions.
10. Regression Trees:
• In regression tasks, Decision Trees predict a continuous value at each leaf node, making them suitable for predicting numeric outcomes.
11. Ensemble Methods:
• Decision Trees are often used in ensemble methods like Random Forests and Gradient Boosting, where multiple trees are combined to improve overall predictive
performance.
Decision Trees are interpretable, easy to understand, and provide a visual representation of the decision-making process. However, they can be sensitive to small
variations in the data, leading to overfitting. Ensemble methods, which combine multiple Decision Trees, are often used to address this limitation and improve overall model
robustness.
Video link
Decision Tree Classification in Machine Learning | Decision Tree in ML
Downloaded
Unit 4 Page 35 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
1. **Knowledge Base:**
- The knowledge base is a central component of an expert system. It contains the information and rules that represent the exper tise of human experts in a particular field.
- Knowledge is typically organized into two main components: facts and rules.
- **Facts:** These are pieces of information about the specific problem domain. They represent the current state of knowledge o r the data relevant to the problem.
- **Rules:** These are conditional statements that express relationships between various facts. Rules capture the decision -making process of human experts.
2. **Inference Engine:**
- The inference engine is responsible for reasoning and making decisions based on the knowledge stored in the knowledge base.
- It uses various inference mechanisms to derive new conclusions from the given facts and rules.
- Common inference mechanisms include forward chaining (data-driven reasoning) and backward chaining (goal-driven reasoning).
3. **User Interface:**
- The user interface facilitates communication between the expert system and the end -user or domain expert.
- It may include a natural language interface, graphical user interface (GUI), or other interactive means for users to input in formation, query the system, and interpret the results.
4. **Knowledge Acquisition:**
- Knowledge acquisition is the process of capturing and entering expertise into the knowledge base.
- Domain experts or knowledge engineers are involved in this process, where they extract information from human experts, docume ntation, and other sources to populate the knowledge base.
5. **Explanation Facility:**
- Expert systems often include an explanation facility to provide users with explanations of the system's reasoning process.
- This enhances transparency and helps users understand why a particular decision or recommendation was made.
6. **Certainty Factors:**
- Some expert systems use certainty factors or confidence levels to express the system's degree of confidence in a particular c onclusion.
- Certainty factors help in dealing with uncertainty and can be useful in decision -making.
Expert systems find applications in various domains such as medicine, finance, engineering, and troubleshooting. They are esp ecially useful in situations where human expertise is valuable but not
always readily available.
It's important to note that while expert systems were popular in the early days of AI, more recent AI approaches, such as mac hine learning and deep learning, have gained prominence. These newer
approaches often excel in tasks with large amounts of data, complex patterns, and the ability to learn from examples.
Unit 5 Downloaded
Page 36 by Milandeep Kour Bali ([email protected])
lOMoARcPSD|35678125
2: Knowledge Acquisition
14 November 2023 17:21
Knowledge acquisition is a crucial step in the development of expert systems. It involves capturing, organizing, and formalizing the expertise of human domain experts and transferring it into a format
that an expert system can use. The goal is to build a knowledge base that reflects the knowledge, rules, and decision-making processes of human experts in a specific domain. Here are key aspects of
knowledge acquisition in expert systems:
1. **Domain Analysis:**
- Before knowledge acquisition begins, a thorough analysis of the target domain is conducted. This involves understanding the problem, identifying key concepts, defining the scope of the expert
system, and determining the goals and objectives.
2. **Identifying Experts:**
- Domain experts possess the knowledge that needs to be transferred to the expert system. Identifying and involving these experts in the knowledge acquisition process is crucial.
- Experts can be individuals with practical experience, deep knowledge, or a combination of both in the domain of interest.
4. **Prototyping:**
- Prototyping involves developing a preliminary version of the expert system to help experts visualize how their knowledge will be represented and used.
- This can facilitate discussions between knowledge engineers and domain experts to refine the system's design and capture more accurate knowledge.
5. **Documenting Knowledge:**
- Knowledge engineers document the acquired knowledge in a structured form that can be understood by the expert system. This documentation includes facts, rules, relationships, and any other
relevant information.
6. **Knowledge Representation:**
- Choosing an appropriate knowledge representation scheme is crucial. This involves deciding how to represent facts, rules, and relationships in a way that the inference engine can effectively use.
- Common representations include frames, semantic networks, production rules, and frames.
Knowledge acquisition is a challenging and iterative process that requires effective communication between knowledge engineers and domain experts. The success of an expert system often depends
on how accurately and comprehensively the expertise is captured and represented in the knowledge base.
Unit 5 Downloaded
Page 37 by Milandeep Kour Bali ([email protected])