4th Unit Imp Topics
4th Unit Imp Topics
What is Utility?
• Utility means satisfaction, happiness, or usefulness a person or an agent gets from a particular choice
or outcome.
• It is expressed as a numerical value.
• Higher the utility = more preferred the outcome.
Samosa 9
Vada Pav 7
Banana 5
Here, Samosa gives you the highest utility (9), so according to Utility Theory, you would choose Samosa.
Expected Utility
Utility Theory helps in making decisions under uncertainty using Expected Utility.
Kitchen 5
Bedroom 10
Bathroom 3
Summary
Concept Meaning
Final Thoughts
Utility Theory is a powerful tool that helps in:
• Making rational decisions
• Choosing the most beneficial option
• Understanding preferences under risk
It is like a mathematical way to model “what we want” and helps both humans and machines make better
choices.
Case 3: Pirates C, D, E
• Pirate C proposes. Needs 2 out of 3 votes.
• If C is thrown out, D gets all (from above case).
• So, C bribes E (who gets nothing otherwise) with 1 coin.
• Final split: C = 99, D = 0, E = 1
Utility(C) = 99 (keeps most and survives)
Case 4: Pirates B, C, D, E
• B needs 2 out of 4 votes (50%).
• If rejected, C’s plan gives D = 0, E = 1.
• So, B can bribe D with 1 coin (better than 0).
• Final split: B = 99, C = 0, D = 1, E = 0
Case 5: Pirates A, B, C, D, E
• A need 3 out of 5 votes.
• If A is thrown out, B’s plan gives C = 0, E = 0.
• A bribes C and E with 1 coin each (better than 0).
• Final split: A = 98, B = 0, C = 1, D = 0, E = 1
Summary:
Conclusion:
The Pirate Ship Problem is a classic way to understand how Utility Theory works in decision-making,
especially when people (or agents) are:
• Selfish
• Strategic
• Want to maximize their outcomes under rules and risks
This is very similar to how AI agents make decisions in multi-agent systems, auctions, negotiations, etc.
Example:
• Online shopping sites recommend products by learning your preferences over time based on your
clicks and purchases.
Formula:
Example:
If there’s a 50% chance of getting ₹100 and 50% chance of getting ₹0:
Example:
Choosing a laptop based on:
• Price (low is better)
• Battery life (high is better)
• Weight (low is better)
Each attribute gets a score, and the final decision is made using a combined utility function.
Significance:
• In many situations, an agent cannot see the full state of the environment. It has to make decisions
based on partial information.
• POMDP helps in making decisions under uncertainty when the agent doesn’t have complete
knowledge.
• It uses:
o A belief state (probabilistic idea of what the true state might be),
o Rewards,
o Actions,
o Observations, and
o Transition probabilities.
Example:
• A robot in a smoky room where visibility is poor—it must act based on sensor readings (which are not
always accurate).
Sure! Here's a simple and easy-to-understand explanation of the Pirate Ship Problem in the context of Utility
Theory, written in a student-friendly and Indian English tone:
Key Concepts
• Policy: A rule that tells the agent which action to take in each state.
• Value Function (V): Tells how good a state is, under a given policy.
Repeat this step until the value function becomes stable (i.e., doesn’t change much).
Step 3: Policy Improvement
• For each state, check if there is a better action than the one in the current policy.
• Choose the action that gives the highest expected value.
• If the policy doesn’t change after this step, the current policy is optimal, and we stop.
• Otherwise, update the policy and go back to Step 2.
Final Result:
• You will get the optimal policy – the best action to take from every state.
• Also, you’ll have the optimal value function for all states.
Summary Table
Step What Happens
Why Is It Important?
• It is a fundamental algorithm in dynamic programming.
• Helps AI agents learn the best behaviour.
• Works in problems where the model (transition and reward) is known.
What is a POMDP?
POMDP stands for Partially Observable Markov Decision Process.
It is used when an agent:
• Cannot fully observe the current state of the environment,
• Has to take decisions based on partial and noisy observations.
In simple words, the agent doesn't "see" the full picture but must still make smart decisions using probabilities.
Element Description
States (S) Possible situations in the environment (which the agent cannot fully see)
Observations (O) What the agent sees or senses (not always accurate)
Transition Model (T) Probability of moving from one state to another when an action is taken
Story Setup:
An agent is standing in front of two closed doors:
Action Effect
Listen Pay a small cost, and get a clue (e.g., hear growl from left or right)
Agent Strategy:
1. Start with a belief (e.g., 50% tiger behind left, 50% behind right).
2. Listen once or more → update belief based on what it hears.
3. When belief becomes strong enough (e.g., 90% sure tiger is behind left), open the right door.
Summary
Concept Description
Sure! Let's break down the Wumpus World Problem, a classic example from Artificial Intelligence
(especially in logical agents and knowledge-based systems) — in a very simple and beginner-friendly way
Percept Meaning
Agent's Actions:
Action Description
Situation Reward/Penalty
Every move -1
How Does the Agent Decide?
The agent:
• Uses logic (like propositional logic) to infer safe and dangerous cells.
• Maintains a knowledge base (KB).
• Updates KB using percepts from the environment.
• Chooses actions that maximize expected reward and avoid risk.
Example Inference:
If cell (1,2) has a breeze, the agent can infer:
“One of the neighboring cells might contain a pit”
It will then mark them as ‘possibly dangerous’ and avoid until more info is collected.
Summary Table:
Concept Description