Artificial and Computational Intelligence

Artificial and Computational Intelligence Page 1 of 124
Session 1
Chapter 1: Introduction of Artificial and Computational Intelligence
Contents
(1) What is Artificial Intelligence: Acting Humanly, Thinking humanly, Thinking rationally,
Acting Rationally
(2) Foundations of AI
(3) Brief Overview of Modern AI & Application Domains.
Introduction
 Homo sapiens – man the wise.
 We have tried to understand how we think.
 A.I. is the study of how to make computers do things at which, at the moment, people are
better.
 The term was coined in 1956 by John McCarthy at the
Massachusetts Institute of Technology (MIT).
Definition:
Artificial Intelligence is concerned with the design of intelligence in an artificial device.
There are two ideas in the definition.
– Intelligence
– artificial device
Features of Intelligence
Dr. Howard Gardner of Harvard University has, for example, identified 9 intelligences and
speculates that there may be more. He posits that we all possess these intelligences to some
degree but are stronger in some than in others. Recognizing and tapping into our strong
intelligences should therefore enhance our learning.
Bodily/kinesthetic - those who learn best through movement, games, hands-on tasks.
Musical/Rhythmic - those who learn best through music, song, rhythmic chants etc.
Intra personal - those who are in touch with their own feelings, ideas and values.
Inter personal - those who are outgoing and learn better cooperatively in pairs and groups
Naturalist - those who are at one with nature, the outdoors, animals.
Existentialist - those who seek the ‘big picture’ - why we are here and our role in the world.

Artificial Devices
AI operates within artificial devices, which can range from simple programs running on a
computer to complex robotic systems. These devices are designed to process data, analyze
information, and perform tasks autonomously or semi-autonomously.
Examples
– Smartphones
– Chatbots
– Self-Driving Cars
– Game Playing AI
– Robots
– Language Translation Apps
– Recommendation Systems and many more
What is Intelligence?
Intelligence: “ability to learn, understand and think” (Oxford dictionary)
Is it that which characterize humans?
or
Is there an absolute standard of judgement? Accordingly, there are two possibilities:
1. A system with intelligence is expected to behave as intelligently as a human.
2. A system with intelligence is expected to behave in the best possible manner.
Secondly, what type of behavior are we talking about?
1. Are we looking at the thought process or reasoning ability of the system?
2. Or are we only interested in the final manifestations of the system in terms of its
actions?
Different types of Intelligence

Thinking Humanly
The automation of activities that we associate with human thinking, activities such as
decision-making, problem solving, learning..."(Bellman, 1978)
This view is that artificial intelligence is about designing systems that are as intelligent as
humans.
This view involves trying to understand human thought and an effort to build machines that
emulate the human thought process. This view is the cognitive science approach to AI.
The interdisciplinary field of cognitive science brings together computer models from AI and
experimental techniques from psychology to try to construct precise and testable theories
of the workings of the human mind.
Suppose if we ask a person to explain how his brain connects different things during the
thinking process, he/she will probably close both eyes and will start to check how he/she
thinks but he/she cannot explain or interpret the process.
For example – If we want to model the thinking of Roger Federer and make the model
system to compete with someone or against him to play in a tennis game, it may not be
possible to replicate the exact thinking as Roger Federer, however, a good build of
Intelligence systems (Robot) can play and win the game against him.

We can interpret how the human mind thinks in theory, in three ways as follows:
Introspection method: In the field of psychology and cognitive science, introspection is a

research method where individuals observe and report on their own mental processes, thoughts,
and experiences.
Psychological Inspections method: Observational methods in psychology involve
systematically watching and recording behavior, often in natural settings, to gain insights
into psychological processes, interactions, and responses to stimuli. It's a qualitative research
approach that allows researchers to study behavior in its real-world context.
Brain Imaging method (MRI - Magnetic resonance imaging or fMRI – Functional
Magnetic resonance imaging) scanning): Observe a person’s brain in action.
Thinking Rationally
Logic and laws of thought deals with studies of ideal or rational thought process and
inference.
The emphasis in this case is on the inferencing mechanism, and its properties.
That is how the system arrives at a conclusion, or the reasoning behind its selection of actions is
very important in this point of view.
The soundness and completeness of the inference mechanisms are important here.
In simple words, if your thoughts are based on facts and not emotions, it is called rational
thinking.
Thinking Rationally: Laws of Thought

Aristotle was one of the first to attempt to codify “right thinking”, i.e., irrefutable reasoning

processes.
His famous syllogisms provided patterns for argument structures that always gave correct
conclusions given correct premises. For example, “Socrates is a man; all men are mortal;
therefore Socrates is mortal." These laws of thought were supposed to govern the operation of
the mind, and initiated the field of logic.
Formal logic provides a precise notation and rules for representing and
reasoning with all kinds of things in the world.
Obstacles:
 Informal knowledge representation.
 Computational complexity and resources
Acting Humanly: The Tuning Test

 Alan Turing (1912-1954)
 “Computing Machinery and Intelligence” (1950)
Consider the following setting. There are two rooms, A and B. One of the rooms contains a
computer. The other contains a human.
The interrogator is outside and does not know which one is a computer. He can ask questions
through a teletype and receives answers from both A and B.
The interrogator needs to identify whether A or B are humans.
To pass the Turing test, the machine has to fool the interrogator into believing that it is human.

Has any machine passed the Tuning test?
Computer programs like ELIZA, MGONZ, NATACHATA, and CYBERLOVER have fooled
many users in the past, and the users never knew that they were talking to a computer program.
A computer chatbot called Eugene Goostman, who had the persona of a 13-year-old boy,
technically passing the Turing Test in 2014.
In 2018, Google Duplex was introduced at the annual Google I/O Annual Developer
Conference. The machine scheduled a hair salon appointment and interacted with a hair salon
assistant via the phone as part of the conversation. Though some critics view the outcome
differently, some believe Google Duplex passed the Turing test.
Acting Rationally
"The branch of computer science that is concerned with the automation of intelligent behavior"
(Luger and Stubblefield, 1993)
The fourth view of AI is that it is the study of rational agents.
Acting rationally means acting so as to achieve one's goals, given one's beliefs.
This view deals with building machines that act rationally.
The focus is on how the system acts and performs, and not so much on the reasoning process.
A rational agent is one that acts rationally, that is in the best possible manner.
Does not necessarily involve thinking.
An agent is just something that perceives and acts.
For example, pulling one's hand off a hot stove is a reflex action that is more successful than a
slower action taken after careful deliberation.
Advantages:
 More general than the “laws of thought” approach.
 More amenable to scientific development than human-based
approaches.
Foundations of AI
The foundation of Artificial Intelligence (AI) is built upon several key concepts, principles, and
techniques. Here are some fundamental aspects that form the foundation of AI:
1. Machine Learning (ML): Machine learning is a subset of AI that focuses on the
development of algorithms and models that enable computers to learn from data. This
learning process allows machines to improve their performance on a task over time
without being explicitly programmed.

• Supervised Learning: Involves training a model on labeled data where the
algorithm learns patterns and makes predictions.
• Unsupervised Learning: Algorithms analyze unlabeled data to discover patterns
and structures on their own.
• Reinforcement Learning: Agents learn by interacting with an environment and
receiving feedback in the form of rewards or penalties.
2. Data: Data is a crucial component for training machine learning models. AI systems rely
on vast amounts of data to learn patterns, make predictions, and perform tasks. Quality
and quantity of data significantly impact the effectiveness of AI applications.
3. Algorithms: AI algorithms are the instructions or rules that govern the behavior of AI
systems. These can include rule-based systems, decision trees, neural networks, and
various other mathematical models that enable machines to process information and
make decisions.
4. Neural Networks: Neural networks, inspired by the human brain, are a key component
of deep learning, a subset of machine learning. Deep neural networks are particularly
effective in handling complex tasks such as image recognition, natural language
processing, and speech recognition.
5. Natural Language Processing (NLP): NLP is a field of AI that focuses on enabling
machines to understand, interpret, and generate human language. It is essential for
applications such as language translation, sentiment analysis, and chatbots. Test
analysis and speech recognition are two important works in NLP.
6. Computer Vision: Computer vision involves enabling machines to interpret and
understand visual information from the world, including images and videos. Applications
include image recognition, object detection, and facial recognition.
7. Expert Systems: Expert systems are AI programs that mimic the decision-making
abilities of a human expert in a specific domain. These systems use rule-based reasoning
to solve complex problems.
AI Problems
While studying the typical range of tasks that we might expect an “intelligent entity” to perform,
we need to consider both “common- place” asks as well as “expert tasks”.
Common-place tasks
– Recognizing people, objects.
– Communicating (through natural language).
– Navigating around obstacles on the streets
These tasks are done routinely by people and some other animals.
Expert tasks

– Medical diagnosis.
– Mathematical problem solving
– Playing games like chess
These tasks cannot be done by all people and can only be performed by skilled specialists.
Which of these tasks are easy and which ones are hard?
• Clearly tasks of the first type are easy for humans to perform, and almost all are able to
master them. The second range of tasks requires skill development and/or intelligence
and only some specialists can perform them well.
• However, when we look at what computer systems have been able to achieve to date, we
see that their achievements include performing sophisticated tasks like medical
diagnosis, performing symbolic integration, proving theorems and playing chess.
• On the other hand, it has proved to be very hard to make computer systems perform
many routine tasks that all humans and a lot of animals can do. Examples of such tasks
include navigating our way without running into things, catching prey and avoiding
predators.
• Humans and animals are also capable of interpreting complex sensory information. We
are able to recognize objects and people from the visual image that we receive. We are
also able to perform complex social functions.
Few famous AI system
1. ALVINN: (Autonomous Land Vehicle in a Neural Network)
– In 1989, Dean Pomerleau at CMU created ALVINN. This is a system which
learns to control vehicles by watching a person drive. It contains a neural network
whose input is a 30x32 unit two-dimensional camera image. The output layer is a
representation of the direction the vehicle should travel.
– The system drove a car from the East Coast of USA to the west coast, a total of
about 2850 miles. Out of this, about 50 miles were driven by a human, and the
rest solely by the system.
2. Deep Blue – In 1997, the Deep Blue chess program created by IBM, beat the current
world chess champion, Gary Kasparov.

3. Machine translation – A system capable of translations between people speaking
different languages will be a remarkable achievement of enormous economic and cultural
benefit. Machine translation is one of the important fields of endeavour in AI. While
some translating systems have been developed, there is a lot of scope for
improvement in translation quality.
4. Internet agents – The explosive growth of the internet has also led to growing interest
in internet agents to monitor users' tasks, seek needed information, and to learn
which information is most useful.
5. Autonomous agents –
– In space exploration, robotic space probes autonomously monitor their
surroundings, make decisions and act to achieve their goals.
– NASA's Mars rovers successfully completed their primary three-month missions

in April 2004. The Spirit rover had been exploring a range of Martian hills that
took two months to reach. It is finding curiously eroded rocks that may be new
pieces to the puzzle of the region's past. Spirit's twin, Opportunity, had been
examining exposed rock layers inside a crater.
– Chandrayan 3 soft landing by India the time it takes for a signal to travel from
the Moon to Earth 1.28 seconds. So, the total time from the Moon to Earth and
then from the earth to moon is 2.56 seconds.
6. Sophia – The Hanson created Sophia is an incredibly advanced social- learning robot.
Through AI, Sophia can efficiently communicate with natural language and use facial
expressions to convey human-like emotions.
7. Covera Health – It is utilizing collaborative data sharing and applied clinical analysis to
reduce the number of misdiagnosed patients throughout the world. The company’s
proprietary technology utilizes a framework that combines advanced data science and AI
to sort through existing diagnostics to provide practitioners with more accurate symptom
data when making a decision that will have a major impact on a patient’s life.
8. Meta – The company's AI team trained an image recognition model to 85 percent

accuracy using billions of public Instagram photos tagged with hashtags. The method is
a breakthrough in computer vision modeling. Facebook is already using a combination
of AI and human moderation to combat spam and abuse. With breakthroughs in
image recognition and a doubling-down on AI research, Meta is counting on AI to help it
police the world's largest media platform.
9. You can thank AI for the tweets you see on Twitter. The social media giant’s algorithms
suggest people to follow, tweets and news based on a user’s individual preferences.
Additionally, Twitter uses AI to monitor and categorize video feeds based on subject
matter. The company’s image cropping tool also uses AI to determine how to crop
images to focus on the most interesting part.

Twitter’s AI has also been put to work identifying hate speech and terroristic language
in tweets. In the first half of 2017, the company discovered and banned 300,000 terrorist-
linked accounts, 95 percent of which were found by non-human, artificially intelligent
machines.
10. Amazon – Amazon is the king of e-commerce AI. Whether it's the company’s
recommendations on which products to buy, the warehouse robots that grab, sort and
ship products or the web services that power the website itself, Amazon employs AI in
almost every step of its process. Simply put, if you've done anything at all on Amazon in
the last five years, an algorithm has helped you do it.
In 2014 the company introduced its AI-powered voice assistant, Alexa. Inspired by the
computers on Star Trek, Alexa ushered in a wave of powerful, conversation-driven
virtual assistants.
11. ChatGPT – https://chat.openai.com/

ChatGPT is definitely the hottest AI website that broke the internet in 2023. Developed
by Open AI, it is a web-based conversational AI chatbot. ChatGPT utilizes state-of-the-
art language processing AI models and was trained using vast amounts of information -
articles, books, web texts, Wikipedia, and other pieces of writing on the internet. It has
capable of understanding and processing natural languages. You can ask questions and
ChatGPT will generate human-like responses to your queries.
ChatGPT is built on the GPT (Generative Pre-trained Transformer) architecture

developed by OpenAI.
Mind Blowing AI websites

12. Fotor Background Remover – https://www.fotor.com/features/background-remover
Say goodbye to the hassle of manually removing backgrounds from photos! You can now
get it done automatically with Fotor background remover.
It combination of image processing and machine learning techniques like Image
Segmentation, Deep Learning, Edge Detection to achieve the removal of backgrounds
from images.
13. Soundraw – https://soundraw.io/

It is an online AI music composition tool that allows you to create original and
customizable music easily and quickly.
General technologies commonly employed by AI music generation platforms are
Generative Models, Deep Learning, Music Theory Algorithms, Data Preprocessing,
Interactive Interfaces etc.
14. Midjourney - https://www.fotor.com/ai-art-generator/

Midjourney is one of the most popular and powerful artificial intelligence websites for
image creation. It lets you generate stunning digital art and images from pain text. Just
enter text to describe what you want to, and Midjourney will produce a series of images
based on your text prompts. You can use it to generate backgrounds, realistic
photographs, paintings, 3D illustrations, logos, and a whole lot more. The possibilities are
endless and there is a lot of fun to play with.

Typically, such platforms leverage deep learning models, generative algorithms, or a
combination of techniques to generate or manipulate images.
Limits of AI today
1. AI can fail too: Machines can fail. For example, Microsoft was forced to disable its Tay
Chatbot, which was able to learn through its conversations with users, after Internet users
managed to make it racist and sexist in less than one day. Similarly, Facebook was forced
to admit failure when its bots reached a failure rate of 70% and started talking to each
other in a strange language only they understood.
2. AI needs big data: Machines are not suitable for all tasks. AI is very effective in rules-
based environments with lots of data to analyze. Its use is therefore relevant for things
such as autonomous cars, which drive in dense traffic governed by specific laws, or
finding the best price at which to resell a batch of shares.
On the other hand, to choose what to invest in, or to recommend products to new
customers without data to exploit, AI is less effective than humans. Lack of rules or data
prevents AI from performing well. The existing AI models require large amounts of task-
specific training data such as ImageNet and CIFAR-10 image databases, composed of 1.2
million and 60 thousand data points (labeled images), respectively. Labeling these data is
often tedious, slow, and expensive, undermining the central purpose of AI.
3. AI needs a dedicated computational infrastructure: All AI system’s successes use a
specific hardware infrastructure dedicated to the AI task to be solved. For instance,
Google DeepMind’s AlphaGo ZeroAlphaGo system, which crushed 18-time world
champion Lee Sedol and the reigning world number one player of Go, Ke Jie, was trained
using 64 GPUs (graphics processing units) and 19 CPUs (central processing units).
While the human brain, the smartest system we know in the universe, is remarkably low
in power consumption, computers are still far from matching the energy efficiency of the
brain. A typical adult human brain only consumes around the equivalent of 0.3 kilowatts
hours.
4. AI does not understand causal reasoning: AI algorithms, as currently designed, do not
take into account the relationship between cause and effect. They replace causal
reasoning with reasoning by association.
5. AI is vulnerable to adversarial attacks: Adversarial attacks are like optical illusions for
machines. They are intentionally designed to cause the model to make a mistake. These
attacks add noise of small amplitude in the data submitted as input to the AI algorithm in
order to mislead these algorithms, forcing them to predict a wrong answer.
Conclusion: Though AI is seen by many as the next big thing, it still has severe limitations
that may have unforeseen and potentially disastrous consequences if it is not implemented in
the correct fashion. With the intensity of research and development currently being
undertaken in this sector, we will likely see advancements to counteract many of these
factors, expanding the potential applications of AI significantly.
Referrences
1. https://www.prescouter.com/2018/06/the-limitations-of-ai-today/

2. https://builtin.com/artificial-intelligence/examples-ai-in-industry
3. DAY 9 - Thinking Humanly: The cognitive modeling approach - Artificial
Intelligence - IT Consultant - SAP, Artificial Intelligence and Machine Learning
(gopichandrakesan.com)

Session 2
Contents
As per syllabus
(1) Introduction to Intelligent Agents: Notion of agents and environments, Task
Environments: Elements, examples, properties with examples, Structure of Agents
As per session plan
(2) Intelligent Agents: Notion of agents and environments, rational agents, Omniscience vs.
Rationality – Task Environments, Structure of Agents
What is an Agent?
In the context of the AI field, an “agent” is an independent program or entity that interacts with
its environment by perceiving its surroundings via sensors, then acting through actuators or
effectors.
Introduction to Agent
An agent perceives its environment through sensors.
The complete set of inputs at a given time is called a percept.
The current percept, or a sequence of percepts can influence the actions of an agent.
The agent can change the environment through actuators or effectors.
An operation involving an effector is called an action. Actions can be grouped into action
sequences.
The agent can have goals which it tries to achieve. Thus, an agent can be looked upon as a
system that implements a mapping from percept sequences to actions.
A performance measure has to be used in order to evaluate an agent.
An autonomous agent decides autonomously which action to take in the current situation to
maximize progress towards its goals. To the extent that an agent relies on the prior knowledge
of its designer rather than on its own percepts, we say that the agent lacks autonomy.
We use the term percept to refer to the agent’s perceptual inputs at any given instant.
An agent’s percept sequence is the complete history of everything the agent has ever perceived in
general.
An agent choice of action at any given instant can depend on the entire percept sequence observed to
date.
An agent’s behavior is described by the agent function that maps any given percept sequence to an
action.

Performance of an Agent
An agent function implements a mapping from perception history to action. The behavior and
performance of intelligent agents have to be evaluated in terms of the agent function.
The ideal mapping specifies which actions an agent ought to take at any point in time.
The performance measure is a subjective measure to characterize how successful an agent is.
The success can be measured in various ways. It can be measured in terms of speed or efficiency
of the agent. It can be measured by the accuracy, or the quality of the solutions achieved by the
agent. It can also be measured by power usage, money, etc.
Examples of Agent
Humans can be looked upon as agents. They have eyes, ears, skin, taste buds, etc. for sensors;
and hands, fingers, legs, mouth for effectors.
Robots (hardware) are agents. Robots may have camera, sonar, infrared, bumper, etc. for
sensors. They can have grippers, wheels, lights, speakers, etc. for actuators. Examples:
Xavier Robot (CMU) COG at MIT Museum
Aibo from SONY Sophia in 2018

Self-Driving Car
We also have software agents or softbots that have some functions as sensors and some
functions as actuators. Askjeeves.com is an example of a softbot.
Expert systems like the Cardiologist is an agent.
Autonomous spacecrafts are agents.
Intelligent Agent
An Intelligent Agent must sense, must act, must be autonomous (to some extent). It also must be
rational.
AI is about building rational agents.
A rational agent always does the right thing. For each possible percept sequence, a rational
agent should select an action that is expected to maximize its performance measure, given the
evidence provided by the percept sequence and whatever built-in knowledge the agent has.
Rational Action is the action that maximizes the expected value of the performance measure
given the percept sequence to date.
Rationality maximizes expected performance, while perfection maximizes actual
performance.
Following are the main four rules for an AI agent:
Rule 1: An AI agent must have the ability to perceive the environment.
Rule 2: The Observation must be used to make decisions.
Rule 3: Decision should result in an action.
Rule 4: The action taken by an AI agent must be a rational action.
Rationality
Perfect Rationality assumes that the rational agent knows all and will take the action that
maximizes her utility. Human beings do not satisfy this definition of rationality.
However, a rational agent is not omniscient (knowing everything). It does not know the actual
outcome of its actions, and it may not know certain aspects of its environment. Therefore
rationality must take into account the limitations of the agent. The agent has too select the best
action to the best of its knowledge depending on its percept sequence, its background knowledge

and its feasible actions. An agent also has to deal with the expected outcome of the actions where
the action effects are not deterministic.
Bounded Rationality
“Because of the limitations of the human mind, humans must use approximate methods to handle
many tasks.” Herbert Simon, 1972
Evolution did not give rise to optimal agents, but to agents which are in
some senses locally optimal at best. In 1957, Simon proposed the notion of Bounded
Rationality: that property of an agent that behaves in a manner that is nearly optimal with
respect to its goals as its resources will allow.
Under these promises an intelligent agent will be expected to act optimally
to the best of its abilities and its resource constraints.
Agent Environment or Task Environment
An task environment in artificial intelligence is the surrounding of the agent. The agent takes input from
the environment through sensors and delivers the output to the environment through actuators. There
are several types of environments:
 Fully Observable vs Partially Observable
 Deterministic vs Stochastic
 Competitive vs Collaborative
 Single-agent vs Multi-agent
 Static vs Dynamic
 Discrete vs Continuous
 Episodic vs Sequential
 Known vs Unknown

Agent Environment or Task Environment

Environments in which agents operate can be defined in different ways.
Observability
 In terms of observability, an environment can be characterized as fully observable or
partially observable.
 In a fully observable environment, all of the environment relevant to the action being
considered is observable. In such environments, the agent does not need to keep track of
the changes in the environment. A chess playing system is an example of a system that
operates in a fully observable environment.
 In a partially observable environment, the relevant features of the environment are only
partially observable. Driving – the environment is partially observable because what’s
around the corner is not known.
Determinism
When a uniqueness in the agent’s current state completely determines the next state of the agent,
the environment is said to be deterministic.
The stochastic environment is random in nature which is not unique and cannot be completely
determined by the agent.
Examples:

 Chess: there would be only a few possible moves for a coin at the current state and these
moves can be determined.
 Self-Driving Cars: The actions of a self-driving car are not unique, it varies time to time.
Competitive vs Collaborative
An agent is said to be in a competitive environment when it competes against another agent to
optimize the output. The game of chess is competitive as the agents compete with each other to
win the game which is the output.
An agent is said to be in a collaborative environment when multiple agents cooperate to produce
the desired output. When multiple self- driving cars are found on the roads, they cooperate with
each other to avoid collisions and reach their destination which is the output desired.
Single-agent vs Multi-agent
An environment consisting of only one agent is said to be a single- agent environment. A person
left alone in a maze is an example of the single-agent system.
An environment involving more than one agent is a multi-agent environment. The game of
football is multi-agent as it involves 11 players in each team.
Dynamic vs Static
Static Environment: does not change from one state to the next while the agent is considering
its course of action. Example: An empty house is static as there’s no change in the surroundings
when an agent enters.
Dynamic Environment: A Dynamic Environment changes over time independent of the actions
of the agent -- and thus if an agent does not respond in a timely manner, this counts as a choice to
do nothing. Example: A roller coaster ride is dynamic as it is set in motion and the environment
keeps changing every instant.
If the environment itself does not change with passage of time but the agent performance score
does, then we say the environment is semi dynamic.
Example: Chess, when played with clock.
Episodically
 An episodic environment means that subsequent episodes do not depend on what actions
occurred in previous episodes.
 In a sequential environment, the agent engages in a series of connected episodes.
Continuity
 If the number of distinct percepts and actions is limited, the environment is discrete
(Chess playing), otherwise it is continuous (Taxi driving).
Known vs Unknown
In a known environment, the output for all probable actions is given. Obviously, in case of
unknown environment, for an agent to make a decision, it has to gain knowledge about how the
environment works.
Agent Architecture
According to the architecture agents can be classified as follows
 Table based agent

 Percept based agent or reflex agent
 Subsumption Architecture
 State-based Agent or model-based reflex agent
 Goal-based Agent
 Utility-based Agent
 Learning Agent
Table based Agent

 In table-based agent the action is looked up from a table based on information about the
agent’s percepts.
 If the environment has n variables, each with t possible states, then the table size is t n.
 A table is a simple way to specify a mapping from percepts to actions.
 The mapping is implicitly defined by a program. The mapping may be implemented by a
rule based system, by a neural network or by a procedure.
Disadvantages
 The tables may become very large.
 Learning a table may take a very long time, especially if the table is large.
 Such systems usually have little autonomy, as all actions are pre-determined.
Percept Based Agent or Reflex Agent

• In percept-based agents,
o Information comes from sensors - percepts
o Changes the agents current state of the world
o Triggers actions through the effectors
Such agents are called reactive agents or stimulus-response agents. Reactive agents have no
notion of history. The current state is as the sensors see it right now. The action is based on the
current percepts only.
Characteristics of percept-based agents.
 Efficient

 No internal representation for reasoning, inference.
 No strategic planning, learning.
 Percept-based agents are not good for multiple, opposing, goals.
Problems with Simple reflex agents are :

 Very limited intelligence.
 No knowledge of non-perceptual parts of the state.
 Usually too big to generate and store.
 If there occurs any change in the environment, then the collection of rules need to be
updated.
Example: A thermostat that turns on the air conditioner when the current temperature exceeds a
certain threshold is a simple reflex agent.

Subsumption Architecture
• Proposed by Rodney Brooks, 1986.
• Based on reactive systems.
• Brooks notes that in lower animals there is no deliberation, and the actions are based on
sensory inputs. But even lower animals are capable of many complex tasks.
• Subsumption has been widely influential in autonomous robotics and elsewhere in real-
time AI.
• The main features of Brooks’ architecture are:
 There is no explicit knowledge representation.
 Behavior is distributed, not centralized.
 Response to stimuli is reflexive.
 The design is bottom up, and complex behaviors are fashioned from the
combination of simpler underlying ones.
 Individual agents are simple.
State Based Agent or Model Based Reflex Agent

 State based agents differ from percept-based agents in that such agents maintain some
sort of state based on the percept sequence received so far. The state is updated
regularly based on what the agent senses, and the agent’s actions. Keeping track of the
state requires that the agent has knowledge about how the world evolves, and how the
agent’s actions affect the world.
 The agent has the Knowledge about “how the world works”. In other words it maintain
an internal model of the environment.
 Implemented in simple Boolean circuits or in complete scientific theories – called model
of the world.
 Thus, a state-based agent works as follows:
o Information comes from sensors - percepts.
o based on this, the agent changes the current state of the world
o based on state of the world and knowledge (memory), it triggers actions
through the effectors
 Example: A chess-playing AI that considers the history of moves and the current board
state to decide the next move is a model-based agent.


State based agent or model based reflex agent
Goal Based Agent

 Knowledge about the current state of the environment is not always enough to decide
what to do.
 The goal-based agent has some goal which forms a basis of its actions.
 Such agents work as follows:
 Information comes from sensors - percepts.
 Changes the agents current state of the world.
 Based on state of the world and knowledge (memory) and goals/intentions, it
chooses actions and does them through the effectors.
 Goal formulation based on the current situation is a way of solving many problems and
search is a universal problem-solving mechanism in AI. The sequence of steps required to
solve a problem is not known a priori and must be determined by a systematic
exploration of the alternatives (planning).

Breakout is a classic arcade game in which the player controls a paddle at

the bottom of the screen. The goal of the game is to use the paddle to hit a ball and prevent it
from falling off the screen by bouncing it back into the play. Furthermore, the ball bounces off
the walls as well as from the paddle. Additionally, the player can move the paddle left and right
to try and hit the ball.
The game is over if the ball falls off the screen. Additionally, the player can
earn points by hitting bricks at the top of the screen with the ball. However, the game gets
progressively more difficult as the player progresses, and the ball moves faster.

In this game, the goal-based agent aims to destroy all the bricks in order to
gain maximum reward. To achieve this goal, the agent must use its paddle to hit the ball and
destroy all the bricks. Additionally, the agent needs to continuously evaluate its environment
and take actions likely to lead it closer to its goal. Here, the agent learns by exploring the
environment and setting rules for maximizing the reward.
Utility Based Agent
 Goals alone are not really enough to generate high-quality behavior in most
environments.
 Utility based agents provide a more general agent framework. In case that the agent has
multiple goals, this framework can accommodate different preferences for the different
goals.
 Such systems are characterized by a utility function that maps a state or a sequence of
states to a real valued utility. The agent acts so as to maximize the expected utility.
 A self-driving car, for instance, has many goals to consider when heading toward its
destination: choosing the quickest route, ensuring the safety of its passengers, avoiding
road closures or traffic jams, among others.

Learning Agent
Learning allows an agent to operate in initially unknown environments. The learning element
modifies the performance element. Learning is required for true autonomy.
It has four main parts as follows:
1. Learning Element
2. Performance Element
3. Critic
4. Problem Generator
Example: A spam filter that learns from user feedback.
Conclusions
Looking ahead, the horizon for agents is boundless. As technology advances, we can anticipate
even more sophisticated agents reshaping the landscape of industries, making processes more
efficient, and augmenting human capabilities.
References
https://www.geeksforgeeks.org/agents-artificial-intelligence/
• Agents in Artificial Intelligence – GeeksforGeeks

Session 3
Contents
As per syllabus
(1) Uninformed Search: Problem Formulation, Algorithms: BFS, Uniform cost Search,
DFS, Depth Limited Search, Iterative Deepening Search, Bidirectional Search,
Comparisons
As per session plan:
Problem Solving Agent using Uninformed Search:
(2) Problem Formulation – Examples
(3) Algorithms: BFS, Uniform cost Search, DFS, Depth Limited Search, Iterative Deepening
Search, Bidirectional Search, Comparisons.
State Space Search
A state space is the set of all possible configurations of a system.
Intelligent agents can solve problems by searching for a state-space.
State-space Model
 The agent’s model of the world
 Usually, a set of discrete states
 In driving, the states in the model could be towns/cities.
State Space Search

Formulate a problem as a state space search by showing the legal problem states, the legal
operators, and the initial and goal states.
A state is defined by the specification of the values of all attributes of interest in the world.
An operator changes one state into the other; it has a precondition which is the value of certain
attributes prior to the application of the operator, and a set of effects, which are the attributes
altered by the operator.
The initial state is where you start.
The goal state is the partial description of the solution.


A plan is a sequence of actions. The cost of a plan is referred to as the path cost. The path cost is
a positive number, and a common path cost may be the sum of the costs of the steps in the path.
Problem formulation means choosing a relevant set of states to consider, and a feasible set of
operators for moving from one state to another.
Search is the process of considering various possible sequences of operators applied to the initial
state, and finding out a sequence which culminates in a goal state.
S: The full set of states
s0: the initial state
A:S→S is a set of operators
G is the set of final states.
Note that G  S
The Search Problem is to find a sequence of actions which transforms the agent from the initial
state to a goal state gG. A search problem is represented by a 4 - tuple {S, s0, A, G}.
This sequence of actions is called a solution plan. It is a path from the initial state to a goal state.
A plan P is a sequence of actions.
P = {a0, a1, … , aN} which leads to traversing a number of states {s0, s1, … , sN+1 Î G}.
A sequence of states is called a path. The cost of a path is a positive number. In many cases the
path cost is computed by taking the sum of the costs of each action.
Representation of Search Problem
A search problem is represented using a directed graph.
 The states are represented as nodes.
 The allowed actions are represented as arcs.

Searching Process
Do until a solution is found or the state space is exhausted.
(1) Check the current state
(2) Execute allowable actions to find the successor states.
(3) Pick one of the new states.
(4) Check if the new state is a solution state
If it is not, the new state becomes the current state and the process is repeated
Elastration of a search Process
Pigs and Disks problem

The initial state
The goal state


8 puzzle
Tic-Tac-Toi Game

Traveling in Romania
A problem is defined by four items:

1. initial state e.g., "at Arad“
2. actions or successor function

a. S(x) = set of action–state pairs
b. e.g., S(Arad) = {<Arad à Zerind, Zerind>, … }
3. goal test (or set of goal states)

e.g., x = "at Bucharest”, Checkmate(x)

4. path cost (additive)
a. e.g., sum of distances, number of actions executed, etc.
b. c(x,a,y) is the step cost, assumed to be ≥ 0
A solution is a sequence of actions leading from the initial state to a goal state

Tree based Search
Basic idea:
 Exploration of state space by generating successors of already explored states (expanding
states).
 Every state is evaluated: is it a goal state?


Search Tree for the 8 puzzle problem
Search Strategies
A search strategy is defined by picking the order of node expansion.
As per your syllabus I will discuss following uninformed search strategies in this chapter
 Breath First Search (BFS)
 Uniform cost Search
 Depth First Search (DFS)
 Depth Limited Search
 Iterative Deepening Search
 Bidirectional Search
Uninformed search is a class of general-purpose search algorithms which operates in brute
force-way. Uninformed search algorithms do not have additional information about state or
search space other than how to traverse the tree, so it is also called blind search.

Breath First Search (BFS)



Uniform Cost Search (UCS)
This algorithm comes into play when a different cost is available for each edge.
The primary goal of the uniform-cost search is to find a path to the goal node which has the
lowest cumulative cost.
Uniform-cost search expands nodes according to their path costs form the root node.
It can be used to solve any graph/tree where the optimal cost is in demand.
A uniform-cost search algorithm is implemented by the priority queue. It gives maximum
priority to the lowest cumulative cost.
Uniform cost search is equivalent to BFS algorithm if the path cost of all edges is the same.
Completeness: Uniform-cost search is complete, such as if there is a solution, UCS will find it.
Time Complexity: Let C* is Cost of the optimal solution, and ε is each step to get closer to the
goal node. Then the number of steps is = C*/ε+1. Here we have taken +1, as we start from state
0 and end to C*/ε.
Hence, the worst-case time complexity of Uniform-cost search is O(b1 + [C*/ε])
Space Complexity: The same logic is for space complexity so; the worst-case space complexity
of Uniform-cost search is O(b1 + [C*/ε]).
Optimal: Uniform-cost search is always optimal as it only selects a path with the lowest path
cost.
Advantages: It is optimal.
Disadvantages: It does not care about the number of steps involved in searching and only
concerned about path cost. Due to which this algorithm may be stuck in an infinite loop.

Depth First Search (DFS)
Completeness: DFS search algorithm is complete within finite state space as it will expand
every node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:
T(b)= 1+ b+ b2+ b3 +.........+ bm=O(bm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest solution
depth) and b = branching factor.
Space Complexity: DFS algorithm needs to store only single path from the root node, hence
space complexity of DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps or
high cost to reach to the goal node.

Depth Limited Search (DLS)
Like Depth first search, but the search is limited to a predefined depth.
The depth of each state is recorded as it is generated. When picking the next state to expand, only
those with depth less or equal than the current depth are expanded.
Combination of DFS and BFS.

Iterative Deepening Search (IDS)
The iterative deepening algorithm is a combination of DFS and BFS algorithms. This search
algorithm finds out the best depth limit and does it by gradually increasing the limit until a goal
is found.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps increasing
the depth limit after each iteration until the goal node is found.
This Search algorithm combines the benefits of Breadth-first search's fast search and depth-
first search's memory efficiency.
The iterative search algorithm is useful for uninformed search when search space is large, and
depth of goal node is unknown.

Bi-directional Search
Expand nodes from the start and goal state simultaneously. Check at each stage if the nodes of
one have been generated by the other. If so, the path concatenation is the solution.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.

Comparison of Uninformed Search Strategies

Session 4
Contents
As per syllabus
(1) Problem Solving Agent using Search: Informed Search : Notion of Heuristics, Greedy
best first search, A* search , Optimality of A* , Memory Bounded Heuristic Search
Problem Solving Agent using Informed Search:
(2) Notion of Heuristics
(3) Algo Greedy best first search, A* search , Optimality of A* , Memory Bounded Heuristic
Search.
Informed Search
We have seen that uninformed search methods that systematically explore the state space and
find the goal.
They are inefficient in most cases. Informed search methods use problem specific knowledge
and may be more efficient.
Informed search refers to search algorithms that use additional knowledge or information to
guide the search process more effectively.
At the heart of such algorithms there is the concept of a heuristic function.
Heuristic means “rule of thumb”.
In heuristic search or informed search, heuristics are used to identify the most promising search
path.
Heuristic Function
Heuristic is a word from the Greek heuriskein meaning "to discover".
Heuristic Function is a function that estimates the cost of getting from one place to another
(from the current state to the goal state). Also called as simply a heuristic.
Used in a decision process to try to make the best choice of a list of possibilities (to choose the
move more likely to lead to the goal state). The best move is the one with the least cost.
It can also be defined thus as a function that ranks alternatives in search algorithms at each
branching step based on available information to decide which branch to follow.
For example: The problem might be finding the shortest driving distance to a point. A heuristic
cost would be the Straight-Line Distance to the point. The true distance would likely be higher.
Types of Heuristics
There are different types of heuristics that people use as a way to solve
a problem or to learn something.
1. Affect heuristic.
2. Anchoring and adjustment heuristic
3. Availability heuristic
4. Common sense heuristic
5. Familiarity heuristic
6. Representativeness heuristic
Reference: https://examples.yourdictionary.com/examples-of- heuristics.html

Affect Heuristic
When you apply affect heuristic, you view a situation quickly and decide without further
research whether a thing is good or bad. This can also be described as an impulsive or emotional
decision.
A person is stuck in traffic and makes an impulsive decision to take the other route even though
you don’t know the way.
Anchoring and adjustment Heuristic
When you use an anchoring and adjustment heuristic, you use a starting point to anchor your
point or judgment, but then you adjust your information based on new evidence.
A salesman initially offering a high price and eventually arriving at a fair value with the
customer.
Availability Heuristic
When you use an availability heuristic, you use the information available to you to make the best
guess or decision possible.
Guessing the population of the city you live in even though you have never looked up the exact
number of people.
Common sense Heuristic
Common sense heuristics is a practical and prudent approach that is applied to a decision.
If it is raining outside, you should bring an umbrella.
Familiarity Heuristic
The familiarity heuristic is when something, someone or somewhere familiar is favored over the
unknown.
A group is deciding between a new restaurant and a restaurant they have been to many times and
ultimately goes to the restaurant they usually go to.
Representativeness Heuristic
Making a judgment about the likelihood of an event or fact based on preconceived notions or
memories of a prototype, stereotype or average.
Assuming someone is arrogant and self-absorbed because they are reserved, quiet and rarely
interact with people.

Best first search (BFS)
Best-first search is a search algorithm which explores a graph by expanding the most
promising node chosen according to a specified rule.
The algorithm maintains two lists, one containing a list of candidates yet to explore (OPEN),
and one containing a list of visited nodes (CLOSED).
Since all unvisited successor nodes of every visited node are included in the OPEN list, the
algorithm is not restricted to only exploring successor nodes of the most recently visited node. In
other words, the algorithm always chooses the best of all unvisited nodes that have been graphed,
rather than being restricted to only a small subset, such as immediate neighbours.
Other search strategies, such as depth-first and breadth-first, have this restriction. The advantage
of this strategy is that if the algorithm reaches a dead-end node, it will continue to try other
nodes.
1. Define a list, OPEN, consisting solely of a single node, the start node, s0.
2. IF the list is empty, return failure.
3. Remove from the list the node n with the best score (the node where evaluation function,
f is the minimum), and move it to a list, CLOSED.
4. Expand node n.
5. IF any successor to n is the goal node, return success and the solution (by tracing the
path from the goal node to s).
6. FOR each successor node:
1. apply the evaluation function, f, to the node.
2. IF the node has not been in either list, add it to OPEN.
7. Looping structure by sending the algorithm back to the second step.


Properties of Best first search (BFS)
Complete?
 Not unless it keeps track of all states visited.
• Otherwise, can get stuck in loops (just like DFS)
Optimal?
 No – we just saw a counter-example.
Time?
 O(bm), can generate all nodes at depth m before finding solution.
 m = maximum depth of search space
Space?
 O(bm) – again, worst case, can generate all nodes at depth m before finding solution.
Best first search or Greedy Best first search

"Best-First Search" is a more general term referring to any search algorithm that selects nodes
based on a heuristic evaluation.
"Greedy Best-First Search" is a specific instance of Best-First Search that always chooses the
node that seems most promising based solely on the heuristic estimate, without considering the
cost of reaching that node.
Greedy Best first search
In greedy search, the idea is to expand the node with the smallest estimated cost to reach the
goal.
We use a heuristic function f(n) = h(n)
h(n) estimates the distance remaining to a goal.
Greedy algorithms often perform very well. They tend to find good solutions quickly, although
not always optimal ones.
The algorithm is also incomplete, and it may fail to find a solution even if one exists.

The path obtained is A-B-E-G-H and its cost is 99

Clearly this is not the optimal path. The path A-B-C-F-H has a cost of 39.

A* Search
This algorithm was invented by Hart, Nilsson & Rafael in 1968.
Expand node based on estimate of total path cost through node
Evaluation function f(n) = g(n) + h(n)
g(n) = cost so far to reach n
h(n) = estimated cost from n to goal
f(n) = estimated total cost of path through n to goal
f(n) = actual distance so far + estimated distance remaining Efficiency of search will depend on
quality of heuristic h(n)

Complete?
– Yes (unless there are infinitely many nodes with f ≤ f(G) )
Optimal?
– Yes
– Also optimally efficient:
• No other optimal algorithm will expand fewer nodes, for a given heuristic
Time?
– Exponential in worst case
Space?
– Exponential in worst case

Optimality of A* Search

Memory Bounded Heuristic Search
Idea: Try something like depth first search, but let’s not forget everything about the branches we
have partially explored.
Recursive Best First Search
Uses f-value (g + h) as the cutoff.
Similar to DFS but keeps track of the f-value of the best alternative path available from any
ancestor of the current node.
If current node exceeds f-limit -> backtrack to alternative path
As it backtracks, replace f-value of each node along the path with the best f(n) value of its
children.
– This allows it to return to this subtree, if it turns out to look better than alternatives

(Simplified) Memory Bounded A* (SMA*)

This is like A*, but when memory is full we delete the worst node (largest f-value).
Like RBFS, we remember the best descendant in the branch we delete.
If there is a tie (equal f-values) we delete the oldest nodes first.
Simplified-MA* finds the optimal reachable solution given the memory constraint.
Time can still be exponential.

Session 5
Contents
As per syllabus
(1) Problem Solving Agent using Search: Heuristic Functions: Heuristic Accuracy &
Algorithm performance Admissible heuristics from relaxed problems, pattern databases
& Experience
Heuristic Functions:
(2) Heuristic Accuracy & Algorithm performance
(3) Admissible heuristics from relaxed problems, pattern databases. & Experience
Heuristic Function
In contrast to the uninformed search, informed search strategies use additional knowledge
beyond what we provide in the problem definition. The additional knowledge is available
through a function called a Heuristic.
Heuristic functions are the heart of any informed search algorithm.
Desirable properties of Heuristic Function

1. Efficient to compute (ℎ(𝑠)=0 as extreme case)
Having h(s)=0 for all states s effectively reduces the algorithm to its simplest form. When
h(s)=0, the algorithm becomes essentially equivalent to performing an uninformed search (i.e., a
search without any heuristic information).
2. Informative (ℎ(𝑠)=ℎ*(𝑠) as extreme case, where ℎ*(𝑠) is the true cost)
3. ℎ(𝑠)=0 if 𝑠 is the goal state, otherwise ℎ(𝑠)>0
4. ℎ is admissible
5. ℎs(𝑑)=∞ for dead-end states 𝑠𝑑
6. ℎ is consistent
GOOD heuristics should satisfy a balanced compromise of properties.
(1) to (4) at least, better of all 6
Properties (5) ensures effective dead-end recognition and (6) is a prerequisite for algorithms to
guarantee minimal-cost (optimal) solutions.
Admissibility of h(s)
Let π be a problem with state space θ and let ℎ be a heuristic function for θ.
We say that ℎ is admissible if, for all 𝑠∈𝑆, we have
ℎ (𝑠) ≤ ℎ*(s)
The function ℎ*(𝑠) corresponds to the real cost of the optimal path from
node 𝑛 to a goal state.
The function ℎ is an optimistic estimation of the costs that occur. It underestimates the real costs
and provides the search algorithm with a lower bound on the goal distance.

Consistency (Monotonicity) of h(s)
Let π be a problem with state space θ, and let ℎ be a heuristic function for θ.
We say that ℎ is consistent if, for all transitions s→𝑎 𝑠′ in θ,
we have ℎ(𝑠)−ℎ(𝑠′) ≤ 𝑐(𝑠,𝑎).
The value c(𝑠,𝑎) is the action cost of getting from 𝑠 to 𝑠′ with action 𝑎.
We reformulate the inequality from above to: ℎ(𝑠) ≤ 𝑐(𝑠,𝑎)+ ℎ(𝑠′)
Applying an action 𝑎 to the state 𝑠, the heuristic value cannot decrease by more than the cost
𝑐(𝑠,𝑎) of 𝑎.
Triangle inequality: The sum of the

lengths of any two sides of a triangle
must be greater or equal than the length
of the remaining side.
Heuristic Accuracy and Algorithm Performance

Heuristic function receives a state at its input and estimates how close it is to the goal.
Using a heuristic, a search strategy can differentiate between non-goal states and focus on
those that look more promising.
That is why informed search techniques can find the goal faster than an uninformed algorithm,
provided that the heuristic function is well- defined.
To be more precise, we’ll say that a heuristic is a function that estimates the cost of the shortest
path between a state at the given node and the goal state (or the closest goal state, if there’s more
than one).
The A* algorithm is a classical and probably the most famous example of an informed search
strategy. Given a proper heuristic, A* is guaranteed to find the optimal path between the start
and goal nodes (if such a path exists), and its implementations are usually very efficient in
practice. Other examples of informed algorithms are Best-First Search (BFS), Recursive Best-
First Search (RBFS), and Simplified Memory-bounded A* (SMA*).
Since informed algorithms rely so much on heuristics, it’s crucial to define them well.
But how can we characterize and compare heuristics to decide which one to use?
An obvious criterion for evaluating heuristics is their accuracy. The more heuristic estimates
reflect the actual costs, the more useful the heuristic is. It may seem that the actual cost of the
path from a state to the goal is the ideal heuristic.
However, there’s a price to pay for such high accuracy. The only way to have such a heuristic
is to find the shortest path between and the goal, which is an instance of the original problem.
Highly accurate heuristics usually require more computation, which slows down the search.
Therefore, a good heuristic will make a balance between its accuracy and computational
complexity, sacrificing the former for the latter to some extent.

Effective Branching factor
Let’s say that the informed algorithm using the heuristic h had generated a search tree with N
nodes (other than the start node) before it found the goal state at depth d.
The Effective Branching Factor (EBF) b induced by h is the branching factor that a uniform
tree of depth d would have to have in order to contain N+1 nodes. So:
Since we know N and d, we can compute b. However, we can’t run the algorithm only once and
determine the EBF of b. That’s because the problem instances vary, and the values of N and d
will vary. Further, the algorithm itself may be randomized.
To overcome those issues, we should calculate the EBFs on a random representative sample of
instances.
We can then compute the mean or median score, quantifying the statistical uncertainty with
confidence or credible intervals.
If A* finds a solution at depth 5 using 52 nodes, then the effective branching factor is 1.92.
Experimental measurements of b* on a small set of problems can provide a good guide to the
heuristic’s overall usefulness.
The efficient heuristics will have an EBF close to 1.
A* on Romania Travel Example

Computed f values in the Example
f based contours in the search space

A* fans out from the start node, adding nodes in concentric bands of
increasing f-costs
1. A fans out from the start node: The algorithm begins its search at the start node and
explores adjacent nodes, gradually expanding its search outward.
2. Adding nodes in concentric bands: As A* explores the search space, it organizes nodes
into layers or bands based on their distance from the start node. Nodes closer to the start
node are explored first, followed by nodes farther away.
3. Of increasing f-costs: The f-cost of a node in A* is the sum of two components: the cost
of reaching that node from the start node (g- cost) and an estimate of the cost from that
node to the goal node (h- cost). A* prioritizes nodes with lower f-costs, so as it expands
outward from the start node, it adds nodes in each concentric band based on increasing f-
costs. This means that nodes with lower total costs (g-cost + h-cost) are explored before
nodes with higher total costs.
With good heuristics the bands stretch towards the goal state and are more narrowly focus around
the optimal path.

Dominance
The EBF isn’t the only way to characterize and compare heuristics.
If for two heuristics h1 and h2 it holds that h2 (s) > h1 (s) for every states s, then we say that h2
dominates h1 .
Dominance has a direct effect on performance, at least when it comes to the A* algorithm. A*
with h2 will never expand more nodes than A* with h1 if h2 dominates h1 (and both heuristics are
consistent or admissible).
Given any admissible heuristics ha , hb ,
h(n) = max(ha (n), hb (n)) is also admissible and h(n) dominates
both ha and hb
h1 corresponds to the number of tiles in the wrong position (Misplaced Tiles)

h2 corresponds to the sum of the distances of the tiles from their goal Positions (Manhattan
Distance)
h1 = 8 (all tiles are misplaced)
h2 = 3+1+2+2+2+3+3+2=18
Distance between two points are measured along axis at right angle
We see that h2 (s) > h1 (s) since for each misplaced symbol (a number or the empty cell), the
minimal Manhattan distance to the goal position is 1.
Experimentation with A* and random instances should indicate that the average EBF of h2 is
smaller than that of h1.

Formulating Heuristics
An efficient heuristic may not be easy to devise.
Luckily, there are several approaches we can follow, and we’ll mention three of them.
Formulating Heuristics from Relaxations

Firstly, we can relax the original problem.
We do so by removing certain restrictions from its definition to create additional edges
between the states that weren’t adjacent originally.
For example, we can drop the condition that only the blank cell can swap places with another
tile in the n -puzzle problem.
That immediately makes more actions legal in every state and places an edge between many
states that weren’t neighbors in the original formulation.
For that reason, the cost of the optimal path between the start and the goal node in the
relaxation always underestimates the cost of the optimal path in the original version.
Since the relaxed problem has more edges in the state space, it’s easier to solve.
The relaxed costs can serve as heuristics for the original problem.

Formulating Heuristics from Sub-problems
Secondly, we can focus on a sub-problem that’s easy to solve and use the cost of its solution as
a heuristic function.
For instance, we can focus only on numbers 1-4 in the 8-puzzle game.
Let’s use as a heuristic the length of the shortest sequence of moves that put only those four
numbers in their goal positions.
Although the heuristic underestimates the cost of the optimal solution to the whole problem, it
does hint at how much a state is far from the goal.
Divide the 15 puzzle into fringe + 8 puzzle

Map the current location of the fringe tiles into an index of the database The data base tells us the
minimal number of moves to achieve the fringe
Achieve the fringe + solve the remaining puzzle
Apply the Divide and Conquer principle.

1. Decompose the problem into sub-problems (sub-goals)
2. Store solutions to the sub-problems with associated costs (patterns)
3. Reuse these solutions.

Learning Heuristics from Data
Let’s imagine a list of pairs (s.c), where s is a state, and c is the cost of the optimal path from s to
the goal.
We can collect such a dataset by setting s to be the start state and running an uninformed search
strategy.
To account for statistical variability, we randomly select a few problem instances and sample
several states from their search spaces.
Once we create the dataset, we can treat it as a regression problem and apply a machine-learning
algorithm to find a model that approximates the costs. Then, we use that model as our heuristic.
Sometimes, the state representation won’t be suitable for machine learning. That can happen if
the states are structured objects that are hard to handle by traditional algorithms that expect
vector data. To overcome this issue, we can represent the states by hand- selected or
automatically engineered features.
Cost
Size of Problem Space

IDA* with Manhattan distance
 To solve a 15-puzzle instance were solved optimally using IDA* (Iterative Deepening
A*) with Manhattan distance heuristics (Korf, 1985)
 The average length of optimal solution is 53 moves.
 400 million nodes are generated on average.
 Average solution time is about 50 seconds using current machines.
 To solve a 24-puzzle instance, IDA* (Iterative Deepening A*) with Manhattan distance
would take about 65000 years on average.
 Assume that each tile moves independently.
 In fact, tiles interfere with each other.
 According to these interactions is the key to more accurate heuristic functions.
Linear Conflict
Manhattan Distance is 2+2 = 4

But it is not possible due to linear conflict
Actual requirement is 6 (minimum)
Linear Conflict Heuristic

• Hansson, Mayer and Yung, 1991
1. Calculate Manhattan Distance: Begin by calculating the Manhattan distance for each
tile, as you would with the regular Manhattan distance heuristic. This involves summing
the distances (in rows and columns) between each tile's current position and its goal
position.
2. Identify Linear Conflicts: A linear conflict occurs when two tiles that belong to the
same row or column are in conflict with each other, meaning they need to move in
opposite directions to reach their goal positions. Specifically, this happens when both tiles
must move horizontally or vertically to reach their goal positions, and one tile blocks the
other's path.
3. Count Linear Conflicts: For each row and column, count the number of linear conflicts
that occur. This involves examining pairs of tiles within the same row or column and
determining if they are in conflict with each other.
4. Add Conflicts to the Heuristic Value: Double the number of linear conflicts found
(since each conflict requires two additional moves to resolve) and add this value to the
Manhattan distance. This adjusted sum becomes the heuristic value used to guide the
search algorithm (such as A*) in finding the optimal solution.

More complex Tile interaction
M.d. is 19 moves, but 31 moves are

needed

needed

needed
Pattern Database Heuristics

• Culberson and Schaeffer, 1996
1. Partitioning the Puzzle: The first step is to partition the puzzle into different subsets or
patterns. These patterns are typically chosen based on their ability to capture important
aspects of the puzzle's structure or complexity. For example, in the case of the Rubik's
Cube, patterns might consist of certain combinations of colors or arrangements of
specific groups of pieces.
2. Generating Optimal Solutions: For each pattern, optimal solutions are generated using
specialized algorithms or exhaustive search techniques. These solutions represent the
minimum number of moves required to solve the pattern from any starting configuration
within the pattern's subset.
3. Storing Solutions in a Database: The optimal solutions for each pattern are stored in a
database, often referred to as a pattern database or pattern collection. Each entry in the
database corresponds to a specific pattern configuration, and the value associated with
each entry is the optimal number of moves required to solve that configuration.
4. Heuristic Estimation: During the search for the optimal solution to the entire puzzle, the
pattern databases are used to provide heuristic estimates of the remaining distance to the
goal state. At each step of the search, the heuristic function looks up the current state of
the puzzle in the pattern databases and sums the optimal solution lengths for each pattern
subset. This sum serves as an estimate of the total number of moves required to solve the
entire puzzle from the current state.
5. 5. Search Algorithm: The heuristic estimates obtained from the pattern databases are
used by search algorithms such as A* to guide the search towards the goal state. By
incorporating the precomputed information from the pattern databases, the search
algorithm can focus its efforts on promising regions of the search space, leading to more
efficient and effective search.

Precomputing Pattern Database
Combining Multiple Databases
Additive Pattern Databases

Example Additive Databases
The 7-tile database contains 5.8 crores entries.

The 8-tile database contains 51.9 crores entries.
Computing the Heuristics
Performances
Conclusion
Here, we talked about uninformed and informed search strategies. Uninformed
algorithms use only the problem definition, whereas the informed strategies can also use
additional knowledge available through a heuristic that estimates the cost of the optimal path to
the goal state.
If the heuristic estimates are easy to compute, the informed search algorithms will be
faster than the uninformed. That’s because the heuristic allows them to focus on the promising
parts of the search tree. However, an efficient heuristic isn’t easy to formulate.
References
NPTEL :: Computer Science and Engineering - NOC:An Introduction to Artificial Intelligence
https://archive.nptel.ac.in/courses/106/102/106102220/
By Prof. Mausam

Session 6
Contents
As per syllabus
(1) Problem Solving Agent using Search:
Local Search Algorithms & Optimization Problems: Hill Climbing Search, Simulated
Annealing, Local Beam Search, Genetic Algorithm
Local Search Algorithms & Optimization Problems:
(2) Hill Climbing Search, Simulated Annealing, Local Beam Search, Genetic Algorithm
Local Search Algorithm and Optimization Problem

The informed and uninformed search expands the nodes systematically in two ways:
1. Keeping different paths in the memory and
2. Selecting the best suitable path,
When a goal is found, the path to the goal constitute the solution. But, depending on the
applications, the path may or may not matter.
Which leads to a solution state required to reach the goal node. But beyond these “classical
search algorithms," we have some “local search algorithms” where the path cost does not
matters, and only focus on solution-state needed to reach the goal node.
A local search algorithm completes its task by traversing on a single current node rather than
multiple paths and following the neighbors of that node generally. It is a iterative improvement
algorithm.
Although local search algorithms are not systematic, still they have the following
advantages:
 No need to maintain any search tree.
 Local search algorithms use a very little or constant amount of memory as they operate
only on a single path.
 Most often, they find a reasonable solution in large or infinite state spaces where the
classical or systematic algorithms do not work.
Local Search
• Keep track of the current state
• Move only to the neighboring states
• Ignore paths
“Pure optimization” problems
– All states have an objective function, which we have to optimize.
– Goal is to find state with max (or min) objective value.
– Does not quite fit into path-cost/goal-state formulation.
– Local search can do quite well on these problems.

Use an initial conflicting assignment and a heuristic. Each variable is changed one at a time to
attain the solution.
The obvious heuristic is to change the variable to decrease the number of currently occurring
conflicts.
This technique works particularly well with a good initial assignment and when the solutions are
densely distributed in the state space.
A comprehensive example 8 Queens is presented now.
N-queens problem
Put n queens on an n x n board with no two queens on the same row, column, or diagonal.
Neighbor: move one queen to another row Search: go from one neighbor to the next…

To solve the 8-queen problem using local search, you can follow these
steps:
1. Initialization: Start with a random or predefined initial state where each queen is placed
on a different column.
2. Evaluation Function: Define an evaluation function that calculates the number of pairs
of queens that are attacking each other. In the case of the 8-queen problem, this means
counting how many pairs of queens are in the same row, column, or diagonal.
3. Generate Neighbors: Generate neighboring states by moving one queen at a time to a
different row within its column. This creates a set of possible successor states.
4. Local Search Loop:
 Evaluate the current state using the evaluation function.
 If no pairs of queens are attacking each other (i.e., evaluation function returns 0),
the problem is solved, and the current state represents a solution.
 Otherwise, select one of the neighboring states with a better
 evaluation score (fewer pairs of attacking queens).
 Repeat the evaluation and selection process until a solution is found or a
termination condition is met (e.g., maximum number of iterations, time limit
reached).
5. Termination Condition: Terminate the local search when one of the following
conditions is met:
 A solution state is found (no attacking queens).
 The maximum number of iterations is reached.
 The time limit is exceeded.
 No further improvement is possible (local optimum reached).
6. Output Solution: If a solution state is found, output the positions of the queens on the
chessboard, indicating a valid solution to the 8- queen problem.
7. Optimization: You can improve the efficiency of the local search algorithm by
incorporating techniques such as random restarts, simulated annealing, or genetic
algorithms to escape local optima and explore a larger portion of the search space.

Different types of local search
1. Hill-climbing Search
2. Simulated Annealing
3. Local Beam Search
1. Hill-climbing Search
It is also known as greedy local search. It ends with a local maximum, a global maximum, or can
become stuck with no further progress possible.
Hill-Climbing on 8 queens

Hill-Climbing Search
Hill-Climbing search with sideways move
Hill-Climbing search with random restart

Hill-Climbing search
Local Beam Search

In the hill climbing search we need to keep only one node into the memory.
It might seem to be an extreme reaction to the problem of memory limitations.
The local beam search algorithm keeps track of k states rather than just one.
It begins with k randomly generated states.
At each step, all the successors of all k states are generated. If anyone is a goal, the algorithm
halts.
Otherwise, it selects the k best successors from the complete list and repeats.
At first sight, a local beam search with k states might seem to be nothing more than running k
random restarts in parallel instead of in sequence.
In fact, the two algorithms are quite different.
In a random-restart search, each search process runs independently of the others.
In a local beam search, useful information is passed among the parallel search threads.
In effect, the states that generate the best successors say to the others,
“Come over here, the grass is greener!”
The algorithm quickly abandons unfruitful searches and moves its resources to where the most
progress is being made.
Local beam search can suffer from a lack of diversity among the k states - they can become
clustered in a small region of the state space, making the search little more than a k-times-
slower version of hill climbing.
A variant called stochastic beam search, analogous to stochastic hill climbing, helps alleviate this
problem.
Instead of choosing the top k successors, stochastic beam search chooses successors with
probability proportional to the successor’s value, thus increasing diversity.

Simulated Annealing
The Simulated Annealing algorithm is based upon Physical Annealing in real life.
Physical Annealing is the process of heating up a material until it reaches an annealing
temperature and then it will be cooled down slowly in order to change the material to a desired
structure.
When the material is hot, the molecular structure is weaker and is more susceptible to change.
When the material cools down, the molecular structure is harder and is less susceptible to
change.
Another important part of this analogy is the following equation from Thermal Dynamics:
This equation calculates the probability that the Energy Magnitude will increase. We can
calculate this value given some Energy Magnitude and some temperature t along with the
Boltzmann constant k.
Simulated Annealing (SA) mimics the Physical Annealing process but is used for optimizing
parameters in a model.
This process is very useful for situations where there are a lot of local minima/maxima such that
algorithms like Gradient Descent would be stuck at.

In problems like the one above, if Gradient

Descent started at the starting point
indicated, it would be stuck at the local
minima and not be able to reach the global
minima.
Simulated Annealing Algorithm:

Step 1: We first start with an initial solution s = S₀. This can be any solution that fits the criteria
for an acceptable solution. We also start with an initial temperature t = t₀.
Step 2: Setup a temperature reduction function alpha. There are usually 3 main types of
temperature reduction rules:
Each reduction rule reduces the temperature at a different rate and each method is better at
optimizing a different type of model. For the 3rd rule, beta is an arbitrary constant.
Step 3: Starting at the initial temperature, loop through n iterations of Step 4 and then decrease
the temperature according to alpha. Stop this loop until the termination conditions are reached.
The termination conditions could be reaching some end temperature, reaching some acceptable
threshold of performance for a given set of parameters, etc. The mapping of time to temperature
and how fast the temperature decreases is called the Annealing Schedule.
Step 4: Given the neighbourhood of solutions N(s), pick one of the solutions and calculate the
difference in cost between the old solution and the new neighbour solution. The neighbourhood
of a solution are all solutions that are close to the solution. For example, the neighbourhood of a
set of 5 parameters might be if we were to change one of the five parameters but kept the
remaining four the same.
Step 5: If the difference in cost between the old and new solution is greater than 0 (the new
solution is better), then accept the new solution. If the difference in cost is less than 0 (the old
solution is better), then generate a random number between 0 and 1 and accept it if it’s under the
value calculated from the Energy Magnitude equation from before.
In the Simulated Annealing case, the equation has been altered to the following:

Where the delta c is the change in cost and the t is the current temperature.
The P calculated in this case is the probability that we should accept the new solution.

High vs. Low Temperature

Due to the way the probability is calculated, when the temperature is higher, is it more likely that
the algorithm accepts a worse solution. This promotes Exploration of the search space and
allows the algorithm to more likely travel down a sub-optimal path to potentially find a global
maximum.
When the temperature is lower, the algorithm is less likely or will not to accept a worse solution.
This promotes Exploitation which means that once the algorithm is in the right search space,
there is no need to search other sections of the search space and should instead try to converge
and find the global maximum.
Advantages
 Easy to implement and use.
 Provides optimal solutions to a wide range of problems.
Disadvantages
 Can take a long time to run if the annealing schedule is very long.

 There are a lot of tuneable parameters in this algorithm.
Genetic Algorithm
The Origin of Species by Means of Natural Selection
or
The Preservation of Favored Races in the Struggle for Life
A decent origin of man
Survival of the fittest by Natural Selection

Genetic Algorithms
Genetic algorithms, which work based on Darwin's principle of selection (that is, survival of
the fittest), have been used as optimization tools.
An optimization problem can be a single objective problem or it can be a multi
objective problem.
A single objective optimization problem can be a maximization problem or a minimization
problem.
Multi objective optimization problems are more complex in nature.
Genetic Algorithm (GA)
t := 0;
Compute initial population p;
WHILE stopping condition not fulfilled DO BEGIN
select individuals for reproduction; create offsprings by crossing individuals; eventually mutate
some individuals; compute new generation.
END
Genetic Algorithm (GA)

Chromosome encodes a solution in the search space
1. Usually as strings of 0's and 1's
2. If l is the string length, number of different chromosomes (or strings) is 2l
Population
1. A set of chromosomes in a generation
2. Population size is usually constant
3. Common practice is to choose the initial population randomly.



• I have discussed Binary coded Single Objective GA

• Real coded GA
• Multi Objective GA
• Many Objective GA
• I have invented Bi-Phased MOGA very recently.
BPMOGA
https://www.elsevier.com/locate/eswa

Session 7
As per syllabus
Problem Solving Agent using Search:
1. Ant Colony Optimization
2. Particle Swarm Optimization
As per session plan

1. Local Search Algorithms & Optimization Problems
Background
In last class we have discussed about Genetic Algorithm (GA). It is an optimization algorithm.

Swarm Intelligence
Swarm intelligence is a collective behavior observed in decentralized, self-organized systems,
particularly in biological entities like ants, bees, birds, and fish. These systems exhibit
complex, coordinated behaviors emerging from interactions between individuals without
centralized control.
The concept is often applied in computer science and engineering to develop algorithms and
systems inspired by natural swarm behaviors. These algorithms often involve simulating the
interactions and behaviors of individuals within a swarm to solve complex problems, such as
optimization, routing, and task allocation.
Key characteristics of swarm intelligence include decentralized control, robustness,
adaptability, and the ability to solve complex problems through simple rules governing
individual behavior. Examples include ant colony optimization, particle swarm
optimization, and artificial bee colony algorithms.
Swarm Intelligence by ant – A living bridge

Swarm Intelligence by Bees

Swarm Intelligence by Birds
Swarm intelligence by Fishes

Swarm Intelligence
Initially proposed by Marco Dorigo in 1992 in his PhD thesis, the first algorithm was aiming to
search for an optimal path in a graph, based on the behavior of ants seeking a path between their
colony and a source of food.
Ants are eusocial insects that prefer community survival and sustaining rather than as individual
species.
They communicate with each other using sound, touch, and pheromone.
Pheromones are organic chemical compounds secreted by the ants that trigger a social response
in members of same species.
Since most ants live on the ground, they use the soil surface to leave pheromone trails that may
be followed (smelled) by other ants. Ants live in community nests and the underlying principle of
ACO is to observe the movement of the ants from their nests in order to search for food in the
shortest possible path.
Initially, ants start to move randomly in search of food around their nests. This randomized
search opens up multiple routes from the nest to the food source.
Now, based on the quality and quantity of the food, ants carry a portion of the food back with
necessary pheromone concentration on its return path.
Depending on these pheromone trials, the probability of selection of a specific path by the
following ants would be a guiding factor to the food source.
Evidently, this probability is based on the concentration as well as the rate of evaporation of
pheromone. It can also be observed that since the evaporation rate of pheromone is also a
deciding factor, the length of each path can easily be accounted for.

In the above figure, for simplicity, only two possible paths have been considered between the
food source and the ant nest. The stages can be analyzed as follows:
Stage 1: All ants are in their nest. There is no pheromone content in the environment. (For
algorithmic design, residual pheromone amount can be considered without interfering with the
probability)
Stage 2: Ants begin their search with equal (0.5 each) probability along each path. Clearly, the
curved path is the longer and hence the time taken by ants to reach food source is greater than the
other.
Stage 3: The ants through the shorter path reaches food source earlier. Now, evidently they face
with a similar selection dilemma, but this time due to pheromone trail along the shorter path
already available, probability of selection is higher.
Stage 4: More ants return via the shorter path and subsequently the pheromone concentrations
also increase. Moreover, due to evaporation, the pheromone concentration in the longer path
reduces, decreasing the probability of selection of this path in further stages. Therefore, the
whole colony gradually uses the shorter path in higher probabilities. So, path optimization is
attained.
Algorithm Design
Now the above behavior of the ants can be used to design the algorithm to find the shortest path.
We can consider the ant colony and food source as the node or vertex of the graph and the
path as the edges to these vertices. Now the pheromone concentration can be assumed as the
weight associated with each path.
Let's suppose there are only two paths which are P1 and P2. C1 and C2 are the weight or the
pheromone concentration along the path, respectively.
So we can represent it as graph G(V, E) where V represents the Vertex and E represents the Edge
of the graph.
Initially, for the ith path, the probability of choosing is:
If C1 > C2, then the probability of choosing path 1 is more than path 2. If C1 < C2, then Path 2
will be more favorable.
For the return path, the length of the path and the rate of evaporation of the pheromone are the
two factors.
1. Concentration of pheromone according to the length of the path:
Where Li is the length of the path and K is the constant depending upon the length of the path. If
the path is shorter, concentration will be added more to the existing pheromone concentration.

2. Change in concentration according to the rate of evaporation:
Here parameter v varies from 0 to 1. If v is higher, then the concentration will be less.
In the initialization step of ACO, several key aspects are set up to prepare the algorithm for
solving the optimization problem.
1. Problem Representation: Define how the problem solution will be represented. This
could involve encoding the problem into a suitable format, such as a graph for the
Traveling Salesman Problem (TSP) or a network for the Vehicle Routing Problem
(VRP).
2. Number of Ants: Determine the number of artificial ants that will participate in the
search. Typically, this value is predefined based on problem characteristics and
computational resources. Having more ants can enhance exploration but may increase
computational complexity.
3. Pheromone Initialization: Initialize the pheromonelevels on solution components or
positions. Pheromones represent the amount of chemical substance deposited by ants on
solution components and play a crucial role in guiding the search process. Common
initialization strategies include setting all pheromone levels to a constant value or
randomizing initial pheromone levels.
4. Heuristic Information: Calculate or assign heuristic information to guide ant movement.
Heuristic information provides additional guidance to ants based on problem-specific
knowledge. For example, in the TSP, heuristic information could be the inverse of the
distance between cities.
5. Ant Placement: Randomly place ants on solution components or positions to start the
construction of solutions. Ants are typically distributed evenly across the problem space
to ensure a fair exploration of the solution space.
6. Parameter Initialization: Set other algorithm parameters such as the pheromone
evaporation rate, exploration-exploitation trade-off parameters (e.g., alpha and
beta), and convergence criteria. These parameters influence the behavior of the
algorithm and may need to be tuned through experimentation.




• After a certain number of iterations or upon reaching a termination condition, select the
best solution found by the ants.
• This solution represents the output of the Ant Colony Optimization algorithm and
serves as the optimized solution to the given problem.
Particle Swarm Optimization (PSO)

Particle Swarm Optimization (PSO) is a computational optimization technique inspired by the

social behavior of bird flocking or fish schooling. In PSO, a population of potential solutions,
called particles, move around in the search space to find the optimal solution.
Particle Swarm Optimization (PSO) was introduced in 1995 by Kennedy and Eberhart.
James Kennedy and Russell Eberhart, 1995, Particle swarm optimization, in Proceedings of
IEEE International Conference on Neural Networks, pages 1942-1948.
https://www.cs.tufts.edu/comp/150GA/homeworks/hw3/_reading6%201995%20particle
%20swarming.pdf

Chances for getting food is much more when the search is done collectively than individually.

In Particle Swarm Optimization (PSO), the next position of each particle is determined by
updating its current position and velocity according to certain rules. The new position of a
particle is typically calculated using the following formula:
xi(t+1)=xi(t)+vi(t+1)
Where:
 xi(t) is the current position of particle i at time t.
 vi(t+1) is the velocity of particle i at time t+1.
 xi(t+1) is the new position of particle i at time t+1.
The velocity update equation is commonly defined as:
vi(t+1)=w⋅vi(t)+c1⋅r1⋅(pbesti−xi(t))+c2⋅r2⋅(gbest−xi(t))
Where:
 w is the inertia weight, which controls the impact of the particle's current velocity on its
next velocity.
 c1 and c2 are acceleration coefficients (also called cognitive and social coefficients,
respectively) that control the influence of the particle's best-known position (pbest) and
the global best-known position (gbest) on its movement.
 r1 and r2 are random values sampled from a uniform distribution between 0 and 1.
 pbesti is the best-known position of particle i so far.
 gbest is the best-known position found by any particle in the swarm.

Exploration Vs Exploitation
High C2 – Population converges too fast to the best solution found so far i.e. High Exploitation
High C1 – Each swarm stick to their personal best performance i.e. High Exploration of personal
knowledge
High w – Individual swarm keep exploring the current direction i.e. High Exploration
Required: A balance
Decrease w iteratively. Initially it will do High Exploration then High Exploitation

References
• HK Lum of Kings college
London, University of London
• Greeks for Greeks

Session 8
As per syllabus
Game Playing, Searching to Play Games:
1. Minimax Algorithm
2. Alpha-Beta Pruning
3. Making imperfect real time decisions
As per session plan – Searching to Play Games:

1. Minimax Algorithm
2. Alpha-Beta Pruning
3. Making imperfect real time decisions
One famous AI system

Deep Blue: In 1997, the Deep Blue chess program created by IBM, beat the current world chess
champion, Gary Kasparov.
Types of Games
Here we consider Games with Perfect Information.

Competitive environments: In which the agents’ goal are in conflict, giving rise to adversarial
search problems.

Why study games?
Clear criteria for success
Offer an opportunity to study problems involving {hostile, adversarial, competing} agents.
Historical reasons
Fun
Interesting, hard problems which require minimal “initial structure”
Games often define very large search spaces
– Chess 35100 nodes in search tree, 1040 legal states
Game
S0: The initial state, which specifies how the game is set up at the start.
PLAYER(s): Defines which player has the move in a state.
ACTIONS(s): Returns the set of legal moves in a state.
RESULT(s, a): The transition model, which defines the result of a move.
TERMINAL-TEST(s): A terminal test, which is true when the game is over and false otherwise.
States where the game has ended are called terminal states.
UTILITY(s, p): A utility function (also called an objective function or payoff function), applied
to terminal states, defines the final numeric value for a game that ends in terminal state s for a
player p. In chess, the outcome is a win, loss or draw, with value +1, 0, or 1/2. Some games have
a wider variety of possible outcomes, e.g. In backgammon the payoffs range from 0 to +192.
A zero-sum game is one where the total payoff to all players is the
same for every instance of the game. e.g. Chess, 0+1=1+0=1/2+1/2
Typical Case
2-person game
Players alternate moves
Zero-sum: one player’s loss is the other’s gain
Perfect Information: Both players have access to complete information about the state of the
game. No information is hidden from either player.
No chance (e.g., using dice) involved

Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, Othello Not: Bridge, Solitaire,
Backgammon, ...

How to play a game
A way to play such a game is to:
 Consider all the legal moves you can make
 Compute the new position resulting from each move
 Evaluate each resulting position and determine which is best
 Make that move
 Wait for your opponent to move and repeat
Key problems are:
 Representing the “board”
 Generating all legal next boards
 Evaluating a position
Evaluation Function
Evaluation function or static evaluator is used to evaluate the “goodness” of a game position.
 Contrast with heuristic search where the evaluation function was a non-negative estimate
of the cost from the start node to a goal and passing through the given node.
The zero-sum assumption allows us to use a single evaluation function to describe the goodness
of a board with respect to both players.
 f(n) >> 0: position n good for me and bad for you
 f(n) << 0: position n bad for me and good for you
 f(n) near 0: position n is a neutral position
 f(n) = + infinity: win for me
 f(n) = - infinity: win for you
Evaluation function examples
Example of an evaluation function for Tic-Tac-Toe:
f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for you]
where a 3-length is a complete row, column, or diagonal Alan Turing’s function for chess.
 f(n) = w(n)/b(n) where w(n) = sum of the point value of white’s pieces and b(n) = sum of
black’s
Most evaluation functions are specified as a weighted sum of position features:
f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)
Example features for chess are piece count, piece placement, squares controlled, etc.
Deep Blue has about 6000 features in its evaluation function

Game Trees
Minimax Procedure
Create start node as a MAX node with current board configuration.
Expand nodes down to some depth of lookahead in the game.
Apply the evaluation function at each of the leaf nodes.
“Back up” values for each of the non-leaf nodes until a value is computed for the root node.
 At MIN nodes, the backed-up value is the minimum of the values associated with its
children.
 At MAX nodes, the backed-up value is the maximum of the values associated with its
children.
Pick the operator associated with the child node whose backed-up value determined the value at
the root.

Minimax Algorithm
Partial Game Tree for Tic Tac Toe

Minimax Tree
Properties of Minimax
Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)
Alpha – Beta Pruning

We can improve on the performance of the minimax algorithm through alpha-beta pruning
Basic idea: “If you have an idea that is surely bad, don't take the time to see how truly awful it
is.” -- Pat Winston

Alpha – Beta Pruning
Traverse the search tree in depth-first order
At each MAX node n, alpha(n) = maximum value found so far
At each MIN node n, beta(n) = minimum value found so far
 Note: The alpha values start at -infinity and only increase, while beta values start at
+infinity and only decrease.
Beta cutoff: Given a MAX node n, cut off the search below n (i.e., don’t generate or examine
any more of n’s children) if alpha(n) >= beta(i) for some MIN node ancestor i of n.
Alpha cutoff: stop searching below MIN node n if beta(n) <= alpha(i) for some MAX node
ancestor i of n.
Effectiveness of Alpha – Beta Pruning

Alpha-beta is guaranteed to compute the same value for the root node.
as computed by minimax, with less or equal computation
Worst case: No pruning, examining bd leaf nodes, where each node has b children and a d-ply
search is performed
Best case: examine only (2b) (d/2) leaf nodes.
 Result is you can search twice as deep as minimax!
Best case is when each player’s best move is the first alternative generated.
In Deep Blue, they found empirically that alpha-beta pruning meant that the average branching
factor at each node was about 6 instead of about 35!
The Alpha-beta Procedure
There are two rules for terminating search:
 Search can be stopped below any MIN node having a beta value less than or equal to the
alpha value of any of its MAX ancestors.
 Search can be stopped below any MAX node having an alpha value greater than or equal
to the beta value of any of its MIN ancestors.











Limitations of Minimax and Alpha – Beta Pruning
The main drawback of the minimax algorithm is that it gets really slow for complex games such
as Chess. This type of games has a huge branching factor, and the player has lots of choices to
decide. This limitation of the minimax algorithm can be improved from alpha-beta pruning.
The main disadvantage of Alpha Beta pruning is that alpha beta pruning requires setting a depth
limit, as in most cases, it is not feasible to search the entire game tree.
Imperfect Real Time Decision
In game playing algorithms like minimax or alpha-beta pruning, the traditional approach
involves exploring the entire game tree to determine the best move. However, this exhaustive
search is often impractical due to the exponential growth of the game tree with increasing depth.
To address this issue, two modifications are commonly made:
1. Using a Heuristic Evaluation Function (EVAL):
1. Instead of evaluating the utility of terminal nodes directly (as in the case of a
utility function), a heuristic evaluation function (EVAL) is employed to
estimate the utility of non-terminal nodes.
2. This heuristic function provides an approximate evaluation of the board
position based on various features such as piece placement, control of the board,
mobility, etc.
3. By using this heuristic, the algorithm can evaluate non-terminal nodes without
having to traverse the entire subtree, significantly reducing the
computational cost.
2. Introducing a Cutoff Test:
1. Instead of searching the entire game tree to the terminal depth, a cutoff test is
introduced to determine when to stop the search and apply the heuristic
evaluation function (EVAL).
2. The cutoff test sets a maximum depth or a predefined time limit for the
search, beyond which the algorithm stops exploring further and evaluates the
nodes using the heuristic function.
3. This allows the algorithm to focus on the most promising branches of the game
tree, avoiding the exploration of less relevant or less likely paths.
4. The choice of when to apply the cutoff test depends on factors such as available
computational resources, time constraints, and the complexity of the game.
These modifications allow game-playing algorithms to make effective decisions within a
reasonable amount of time, even in games with large branching factors and deep game trees. By
combining heuristic evaluation functions with cutoff tests, these algorithms strike a balance
between exploration and exploitation, enabling efficient decision-making in complex games.

Evaluation functions
An evaluation function returns an estimate of the expected utility of the game from a given position.
How do we design good evaluation functions?
1. The evaluation function should order the non-terminal states in the same way as the true
utility function.
2. The computation must not take too long.
3. For non-terminal states, the evaluation function should be strongly correlated with the
actual chances of winning.
Hence, there is a trade-off between the accuracy of the evaluation function and its time cost.
Features of the state: Most evaluation function work by calculating various features of the state, e.g.
in chess, number of white pawns, black pawns, white queens, black queens, etc.
Example: An approximate material value for each piece can be assigned: each pawn is worth 1, a
knight or bishop is worth 3, a rook 5, and the queen 9. Other features such as "good pawn structure"
and "king safety" might be worth half a pawn, say. All other things being equal, a side that has a
secure material advantage of a pawn or more will probably win the game, and a 3-point advantage is
sufficient for near-certain victory.
It should be clear that the performance of a game-playing program is extremely dependent on
the quality of its evaluation function.
Two ways to design a evaluation function:
1. Expected value: In the context of game playing algorithms, refers to the weighted average of
possible outcomes associated with a particular game state. It's a way to estimate the utility or value of
a state based on the likelihood of different outcomes occurring from that state (requires too many
categories and hence too much experience to estimate)
Ex: 72% of the states encountered in the two-pawns vs. one-pawn category lead to a win (utility
+1); 20% to a loss (utility 0) and 8% to a draw (utility 1/2).
Then a reasonable evaluation for states in the category is the expected value:
(0.72*1)+(0.20*0)+(0.08*1/2)=0.76.
So, the expected value for states in the two-pawns vs. one-pawn category is 0.76.
2. Weighted linear function (most evaluation function use that.) We can compute separate
numerical contributions from each feature and then combine them to find the total value.
Each wi is a weight and each fi is a feature of the position. For chess, the fi could be the numbers of
each kind of piece on the board (i.e. feature), and wi could be the values of the pieces (1 for pawn, 3
for bishop, etc.).
Adding up the values of features in fact involves a strong assumption (that the contribution of each
feature is independent of the values of the other features), thus current programs for games also use
nonlinear combinations of features.
Deep Blue has about 6000 features in its evaluation function.

3. Cutting off search: The most straightforward approach to controlling the amount of search is
to set a fixed depth limit, so that the cutoff test succeeds for all nodes at or below depth d. The
depth is chosen so that the amount of time used will not exceed what the rules of the game allow.
A slightly more robust approach is to apply iterative deepening, as defined earlier. When time
runs out, the program returns the move selected by the deepest completed search.
These approaches can have some disastrous consequences because of the approximate
nature of the evaluation function.
Obviously, a more sophisticated cutoff test is needed. The evaluation function should only be
applied to positions that are quiescent, that is, unlikely to exhibit wild swings in value in the
near future. In chess, for example, positions in which favorable captures can be made are not
quiescent for an evaluation function that just counts material. Non-quiescent positions can be
expanded further until quiescent positions are reached. This extra search is called a
quiescence search; sometimes it is restricted to consider only certain types of moves, such as
capture moves, that will quickly resolve the uncertainties in the position.
The horizon problem is more difficult to eliminate. It arises when the program is facing a move
by the opponent that causes serious damage and is ultimately unavoidable, but can be
temporarily avoided by delaying tactics. Consider the chess game in Figure in the next slide.
Black is slightly ahead in material, but if white can advance its pawn from the seventh row to the
eighth, it will become a queen and be an easy win for white. Black can forestall this for a dozen
or so ply by checking white with the rook, but inevitably the pawn will become a queen. At
present, no general solution has been found for the horizon problem.

Imperfect Real Time Decision
Let us assume we have implemented a minimax search with a reasonable evaluation function for
chess, and a reasonable cutoff test with a quiescence search.
With a well-written program on an ordinary computer, one can probably search about 1000
positions a second. How well will our program play?
In tournament chess, one gets about 150 seconds per move, so we can look at 150,000 positions.
In chess, the branching factor is about 35, so our program will be able to look ahead only three
or four ply, and will play at the level of a complete novice! Even average human players can
make plans six or eight ply ahead, so our program will be easily fooled.
Fortunately, it is possible to compute the correct minimax decision without looking at every node
in the search tree. The process of eliminating a branch of the search tree from consideration
without examining it is called pruning the search tree. The particular technique we will examine
is called alpha-beta pruning. When applied to a standard minimax tree, it returns the same
move as minimax would, but prunes away branches that cannot possibly influence the final
decision.
Effectiveness: If we assume that this can be done, then it turns out that alpha-beta only needs to
examine O(bd/2) nodes to pick the best move, instead of O(bd) with minimax. This means that the
effective branching factor is √b instead of b - for chess, 6 instead of 35.
Put another way, this means that alpha-beta can look ahead twice as far as minimax for the same
cost. Thus, by generating 150,000 nodes in the time allotment, a program can look ahead eight
ply instead of four. By thinking carefully about which computations actually affect the decision,
we are able to transform a program from a novice into an expert.
Forward Pruning: Some moves at a given node are pruned immediately without further
consideration.
PROBCUT (probabilistic) algorithm: A forward-pruning version of alpha-beta search that uses
statistic gained from prior experience to lessen the chance that the best move will be pruned.

Search versus Lookup
Many game programs precompute tables of best moves in the opening and endgame so that they
can look up a move rather than search.
For the opening (and early moves), the program use table lookup, relying on the expertise of
human and statistic from a database of past games;
After about ten moves, end up in a rarely seen position, the program switch from table lookup
to search;
Near the end of the game there are again fewer possible positions, and more chance to do
lookup.
A computer can completely solve the endgame by producing a policy, which is a mapping from
every possible state to the best move in that state. Then we can just look up the best move rather
than recomputed it a new.

Ratings of human and computer Chess champions
Chess has received by far the largest share of attention in game playing. In speed chess,
computers have defeated the world champion, Gary Kasparov. Figure shows the ratings of
human and computer champions over the years.
Progress beyond a mediocre level was initially very slow: some programs in the early 1970s
became extremely complicated, with various kinds of tricks for eliminating some branches of
search, generating plausible moves, and so on, but the programs that won the ACM North
American Computer Chess Championships (initiated in 1970) tended to use straightforward
alpha beta search, augmented with book openings and infallible endgame algorithms.
The first real jump in performance came not from better algorithms or evaluation functions, but
from hardware. Belle, the first special- purpose chess computer (Condon and Thompson, 1982),
used custom integrated circuits to implement move generation and position evaluation, enabling
it to search several million positions to make a single move. Belle's rating was around 2250, on
a scale where beginning humans are 1000 and the world champion around 2750; it became the
first master-level program.
The HITECH system, also a special-purpose computer, was designed by former world
correspondence champion Hans Berliner and his student Carl Ebeling to allow rapid calculation
of very sophisticated evaluation functions. Generating about 10 million positions per move and
using probably the most accurate evaluation of positions yet developed, HITECH became
computer world champion in 1985, and was the first program to defeat a human grandmaster,
Arnold Denker, in 1987. At the time it ranked among the top 800 human players in the world.

The best current system is Deep Thought 2. It is sponsored by IBM, which hired part of the
team that built the Deep Thought system at Carnegie Mellon University. Although Deep Thought
2 uses a simple evaluation function, it examines about half a billion positions per move,
allowing it to reach depth 10 or 11, with a special provision to follow lines of forced moves still
further (it once found a 37-move checkmate). In February 1993, Deep Thought 2 competed
against the Danish Olympic team and won, 3-1, beating one grandmaster and drawing against
another. Its FIDE rating is around 2600, placing it among the top 100 human players.
The next version of the system, Deep Blue, will use a parallel array of 1024 custom VLSI chips.
This will enable it to search the equivalent of one billion positions per second (100-200 billion
per move) and to reach depth 14. A 10-processor version is due to play the Israeli national team
(one of the strongest in the world) in May 1995, and the full-scale system will challenge the
world champion shortly thereafter.
Subsequent to its predecessor Deep Thought's 1989 loss to Garry Kasparov, Deep Blue played
Kasparov twice more. In the first game of the first match, which took place from 10 to 17
February 1996, Deep Blue became the first machine to win a chess game against a reigning
world champion under regular time controls. However, Kasparov won three and drew two of
the following five games, beating Deep Blue by 4–2 at the close of the match.
Deep Blue's hardware was subsequently upgraded, doubling its speed before it faced Kasparov
again in May 1997, when it won the six- game rematch 3½–2½. Deep Blue won the deciding
game after Kasparov failed to secure his position in the opening, thereby becoming the first
computer system to defeat a reigning world champion in a match under standard chess
tournament time controls
In the 44th move of the first game of their second match, unknown to Kasparov, a bug in Deep
Blue's code led it to enter an unintentional loop, which it exited by taking a randomly selected
valid move. Kasparov did not take this possibility into account, and misattributed the seemingly
pointless move to "superior intelligence". Subsequently, Kasparov experienced a decline in
performance in the following game, though he denies this was due to anxiety in the wake of
Deep Blue's inscrutable move.
After his loss, Kasparov said that he sometimes saw unusual creativity in the machine's moves,
suggesting that during the second game, human chess players had intervened on behalf of the
machine. IBM denied this, saying the only human intervention occurred between games.
References
https://www.cnblogs.com/RDaneelOlivaw/p/7919696.html
AI-Game & Optimal decisions in games & Imperfect real-time decisions & Partially
observable games - 丹尼尔奥利瓦 - 博客园 (cnblogs.com)

Artificial and Computational Intelligence

Uploaded by

Copyright:

Available Formats

Artificial and Computational Intelligence

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial and Computational Intelligence

Uploaded by

Copyright:

Available Formats

Artificial and Computational Intelligence Page 1 of 124

Artificial and Computational Intelligence Page 1 of 124

Artificial and Computational Intelligence Page 2 of 124

Artificial and Computational Intelligence Page 3 of 124

Introspection method: In the field of psychology and cognitive science, introspection is a

Thinking Rationally: Laws of Thought

Artificial and Computational Intelligence Page 4 of 124

Acting Humanly: The Tuning Test

Artificial and Computational Intelligence Page 5 of 124

The fourth view of AI is that it is the study of rational agents.

This view deals with building machines that act rationally.

Artificial and Computational Intelligence Page 6 of 124

Artificial and Computational Intelligence Page 7 of 124

Artificial and Computational Intelligence Page 8 of 124

– NASA's Mars rovers successfully completed their primary three-month missions

8. Meta – The company's AI team trained an image recognition model to 85 percent

Artificial and Computational Intelligence Page 9 of 124

11. ChatGPT – https://chat.openai.com/

ChatGPT is built on the GPT (Generative Pre-trained Transformer) architecture

Mind Blowing AI websites

13. Soundraw – https://soundraw.io/

14. Midjourney - https://www.fotor.com/ai-art-generator/

Artificial and Computational Intelligence Page 10 of 124

Artificial and Computational Intelligence Page 11 of 124

Artificial and Computational Intelligence Page 12 of 124

Artificial and Computational Intelligence Page 13 of 124

Xavier Robot (CMU) COG at MIT Museum

Aibo from SONY Sophia in 2018

Artificial and Computational Intelligence Page 14 of 124

Artificial and Computational Intelligence Page 15 of 124

Artificial and Computational Intelligence Page 16 of 124

Agent Environment or Task Environment

Artificial and Computational Intelligence Page 17 of 124

Artificial and Computational Intelligence Page 18 of 124

Table based Agent

Percept Based Agent or Reflex Agent

Artificial and Computational Intelligence Page 19 of 124

Problems with Simple reflex agents are :

Artificial and Computational Intelligence Page 20 of 124

State Based Agent or Model Based Reflex Agent

Artificial and Computational Intelligence Page 21 of 124

Artificial and Computational Intelligence Page 22 of 124

Goal Based Agent

Artificial and Computational Intelligence Page 23 of 124

Breakout is a classic arcade game in which the player controls a paddle at

Artificial and Computational Intelligence Page 24 of 124

Artificial and Computational Intelligence Page 25 of 124

Artificial and Computational Intelligence Page 26 of 124

State Space Search

Artificial and Computational Intelligence Page 27 of 124

Artificial and Computational Intelligence Page 28 of 124

Artificial and Computational Intelligence Page 29 of 124

Elastration of a search Process

Pigs and Disks problem

The goal state

Artificial and Computational Intelligence Page 30 of 124

Artificial and Computational Intelligence Page 31 of 124

Artificial and Computational Intelligence Page 32 of 124

A problem is defined by four items: