0% found this document useful (0 votes)
203 views

Made," and Intelligence Defines "Thinking Power", Hence AI Means "A Man-Made Thinking Power."

The document provides an introduction to artificial intelligence, discussing its history, applications, and key concepts like machine learning and deep learning. It defines AI, describes its goals and components, and outlines some advantages like high accuracy and disadvantages like lack of original creativity.

Uploaded by

saRIKA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
203 views

Made," and Intelligence Defines "Thinking Power", Hence AI Means "A Man-Made Thinking Power."

The document provides an introduction to artificial intelligence, discussing its history, applications, and key concepts like machine learning and deep learning. It defines AI, describes its goals and components, and outlines some advantages like high accuracy and disadvantages like lack of original creativity.

Uploaded by

saRIKA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 42

MODULE 1

Artificial Intelligence Tutorial


The Artificial Intelligence tutorial provides an introduction to AI which will help you to understand the concepts
behind Artificial Intelligence. In this tutorial, we have also discussed various popular topics such as History of AI,
applications of AI, deep learning, machine learning, natural language processing, Reinforcement learning, Q-
learning, Intelligent agents, Various search algorithms, etc.
What is Artificial Intelligence?
In today's world, technology is growing very fast, and we are getting in touch with different new technologies day
by day.
Here, one of the booming technologies of computer science is Artificial Intelligence which is ready to create a new
revolution in the world by making intelligent machines.The Artificial Intelligence is now all around us. It is
currently working with a variety of subfields, ranging from general to specific, such as self-driving cars, playing
chess, proving theorems, playing music, Painting, etc.
AI is one of the fascinating and universal fields of Computer science which has a great scope in future. AI holds a
tendency to cause a machine to work as a human.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines "man-
made," and intelligence defines "thinking power", hence AI means "a man-made thinking power."
So, we can define AI as:
 "It is a branch of computer science by which we can create intelligent machines which can behave like a human, think like humans, and
able to make decisions." 
Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning, and solving
problems
With Artificial Intelligence you do not need to preprogram a machine to do some work, despite that you can create
a machine with programmed algorithms which can work with own intelligence, and that is the awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek myth, there were
Mechanical men in early days which can work and behave like humans.
Why Artificial Intelligence?
Before Learning about Artificial Intelligence, we should know that what is the importance of AI and why should we
learn it. Following are some main reasons to learn about AI:
o With the help of AI, you can create such software or devices which can solve real-world problems very
easily and with accuracy such as health issues, marketing, traffic issues, etc.
o With the help of AI, you can create your personal virtual Assistant, such as Cortana, Google Assistant, Siri,
etc.
o With the help of AI, you can build such Robots which can work in an environment where survival of
humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.
Goals of Artificial Intelligence
Following are the main goals of Artificial Intelligence:
1. Replicate human intelligence
2. Solve Knowledge-intensive tasks
3. An intelligent connection of perception and action
4. Building a machine which can perform tasks that requires human intelligence such as:
o Proving a theorem
o Playing chess
o Plan some surgical operation
o Driving a car in traffic
5. Creating some system which can exhibit intelligent behavior, learn new things by itself, demonstrate,
explain, and can advise to its user.
What Comprises to Artificial Intelligence?
Artificial Intelligence is not just a part of computer science even it's so vast and requires lots of other factors which
can contribute to it. To create the AI first we should know that how intelligence is composed, so the Intelligence is
an intangible part of our brain which is a combination of Reasoning, learning, problem-solving perception,
language understanding, etc.
To achieve the above factors for a machine or software Artificial Intelligence requires the following discipline:
o Mathematics
o Biology
o Psychology
o Sociology
o Computer Science
o Neurons Study
o Statistics
Advantages of Artificial Intelligence
Following are some main advantages of Artificial Intelligence:
o High Accuracy with less errors: AI machines or systems are prone to less errors and high accuracy as it
takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making, because of that AI systems
can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action multiple times with
high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a bomb, exploring the
ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such as AI technology is
currently used by various E-commerce websites to show the products as per customer requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self-driving car which can
make our journey safer and hassle-free, facial recognition for security purpose, Natural language
processing to communicate with the human in human-language, etc.
Disadvantages of Artificial Intelligence
Every technology has some disadvantages, and thesame goes for Artificial intelligence. Being so advantageous
technology still, it has some disadvantages which we need to keep in our mind while creating an AI system.
Following are the disadvantages of AI:
o High Cost: The hardware and software requirement of AI is very costly as it requires lots of maintenance
to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still they cannot work
out of the box, as the robot will only do that work for which they are trained, or programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still it does not have the
feeling so it cannot make any kind of emotional attachment with human, and may sometime be harmful
for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are getting more
dependent on devices and hence they are losing their mental capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas but still AI machines
cannot beat this power of human intelligence and cannot be creative and imaginative.
Application of AI
Artificial Intelligence has various applications in today's society. It is becoming essential for today's time
because it can solve complex problems with an efficient way in multiple industries, such as Healthcare,
entertainment, finance, education, etc. AI is making our daily life more comfortable and fast.
Following are some sectors which have the application of Artificial Intelligence:
1. AI in Astronomy
o Artificial Intelligence can be very useful to solve complex universe problems. AI technology can be
helpful for understanding the universe such as how it works, origin, etc.
2. AI in Healthcare
o In the last, five to ten years, AI becoming more advantageous for the healthcare industry and going to
have a significant impact on this industry.
o Healthcare Industries are applying AI to make a better and faster diagnosis than humans. AI can help
doctors with diagnoses and can inform when patients are worsening so that medical help can reach to
the patient before hospitalization.
3. AI in Gaming
o AI can be used for gaming purpose. The AI machines can play strategic games like chess, where the
machine needs to think of a large number of possible places.
4. AI in Finance
o AI and finance industries are the best matches for each other. The finance industry is implementing
automation, chatbot, adaptive intelligence, algorithm trading, and machine learning into financial
processes.
5. AI in Data Security
o The security of data is crucial for every company and cyber-attacks are growing very rapidly in the
digital world. AI can be used to make your data more safe and secure. Some examples such as AEG
bot, AI2 Platform,are used to determine software bug and cyber-attacks in a better way.
6. AI in Social Media
o Social Media sites such as Facebook, Twitter, and Snapchat contain billions of user profiles, which need
to be stored and managed in a very efficient way. AI can organize and manage massive amounts of
data. AI can analyze lots of data to identify the latest trends, hashtag, and requirement of different
users.
7. AI in Travel & Transport
o AI is becoming highly demanding for travel industries. AI is capable of doing various travel related
works such as from making travel arrangement to suggesting the hotels, flights, and best routes to the
customers. Travel industries are using AI-powered chatbots which can make human-like interaction
with customers for better and fast response.
8. AI in Automotive Industry
o Some Automotive industries are using AI to provide virtual assistant to their user for better
performance. Such as Tesla has introduced TeslaBot, an intelligent virtual assistant.
o Various Industries are currently working for developing self-driven cars which can make your journey
more safe and secure.
9. AI in Robotics:
o Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are programmed such
that they can perform some repetitive task, but with the help of AI, we can create intelligent robots
which can perform tasks with their own experiences without pre-programmed.
o Humanoid Robots are best examples for AI in robotics, recently the intelligent Humanoid robot named
as Erica and Sophia has been developed which can talk and behave like humans.
10. AI in Entertainment
o We are currently using some AI based applications in our daily life with some entertainment services
such as Netflix or Amazon. With the help of ML/AI algorithms, these services show the
recommendations for programs or shows.
11. AI in Agriculture
o Agriculture is an area which requires various resources, labor, money, and time for best result. Now a
day's agriculture is becoming digital, and AI is emerging in this field. Agriculture is applying AI as
agriculture robotics, solid and crop monitoring, predictive analysis. AI in agriculture can be very helpful
for farmers.
12. AI in E-commerce
o AI is providing a competitive edge to the e-commerce industry, and it is becoming more demanding in
the e-commerce business. AI is helping shoppers to discover associated products with recommended
size, color, or even brand.
13. AI in education:
o AI can automate grading so that the tutor can have more time to teach. AI chatbot can communicate
with students as a teaching assistant.
o AI in the future can be work as a personal virtual tutor for students, which will be accessible easily at
any time and any place.
History of Artificial Intelligence
Artificial Intelligence is not a new word and not a new technology for researchers. This technology is much older
than you would imagine. Even there are the myths of Mechanical men in Ancient Greek and Egyptian Myths.
Following are some milestones in the history of AI which defines the journey from the AI generation to till date
development.
Maturation of Artificial Intelligence (1943-1952)
o Year 1943: The first work which is now recognized as AI was done by Warren McCulloch and Walter pits
in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection strength between
neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered Machine learning in 1950.
Alan Turing publishes "Computing Machinery and Intelligence" in which he proposed a test. The test
can check the machine's ability to exhibit intelligent behavior equivalent to human intelligence, called
a Turing test.
The birth of Artificial Intelligence (1952-1956)
o Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial intelligence program"Which
was named as "Logic Theorist". This program had proved 38 of 52 Mathematics theorems, and find new
and more elegant proofs for some theorems.
o Year 1956: The word "Artificial Intelligence" first adopted by American Computer scientist John McCarthy
at the Dartmouth Conference. For the first time, AI coined as an academic field.
At that time high-level computer languages such as FORTRAN, LISP, or COBOL were invented. And the enthusiasm
for AI was very high at that time.
The golden years-Early enthusiasm (1956-1974)
o Year 1966: The researchers emphasized developing algorithms which can solve mathematical problems.
Joseph Weizenbaum created the first chatbot in 1966, which was named as ELIZA.
o Year 1972: The first intelligent humanoid robot was built in Japan which was named as WABOT-1.
The first AI winter (1974-1980)
o The duration between years 1974 to 1980 was the first AI winter duration. AI winter refers to the time
period where computer scientist dealt with a severe shortage of funding from government for AI
researches.
o During AI winters, an interest of publicity on artificial intelligence was decreased.
A boom of AI (1980-1987)
o Year 1980: After AI winter duration, AI came back with "Expert System". Expert systems were
programmed that emulate the decision-making ability of a human expert.
o In the Year 1980, the first national conference of the American Association of Artificial Intelligence was
held at Stanford University.
The second AI winter (1987-1993)
o The duration between the years 1987 to 1993 was the second AI Winter duration.
o Again Investors and government stopped in funding for AI research as due to high cost but not efficient
result. The expert system such as XCON was very cost effective.
The emergence of intelligent agents (1993-2011)
o Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary Kasparov, and became
the first computer to beat a world chess champion.
o Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum cleaner.
o Year 2006: AI came in the Business world till the year 2006. Companies like Facebook, Twitter, and
Netflix also started using AI.
Deep learning, big data and artificial general intelligence (2011-present)
o Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had to solve the
complex questions as well as riddles. Watson had proved that it could understand natural language and
can solve tricky questions quickly.
o Year 2012: Google has launched an Android app feature "Google now", which was able to provide
information to the user as a prediction.
o Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the infamous "Turing
test."
o Year 2018: The "Project Debater" from IBM debated on complex topics with two master debaters and also
performed extremely well.
o Google has demonstrated an AI program "Duplex" which was a virtual assistant and which had taken
hairdresser appointment on call, and lady on other side didn't notice that she was talking with the
machine.
Now AI has developed to a remarkable level. The concept of Deep learning, big data, and data science are now
trending like a boom. Nowadays companies like Google, Facebook, IBM, and Amazon are working with AI and
creating amazing devices. The future of Artificial Intelligence is inspiring and will come with high intelligence.
Types of Artificial Intelligence:
Artificial Intelligence can be divided in various types, there are mainly two types of main categorization which are
based on capabilities and based on functionally of AI. Following is flow diagram which explain the types of AI.

AI type-1: Based on Capabilities


1. Weak AI or Narrow AI:
o Narrow AI is a type of AI which is able to perform a dedicated task with intelligence.The most common and
currently available AI is Narrow AI in the world of Artificial Intelligence.
o Narrow AI cannot perform beyond its field or limitations, as it is only trained for one specific task. Hence it
is also termed as weak AI. Narrow AI can fail in unpredictable ways if it goes beyond its limits.
o Apple Siriis a good example of Narrow AI, but it operates with a limited pre-defined range of functions.
o IBM's Watson supercomputer also comes under Narrow AI, as it uses an Expert system approach
combined with Machine learning and natural language processing.
o Some Examples of Narrow AI are playing chess, purchasing suggestions on e-commerce site, self-driving
cars, speech recognition, and image recognition.
2. General AI:
o General AI is a type of intelligence which could perform any intellectual task with efficiency like a human.
o The idea behind the general AI to make such a system which could be smarter and think like a human by
its own.
o Currently, there is no such system exist which could come under general AI and can perform any task as
perfect as a human.
o The worldwide researchers are now focused on developing machines with General AI.
o As systems with general AI are still under research, and it will take lots of efforts and time to develop such
systems.
3. Super AI:
o Super AI is a level of Intelligence of Systems at which machines could surpass human intelligence, and can
perform any task better than human with cognitive properties. It is an outcome of general AI.
o Some key characteristics of strong AI include capability include the ability to think, to reason,solve the
puzzle, make judgments, plan, learn, and communicate by its own.
o Super AI is still a hypothetical concept of Artificial Intelligence. Development of such systems in real is still
world changing task.

Artificial Intelligence type-2: Based on functionality


1. Reactive Machines
o Purely reactive machines are the most basic types of Artificial Intelligence.
o Such AI systems do not store memories or past experiences for future actions.
o These machines only focus on current scenarios and react on it as per possible best action.
o IBM's Deep Blue system is an example of reactive machines.
o Google's AlphaGo is also an example of reactive machines.
2. Limited Memory
o Limited memory machines can store past experiences or some data for a short period of time.
o These machines can use stored data for a limited time period only.
o Self-driving cars are one of the best examples of Limited Memory systems. These cars can store recent
speed of nearby cars, the distance of other cars, speed limit, and other information to navigate the road.
3. Theory of Mind
o Theory of Mind AI should understand the human emotions, people, beliefs, and be able to interact socially
like humans.
o This type of AI machines are still not developed, but researchers are making lots of efforts and
improvement for developing such AI machines.
4. Self-Awareness
o Self-awareness AI is the future of Artificial Intelligence. These machines will be super intelligent, and will
have their own consciousness, sentiments, and self-awareness.
o These machines will be smarter than human mind.
o Self-Awareness AI does not exist in reality still and it is a hypothetical concept.
Agents in Artificial Intelligence
An AI system can be defined as the study of the rational agent and its environment. The agents sense the environment
environment through actuators. An AI agent can have mental properties such as knowledge, belief, intention, etc.
What is an Agent?
An agent can be anything that perceiveits environment through sensors and act upon that environment through actuato
of perceiving, thinking, and acting. An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and hand, legs, vocal
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors and various motors fo
o Software Agent: Software agent can have keystrokes, file contents as sensory input and act on those inputs a
Hence the world around us is full of agents such as thermostat, cellphone, camera, and even we are also agents.
Before moving forward, we should first know about sensors, effectors, and actuators.
Sensor: Sensor is a device which detects the change in the environment and sends the information to other electronic d
environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The actuators are only respons
system. An actuator can be an electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs, wheels, arms, fingers, wings, f

Intelligent Agents:
An intelligent agent is an autonomous entity which act upon an environment using sensors and actuators for achieving g
from the environment to achieve their goals. A thermostat is an example of an intelligent agent.
Following are the main four rules for an AI agent:
o Rule 1: An AI agent must have the ability to perceive the environment.
o Rule 2: The observation must be used to make decisions.
o Rule 3: Decision should result in an action.
o Rule 4: The action taken by an AI agent must be a rational action.
Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way to maximize its performa
A rational agent is said to perform the right things. AI is about creating rational agents to use for game theory and decis
scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning algorithm, for each best pos
reward and for each wrong action, an agent gets a negative reward.
Note: Rational agents in AI are very similar to intelligent agents.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be judged on the basis of following
o Performance measure which defines the success criterion.
o Agent prior knowledge of its environment.
o Best possible actions that an agent can perform.
o The sequence of percept.
Note: Rationality differs from Omniscience because an Omniscient agent knows the actual outcome of its action and act accord
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The structure of an intelligent agent
agent program. It can be viewed as:
1. Agent = Architecture + Agent program  
Following are the main three terms involved in the structure of an AI agent:
Architecture: Architecture is machinery that an AI agent executes on.
Agent Function: Agent function is used to map a percept to an action.
1. f:P* → A  
Agent program: Agent program is an implementation of agent function. An agent program executes on the physical ar
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or rational agent, then we can g
representation model. It is made up of four words:
o P: Performance measure
o E: Environment
o A: Actuators
Agent Performance Environment Actuators Sensors o S: Sensors
measure Here performance measure is the objectiv
behavior.
PEAS for self-driving cars:
1. o Heal o Pati o Tests Keyboard
Med thy ent o Treat (Entry of Let's suppose a self-driving car then PEAS
ical patie o Hos ment symptoms Performance: Safety, time, legal drive, c
Dia nt pita s ) Environment: Roads, other vehicles, road
gno o Mini l Actuators: Steering, accelerator, brake, s
se mize o Staf Sensors: Camera, GPS, speedometer, odo
d f Example of Agents with their PEAS representati
cost

2. o Clea o Roo o Whe o Ca


VacTypes of AI Agents
nnes m els me
uu Agents can be grouped
s into five
o classes
Tab based ono their Brusdegree of perceived
ra intelligence and capability. All these
m agents can oimprove
Effici their performance
le and generate hes better action over
o the
Dir time. These are given below:
Cle o Simple ency
Reflex Agent o Wo o Vacu t
ane o Model-based
o Batt reflex agent od um det
r o Goal-basedery agents floo Extra ect
life agent
o Utility-based r ctor ion
o Secu
o Learning agent o Car se
rity
1. Simple Reflex agent: pet ns
o The Simple reflex agents Varithe simplest agents. These agentsor
o are take decisions on the basis of the current
percepts and ignore the rest ousof the percept history. o Clif
o These agents only succeedobs in the fully observable environment. f
o The Simple reflex agent does taclnot consider any part of percepts sehistory during their decision and action
process. es ns
o The Simple reflex agent works on Condition-action rule, which means or it maps the current state to action.
o
Such as a Room Cleaner agent, it works only if there is dirt in the Buroom.
o Problems for the simple reflex agent design approach: mp
o They have very limited intelligence Se
ns
o They do not have knowledge of non-perceptual parts of the current state
or
o Mostly too big to generate and to store.
o Inf
o Not adaptive to changes in the environment.
rar
ed
Wa
ll
Se
ns
or

3. o Perc o Con o Joint o Ca


Part enta vey ed me
-pic ge of or Arms ra
kin parts belt o Hand o Joi
g in wit nt
Rob2. Model-based reflex
correagent h an
ot ct par gle
o The Model-based agent can work in a partially observable environment, and track the situation.
bins.
o A model-based ts, important factors:
agent has two se
o Bin
o Model: It is knowledge about "how things happen in thensworld," so it is called a Model-based
s ors
agent.
.
o Internal State: It is a representation of the current state based on percept history.
o These agents have the model, "which is knowledge of the world" and based on the model they perform
actions.
o Updating the agent state requires information about:
o How the world evolves
o How the agent's action affects the world.
3. Goal-based agents
o The knowledge of the current state environment is not always sufficient to decide for an agent to what to
do.
o The agent needs to know its goal which describes desirable situations.
o Goal-based agents expand the capabilities of the model-based agent by having the "goal" information.
o They choose an action, so that they can achieve the goal.
o These agents may have to consider a long sequence of possible actions before deciding whether the goal is
achieved or not. Such considerations of different scenario are called searching and planning, which makes
an agent proactive.

4. Utility-based agents
o These agents are similar to the goal-based agent but provide an extra component of utility measurement
which makes them different by providing a measure of success at a given state.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an agent has to choose
in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each action achieves the
goals.

5. Learning Agents
o A learning agent in AI is the type of agent which can learn from its past experiences, or it has learning
capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically through learning.
o A learning agent has mainly four conceptual components, which are:
1. Learning element: It is responsible for making improvements by learning from environment
2. Critic: Learning element takes feedback from critic which describes that how well the agent is
doing with respect to a fixed performance standard.
3. Performance element: It is responsible for selecting external action
4. Problem generator: This component is responsible for suggesting actions that will lead to new
and informative experiences.
o Hence, learning agents are able to learn, analyze performance, and look for new ways to improve the
performance.
What is an Expert System?
An expert system is a computer program that is designed to solve complex problems and to provide decision-
making ability like a human expert. It performs this by extracting knowledge from its knowledge base using the
reasoning and inference rules according to the user queries.
The expert system is a part of AI, and the first ES was developed in the year 1970, which was the first successful
approach of artificial intelligence. It solves the most complex issue as an expert by extracting the knowledge
stored in its knowledge base. The system helps in decision making for complex problems using both facts and
heuristics like a human expert. It is called so because it contains the expert knowledge of a specific domain
and can solve any complex problem of that particular domain. These systems are designed for a specific domain,
such as medicine, science, etc.
The performance of an expert system is based on the expert's knowledge stored in its knowledge base. The more
knowledge stored in the KB, the more that system improves its performance. One of the common examples of an
ES is a suggestion of spelling errors while typing in the Google search box.
Below is the block diagram that represents the working of an expert system:

Note: It is important to remember that an expert system is not used to replace the human experts; instead, it is used to
assist the human in making a complex decision. These systems do not have human capabilities of thinking and work on
the basis of the knowledge base of the particular domain.
Below are some popular examples of the Expert System:
o DENDRAL: It was an artificial intelligence project that was made as a chemical analysis expert system. It
was used in organic chemistry to detect unknown organic molecules with the help of their mass spectra
and knowledge base of chemistry.
o MYCIN: It was one of the earliest backward chaining expert systems that was designed to find the
bacteria causing infections like bacteraemia and meningitis. It was also used for the recommendation of
antibiotics and the diagnosis of blood clotting diseases.
o PXDES: It is an expert system that is used to determine the type and level of lung cancer. To determine
the disease, it takes a picture from the upper body, which looks like the shadow. This shadow identifies
the type and degree of harm.
o CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer at early stages.
Characteristics of Expert System
o High Performance: The expert system provides high performance for solving any type of complex
problem of a specific domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable by the user. It can take input in
human language and provides the output in the same way.
o Reliable: It is much reliable for generating an efficient and accurate output.
o Highly responsive: ES provides the result for any complex query within a very short period of time.
Components of Expert System
An expert system mainly consists of three components:
o User Interface
o Inference Engine
o Knowledge Base
1. User Interface
With the help of a user interface, the expert system interacts with the user, takes queries as an input in a readable
format, and passes it to the inference engine. After getting the response from the inference engine, it displays the
output to the user. In other words, it is an interface that helps a non-expert user to communicate with the
expert system to find a solution.
2. Inference Engine(Rules of Engine)
o The inference engine is known as the brain of the expert system as it is the main processing unit of the
system. It applies inference rules to the knowledge base to derive a conclusion or deduce new information.
It helps in deriving an error-free solution of queries asked by the user.
o With the help of an inference engine, the system extracts the knowledge from the knowledge base.
o There are two types of inference engine:
o Deterministic Inference engine: The conclusions drawn from this type of inference engine are assumed
to be true. It is based on facts and rules.
o Probabilistic Inference engine: This type of inference engine contains uncertainty in conclusions, and
based on the probability.
Inference engine uses the below modes to derive the solutions:
o Forward Chaining: It starts from the known facts and rules, and applies the inference rules to add their
conclusion to the known facts.
o Backward Chaining: It is a backward reasoning method that starts from the goal and works backward to
prove the known facts.
3. Knowledge Base
o The knowledgebase is a type of storage that stores knowledge acquired from the different experts of the
particular domain. It is considered as big storage of knowledge. The more the knowledge base, the more
precise will be the Expert System.
o It is similar to a database that contains information and rules of a particular domain or subject.
o One can also view the knowledge base as collections of objects and their attributes. Such as a Lion is an
object and its attributes are it is a mammal, it is not a domestic animal, etc.
Components of Knowledge Base
o Factual Knowledge: The knowledge which is based on facts and accepted by knowledge engineers comes
under factual knowledge.
o Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.
Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base using the If-
else rules.
Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain knowledge,
specifying the rules to acquire the knowledge from various experts, and store that knowledge into the knowledge
base.
Development of Expert System
Here, we will explain the working of an expert system by taking an example of MYCIN ES. Below are some steps to
build an MYCIN:
o Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human experts specialized in the
medical field of bacterial infection, provide information about the causes, symptoms, and other knowledge
in that domain.
o The KB of the MYCIN is updated successfully. In order to test it, the doctor provides a new problem to it.
The problem is to identify the presence of the bacteria by inputting the details of a patient, including the
symptoms, current condition, and medical history.
o The ES will need a questionnaire to be filled by the patient to know the general information about the
patient, such as gender, age, etc.
o Now the system has collected all the information, so it will find the solution for the problem by applying if-
then rules using the inference engine and using the facts stored within the KB.
o In the end, it will provide a response to the patient by using the user interface.
Participants in the development of Expert System
There are three primary participants in the building of Expert System:
1. Expert: The success of an ES much depends on the knowledge provided by human experts. These experts
are those persons who are specialized in that specific domain.
2. Knowledge Engineer: Knowledge engineer is the person who gathers the knowledge from the domain
experts and then codifies that knowledge to the system according to the formalism.
3. End-User: This is a particular person or a group of people who may not be experts, and working on the
expert system needs the solution or advice for his queries, which are complex.
Why Expert System?

Before using any technology, we must have an idea about why to use that technology and hence the same for the
ES. Although we have human experts in every field, then what is the need to develop a computer-based system.
So below are the points that are describing the need of the ES:
1. No memory Limitations: It can store as much data as required and can memorize it at the time of its
application. But for human experts, there are some limitations to memorize all things at every time.
2. High Efficiency: If the knowledge base is updated with the correct knowledge, then it provides a highly
efficient output, which may not be possible for a human.
3. Expertise in a domain: There are lots of human experts in each domain, and they all have different
skills, different experiences, and different skills, so it is not easy to get a final output for the query. But if
we put the knowledge gained from human experts into the expert system, then it provides an efficient
output by mixing all the facts and knowledge
4. Not affected by emotions: These systems are not affected by human emotions such as fatigue, anger,
depression, anxiety, etc.. Hence the performance remains constant.
5. High security: These systems provide high security to resolve any query.
6. Considers all the facts: To respond to any query, it checks and considers all the available facts and
provides the result accordingly. But it is possible that a human expert may not consider some facts due to
any reason.
7. Regular updates improve the performance: If there is an issue in the result provided by the expert
systems, we can improve the performance of the system by updating the knowledge base.
Capabilities of the Expert System
Below are some capabilities of an Expert System:
o Advising: It is capable of advising the human being for the query of any domain from the particular ES.
o Provide decision-making capabilities: It provides the capability of decision making in any domain,
such as for making any financial decision, decisions in medical science, etc.
o Demonstrate a device: It is capable of demonstrating any new products such as its features,
specifications, how to use that product, etc.
o Problem-solving: It has problem-solving capabilities.
o Explaining a problem: It is also capable of providing a detailed description of an input problem.
o Interpreting the input: It is capable of interpreting the input given by the user.
o Predicting results: It can be used for the prediction of a result.
o Diagnosis: An ES designed for the medical field is capable of diagnosing a disease without using multiple
components as it already contains various inbuilt medical tools.
Advantages of Expert System
o These systems are highly reproducible.
o They can be used for risky places where the human presence is not safe.
o Error possibilities are less if the KB contains correct knowledge.
o The performance of these systems remains steady as it is not affected by emotions, tension, or fatigue.
o They provide a very high speed to respond to a particular query.
Limitations of Expert System
o The response of the expert system may get wrong if the knowledge base contains the wrong information.
o Like a human being, it cannot produce a creative output for different scenarios.
o Its maintenance and development costs are very high.
o Knowledge acquisition for designing is much difficult.
o For each domain, we require a specific ES, which is one of the big limitations.
o It cannot learn from itself and hence requires manual updates.
Applications of Expert System
o In designing and manufacturing domain
It can be broadly used for designing and manufacturing physical devices such as camera lenses and
automobiles.
o In the knowledge domain
These systems are primarily used for publishing the relevant knowledge to the users. The two popular ES
used for this domain is an advisor and a tax advisor.
o In the finance domain
In the finance industries, it is used to detect any type of possible fraud, suspicious activity, and advise
bankers that if they should provide loans for business or not.
o In the diagnosis and troubleshooting of devices
In medical diagnosis, the ES system is used, and it was the first area where these systems were used.
o Planning and Scheduling
The expert systems can also be used for planning and scheduling some particular tasks for achieving the
goal of that task.

MODULE 2

Search Algorithms in Artificial Intelligence


Search algorithms are one of the most important areas of Artificial Intelligence. This topic will explain all about
the search algorithms in AI.

Problem-solving agents:

In Artificial Intelligence, Search techniques are universal problem-solving methods. Rational


agents or Problem-solving agents in AI mostly used these search strategies or algorithms to solve a specific
problem and provide the best result. Problem-solving agents are the goal-based agents and use atomic
representation. In this topic, we will learn various problem-solving search algorithms.

Search Algorithm Terminologies:

o Search: Searchingis a step by step procedure to solve a search-problem in a given search space. A


search problem can have three main factors:

1. Search Space: Search space represents a set of possible solutions, which a system may have.
2. Start State: It is a state from where agent begins the search.
3. Goal test: It is a function which observe the current state and returns whether the goal state
is achieved or not.
o Search tree: A tree representation of search problem is called Search tree. The root of the search tree
is the root node which is corresponding to the initial state.
o Actions: It gives the description of all the available actions to the agent.
o Transition model: A description of what each action do, can be represented as a transition model.
o Path Cost: It is a function which assigns a numeric cost to each path.
o Solution: It is an action sequence which leads from the start node to the goal node.
o Optimal Solution: If a solution has the lowest cost among all solutions.

Properties of Search Algorithms:

Following are the four essential properties of search algorithms to compare the efficiency of these algorithms:

Completeness: A search algorithm is said to be complete if it guarantees to return a solution if at least any
solution exists for any random input.

Optimality: If a solution found for an algorithm is guaranteed to be the best solution (lowest path cost) among
all other solutions, then such a solution for is said to be an optimal solution.

Time Complexity: Time complexity is a measure of time for an algorithm to complete its task.

Space Complexity: It is the maximum storage space required at any point during the search, as the
complexity of the problem.
Types of search algorithms
Based on the search problems we can classify the search algorithms into uninformed (Blind search)
search and informed search (Heuristic search) algorithms.

Uninformed/Blind Search:
The uninformed search does not contain any domain knowledge such as closeness, the location of the goal. It
operates in a brute-force way as it only includes information about how to traverse the tree and how to identify
leaf and goal nodes. Uninformed search applies a way in which search tree is searched without any information
about the search space like initial state operators and test for the goal, so it is also called blind search.It
examines each node of the tree until it achieves the goal node.
It can be divided into five main types:
o Breadth-first search
o Uniform cost search
o Depth-first search
o Iterative deepening depth-first search
o Bidirectional Search
Informed Search
Informed search algorithms use domain knowledge. In an informed search, problem information is available
which can guide the search. Informed search strategies can find a solution more efficiently than an uninformed
search strategy. Informed search is also called a Heuristic search.
A heuristic is a way which might not always be guaranteed for best solutions but guaranteed to find a good
solution in reasonable time.
Informed search can solve much complex problem which could not be solved in another way.
An example of informed search algorithms is a traveling salesman problem.
1. Greedy Search
2. A* Search
Uninformed Search Algorithms
Uninformed search is a class of general-purpose search algorithms which operates in brute force-way.
Uninformed search algorithms do not have additional information about state or search space other
than how to traverse the tree, so it is also called blind search.
Following are the various types of uninformed search algorithms:
1. Breadth-first Search
2. Depth-first Search
3. Depth-limited Search
4. Iterative deepening depth-first search
5. Uniform cost search
6. Bidirectional Search
1. Breadth-first Search:
o Breadth-first search is the most common search strategy for traversing a tree or graph. This algorithm
searches breadthwise in a tree or graph, so it is called breadth-first search.
o BFS algorithm starts searching from the root node of the tree and expands all successor node at the
current level before moving to nodes of next level.
o The breadth-first search algorithm is an example of a general-graph search algorithm.
o Breadth-first search implemented using FIFO queue data structure.
Advantages:
o BFS will provide a solution if any solution exists.
o If there are more than one solutions for a given problem, then BFS will provide the minimal solution which
requires the least number of steps.
Disadvantages:
o It requires lots of memory since each level of the tree must be saved into memory to expand the next
level.
o BFS needs lots of time if the solution is far away from the root node.
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm from the root node S to
goal node K. BFS search algorithm traverse in layers, so it will follow the path which is shown by the dotted arrow,
and the traversed path will be:
1. S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K  

Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of nodes traversed in BFS
until the shallowest Node. Where the d= depth of shallowest solution and b is a node at every state.
T (b) = 1+b2+b3+.......+ bd= O (bd)
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier which is O(b d).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth, then BFS will
find a solution.
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.
2. Depth-first Search
o Depth-first search isa recursive algorithm for traversing a tree or graph data structure.
o It is called the depth-first search because it starts from the root node and follows each path to its greatest
depth node before moving to the next path.
o DFS uses a stack data structure for its implementation.
o The process of the DFS algorithm is similar to the BFS algorithm.
Note: Backtracking is an algorithm technique for finding all possible solutions using recursion.
Advantage:
o DFS requires very less memory as it only needs to store a stack of the nodes on the path from root node
to the current node.
o It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right path).
Disadvantage:
o There is the possibility that many states keep re-occurring, and there is no guarantee of finding the
solution.
o DFS algorithm goes for deep down searching and sometime it may go to the infinite loop.
Example:
In the below search tree, we have shown the flow of depth-first search, and it will follow the order as:
Root node--->Left node ----> right node.
It will start searching from root node S, and traverse A, then B, then D and E, after traversing E, it will backtrack
the tree as E has no other successor and still goal node is not found. After backtracking it will traverse node C and
then G, and here it will terminate as it found goal node.

Completeness: DFS search algorithm is complete within finite state space as it will expand every node within a
limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the algorithm. It is given
by:
T(n)= 1+ n2+ n3 +.........+ nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest solution
depth)
Space Complexity: DFS algorithm needs to store only single path from the root node, hence space complexity of
DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps or high cost to reach
to the goal node.
Informed Search Algorithms
So far we have talked about the uninformed search algorithms which looked through search space for all
possible solutions of the problem without having any additional knowledge about search space. But informed
search algorithm contains an array of knowledge such as how far we are from the goal, path cost, how to reach
to goal node, etc. This knowledge help agents to explore less to the search space and find more efficiently the
goal node.
The informed search algorithm is more useful for large search space. Informed search algorithm uses the idea
of heuristic, so it is also called Heuristic search.
Heuristics function: Heuristic is a function which is used in Informed Search, and it finds the most promising
path. It takes the current state of the agent as its input and produces the estimation of how close agent is from
the goal. The heuristic method, however, might not always give the best solution, but it guaranteed to find a
good solution in reasonable time. Heuristic function estimates how close a state is to the goal. It is represented
by h(n), and it calculates the cost of an optimal path between the pair of states. The value of the heuristic
function is always positive.
Admissibility of the heuristic function is given as:
1. h(n) <= h*(n)  
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be less than
or equal to the estimated cost.
Pure Heuristic Search:
Pure heuristic search is the simplest form of heuristic search algorithms. It expands nodes based on their
heuristic value h(n). It maintains two lists, OPEN and CLOSED list. In the CLOSED list, it places those nodes
which have already expanded and in the OPEN list, it places nodes which have yet not been expanded.
On each iteration, each node n with the lowest heuristic value is expanded and generates all its successors and
n is placed to the closed list. The algorithm continues unit a goal state is found.
In the informed search we will discuss two main algorithms which are given below:
o Best First Search Algorithm(Greedy search)
o A* Search Algorithm
1.) Best-first Search Algorithm (Greedy Search):
Greedy best-first search algorithm always selects the path which appears best at that moment. It is the
combination of depth-first search and breadth-first search algorithms. It uses the heuristic function and search.
Best-first search allows us to take the advantages of both algorithms. With the help of best-first search, at
each step, we can choose the most promising node. In the best first search algorithm, we expand the node
which is closest to the goal node and the closest cost is estimated by heuristic function, i.e.
1. f(n)= g(n).   
Were, h(n)= estimated cost from node n to the goal.
The greedy best first algorithm is implemented by the priority queue.
Best first search algorithm:
o Step 1: Place the starting node into the OPEN list.
o Step 2: If the OPEN list is empty, Stop and return failure.
o Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places it in the
CLOSED list.
o Step 4: Expand the node n, and generate the successors of node n.
o Step 5: Check each successor of node n, and find whether any node is a goal node or not. If any
successor node is goal node, then return success and terminate the search, else proceed to Step 6.
o Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check if the
node has been in either OPEN or CLOSED list. If the node has not been in both list, then add it to the
OPEN list.
o Step 7: Return to Step 2.
Advantages:
o Best first search can switch between BFS and DFS by gaining the advantages of both the algorithms.
o This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
o It can behave as an unguided depth-first search in the worst case scenario.
o It can get stuck in a loop as DFS.
o This algorithm is not optimal.
Example:
Consider the below search problem, and we will traverse it using greedy best-first search. At each iteration,
each node is expanded using evaluation function f(n)=h(n) , which is given in the below table.

In this search example, we are using two lists which are OPEN and CLOSED Lists. Following are the iteration
for traversing the above example.

Expand the nodes of S and put in the CLOSED list


Initialization: Open [A, B], Closed [S]
Iteration 1: Open [A], Closed [S, B]
Iteration 2: Open [E, F, A], Closed [S, B]
                  : Open [E, A], Closed [S, B, F]
Iteration 3: Open [I, G, E, A], Closed [S, B, F]
                  : Open [I, E, A], Closed [S, B, F, G]
Hence the final solution path will be: S----> B----->F----> G
Time Complexity: The worst case time complexity of Greedy best first search is O(b m).
Space Complexity: The worst case space complexity of Greedy best first search is O(b m). Where, m is the
maximum depth of the search space.
Complete: Greedy best-first search is also incomplete, even if the given state space is finite.
Optimal: Greedy best first search algorithm is not optimal.
2.) A* Search Algorithm:
A* search is the most commonly known form of best-first search. It uses heuristic function h(n), and cost to
reach the node n from the start state g(n). It has combined features of UCS and greedy best-first search, by
which it solve the problem efficiently. A* search algorithm finds the shortest path through the search space
using the heuristic function. This search algorithm expands less search tree and provides optimal result faster.
A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n).
In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we can combine
both costs as following, and this sum is called as a fitness number.

At each point in the search space, only those node is expanded which have the lowest value of f(n), and the algorithm
terminates when the goal node is found.
Algorithm of A* search:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation function (g+h), if node n
is goal node then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each successor n',
check whether n' is already in the OPEN or CLOSED list, if not then compute evaluation function for n' and
place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer which
reflects the lowest g(n') value.
Step 6: Return to Step 2.
Advantages:
o A* search algorithm is the best algorithm than other search algorithms.
o A* search algorithm is optimal and complete.
o This algorithm can solve very complex problems.
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so it
is not practical for various large-scale problems.
Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all states is
given in the below table so we will calculate the f(n) of each state using the formula f(n)= g(n) + h(n), where
g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.

Solution:

Initialization: {(S, 5)}
Iteration1: {(S--> A, 4), (S-->G, 10)}
Iteration2: {(S--> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path with cost 6.
Points to remember:
o A* algorithm returns the path which occurred first, and it does not search for all remaining paths.
o The efficiency of A* algorithm depends on the quality of heuristic.
o A* algorithm expands all nodes which satisfy the condition f(n)<="" li="">
Complete: A* algorithm is complete as long as:
o Branching factor is finite.
o Cost at every action is fixed.
Optimal: A* search algorithm is optimal if it follows below two conditions:
o Admissible: the first condition requires for optimality is that h(n) should be an admissible heuristic for
A* tree search. An admissible heuristic is optimistic in nature.
o Consistency: Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and the number
of nodes expanded is exponential to the depth of solution d. So the time complexity is O(b^d), where b is the
branching factor.
Space Complexity: The space complexity of A* search algorithm is O(b^d)
Hill Climbing Algorithm in Artificial Intelligence
o Hill climbing algorithm is a local search algorithm which continuously moves in the direction of increasing
elevation/value to find the peak of the mountain or best solution to the problem. It terminates when it
reaches a peak value where no neighbor has a higher value.
o Hill climbing algorithm is a technique which is used for optimizing the mathematical problems. One of the
widely discussed examples of Hill climbing algorithm is Traveling-salesman Problem in which we need to
minimize the distance traveled by the salesman.
o It is also called greedy local search as it only looks to its good immediate neighbor state and not beyond
that.
o A node of hill climbing algorithm has two components which are state and value.
o Hill Climbing is mostly used when a good heuristic is available.
o In this algorithm, we don't need to maintain and handle the search tree or graph as it only keeps a single
current state.
Features of Hill Climbing:
Following are some main features of Hill Climbing Algorithm:
o Generate and Test variant: Hill Climbing is the variant of Generate and Test method. The Generate and
Test method produce feedback which helps to decide which direction to move in the search space.
o Greedy approach: Hill-climbing algorithm search moves in the direction which optimizes the cost.
o No backtracking: It does not backtrack the search space, as it does not remember the previous states.
State-space Diagram for Hill Climbing:
The state-space landscape is a graphical representation of the hill-climbing algorithm which is showing a graph
between various states of algorithm and Objective function/Cost.
On Y-axis we have taken the function which can be an objective function or cost function, and state-space on the
x-axis. If the function on Y-axis is cost then, the goal of search is to find the global minimum and local minimum.
If the function of Y-axis is Objective function, then the goal of the search is to find the global maximum and local
maximum.

Different regions in the state space landscape:


Local Maximum: Local maximum is a state which is better than its neighbor states, but there is also another
state which is higher than it.
Global Maximum: Global maximum is the best possible state of state space landscape. It has the highest value
of objective function.
Current state: It is a state in a landscape diagram where an agent is currently present.
Flat local maximum: It is a flat space in the landscape where all the neighbor states of current states have the
same value.
Shoulder: It is a plateau region which has an uphill edge.
Types of Hill Climbing Algorithm:
o Simple hill Climbing:
o Steepest-Ascent hill-climbing:
o Stochastic hill Climbing:
1. Simple Hill Climbing:
Simple hill climbing is the simplest way to implement a hill climbing algorithm. It only evaluates the neighbor
node state at a time and selects the first one which optimizes current cost and set it as a current state.
It only checks it's one successor state, and if it finds better than the current state, then move else be in the same
state. This algorithm has the following features:
o Less time consuming
o Less optimal solution and the solution is not guaranteed
Algorithm for Simple Hill Climbing:
o Step 1: Evaluate the initial state, if it is goal state then return success and Stop.
o Step 2: Loop Until a solution is found or there is no new operator left to apply.
o Step 3: Select and apply an operator to the current state.
o Step 4: Check new state:
1. If it is goal state, then return success and quit.
2. Else if it is better than the current state then assign new state as a current state.
3. Else if not better than the current state, then return to step2.
o Step 5: Exit.
2. Steepest-Ascent hill climbing:
The steepest-Ascent algorithm is a variation of simple hill climbing algorithm. This algorithm examines all the
neighboring nodes of the current state and selects one neighbor node which is closest to the goal state. This
algorithm consumes more time as it searches for multiple neighbors
Algorithm for Steepest-Ascent hill climbing:
o Step 1: Evaluate the initial state, if it is goal state then return success and stop, else make current state
as initial state.
o Step 2: Loop until a solution is found or the current state does not change.
1. Let SUCC be a state such that any successor of the current state will be better than it.
2. For each operator that applies to the current state:
I. Apply the new operator and generate a new state.
II. Evaluate the new state.
III. If it is goal state, then return it and quit, else compare it to the SUCC.
IV. If it is better than SUCC, then set new state as SUCC.
V. If the SUCC is better than the current state, then set current state to SUCC.
o Step 5: Exit.
3. Stochastic hill climbing:
Stochastic hill climbing does not examine for all its neighbor before moving. Rather, this search algorithm selects
one neighbor node at random and decides whether to choose it as a current state or examine another state.
Problems in Hill Climbing Algorithm:
1. Local Maximum: A local maximum is a peak state in the landscape which is better than each of its neighboring
states, but there is another state also present which is higher than the local maximum.
Solution: Backtracking technique can be a solution of the local maximum in state space landscape. Create a list of
the promising path so that the algorithm can backtrack the search space and explore other paths as well.

2. Plateau: A plateau is the flat area of the search space in which all the neighbor states of the current state
contains the same value, because of this algorithm does not find any best direction to move. A hill-climbing search
might be lost in the plateau area.
Solution: The solution for the plateau is to take big steps or very little steps while searching, to solve the
problem. Randomly select a state which is far away from the current state so it is possible that the algorithm could
find non-plateau region.

3. Ridges: A ridge is a special form of the local maximum. It has an area which is higher than its surrounding
areas, but itself has a slope, and cannot be reached in a single move.
Solution: With the use of bidirectional search, or by moving in different directions, we can improve this problem.
Simulated Annealing:
A hill-climbing algorithm which never makes a move towards a lower value guaranteed to be incomplete because it
can get stuck on a local maximum. And if algorithm applies a random walk, by moving a successor, then it may
complete but not efficient. Simulated Annealing is an algorithm which yields both efficiency and completeness.
In mechanical term Annealing is a process of hardening a metal or glass to a high temperature then cooling
gradually, so this allows the metal to reach a low-energy crystalline state. The same process is used in simulated
annealing in which the algorithm picks a random move, instead of picking the best move. If the random move
improves the state, then it follows the same path. Otherwise, the algorithm follows the path which has a
probability of less than 1 or it moves downhill and chooses another pathMODULE 3
Adversarial Search
Adversarial search is a search, where we examine the problem which arises when we try to plan ahead
of the world and other agents are planning against us.
o In previous topics, we have studied the search strategies which are only associated with a single agent
that aims to find the solution which often expressed in the form of a sequence of actions.
o But, there might be some situations where more than one agent is searching for the solution in the same
search space, and this situation usually occurs in game playing.
o The environment with more than one agent is termed as multi-agent environment, in which each agent
is an opponent of other agent and playing against each other. Each agent needs to consider the action of
other agent and effect of that action on their performance.
o So, Searches in which two or more players with conflicting goals are trying to explore the same
search space for the solution, are called adversarial searches, often known as Games.
o Games are modeled as a Search problem and heuristic evaluation function, and these are the two main
factors which help to model and solve games in AI.
Types of Games in AI:
o Perfect information: A game with the perfect information is that in which agents can look into the
complete board. Agents have all the information about the game, and they can see each other moves also.
Examples are Chess, Checkers, Go, etc.
o Imperfect information: If in a game agents do not have all information about the game and not aware
with what's going on, such type of games are called the game with imperfect information, such as tic-tac-
toe, Battleship, blind, Bridge, etc.
o Deterministic games: Deterministic games are those games which follow a strict pattern and set of rules
for the games, and there is no randomness associated with them. Examples are chess, Checkers, Go, tic-
tac-toe, etc.
o Non-deterministic games: Non-deterministic are those games which have various unpredictable events
and has a factor of chance or luck. This factor of chance or luck is introduced by either dice or cards.
These are random, and each action response is not fixed. Such games are also called as stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
Note: In this topic, we will discuss deterministic games, fully observable environment, zero-sum, and where each agent
acts alternatively.
Zero-Sum Game
o Zero-sum games are adversarial search which involves pure competition.
o In Zero-sum game each agent's gain or loss of utility is exactly balanced by the losses or gains of utility of
another agent.
o One player of the game try to maximize one single value, while other player tries to minimize it.
o Each move by one player in the game is called as ply.
o Chess and tic-tac-toe are examples of a Zero-sum game.
Zero-sum game: Embedded thinking
The Zero-sum game involved embedded thinking in which one agent or player is trying to figure out:
o What to do.
o How to decide the move
o Needs to think about his opponent as well
o The opponent also thinks what to do
Each of the players is trying to find out the response of his opponent to their actions. This requires embedded
thinking or backward reasoning to solve the game problems in AI.
Formalization of the problem:
A game can be defined as a type of search in AI which can be formalized of the following elements:
o Initial state: It specifies how the game is set up at the start.
o Player(s): It specifies which player has moved in the state space.
o Action(s): It returns the set of legal moves in state space.
o Result(s, a): It is the transition model, which specifies the result of moves in the state space.
o Terminal-Test(s): Terminal test is true if the game is over, else it is false at any case. The state where
the game ends is called terminal states.
o Utility(s, p): A utility function gives the final numeric value for a game that ends in terminal states s for
player p. It is also called payoff function. For Chess, the outcomes are a win, loss, or draw and its payoff
values are +1, 0, ½. And for tic-tac-toe, utility values are +1, -1, and 0.
Game tree:
A game tree is a tree where nodes of the tree are the game states and Edges of the tree are the moves by
players. Game tree involves initial state, actions function, and result Function.
Example: Tic-Tac-Toe game tree:
The following figure is showing part of the game-tree for tic-tac-toe game. Following are some key points of the
game:
o There are two players MAX and MIN.
o Players have an alternate turn and start with MAX.
o MAX maximizes the result of the game tree
o MIN minimizes the result.

Example Explanation:
o From the initial state, MAX has 9 possible moves as he starts first. MAX place x and MIN place o, and both
player plays alternatively until we reach a leaf node where one player has three in a row or all squares are
filled.
o Both players will compute each node, minimax, the minimax value which is the best achievable utility
against an optimal adversary.
o Suppose both the players are well aware of the tic-tac-toe and playing the best play. Each player is doing
his best to prevent another one from winning. MIN is acting against Max in the game.
o So in the game tree, we have a layer of Max, a layer of MIN, and each layer is called as Ply. Max place x,
then MIN puts o to prevent Max from winning, and this game continues until the terminal node.
o In this either MIN wins, MAX wins, or it's a draw. This game-tree is the whole search space of possibilities
that MIN and MAX are playing tic-tac-toe and taking turns alternately.
Hence adversarial Search for the minimax procedure works as follows:
o It aims to find the optimal strategy for MAX to win the game.
o It follows the approach of Depth-first search.
o In the game tree, optimal leaf node could appear at any depth of the tree.
o Propagate the minimax values up to the tree until the terminal node discovered.
In a given game tree, the optimal strategy can be determined from the minimax value of each node, which can be
written as MINIMAX(n). MAX prefer to move to a state of maximum value and MIN prefer to move to a state of
minimum value then:

Mini-Max Algorithm in Artificial Intelligence


o Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making and game
theory. It provides an optimal move for the player assuming that opponent is also playing optimally.
o Mini-Max algorithm uses recursion to search through the game-tree.
o Min-Max algorithm is mostly used for game playing in AI. Such as Chess, Checkers, tic-tac-toe, go, and
various tow-players game. This Algorithm computes the minimax decision for the current state.
o In this algorithm two players play the game, one is called MAX and other is called MIN.
o Both the players fight it as the opponent player gets the minimum benefit while they get the maximum
benefit.
o Both Players of the game are opponent of each other, where MAX will select the maximized value and MIN
will select the minimized value.
o The minimax algorithm performs a depth-first search algorithm for the exploration of the complete game
tree.
o The minimax algorithm proceeds all the way down to the terminal node of the tree, then backtrack the
tree as the recursion.
Pseudo-code for MinMax Algorithm:
1. function minimax(node, depth, maximizingPlayer) is  
2. if depth ==0 or node is a terminal node then  
3. return static evaluation of node  
4.   
5. if MaximizingPlayer then      // for Maximizer Player  
6. maxEva= -infinity            
7.  for each child of node do  
8.  eva= minimax(child, depth-1, false)  
9. maxEva= max(maxEva,eva)        //gives Maximum of the values  
10. return maxEva  
11.   
12. else                         // for Minimizer player  
13.  minEva= +infinity   
14.  for each child of node do  
15.  eva= minimax(child, depth-1, true)  
16.  minEva= min(minEva, eva)         //gives minimum of the values  
17.  return minEva  
Initial call:
Minimax(node, 3, true)
Working of Min-Max Algorithm:
o The working of the minimax algorithm can be easily described using an example. Below we have taken an
example of game-tree which is representing the two-player game.
o In this example, there are two players one is called Maximizer and other is called Minimizer.
o Maximizer will try to get the Maximum possible score, and Minimizer will try to get the minimum possible
score.
o This algorithm applies DFS, so in this game-tree, we have to go all the way through the leaves to reach
the terminal nodes.
o At the terminal node, the terminal values are given so we will compare those value and backtrack the tree
until the initial state occurs. Following are the main steps involved in solving the two-player game tree:
Step-1: In the first step, the algorithm generates the entire game-tree and apply the utility function to get the
utility values for the terminal states. In the below tree diagram, let's take A is the initial state of the tree. Suppose
maximizer takes first turn which has worst-case initial value =- infinity, and minimizer will take next turn which
has worst-case initial value = +infinity.

Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we will compare each
value in terminal state with initial value of Maximizer and determines the higher nodes values. It will find the
maximum among the all.
o For node D         max(-1,- -∞) => max(-1,4)= 4
o For Node E         max(2, -∞) => max(2, 6)= 6
o For Node F         max(-3, -∞) => max(-3,-5) = -3
o For node G         max(0, -∞) = max(0, 7) = 7
Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will find the
3rd layer node values.
o For node B= min(4,6) = 4
o For node C= min (-3, 7) = -3

Step 3: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value and find the
maximum value for the root node. In this game tree, there are only 4 layers, hence we reach immediately to the
root node, but in real games, there will be more than 4 layers.
o For node A max(4, -3)= 4

That was the complete workflow of the minimax two player game.
Properties of Mini-Max algorithm:
o Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the finite search
tree.
o Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
o Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max algorithm
is O(bm), where b is branching factor of the game-tree, and m is the maximum depth of the tree.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is O(bm).
Limitation of the minimax Algorithm:
The main drawback of the minimax algorithm is that it gets really slow for complex games such as Chess, go, etc.
This type of games has a huge branching factor, and the player has lots of choices to decide. This limitation of the
minimax algorithm can be improved from alpha-beta pruning which we have discussed in the next topic.
Alpha-Beta Pruning
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization technique for the
minimax algorithm.
o As we have seen in the minimax search algorithm that the number of game states it has to examine are
exponential in depth of the tree. Since we cannot eliminate the exponent, but we can cut it to half. Hence
there is a technique by which without checking each node of the game tree we can compute the correct
minimax decision, and this technique is called pruning. This involves two threshold parameter Alpha and
beta for future expansion, so it is called alpha-beta pruning. It is also called as Alpha-Beta Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune the tree leaves
but also entire sub-tree.
o The two-parameter can be defined as:
1. Alpha: The best (highest-value) choice we have found so far at any point along the path of
Maximizer. The initial value of alpha is -∞.
2. Beta: The best (lowest-value) choice we have found so far at any point along the path of
Minimizer. The initial value of beta is +∞.
o The Alpha-beta pruning to a standard minimax algorithm returns the same move as the standard
algorithm does, but it removes all the nodes which are not really affecting the final decision but making
algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.
Note: To better understand this topic, kindly study the minimax algorithm.
Condition for Alpha-beta pruning:
The main condition which required for alpha-beta pruning is:
1. α>=β  
Key points about alpha-beta pruning:
o The Max player will only update the value of alpha.
o The Min player will only update the value of beta.
o While backtracking the tree, the node values will be passed to upper nodes instead of values of alpha and
beta.
o We will only pass the alpha, beta values to the child nodes.
Pseudo-code for Alpha-beta Pruning:
1. function minimax(node, depth, alpha, beta, maximizingPlayer) is  
2. if depth ==0 or node is a terminal node then  
3. return static evaluation of node  
4.   
5. if MaximizingPlayer then      // for Maximizer Player  
6.    maxEva= -infinity            
7.    for each child of node do  
8.    eva= minimax(child, depth-1, alpha, beta, False)  
9.   maxEva= max(maxEva, eva)   
10.   alpha= max(alpha, maxEva)      
11.    if beta<=alpha  
12.  break  
13.  return maxEva  
14.     
15. else                         // for Minimizer player  
16.    minEva= +infinity   
17.    for each child of node do  
18.    eva= minimax(child, depth-1, alpha, beta, true)  
19.    minEva= min(minEva, eva)   
20.    beta= min(beta, eva)  
21.     if beta<=alpha  
22.   break          
23.  return minEva  
Working of Alpha-Beta Pruning:
Let's take an example of two-player search tree to understand the working of Alpha-beta pruning
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these value of
alpha and beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the same value to its
child D.

Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2
and then 3, and the max (2, 3) = 3 will be the value of α at node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β= +∞,
will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β=
3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and
β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be
compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right successor of E
will be pruned, and algorithm will not traverse it, and the value at node E will be 5.

Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of alpha
will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values now passes
to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and then
compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will become 1.

Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed,
it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the condition α>=β, so
the next child of C which is G will be pruned, and the algorithm will not compute the entire sub-tree G.

Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game
tree which is the showing the nodes which are computed and nodes which has never computed. Hence the optimal
value for the maximizer is 3 for this example.
Move Ordering in Alpha-Beta pruning:

The effectiveness of alpha-beta pruning is highly dependent on the order in which each node is examined. Move
order is an important aspect of alpha-beta pruning.

It can be of two types:

o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the leaves of the
tree, and works exactly as minimax algorithm. In this case, it also consumes more time because of alpha-
beta factors, such a move of pruning is called worst ordering. In this case, the best move occurs on the
right side of the tree. The time complexity for such an order is O(b m).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in the
tree, and best moves occur at the left side of the tree. We apply DFS hence it first search left of the tree
and go deep twice as minimax algorithm in the same amount of time. Complexity in ideal ordering is
O(bm/2).

Rules to find good ordering:

Following are some rules to find good ordering in alpha-beta pruning:

o Occur the best move from the shallowest node.


o Order the nodes in the tree such that the best nodes are checked first.
o Use domain knowledge while finding the best move. Ex: for Chess, try order: captures first, then threats,
then forward moves, backward moves.
o We can bookkeep the states, as there is a possibility that states may repeat.

Knowledge-Based Agent in Artificial intelligence

o An intelligent agent needs knowledge about the real world for taking decisions and reasoning to act
efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining an internal state
of knowledge, reason over that knowledge, update their knowledge after observations and
take actions. These agents can represent the world with some formal representation and act
intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.

A knowledge-based agent must able to do the following:

o An agent should be able to represent states, actions, etc.


o An agent Should be able to incorporate new percepts
o An agent can update the internal representation of the world
o An agent can deduce the internal representation of the world
o An agent can deduce appropriate actions.

The architecture of knowledge-based agent:

The above diagram is representing a generalized architecture for a knowledge-based agent. The knowledge-
based agent (KBA) take input from the environment by perceiving the environment. The input is taken by the
inference engine of the agent and which also communicate with KB to decide as per the knowledge store in KB.
The learning element of KBA regularly updates the KB by learning new knowledge.

Knowledge base: Knowledge-base is a central component of a knowledge-based agent, it is also known as


KB. It is a collection of sentences (here 'sentence' is a technical term and it is not identical to sentence in
English). These sentences are expressed in a language which is called a knowledge representation language.
The Knowledge-base of KBA stores fact about the world.

Why use a knowledge base?

Knowledge-base is required for updating knowledge for an agent to learn with experiences and take action as
per the knowledge.

Inference system

Inference means deriving new sentences from old. Inference system allows us to add a new sentence to the
knowledge base. A sentence is a proposition about the world. Inference system applies logical rules to the KB
to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference system works mainly
in two rules which are given as:
o Forward chaining
o Backward chaining
Operations Performed by KBA
Following are three operations which are performed by KBA in order to show the intelligent
behavior:
1. TELL: This operation tells the knowledge base what it perceives from the environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
A generic knowledge-based agent:
Following is the structure outline of a generic knowledge-based agents program:
1. function KB-AGENT(percept):  
2. persistent: KB, a knowledge base   
3.           t, a counter, initially 0, indicating time   
4. TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))   
5. Action = ASK(KB, MAKE-ACTION-QUERY(t))   
6. TELL(KB, MAKE-ACTION-SENTENCE(action, t))  
7.  t = t + 1  
8.  return action   
The knowledge-based agent takes percept as input and returns an action as output. The agent maintains the
knowledge base, KB, and it initially has some background knowledge of the real world. It also has a counter to
indicate the time for the whole process, and this counter is initialized with zero.
Each time when the function is called, it performs its three operations:
o Firstly it TELLs the KB what it perceives.
o Secondly, it asks KB what action it should take
o Third agent program TELLS the KB that which action was chosen.
The MAKE-PERCEPT-SENTENCE generates a sentence as setting that the agent perceived the given percept at
the given time.
The MAKE-ACTION-QUERY generates a sentence to ask which action should be done at the current time.
MAKE-ACTION-SENTENCE generates a sentence which asserts that the chosen action was executed.
Various levels of knowledge-based agent:
A knowledge-based agent can be viewed at different levels which are given below:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to specify what the
agent knows, and what the agent goals are. With these specifications, we can fix its behavior. For example,
suppose an automated taxi agent needs to go from a station A to station B, and he knows the way from A to B,
so this comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is stored. At this level,
sentences are encoded into different logics. At the logical level, an encoding of knowledge into logical
sentences occurs. At the logical level we can expect to the automated taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level agent perform actions
as per logical and knowledge level. At this level, an automated taxi agent actually implement his knowledge
and logic so that he can reach to the destination.
Approaches to designing a knowledge-based agent:
There are mainly two approaches to build a knowledge-based agent:
1. 1. Declarative approach: We can create a knowledge-based agent by initializing with an empty
knowledge base and telling the agent all the sentences with which we want to start with. This approach
is called Declarative approach.
2. 2. Procedural approach: In the procedural approach, we directly encode desired behavior as a
program code. Which means we just need to write a program that already encodes the desired
behavior or agent.
However, in the real world, a successful agent can be built by combining both declarative and procedural
approaches, and declarative knowledge can often be compiled into more efficient procedural code.
Propositional logic in Artificial intelligence
Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions. A
proposition is a declarative statement which is either true or false. It is a technique of knowledge representation in
logical and mathematical form.
Example:
1. a) It is Sunday.  
2. b) The Sun rises from West (False proposition)  
3. c) 3+3= 7(False proposition)  
4. d) 5 is a prime number.   
Following are some basic facts about propositional logic:
o Propositional logic is also called Boolean logic as it works on 0 and 1.
o In propositional logic, we use symbolic variables to represent the logic, and we can use any symbol for a
representing a proposition, such A, B, C, P, Q, R, etc.
o Propositions can be either true or false, but it cannot be both.
o Propositional logic consists of an object, relations or function, and logical connectives.
o These connectives are also called logical operators.
o The propositions and connectives are the basic elements of the propositional logic.
o Connectives can be said as a logical operator which connects two sentences.
o A proposition formula which is always true is called tautology, and it is also called a valid sentence.
o A proposition formula which is always false is called Contradiction.
o A proposition formula which has both true and false values is called
o Statements which are questions, commands, or opinions are not propositions such as "Where is Rohini",
"How are you", "What is your name", are not propositions.
Syntax of propositional logic:
The syntax of propositional logic defines the allowable sentences for the knowledge representation. There are two
types of Propositions:
a. Atomic Propositions
b. Compound propositions
o Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single proposition
symbol. These are the sentences which must be either true or false.
Example:
1. a) 2+2 is 4, it is an atomic proposition as it is a true fact.  
2. b) "The Sun is cold" is also a proposition as it is a false fact.   
o Compound proposition: Compound propositions are constructed by combining simpler or atomic
propositions, using parenthesis and logical connectives.
Example:
1. a) "It is raining today, and street is wet."  
2. b) "Ankit is a doctor, and his clinic is in Mumbai."   
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence logically. We can
create compound propositions with the help of logical connectives. There are mainly five connectives, which are
given as follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or negative
literal.
2. Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.
Example: Rohan is intelligent and hardworking. It can be written as,
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
3. Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P and Q are
the propositions.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
4. Implication: A sentence such as P → Q, is called an implication. Implications are also known as if-then
rules. It can be represented as
            If it is raining, then the street is wet.
        Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
5. Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am breathing,
then I am alive
            P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.
Following is the summarized table for Propositional Logic Connectives:

Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible scenarios. We can combine
all the possible combination with logical connectives, and the representation of these combinations in a tabular
format is called Truth table. Following are the truth table for all logical connectives:

Truth table with three propositions:


We can build a proposition composing three propositions P, Q, and R. This truth table is made-up of 8n Tuples as
we have taken three proposition symbols.

Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or logical operators. This
order should be followed while evaluating a propositional problem. Following is the list of the precedence order for
operators:

Precedence Operators

First Precedence Parenthesis

Second Precedence Negation

Third Precedence Conjunction(AND)


Fourth Precedence Disjunction(OR)

Fifth Precedence Implication

Six Precedence Biconditional


Note: For better understanding use parenthesis to make sure of the correct interpretations. Such as ¬R ∨ Q, It can be
interpreted as (¬R) ∨ Q.
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are said to be logically equivalent
if and only if the columns in the truth table are identical to each other.
Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B. In below truth table we can
see that column for ¬A∨ B and A→B, are identical hence A is Equivalent to B

Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
Limitations of Propositional logic:
o We cannot represent relations like ALL, some, or none with propositional logic. Example:
1. All the girls are intelligent.
2. Some apples are sweet.
o Propositional logic has limited expressive power.
o In propositional logic, we cannot describe statements in terms of their properties or logical relationships.
First-Order Logic in Artificial intelligence
In the topic of Propositional logic, we have seen that how to represent statements using propositional logic. But
unfortunately, in propositional logic, we can only represent the facts, which are either true or false. PL is not
sufficient to represent the complex sentences or natural language statements. The propositional logic has very
limited expressive power. Consider the following sentence, which we cannot represent using PL logic.
o "Some humans are intelligent", or
o "Sachin likes cricket."
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such as
first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to
propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a
powerful language that develops information about the objects in a more easy way and can also express
the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such
as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
o Syntax
o Semantics
Syntax of First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic
syntactic elements of first-order logic are symbols. We write statements in short-hand notation in FOL.
Basic Elements of First-order logic:
Following are the basic elements of FOL syntax:
Constant 1, 2, A, John, Mumbai, cat,....

Variables x, y, z, a, b,....

Predicates Brother, Father, >,....

Function sqrt, LeftLegOf, ....

Connectives ∧, ∨, ¬, ⇒, ⇔

Equality ==

Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a
predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).
                Chinky is a cat: => cat (Chinky).
Complex Sentences:
o Complex sentences are made by combining atomic sentences using connectives.
First-order logic statements can be divided into two parts:
o Subject: Subject is the main part of the statement.
o Predicate: A predicate can be defined as a relation, which binds two atoms together in a statement.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of the
statement and second part "is an integer," is known as a predicate.

Quantifiers in First-order logic:


o A quantifier is a language element which generates quantification, and quantification specifies the quantity
of specimen in the universe of discourse.
o These are the symbols that permit to determine or identify the range and scope of the variable in the
logical expression. There are two types of quantifier:
1. Universal Quantifier, (for all, everyone, everything)
2. Existential quantifier, (for some, at least one).
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement within its range is true
for everything or every instance of a particular thing.
The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.
Note: In universal quantifier we use implication "→".
If x is a variable, then ∀x is read as:
o For all x
o For each x
o For every x.
Example:
All man drink coffee.
Let a variable x which refers to a cat so all x can be represented in UOD as below:
∀x man(x) → drink (x, coffee).
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within its scope is true for at
least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a predicate variable
then it is called as an existential quantifier.
Note: In Existential quantifier we always use AND or Conjunction symbol ( ∧).
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
o There exists a 'x.'
o For some 'x.'
o For at least one 'x.'
Example:
Some boys are intelligent.

∃x: boys(x) ∧ intelligent(x)


It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
Some Examples of FOL using quantifier:
1. All birds fly.
In this question the predicate is "fly(bird)."
And since there are all birds who fly so it will be represented as follows.
              ∀x bird(x) →fly(x).
2. Every man respects his parent.
In this question, the predicate is "respect(x, y)," where x=man, and y= parent.
Since there is every man so will use ∀, and it will be represented as follows:
              ∀x man(x) → respects (x, parent).
3. Some boys play cricket.
In this question, the predicate is "play(x, y)," where x= boys, and y= game. Since there are some boys so we
will use ∃, and it will be represented as:
              ∃x boys(x) → play(x, cricket).
4. Not all students like both Mathematics and Science.
In this question, the predicate is "like(x, y)," where x= student, and y= subject.
Since there are not all students, so we will use ∀ with negation, so following representation for this:
              ¬∀ (x) [ student(x) → like(x, Mathematics) ∧ like(x, Science)].
5. Only one student failed in Mathematics.
In this question, the predicate is "failed(x, y)," where x= student, and y= subject.
Since there is only one student who failed in Mathematics, so we will use following representation for this:
              ∃(x) [ student(x) → failed (x, Mathematics) ∧∀ (y) [¬(x==y) ∧ student(y) → ¬failed (x,
Mathematics)].
Free and Bound Variables:
The quantifiers interact with variables which appear in a suitable way. There are two types of variables in First-
order logic which are given below:
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope of the quantifier.
          Example: ∀x ∃(y)[P (x, y, z)], where z is a free variable.
Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the scope of the
quantifier.
          Example: ∀x [A (x) B( y)], here x and y are the bound variables.

Forward Chaining and backward chaining in AI


In artificial intelligence, forward and backward chaining is one of the important topics, but before understanding
forward and backward chaining lets first understand that from where these two terms came.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which applies logical rules
to the knowledge base to infer new information from known facts. The first inference engine was part of the expert
system. Inference engine commonly proceeds in two modes, which are:
a. Forward chaining
b. Backward chaining
Horn Clause and Definite clause:
Horn clause and definite clause are the forms of sentences, which enables knowledge base to use a more
restricted and efficient inference algorithm. Logical inference algorithms use forward and backward chaining
approaches, which require KB in the form of the first-order definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as a
definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive literal is known as horn
clause. Hence all the definite clauses are horn clauses.
Example: (¬ p V ¬ q V k). It has only one positive literal k.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when using an inference
engine. Forward chaining is a form of reasoning which start with atomic sentences in the knowledge base and
applies inference rules (Modus Ponens) in the forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied, and add
their conclusion to the known facts. This process repeats until the problem is solved.
Properties of Forward-Chaining:
o It is a down-up approach, as it moves from bottom to top.
o It is a process of making a conclusion based on known facts or data, by starting from the initial state and
reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal using available data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and
production rule systems.
Consider the following famous example which we will use in both approaches:
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations. Country A, an enemy
of America, has some missiles, and all the missiles were sold to it by Robert, who is an American
citizen."
Prove that "Robert is criminal."
To solve the above problem, first, we will convert all the above facts into first-order definite clauses, and then we
will use a forward-chaining algorithm to reach the goal.
Facts Conversion into FOL:
o It is a crime for an American to sell weapons to hostile nations. (Let's say p, q, and r are variables)
American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p)       ...(1)
o Country A has some missiles. ?p Owns(A, p) ∧ Missile(p). It can be written in two definite clauses by
using Existential Instantiation, introducing new Constant T1.
Owns(A, T1)             ......(2)
Missile(T1)             .......(3)
o All of the missiles were sold to country A by Robert.
?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A)       ......(4)
o Missiles are weapons.
Missile(p) → Weapons (p)             .......(5)
o Enemy of America is known as hostile.
Enemy(p, America) →Hostile(p)             ........(6)
o Country A is an enemy of America.
Enemy (A, America)             .........(7)
o Robert is American
American(Robert).             ..........(8)
Forward chaining proof:
Step-1:
In the first step we will start with the known facts and will choose the sentences which do not have implications,
such as: American(Robert), Enemy(A, America), Owns(A, T1), and Missile(T1). All these facts will be
represented as below.

Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(2) and (3) are already added.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers from the
conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so we can add
Criminal(Robert) which infers all the available facts. And hence we reached our goal statement.

Hence it is proved that Robert is Criminal using forward chaining approach.


B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method when using an inference
engine. A backward chaining algorithm is a form of reasoning, which starts with the goal and works backward,
chaining through rules to find known facts that support the goal.
Properties of backward chaining:
o It is known as a top-down approach.
o Backward-chaining is based on modus ponens inference rule.
o In backward chaining, the goal is broken into sub-goal or sub-goals to prove the facts true.
o It is called a goal-driven approach, as a list of goals decides which rules are selected and used.
o Backward -chaining algorithm is used in game theory, automated theorem proving tools, inference
engines, proof assistants, and various AI applications.
o The backward-chaining method mostly used a depth-first search strategy for proof.
Example:
In backward-chaining, we will use the same above example, and will rewrite all the rules.
o American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p) ...(1)
Owns(A, T1)                 ........(2)
o Missile(T1)
o ?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A)           ......(4)
o Missile(p) → Weapons (p)                 .......(5)
o Enemy(p, America) →Hostile(p)                 ........(6)
o Enemy (A, America)                 .........(7)
o American(Robert).                 ..........(8)
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert), and then infer further
rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts, and at last, we will
prove those facts true. So our goal fact is "Robert is Criminal," so following is the predicate of it.

Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we can see in Rule-1,
the goal predicate Criminal (Robert) is present with substitution {Robert/P}. So we will add all the conjunctive
facts below the first level and will replace p with Robert.
Here we can see American (Robert) is a fact, so it is proved here.

Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it satisfies Rule-(5).
Weapon (q) is also true with the substitution of a constant T1 at q.

Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which satisfies the Rule- 4,
with the substitution of A in place of r. So these two statements are proved here.

Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6. And hence all the
statements are proved true using backward chaining.
Difference between backward chaining and forward chaining
Following is the difference between the forward chaining and backward chaining:
o Forward chaining as the name suggests, start from the known facts and move forward by applying
inference rules to extract more data, and it continues until it reaches to the goal, whereas backward
chaining starts from the goal, move backward by using inference rules to determine the facts that satisfy
the goal.
o Forward chaining is called a data-driven inference technique, whereas backward chaining is called
a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward chaining is known as a top-
down approach.
o Forward chaining uses breadth-first search strategy, whereas backward chaining uses depth-first
search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process monitoring, diagnosis, and
classification, whereas backward chaining can be used for classification and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward chaining tries to avoid the
unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge base, whereas in backward
chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward chaining is fast as it checks few
required rules only.

S. No. Forward Chaining Backward Chaining

1. Forward chaining starts from known Backward chaining starts from the goal
facts and applies inference rule to and works backward through inference
extract more data unit it reaches to rules to find the required facts that support the goal.
the goal.

2. It is a bottom-up approach It is a top-down approach

3. Forward chaining is known as data- Backward chaining is known as goal-driven


driven inference technique as we technique as we start from the goal and
reach to the goal using the available divide into sub-goal to extract the facts.
data.

4. Forward chaining reasoning applies a Backward chaining reasoning applies


breadth-first search strategy. a depth-first search strategy.

5. Forward chaining tests for all the Backward chaining only tests for few required rules.
available rules

6. Forward chaining is suitable for the Backward chaining is suitable for diagnostic,
planning, monitoring, control, and prescription, and debugging application.
interpretation application.

7. Forward chaining can generate an Backward chaining generates a finite


infinite number of possible number of possible conclusions.
conclusions.

8. It operates in the forward direction. It operates in the backward direction.

9. Forward chaining is aimed for any Backward chaining is only aimed for the required data.
conclusion.

Probabilistic reasoning in Artificial intelligence

Uncertainty:

Till now, we have learned knowledge representation using first-order logic and propositional logic with certainty,
which means we were sure about the predicates. With this knowledge representation, we might write A→B, which
means if A is true then B is true, but consider a situation where we are not sure about whether A is true or not
then we cannot express this statement, this situation is called uncertainty.

So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain reasoning or
probabilistic reasoning.
Causes of uncertainty:

Following are some leading causes of uncertainty to occur in the real world.

1. Information occurred from unreliable sources.

2. Experimental Errors

3. Equipment fault

4. Temperature variation

5. Climate change.

Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability to indicate
the uncertainty in knowledge. In probabilistic reasoning, we combine probability theory with logic to handle the
uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is the result
of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as "It will
rain today," "behavior of someone for some situations," "A match between two teams or two players." These are
probable sentences for which we can assume that it will happen but not sure about it, so here we use probabilistic
reasoning.
Need of probabilistic reasoning in AI:
o When there are unpredictable outcomes.
o When specifications or possibilities of predicates becomes too large to handle.
o When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
o Bayes' rule
o Bayesian Statistics
Note: We will learn the above two rules in later chapters.
As probabilistic reasoning uses probability and related terms, so before understanding probabilistic reasoning, let's
understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the numerical measure
of the likelihood that an event will occur. The value of probability always remains between 0 and 1 that represent
ideal uncertainties.
1. 0 ≤ P(A) ≤ 1,   where P(A) is the probability of an event A.  
1. P(A) = 0,  indicates total uncertainty in an event A.   
1. P(A) =1, indicates total certainty in an event A.    
We can find the probability of an uncertain event by using the below formula.

o P(¬A) = probability of a not happening event.


o P(¬A) + P(A) = 1.
Event: Each possible outcome of a variable is called an event.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in the real world.
Prior probability: The prior probability of an event is probability computed before observing new information.
Posterior Probability: The probability that is calculated after all evidence or information has taken into account.
It is a combination of prior probability and new information.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the probability of A under
the conditions of B", it can be written as:

Where P(A⋀B)= Joint probability of a and B


P(B)= Marginal probability of B.
If the probability of A is given and we need to find the probability of B, then it will be given as:

It can be explained by using the below Venn diagram, where B is occurred event, so sample space will be reduced
to set B, and now we can only calculate event A when event B is already occurred by dividing the probability
of P(A⋀B) by P( B ).
Example:
In a class, there are 70% of the students who like English and 40% of the students who likes English and
mathematics, and then what is the percent of students those who like English also like mathematics?
Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.

Hence, 57% are the students who like English also like Mathematics.

Bayes' theorem in Artificial intelligence


Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the
probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics.
It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
Bayes' theorem allows updating the probability prediction of an event by observing new information of the real
world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of
cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known event B:
As from product rule we can write:
1. P(A ⋀ B)= P(A|B) P(B) or  
Similarly, the probability of event B with known event A:
1. P(A ⋀ B)= P(B|A) P(A)  
Equating right hand side of both the equations, we will get:

The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI
systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when
we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of
evidence.
P(A) is called the prior probability, probability of hypothesis before considering the evidence
P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as:

Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is very useful in
cases where we have a good probability of these three terms and want to determine the fourth one. Suppose we
want to perceive the effect of some unknown cause, and want to compute that cause, then the Bayes' rule
becomes:

Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80% of the time. He
is also aware of some more facts, which are given as follows:
o The Known probability that a patient has meningitis disease is 1/30,000.
o The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningitis. , so we
can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02

Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that the card
is king is 4/52, then calculate posterior probability P(King|Face), which means the drawn face card is
a king card.
Solution:

P(king): probability that the card is King= 4/52= 1/13


P(face): probability that a card is a face card= 3/13
P(Face|King): probability of face card when we assume it is a king = 1
Putting all values in equation (i) we will get:

Application of Bayes' theorem in Artificial intelligence:


Following are some applications of Bayes' theorem:
o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem

Bayesian Belief Network in artificial intelligence


Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a problem
which has uncertainty. We can define a Bayesian network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and their conditional
dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability distribution, and also
use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between multiple events, we
need a Bayesian network. It can also be used in various tasks including prediction, anomaly detection,
diagnostics, automated insight, reasoning, time series prediction, and decision making under
uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two parts:
o Directed Acyclic Graph
o Table of conditional probabilities.
The generalized form of Bayesian network that represents and solve decision problems under uncertain knowledge
is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between random
variables. These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that
means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes of
the network graph.
o If we are considering node B, which is connected with node A by a directed arrow, then
node A is called the parent of Node B.
o Node C is independent of node A.
Note: The Bayesian network graph does not contain any cyclic graph. Hence, it is known as a directed acyclic graph or
DAG.
The Bayesian network has mainly two components:
o Causal Component
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ), which determines the
effect of the parent on that node.
Bayesian network is based on Joint probability distribution and conditional probability. So let's first understand the
joint probability distribution:
Joint probability distribution:
If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1, x2, x3.. xn, are
known as Joint probability distribution.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability distribution.
= P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].
In general for each variable Xi, we can write the equation as:
P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))
Explanation of Bayesian network:
Let's understand the Bayesian network through an example by creating a directed acyclic graph:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds at
detecting a burglary but also responds for minor earthquakes. Harry has two neighbors David and Sophia, who
have taken a responsibility to inform Harry at work when they hear the alarm. David always calls Harry when he
hears the alarm, but sometimes he got confused with the phone ringing and calls at that time too. On the other
hand, Sophia likes to listen to high music, so sometimes she misses to hear the alarm. Here we would like to
compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary, nor an earthquake
occurred, and David and Sophia both called the Harry.
Solution:
o The Bayesian network for the above problem is given below. The network structure is showing that
burglary and earthquake is the parent node of the alarm and directly affecting the probability of alarm's
going off, but David and Sophia's calls depend on alarm probability.
o The network is representing that our assumptions do not directly perceive the burglary and also do not
notice the minor earthquake, and they also not confer before calling.
o The conditional distributions for each node are given as conditional probabilities table or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent an exhaustive set of
cases for the variable.
o In CPT, a boolean variable with k boolean parents contains 2 K probabilities. Hence, if there are two
parents, then CPT will contain 4 probability values
List of all events occurring in this network:
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can rewrite the above
probability statement using joint probability distribution:
P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
= P [D| A]. P [ S| A, B, E]. P[ A, B, E]
= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Let's take the observed probability for the Burglary and earthquake component:
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:

Conditional probability table for Alarm A:


The Conditional probability of Alarm A depends on Burglar and earthquake:

B E P(A= True) P(A=

True True 0.94 0

True False 0.95 0

False True 0.31 0

False False 0.001 0


Conditional probability table for David Calls:
The Conditional probability of David that he will call depends on the probability of Alarm.

A P(D= True) P(D= False)

True 0.91 0.09

False 0.05 0.95


Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."

A P(S= True) P(S= False)

True 0.75 0.25

False 0.02 0.98


From the formula of joint distribution, we can write the problem statement in the form of probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional independence statements.
It is helpful in designing inference procedure.
Fuzzy Logic Systems (FLS) produce acceptable but definite output in response to incomplete, ambiguous, distorted, or inaccurate (fuzzy) input.
What is Fuzzy Logic?
Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. The approach of FL imitates the way of decision making in humans
that involves all intermediate possibilities between digital values YES and NO.
The conventional logic block that a computer can understand takes precise input and produces a definite output as TRUE or FALSE, which is
equivalent to human’s YES or NO.
The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike computers, the human decision making includes a range of possibilities between
YES and NO, such as −
CERTAINLY
YES
POSSIBLY YES
CANNOT SAY
POSSIBLY NO
CERTAINLY NO
The fuzzy logic works on the levels of possibilities of input to achieve the definite output.
Implementation
 It can be implemented in systems with various sizes and capabilities ranging from small micro-controllers to large, networked,
workstation-based control systems.
 It can be implemented in hardware, software, or a combination of both.
Why Fuzzy Logic?
Fuzzy logic is useful for commercial and practical purposes.
 It can control machines and consumer products.
 It may not give accurate reasoning, but acceptable reasoning.
 Fuzzy logic helps to deal with the uncertainty in engineering.
Fuzzy Logic Systems Architecture
It has four main parts as shown −
 Fuzzification Module − It transforms the system inputs, which are crisp numbers, into fuzzy sets. It splits the input signal into five steps
such as −

LP x is Large Positive

MP x is Medium Positive

S x is Small

MN x is Medium Negative

LN x is Large Negative

 Knowledge Base − It stores IF-THEN rules provided by experts.


 Inference Engine − It simulates the human reasoning process by making fuzzy inference on the inputs and IF-THEN rules.
 Defuzzification Module − It transforms the fuzzy set obtained by the inference engine into a crisp value.

The membership functions work on fuzzy sets of variables.


Membership Function
Membership functions allow you to quantify linguistic term and represent a fuzzy set graphically. A membership function for a fuzzy set A on the
universe of discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to a value between 0 and 1. It is called membership value or degree of membership. It quantifies the
degree of membership of the element in X to the fuzzy set A.
 x axis represents the universe of discourse.
 y axis represents the degrees of membership in the [0, 1] interval.
There can be multiple membership functions applicable to fuzzify a numerical value. Simple membership functions are used as use of complex
functions does not add more precision in the output.
All membership functions for LP, MP, S, MN, and LN are shown as below −
The triangular membership function shapes are most common among various other membership function shapes such as trapezoidal, singleton,
and Gaussian.
Here, the input to 5-level fuzzifier varies from -10 volts to +10 volts. Hence the corresponding output also changes.
Example of a Fuzzy Logic System
Let us consider an air conditioning system with 5-level fuzzy logic system. This system adjusts the temperature of air conditioner by comparing the
room temperature and the target temperature value.

Algorithm
 Define linguistic Variables and terms (start)
 Construct membership functions for them. (start)
 Construct knowledge base of rules (start)
 Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
 Evaluate rules in the rule base. (Inference Engine)
 Combine results from each rule. (Inference Engine)
 Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are input and output variables in the form of simple words or sentences. For room temperature, cold, warm, hot, etc., are
linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is a linguistic term and it can cover some portion of overall temperature values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −

Step3 − Construct knowledge base rules


Create a matrix of room temperature values versus target temperature values that an air conditioning system is expected to provide.
RoomTemp.
Very_Cold Cold Warm Hot Very_Hot
/Target

Very_Cold No_Change Heat Heat Heat Heat

Cold Cool No_Change Heat Heat Heat

Warm Cool Cool No_Change Heat Heat

Hot Cool Cool Cool No_Change Heat

Very_Hot Cool Cool Cool Cool No_Change

Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. Condition Action

1 IF temperature=(Cold OR Very_Cold) AND target=Warm THEN Heat

2 IF temperature=(Hot OR Very_Hot) AND target=Warm THEN Cool

3 IF (temperature=Warm) AND (target=Warm) THEN No_Change


Step 4 − Obtain fuzzy value
Fuzzy set operations perform evaluation of rules. The operations used for OR and AND are Max and Min respectively. Combine all results of
evaluation to form a final result. This result is a fuzzy value.
Step 5 − Perform defuzzification
Defuzzification is then performed according to membership function for output variable.

Application Areas of Fuzzy Logic


The key application areas of fuzzy logic are as given −
Automotive Systems
 Automatic Gearboxes
 Four-Wheel Steering
 Vehicle environment control
Consumer Electronic Goods
 Hi-Fi Systems
 Photocopiers
 Still and Video Cameras
 Television
Domestic Goods
 Microwave Ovens
 Refrigerators
 Toasters
 Vacuum Cleaners
 Washing Machines
Environment Control
 Air Conditioners/Dryers/Heaters
 Humidifiers
Advantages of FLSs
 Mathematical concepts within fuzzy reasoning are very simple.
 You can modify a FLS by just adding or deleting rules due to flexibility of fuzzy logic.
 Fuzzy logic Systems can take imprecise, distorted, noisy input information.
 FLSs are easy to construct and understand.
 Fuzzy logic is a solution to complex problems in all fields of life, including medicine, as it resembles human reasoning and decision
making.
Disadvantages of FLSs
 There is no systematic approach to fuzzy system designing.
 They are understandable only when simple.
 They are suitable for the problems which do not need high accuracy.

Constraint Satisfaction Problems in Artificial Intelligence


We have seen so many techniques like Local search, Adversarial search to solve different problems. The objective of every problem-solving
technique is one, i.e., to find a solution to reach the goal. Although, in adversarial search and local search, there were no constraints on the agents
while solving the problems and reaching to its solutions.
In this section, we will discuss another type of problem-solving technique known as Constraint satisfaction technique. By the name, it is understood
that constraint satisfaction means solving a problem under certain constraints or rules.
Constraint satisfaction is a technique where a problem is solved when its values satisfy certain constraints or rules of the problem. Such type of
technique leads to a deeper understanding of the problem structure as well as its complexity.
Constraint satisfaction depends on three components, namely:
 X: It is a set of variables.
 D: It is a set of domains where the variables reside. There is a specific domain for each variable.
 C: It is a set of constraints which are followed by the set of variables.
In constraint satisfaction, domains are the spaces where the variables reside, following the problem specific constraints. These are the three main
elements of a constraint satisfaction technique. The constraint value consists of a pair of {scope, rel}. The scope is a tuple of variables which
participate in the constraint and rel is a relation which includes a list of values which the variables can take to satisfy the constraints of the problem.
Solving Constraint Satisfaction Problems
The requirements to solve a constraint satisfaction problem (CSP) is:
 A state-space
 The notion of the solution.
A state in state-space is defined by assigning values to some or all variables such as
{X1=v1, X2=v2, and so on…}.
An assignment of values to a variable can be done in three ways:
 Consistent or Legal Assignment: An assignment which does not violate any constraint or rule is called Consistent or legal assignment.
 Complete Assignment: An assignment where every variable is assigned with a value, and the solution to the CSP remains consistent.
Such assignment is known as Complete assignment.
 Partial Assignment: An assignment which assigns values to some of the variables only. Such type of assignments are called Partial
assignments.
Types of Domains in CSP
There are following two types of domains which are used by the variables :
 Discrete Domain: It is an infinite domain which can have one state for multiple variables. For example, a start state can be allocated
infinite times for each variable.
 Finite Domain: It is a finite domain which can have continuous states describing one domain for one specific variable. It is also called a
continuous domain.
Constraint Types in CSP
With respect to the variables, basically there are following types of constraints:
 Unary Constraints: It is the simplest type of constraints that restricts the value of a single variable.
 Binary Constraints: It is the constraint type which relates two variables. A value x2 will contain a value which lies between x1 and x3.
 Global Constraints: It is the constraint type which involves an arbitrary number of variables.
Some special types of solution algorithms are used to solve the following types of constraints:
 Linear Constraints: These type of constraints are commonly used in linear programming where each variable containing an integer
value exists in linear form only.
 Non-linear Constraints: These type of constraints are used in non-linear programming where each variable (an integer value) exists in a
non-linear form.
Note: A special constraint which works in real-world is known as Preference constraint.
Constraint Propagation
In local state-spaces, the choice is only one, i.e., to search for a solution. But in CSP, we have two choices either:
 We can search for a solution or
 We can perform a special type of inference called constraint propagation.
Constraint propagation is a special type of inference which helps in reducing the legal number of values for the variables. The idea behind
constraint propagation is local consistency.
In local consistency, variables are treated as nodes, and each binary constraint is treated as an arc in the given problem. There are following
local consistencies which are discussed below:
 Node Consistency: A single variable is said to be node consistent if all the values in the variable’s domain satisfy the unary constraints
on the variables.
 Arc Consistency: A variable is arc consistent if every value in its domain satisfies the binary constraints of the variables.
 Path Consistency: When the evaluation of a set of two variable with respect to a third variable can be extended over another variable,
satisfying all the binary constraints. It is similar to arc consistency.
 k-consistency: This type of consistency is used to define the notion of stronger forms of propagation. Here, we examine the k-
consistency of the variables.
CSP Problems
Constraint satisfaction includes those problems which contains some constraints while solving the problem. CSP includes the following problems:
 Graph Coloring: The problem where the constraint is that no adjacent sides can have the same color.

You might also like