Ai PPTS
Ai PPTS
me/jntua
INTRODUCTION
TO
ARTIFICIAL
INTELLIGENCE
AI is one of the newest disciplines, formally initiated in 1956 when the name was coined. However, the study of
intelligence is one of the oldest disciplines being approximately 2000 years old. The advent of computers made it
possible for the first time for people to test models they proposed for learning, reasoning, perceiving, etc.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines "man-
made," and intelligence defines "thinking power", hence AI means "a man-made thinking power."
A human interrogates the program and another human via a terminal simultaneously. If after a reasonable
period, the interrogator cannot tell which is which, the program passes.
Machine learning
Computer vision
Robotics
Another way to do this is to observe a human problem solving and argue that one's programs go about
EXAMPLE:
GPS (General Problem Solver) was an early computer program that attempted to model human thinking.
The developers were not so much interested in whether or not GPS solved problems correctly.
They were more interested in showing that it solved problems like people, going through the same steps
and taking around the same amount of time to perform those steps.
Aristotle was one of the first to attempt to codify "thinking". His syllogisms provided patterns of
argument structure that always gave correct conclusions, giving correct premises.
EXAMPLE: All computers use energy. Using energy always generates heat. Therefore, all computers generate heat.
This initiate the field of logic. Formal logic was developed in the late nineteenth century. This was the first step toward
By 1965, programs existed that could, given enough time and memory, take a description of the problem in logical notation
and find the solution, if one existed. The logicist tradition in AI hopes to build on such programs to create intelligence.
There are two main obstacles to this approach: First, it is difficult to make informal knowledge precise enough to use the
Second, there is a big difference between being able to solve a problem in principle and doing so in practice.
Acting rationally means acting so as to achieve one's goals, given one's beliefs. An agent is just something
that perceives and acts.
In the logical approach to AI, the emphasis is on correct inferences. This is often part of being a rational agent because
one way to act rationally is to reason logically and then act on ones conclusions. But this is not all of rationality because
agents often find themselves in situations where there is no provably correct thing to do, yet they must do something.
There are also ways to act rationally that do not seem to involve inference, e.g., reflex actions.
The study of AI as rational agent design has two advantages:
1. It is more general than the logical approach because correct inference is only a useful mechanism for achieving
rationality, not a necessary one.
2. It is more amenable to scientific development than approaches based on human behaviour or human thought
because a standard of rationality can be defined independent of humans.
Achieving perfect rationality in complex environments is not possible because the computational demands are too high.
However, we will study perfect rationality as a starting place.
1. PHILOSOPHY
Can formal rules be used to draw valid conclusions?
How does the mind arise from a physical brain?
Where does knowledge come from?
How does knowledge lead to action?
Aristotle (384–322 B.C.), was the first to formulate a precise set of laws governing the rational part of the mind. He
developed an informal system of syllogisms for proper reasoning, which in principle allowed one to generate conclusions
mechanically, given initial premises.
Much later, Ramon Lull (d. 1315) had the idea that useful reasoning could actually be carried out by a mechanical artifact.
Thomas Hobbes (1588–1679) proposed that reasoning was like numerical computation that “we add and subtract in our
silent thoughts.”
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
2 MATHEMATICS
Philosophers staked out most of the important ideas of AI, but to move to a formal science requires a level of
Mathematicians have proved that there exists an algorithm to prove any true statement in first-order logic.
However, if one adds the principle of induction required to capture the semantics of the natural numbers, then
this is no longer the case. Specifically, the incompleteness theorem showed that in any language expressive
enough to describe the properties of the natural numbers, there are true statements that are undecidable: their
4 NEUROSCIENCE
How do brains process information?
Neuroscience is the study of the nervous system, particularly the brain. Although the exact way in which the
brain enables thought is one of the great mysteries of science, the fact that it does enable thought has been
appreciated for thousands of years because of the evidence that strong blows to the head can lead to mental
incapacitation.
It has also long been known that human brains are somehow different; in about 335 B.C. Aristotle wrote, “Of
all the animals, man has the largest brain in proportion to his size.”5 Still, it was not until the middle of the
18th century that the brain was widely recognized as the seat of consciousness. Before then, candidate
locations included the heart and the spleen.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
The parts of a nerve cell or neuron. Each neuron consists of a cell body, or soma, that contains a cell nucleus.
Branching out from the cell body are a number of fibers called dendrites and a single long fiber called the axon.
The axon stretches out for a long distance, much longer than the scale in this diagram indicates.
6 COMPUTER ENGINEERING
How can we build an efficient computer?
For artificial intelligence to succeed, we need two things: intelligence and an artifact. The computer has
been the artifact of choice. The modern digital electronic computer was invented independently and
almost simultaneously by scientists in three countries embattled in World War II.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
7 CONTROL THEORY AND CYBERNETICS
8. LINGUISTICS
Having a theory of how humans successfully process natural language is an AI-complete problem - if we
could solve this problem then we would have created a model of intelligence.
Much of the early work in knowledge representation was done in support of programs that attempted natural
language understanding.
Year 1943: The first work which is now recognized as AI was done by Warren McCulloch and
Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection strength
Year 1950: The Alan Turing who was an English mathematician and pioneered Machine
which he proposed a test. The test can check the machine's ability to exhibit intelligent
Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial intelligence
Mathematics theorems, and find new and more elegant proofs for some theorems.
Year 1956: The word "Artificial Intelligence" first adopted by American Computer scientist
John McCarthy at the Dartmouth Conference. For the first time, AI coined as an academic field.
At that time high-level computer languages such as FORTRAN, LISP, or COBOL were
invented. And the enthusiasm for AI was very high at that time.
Year 1966: The researchers emphasized developing algorithms which can solve mathematical
problems. Joseph Weizenbaum created the first chatbot in 1966, which was named as ELIZA.
Year 1972: The first intelligent humanoid robot was built in Japan which was named as
WABOT-1.
The duration between years 1974 to 1980 was the first AI winter duration. AI winter refers to
the time period where computer scientist dealt with a severe shortage of funding from
A boom of AI (1980-1987)
Year 1980: After AI winter duration, AI came back with "Expert System". Expert systems were
In the Year 1980, the first national conference of the American Association of Artificial
The duration between the years 1987 to 1993 was the second AI Winter duration.
Again Investors and government stopped in funding for AI research as due to high cost but not
efficient result. The expert system such as XCON was very cost effective.
Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary Kasparov, and
Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum cleaner.
Year 2006: AI came in the Business world till the year 2006. Companies like Facebook, Twitter,
The organizations which mean to have a serious edge over their adversaries are banking upon AI
advancements to acquire this.
Take the case of the Autopilot highlight offered by Tesla in its vehicles. Tesla is utilizing Deep
Learning Algorithms to accomplish Autonomous driving. This was before, when there was only one
element out of many, yet now it is characterizing the brand.
2. ACCESSIBILITY
The establishment speed, availability, and sheer scale have enabled bolder computations to deal
with progressively exciting issues. Not solely is the gear faster, expanded by specific assortments of
processors (e.g., GPUs), it is moreover available looking like cloud organizations.
What used to run in explicit labs with access to super PCs would now pass on to the cloud at a
lower cost. This has democratized access to the significant hardware stages to run AI, enabling
duplication of new organizations.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
3. FEAR OF MISSING OUT (FOMO)
No typo, you read that right! Not simply us, organizations additionally feel the dread of passing
up a major opportunity. To stay competitive and not get tossed out of the market, they need to adjust
appropriately. This is done by putting resources into advances that would upset their enterprises.
Take the case of the financial part, where practically all the banks have put vigorously in chatbots with
the goal that they won’t pass up the following rush of interruption.
4. COST-EFFECTIVENESS
As with all other technologies, with time, AI is becoming more and more affordable. This has
made it feasible for a lot of organizations that couldn’t bear the cost of them in the past to use
these advances.
Organizations do not have that barrier of cost to implement AI.
5. FUTURE PROOF
One thing that we all need to comprehend is that future in AI is very safe .
Before Learning about Artificial Intelligence, we should know that what is the importance of AI
and why should we learn it.
With the help of AI, you can create such software or devices which can solve real-world
problems very easily and with accuracy such as health issues, marketing, traffic issues, etc.
With the help of AI, you can create your personal virtual Assistant, such as Cortana, Google
Assistant, Siri, etc.
With the help of AI, you can build such Robots which can work in an environment where
survival of humans can be at risk.
AI opens a path for other new technologies, new devices, and new Opportunities.
Building a machine which can perform tasks that requires human intelligence such as:
Proving a theorem
Playing chess
2. AI in Healthcare
In the last, five to ten years, AI becoming more advantageous for the healthcare industry and going to
have a significant impact on this industry.
Healthcare Industries are applying AI to make a better and faster diagnosis than humans. AI can help
doctors with diagnoses and can inform when patients are worsening so that medical help can reach to
the patient before hospitalization.
3. AI in Gaming
AI can be used for gaming purpose. The AI machines can play strategic games like chess, where the
machine needs to think of a large number of possible places.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
4. AI in Finance
AI and finance industries are the best matches for each other. The finance industry is implementing
automation, chatbot, adaptive intelligence, algorithm trading, and machine learning into financial
processes.
5. AI in Data Security
The security of data is crucial for every company and cyber-attacks are growing very rapidly in the
digital world. AI can be used to make your data more safe and secure. Some examples such as AEG
bot, AI2 Platform,are used to determine software bug and cyber-attacks in a better way.
6. AI in Social Media
Social Media sites such as Facebook, Twitter, and Snapchat contain billions of user profiles, which
need to be stored and managed in a very efficient way. AI can organize and manage massive amounts
of data. AI can analyze lots of data to identify the latest trends, hashtag, and requirement of different
users.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
8. AI in Automotive Industry
Some Automotive industries are using AI to provide virtual assistant to their user for better
performance. Such as Tesla has introduced TeslaBot, an intelligent virtual assistant.
Various Industries are currently working for developing self-driven cars which can make your journey
more safe and secure.
10. AI in Entertainment
We are currently using some AI based applications in our daily life with some entertainment
services such as Netflix or Amazon. With the help of ML/AI algorithms, these services show the
recommendations for programs or shows.
12. AI in E-commerce
AI is providing a competitive edge to the e-commerce industry, and it is becoming more demanding in
the e-commerce business. AI is helping shoppers to discover associated products with recommended
size, color, or even brand.
13. AI in education:
AI can automate grading so that the tutor can have more time to teach. AI chatbot can communicate
with students as a teaching assistant.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
The main aim of Artificial Intelligence aim is to enable machines to perform a human-like function
Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI.
2. LIMITED MEMORY
Limited memory machines can store past experiences or some data for a short period of time.
These machines can use stored data for a limited time period only.
Self-driving cars are one of the best examples of Limited Memory systems. These cars can
store recent speed of nearby cars, the distance of other cars, speed limit, and other information to
navigate the road.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
3. THEORY OF MIND
Theory of Mind AI should understand the human emotions, people, beliefs, and be able to
This type of AI machines are still not developed, but researchers are making lots of efforts and
4. SELF-AWARENESS
intelligent, and will have their own consciousness, sentiments, and self-awareness.
BASED ON CAPABILITIES
1. WEAK AI OR NARROW AI:
Narrow AI is a type of AI which is able to perform a dedicated task with intelligence. The most
common and currently available AI is Narrow AI in the world of Artificial Intelligence.
Narrow AI cannot perform beyond its field or limitations, as it is only trained for one specific
task. Hence it is also termed as weak AI. Narrow AI can fail in unpredictable ways if it goes
beyond its limits.
Apple Siriis a good example of Narrow AI, but it operates with a limited pre-defined range of
functions.
IBM's Watson supercomputer also comes under Narrow AI, as it uses an Expert system
approach combined with Machine learning and natural language processing.
Some Examples of Narrow AI are playing chess, purchasing suggestions on e-commerce site,
self-driving cars, speech recognition, and image recognition.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
2. GENERAL AI:
General AI is a type of intelligence which could perform any intellectual task with efficiency
like a human.
The idea behind the general AI to make such a system which could be smarter and think like a
Currently, there is no such system exist which could come under general AI and can perform
The worldwide researchers are now focused on developing machines with General AI.
As systems with general AI are still under research, and it will take lots of efforts and time to
4. AVAILABILITY
Machines, in contrast to people, don’t have to take rests. They can work nonstop.
Now, we can depend on machines for keeping up required manufacturing units running with their
judgment which would prompt 24×7 creation units and complete mechanization.
ROBOTIC VEHICLES: A driverless robotic car named STANLEY sped through the rough
terrain of the Mojave Desert at 22 mph, finishing the 132-mile course first to win the 2005 DARPA
Grand Challenge.
STANLEY is a Volkswagen Touareg outfitted with cameras, radar, and laser rangefinders to sense
the environment and onboard software to command the steering, braking, and acceleration (Thrun,
2006).
The following year CMU’s BOSS won the Urban Challenge, safely driving in traffic through the
streets of a closed Air Force base, obeying traffic rules and avoiding pedestrians and other vehicles.
SPEECH RECOGNITION: A traveller calling United Airlines to book a flight can have the entire
REMOTE AGENT generated plans from high-level goals specified from the ground and monitored
the execution of those plans—detecting, diagnosing, and recovering from problems as they
occurred.
Successor program MAPGEN (Al-Chang et al., 2004) plans the daily operations for NASA’s
Mars Exploration Rovers, and MEXAR2 (Cesta et al., 2007) did mission planning—both logistics
and science planning—for the European Space Agency’s Mars Express mission in 2008.
SPAM FIGHTING: Each day, learning algorithms classify over a billion messages as spam,
saving the recipient from having to waste time deleting what, for many users, could comprise 80%
or 90% of all messages, if not classified away by algorithms. Because the spammers are continually
updating their tactics, it is difficult for a static programmed approach to keep up, and learning
algorithms work best (Sahami et al., 1998; Goodman and Heckerman, 2004).
LOGISTICS PLANNING: During the Persian Gulf crisis of 1991, U.S. forces deployed a Dynamic
Analysis and Replanning Tool, DART (Cross and Walker, 1994), to do automated logistics planning
and scheduling for transportation. This involved up to 50,000 vehicles, cargo, and people at a time,
and had to account for starting points, destinations, routes, and conflict resolution among all
parameters. The AI planning techniques generated in hours a plan that would have taken weeks with
older methods. The Defence Advanced Research Project Agency (DARPA) stated that this single
application more than paid back DARPA’s 30-year investment in AI.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
ROBOTICS: The iRobot Corporation has sold over two million Roomba robotic vacuum cleaners
for home use. The company also deploys the more rugged PackBot to Iraq and Afghanistan, where
it is used to handle hazardous materials, clear explosives, and identify the location of snipers.
English, allowing an English speaker to see the headline “Ardogan Confirms That Turkey Would
Not Accept Any Pressure, Urging Them to Recognize Cyprus.” The program uses a statistical
model built from examples of Arabic-to-English translations and from examples of English text
totalling two trillion words (Brants et al., 2007). None of the computer scientists on the team speak
AGENTS
AND
ENVIRONMENTS
Agents in Artificial Intelligence are the associated concepts that the AI technologies work
upon.
The AI software or AI-enabled devices with sensors generally captures the information from
the environment setup and process the data for further actions.
There are mainly two ways the agents interact with the environment, such as perception and
action.
The person is only passive for capturing the information without changing the actual
environment, whereas action is the active form of interaction by changing the actual
environment.
AI technologies such as virtual assistance catboats, AI-enabled devices to work based on the
previous persecution data processing and learning for the actions.
HUMAN-AGENT: A human agent has eyes, ears, and other organs which work for sensors and
ROBOTIC AGENT: A robotic agent can have cameras, infrared range finder, NLP for sensors
SOFTWARE AGENT: Software agent can have keystrokes, file contents as sensory input and
2. ACTION
Action is an active interaction where the environment is changed. When the robot moves
an obstacle using its arm, it is called an action as the environment is changed. The arm of the robot
is called an “Effector” as it performs the action.
SENSOR: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
ACTUATORS: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an electric
motor, gears, rails, etc.
EFFECTORS: Effectors are the devices which affect the environment. Effectors can be legs,
wheels, arms, fingers, wings, fins, and display screen.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
CONSIDER A VACUUM CLEANER WORLD
Let's suppose that the world has just two rooms. The robot can be in either room and there can
be dirt in zero, one, or two rooms.
Goal formulation: intuitively, we want all the dirt cleaned up. Formally, the goal is {State 7, state 8}.
Problem formulation (Actions):Left, Right, Suck, NoOp
MEASURING PERFORMANCE
With any intelligent agent, we want it to find a (good) solution and not spend forever doing it.
The interesting quantities are, therefore,
THE SEARCH COST--how long the agent takes to come up with the solution to the problem,
and
THE PATH COST--how expensive the actions in the solution are.
The total cost of the solution is the sum of the above two quantities.
An omniscient agent knows what impact the action will have and can act accordingly, but it is not
possible in reality.
The percept sequence which is the entire sequence of perceptions by the agent until the present
moment
Car Driver Speedometer, GPS, Steering control, Safe, legal, Road, Traffic,
Microphone, accelerate, brake, comfortable journey Pedestrian etc.
Cameras talk to passenger
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
GOOD BEHAVIOUR: THE CONCEPT OF RATIONALITY
INTELLIGENT AGENTS:
An intelligent agent is an autonomous entity which act upon an environment using sensors and
actuators for achieving goals. An intelligent agent may learn from the environment to achieve their
NOTE: Rationality differs from Omniscience because an Omniscient agent knows the actual
outcome of its action and act accordingly, which is not possible in reality.
AUTONOMY
The behaviour of an agent depends on its own experience as well as the built-in knowledge of the
agent instilled by the agent designer. A system is autonomous if it takes actions according to its
experience. So for the initial phase, as it does not have any experience, it is good to provide built-in
knowledge. The agent learns then through evolution. A truly autonomous intelligent agent should be
able to operate successfully in a wide variety of environments if given sufficient time to adapt.
TASK ENVIRONMENTS
To design a rational agent we need to specify a task environment
A problem specification for which the agent is a solution
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made up
of four words:
P: Performance measure
E: Environment
A: Actuators
S: Sensors
Here performance measure is the objective for the success of an agent's behaviour.
Performance measure: ?
Environment: ?
Actuators: ?
Sensors: ?
Performance measure:
safe, fast, legal, comfortable, maximize profits
Environment:
roads, other traffic, pedestrians, customers
Actuators:
steering, accelerator, brake, signal, horn
Sensors:
cameras, sonar, speedometer, GPS
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
treatments, referrals)
THE
NATURE
OF
ENVIRONMENTS
An environment in artificial intelligence is the surrounding of the agent. The agent takes
input from the environment through sensors and delivers the output to the environment
through actuators.
An environment is everything in the world which surrounds the agent, but it is not a part of
The environment is where agent lives, operate and provide the agent with something to sense
2. Static vs Dynamic
3. Discrete vs Continuous
4. Deterministic vs Stochastic
5. Single-agent vs Multi-agent
6. Episodic vs sequential
7. Known vs Unknown
8. Accessible vs Inaccessible
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
1. FULLY OBSERVABLE VS PARTIALLY OBSERVABLE:
If an agent sensor can sense or access the complete state of an environment at each point of time
then it is a fully observable environment, else it is partially observable.
A fully observable environment is easy as there is no need to maintain the internal state to keep
track history of the world.
An agent with no sensors in all environments then such an environment is called as unobservable.
2. DETERMINISTIC VS STOCHASTIC:
If an agent's current state and selected action can completely determine the next state of the
environment, then such environment is called a deterministic environment.
In a deterministic, fully observable environment, agent does not need to worry about uncertainty.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
3. EPISODIC VS SEQUENTIAL:
In an episodic environment, there is a series of one-shot actions, and only the current percept is
required for the action.
However, in Sequential environment, an agent requires memory of past actions to determine the
next best actions.
4. SINGLE-AGENT VS MULTI-AGENT
If only one agent is involved in an environment, and operating by itself then such an
environment is called single agent environment.
However, if multiple agents are operating in an environment, then such an environment is called
a multi-agent environment.
The agent design problems in the multi-agent environment are different from single agent
environment.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
5. STATIC VS DYNAMIC:
If the environment can change itself while an agent is deliberating then such environment is
called a dynamic environment else it is called a static environment.
Static environments are easy to deal because an agent does not need to continue looking at the
world while deciding for an action.
However for dynamic environment, agents need to keep looking at the world at each action.
A chess game comes under discrete environment as there is a finite number of moves that can be
performed.
7. ACCESSIBLE VS INACCESSIBLE
If an agent can obtain complete and accurate information about the state's environment, then such
an environment is called an Accessible environment else it is called inaccessible.
An empty room whose state can be defined by its temperature is an example of an accessible
environment.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
8. KNOWN VS UNKNOWN
Known and unknown are not actually a feature of an environment, but it is an agent's state of
In a known environment, the results for all actions are known to the agent. While in unknown
THE
STRUCTURE
OF
AGENTS
TYPES OF AI AGENTS
Agents can be grouped into five classes based on their degree of perceived intelligence and
capability. All these agents can improve their performance and generate better action over the time.
Goal-based agents
Utility-based agent
Learning agent
The Model-based agent can work in a partially observable environment, and track the
situation.
A model-based agent has two important factors:
Model: It is knowledge about "how things happen in the world," so it is called a
Model-based agent.
Internal State: It is a representation of the current state based on percept history.
These agents have the model, "which is knowledge of the world" and based on the model
they perform actions.
Updating the agent state requires information about:
How the world evolves
How the agent's action affects the world.
The knowledge of the current state environment is not always sufficient to decide for an
The agent needs to know its goal which describes desirable situations.
Goal-based agents expand the capabilities of the model-based agent by having the "goal"
information.
These agents may have to consider a long sequence of possible actions before deciding
whether the goal is achieved or not. Such considerations of different scenario are called
These agents are similar to the goal-based agent but provide an extra component of utility
state.
Utility-based agent act based not only goals but also the best way to achieve the goal.
The Utility-based agent is useful when there are multiple possible alternatives, and an
The utility function maps each state to a real number to check how efficiently each action
A learning agent in AI is the type of agent which can learn from its past experiences, or it has
learning capabilities.
It starts to act with basic knowledge and then able to act and adapt automatically through learning.
A learning agent has mainly four conceptual components, which are:
Learning element: It is responsible for making improvements by learning from environment
Critic: Learning element takes feedback from critic which describes that how well the agent is
doing with respect to a fixed performance standard.
Performance element: It is responsible for selecting external action
Problem generator: This component is responsible for suggesting actions that will lead to new
and informative experiences.
Hence, learning agents are able to learn, analyze performance, and look for new ways to improve
the performance.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
PROBLEM-SOLVING AGENT
The problem-solving agent performs precisely by defining problems and its several solutions.
According to psychology, “a problem-solving refers to a state where we wish to reach to a
definite goal from a present state or condition.”
According to computer science, a problem-solving is a part of artificial intelligence which
encompasses a number of techniques such as algorithms, heuristics to solve a problem.
Problem Formulation: It is the most important step of problem-solving which decides what
actions should be taken to achieve the formulated goal.
Initial State: It is the starting state or initial step of the agent towards its goal.
Path cost: It assigns a numeric cost to each path that follows the goal. The problem-solving
agent selects a cost function, which reflects its performance measure. Remember, an optimal
solution has the lowest path cost among all the solutions.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Search: It identifies all the best possible sequence of actions to reach the goal state from the current
Solution: It finds the best algorithm out of various algorithms, which may be proven as the best
optimal solution.
Execution: It executes the best optimal solution from the searching algorithms to reach the goal
NOTE: Initial state, actions, and transition model together define the state-space of the problem
implicitly. State-space of a problem is a set of all states which can be reached from the initial state
followed by any sequence of actions. The state-space forms a directed map or graph where nodes are
the states, links between the nodes are actions, and the path is a sequence of states connected by the
sequence of actions.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
8 Puzzle Problem: Here, we have a 3×3 matrix with movable tiles numbered from 1 to 8 with
a blank space. The tile adjacent to the blank space can slide into that space. The objective is to
reach a specified goal state similar to the goal state, as shown in the below figure.
We also know the eight-puzzle problem by the name of N puzzle problem or sliding puzzle
problem.
N-puzzle that consists of N tiles (N+1 titles with an empty tile) where N can be 8, 15, 24 and so
on.
In our example N = 8. (that is square root of (8+1) = 3 rows and 3 columns).
In the same way, if we have N = 15, 24 in this way, then they have Row and columns as
follow (square root of (N+1) rows and square root of (N+1) columns).
That is if N=15 than number of rows and columns= 4, and if N= 24 number of rows and
columns= 5.
So, basically in these types of problems we have given a initial state or initial configuration
(Start state) and a Goal state or Goal Configuration.
The puzzle can be solved by moving the tiles one by one in the single empty space and thus
achieving the Goal state.
Rules of solving puzzle
Instead of moving the tiles in the empty space we can visualize moving the empty space in place
of the tile.
The empty space can only move in four directions (Movement of empty space)
Up Down Right or Left
The empty space cannot move diagonally and can take only one step at a time.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Cell layout: Here, the primitive components of the circuit are grouped into cells, each
performing its specific function. Each cell has a fixed shape and size. The task is to place the
cells on the chip without overlapping each other.
Channel routing: It finds a specific route for each wire through the gaps between the cells.
Protein Design: The objective is to find a sequence of amino acids which will fold into 3D
protein having a property to cure some disease.
Search algorithms require a data structure to keep track of the search tree that is being constructed.
For each node n of the tree, we have a structure that contains four components:
n. STATE: the state in the state space to which the node corresponds;
n. PARENT: the node in the search tree that generated this node;
n. ACTION: the action that was applied to the parent to generate the node;
n. PATH-COST: the cost, traditionally denoted by g(n), of the path from the initial state to the
Completeness: It measures if the algorithm guarantees to find a solution (if any solution exists).
Breadth-first search is the most common search strategy for traversing a tree or graph.
BFS algorithm starts searching from the root node of the tree and expands all successor node at
ADVANTAGES:
BFS will provide a solution if any solution exists.
If there are more than one solution for a given problem, then BFS will provide the minimal solution
which requires the least number of steps.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
DISADVANTAGES:
It requires lots of memory since each level of the tree must be saved into memory to expand the
next level.
BFS needs lots of time if the solution is far away from the root node.
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of nodes
traversed in BFS until the shallowest Node. Where the d= depth of shallowest solution and b is a
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier
which is O(bd).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth,
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.
Completeness: DFS search algorithm is complete within finite state space as it will expand every node
within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the algorithm.
It is given by:
T(n)= 1+ n2+ n3 +.........+ nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest solution
depth)
Space Complexity: DFS algorithm needs to store only single path from the root node, hence space
complexity of DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps or high cost
to reach to the goal node.
Standard failure value: It indicates that problem does not have any solution.
Cut off failure value: It defines no solution for the problem within a given depth limit.
Advantages:
Depth-limited search is Memory efficient.
Disadvantages:
Depth-limited search also has a disadvantage of incompleteness.
It may not be optimal if the problem has more than one solution.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
EXAMPLE
Advantages:
Uniform cost search is optimal because at every state the path with the least cost is chosen.
EXAMPLE
Completeness: Uniform-cost search is complete, such as if there is a solution, UCS will find it.
Time Complexity: Let C* is Cost of the optimal solution, and ε is each step to get closer to the goal
node. Then the number of steps is = C*/ε+1. Here we have taken +1, as we start from state 0 and end
to C*/ε.
Space Complexity: The same logic is for space complexity so, the worst-case space complexity of
Optimal: Uniform-cost search is always optimal as it only selects a path with the lowest path cost.
Advantages:
It combines the benefits of BFS and DFS search algorithm in terms of fast search and memory
efficiency.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Disadvantages:
The main drawback of IDDFS is that it repeats all the work of the previous phase.
EXAMPLE
Following tree structure is showing the iterative deepening depth-first search.
IDDFS algorithm performs various iterations until it does not find the goal node. The iteration
performed by the algorithm is given as:
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
4'th Iteration------>A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the
goal node
Time Complexity: Let's suppose b is the branching factor and depth is d then the worst-case time
complexity is O(bd).
Optimal: IDDFS algorithm is optimal if path cost is a non- decreasing function of the depth of the
node.
Advantages:
Bidirectional search is fast.
Bidirectional search requires less memory
Disadvantages:
Implementation of the bidirectional search tree is difficult.
In bidirectional search, one should know the goal state in advance.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
EXAMPLE
In the below search tree, bidirectional search algorithm is applied.
This algorithm divides one graph/tree into two sub-graphs.
It starts traversing from node 1 in the forward direction and starts from goal node 16 in the
backward direction.
The algorithm terminates at node 9 where two searches meet.
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence
h(n) <= h*(n)
heuristic cost should be less than or equal to the estimated cost.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
PURE HEURISTIC SEARCH:
Greedy best-first search algorithm always selects the path which appears best at that moment.
It is the combination of depth-first search and breadth-first search algorithms.
It uses the heuristic function and search. Best-first search allows us to take the advantages of
both algorithms.
With the help of best-first search, at each step, we can choose the most promising node.
In the best first search algorithm, we expand the node which is closest to the goal node and the
closest cost is estimated by heuristic function, i.e.
f(n)= g(n).
Were, h(n)= estimated cost from node n to the goal.
The greedy best first algorithm is implemented by the priority queue.
Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places it in
the CLOSED list.
Step 5: Check each successor of node n, and find whether any node is a goal node or not. If any
successor node is goal node, then return success and terminate the search, else proceed to Step 6.
Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check if
the node has been in either OPEN or CLOSED list. If the node has not been in both list, then add it
to the OPEN list.
Step 7: Return to Step 2.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
ADVANTAGES:
Best first search can switch between BFS and DFS by gaining the advantages of both
the algorithms.
DISADVANTAGES:
iteration, each node is expanded using evaluation function f(n)=h(n) , which is given in the below
table.
Time Complexity: The worst-case time complexity of Greedy best first search is O(bm).
Space Complexity: The worst-case space complexity of Greedy best first search is
Complete: Greedy best-first search is also incomplete, even if the given state space is
finite.
NOTE: At each point in the search space, only those nodes is expanded which have the lowest value of f(n), and the algorithm
terminates when the goal node is found.
ALGORITHM OF A* SEARCH:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and
stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation
function (g+h), if node n is goal node then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For
each successor n', check whether n' is already in the OPEN or CLOSED list, if not then
compute evaluation function for n' and place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the
back pointer which reflects the lowest g(n') value.
Step 6: Return to Step 2.
ADVANTAGES:
A* search algorithm is the best algorithm than other search algorithms.
A* search algorithm is optimal and complete.
This algorithm can solve very complex problems.
DISADVANTAGES:
It does not always produce the shortest path as it mostly based on heuristics and
approximation.
A* search algorithm has some complexity issues.
The main drawback of A* is memory requirement as it keeps all generated nodes in the
memory, so it is not practical for various large-scale problems.
HEURISTIC FUNCTIONS
So, there is total of three tiles out of position i.e., 6,5 and 4. Do not count the empty tile present
in the goal state). i.e. h(n)=3. Now, we require to minimize the value of h(n) =0.
It is seen from the above state space tree that the goal state is minimized from h(n)=3 to h(n)=0
However, we can create and use several heuristic functions as per the requirement. It is also clear
from the above example that a heuristic function h(n) can be defined as the information required to
solve a given problem more efficiently. The information can be related to the nature of the state,
cost of transforming from one state to another, goal node characteristics, etc., which is
The informed and uninformed search expands the nodes systematically in two ways:
Which leads to a solution state required to reach the goal node. But beyond these “classical
search algorithms,” we have some “local search algorithms” where the path cost does not
matters, and only focus on solution-state needed to reach the goal node.
A local search algorithm completes its task by traversing on a single current node rather than
Although local search algorithms are not systematic, still they have the following two
advantages:
Local search algorithms use a very little or constant amount of memory as they operate only
on a single path.
Most often, they find a reasonable solution in large or infinite state spaces where the classical
or systematic algorithms do not work.
Does the local search algorithm work for a pure optimized problem?
Yes, the local search algorithm works for pure optimized problems. A pure optimization
problem is one where all the nodes can give a solution. But the target is to find the best state out of
all according to the objective function. Unfortunately, the pure optimization problem fails to find
high-quality solutions to reach the goal state from the current state.
The local search algorithm explores the above landscape by finding the following two points:
Global Minimum: If the elevation corresponds to the cost, then the task is to find the
lowest valley, which is known as Global Minimum.
Global Maxima: If the elevation corresponds to an objective function, then it finds the
highest peak which is called as Global Maxima. It is the highest point in the valley.
We will understand the working of these points better in Hill-climbing search.
Hill-climbing Search
Simulated Annealing
Hill climbing algorithm is a technique which is used for optimizing the mathematical problems. One of
the widely discussed examples of Hill climbing algorithm is Traveling-salesman Problem in which we
need to minimize the distance travelled by the salesman.
It is also called greedy local search as it only looks to its good immediate neighbour state and not beyond
that.
A node of hill climbing algorithm has two components which are state and value.
In this algorithm, we don't need to maintain and handle the search tree or graph as it only keeps a single
current state.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
STATE-SPACE LANDSCAPE OF HILL CLIMBING ALGORITHM
To understand the concept of hill climbing algorithm, consider the below landscape representing
the goal state/peak and the current state of the climber. The topographical regions shown in the
figure can be defined as:
Global Maximum: It is the highest point on the hill, which is the goal state.
Local Maximum: It is the peak higher than all other peaks but lower than the global
maximum.
Flat local maximum: It is the flat area over the hill where it has no uphill or downhill. It is
TYPES
OF
HILL CLIMBING
SEARCH ALGORITHM
2. If the CURRENT node=GOAL node, return GOAL and terminate the search.
NOTE: Both simple, as well as steepest-ascent hill climbing search, fails when there is no closer
node.
Stochastic hill climbing does not focus on all the nodes. It selects one node at random and
Random-restart algorithm is based on try and try strategy. It iteratively searches the node
and selects the best one at each step until the goal is not found. The success depends most
commonly on the shape of the hill. If there are few plateaus, local maxima, and ridges, it
Hill climbing algorithm is a fast and furious approach. It finds the solution state rapidly because it
is quite easy to improve a bad state. But, there are following limitations of this search:
Local Maxima: It is that peak of the mountain which is highest than all its neighbouring states
but lower than the global maxima. It is not the goal peak because there is another peak higher
than it.
Generate and Test variant: Hill Climbing is the variant of Generate and Test method.
The Generate and Test method produce feedback which helps to decide which direction
No backtracking: It does not backtrack the search space, as it does not remember the
previous states.
SIMULATED ANNEALING
Gradient descent is an iterative optimization algorithm for finding the local minimum of a
function.
To find the local minimum of a function using gradient descent, we must take steps proportional
to the negative of the gradient (move away from the gradient) of the function at the current point. If
we take steps proportional to the positive of the gradient (moving towards the gradient), we will
approach a local maximum of the function, and the procedure is called Gradient Ascent.
Gradient descent was originally proposed by CAUCHY in 1847. It is also known as steepest
descent.
The goal of the gradient descent algorithm is to minimize the given function (say cost function). To
achieve this goal, it performs two steps iteratively:
1. Compute the gradient (slope), the first order derivative of the function at that point
2. Make a step (move) in the direction opposite to the gradient, opposite direction of slope
increases from the current point by alpha times the gradient at that point
SEARCHING
WITH
NONDETERMINISTIC
ACTIONS
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
SEARCHING WITH NONDETERMINISTIC ACTIONS
The environment is fully observable and deterministic and that the agent knows what the
effects of each action are.
Therefore, the agent can calculate exactly which state results from any sequence of actions and
always knows which state it is in.
Its precepts provide no new information after each action, although of course they tell the agent
the initial state.
When the environment is either partially observable or nondeterministic (or both), precepts
become useful.
In a partially observable environment, every percept helps narrow down the set of possible
states the agent might be in, thus making it easier for the agent to achieve its goals.
When the environment is nondeterministic, precepts tell the agent which of the possible
outcomes of its actions has actually occurred.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
In both cases, the future precepts cannot be determined in advance and the agent’s future actions
will depend on those future precepts.
So, the solution to a problem is not a sequence but a contingency plan (also known as a strategy)
that specifies what to do depending on what precepts are received.
As an example, we use the vacuum world, recall that the state space has eight states, as shown in
Figure. There are three actions — Left, Right, and Suck — and the goal is to clean up all the
dirt (states 7 and 8).
If the environment is observable, deterministic, and completely known, then the problem is
trivially solvable by any of the algorithm and the solution is an action sequence.
A solution for an AND–OR search problem is a subtree that (1) has a goal node at every leaf, (2)
specifies one action at each of its OR nodes, and (3) includes every outcome branch at each of its
AND nodes.
TRANSITION MODEL
Union of all states that 𝑅𝑒𝑠𝑢𝑙𝑡𝑝(s) returns for all states, s, in your current belief state
𝑏′=𝑅𝑒𝑠𝑢𝑙𝑡 𝑏 ,𝑎 = {𝑠′ : 𝑠′ = 𝑅𝑒𝑠𝑢𝑙𝑡𝑝(s, a) and s ϵ b}
This is the prediction step, 𝑃𝑟𝑒𝑑𝑖𝑐𝑡 𝑝 (b, a)
Goal-Test: If all physical states in belief state satisfy 𝐺𝑜𝑎𝑙−𝑇𝑒𝑠𝑡𝑝
Path cost Tricky in general. Consider what happens if actions in different physical states have different
costs. For now assume cost of an action is the same in all states
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
One solution is to represent the belief state by some more compact description. In English, we could
say the agent knows “Nothing” in the initial state; after moving Left, we could say, “Not in the rightmost
column,” and so on. Chapter 7 explains how to do this in a formal representation scheme. Another
approach is to avoid the standard search algorithms, which treat belief states as black boxes just like any
other problem state. Instead, we can look
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
SEARCHING
WITH
OBSERVATIONS
The observation prediction stage determines the set of precepts o that could be observed in the
predicted belief state:
POSSIBLE-PERCEPTS(ˆ b) = {o : o=PERCEPT(s) and s ∈ ˆ b}
The update stage determines, for each possible percept, the belief state that would result from the
percept. The new belief state bo is just the set of states in ˆb that could have produced the percept:
Notice that each updated belief state bo can be no larger than the predicted belief state ˆ b;
observations can only help reduce uncertainty compared to the sensor less case.
Moreover, for deterministic sensing, the belief states for the different possible precepts will be
The preceding section showed how to derive the RESULTS function for a nondeterministic
belief-state problem from an underlying physical problem and the PERCEPT function.
Given such a formulation, the AND–OR search algorithm of Figure can be applied directly to
derive a solution.
Figure shows part of the search tree for the local-sensing vacuum world, assuming an initial
Great for
dynamic domains
Non-deterministic domains
Does not know about obstacles, where the goal is, that UP from (1,1) goes to (1, 2)
Competitive Ratio = Cost of shortest path without exploration/Cost of actual agent path
Irreversible actions can lead to dead ends and CR can become infinite
Hill-climbing is already an online search algorithm but stops at local optimum. How about
randomization?
Cannot do random restart (you can’t teleport a robot)
How about just a random walk instead of hill-climbing?
Can be very bad (two ways back for every way forward above)
Let’s augment HC with memory
Learning real-time A* (LRTA*)
Updates cost estimates, g(s), for the state it leaves
Likes unexplored states
f(s) = h(s) not g(s) + h(s) for unexplored states
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
REINFORCEMENT
LEARNING
Agent (): An entity that can perceive/explore the environment and act upon it.
Environment (): A situation in which an agent is present or surrounded by. In RL, we assume the
stochastic environment, which means it is random in nature.
Action (): Actions are the moves taken by an agent within the environment.
State (): State is a situation returned by the environment after each action taken by the agent.
Reward (): A feedback returned to the agent from the environment to evaluate the action of the agent.
Policy (): Policy is a strategy applied by the agent for the next action based on the current state.
Value (): It is expected long-term retuned with the discount factor and opposite to the short-term
reward.
Q-value (): It is mostly similar to the value, but it takes one additional parameter as a current action
(a).
Value-based
Policy-based
Model-based
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
1. VALUE-BASED:
The value-based approach is about to find the optimal value function, which is the maximum value
at a state under any policy. Therefore, the agent expects the long-term return at any state(s) under
policy π.
2. POLICY-BASED:
Policy-based approach is to find the optimal policy for the maximum future rewards without
using the value function. In this approach, the agent tries to apply such a policy that the action
performed in each step helps to maximize the future reward.
The policy-based approach has mainly two types of policy:
Deterministic: The same action is produced by the policy (π) at any state.
Stochastic: In this policy, probability determines the produced action.
There are four main elements of Reinforcement Learning, which are given below:
1. Policy
2. Reward Signal
3. Value Function
4. Model of the environment
3) VALUE FUNCTION:
The value function gives information about how good the situation and action are and how
much reward an agent can expect. A reward indicates the immediate signal for each good and bad
action, whereas a value function specifies the good state and action for the future. The value
function depends on the reward as, without reward, there could be no value. The goal of estimating
values is to achieve more rewards.
4) MODEL:
The last element of reinforcement learning is the model, which mimics the behaviour of the
environment. With the help of the model, one can make inferences about how the environment will
behave. Such as, if a state and an action are given, then a model can predict the next state and reward.
The model is used for planning, which means it provides a way to take a course of action by
considering all future situations before actually experiencing those situations. The approaches for
solving the RL problems with the help of the model are termed as the model-based approach.
Comparatively, an approach without using a model is called a model-free approach.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
HOW DOES REINFORCEMENT LEARNING WORK?
To understand the working process of the RL, we need to consider two main things:
Environment: It can be anything such as a room, maze, football ground, etc.
Agent: An intelligent agent such as AI robot.
Let's take an example of a maze environment that the agent needs to explore. Consider the below
image:
It will be a difficult condition for the agent whether he should go up or down as each block has
the same value. So, the above approach is not suitable for the agent to reach the destination. Hence to
solve the problem, we will use the Bellman equation, which is the main concept behind
reinforcement learning.
It is a way of calculating the value functions in dynamic programming or environment that leads
to modern reinforcement learning.
The key-elements used in Bellman equations are:
Action performed by the agent is referred to as "a"
State occurred by performing the action is "s."
The reward/feedback obtained for each good and bad action is "R."
A discount factor is Gamma "γ."
In the above equation, we are taking the max of the complete values because the agent tries to find
the optimal solution always.
So now, using the Bellman equation, we will find value at each state of the given environment. We
will start from the block, which is next to the target block.
NEGATIVE REINFORCEMENT:
We can represent the agent state using the Markov State that contains all the required
information from the history. The State St is Markov state if it follows the given condition:
The Markov state follows the Markov property, which says that the future is independent of the
past and can only be defined with the present. The RL works on fully observable environments,
where the agent can observe the environment and act for the new state. The complete process is
Markov Decision Process or MDP, is used to formalize the reinforcement learning problems. If
the environment is completely observable, then its dynamic can be modelled as a Markov Process.
In MDP, the agent constantly interacts with the environment and performs actions; at each action, the
environment responds and generates a new state.
MARKOV PROPERTY
It says that "If the agent is present in the current state S1, performs an action a1 and move to
the state s2, then the state transition from s1 to s2 only depends on the current state and
future action and states do not depend on past actions, rewards, or states."
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
MARKOV PROPERTY
Or, in other words, as per Markov Property, the current state transition does not depend on any
past action or state. Hence, MDP is an RL problem that satisfies the Markov property. Such as in
a Chess game, the players only focus on the current state and do not need to remember
past actions or states.
FINITE MDP
A finite MDP is when there are finite states, finite rewards, and finite actions. In RL, we consider
only the finite MDP.
MARKOV PROCESS
Markov Process is a memoryless process with a sequence of random states S1, S2, ....., St that
uses the Markov Property. Markov process is also known as Markov chain, which is a tuple (S, P) on
state S and transition function P. These two components (S and P) can define the dynamics of the
system.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
REINFORCEMENT LEARNING ALGORITHMS
Reinforcement learning algorithms are mainly used in AI applications and gaming applications.
The main used algorithms are:
Q-Learning:
Q-learning is an Off policy RL algorithm, which is used
for the temporal difference Learning. The temporal
difference learning methods are the way of comparing
temporally successive predictions.
It learns the value function Q (s, a), which means how good
to take action "a" at a particular state "s."
The below flowchart explains the working of Q- learning:
SARSA stands for State Action Reward State action, which is an on-policy temporal
difference learning method. The on-policy control method selects the action for each state
while learning using a specific policy.
The goal of SARSA is to calculate the Q π (s, a) for the selected current policy π and all
pairs of (s-a).
The main difference between Q-learning and SARSA algorithms is that unlike Q-learning, the
maximum reward for the next state is not required for updating the Q-value in the table.
In SARSA, new action and reward are selected using the same policy, which has determined
the original action.
The SARSA is named because it uses the quintuple Q(s, a, r, s', a'). Where,
s: original state
a: Original action
In the equation, we have various components, including reward, discount factor (γ), probability,
and end states s'. But there is no any Q-value is given so first consider the below image:
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
In the image, we can see there is an agent who has three values options,
V(s1), V(s2), V(s3). As this is MDP, so agent only cares for the current
state and the future state. The agent can go to any direction (Up, Left,
or Right), so he needs to decide where to go for the optimal path. Here
agent will take a move as per probability bases and changes the state.
But if we want some exact moves, so for this, we need to make some
changes in terms of Q-value.
Where R(s) = reward for being in states, P(s’|s, π(s)) = transition model, γ = discount factor
and Uπ(s) = utility of being in state’s’.
It can be solved using value-iteration algorithm. The algorithm converges fast but can become
quite costly to compute for large state spaces. ADP is a model-based approach and requires the
transition model of the environment. A model-free approach is Temporal Difference Learning.
Where f(u, n) is the exploration function that increases with expected value u and decreases with
number of tries n
R+ is an optimistic reward and Ne is the number of times we want an agent to be forced to pick an action in
every state. The explorationwww.android.previousquestionpapers.com
function converts a passive agent into an active one.
| www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
GENERALIZATION
IN
REINFORCEMENT
LEARNING
The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms
whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting
to their training environments.
Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios,
where the environment will be diverse, dynamic and unpredictable.
This survey is an overview of this nascent field. We provide a unifying formalism and terminology
for discussing different generalisation problems, building upon previous works.
We go on to categorise existing benchmarks for generalisation, as well as current methods for
tackling the generalisation problem. Finally, we provide a critical discussion of the current state of
the field, including recommendations for future work.
Among other conclusions, we argue that taking a purely procedural content generation approach to
benchmark design is not conducive to progress in generalisation, we suggest fast online adaptation
and tackling RL-specific problems as some areas for future work on methods for generalisation, and
Its underlying idea, states Russel, is that intelligence is an emergent property of the interaction
between an agent and its environment.
This property guides the agent’s actions by orienting its choices in the conduct of some tasks.
We can say, analogously, that intelligence is the capacity of the agent to select the appropriate
strategy in relation to its goals. Strategy, a teleologically-oriented subset of all possible
behaviours, is here connected to the idea of “policy”.
A policy is, therefore, a strategy that an agent uses in pursuit of goals. The policy dictates the
actions that the agent takes as a function of the agent’s state and the environment.
We can now formally define the policy, which we indicate with π(s). A policy π(s) comprises the
suggested actions that the agent should take for every possible state s € S.
The internal state of the agent corresponds to its location on the board, in this case, st =(x,y) and S0 =(1,1) .
The action space, in this example, consists of four possible behaviours: A= up, down, right. The
probability matrix P contains all pairwise combinations of states (S,S’), for all actions in A. It’s
Bernoulli-distributed, and looks like this:
The reward function is defined in this manner. If it’s in an empty cell, the agent receives a negative
reward of -1, to simulate the effect of hunger. If instead, the agent is in a cell with fruit, in this case,
(3,2) for the pear and (4,4) for the apple, it then receives a reward of +5 and +10, respectively.
The evaluation of the policies suggests that the utility is maximized with π2, which then the agent
chooses as its policy for this task.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
NATURAL
LANGUAGE
PROCESSING
ADVANTAGES OF NLP
NLP helps users to ask questions about any subject and get a direct response within seconds.
NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
Most of the companies use NLP to improve the efficiency of documentation processes, accuracy of
DISADVANTAGES OF NLP
NLP is unpredictable
NLP is unable to adapt to the new domain, and it has a limited function that's why NLP is built for
COMPONENTS OF NLP
There are the following two components of NLP -
1. Natural Language Understanding (NLU)
Natural Language Understanding (NLU) helps the machine to understand and analyse human
language by extracting the metadata from content such as concepts, entities, keywords, emotion,
relations, and semantic roles.
NLU mainly used in Business applications to understand the customer's problem in both spoken and
written language.
NLU involves the following tasks -
It is used to map the given input into useful representation.
It is used to analyse different aspects of the language.
NLU NLG
NLU is the process of reading and NLG is the process of writing or
interpreting language. generating language.
A language can be defined as a set of strings; “print(2 + 2)” is a legal program in the language
Python, whereas “2)+(2 print” is not.
Since there are an infinite number of legal programs, they cannot be enumerated; instead they are
specified by set of rules called a grammar.
Formal languages also have rules that define the meaning or semantics of a program; for example, the
rules say that the “meaning” of “2 + 2” is 4, and the meaning of “1/0” is that an error is signaled.
Everyone agrees that “Not to be invited is sad” is a sentence of English, but people disagree on the
grammaticality of “To be not invited is sad.”
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Therefore, it is more fruitful to define a natural language model as a probability distribution over
sentences rather than a definitive set.
P(S = words)
Natural languages are also ambiguous. Because we cannot speak of a single meaning for a
sentence, but rather of a probability distribution over possible meanings.
Finally, natural languages are difficult to deal with because they are very large, and constantly
changing.
Thus, one of the simplest language models is a probability distribution over sequences of
characters.
A sequence of written symbols of length n is called an n-gram (from the Greek root for writing or
letters), with special case “unigram” for 1-gram, “bigram” for 2-gram, and “trigram” for 3-gram.
A model of the probability distribution of n-letter sequences is thus called an n-gram model. (But be
careful: we can have n-gram models over sequences of words, syllables, or other units; not just over
characters.)
In a Markov chain the probability of character ci depends only on the immediately preceding
characters, not on any other characters.
We call a body of text a corpus (plural corpora), from the Latin word for body.
What can we do with n-gram character models? One task for which they are well suited is
language identification .
where λ3 + λ2 + λ1 = 1. The parameter values λi can be fixed, or they can be trained with an
expectation–maximization algorithm.
It is also possible to have the values of λi depend on the counts: if we have a high count of
trigrams, then we weigh them relatively more; if only a low count, then we put more weight on the
Split the corpus into a training corpus and a validation corpus. Determine the parameters of the model
from the training data. Then evaluate the model on the validation corpus.
The evaluation can be a task-specific metric, such as measuring accuracy on language identification.
Alternatively we can have a task-independent model of language quality: calculate the probability
assigned to the validation corpus by the model; the higher the probability the better.
This metric is inconvenient because the probability of a large corpus will be a very small number, and
floating-point underflow becomes an issue.
A different way of describing the probability of a sequence is with a measure called perplexity,
It can also be thought of as the weighted average branching factor of a model. Suppose there are
100 characters in our language, and our model says they are all equally likely. Then for a
If some characters are more likely than others, and the model reflects that, then the model will
All the same mechanism applies equally to word and character models. The main difference is that the vocabulary—the
set of symbols that make up the corpus and the model—is larger.
There are only about 100 characters in most languages, and sometimes we build character models that are even more
restrictive, for example by treating “A” and “a” as the same symbol or by treating all punctuation as the same symbol.
But with word models we have at least tens of thousands of symbols, and sometimes millions.
In English a sequence of letters surrounded by spaces is a word, but in some languages, like Chinese, words are not
separated by spaces, and even in English many decisions must be made to have a clear policy on word boundaries: how
many words are in.
With character models, we didn’t have to worry about someone inventing a new letter of the alphabet.
But with word models there is always the chance of a new word that was not seen in the training corpus, so we need to
model that explicitly in our language model.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
TEXT
CLASSIFICATION
We now consider in depth the task of text classification, also known as categorization: given a text
of some kind, decide which of a predefined set of classes it belongs to. Language identification and
genre classification are examples of text classification
A training set is readily available: the positive (spam) examples are in my spam folder, the negative
(ham) examples are in my inbox.
In the language-modeling approach, we define one n-gram language model for P(Message | spam)
by training on the spam folder, and one model for P(Message | ham) by training on the inbox.
where P (c) is estimated just by counting the total number of spam and ham messages. This approach
works well for spam detection, just as it did for language identification.
If there are 100,000 words in the language model, then the feature vector has length 100,000, but for
a short email message almost all the features will have count zero.
This unigram representation has been called the bag of words model.
You can think of the model as putting the words of the training corpus in a bag and then selecting
words one at a time.
The notion of order of the words is lost; a unigram model gives the same probability to any
permutation of a text.
Higher-order n-gram models maintain some local notion of word order.
It can be expensive to run algorithms on a very large feature vector, so often a process of feature
selection is used to keep only the features that best discriminate between spam and ham.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
CLASSIFICATION BY DATA COMPRESSION
Another way to think about classification is as a problem in data compression.
A lossless compression algorithm takes a sequence of symbols, detects repeated patterns in it, and
writes a description of the sequence that is more compact than the original.
To do classification by compression, we first lump together all the spam training messages and
We do the same for the ham. Then when given a new message to classify, we append it to the spam
We also append it to the ham and compress that. Whichever class compresses better—adds the
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
INFORMATION RETRIEVAL
Information retrieval is the task of finding documents that are relevant to a user’s need for
information.
The best-known examples of information retrieval systems are search engines on the World Wide
Web.
A Web user can type a query such as “AI book” into a search engine and see a list of relevant pages.
A corpus of documents. Each system must decide what it wants to treat as a document: a
paragraph, a page, or a multipage text.
Queries posed in a query language. A query specifies what the user wants to know. The query
language can be just a list of words, such as [AI book]; or it can specify a phrase of words that
must be adjacent, as in [“AI book”]; it can contain Boolean operators as in [AI AND book]; it can
include non-Boolean operators such as [AI NEAR book].
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
A result set. This is the subset of documents that the IR system judges to be relevant to the query.
A presentation of the result set. This can be as simple as a ranked list of document titles or as
complex as a rotating color map of the result set projected onto a three dimensional space,
First, the degree of relevance of a document is a single bit, so there is no guidance as to how to
Second, Boolean expressions are unfamiliar to users who are not programmers or logicians.
Third, it can be hard to formulate an appropriate query, even for a skilled user.
IR system evaluation
recall
precision.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Precision measures the proportion of documents in the result set that are actually relevant.
In our example, the precision is 30/(30 + 10) = .75. The false positive rate is 1 - .75 = .25.
Recall measures the proportion of all the relevant documents in the collection that are in the result set.
In our example, recall is 30/(30 + 20) = .60. The false negative rate is 1 - .60 = .40.
In a very large document collection, such as the World Wide Web, recall is difficult to compute,
because there is no easy way to examine every page on the Web for relevance.
All we can do is either estimate recall by sampling or ignore recall completely and just judge
precision.
are continually updating their algorithms as they discover new approaches and as the Web grows
and changes.
One common refinement is a better model of the effect of document length on relevance.
Singhal et al. (1996) observed that simple document length normalization schemes tend to favor
They propose a pivoted document length normalization scheme; the idea is that the pivot is the
document length at which the old-style normalization is correct; documents shorter than that get a
The BM25 scoring function uses a word model that treats all words as completely independent,
The next step is to recognize synonyms, such as “sofa” for “couch.” As with stemming, this has
the potential for small gains in recall, but can hurt precision.
On the Web, hypertext links between documents are a crucial source of information.
PageRank was one of the two original ideas that set Google’s search apart from other Web search
engines when it was introduced in 1997. (The other innovation was the use of anchor text—the
underlined text in a hyperlink).
PageRank was invented to solve the problem of the tyranny of TF scores: if the query is [IBM], how
do we make sure that IBM’s home page, ibm.com, is the first result, even if another page mentions
the term “IBM” more frequently?
The idea is that ibm.com has many in-links (links to the page), so it should be ranked higher:
each in-link is a vote for the quality of the linked-to page.
But if we only counted in-links, then it would be possible for a Web spammer to create a network of
pages and have them all point to a page of his choosing, increasing the score of that page.
What is a high quality site? One that is linked to by other high-quality sites.
The definition is recursive, but we will see that the recursion bottoms out properly. The PageRank
for a page p is defined as:
where P R(p) is the PageRank of page p, N is the total number of pages in the corpus, ini are the
pages that link in to p, and C(ini) is the count of the total number of out-links on page ini.
The constant d is a damping factor. It can be understood through the random surfer model :
imagine a Web surfer who starts at some random page and begins exploring.
The Hyperlink-Induced Topic Search algorithm, also known as “Hubs and Authorities” or
HITS, is another influential link-analysis algorithm .
Given a query, HITS first finds a set of pages that are relevant to the query. It does that by
intersecting hit lists of query words, and then adding pages in the link neighborhood of these
pages
Both PageRank and HITS played important roles in developing our understanding of Web
information retrieval.
These algorithms and their extensions are used in ranking billions of queries daily as search
engines steadily develop better ways of extracting yet finer signals of search relevance.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Question answering
Information retrieval is the task of finding documents that are relevant to a query, where the query
Question answering is a somewhat different task, in which the query really is a question, and the
answer is not a ranked list of documents but rather a short response—a sentence, or even just a
phrase.
There have been question-answering NLP (natural language processing) systems since the 1960s,
but only since 2001 have such systems used Web information retrieval to radically increase their
breadth of coverage.
INFORMATION
EXTRACTION
Information extraction is the process of acquiring knowledge by skimming a text and looking for
A typical task is to extract instances of addresses from Web pages, with database fields for street,
city, state, and zip code; or instances of storms from weather reports, with fields for temperature,
In a limited domain, this can be done with high accuracy. As the domain gets more general, more
complex linguistic models and more complex learning techniques are necessary.
For example, the problem of extracting from the text “IBM Think Book 970. Our price:
$399.00” the set of attributes {Manufacturer=IBM, Model=ThinkBook970, Price=$399.00}.
We can address this problem by defining a template (also known as a pattern) for each
attribute we would like to extract. The template is defined by a finite state automaton, the
simplest example of which is the regular expression, or regex.
Thus, when these systems see the text “$249.99,” they need to determine not just that it is a price,
A typical relational-based extraction system is FASTUS, which handles news stories about
That is, the system consists of a series of small, efficient finite-state automata (FSAs), where
each automaton receives text as input, transduces the text into a different format, and passes it
along to the next automaton.
3 The third stage handles basic groups, meaning noun groups and verb groups. The idea is to
chunk these into units that will be managed by the later stages.
4 The fourth stage combines the basic groups into complex phrases.
5 The final stage merges structures that were built up in the previous step.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
When information extraction must be attempted from noisy or varied input, simple finite-state
approaches fare poorly.
It is too hard to get all the rules and their priorities right; it is better to use a probabilistic model
rather than a rule-based model.
The simplest probabilistic model for sequences with hidden state is the hidden Markov model, or
HMM.
HMM models a progression through a sequence of hidden states, xt, with an observation et at each
step.
To apply HMMs to information extraction, we can either build one big HMM for all the attributes
or build a separate HMM for each attribute. We’ll do the second.
Modeling this directly gives us some freedom. We don’t need the independence assumptions of
the Markov model—we can have an xt that is dependent on x1.
A framework for this type of model is the conditional random field, or CRF, which models a
conditional probability distribution of a set of target variables given a set of observed variables.
Like Bayesian networks, CRFs can represent many different structures of dependencies among
the variables.
One common structure is the linear-chain conditional random field for representing Markov
dependencies among variables in a temporal sequence.
Thus, HMMs are the temporal version of naive Bayes models, and linear-chain CRFs are the
temporal version ofwww.android.previousquestionpapers.com
logistic regression. | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Ontology extraction from large corpora
First it is open-ended—we want to acquire facts about all types of domains, not just one specific
domain.
Second, with a large corpus, this task is dominated by precision, not recall—just as with question
answering on the Web .
Third, the results can be statistical aggregates gathered from multiple sources, rather than being
extracted from one specific text.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Automated template construction
Fortunately, it is possible to learn templates from a few examples, then use the templates to learn
more examples, from which more templates can be learned, and so on.
In one of the first experiments of this kind, Brin (1999) started with a data set of just five examples
(“Isaac Asimov”, “The Robots of Dawn”)
(“David Brin”, “Startide Rising”)
(“James Gleick”, “Chaos—Making a New Science”)
(“Charles Dickens”, “Great Expectations”)
(“William Shakespeare”, “The Comedy of Errors”)
Clearly these are examples of the author–title relation, but the learning system had no knowledge
of authors or titles.
The words in these examples were used in a search over a Web corpus, resulting in 199 matches.
Each match is definedwww.android.previousquestionpapers.com
as a tuple of seven strings, | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
where Order is true if the author came first and false if the title came first, Middle is the characters between the
author and title, Prefix is the 10 characters before the match, Suffix is the 10 characters after the match, and
URL is the Web address where the match was made.
Machine reading
Automated template construction is a big step up from handcrafted template construction, but it still requires a
handful of labeled examples of each relation to get started.
To build a large ontology with many thousands of relations, even that amount of work would be onerous; we
would like to have an extraction system with no human input of any kind—a system that could read on its own
and build up its own database.
Such a system would be relation-independent; would work for any relation. In practice, these systems work on
all relations in parallel, because of the I/O demands of large corpora.
They behave less like a traditional information extraction system that is targeted at a few relations and more
like a human reader who learns from the text itself; because of this the field has been called machine reading.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
INTRODUCTION
Communication is the intentional exchange of information brought about by the production SIGN
and perception of signs drawn from a shared system of conventional signs. Most animals use signs
to represent important messages: food here, predator nearby, approach, withdraw, let’s mate.
PHRASE STRUCTURE GRAMMARS
The n-gram language models were based on sequences of words.
The big issue for these models is data sparsity—with a vocabulary of, say, trigram probabilities to estimate,
and so a corpus of even a trillion words will not be able to supply reliable estimates for all of them.
Despite the exceptions, the notion of a lexical category (also known as a part of speech) such as noun or
adjective is a useful generalization—useful in its own right, but more so when we string together lexical
categories to form syntactic categories such as noun phrase or verb phrase, and combine these syntactic
categories into trees representing the phrase structure of sentences: nested phrases, each marked with a
category . www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
GENERATIVE CAPACITY
Grammatical formalisms can be classified by their generative capacity: the set of languages they
can represent.
Chomsky (1957) describes four classes of grammatical formalisms that differ only in the form of
the rewrite rules.
The classes can be arranged in a hierarchy, where each class can be used to describe all the
languages that can be described by a less powerful class, as well as some additional languages.
1. Recursively enumerable grammars use unrestricted rules: both sides of the rewrite rules can have
any number of terminal and nonterminal symbols, as in the rule A B C → D E.
The name “context sensitive” comes from the fact that a rule such as A X B → A Y B says that an
X can be rewritten as a Y in the context of a preceding A and a following B.
3. In context-free grammars (or CFGs), the left-hand side consists of a single nonterminal
symbol. Thus, each rule licenses rewriting the nonterminal as the right-hand side in any context.
Regular grammars are equivalent in power to finite state machines. They are poorly suited
for programming languages, because they cannot represent constructs such as balanced
opening and closing parentheses .
The closest they can come is representing a∗b∗, a sequence of any number of as followed by
any number of bs.
There have been many competing language models based on the idea of phrase structure; we will
describe a popular model called the probabilistic context-free grammar, or PCFG.
A grammar is a collection of rules that defines a language as a set of allowable strings of words.
Probabilistic means that the grammar assigns a probability to every string.
VP → Verb [0.70]
VP NP [0.30]
Here VP (verb phrase) and NP (noun phrase) are non-terminal symbols. The grammar also refers to
actual words, which are called terminal symbols.
This rule is saying that with probability 0.70 a verb phrase consists solely of a verb, and with
probability 0.30 it is a VP followed by an NP.
The lexicon of
First we define the lexicon, or list of allowable words. The words are grouped into the lexical
categories familiar to dictionary users: nouns, pronouns, and names to denote things; verbs to denote
events; adjectives to modify nouns; adverbs to modify verbs; and function words: articles (such as
the), prepositions (in), and conjunctions (and).
Each of the categories ends in . . . to indicate that there are other words in the category.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
The Grammar of
The next step is to combine the words into phrases.
A grammar for with rules for each of the six syntactic categories and an example for each rewrite rule.
2. Have the students in section 2 of Computer Science 101 taken the exam?
If the algorithm guesses wrong, it will have to backtrack all the way to the first word and reanalyze the whole
sentence under the other interpretation.
To avoid this source of inefficiency we can use dynamic programming: every time we analyze a substring,
store the results so we won’t have to reanalyze it later.
For example, once we discover that “the students in section 2 of Computer Science 101” is an NP, we can
record that result in a data structure known as a chart.
There are many types of chart parsers; we describe a bottom-up version called the CYK algorithm, after its
inventors, John Cocke, Daniel Younger, and Tadeo Kasami.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
CYK algorithm
This suggests that learning the grammar from data might be better than a knowledge engineering
approach.
Learning is easiest if we are given a corpus of correctly parsed sentences, commonly called a
treebank.
The Penn Treebank is the best known; it consists of 3 million words which have been annotated
with part of speech and parse-tree structure, using human labor assisted by some automated tools.
That means that the difference between P (“eat a banana”) and P (“eat a bandanna”) depends only
on P (Noun → “banana”) versus
P (Noun → “bandanna”) and not on the relation between “eat” and the respective objects.
A Markov model of order two or more, given a sufficiently large corpus, will know that “eat a
banana” is more probable.
We can combine a PCFG and Markov model to get the best of both. The simplest approach is to
estimate the probability of a sentence with the geometric mean of the probabilities computed by
both models.
Another problem with PCFGs is that they tend to have too strong a preference for shorter
sentences.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
AUGMENTED GRAMMARS AND SEMANTIC INTERPRETATION
Lexicalized PCFGs
To get at the relationship between the verb “eat” and the nouns “banana” versus “bandanna,”
we can use a lexicalized PCFG, in which the probabilities for a rule depend on the
relationship between words in the parse tree, not just on the adjacency of words in a sentence.
Of course, we can’t have the probability depend on every word in the tree, because we won’t have
enough training data to estimate all those probabilities.
It is useful to introduce the notion of the head of a phrase—the most important word. Thus, “eat”
is the head of the VP “eat a banana” and “banana” is the head of the NP “a banana.”
We use the notation VP(v) to denote a phrase with category VP whose head word is v. We say
that the category VP is augmented with the head variable v.
Augmented rules are complicated, so we will give them a formal definition by showing how an
augmented rule can be translated into a logical sentence.
The sentence will have the form of a definite clause, so the result is called a definite clause
grammar, or DCG.
We would also need to split the category Pronoun into the two categories PronounS (which includes “I”) and
PronounO (which includes “me”).
Rough translation, as provided by free online services, gives the “gist” of a foreign sentence or
Pre-edited translation is used by companies to publish their documentation and sales materials in
multiple languages.
The original source text is written in a constrained language that is easier to translate
automatically, and the results are usually edited by a human to correct any errors.
Restricted-source translation works fully automatically, but only on highly stereotypical language,
They keep a database of translation rules (or examples), and whenever the rule (or example)
matches, they translate directly.
All it needs is data—sample translations from which a translation model can be learned. To
translate a sentence in, say, English (e) into French (f), we find the string of words f ∗ that
maximizes
Here the factor P (f) is the target language model for French; it says how probable a given
sentence is in French. P (e|f) is the translation mode.
All that remains is to learn the phrasal and distortion probabilities. We sketch the procedure;
It has become one of the mainstream applications of AI—millions of people interact with speech
recognition systems every day to navigate voice mail systems, search the Web from mobile
phones, and other applications.
Speech recognition is difficult because the sounds made by a speaker are ambiguous and,
well, noisy.
First, segmentation: written words in English have spaces between them, but in fast speech
there are no pauses in “wreck a nice” that would distinguish it as a multiword phrase as
opposed to the single word “recognize.”
Second, coarticulation: when speaking quickly the “s” sound at the end of “nice” merges
with the “b” sound at the beginning of “beach,” yielding something that is close to a “sp.”
Another problem that does not show up in this example is homophones—words like “to,”
“too,” and “two” that sound the same but differ in meaning.
Most speech recognition systems use a language model that makes the Markov assumption—that
the current state Word t depends only on a fixed number n of previous states—and represent
Word t as a single random variable taking on a finite set of values, which makes it a Hidden
Markov Model (HMM).
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Acoustic model
The precision of each measurement is determined by the quantization factor; speech recognizers
typically keep 8 to 12 bits.
A phoneme is the smallest unit of sound that has a distinct meaning to speakers of a particular
language.
For example, the “t” in “stick” sounds similar enough to the “t” in “tick” that speakers of English
consider them the same phoneme.
First, we observe that although the sound frequencies in speech may be several kHz, the changes
in the content of the signal occur much less often, perhaps at no more than 100 Hz.
For general-purpose speech recognition, the language model can be an n-gram model of text learned
from a corpus of written sentences.
However, spoken language has different characteristics than written language, so it is better to get a
corpus of transcripts of spoken language.
For task-specific speech recognition, the corpus should be task-specific: to build your airline
reservation system, get transcripts of prior calls.
It also helps to have task-specific vocabulary, such as a list of all the airports and cities served,
and all the flight numbers.
In cameras, the image is formed on an image plane, which can be a piece of film coated with
silver halides or a rectangular grid of a few million photosensitive pixels, each a complementary
metal-oxide semiconductor (CMOS) or charge-coupled device (CCD).
Lens systems
we will study three useful image-processing operations: edge detection, texture analysis,
and computation of optical flow. These are called “early” or “low-level” operations because
they are the first in a pipeline of operations. Early vision operations are characterized by
their local nature (they can be carried out in one part of the image without regard for
anything more than a few pixels away) and by their lack of knowledge: we can perform
these operations without consideration of the objects that might be present in the scene.
This makes the low-level operations good candidates for implementation in parallel
hardware—either in a graphics processor unit (GPU) or an eye. We will then look at one
mid-level operation: segmenting the image into regions.
Edges are straight lines or curves EDGE in the image plane across which there is a “significant”
change in image brightness. The goal of edge detection is to abstract away from the messy,
multimega byte image and toward a more compact, abstract representation, the motivation is that
edge contours in the image correspond to important scene contours. In the figure we have three
Edge detection is concerned only with the image, and thus does not distinguish between these
Robots are physical agents that perform tasks by manipulating the physical world. To do so they are
equipped with effectors such as legs, wheels, joints, and grippers. Effectors have a single purpose:
to assert physical forces on the environment. Robots are also equipped with sensors, which allow
them to perceive their environment.
Present day robotics employs a diverse set of sensors, including cameras and lasers to measure the
environment, and gyroscopes and accelerometers to measure the robot’s own motion.
Most of today’s robots fall into one of three primary categories. Manipulators,
The second category is the mobile robot. Mobile robots move about their environment using
wheels, legs, or similar mechanisms.
They have been put to use delivering food in hospitals, moving containers at loading docks, and
similar tasks. Unmanned ground vehicles, or UGVs, drive autonomously on streets, highways, and
off-road.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
The planetary rover shown in Figure 25.2(b) explored Mars for a period of 3 months in 1997.
Other types of mobile robots include unmanned air vehicles (UAVs), commonly used for
surveillance, crop-spraying
The third type of robot combines mobility with manipulation, and is often called a mobile
manipulator. Humanoid robots mimic the human torso. shows two early humanoid robots, both
manufactured by Honda Corp. in Japan. Mobile manipulators can apply their effectors further afield
than anchored manipulators can, but their task is made harder because they don’t have the rigidity that
UAV commonly used by the U.S. military. Autonomous underwater vehicles (AUVs) are used in
deep sea exploration. Mobile robots deliver packages in the workplace and vacuum the floors at home.
Stereo vision relies STEREO VISION on multiple cameras to image the environment from slightly
different viewpoints.
shows a time of flight camera. This camera acquires range images like the one at up to 60 frames
per second.
These sensors are called scanning lidars (short for light detection and ranging).
On the other extreme end of range sensing are tactile sensors such as whiskers
A second important class of sensors is location sensors.
Outdoors, the Global Positioning System is the most common solution to the localization problem.
Differential GPS involves a second ground receiver with known location, providing millimetre
accuracy under ideal conditions.
The third important class is proprioceptive sensors, which inform the robot of its own motion.
Inertial sensors, such as gyroscopes, rely on the resistance of mass to the change of velocity. They
can help reduce uncertainty.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
EFFECTORS
To understand the design of effectors, it will help to talk about motion and shape in the abstract,
using the concept of a degree of freedom (DOF) We count one degree of freedom for each
independent direction in which a robot, or one of its effectors.
These six degrees define the kinematic state2 or pose of the robot. The dynamic state of a robot
includes these six plus an additional six dimensions for the rate of change of each kinematic
dimension, that is, their velocities.
created by five revolute joints that generate rotational motion and one prismatic joint that
generates sliding motion. You can verify that the human arm as a whole has more than six degrees
of freedom by a simple experiment: put your hand on the table and notice that you still have the
freedom to rotate your elbow without changing the configuration of your hand. Manipulators that
have extra degrees of freedom are easier to control than robots with only the minimum number of
DOFs. Many industrial manipulators therefore have seven DOFs, not six.
www.android.previousquestionpapers.com | www.previousquestionpapers.com | https://telegram.me/jntua
www.android.universityupdates.in | www.universityupdates.in | https://telegram.me/jntua
Legged robots have been made to walk, run, and even hop—as we see with the legged robot. This
robot is dynamically stable, meaning that it can remain upright while hopping around. A robot that
can remain upright without moving its legs is called statically stable
we saw that Kalman filters, HMMs, and dynamic Bayes nets can represent the transition and sensor
models of a partially observable environment, and we described both exact and approximate
algorithms for updating the belief state
We would like to compute the new belief state, P(Xt+1 | z1:t+1, a1:t), from the current belief state
P(Xt | z1:t, a1:t−1) and the new observation zt+1. We did this in Section 15.2, but here there are two
differences: we condition explicitly on the actions as well as the observations, and we deal with
continuous rather than discrete variables