What Are AI Agents? - IBM
What Are AI Agents? - IBM
Reasoning paradigms
Types of AI agents
Benefits of AI agents
Best practices
Related resources
Published: 3 July 2024
Take the nextAnna
Contributor: step Gutowska
AI agents can encompass a wide range of functionalities beyond natural language processing including decision-
making, problem-solving, interacting with external environments and executing actions.
These agents can be deployed in various applications to solve complex tasks in various enterprise contexts from
software design and IT automation to code-generation tools and conversational assistants. They use the advanced
:
natural language processing techniques of large language models (LLMs) to comprehend and respond to user inputs
step-by-step and determine when to call on external tools.
Guide
Use this model selection framework to choose the most appropriate model while
balancing your performance requirements with cost, risks and deployment needs.
:
Related content
At the core of AI agents are large language models (LLMs). For this reason, AI agents are often referred to as
LLM agents. Traditional LLMs, such as IBM® Granite™ models, produce their responses based on the data
used to train them and are bounded by knowledge and reasoning limitations. In contrast, agentic technology
uses tool calling on the backend to obtain up-to-date information, optimize workflow and create subtasks
autonomously to achieve complex goals.
In this process, the autonomous agent learns to adapt to user expectations over time. The agent's ability to
store past interactions in memory and plan future actions encourages a personalized experience and
comprehensive responses.1 This tool calling can be achieved without human intervention and broadens the
possibilities for real-world applications of these AI systems. The approach that AI agents take in achieving
goals set by users is comprised of these three stages:
Although AI agents are autonomous in their decision-making processes, they require goals and environments
defined by humans.2 There are three main influences on autonomous agent behavior:
– The team of developers that design and train the agentic AI system.
– The team that deploys the agent and provides the user with access to it.
– The user that provides the AI agent with specific goals to accomplish and establishes available tools to
use.
Given the user's goals and the agent’s available tools, the AI agent then performs task decomposition to
improve performance.3 Essentially, the agent creates a plan of specific tasks and subtasks to accomplish the
complex goal.
For simple tasks, planning is not a necessary step. Instead, an agent can iteratively reflect on its responses
:
and improve them without planning its next steps.
AI agents base their actions on the information they perceive. Often, AI agents do not have the
full knowledge base needed for tackling all subtasks within a complex goal. To remedy this, AI agents use
their available tools. These tools can include external data sets, web searches, APIs and even other agents.
After the missing information is retrieved from these tools, the agent can update its knowledge base. This
means that each step of the way, the agent reassesses its plan of action and self-corrects.
To help illustrate this process, imagine a user planning their vacation. The user tasks an AI agent with
predicting which week in the next year would likely have the best weather for their surfing trip in Greece.
Since the LLM model at the core of the agent does not specialize in weather patterns, the agent gathers
information from an external database comprised of daily weather reports for Greece over the past several
years.
Despite acquiring this new information, the agent still cannot determine the optimal weather conditions for
surfing and so, the next subtask is created. For this subtask, the agent communicates with an external agent
that specializes in surfing. Let’s say that in doing so, the agent learns that high tides and sunny weather with
little to no rain provide the best surfing conditions.
The agent can now combine the information it has learned from its tools to identify patterns. It can predict
which week next year in Greece will likely have high tides, sunny weather and a low chance of rain. These
findings are then presented to the user. This sharing of information between tools is what allows AI agents to
be more general-purpose than traditional AI models.3
AI agents use feedback mechanisms, such as other AI agents and human-in-the-loop (HITL), to improve the
accuracy of their responses. Let’s return to our previous surfing example to highlight this. After the agent
forms its response to the user, the agent stores the learned information along with the user’s feedback to
improve performance and adjust to user preferences for future goals.
If other agents were used to reach the goal, their feedback may also be used. Multi-agent feedback can be
especially useful in minimizing the time that human users spend providing direction. However, users can also
provide feedback throughout the agent's actions and internal reasoning to better align the results with the
intended goal.2
:
Feedback mechanisms improve the AI agent's reasoning and accuracy, which is commonly referred to as
iterative refinement.3 To avoid repeating the same mistakes, AI agents can also store data about solutions to
previous obstacles in a knowledge base.
Non-agentic AI chatbots are ones without available tools, memory and reasoning. They can only reach short-term
goals and cannot plan ahead. As we know them, non-agentic chatbots require continuous user input to
respond. They can produce responses to common prompts that most likely align with user expectations but perform
poorly on questions unique to the user and their data. Since these chatbots do not hold memory, they cannot learn
from their mistakes if their responses are unsatisfactory.
In contrast, agentic AI chatbots learn to adapt to user expectations over time, providing a more personalized
experience and comprehensive responses. They can complete complex tasks by creating subtasks without human
intervention and considering different plans. These plans can also be self-corrected and updated as needed. Agentic
AI chatbots, unlike non-agentic ones, assess their tools and use their available resources to fill in information gaps.
Reasoning paradigms
There is not one standard architecture for building AI agents. Several paradigms exist for solving multi-step
problems.
With this paradigm, we can instruct agents to "think" and plan after each action taken and with each tool response to
decide which tool to use next. These Think-Act-Observe loops are used to solve problems step by step and iteratively
improve upon responses.
Through the prompt structure, agents can be instructed to reason slowly and to display each "thought".4 The agent's
verbal reasoning gives insight into how responses are formulated. In this framework, agents continuously update
their context with new reasoning. This can be interpreted as a form of Chain-of-Thought prompting.
:
ReWOO (Reasoning WithOut Observation)
The ReWOO method, unlike ReAct, eliminates the dependence on tool outputs for action planning. Instead, agents
plan upfront. Redundant tool usage is avoided by anticipating which tools to use upon receiving the initial prompt
from the user. This is desirable from a human-centered perspective since the user can confirm the plan before it is
executed.
The ReWOO workflow is made up of three modules. In the planning module, the agent anticipates its next steps
given a user's prompt. The next stage entails collecting the outputs produced by calling these tools. Lastly, the agent
pairs the initial plan with the tool outputs to formulate a response. This planning ahead can greatly reduce token
usage and computational complexity as well as the repercussions of intermediate tool failure.5
Types of AI agents
AI agents can be developed to have varying levels of capabilities. A
simple agent may be preferred for straightforward goals to limit
unnecessary computational complexity. In order of simplest to most
advanced, there are 5 main agent types:
Simple reflex agents are the simplest agent form that grounds actions on current perception. This agent does not
hold any memory, nor does it interact with other agents if it is missing information. These agents function on a set of
so-called reflexes or rules. This means that the agent is preprogrammed to perform actions that correspond to
certain conditions being met.
If the agent encounters a situation that it is not prepared for, it cannot respond appropriately. The agents are only
effective in environments that are fully observable granting access to all necessary information.6
Example: A thermostat that turns on the heating system at a set time every night. The condition-action rule here is,
for instance, if it is 8 PM, then the heating is activated.
:
Simple reflex agent diagram
Model-based reflex agents use both their current perception and memory to maintain an internal model of the world.
As the agent continues to receive new information, the model is updated. The agent’s actions depend on its model,
reflexes, previous precepts and current state.
These agents, unlike simple reflex agents, can store information in memory and can operate in environments that are
partially observable and changing. However, they are still limited by their set of rules.6
Example: A robot vacuum cleaner. As it cleans a dirty room, it senses obstacles such as furniture and adjusts around
them. The robot also stores a model of the areas it has already cleaned to not get stuck in a loop of repeated
cleaning.
3. Goal-based agents
Goal-based agents have an internal model of the world and also a goal or set of goals. These agents search for action
sequences that reach their goal and plan these actions before acting on them. This search and planning improve their
effectiveness when compared to simple and model-based reflex agents.7
Example: A navigation system that recommends the fastest route to your destination. The model considers various
routes that reach your destination, or in other words, your goal. In this example, the agent’s condition-action rule
states that if a quicker route is found, the agent recommends that one instead.
4. Utility-based agents
Utility-based agents select the sequence of actions that reach the goal and also maximize utility or reward. Utility is
calculated using a utility function. This function assigns a utility value, a metric measuring the usefulness of an action
:
calculated using a utility function. This function assigns a utility value, a metric measuring the usefulness of an action
or how “happy” it will make the agent, to each scenario based on a set of fixed criteria.
The criteria can include factors such as progression toward the goal, time requirements, or computational
complexity. The agent then selects the actions that maximize the expected utility. Hence, these agents are useful in
cases where multiple scenarios achieve a desired goal and an optimal one must be selected.7
Example: A navigation system that recommends the route to your destination that optimizes fuel efficiency and
minimizes the time spent in traffic and the cost of tolls. This agent measures utility through this set of criteria to
select the most favorable route.
5. Learning agents
Learning agents hold the same capabilities as the other agent types but are unique in their ability to learn. New
experiences are added to their initial knowledge base, which occurs autonomously. This learning enhances the
agent’s ability to operate in unfamiliar environments. Learning agents may be utility or goal-based in their reasoning
and are comprised of four main elements:7
– Learning: This improves the agent’s knowledge by learning from the environment through its precepts and
sensors.
– Critic: This provides feedback to the agent on whether the quality of its responses meets the performance
standard.
– Performance: This element is responsible for selecting actions upon learning.
– Problem generator: This creates various proposals for actions to be taken.
Example: Personalized recommendations on e-commerce sites. These agents track user activity and preferences in
their memory. This information is used to recommend certain products and services to the user. The cycle repeats
each time new recommendations are made. The user’s activity is continuously stored for learning purposes. In doing
so, the agent improves its accuracy over time.
Customer experience
AI agents can be integrated into websites and apps to enhance the customer experience by serving as a virtual
assistants, providing mental health support, simulating interviews and other related tasks.8 There are many no-code
templates for user implementation, making the process of creating these AI agents even easier.
Healthcare
AI agents can be used for various real-world healthcare applications. Multi-agent systems can be particularly useful
for problem-solving in such settings. From treatment planning for patients in the emergency department to managing
drug processes, these systems save the time and effort of medical professionals for more urgent tasks.9
Emergency response
In case of natural disasters, AI agents can use deep learning algorithms to retrieve the information of users on social
media sites that need rescue. The locations of these users can be mapped to assist rescue services in saving more
people in less time. Therefore, AI agents can greatly benefit human life in both mundane tasks and life-saving
situations.10
Benefits of AI agents
Task automation
With the ongoing advancements in generative AI, there is a growing interest in workflow optimization using AI, or
:
intelligent automation. AI agents are AI tools that can automate complex tasks that would otherwise require human
resources. This translates to goals being reached inexpensively, rapidly and at scale. In turn,
these advancements mean human agents do not need to provide direction to the AI assistant for creating and
navigating its tasks.
Greater performance
Multi-agent frameworks tend to outperform singular agents.11 This is because the more plans of action are available
to an agent, the more learning and reflection occur. An AI agent incorporating knowledge and feedback from other AI
agents specializing in related areas can be useful for information synthesis. This backend collaboration of AI agents
and the ability to fill information gaps are unique to agentic frameworks, making them a powerful tool and a
meaningful advancement in artificial intelligence.
Quality of responses
AI agents provide responses that are more comprehensive, accurate and personalized to the user than traditional AI
models. This is extremely important to us as users since higher-quality responses typically yield a better customer
experience. As previously described, this is made possible through exchanging information with other agents, using
external tools and updating their memory stream. These behaviors emerge on their own and are not
preprogrammed.12
Certain complex tasks require the knowledge of multiple AI agents. When implementing these multi-agent
frameworks, there is a risk of malfunction. Multi-agent systems built on the same foundation models may experience
shared pitfalls. Such weaknesses could cause a system-wide failure of all involved agents or expose vulnerability to
adverse attacks.13 This highlights the importance of data governance in building foundation models and thorough
training and testing processes.
The convenience of the hands-off reasoning for human users using AI agents also comes with its risks. Agents that
are unable to create a comprehensive plan or reflect on their findings, may find themselves repeatedly calling the
same tools, invoking infinite feedback loops. To avoid these redundancies, some level of real-time human monitoring
may be used.13
Computational complexity
Building AI agents from scratch is both time-consuming and can also be very computationally expensive. The
resources required for training a high-performance agent can be extensive. Additionally, depending on the
:
complexity of the task, agents can take several days to complete tasks.12
Best practices
Activity logs
To address the concerns of multi-agent dependencies, developers can provide users with access to a log of agent
actions.14 The actions can include the use of external tools and describe the external agents utilized to reach the goal.
This transparency grants users insight into the iterative decision-making process, provides the opportunity
to discover errors and builds trust.
Interruption
Preventing AI agents from running for overly long periods of time is recommended. Particularly, in cases of
unintended infinite feedback loops, changes in access to certain tools, or malfunctioning due to design flaws. One
way to accomplish this is by implementing interruptibility.
Maintaining control of this involves allowing human users the option to gracefully interrupt a sequence of actions or
the entire operation. Choosing if and when to interrupt an AI agent requires some thoughtfulness as some
terminations can cause more harm than good. For instance, it may be safer to allow a faulty agent to continue
assisting in a life-threatening emergency than to completely shut it down.5
To mitigate the risk of agentic systems being used for malicious use, unique identifiers can be used.14 If these
identifiers were to be required for agents to access external systems, there would be greater ease in tracing the
origin of the agent's developers, deployers and its user. This would be particularly helpful in case of any malicious
use or unintended harm done by the agent. This level of accountability would provide a safer environment for
these AI agents to operate.
Human supervision
To assist in the learning process for AI agents, especially in their early stages in a new environment, it can be helpful
to provide occasional human feedback. This allows the AI agent to compare its performance to the expected
standard and adjust accordingly. This form of feedback is helpful in improving the agent’s adaptability to user
preferences.5
Apart from this, it is best practice to require human approval before an AI agent takes highly impactful actions. For
instance, actions ranging from sending mass emails to financial trading should require human confirmation.7 Some
level of human monitoring is recommended for such high-risk domains.
IBM watsonx™ AI
Use IBM watsonx AI, a next generation enterprise studio for AI builders to train, validate,
tune and deploy AI models, as a development platform that can support AI agent
building.
Explore watsonx AI
Try watsonx AI
Increase developer productivity through the use of IBM watsonx Code Assistant powered
by IBM Granite foundation models. Features include code generation, matching and
modernization.
Related resources
Explore watsonx.ai
1
Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, and Gao Huang, "Expel: Llm agents are experiential learners," Proceedings of
the AAAI Conference on Artificial Intelligence, Vol. 38, No. 17, pp. 19632-19642,
2024, https://ojs.aaai.org/index.php/AAAI/article/view/29936 (link resides outside of ibm.com).
2
Yonadov Shavit, Sandhini Agarwal, Miles Brundage, Steven Adler, Cullen O’Keefe, Rosie Campbell, Teddy Lee, Pamela Mishkin, Tyna Eloundou,
Alan Hickey, Katarina Slama, Lama Ahmad, Paul McMillan, Alex Beutel, Alexandre Passos and David G. Robinson, “Practices for Governing
Agentic AI Systems,” OpenAI, 2023, https://arxiv.org/pdf/2401.13138v3 (link resides outside of ibm.com).
3
Tula Masterman, Sandi Besen, Mason Sawtell, Alex Chao, “The Landscape of Emerging AI AgentArchitectures for Reasoning, Planning, and Tool
Calling: A Survey,” arXiv preprint, 2024, https://arxiv.org/abs/2404.11584 (link resides outside of ibm.com).
4
Gautier Dagan, Frank Keller, and Alex Lascarides, "Dynamic Planning with a LLM," arXiv preprint, 2023. https://arxiv.org/abs/2308.06391 (link
resides outside of ibm.com).
5
Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, and Dongkuan Xu, "ReWOO: Decoupling Reasoning from Observations
for Efficient Augmented Language Models," arXiv preprint, 2023, https://arxiv.org/abs/2305.18323 (link resides outside of ibm.com).
6
Sebastian Schmid, Daniel Schraudner, and Andreas Harth, "Performance comparison of simple reflex agents using stigmergy with model-based
agents in self-organizing transportation." IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion, pp. 93-
98, 2021, https://ieeexplore.ieee.org/document/9599196 (link resides outside of ibm.com).
7
Veselka Sasheva Petrova-Dimitrova, “Classifications of intelligence agents and their applications,” Fundamental Sciences and Applications, Vol.
28, No. 1, 2022.
8
Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao,
Zhewei Wei, and Jirong Wen, “A survey on large language model based autonomous agents,” Frontiers of Computer Science, Vol. 18, No. 6,
2024, https://link.springer.com/article/10.1007/s11704-024-40231-1 (link resides outside of ibm.com).
9
Jaya R. Haleema, Haleema, N. C. S. N. Narayana, “Enhancing a Traditional Health Care System of an Organization for Better Service with Agent
Technology by Ensuring Confidentiality of Patients’ Medical Information,” Cybernetics and Information Technologies, Vol. 12, No. 3, pp.140-156,
2013, https://sciendo.com/article/10.2478/cait-2013-0031.
10
Jingwei Huang, Wael Khallouli, Ghaith Rabadi, Mamadou Seck, “Intelligent Agent for Hurricane Emergency Identification and Text Information
Extraction from Streaming Social Media Big Data,” International Journal of Critical Infrastructures, Vol. 19, No. 2, pp. 124-139,
2023, https://arxiv.org/abs/2106.07114.
11
Junyou Li, Qin Zhang, Yangbin Yu, Qiang Fu, and Deheng Ye. "More agents is all you need." arXiv preprint,
2024, https://arxiv.org/abs/2402.05120.
12
Joon Sung Park, Joseph O'Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein, "Generative agents: Interactive
simulacra of human behavior," Proceedings of the 36th Annual ACM Symposium on User Interface software and Technology, pp. 1-22,
2023, https://dl.acm.org/doi/10.1145/3586183.3606763.
13
Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam
Kolt, Lennart Heim and Markus Anderljung, “Visibility into AI Agents,” The 2024 ACM Conference on Fairness, Accountability, and Transparency, pp.
958-973, 2024, https://arxiv.org/abs/2401.13138.
14
Devjeet Roy, Xuchao Zhang, Rashi Bhave, Chetan Bansal, Pedro Las-Casas, Rodrigo Fonseca, and Saravan Rajmohan, "Exploring LLM-based
Agents for Root Cause Analysis," arXiv preprint, 2024, https://arxiv.org/abs/2403.04123.
: