0% found this document useful (0 votes)
44 views64 pages

Lecture17 18 LLM Based Agent

The document discusses LLM-based autonomous agents, highlighting their interactions with humans, tools, and environments. It covers various frameworks and methodologies such as Toolformer and HuggingGPT, which enhance LLM capabilities by integrating external APIs and task planning. Additionally, it explores concepts like ReAct, Self-Refine, and Reflexion for improving decision-making and output refinement in language models.

Uploaded by

chatpgtzhangyue
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views64 pages

Lecture17 18 LLM Based Agent

The document discusses LLM-based autonomous agents, highlighting their interactions with humans, tools, and environments. It covers various frameworks and methodologies such as Toolformer and HuggingGPT, which enhance LLM capabilities by integrating external APIs and task planning. Additionally, it explores concepts like ReAct, Self-Refine, and Reflexion for improving decision-making and output refinement in language models.

Uploaded by

chatpgtzhangyue
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

Advanced Natural Language Processing

Lecture 17: LLM-based Agent

陈冠华 CHEN Guanhua


Department of Statistics and Data Science
Autonomous Agents

LLM Powered Autonomous Agents | Lil'Log (lilianweng.github.io)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 2
Autonomous Agents

• Interact with humans


• Interact with tools
• Interact with environment
• Interact with other agents

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 3
LLM with Tools

• External tools to enhance LLMs

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 4
LLM with Tools

• Code Interpreter

https://openai.com/blog/chatgpt-plugins#code-interpreter

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 5
LLM with Tools

• With Wolfram tool

https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/
Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 6
LLM with Tools

XAgent: An Autonomous Agent for Complex Task Solving | XAgent (x-agent.net)


Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 7
Toolformer

• Teach themselves to use external tools


via simple APIs
• Trained to decide
• Which APIs to call
• When to call them
• What arguments to pass
• How to best incorporate the results
into future token prediction

[2302.04761] Toolformer: Language Models Can Teach Themselves to Use Tools (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 8
Toolformer

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 9
Toolformer

• Gain the ability to use different tools by means of API calls


• Represent each API call as a tuple , where is the name of the API and is the
corresponding input.

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 10
Toolformer

Step 1: Sampling API Calls


• Use in-context learning to annotate data

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 11
Toolformer

Step 1: Sampling API Calls


• Sample candidate positions where API is inserted,

• Keep the top k positions that (threshold)


• For each position sample API calls given as a prefix and and </API> as an end-of-
sequence token

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 12
Toolformer

Step 2: Executing API Calls


• Execute all API calls to obtain the corresponding results.
• It can involve calling another neural network, executing a Python script or
using a retrieval system to perform search over a large corpus.

KillianLucas/open-interpreter: OpenAI's Code Interpreter in your terminal, running locally (github.com)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 13
Toolformer

Step 3: Filtering API Calls


• An API call is helpful
• Let be the position of the API call in the sequence ,
• Let be the response from the API
• is a weight sequence

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 14
Toolformer

Step 3: Filtering API Calls


• Three cases
• With API call and its result,
• With API call only,
• Without API call,
• Define , is an empty sequence
• An API call is helpful if providing it with both the input and the output of this call
makes it easier for the model to predict future tokens

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 15
Toolformer

Step 3: Filtering API Calls


• An API call is helpful
• Let be the position of the API call in the sequence , and let be the response from the
API, is a weight sequence

• When result is better compared to not receiving the API call at all, or receiving only its
input

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 16
Toolformer

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 17
Example

Input x: Joe Biden was born in Scranton, Pennsylvania.


• Step 1 Sampling
• Joe Biden was born in <API> QA("Where was Joe Biden born?")</API> Scranton, <API> QA("In which state
is Scranton?") </API> Pennsylvania.
• Step 2 execute API
• Joe Biden was born in <API> QA("Where was Joe Biden born?") →Scranton</API> Scranton, <API> QA("In
which state is Scranton?") →Pennsylvania </API> Pennsylvania.
• Step 3 Filtering
• Joe Biden was born in <API> QA("Where was Joe Biden born?") →Scranton</API> Scranton, Pennsylvania.

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 18
Toolformer

• Inference mode
• The New England Journal of Medicine is a registered trademark of ...
• “The New England Journal of Medicine is a registered trademark of <API> QA(“Who
is the publisher of The New England Journal of Medicine?”) →
• Tool mode when meets token “→”
• The New England Journal of Medicine is a registered trademark of <API> QA(“Who
is the publisher of The New England Journal of Medicine?”) → Massachusetts
Medical Society </API>
• Inference mode
• The New England Journal of Medicine is a registered trademark of <API> QA(“Who
is the publisher of The New England Journal of Medicine?”) → Massachusetts
Medical Society </API> Massachusetts Medical Society.

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 19
ToolLLM

• A general tool-use framework including data construction, model training, and


evaluation.
• API Collection
• Gather 16,464 representational state transfer (REST) APIs from RapidAPI, a
platform that hosts massive real-world APIs provided by developers
• By comprehending these documents to learn to execute APIs, LLMs can generalize
to new APIs unseen during training
• API retriever
• Given an instruction, the API retriever recommends a set of relevant APIs

[2307.16789] ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 20
ToolLLM

Fine-tuning LLaMA on ToolBench, then obtain ToolLLaMA

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 21
ToolLLM

• Instruction Generation
• First sample APIs and then prompt ChatGPT to generate diverse instructions
• Both single-tool and multi-tool scenarios
• The prompt is composed of
1. a general description of the instruction generation task
2. comprehensive documentation of each sampled API
3. three in-context seed examples {seed1, seed2, seed3}. Each seed example is an
ideal instruction generation written by human experts

[2307.16789] ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 22
ToolLLM

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 23
ToolLLM

• Solution Path Annotation


• Each solution path may contain multiple rounds of reasoning and API calls
• Given an instruction , we prompt ChatGPT to search for a valid action sequence:
• For each , ChatGPT should specify its “thought”, which API to use, and the specific
parameters for this API
• format: “Thought: · · · , API Name: · · · , Parameters: · · · ”
• Define two additional functions, i.e., “Finish with Final Answer” and “Finish by
Giving Up”

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 24
ToolLLM

• Prompts for instruction generation

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 25
ToolLLM

• Depth-first search-based decision tree (DFSDT)


• Construct a decision tree
• Prompt ChatGPT with the information of the previously generated nodes and
explicitly encourage the model to generate a distinct node
• Why prefer depth-first search (DFS) instead of breadth-first search (BFS)
• The annotation can be finished as long as one valid path is found
• Using BFS will cost excessive OpenAI API calls

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 26
ToolLLM

• Depth-first search-based decision tree (DFSDT)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 27
ToolLLM

Fine-tuning LLaMA on ToolBench, then obtain ToolLLaMA

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 28
HuggingGPT

• A framework to use ChatGPT as the task planner


• Select models available in HuggingFace platform according to the model
descriptions
• Summarize the response based on the execution results
• The system comprises of 4 stages:
• Task planning
• Parses the user requests into multiple tasks.
• Each task has task type, ID, dependencies, and arguments
• In-context learning

[2303.17580] HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face (arxiv.org)
Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 29
HuggingGPT

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 30
HuggingGPT

• Model selection
• Distributes the tasks to expert models
• LLM is presented with a list of models to choose from
• Task execution
• Expert models execute on the specific tasks and log results
• Response generation
• LLM receives the execution results and provides summarized results to users

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 31
HuggingGPT

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 32
HuggingGPT

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 33
Autonomous Agents

• Task decomposition and planning

[2308.11432] A Survey on Large Language Model based Autonomous Agents (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 34
ReAct

• Combine reasoning and acting for


solving diverse reasoning and decision-
making tasks
• Different stages
• Thought
• Action
• Observation

[2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 35
ReAct

• Thought
• Compose useful information by reasoning over the current context, and update the
context to support future reasoning or acting
• Examples
• Decomposing task goals and create action plans
• Injecting commonsense knowledge relevant to task solving
• Extracting important parts from observations
• Track progress and transit action plans
• Handle exceptions and adjust action plans

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 36
ReAct

• Action (Example action space)


• Search[entity]
• Lookup[string]
• Finish[answer]
• Observation
• Feedback from the environment

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 37
ReAct

Example:
Question: Musician and satirist Allie Goertz wrote a song about the
"The Simpsons" character Milhouse, who Matt Groening named after
who?
Thought 1: The question simplifies to "The Simpsons" character
Milhouse is named after who. I only need to search Milhouse and
find who it is named after.
Action 1: Search[Milhouse]
Observation 1: Milhouse Mussolini Van Houten is a recurring character
in the Fox animated television series The Simpsons voiced by Pamela
Hayden and created by Matt Groening.

Thought 3: Milhouse was named after U.S. president Richard Nixon, so
the answer is Richard Nixon.
Action 3: Finish[Richard Nixon]

ysymyth/ReAct: [ICLR 2023] ReAct


: Synergizing Reasoning and Acting in Language Models (github.
com)
Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 38
Self-Refine

• Improving initial outputs from LLMs through iterative feedback and refinement
• The same LLM provides feedback for its output and uses it to refine itself,
iteratively

[2303.17651] Self-Refine: Iterative Refinement with Self-Feedback (arxiv.org)


Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 39
Self-Refine

madaan
[2303.17651] Self-Refine: Iterative Refinement with Self-Feedback (arxiv.org)
/self-refine: LLMs can generate feedback on their wor
k, use it to improve the output
Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 40
Reflexion

• Reinforce agents not by updating weights,


but instead through linguistic feedback
• Verbally reflect on task feedback signals
• Maintain their own reflective text in an
episodic memory buffer
• Induce better decision-making in subsequent
trials

[2303.11366] Reflexion: Language Agents with Verbal Reinforcement Learning (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 41
Reflexion

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 42
Reflexion

• Three distinct models


• Actor, generates text and actions
• Evaluator, scores the outputs
• Self-Reflection model, generates verbal
reinforcement cues to assist the Actor in self-
improvement
• Self-reflection
• Given a sparse reward signal, such as a binary
success status (success/fail), the current
trajectory, and its persistent memory
• Generates nuanced and specific feedback.

[2303.11366] Reflexion: Language Agents with Verbal Reinforcement Learning (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 43
Reflexion

[2303.11366] Reflexion: Language Agents with Verbal Reinforcement Learning (arxiv.org) noahshinn/reflexion: [NeurIPS 2023] Reflexion
: Language Agents with Verbal Reinforcement Learning (git
hub.com)
Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 44
ReWOO

• Three modules: Planner, Worker, and Solver


• Step-wise reasoning, tool-calls, and summarization
• Planner breaks down a task and formulates a blueprint of interdependent
plans, each of which is allocated to Worker.
• Worker retrieves external knowledge from tools to provide evidence.
• Solver synthesizes all the plans and evidence to generate the answer

[2305.18323] ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 45
ReWOO

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 46
ReWOO

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 47
ReWOO

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 48
ReWOO

billxbf/ReWOO
: Decoupling Reasoning from Observations for Efficient Aug
mented Language Models (github.com)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 49
Auto-GPT
• It receives only high-level goals and instructions at the beginning of complex
multi-step tasks, without requiring step-by-step guidance from humans
• It engages in self-monologue by generating ’Thoughts,’ ’Reasoning,’ ’Plan,’
and ’Criticism’ for each individual step of action
• It possesses the capability to integrate various tools through simple tool
instructions and a few examples
• It incorporates long-term self-memory and memory retrieval mechanisms
• Tasks-specific adaption should only require minimal efforts like providing goal
definitions and tool descriptions.

[2306.02224] Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions (arxiv.org)

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 50
Auto-GPT

Significant-Gravitas/AutoGPT:

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 51
Autonomous Agent

• Task decomposition and planning

Stanford SmallVille

(Neural) Turing Machine

[2304.03442] Generative Agents: Interactive Simulacra of Human Behavior (arxiv.org)


Guanhua Chen @ Stat-DS,
Department SUSTech and Data
of Statistics 52
Generative Agents

• Generative agents
• Computational software agents that simulate believable human behavior
• Take their current environment and past experiences as input and generate
behavior as output.
• Behaviors like
• Wake up, cook breakfast, and head to work
• artists paint, while authors write
• They form opinions, notice each other, and initiate conversations;
• They remember and reflect on days past as they plan the next day

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 53
Generative Agents

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 54
Generative Agents

• Challenge for general agents: ensure long-term coherence


• Manage constantly-growing memories as new interactions, conflicts, and events
arise and fade over time
• Handling cascading social dynamics that unfold between multiple agents
• Requirements
• Retrieve relevant events and interactions over a long period
• Reflect on those memories to generalize and draw higher-level inferences
• Planning and reaction that make sense in the moment and in the longer-term arc
of the agent’s behavior

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 55
Generative Agents

• Authored one paragraph of natural


language description to depict each
agent’s identity
• Including their occupation and
relationship with other agents, as seed
memories

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 56
Generative Agents

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 57
Generative Agents

Architecture
• Memory stream
• A database that maintains a comprehensive record of an agent’s experience
• Records are retrieved as relevant to plan the agent’s actions and react
appropriately to the environment
• Records are recursively synthesized into higher- and higher-level reflections that
guide behaviors
• Each record contains a natural language description, a creation timestamp, and a
most recent access timestamp

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 58
Generative Agents

Architecture
• Reflection
• A higher-level, more abstract thoughts generated by the agent
• Generate reflections when the sum of the importance scores for the latest events
perceived by the agents exceeds a threshold (roughly two or three times a day)
• First, determine what to reflect on, by identifying questions that can be asked
given the agent’s recent experiences.
• E.g., prompt the language model, “Given only the information above, what are 3
most salient high-level questions we can answer about the subjects in the
statements?
• Then, prompt LLM to extract insights and cite the particular records that
served as evidence for the insights

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 59
Generative Agents

Architecture
• Reflection
• Example

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 60
Generative Agents

Architecture
• Planning
• A plan includes a location, a starting time, and a duration
• Consider observations, reflections, and plans all together when deciding how to
behave
• Starts top-down and then recursively generates more detail
• First step is to create a plan that outlines the day’s agenda in broad strokes
• Prompt the language model with the agent’s summary description (e.g., name,
traits, and a summary of their recent experiences) and a summary of their
previous day
• Recursively decomposes it to create finer-grained actions, first into hour-long
chunks of actions. e then recursively decompose this again into 5–15 minute
chunks

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 61
Generative Agents

Architecture
• Reacting and Updating Plans
• Prompt the language model with
these observations to decide
whether the agent should continue
with their existing plan, or react

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 62
Generative Agents

Emergent Social Behaviors


• Information diffusion
• If there is important information, the agents should spread it among themselves
• Relationship formation
• Agents form new relationships over time and remember their interactions with
other agents
• Agent coordination
• Generative agents coordinate with each other

Guanhua Chen @ Stat-DS,


Department SUSTech and Data
of Statistics 63
Thank you

You might also like