LLM Agents and Tool Use
LLM Agents and Tool Use
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 1
Credits Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 2
Outline Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 3
Recap: Prompting Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 4
Recap: Fine-tuning Methods Data and Web Science Group
Ding, N., et al., 2022. Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models. arXiv
preprint arXiv:2203.06904.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 5
Recap: Evaluating LLMs Data and Web Science Group
• Benchmark-based evaluation
– Format problem into prompt and generate result
– Parse result and compute standard metrics like accuracy
– Good for close-ended evaluation
• Model-based evaluation
– Use LLM like GPT-4 as surrogate for human evaluation
– Shown to achieve high agreement with human evaluators
• Human-based evaluation
– Human evaluators judge answer of LLMs
• Pair-wise comparison of two answers from different models
• Single-answer grading: score a single answer from an LLM
– Good for open-ended evaluation
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 6
Outline Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 7
What is an Agent? Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 8
What is an Agent? Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 9
What is an Agent? Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 11
A brief history of LLM agents Data and Web Science Group
Wang, L., et al., 2024. A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6), p.186345.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 12
Outline Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 13
Example Task: Question Answering Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 14
Supporting LLMs with Tools Data and Web Science Group
Image Source
Mialon, G., et al. 2023, Augmented Language Models: a Survey. Transactions on Machine Learning Research.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 15
Example: Code Generation for Data and Web Science Group
Computational Problems
Knowledge Problems
• Answer knowledge-intensive questions with
– Extra corpora
– A retriever (e.g. BM25, DensePassageRetrieval, etc.)
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 17
Teaching LLMs to use Tools Data and Web Science Group
Parisi, A., et al., 2022. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255.
Schick, T., et al., 2024. Toolformer: Language models can teach themselves to use tools. Advances in Neural
Information Processing Systems, 36.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 18
Tool Usage: General Process Data and Web Science Group
Image Source
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 19
Tool Learning: Tutorial Data and Web Science Group
• Tutorial Learning
– Have model tuned for tool use read tool manuals (tutorials), so that
it understands the functions of the tool and how to invoke them
– Works well with powerful LLMs
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 20
Tool Learning Prompt Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 21
Tool Learning: RL Data and Web Science Group
• Reinforcement Learning
– Autonomous exploration and correction of errors based on
environmental feedback through reinforcement learning
– Action space defined by tools
– Agent learns to select appropriate tool
– Correct action maximize reward signal
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 22
Tool Learning: Self-supervised Data and Web Science Group
Nakano, R., et al., 2021. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 24
Early Example: WebGPT Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 25
What if Both External Knowledge and Data and Web Science Group
Press, O., et al., 2023, December. Measuring and Narrowing the Compositionality Gap in Language Models. In Findings of the Association
for Computational Linguistics: EMNLP 2023 (pp. 5687-5711).
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 26
Reasoning OR Acting Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 27
Outline Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 28
The ReAct Paradigm Data and Web Science Group
Yao, S., et al., 2023. ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh
International Conference on Learning Representations.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 29
ReAct is Simple and Intuitive to Use Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 30
Zero-shot ReAct Prompt Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 31
Zero-shot ReAct Prompt Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 32
Zero-shot ReAct Prompt Data and Web Science Group
• Synergy
– Acting supports reasoning
– Reasoning guides acting
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 33
Converting Tasks to Text Data and Web Science Group
• Many tasks can be turned into natural language for LLM agents
• “LLM grounding”: Supplementing the LLM with use-case specific
information, e.g a data store that is part of a RAG system
Brohan, A., et al., 2023, March. Do as i can, not as i say: Grounding language in robotic
affordances. In Conference on robot learning (pp. 287-318). PMLR.
Huang, W., et al., 2023, March. Inner Monologue: Embodied Reasoning through Planning with
Language Models. In Conference on Robot Learning (pp. 1769-1782). PMLR.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 34
Acting without Reasoning Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 35
ReAct Enables Systematic Exploration Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 36
ReAct is general and effective Data and Web Science Group
Yao, S., et al., 2023, ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh
International Conference on Learning Representations.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 37
ReAct vs. Traditional Agents Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 38
Outline Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 39
Unified Framework for LLM-powered Data and Web Science Group
Agents
• LLMs pave the way for the use of AI agents to simulate
users and other entities, as well as their interactions
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 40
Observation and Action Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 41
The “Brain” Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 42
The “Brain” Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 43
Collaboration Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 45
Example: Agent creation with OpenAI Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 46
Example: Long-term Memory Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 47
Long-term Memory for Reflexion Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 48
Example: Voyager - Procedural Data and Web Science Group
Memory of Skills
Wang, G., et al., 2024. Voyager: An Open-Ended Embodied Agent with Large Language Models. Transactions
on Machine Learning Research.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 49
Multi-Agent Orchestration Data and Web Science Group
• Usually a “Manager” or
“Commander” for
orchestrating many agents
• Context may be shared or
isolated
• Cooperative vs. competitive
environments
• Centralized vs. decentralized
communication
• Human intervention vs. full
automation
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 50
Example: Retrieval-Augmented QA Data and Web Science Group
• Two agents
• User Proxy processes documents into vectorstore
• User Question and relevant context passed to assistant that
generates answer
• Conversation continues until satisfactory answer
Wu, Q., et al., 2024, AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent
Conversation. In ICLR 2024 Workshop on Large Language Model (LLM) Agents.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 51
Example: Decision Making Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 54
Summary: LLM Agents Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 55
Outline Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 56
Evaluating (Multi-)Agent Systems Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 57
Example: AgentBench Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 58
Example: ToolEMU Data and Web Science Group
Ruan, Y., et al., 2024. Identifying the Risks of LM Agents with an LM-Emulated Sandbox. In The
Twelfth International Conference on Learning Representations.
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 59
Example: WebShop Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 60
Example: WebArena Data and Web Science Group
University of Mannheim | IE686 LLMs and Agents | LLM Agents and Tool Use| Version 04.10.2024 62