0% found this document useful (0 votes)

43 views8 pages

Introducing Transformers Agents 20

medium.com-Introducing Transformers Agents 20

Uploaded by

Uc Ngô

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views8 pages

Introducing Transformers Agents 20

medium.com-Introducing Transformers Agents 20

Uploaded by

Uc Ngô

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Introducing Transformers Agents 2.

0
medium.com/@amanatulla1606/introducing-transformers-agents-2-0-14a5601ade0b

Amanatullah 25 tháng 5, 2024

Amanatullah

What is an agent?
Large Language Models (LLMs) can tackle a wide range of tasks, but they often struggle
with specific tasks like logic, calculation, and search. When prompted in these domains in
which they do not perform well, they frequently fail to generate a correct answer.

One approach to overcome this weakness is to create an agent, which is just a program
driven by an LLM. The agent is empowered by tools to help it perform actions. When the
agent needs a specific skill to solve a particular problem, it relies on an appropriate tool
from its toolbox.

Thus when during problem-solving the agent needs a specific skill, it can just rely on an
appropriate tool from its toolbox.

1/8
Experimentally, agent frameworks generally work very well, achieving state-of-the-art
performance on several benchmarks.

The Transformers Agents approach

Building agent workflows is complex, and we feel these systems need a lot of clarity and
modularity. HF launched Transformers Agents one year ago, and they doubling down on
core design goals.

Framework strives for:

Clarity through simplicity: reduce abstractions to the minimum. Simple error logs
and accessible attributes let you easily inspect what’s happening and give you more
clarity.
Modularity: prefer to propose building blocks rather than full, complex feature sets.
You are free to choose whatever building blocks are best for your project.
For instance, since any agent system is just a vehicle powered by an LLM engine,
they decided to conceptually separate the two, which lets you create any agent type
from any underlying LLM.

On top of that, they have sharing features that let you build on the shoulders of giants!

Main elements
Tool: this is the class that lets you use a tool or implement a new one. It is
composed mainly of a callable forward method that executes the tool action, and a
set of a few essential attributes: name, descriptions, inputs and output_type.
These attributes are used to dynamically generate a usage manual for the tool and
insert it into the LLM’s prompt.
Toolbox: It's a set of tools that are provided to an agent as resources to solve a
particular task. For performance reasons, tools in a toolbox are already instantiated
and ready to go. This is because some tools take time to initialize, so it’s usually
better to re-use an existing toolbox and just swap one tool, rather than re-building a
set of tools from scratch at each agent initialization.
CodeAgent: a very simple agent that generates its actions as one single blob of
Python code. It will not be able to iterate on previous observations.
ReactAgent: ReAct agents follow a cycle of Thought ⇒ Action ⇒ Observation until
they’ve solve the task. We propose two classes of ReactAgent:
ReactCodeAgent generates its actions as python blobs.
ReactJsonAgent generates its actions as JSON blobs.

Check out the documentation to learn how to use each component!

How do agents work under the hood?

In essence, what an agent does is “allowing an LLM to use tools”. Agents have a key
agent.run() method that:

2/8
Provides information about tool usage to your LLM in a specific prompt. This way,
the LLM can select tools to run to solve the task.
Parses the tool calls from the LLM output (can be via code, JSON format, or any
other format).
Executes the calls.
If the agent is designed to iterate on previous outputs, it keeps a memory with
previous tool calls and observations. This memory can be more or less fine-grained
depending on how long-term you want it to be.

Example use cases

In order to get access to the early access of this feature, please first install transformers
from its main branch:

pip install

Self-correcting Retrieval-Augmented-Generation
Quick definition: Retrieval-Augmented-Generation (RAG) is “using an LLM to answer a
user query, but basing the answer on information retrieved from a knowledge base”. It has
many advantages over using a vanilla or fine-tuned LLM: to name a few, it allows to
ground the answer on true facts and reduce confabulations, it allows to provide the LLM
with domain-specific knowledge, and it allows fine-grained control of access to
information from the knowledge base.

3/8
Let’s say we want to perform RAG, and some parameters must be dynamically
generated. For example, depending on the user query we could want to restrict the
search to specific subsets of the knowledge base, or we could want to adjust the number
of documents retrieved. The difficulty is: how to dynamically adjust these parameters
based on the user query?

Well, we can do this by giving our agent an access to these parameters!

Let’s setup this system.

Tun the line below to install required dependancies:

pip install langchain sentence-transformers faiss-cpu langchain

langchain_community datasets

We first load a knowledge base on which we want to perform RAG: this dataset is a
compilation of the documentation pages for many huggingface packages, stored as
markdown.

datasetsknowledge_base = datasets.load_dataset(, split=)

Now we prepare the knowledge base by processing the dataset and storing it into a
vector database to be used by the retriever. We are going to use LangChain, since it
features excellent utilities for vector databases:

from langchain.docstore.document import Document

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings

source_docs = [
Document(
page_content=doc["text"], metadata={"source": doc["source"].split("/")[1]}
) for doc in knowledge_base
]

docs_processed =
RecursiveCharacterTextSplitter(chunk_size=500).split_documents(source_docs)[:1000]

embedding_model = HuggingFaceEmbeddings(model_name=)vectordb =
FAISS.from_documents( documents=docs_processed, embedding=embedding_model)

Now that we have the database ready, let’s build a RAG system that answers user
queries based on it!

We want our system to select only from the most relevant sources of information,
depending on the query.

4/8
Our documentation pages come from the following sources:

all_sources = (([doc.metadata[] doc docs_processed]))(all_sources)

👉 Let us build our RAG system as an agent that will be free to choose its sources!
We create a retriever tool that the agent can call with the parameters of its choice:

5/8
import json
from transformers.agents import Tool
from langchain_core.vectorstores import VectorStore

classRetrieverTool(Tool):
name = "retriever"
description = "Retrieves some documents from the knowledge base that have the
closest embeddings to the input query."
inputs = {
"query": {
"type": "text",
"description": "The query to perform. This should be semantically close to your
target documents. Use the affirmative form rather than a question.",
},
"source": {
"type": "text",
"description": ""
},
}
output_type = "text"

definit(self, vectordb: VectorStore, all_sources: , **kwargs):

super().__init__(**kwargs)
self.vectordb = vectordb
self.inputs["source"]["description"] = (
f"The source of the documents to search, as a str representation of a list.
Possible values in the list are: . If this argument is not provided, all sources
will be searched."
)

defforward(self, query: , source: = ) -> str:

assertisinstance(query, str), "Your search query must be a string"

if source:
ifisinstance(source, str) and"["notinstr(source): # if the source is not
representing a list
source = [source]
source = json.loads(str(source).replace("'", '"'))

docs = self.vectordb.similarity_search(query, filter=({"source": source}

if source elseNone), k=3)

(docs) == : + .join( [doc.page_content

doc docs] )

Now it’s straightforward to create an agent that leverages this tool!

The agent will need these arguments upon initialization:

6/8
tools: a list of tools that the agent will be able to call.
llm_engine: the LLM that powers the agent.

Our llm_engine must be a callable that takes as input a list of messages and returns text.
It also needs to accept a stop_sequences argument that indicates when to stop its
generation. For convenience, we directly use the HfEngine class provided in the package
to get a LLM engine that calls our Inference API.

from transformers.agents import HfEngine, ReactJsonAgent

llm_engine = HfEngine("meta-llama/Meta-Llama-3-70B-Instruct")

agent = ReactJsonAgent(
tools=[RetrieverTool(vectordb, all_sources)],
llm_engine=llm_engine
)

agent_output = agent.run("Please show me a LORA finetuning script")

()(agent_output)

Since we initialized the agent as a ReactJsonAgent, it has been automatically given a

default system prompt that tells the LLM engine to process step-by-step and generate
tool calls as JSON blobs (you could replace this prompt template with your own as
needed).

Then when its .run() method is launched, the agent takes care of calling the LLM
engine, parsing the tool call JSON blobs and executing these tool calls, all in a loop that
ends only when the final answer is provided.

Using a simple multi-agent setup 🤝 for efficient web browsing

In this example, we want to build an agent and test it on the GAIA benchmark (Mialon et
al. 2023). GAIA is an extremely difficult benchmark, with most questions requiring several
steps of reasoning using different tools. A specifically difficult requirement is to have a
powerful web browser, able to navigate to pages with specific constraints: discovering
pages using the website’s inner navigation, selecting specific articles in time…

Web browsing requires diving deeper into subpages and scrolling through lots of text
tokens that will not be necessary for the higher-level task-solving. We assign the web-
browsing sub-tasks to a specialized web surfer agent. We provide it with some tools to
browse the web and a specific prompt (check the repo to find specific implementations).

Defining these tools is outside the scope of this post: but you can check the repository to
find specific implementations.

7/8
from transformers.agents import ReactJsonAgent, HfEngine

WEB_TOOLS = [
SearchInformationTool(),
NavigationalSearchTool(),
VisitTool(),
DownloadTool(),
PageUpTool(),
PageDownTool(),
FinderTool(),
FindNextTool(),
]

websurfer_llm_engine = HfEngine(
model="CohereForAI/c4ai-command-r-plus"
) # We choose Command-R+ for its high context length

websurfer_agent = ReactJsonAgent( tools=WEB_TOOLS,

llm_engine=websurfer_llm_engine,)

To allow this agent to be called by a higher-level task solving agent, we can simply
encapsulate it in another tool:

classSearchTool(Tool):
name = "ask_search_agent"
description = "A search agent that will browse the internet to answer a
question. Use it to gather informations, not for problem-solving."

inputs = {
"question": {
"description": "Your question, as a natural language sentence. You are talking to
an agent, so provide them with as much context as possible.",
"type": "text",
}
}
output_type = "text"

() -> : websurfer_agent.run(question)

from transformers.agents import ReactCodeAgent

llm_engine = HfEngine(model=)react_agent_hf = ReactCodeAgent( tools=

[SearchTool()], llm_engine=llm_engine,)

8/8

Curious Moon
No ratings yet
Curious Moon
386 pages
LangChain - Chat With Your Data
No ratings yet
LangChain - Chat With Your Data
32 pages
Firehose DG
No ratings yet
Firehose DG
146 pages
Online Pet Shop Management System
No ratings yet
Online Pet Shop Management System
49 pages
Building Intelligent Agents with Google ADK
From Everand
Building Intelligent Agents with Google ADK
Amulya Rattan Bhatia
No ratings yet
Group 22
No ratings yet
Group 22
24 pages
Developing, Managing and Using Customer-Related Database
No ratings yet
Developing, Managing and Using Customer-Related Database
18 pages
1st Round Exit Exam Tutorial 2016
No ratings yet
1st Round Exit Exam Tutorial 2016
2 pages
Notes Big Data
No ratings yet
Notes Big Data
106 pages
Univeftsm Kebangsaan Malaysia: Peperlksaan Akhir Semester I Sesi Akademik 2019 - 2020 Ijazah Sarjana Muda Dengan Kepujian
No ratings yet
Univeftsm Kebangsaan Malaysia: Peperlksaan Akhir Semester I Sesi Akademik 2019 - 2020 Ijazah Sarjana Muda Dengan Kepujian
13 pages
Database System: Nasreen Akhtar Fast-Nu Chiniot-Faisalabad Campus
No ratings yet
Database System: Nasreen Akhtar Fast-Nu Chiniot-Faisalabad Campus
39 pages
Pallavi Model School, Alwal: A Project REPORT On
No ratings yet
Pallavi Model School, Alwal: A Project REPORT On
22 pages
The Complete Servicenow System Administrator Course: Section 5 - Tables & Fields
No ratings yet
The Complete Servicenow System Administrator Course: Section 5 - Tables & Fields
23 pages
VB Advanced Exercises
No ratings yet
VB Advanced Exercises
66 pages
Administration of Veritas Backup Exec™ 21 Sample Exam
No ratings yet
Administration of Veritas Backup Exec™ 21 Sample Exam
6 pages
API Security Checklist
No ratings yet
API Security Checklist
3 pages
M6
No ratings yet
M6
7 pages
IMS-DEDB Alter
No ratings yet
IMS-DEDB Alter
25 pages
Cloud Computing MCQ All Unit
No ratings yet
Cloud Computing MCQ All Unit
25 pages
Shopping Cart System Report
No ratings yet
Shopping Cart System Report
42 pages
2022 ICT Essay (Part B) - English
No ratings yet
2022 ICT Essay (Part B) - English
6 pages
Vi Sem Bca Practical List
No ratings yet
Vi Sem Bca Practical List
14 pages
Sri Harsha Java - Dev 2
No ratings yet
Sri Harsha Java - Dev 2
4 pages
Multi Language Page Creation
No ratings yet
Multi Language Page Creation
4 pages
Generative AI Lifecycle Patterns. Part 2 - Maturing GenAI - Patterns - by Ali Arsanjani - Sep, 2023 - Medium
No ratings yet
Generative AI Lifecycle Patterns. Part 2 - Maturing GenAI - Patterns - by Ali Arsanjani - Sep, 2023 - Medium
24 pages
Rishikesh Benke CV PDF
No ratings yet
Rishikesh Benke CV PDF
2 pages
LangChain From 0 To 1 Public 1 PpuSgEN
No ratings yet
LangChain From 0 To 1 Public 1 PpuSgEN
39 pages
Practical RAG
No ratings yet
Practical RAG
127 pages
LangChain Talk
No ratings yet
LangChain Talk
35 pages
1681096122
No ratings yet
1681096122
35 pages
What Is Relational Database?
No ratings yet
What Is Relational Database?
3 pages
LlamaIndex Talk (W&B Fully Connected 2024)
No ratings yet
LlamaIndex Talk (W&B Fully Connected 2024)
38 pages
TAW10 Test
No ratings yet
TAW10 Test
11 pages
LLM Prcess
No ratings yet
LLM Prcess
7 pages
LlamaIndex Talk (Data + AI Summit 2024)
No ratings yet
LlamaIndex Talk (Data + AI Summit 2024)
58 pages
ReactAgent LangChain Documentation
No ratings yet
ReactAgent LangChain Documentation
4 pages
Unit IV - Multiway Trees (B-Tree) at CSJMU - 6 Slides Handouts
No ratings yet
Unit IV - Multiway Trees (B-Tree) at CSJMU - 6 Slides Handouts
4 pages
Tactiq Free Transcript AC3h KzLARo
No ratings yet
Tactiq Free Transcript AC3h KzLARo
33 pages
Langchain Onepager
No ratings yet
Langchain Onepager
1 page
Academic Research Assistance 1716570959
No ratings yet
Academic Research Assistance 1716570959
13 pages
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
100% (2)
Llama3, LangGraph and Elasticsearch - Build A Local Agent For Vector Search - Search Labs
48 pages
Module#3 - L17 - Information Retrieval Using Agents & Tools
No ratings yet
Module#3 - L17 - Information Retrieval Using Agents & Tools
33 pages
Function Calling at Edge
No ratings yet
Function Calling at Edge
9 pages
Building RAG Apps
No ratings yet
Building RAG Apps
32 pages
Lan Graph
No ratings yet
Lan Graph
7 pages
Building A Hybrid Rag System With Pydanticai and Mongodb: Creating An Ai Agent For Tech News Search and Retrieval
No ratings yet
Building A Hybrid Rag System With Pydanticai and Mongodb: Creating An Ai Agent For Tech News Search and Retrieval
37 pages
RAI AI Engineer Intern Assignments
No ratings yet
RAI AI Engineer Intern Assignments
3 pages
Fine-Tuned Vs RAG Short Notes ?
No ratings yet
Fine-Tuned Vs RAG Short Notes ?
25 pages
Brolly AI - Generative AI - Online Training
No ratings yet
Brolly AI - Generative AI - Online Training
13 pages
Agents in LangChain
100% (2)
Agents in LangChain
11 pages
Agentic RAG - Removed
No ratings yet
Agentic RAG - Removed
9 pages
14 Key Skills To Master Large Language Models 1729745509
No ratings yet
14 Key Skills To Master Large Language Models 1729745509
17 pages
Langchain App Design
No ratings yet
Langchain App Design
7 pages
First Course in Statistics 11th Edition McClave Solutions Manual Download
100% (2)
First Course in Statistics 11th Edition McClave Solutions Manual Download
59 pages
Trending Repositories On GitHub This Month
No ratings yet
Trending Repositories On GitHub This Month
3 pages
Agentic AI Pioneer Program - Curriculum
No ratings yet
Agentic AI Pioneer Program - Curriculum
9 pages
Agentic Ai s1
No ratings yet
Agentic Ai s1
14 pages
Generative Adversarial Networks
No ratings yet
Generative Adversarial Networks
43 pages
Hadoop in Action Chuck Lam Download
No ratings yet
Hadoop in Action Chuck Lam Download
56 pages
Tools For Agents
No ratings yet
Tools For Agents
4 pages
Gen Project
No ratings yet
Gen Project
7 pages
DLI RAG Slides
No ratings yet
DLI RAG Slides
183 pages
Internship Report Hamas Khan
No ratings yet
Internship Report Hamas Khan
24 pages
GenAI PDF
No ratings yet
GenAI PDF
34 pages
Ai Agents
No ratings yet
Ai Agents
1 page
Building AI Agents With Autogen - Workshop
No ratings yet
Building AI Agents With Autogen - Workshop
49 pages
Agents in Langchain
No ratings yet
Agents in Langchain
6 pages
Step 2 Ai Agents
No ratings yet
Step 2 Ai Agents
1 page
Code Agents
No ratings yet
Code Agents
24 pages
Ai Agents Cheat Sheet
No ratings yet
Ai Agents Cheat Sheet
1 page
Swift Programming Simplified: A Practical Guide with Examples
From Everand
Swift Programming Simplified: A Practical Guide with Examples
William E. Clark
No ratings yet
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
From Everand
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
Mustafa Al-Dori
5/5 (1)
Getting Started with Model Context Protocol (MCP): A Beginner’s Guide to Building Structured AI Agent Systems
From Everand
Getting Started with Model Context Protocol (MCP): A Beginner’s Guide to Building Structured AI Agent Systems
Eron Valdric
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
From Everand
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
Adam Freeman
No ratings yet
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
From Everand
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
Lucas Merritt
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet
Spring Boot Intermediate Microservices: Resilient Microservices with Spring Boot 2 and Spring Cloud
From Everand
Spring Boot Intermediate Microservices: Resilient Microservices with Spring Boot 2 and Spring Cloud
Jens Boje
No ratings yet
Java: Tips and Tricks to Programming Code with Java
From Everand
Java: Tips and Tricks to Programming Code with Java
Charlie Masterson
No ratings yet
The Definitive Guide to PowerShell
From Everand
The Definitive Guide to PowerShell
Wesley Dunne
No ratings yet
Salesforce Developer Interview Questions: 1.0, #1
From Everand
Salesforce Developer Interview Questions: 1.0, #1
SFDC TELUGU
No ratings yet
Java: Tips and Tricks to Programming Code with Java: Java Computer Programming, #2
From Everand
Java: Tips and Tricks to Programming Code with Java: Java Computer Programming, #2
Charlie Masterson
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Basics with Windows Powershell
From Everand
Basics with Windows Powershell
Prometheus MMS
No ratings yet
JavaScript Introduction
From Everand
JavaScript Introduction
Lisa Saldivar
No ratings yet
Getting Started With Quick Test Professional (QTP) And Descriptive Programming
From Everand
Getting Started With Quick Test Professional (QTP) And Descriptive Programming
Gaurav Garg
4.5/5 (2)
Java™ Programming: A Complete Project Lifecycle Guide
From Everand
Java™ Programming: A Complete Project Lifecycle Guide
Nitin Shreyakar
No ratings yet
Creating add-ons for Blender
From Everand
Creating add-ons for Blender
Michel Anders
5/5 (1)
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet

Introducing Transformers Agents 20

Uploaded by

Introducing Transformers Agents 20

Uploaded by

Introducing Transformers Agents 2.

Amanatullah 25 tháng 5, 2024

The Transformers Agents approach

Framework strives for:

Check out the documentation to learn how to use each component!

How do agents work under the hood?

Example use cases

Well, we can do this by giving our agent an access to these parameters!

Let’s setup this system.

Tun the line below to install required dependancies:

pip install langchain sentence-transformers faiss-cpu langchain

datasetsknowledge_base = datasets.load_dataset(, split=)

from langchain.docstore.document import Document

all_sources = (([doc.metadata[] doc docs_processed]))(all_sources)

def__init__(self, vectordb: VectorStore, all_sources: , **kwargs):

defforward(self, query: , source: = ) -> str:

docs = self.vectordb.similarity_search(query, filter=({"source": source}

(docs) == : + .join( [doc.page_content

Now it’s straightforward to create an agent that leverages this tool!

The agent will need these arguments upon initialization:

from transformers.agents import HfEngine, ReactJsonAgent

agent_output = agent.run("Please show me a LORA finetuning script")

Since we initialized the agent as a ReactJsonAgent, it has been automatically given a

Using a simple multi-agent setup 🤝 for efficient web browsing

websurfer_agent = ReactJsonAgent( tools=WEB_TOOLS,

from transformers.agents import ReactCodeAgent

llm_engine = HfEngine(model=)react_agent_hf = ReactCodeAgent( tools=

You might also like

definit(self, vectordb: VectorStore, all_sources: , **kwargs):