0% found this document useful (0 votes)
23 views

Autogen Components Guide

AutoGen provides built-in model clients for the ChatCompletion API, including OpenAIChatCompletionClient, AzureOpenAIChatCompletionClient, and AzureAIChatCompletionClient. Each client requires specific setup and authentication methods, such as API keys or Azure Active Directory tokens. Additionally, features like streaming responses, structured output, and caching of model responses are supported to enhance functionality.

Uploaded by

FranMorón10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Autogen Components Guide

AutoGen provides built-in model clients for the ChatCompletion API, including OpenAIChatCompletionClient, AzureOpenAIChatCompletionClient, and AzureAIChatCompletionClient. Each client requires specific setup and authentication methods, such as API keys or Azure Active Directory tokens. Additionally, features like streaming responses, structured output, and caching of model responses are supported to enhance functionality.

Uploaded by

FranMorón10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Model Clients

AutoGen provides a suite of built-in model clients for using ChatCompletion API. All model
clients implement the ChatCompletionClient protocol class.

Currently there are three built-in model clients:

OpenAIChatCompletionClient

AzureOpenAIChatCompletionClient

AzureAIChatCompletionClient

OpenAI
To use the OpenAIChatCompletionClient, you need to install the openai extra.

# pip install "autogen-ext[openai]"


You also need to provide the API key either through the environment variable
OPENAI_API_KEY or through the api_key argument.

from autogen_core.models import UserMessage


from autogen_ext.models.openai import OpenAIChatCompletionClient

# Create an OpenAI model client.


model_client = OpenAIChatCompletionClient(
model="gpt-4o",
# api_key="sk-...", # Optional if you have an API key set in the environment.
)
You can call the create() method to create a chat completion request, and await for an
CreateResult object in return.

# Send a message list to the model and await the response.


messages = [
UserMessage(content="What is the capital of France?", source="user"),
]
response = await model_client.create(messages=messages)

# Print the response


print(response.content)
The capital of France is Paris.
# Print the response token usage
print(response.usage)
RequestUsage(prompt_tokens=15, completion_tokens=7)
Azure OpenAI
To use the AzureOpenAIChatCompletionClient, you need to provide the deployment id,
Azure Cognitive Services endpoint, api version, and model capabilities. For authentication,
you can either provide an API key or an Azure Active Directory (AAD) token credential.

# pip install "autogen-ext[openai,azure]"


The following code snippet shows how to use AAD authentication. The identity used must be
assigned the Cognitive Services OpenAI User role.

from autogen_ext.models.openai import AzureOpenAIChatCompletionClient


from azure.identity import DefaultAzureCredential, get_bearer_token_provider

# Create the token provider


token_provider = get_bearer_token_provider(DefaultAzureCredential(),
"https://cognitiveservices.azure.com/.default")

az_model_client = AzureOpenAIChatCompletionClient(
azure_deployment="{your-azure-deployment}",
model="{model-name, such as gpt-4o}",
api_version="2024-06-01",
azure_endpoint="https://{your-custom-endpoint}.openai.azure.com/",
azure_ad_token_provider=token_provider, # Optional if you choose key-based
authentication.
# api_key="sk-...", # For key-based authentication.
)
Note

See here for how to use the Azure client directly or for more info.

Azure AI Foundry
Azure AI Foundry (previously known as Azure AI Studio) offers models hosted on Azure. To
use those models, you use the AzureAIChatCompletionClient.

You need to install the azure extra to use this client.

# pip install "autogen-ext[openai,azure]"


Below is an example of using this client with the Phi-4 model from GitHub Marketplace.

import os

from autogen_core.models import UserMessage


from autogen_ext.models.azure import AzureAIChatCompletionClient
from azure.core.credentials import AzureKeyCredential

client = AzureAIChatCompletionClient(
model="Phi-4",
endpoint="https://models.inference.ai.azure.com",
# To authenticate with the model you will need to generate a personal access token (PAT)
in your GitHub settings.
# Create your PAT token by following instructions here:
https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-
your-personal-access-tokens
credential=AzureKeyCredential(os.environ["GITHUB_TOKEN"]),
model_info={
"json_output": False,
"function_calling": False,
"vision": False,
"family": "unknown",
},
)

result = await client.create([UserMessage(content="What is the capital of France?",


source="user")])
print(result)
finish_reason='stop' content='The capital of France is Paris.'
usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False
logprobs=None
Ollama
You can use the OpenAIChatCompletionClient to interact with OpenAI-compatible APIs such
as Ollama and Gemini (beta). The below example shows how to use a local model running
on Ollama server.

from autogen_core.models import UserMessage


from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
model="llama3.2:latest",
base_url="http://localhost:11434/v1",
api_key="placeholder",
model_info={
"vision": False,
"function_calling": True,
"json_output": False,
"family": "unknown",
},
)

response = await model_client.create([UserMessage(content="What is the capital of


France?", source="user")])
print(response)
finish_reason='unknown' content='The capital of France is Paris.'
usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False
logprobs=None
Gemini (beta)
The below example shows how to use the Gemini model via the
OpenAIChatCompletionClient.

from autogen_core.models import UserMessage


from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
model="gemini-1.5-flash-8b",
# api_key="GEMINI_API_KEY",
)

response = await model_client.create([UserMessage(content="What is the capital of


France?", source="user")])
print(response)
finish_reason='stop' content='Paris\n' usage=RequestUsage(prompt_tokens=7,
completion_tokens=2) cached=False logprobs=None thought=None
Semantic Kernel Adapter
The SKChatCompletionAdapter allows you to use Semantic kernel model clients as a
ChatCompletionClient by adapting them to the required interface.

You need to install the relevant provider extras to use this adapter.

The list of extras that can be installed:

semantic-kernel-anthropic: Install this extra to use Anthropic models.

semantic-kernel-google: Install this extra to use Google Gemini models.

semantic-kernel-ollama: Install this extra to use Ollama models.

semantic-kernel-mistralai: Install this extra to use MistralAI models.

semantic-kernel-aws: Install this extra to use AWS models.

semantic-kernel-hugging-face: Install this extra to use Hugging Face models.

For example, to use Anthropic models, you need to install semantic-kernel-anthropic.

# pip install "autogen-ext[semantic-kernel-anthropic]"


To use this adapter, you need create a Semantic Kernel model client and pass it to the
adapter.

For example, to use the Anthropic model:

import os

from autogen_core.models import UserMessage


from autogen_ext.models.semantic_kernel import SKChatCompletionAdapter
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.anthropic import AnthropicChatCompletion,
AnthropicChatPromptExecutionSettings
from semantic_kernel.memory.null_memory import NullMemory

sk_client = AnthropicChatCompletion(
ai_model_id="claude-3-5-sonnet-20241022",
api_key=os.environ["ANTHROPIC_API_KEY"],
service_id="my-service-id", # Optional; for targeting specific services within Semantic
Kernel
)
settings = AnthropicChatPromptExecutionSettings(
temperature=0.2,
)

anthropic_model_client = SKChatCompletionAdapter(
sk_client, kernel=Kernel(memory=NullMemory()), prompt_settings=settings
)

# Call the model directly.


model_result = await anthropic_model_client.create(
messages=[UserMessage(content="What is the capital of France?", source="User")]
)
print(model_result)
finish_reason='stop' content='The capital of France is Paris. It is also the largest city in
France and one of the most populous metropolitan areas in Europe.'
usage=RequestUsage(prompt_tokens=0, completion_tokens=0) cached=False
logprobs=None
Read more about the Semantic Kernel Adapter.

Streaming Response
You can use the create_stream() method to create a chat completion request with streaming
response.

messages = [
UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages)

# Iterate over the stream and print the responses.


print("Streamed responses:")
async for response in stream: # type: ignore
if isinstance(response, str):
# A partial response is a string.
print(response, flush=True, end="")
else:
# The last response is a CreateResult object with the complete message.
print("\n\n------------\n")
print("The complete response:", flush=True)
print(response.content, flush=True)
print("\n\n------------\n")
print("The token usage was:", flush=True)
print(response.usage, flush=True)
Streamed responses:
In the heart of an ancient forest, beneath the shadow of snow-capped peaks, a dragon
named Elara lived secretly for centuries. Elara was unlike any dragon from the old tales; her
scales shimmered with a deep emerald hue, each scale engraved with symbols of lost
wisdom. The villagers in the nearby valley spoke of mysterious lights dancing across the
night sky, but none dared venture close enough to solve the enigma.

One cold winter's eve, a young girl named Lira, brimming with curiosity and armed with the
innocence of youth, wandered into Elara’s domain. Instead of fire and fury, she found
warmth and a gentle gaze. The dragon shared stories of a world long forgotten and in return,
Lira gifted her simple stories of human life, rich in laughter and scent of earth.

From that night on, the villagers noticed subtle changes—the crops grew taller, and the air
seemed sweeter. Elara had infused the valley with ancient magic, a guardian of balance,
watching quietly as her new friend thrived under the stars. And so, Lira and Elara’s bond
marked the beginning of a timeless friendship that spun tales of hope whispered through the
leaves of the ever-verdant forest.

------------

The complete response:


In the heart of an ancient forest, beneath the shadow of snow-capped peaks, a dragon
named Elara lived secretly for centuries. Elara was unlike any dragon from the old tales; her
scales shimmered with a deep emerald hue, each scale engraved with symbols of lost
wisdom. The villagers in the nearby valley spoke of mysterious lights dancing across the
night sky, but none dared venture close enough to solve the enigma.

One cold winter's eve, a young girl named Lira, brimming with curiosity and armed with the
innocence of youth, wandered into Elara’s domain. Instead of fire and fury, she found
warmth and a gentle gaze. The dragon shared stories of a world long forgotten and in return,
Lira gifted her simple stories of human life, rich in laughter and scent of earth.

From that night on, the villagers noticed subtle changes—the crops grew taller, and the air
seemed sweeter. Elara had infused the valley with ancient magic, a guardian of balance,
watching quietly as her new friend thrived under the stars. And so, Lira and Elara’s bond
marked the beginning of a timeless friendship that spun tales of hope whispered through the
leaves of the ever-verdant forest.

------------

The token usage was:


RequestUsage(prompt_tokens=0, completion_tokens=0)
Note

The last response in the streaming response is always the final response of the type
CreateResult.

Note
The default usage response is to return zero values

Comparing usage returns in the above Non Streaming


model_client.create(messages=messages) vs streaming
model_client.create_stream(messages=messages) we see differences. The non streaming
response by default returns valid prompt and completion token usage counts. The streamed
response by default returns zero values.

as documented in the OPENAI API Reference an additional parameter stream_options can


be specified to return valid usage counts. see stream_options

Only set this when you using streaming ie , using create_stream

to enable this in create_stream set extra_create_args={"stream_options": {"include_usage":


True}},

Note

Note whilst other API’s like LiteLLM also support this, it is not always guarenteed that it is
fully supported or correct.

See the example below for how to use the stream_options parameter to return usage.

messages = [
UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages,
extra_create_args={"stream_options": {"include_usage": True}})

# Iterate over the stream and print the responses.


print("Streamed responses:")
async for response in stream: # type: ignore
if isinstance(response, str):
# A partial response is a string.
print(response, flush=True, end="")
else:
# The last response is a CreateResult object with the complete message.
print("\n\n------------\n")
print("The complete response:", flush=True)
print(response.content, flush=True)
print("\n\n------------\n")
print("The token usage was:", flush=True)
print(response.usage, flush=True)
Streamed responses:
In a lush, emerald valley hidden by towering peaks, there lived a dragon named Ember.
Unlike others of her kind, Ember cherished solitude over treasure, and the songs of the
stream over the roar of flames. One misty dawn, a young shepherd stumbled into her
sanctuary, lost and frightened.

Instead of fury, he was met with kindness as Ember extended a wing, guiding him back to
safety. In gratitude, the shepherd visited yearly, bringing tales of his world beyond the
mountains. Over time, a friendship blossomed, binding man and dragon in shared stories
and laughter.

As the years passed, the legend of Ember the gentle-hearted spread far and wide, forever
changing the way dragons were seen in the hearts of many.

------------

The complete response:


In a lush, emerald valley hidden by towering peaks, there lived a dragon named Ember.
Unlike others of her kind, Ember cherished solitude over treasure, and the songs of the
stream over the roar of flames. One misty dawn, a young shepherd stumbled into her
sanctuary, lost and frightened.

Instead of fury, he was met with kindness as Ember extended a wing, guiding him back to
safety. In gratitude, the shepherd visited yearly, bringing tales of his world beyond the
mountains. Over time, a friendship blossomed, binding man and dragon in shared stories
and laughter.

As the years passed, the legend of Ember the gentle-hearted spread far and wide, forever
changing the way dragons were seen in the hearts of many.

------------

The token usage was:


RequestUsage(prompt_tokens=17, completion_tokens=146)
Structured Output
Structured output can be enabled by setting the response_format field in
OpenAIChatCompletionClient and AzureOpenAIChatCompletionClient to as a Pydantic
BaseModel class.

Note

Structured output is only available for models that support it. It also requires the model client
to support structured output as well. Currently, the OpenAIChatCompletionClient and
AzureOpenAIChatCompletionClient support structured output.

from typing import Literal

from pydantic import BaseModel


# The response format for the agent as a Pydantic base model.
class AgentResponse(BaseModel):
thoughts: str
response: Literal["happy", "sad", "neutral"]

# Create an agent that uses the OpenAI GPT-4o model with the custom response format.
model_client = OpenAIChatCompletionClient(
model="gpt-4o",
response_format=AgentResponse, # type: ignore
)

# Send a message list to the model and await the response.


messages = [
UserMessage(content="I am happy.", source="user"),
]
response = await model_client.create(messages=messages)
assert isinstance(response.content, str)
parsed_response = AgentResponse.model_validate_json(response.content)
print(parsed_response.thoughts)
print(parsed_response.response)
I'm glad to hear that you're feeling happy! It's such a great emotion that can brighten your
whole day. Is there anything in particular that's bringing you joy today? 😊
happy
You also use the extra_create_args parameter in the create() method to set the
response_format field so that the structured output can be configured for each request.

Caching Model Responses


autogen_ext implements ChatCompletionCache that can wrap any ChatCompletionClient.
Using this wrapper avoids incurring token usage when querying the underlying client with the
same prompt multiple times.

ChatCompletionCache uses a CacheStore protocol. We have implemented some useful


variants of CacheStore including DiskCacheStore and RedisStore.

Here’s an example of using diskcache for local caching:

# pip install -U "autogen-ext[openai, diskcache]"


import asyncio
import tempfile

from autogen_core.models import UserMessage


from autogen_ext.cache_store.diskcache import DiskCacheStore
from autogen_ext.models.cache import CHAT_CACHE_VALUE_TYPE,
ChatCompletionCache
from autogen_ext.models.openai import OpenAIChatCompletionClient
from diskcache import Cache

async def main() -> None:


with tempfile.TemporaryDirectory() as tmpdirname:
# Initialize the original client
openai_model_client = OpenAIChatCompletionClient(model="gpt-4o")

# Then initialize the CacheStore, in this case with diskcache.Cache.


# You can also use redis like:
# from autogen_ext.cache_store.redis import RedisStore
# import redis
# redis_instance = redis.Redis()
# cache_store = RedisCacheStore[CHAT_CACHE_VALUE_TYPE](redis_instance)
cache_store = DiskCacheStore[CHAT_CACHE_VALUE_TYPE](Cache(tmpdirname))
cache_client = ChatCompletionCache(openai_model_client, cache_store)

response = await cache_client.create([UserMessage(content="Hello, how are you?",


source="user")])
print(response) # Should print response from OpenAI
response = await cache_client.create([UserMessage(content="Hello, how are you?",
source="user")])
print(response) # Should print cached response

asyncio.run(main())
True
Inspecting cached_client.total_usage() (or model_client.total_usage()) before and after a
cached response should yield idential counts.

Note that the caching is sensitive to the exact arguments provided to cached_client.create or
cached_client.create_stream, so changing tools or json_output arguments might lead to a
cache miss.

Build an Agent with a Model Client


Let’s create a simple AI agent that can respond to messages using the ChatCompletion API.

from dataclasses import dataclass

from autogen_core import MessageContext, RoutedAgent, SingleThreadedAgentRuntime,


message_handler
from autogen_core.models import ChatCompletionClient, SystemMessage, UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient

@dataclass
class Message:
content: str
class SimpleAgent(RoutedAgent):
def __init__(self, model_client: ChatCompletionClient) -> None:
super().__init__("A simple agent")
self._system_messages = [SystemMessage(content="You are a helpful AI assistant.")]
self._model_client = model_client

@message_handler
async def handle_user_message(self, message: Message, ctx: MessageContext) ->
Message:
# Prepare input to the chat completion model.
user_message = UserMessage(content=message.content, source="user")
response = await self._model_client.create(
self._system_messages + [user_message],
cancellation_token=ctx.cancellation_token
)
# Return with the model's response.
assert isinstance(response.content, str)
return Message(content=response.content)
The SimpleAgent class is a subclass of the autogen_core.RoutedAgent class for the
convenience of automatically routing messages to the appropriate handlers. It has a single
handler, handle_user_message, which handles message from the user. It uses the
ChatCompletionClient to generate a response to the message. It then returns the response
to the user, following the direct communication model.

Note

The cancellation_token of the type autogen_core.CancellationToken is used to cancel


asynchronous operations. It is linked to async calls inside the message handlers and can be
used by the caller to cancel the handlers.

# Create the runtime and register the agent.


from autogen_core import AgentId

runtime = SingleThreadedAgentRuntime()
await SimpleAgent.register(
runtime,
"simple_agent",
lambda: SimpleAgent(
OpenAIChatCompletionClient(
model="gpt-4o-mini",
# api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the
environment.
)
),
)
# Start the runtime processing messages.
runtime.start()
# Send a message to the agent and get the response.
message = Message("Hello, what are some fun things to do in Seattle?")
response = await runtime.send_message(message, AgentId("simple_agent", "default"))
print(response.content)
# Stop the runtime processing messages.
await runtime.stop()
Seattle is a vibrant city with a wide range of activities and attractions. Here are some fun
things to do in Seattle:

1. **Space Needle**: Visit this iconic observation tower for stunning views of the city and
surrounding mountains.

2. **Pike Place Market**: Explore this historic market where you can see the famous fish
toss, buy local produce, and find unique crafts and eateries.

3. **Museum of Pop Culture (MoPOP)**: Dive into the world of contemporary culture, music,
and science fiction at this interactive museum.

4. **Chihuly Garden and Glass**: Marvel at the beautiful glass art installations by artist Dale
Chihuly, located right next to the Space Needle.

5. **Seattle Aquarium**: Discover the diverse marine life of the Pacific Northwest at this
engaging aquarium.

6. **Seattle Art Museum**: Explore a vast collection of art from around the world, including
contemporary and indigenous art.

7. **Kerry Park**: For one of the best views of the Seattle skyline, head to this small park on
Queen Anne Hill.

8. **Ballard Locks**: Watch boats pass through the locks and observe the salmon ladder to
see salmon migrating.

9. **Ferry to Bainbridge Island**: Take a scenic ferry ride across Puget Sound to enjoy
charming shops, restaurants, and beautiful natural scenery.

10. **Olympic Sculpture Park**: Stroll through this outdoor park with large-scale sculptures
and stunning views of the waterfront and mountains.

11. **Underground Tour**: Discover Seattle's history on this quirky tour of the city's
underground passageways in Pioneer Square.

12. **Seattle Waterfront**: Enjoy the shops, restaurants, and attractions along the waterfront,
including the Seattle Great Wheel and the aquarium.

13. **Discovery Park**: Explore the largest green space in Seattle, featuring trails, beaches,
and views of Puget Sound.
14. **Food Tours**: Try out Seattle’s diverse culinary scene, including fresh seafood,
international cuisines, and coffee culture (don’t miss the original Starbucks!).

15. **Attend a Sports Game**: Catch a Seahawks (NFL), Mariners (MLB), or Sounders
(MLS) game for a lively local experience.

Whether you're interested in culture, nature, food, or history, Seattle has something for
everyone to enjoy!
The above SimpleAgent always responds with a fresh context that contains only the system
message and the latest user’s message. We can use model context classes from
autogen_core.model_context to make the agent “remember” previous conversations. See
the Model Context page for more details.

API Keys From Environment Variables


In the examples above, we show that you can provide the API key through the api_key
argument. Importantly, the OpenAI and Azure OpenAI clients use the openai package, which
will automatically read an api key from the environment variable if one is not provided.

For OpenAI, you can set the OPENAI_API_KEY environment variable.

For Azure OpenAI, you can set the AZURE_OPENAI_API_KEY environment variable.

In addition, for Gemini (Beta), you can set the GEMINI_API_KEY environment variable.

This is a good practice to explore, as it avoids including sensitive api keys in your code.
Model Context
A model context supports storage and retrieval of Chat Completion messages. It is always
used together with a model client to generate LLM-based responses.

For example, BufferedChatCompletionContext is a most-recent-used (MRU) context that


stores the most recent buffer_size number of messages. This is useful to avoid context
overflow in many LLMs.

Let’s see an example that uses BufferedChatCompletionContext.

from dataclasses import dataclass

from autogen_core import AgentId, MessageContext, RoutedAgent,


SingleThreadedAgentRuntime, message_handler
from autogen_core.model_context import BufferedChatCompletionContext
from autogen_core.models import AssistantMessage, ChatCompletionClient,
SystemMessage, UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient
@dataclass
class Message:
content: str
class SimpleAgentWithContext(RoutedAgent):
def __init__(self, model_client: ChatCompletionClient) -> None:
super().__init__("A simple agent")
self._system_messages = [SystemMessage(content="You are a helpful AI assistant.")]
self._model_client = model_client
self._model_context = BufferedChatCompletionContext(buffer_size=5)

@message_handler
async def handle_user_message(self, message: Message, ctx: MessageContext) ->
Message:
# Prepare input to the chat completion model.
user_message = UserMessage(content=message.content, source="user")
# Add message to model context.
await self._model_context.add_message(user_message)
# Generate a response.
response = await self._model_client.create(
self._system_messages + (await self._model_context.get_messages()),
cancellation_token=ctx.cancellation_token,
)
# Return with the model's response.
assert isinstance(response.content, str)
# Add message to model context.
await self._model_context.add_message(AssistantMessage(content=response.content,
source=self.metadata["type"]))
return Message(content=response.content)
Now let’s try to ask follow up questions after the first one.
runtime = SingleThreadedAgentRuntime()
await SimpleAgentWithContext.register(
runtime,
"simple_agent_context",
lambda: SimpleAgentWithContext(
OpenAIChatCompletionClient(
model="gpt-4o-mini",
# api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the
environment.
)
),
)
# Start the runtime processing messages.
runtime.start()
agent_id = AgentId("simple_agent_context", "default")

# First question.
message = Message("Hello, what are some fun things to do in Seattle?")
print(f"Question: {message.content}")
response = await runtime.send_message(message, agent_id)
print(f"Response: {response.content}")
print("-----")

# Second question.
message = Message("What was the first thing you mentioned?")
print(f"Question: {message.content}")
response = await runtime.send_message(message, agent_id)
print(f"Response: {response.content}")

# Stop the runtime processing messages.


await runtime.stop()
Question: Hello, what are some fun things to do in Seattle?
Response: Seattle offers a variety of fun activities and attractions. Here are some highlights:

1. **Pike Place Market**: Visit this iconic market to explore local vendors, fresh produce,
artisanal products, and watch the famous fish throwing.

2. **Space Needle**: Take a trip to the observation deck for stunning panoramic views of the
city, Puget Sound, and the surrounding mountains.

3. **Chihuly Garden and Glass**: Marvel at the stunning glass art installations created by
artist Dale Chihuly, located right next to the Space Needle.

4. **Seattle Waterfront**: Enjoy a stroll along the waterfront, visit the Seattle Aquarium, and
take a ferry ride to nearby islands like Bainbridge Island.

5. **Museum of Pop Culture (MoPOP)**: Explore exhibits on music, science fiction, and pop
culture in this architecturally striking building.
6. **Seattle Art Museum (SAM)**: Discover an extensive collection of art from around the
world, including contemporary and Native American art.

7. **Gas Works Park**: Relax in this unique park that features remnants of an old
gasification plant, offering great views of the Seattle skyline and Lake Union.

8. **Discovery Park**: Enjoy nature trails, beaches, and beautiful views of the Puget Sound
and the Olympic Mountains in this large urban park.

9. **Ballard Locks**: Watch boats navigate the locks and see fish swimming upstream during
the salmon migration season.

10. **Fremont Troll**: Check out this quirky public art installation under a bridge in the
Fremont neighborhood.

11. **Underground Tour**: Take an entertaining guided tour through the underground
passages of Pioneer Square to learn about Seattle's history.

12. **Brewery Tours**: Seattle is known for its craft beer scene. Visit local breweries for
tastings and tours.

13. **Seattle Center**: Explore the cultural complex that includes the Space Needle,
MoPOP, and various festivals and events throughout the year.

These are just a few options, and Seattle has something for everyone, whether you're into
outdoor activities, culture, history, or food!
-----
Question: What was the first thing you mentioned?
Response: The first thing I mentioned was **Pike Place Market**. It's an iconic market in
Seattle known for its local vendors, fresh produce, artisanal products, and the famous fish
throwing by the fishmongers. It's a vibrant place full of sights, sounds, and delicious food.
From the second response, you can see the agent now can recall its own previous
responses.
Tools
Tools are code that can be executed by an agent to perform actions. A tool can be a simple
function such as a calculator, or an API call to a third-party service such as stock price
lookup or weather forecast. In the context of AI agents, tools are designed to be executed by
agents in response to model-generated function calls.

AutoGen provides the autogen_core.tools module with a suite of built-in tools and utilities for
creating and running custom tools.

Built-in Tools
One of the built-in tools is the PythonCodeExecutionTool, which allows agents to execute
Python code snippets.

Here is how you create the tool and use it.

from autogen_core import CancellationToken


from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_ext.tools.code_execution import PythonCodeExecutionTool

# Create the tool.


code_executor = DockerCommandLineCodeExecutor()
await code_executor.start()
code_execution_tool = PythonCodeExecutionTool(code_executor)
cancellation_token = CancellationToken()

# Use the tool directly without an agent.


code = "print('Hello, world!')"
result = await code_execution_tool.run_json({"code": code}, cancellation_token)
print(code_execution_tool.return_value_as_string(result))
Hello, world!
The DockerCommandLineCodeExecutor class is a built-in code executor that runs Python
code snippets in a subprocess in the command line environment of a docker container. The
PythonCodeExecutionTool class wraps the code executor and provides a simple interface to
execute Python code snippets.

Examples of other built-in tools

LocalSearchTool and GlobalSearchTool for using GraphRAG.

mcp_server_tools for using Model Context Protocol (MCP) servers as tools.

HttpTool for making HTTP requests to REST APIs.

LangChainToolAdapter for using LangChain tools.

Custom Function Tools


A tool can also be a simple Python function that performs a specific action. To create a
custom function tool, you just need to create a Python function and use the FunctionTool
class to wrap it.

The FunctionTool class uses descriptions and type annotations to inform the LLM when and
how to use a given function. The description provides context about the function’s purpose
and intended use cases, while type annotations inform the LLM about the expected
parameters and return type.

For example, a simple tool to obtain the stock price of a company might look like this:

import random

from autogen_core import CancellationToken


from autogen_core.tools import FunctionTool
from typing_extensions import Annotated

async def get_stock_price(ticker: str, date: Annotated[str, "Date in YYYY/MM/DD"]) -> float:
# Returns a random stock price for demonstration purposes.
return random.uniform(10, 200)

# Create a function tool.


stock_price_tool = FunctionTool(get_stock_price, description="Get the stock price.")

# Run the tool.


cancellation_token = CancellationToken()
result = await stock_price_tool.run_json({"ticker": "AAPL", "date": "2021/01/01"},
cancellation_token)

# Print the result.


print(stock_price_tool.return_value_as_string(result))
36.63801673457121
Calling Tools with Model Clients
Model clients can generate tool calls when they are provided with a list of tools.

Here is an example of how to use the FunctionTool class with a


OpenAIChatCompletionClient. Other model client classes can be used in a similar way. See
Model Clients for more details.

import json

from autogen_core.models import AssistantMessage, FunctionExecutionResult,


FunctionExecutionResultMessage, UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient
# Create the OpenAI chat completion client. Using OPENAI_API_KEY from environment
variable.
client = OpenAIChatCompletionClient(model="gpt-4o-mini")

# Create a user message.


user_message = UserMessage(content="What is the stock price of AAPL on 2021/01/01?",
source="user")

# Run the chat completion with the stock_price_tool defined above.


cancellation_token = CancellationToken()
create_result = await client.create(
messages=[user_message], tools=[stock_price_tool],
cancellation_token=cancellation_token
)
create_result.content
[FunctionCall(id='call_tpJ5J1Xoxi84Sw4v0scH0qBM',
arguments='{"ticker":"AAPL","date":"2021/01/01"}', name='get_stock_price')]
The result is a list of FunctionCall objects, which can be used to run the corresponding tools.

arguments = json.loads(create_result.content[0].arguments) # type: ignore


tool_result = await stock_price_tool.run_json(arguments, cancellation_token)
tool_result_str = stock_price_tool.return_value_as_string(tool_result)
tool_result_str
'32.381250753393104'
Now you can make another model client call to have the model generate a reflection on the
result of the tool execution.

The result of the tool call is wrapped in a FunctionExecutionResult object, which contains the
result of the tool execution and the ID of the tool that was called. The model client can use
this information to generate a reflection on the result of the tool execution.

# Create a function execution result


exec_result = FunctionExecutionResult(call_id=create_result.content[0].id,
content=tool_result_str, is_error=False) # type: ignore

# Make another chat completion with the history and function execution result message.
messages = [
user_message,
AssistantMessage(content=create_result.content, source="assistant"), # assistant
message with tool call
FunctionExecutionResultMessage(content=[exec_result]), # function execution result
message
]
create_result = await client.create(messages=messages,
cancellation_token=cancellation_token) # type: ignore
print(create_result.content)
The stock price of AAPL (Apple Inc.) on January 1, 2021, was approximately $32.38.
Tool-Equipped Agent
Putting the model client and the tools together, you can create a tool-equipped agent that
can use tools to perform actions, and reflect on the results of those actions.

import asyncio
import json
from dataclasses import dataclass
from typing import List

from autogen_core import (


AgentId,
FunctionCall,
MessageContext,
RoutedAgent,
SingleThreadedAgentRuntime,
message_handler,
)
from autogen_core.models import (
ChatCompletionClient,
LLMMessage,
SystemMessage,
UserMessage,
)
from autogen_core.tools import FunctionTool, Tool
from autogen_ext.models.openai import OpenAIChatCompletionClient

@dataclass
class Message:
content: str

class ToolUseAgent(RoutedAgent):
def __init__(self, model_client: ChatCompletionClient, tool_schema: List[Tool]) -> None:
super().__init__("An agent with tools")
self._system_messages: List[LLMMessage] = [SystemMessage(content="You are a
helpful AI assistant.")]
self._model_client = model_client
self._tools = tools

@message_handler
async def handle_user_message(self, message: Message, ctx: MessageContext) ->
Message:
# Create a session of messages.
session: List[LLMMessage] = self._system_messages +
[UserMessage(content=message.content, source="user")]

# Run the chat completion with the tools.


create_result = await self._model_client.create(
messages=session,
tools=self._tools,
cancellation_token=ctx.cancellation_token,
)

# If there are no tool calls, return the result.


if isinstance(create_result.content, str):
return Message(content=create_result.content)
assert isinstance(create_result.content, list) and all(
isinstance(call, FunctionCall) for call in create_result.content
)

# Add the first model create result to the session.


session.append(AssistantMessage(content=create_result.content, source="assistant"))

# Execute the tool calls.


results = await asyncio.gather(
*[self._execute_tool_call(call, ctx.cancellation_token) for call in create_result.content]
)

# Add the function execution results to the session.


session.append(FunctionExecutionResultMessage(content=results))

# Run the chat completion again to reflect on the history and function execution results.
create_result = await self._model_client.create(
messages=session,
cancellation_token=ctx.cancellation_token,
)
assert isinstance(create_result.content, str)

# Return the result as a message.


return Message(content=create_result.content)

async def _execute_tool_call(


self, call: FunctionCall, cancellation_token: CancellationToken
) -> FunctionExecutionResult:
# Find the tool by name.
tool = next((tool for tool in self._tools if tool.name == call.name), None)
assert tool is not None

# Run the tool and capture the result.


try:
arguments = json.loads(call.arguments)
result = await tool.run_json(arguments, cancellation_token)
return FunctionExecutionResult(call_id=call.id,
content=tool.return_value_as_string(result), is_error=False)
except Exception as e:
return FunctionExecutionResult(call_id=call.id, content=str(e), is_error=True)
When handling a user message, the ToolUseAgent class first use the model client to
generate a list of function calls to the tools, and then run the tools and generate a reflection
on the results of the tool execution. The reflection is then returned to the user as the agent’s
response.

To run the agent, let’s create a runtime and register the agent with the runtime.

# Create a runtime.
runtime = SingleThreadedAgentRuntime()
# Create the tools.
tools: List[Tool] = [FunctionTool(get_stock_price, description="Get the stock price.")]
# Register the agents.
await ToolUseAgent.register(
runtime,
"tool_use_agent",
lambda: ToolUseAgent(
OpenAIChatCompletionClient(model="gpt-4o-mini"),
tools,
),
)
AgentType(type='tool_use_agent')
This example uses the OpenAIChatCompletionClient, for Azure OpenAI and other clients,
see Model Clients. Let’s test the agent with a question about stock price.

# Start processing messages.


runtime.start()
# Send a direct message to the tool agent.
tool_use_agent = AgentId("tool_use_agent", "default")
response = await runtime.send_message(Message("What is the stock price of NVDA on
2024/06/01?"), tool_use_agent)
print(response.content)
# Stop processing messages.
await runtime.stop()
The stock price of NVIDIA (NVDA) on June 1, 2024, was approximately $140.05.
Command Line Code Executors
Command line code execution is the simplest form of code execution. Generally speaking, it
will save each code block to a file and then execute that file. This means that each code
block is executed in a new process. There are two forms of this executor:

Docker (DockerCommandLineCodeExecutor) - this is where all commands are executed in a


Docker container

Local (LocalCommandLineCodeExecutor) - this is where all commands are executed on the


host machine

Docker
Note

To use DockerCommandLineCodeExecutor, ensure the autogen-ext[docker] package is


installed. For more details, see the Packages Documentation.

The DockerCommandLineCodeExecutor will create a Docker container and run all


commands within that container. The default image that is used is python:3-slim, this can be
customized by passing the image parameter to the constructor. If the image is not found
locally then the class will try to pull it. Therefore, having built the image locally is enough.
The only thing required for this image to be compatible with the executor is to have sh and
python installed. Therefore, creating a custom image is a simple and effective way to ensure
required system dependencies are available.

You can use the executor as a context manager to ensure the container is cleaned up after
use. Otherwise, the atexit module will be used to stop the container when the program exits.

Inspecting the container


If you wish to keep the container around after AutoGen is finished using it for whatever
reason (e.g. to inspect the container), then you can set the auto_remove parameter to False
when creating the executor. stop_container can also be set to False to prevent the container
from being stopped at the end of the execution.

Example
from pathlib import Path

from autogen_core import CancellationToken


from autogen_core.code_executor import CodeBlock
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor

work_dir = Path("coding")
work_dir.mkdir(exist_ok=True)

async with DockerCommandLineCodeExecutor(work_dir=work_dir) as executor: # type:


ignore
print(
await executor.execute_code_blocks(
code_blocks=[
CodeBlock(language="python", code="print('Hello, World!')"),
],
cancellation_token=CancellationToken(),
)
)
CommandLineCodeResult(exit_code=0, output='Hello, World!\n',
code_file='coding/tmp_code_07da107bb575cc4e02b0e1d6d99cc204.python')
Combining an Application in Docker with a Docker based executor
It is desirable to bundle your application into a Docker image. But then, how do you allow
your containerised application to execute code in a different container?

The recommended approach to this is called “Docker out of Docker”, where the Docker
socket is mounted to the main AutoGen container, so that it can spawn and control “sibling”
containers on the host. This is better than what is called “Docker in Docker”, where the main
container runs a Docker daemon and spawns containers within itself. You can read more
about this here.

To do this you would need to mount the Docker socket into the container running your
application. This can be done by adding the following to the docker run command:

-v /var/run/docker.sock:/var/run/docker.sock
This will allow your application’s container to spawn and control sibling containers on the
host.

If you need to bind a working directory to the application’s container but the directory
belongs to your host machine, use the bind_dir parameter. This will allow the application’s
container to bind the host directory to the new spawned containers and allow it to access the
files within the said directory. If the bind_dir is not specified, it will fallback to work_dir.

Local
Attention

The local version will run code on your local system. Use it with caution.

To execute code on the host machine, as in the machine running your application,
LocalCommandLineCodeExecutor can be used.

Example
from pathlib import Path

from autogen_core import CancellationToken


from autogen_core.code_executor import CodeBlock
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor

work_dir = Path("coding")
work_dir.mkdir(exist_ok=True)
local_executor = LocalCommandLineCodeExecutor(work_dir=work_dir)
print(
await local_executor.execute_code_blocks(
code_blocks=[
CodeBlock(language="python", code="print('Hello, World!')"),
],
cancellation_token=CancellationToken(),
)
)
CommandLineCodeResult(exit_code=0, output='Hello, World!\n',
code_file='/home/ekzhu/agnext/python/packages/autogen-core/docs/src/guides/coding/
tmp_code_07da107bb575cc4e02b0e1d6d99cc204.py')
Local within a Virtual Environment
If you want the code to run within a virtual environment created as part of the application’s
setup, you can specify a directory for the newly created environment and pass its context to
LocalCommandLineCodeExecutor. This setup allows the executor to use the specified
virtual environment consistently throughout the application’s lifetime, ensuring isolated
dependencies and a controlled runtime environment.

import venv
from pathlib import Path

from autogen_core import CancellationToken


from autogen_core.code_executor import CodeBlock
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor

work_dir = Path("coding")
work_dir.mkdir(exist_ok=True)

venv_dir = work_dir / ".venv"


venv_builder = venv.EnvBuilder(with_pip=True)
venv_builder.create(venv_dir)
venv_context = venv_builder.ensure_directories(venv_dir)

local_executor = LocalCommandLineCodeExecutor(work_dir=work_dir,
virtual_env_context=venv_context)
await local_executor.execute_code_blocks(
code_blocks=[
CodeBlock(language="bash", code="pip install matplotlib"),
],
cancellation_token=CancellationToken(),
)
CommandLineCodeResult(exit_code=0, output='',
code_file='/Users/gziz/Dev/autogen/python/packages/autogen-core/docs/src/user-guide/
core-user-guide/framework/coding/
tmp_code_d2a7db48799db3cc785156a11a38822a45c19f3956f02ec69b92e4169ecbf2ca.ba
sh')
As we can see, the code has executed successfully, and the installation has been isolated to
the newly created virtual environment, without affecting our global environment.

You might also like