Execution Runtime

Relevant source files

Purpose and Scope

The Execution Runtime is the orchestration layer that coordinates agent invocations, manages session state, provides service dependencies, and streams events back to clients. This page documents the core execution infrastructure including Runner, InvocationContext, session management, and the event system.

For detailed information about specific subsystems:

Runner orchestration and InvocationContext lifecycle: see 2.2.1
Session management, state propagation, and persistence: see 2.2.2
Event structure, EventActions, and event compaction: see 2.2.3
Artifact and memory service integration: see 2.2.4

Core Components

The execution runtime consists of four primary components that work together to execute agent invocations:

Component	Class	Responsibility
Orchestrator	`Runner`	Coordinates the entire execution lifecycle, from receiving user input to streaming events
Execution Context	`InvocationContext`	Provides access to services, session state, and execution configuration during agent runs
Service Layer	`BaseSessionService`, `BaseArtifactService`, `BaseMemoryService`	Manages persistence, file storage, and semantic search
Plugin System	`PluginManager`	Provides lifecycle hooks for observability, testing, and custom behavior

Sources: core/src/main/java/com/google/adk/runner/Runner.java66-67 core/src/main/java/com/google/adk/agents/InvocationContext.java41-42

Runner

The Runner class is the main entry point for executing agents. It is configured with:

A root BaseAgent that defines the agent hierarchy
An appName identifying the application
Service implementations for session, artifact, and memory management
A list of BasePlugin instances for lifecycle hooks
Configuration for resumability and event compaction

Diagram: Runner Component Dependencies

The Runner is typically constructed using the builder pattern:

Sources: core/src/main/java/com/google/adk/runner/Runner.java77-159 core/src/main/java/com/google/adk/runner/Runner.java274-296

InvocationContext

The InvocationContext encapsulates all state and services needed during a single agent invocation. It includes:

Service References: sessionService, artifactService, memoryService, pluginManager
Execution State: invocationId, session, agent, branch
Request Information: userContent, runConfig
Live Mode Support: liveRequestQueue, activeStreamingTools
Cost Management: Tracks LLM call count and enforces limits

The context is immutable during execution but can be copied with modifications using the builder pattern (toBuilder()).

The InvocationContext also manages cost control through an internal InvocationCostManager that tracks the number of LLM calls and enforces limits specified in RunConfig.

Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java42-86 core/src/main/java/com/google/adk/agents/InvocationContext.java411-648 core/src/main/java/com/google/adk/agents/InvocationContext.java379-409

Execution Modes

The runtime supports two execution modes:

Mode	Method	Use Case	Characteristics
Invocation-Based	`runAsync()`	Standard request-response interactions	User sends a message with optional state delta, agent processes it, returns events. Each invocation is independent.
Live/Streaming	`runLive()`	Real-time audio/video conversations	Continuous bidirectional streaming with `LiveRequestQueue`. Agent, tools, and client exchange content during a single long-lived connection.

Invocation-Based Execution

Invocation-based execution (runAsync()) follows this pattern:

Client calls Runner.runAsync(userId, sessionId, newMessage, runConfig, stateDelta)
Runner retrieves the session from BaseSessionService
Runner creates an initial InvocationContext with the user content
Runner invokes onUserMessageCallback to allow plugins to transform the message
Runner appends the user message to the session via appendNewMessageToSession(), which also merges any provided stateDelta into the session state
Runner retrieves the updated session (with state applied)
Runner determines which agent should handle the request using findAgentToRun()
Runner creates a new InvocationContext with the updated session and selected agent
Runner invokes beforeRunCallback - if a plugin returns early content, it's streamed and execution ends
Agent execution begins via agent.runAsync(invocationContext)
Events are streamed back to the client as they are generated
Each event is persisted to the session via sessionService.appendEvent()
Plugin hooks fire for each event via onEventCallback
After completion, afterRunCallback is invoked
Optional event compaction occurs via compactEvents()

Sources: core/src/main/java/com/google/adk/runner/Runner.java371-487 core/src/main/java/com/google/adk/runner/Runner.java489-545 core/src/main/java/com/google/adk/runner/Runner.java304-351 core/src/main/java/com/google/adk/agents/BaseAgent.java238-284

Live/Streaming Execution

Live execution (runLive()) is designed for continuous bidirectional streaming interactions such as audio conversations:

Client establishes a connection and provides a LiveRequestQueue
Runner creates an InvocationContext configured for live mode via newInvocationContextForLive()
If response modalities are not specified, the runner defaults to AUDIO modality with transcription enabled
For LlmAgent instances, the runner identifies streaming-capable tools (those with LiveRequestQueue parameters) and registers them in activeStreamingTools map
Agent execution begins via agent.runLive(invocationContext)
The agent can receive new content through liveRequestQueue during execution
Streaming tools can output results asynchronously through their own LiveRequestQueue instances
Events are appended to the session in real-time via sessionService.appendEvent()
The LiveRequestQueue allows bidirectional communication between the client, agent, and streaming tools

The runner uses addActiveStreamingTools() to scan tool lists for FunctionTool instances that accept LiveRequestQueue parameters, creating separate queue instances for each streaming tool.

Sources: core/src/main/java/com/google/adk/runner/Runner.java611-659 core/src/main/java/com/google/adk/runner/Runner.java667-680 core/src/main/java/com/google/adk/runner/Runner.java564-591 core/src/main/java/com/google/adk/runner/Runner.java748-763

Execution Flow

Diagram: Invocation-Based Execution Flow with State Delta

Sources: core/src/main/java/com/google/adk/runner/Runner.java371-487 core/src/main/java/com/google/adk/runner/Runner.java489-545 core/src/main/java/com/google/adk/runner/Runner.java304-351

Service Architecture

The runtime provides a unified service architecture through InvocationContext. All services are accessible to agents, tools, and plugins during execution.

Diagram: Service Architecture and Implementations

Service Access Patterns

Services are accessed through InvocationContext methods:

Service	Access Method	Purpose
Session Service	`invocationContext.sessionService()`	Retrieve/update session state and history
Artifact Service	`invocationContext.artifactService()`	Save/retrieve files associated with sessions
Memory Service	`invocationContext.memoryService()`	Perform semantic search over stored content
Plugin Manager	`invocationContext.pluginManager()`	Access lifecycle hooks

The session object itself is also available via invocationContext.session(), providing direct access to:

session.state(): Mutable key-value map for session state
session.events(): Immutable list of historical events
session.userId(), session.id(): Session identifiers

The session's mutable state map (session.state()) can be updated through two mechanisms:

State Delta: Passed to runAsync() as a parameter, merged into the session before agent execution begins
EventActions.stateDelta(): Attached to events during execution to propagate state changes

Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java227-307 core/src/test/java/com/google/adk/runner/RunnerTest.java455-613

Agent Selection

The runtime uses findAgentToRun() to determine which agent in a hierarchy should handle an incoming request. This algorithm examines the session history to find the most recently active agent that can continue the conversation.

The selection logic:

Retrieves session.events() and reverses the order
For each non-user event, identifies the agent by name (event author)
If the author matches rootAgent.name(), returns the root agent
Otherwise, calls rootAgent.findSubAgent(author) to locate the agent in the tree
If found, checks if the agent is "transferable" using isTransferableAcrossAgentTree()
An agent is transferable if it and all its parents are LlmAgent instances and none have disallowTransferToParent() set
Returns the first transferable agent found, or the root agent if none qualify

Diagram: Agent Selection Algorithm

This algorithm enables agent hierarchies where sub-agents can be automatically resumed based on conversation context. For more details on agent transfer, see 5.3.

Sources: core/src/main/java/com/google/adk/runner/Runner.java720-746 core/src/main/java/com/google/adk/runner/Runner.java698-714

Plugin Integration

The runtime integrates PluginManager at multiple lifecycle stages. Plugins can:

Transform user messages before they enter the session
Provide early responses that bypass agent execution
Intercept and modify events before they reach clients
Trigger cleanup or post-processing after invocations complete

Diagram: Plugin Lifecycle Hook Points in Execution

For comprehensive plugin system documentation, see 2.6.

Sources: core/src/main/java/com/google/adk/runner/Runner.java440-442 core/src/main/java/com/google/adk/runner/Runner.java505-517 core/src/main/java/com/google/adk/runner/Runner.java528-536 core/src/main/java/com/google/adk/runner/Runner.java543

Resumability and Long-Running Operations

The runtime supports pausing and resuming invocations for long-running operations. When configured with ResumabilityConfig, the system can:

Detect long-running tool calls (marked by BaseTool.longRunning() returning true)
Populate longRunningToolIds in the event containing the function calls
Pause execution after such tools are invoked
Resume execution when tool results are available

The InvocationContext provides methods to check resumability:

isResumable(): Returns whether the current invocation can be paused/resumed (checks resumabilityConfig)
shouldPauseInvocation(event): Determines if execution should pause after this event by checking if any function call IDs match the longRunningToolIds set

Diagram: Resumability Decision Flow

Long-running tool IDs are populated in BaseLlmFlow.buildModelResponseEvent() by calling Functions.getLongRunningFunctionCalls(), which checks each tool's longRunning() flag.

For details on resumability patterns and LongRunningFunctionTool, see 5.2.

Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java358-377 core/src/main/java/com/google/adk/flows/llmflows/BaseLlmFlow.java632-641 core/src/main/java/com/google/adk/flows/llmflows/Functions.java360-373

Event Compaction

The runtime can optionally compact session history to reduce storage and improve performance. When configured with EventsCompactionConfig, the system uses a sliding window approach implemented by SlidingWindowEventCompactor:

Configuration Parameters:

compactionInterval: Number of new user invocations that trigger compaction
overlapSize: Number of preceding invocations to include from the last compacted range (maintains context)
summarizer: A BaseEventSummarizer implementation (typically LlmEventSummarizer) to generate summaries

Compaction Process:

After afterRunCallback completes, Runner.compactEvents() is invoked
SlidingWindowEventCompactor.compact() scans events in reverse chronological order
It identifies uncompacted invocations by comparing event timestamps to the last EventCompaction.endTimestamp()
When compactionInterval invocations are detected, it includes overlapSize additional invocations from the previous range
The selected events are passed to BaseEventSummarizer.summarizeEvents()
LlmEventSummarizer formats events as conversation history and calls the LLM with a summarization prompt
A new Event with EventActions.compaction() is created containing the summary
The compaction event is appended to the session via sessionService.appendEvent()

The compaction event contains:

EventCompaction.startTimestamp(): First event timestamp in the compacted range
EventCompaction.endTimestamp(): Last event timestamp in the compacted range
EventCompaction.compactedContent(): The LLM-generated summary as Content

For implementation details, see 5.4.

Sources: core/src/main/java/com/google/adk/runner/Runner.java547-552 core/src/main/java/com/google/adk/runner/Runner.java766-784 core/src/main/java/com/google/adk/summarizer/SlidingWindowEventCompactor.java100-110 core/src/main/java/com/google/adk/summarizer/LlmEventSummarizer.java60-103 core/src/main/java/com/google/adk/summarizer/EventsCompactionConfig.java21-37

Cost Management

The runtime tracks and enforces limits on LLM calls per invocation through an internal InvocationCostManager embedded in InvocationContext. This prevents runaway costs from infinite loops or excessive tool usage.

Call Tracking Mechanism:

InvocationContext.incrementLlmCallsCount() is called before each LLM invocation in BaseLlmFlow.runOneStep()
The internal InvocationCostManager.incrementAndEnforceLlmCallsLimit() increments a counter
If RunConfig.maxLlmCalls() is set and the counter exceeds this value, an LlmCallsLimitExceededException is thrown
The exception propagates up, terminating the invocation and returning an error to the client

The InvocationCostManager is created per invocation context and its state is not shared across contexts, ensuring accurate per-invocation tracking.

Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java353-355 core/src/main/java/com/google/adk/agents/InvocationContext.java379-392 core/src/main/java/com/google/adk/flows/llmflows/BaseLlmFlow.java293-298

Builder Pattern Usage

Both Runner and InvocationContext use builders to ensure proper construction with required parameters:

Runner Builder Example:

InvocationContext Builder Example:

The builders validate required parameters at build time, throwing IllegalStateException if essential components are missing.

Sources: core/src/main/java/com/google/adk/runner/Runner.java136-158 core/src/main/java/com/google/adk/agents/InvocationContext.java632-639

Execution Runtime

Relevant source files

Purpose and Scope

For detailed information about specific subsystems:

Runner orchestration and InvocationContext lifecycle: see 2.2.1
Session management, state propagation, and persistence: see 2.2.2
Event structure, EventActions, and event compaction: see 2.2.3
Artifact and memory service integration: see 2.2.4

Core Components

The execution runtime consists of four primary components that work together to execute agent invocations:

Component	Class	Responsibility
Orchestrator	`Runner`	Coordinates the entire execution lifecycle, from receiving user input to streaming events
Execution Context	`InvocationContext`	Provides access to services, session state, and execution configuration during agent runs
Service Layer	`BaseSessionService`, `BaseArtifactService`, `BaseMemoryService`	Manages persistence, file storage, and semantic search
Plugin System	`PluginManager`	Provides lifecycle hooks for observability, testing, and custom behavior

Sources: core/src/main/java/com/google/adk/runner/Runner.java66-67 core/src/main/java/com/google/adk/agents/InvocationContext.java41-42

Runner

The Runner class is the main entry point for executing agents. It is configured with:

A root BaseAgent that defines the agent hierarchy
An appName identifying the application
Service implementations for session, artifact, and memory management
A list of BasePlugin instances for lifecycle hooks
Configuration for resumability and event compaction

Diagram: Runner Component Dependencies

The Runner is typically constructed using the builder pattern:

Sources: core/src/main/java/com/google/adk/runner/Runner.java77-159 core/src/main/java/com/google/adk/runner/Runner.java274-296

InvocationContext

The InvocationContext encapsulates all state and services needed during a single agent invocation. It includes:

Service References: sessionService, artifactService, memoryService, pluginManager
Execution State: invocationId, session, agent, branch
Request Information: userContent, runConfig
Live Mode Support: liveRequestQueue, activeStreamingTools
Cost Management: Tracks LLM call count and enforces limits

The context is immutable during execution but can be copied with modifications using the builder pattern (toBuilder()).

The InvocationContext also manages cost control through an internal InvocationCostManager that tracks the number of LLM calls and enforces limits specified in RunConfig.

Execution Modes

The runtime supports two execution modes:

Mode	Method	Use Case	Characteristics
Invocation-Based	`runAsync()`	Standard request-response interactions	User sends a message with optional state delta, agent processes it, returns events. Each invocation is independent.
Live/Streaming	`runLive()`	Real-time audio/video conversations	Continuous bidirectional streaming with `LiveRequestQueue`. Agent, tools, and client exchange content during a single long-lived connection.

Invocation-Based Execution

Invocation-based execution (runAsync()) follows this pattern:

Client calls Runner.runAsync(userId, sessionId, newMessage, runConfig, stateDelta)
Runner retrieves the session from BaseSessionService
Runner creates an initial InvocationContext with the user content
Runner invokes onUserMessageCallback to allow plugins to transform the message
Runner appends the user message to the session via appendNewMessageToSession(), which also merges any provided stateDelta into the session state
Runner retrieves the updated session (with state applied)
Runner determines which agent should handle the request using findAgentToRun()
Runner creates a new InvocationContext with the updated session and selected agent
Runner invokes beforeRunCallback - if a plugin returns early content, it's streamed and execution ends
Agent execution begins via agent.runAsync(invocationContext)
Events are streamed back to the client as they are generated
Each event is persisted to the session via sessionService.appendEvent()
Plugin hooks fire for each event via onEventCallback
After completion, afterRunCallback is invoked
Optional event compaction occurs via compactEvents()

Live/Streaming Execution

Live execution (runLive()) is designed for continuous bidirectional streaming interactions such as audio conversations:

Client establishes a connection and provides a LiveRequestQueue
Runner creates an InvocationContext configured for live mode via newInvocationContextForLive()
If response modalities are not specified, the runner defaults to AUDIO modality with transcription enabled
For LlmAgent instances, the runner identifies streaming-capable tools (those with LiveRequestQueue parameters) and registers them in activeStreamingTools map
Agent execution begins via agent.runLive(invocationContext)
The agent can receive new content through liveRequestQueue during execution
Streaming tools can output results asynchronously through their own LiveRequestQueue instances
Events are appended to the session in real-time via sessionService.appendEvent()
The LiveRequestQueue allows bidirectional communication between the client, agent, and streaming tools

The runner uses addActiveStreamingTools() to scan tool lists for FunctionTool instances that accept LiveRequestQueue parameters, creating separate queue instances for each streaming tool.

Execution Flow

Diagram: Invocation-Based Execution Flow with State Delta

Sources: core/src/main/java/com/google/adk/runner/Runner.java371-487 core/src/main/java/com/google/adk/runner/Runner.java489-545 core/src/main/java/com/google/adk/runner/Runner.java304-351

Service Architecture

The runtime provides a unified service architecture through InvocationContext. All services are accessible to agents, tools, and plugins during execution.

Diagram: Service Architecture and Implementations

Service Access Patterns

Services are accessed through InvocationContext methods:

Service	Access Method	Purpose
Session Service	`invocationContext.sessionService()`	Retrieve/update session state and history
Artifact Service	`invocationContext.artifactService()`	Save/retrieve files associated with sessions
Memory Service	`invocationContext.memoryService()`	Perform semantic search over stored content
Plugin Manager	`invocationContext.pluginManager()`	Access lifecycle hooks

The session object itself is also available via invocationContext.session(), providing direct access to:

session.state(): Mutable key-value map for session state
session.events(): Immutable list of historical events
session.userId(), session.id(): Session identifiers

The session's mutable state map (session.state()) can be updated through two mechanisms:

State Delta: Passed to runAsync() as a parameter, merged into the session before agent execution begins
EventActions.stateDelta(): Attached to events during execution to propagate state changes

Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java227-307 core/src/test/java/com/google/adk/runner/RunnerTest.java455-613

Agent Selection

The selection logic:

Retrieves session.events() and reverses the order
For each non-user event, identifies the agent by name (event author)
If the author matches rootAgent.name(), returns the root agent
Otherwise, calls rootAgent.findSubAgent(author) to locate the agent in the tree
If found, checks if the agent is "transferable" using isTransferableAcrossAgentTree()
An agent is transferable if it and all its parents are LlmAgent instances and none have disallowTransferToParent() set
Returns the first transferable agent found, or the root agent if none qualify

Diagram: Agent Selection Algorithm

This algorithm enables agent hierarchies where sub-agents can be automatically resumed based on conversation context. For more details on agent transfer, see 5.3.

Sources: core/src/main/java/com/google/adk/runner/Runner.java720-746 core/src/main/java/com/google/adk/runner/Runner.java698-714

Plugin Integration

The runtime integrates PluginManager at multiple lifecycle stages. Plugins can:

Transform user messages before they enter the session
Provide early responses that bypass agent execution
Intercept and modify events before they reach clients
Trigger cleanup or post-processing after invocations complete

Diagram: Plugin Lifecycle Hook Points in Execution

For comprehensive plugin system documentation, see 2.6.

Resumability and Long-Running Operations

The runtime supports pausing and resuming invocations for long-running operations. When configured with ResumabilityConfig, the system can:

Detect long-running tool calls (marked by BaseTool.longRunning() returning true)
Populate longRunningToolIds in the event containing the function calls
Pause execution after such tools are invoked
Resume execution when tool results are available

The InvocationContext provides methods to check resumability:

isResumable(): Returns whether the current invocation can be paused/resumed (checks resumabilityConfig)
shouldPauseInvocation(event): Determines if execution should pause after this event by checking if any function call IDs match the longRunningToolIds set

Diagram: Resumability Decision Flow

Long-running tool IDs are populated in BaseLlmFlow.buildModelResponseEvent() by calling Functions.getLongRunningFunctionCalls(), which checks each tool's longRunning() flag.

For details on resumability patterns and LongRunningFunctionTool, see 5.2.

Event Compaction

Configuration Parameters:

compactionInterval: Number of new user invocations that trigger compaction
overlapSize: Number of preceding invocations to include from the last compacted range (maintains context)
summarizer: A BaseEventSummarizer implementation (typically LlmEventSummarizer) to generate summaries

Compaction Process:

After afterRunCallback completes, Runner.compactEvents() is invoked
SlidingWindowEventCompactor.compact() scans events in reverse chronological order
It identifies uncompacted invocations by comparing event timestamps to the last EventCompaction.endTimestamp()
When compactionInterval invocations are detected, it includes overlapSize additional invocations from the previous range
The selected events are passed to BaseEventSummarizer.summarizeEvents()
LlmEventSummarizer formats events as conversation history and calls the LLM with a summarization prompt
A new Event with EventActions.compaction() is created containing the summary
The compaction event is appended to the session via sessionService.appendEvent()

The compaction event contains:

EventCompaction.startTimestamp(): First event timestamp in the compacted range
EventCompaction.endTimestamp(): Last event timestamp in the compacted range
EventCompaction.compactedContent(): The LLM-generated summary as Content

For implementation details, see 5.4.

Cost Management

Call Tracking Mechanism:

InvocationContext.incrementLlmCallsCount() is called before each LLM invocation in BaseLlmFlow.runOneStep()
The internal InvocationCostManager.incrementAndEnforceLlmCallsLimit() increments a counter
If RunConfig.maxLlmCalls() is set and the counter exceeds this value, an LlmCallsLimitExceededException is thrown
The exception propagates up, terminating the invocation and returning an error to the client

The InvocationCostManager is created per invocation context and its state is not shared across contexts, ensuring accurate per-invocation tracking.

Builder Pattern Usage

Both Runner and InvocationContext use builders to ensure proper construction with required parameters:

Runner Builder Example:

InvocationContext Builder Example:

The builders validate required parameters at build time, throwing IllegalStateException if essential components are missing.

Sources: core/src/main/java/com/google/adk/runner/Runner.java136-158 core/src/main/java/com/google/adk/agents/InvocationContext.java632-639

Execution Runtime

Purpose and Scope

Core Components

Runner

InvocationContext

Execution Modes

Invocation-Based Execution

Live/Streaming Execution

Execution Flow

Service Architecture

Service Access Patterns

Agent Selection

Plugin Integration

Resumability and Long-Running Operations

Event Compaction

Cost Management

Builder Pattern Usage

On this page

Execution Runtime

Purpose and Scope

Core Components

Runner

InvocationContext

Execution Modes

Invocation-Based Execution

Live/Streaming Execution

Execution Flow

Service Architecture

Service Access Patterns

Agent Selection

Plugin Integration

Resumability and Long-Running Operations

Event Compaction

Cost Management

Builder Pattern Usage

On this page