The Execution Runtime is the orchestration layer that coordinates agent invocations, manages session state, provides service dependencies, and streams events back to clients. This page documents the core execution infrastructure including Runner, InvocationContext, session management, and the event system.
For detailed information about specific subsystems:
The execution runtime consists of four primary components that work together to execute agent invocations:
| Component | Class | Responsibility |
|---|---|---|
| Orchestrator | Runner | Coordinates the entire execution lifecycle, from receiving user input to streaming events |
| Execution Context | InvocationContext | Provides access to services, session state, and execution configuration during agent runs |
| Service Layer | BaseSessionService, BaseArtifactService, BaseMemoryService | Manages persistence, file storage, and semantic search |
| Plugin System | PluginManager | Provides lifecycle hooks for observability, testing, and custom behavior |
Sources: core/src/main/java/com/google/adk/runner/Runner.java66-67 core/src/main/java/com/google/adk/agents/InvocationContext.java41-42
The Runner class is the main entry point for executing agents. It is configured with:
BaseAgent that defines the agent hierarchyappName identifying the applicationBasePlugin instances for lifecycle hooksDiagram: Runner Component Dependencies
The Runner is typically constructed using the builder pattern:
Sources: core/src/main/java/com/google/adk/runner/Runner.java77-159 core/src/main/java/com/google/adk/runner/Runner.java274-296
The InvocationContext encapsulates all state and services needed during a single agent invocation. It includes:
sessionService, artifactService, memoryService, pluginManagerinvocationId, session, agent, branchuserContent, runConfigliveRequestQueue, activeStreamingToolsThe context is immutable during execution but can be copied with modifications using the builder pattern (toBuilder()).
The InvocationContext also manages cost control through an internal InvocationCostManager that tracks the number of LLM calls and enforces limits specified in RunConfig.
Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java42-86 core/src/main/java/com/google/adk/agents/InvocationContext.java411-648 core/src/main/java/com/google/adk/agents/InvocationContext.java379-409
The runtime supports two execution modes:
| Mode | Method | Use Case | Characteristics |
|---|---|---|---|
| Invocation-Based | runAsync() | Standard request-response interactions | User sends a message with optional state delta, agent processes it, returns events. Each invocation is independent. |
| Live/Streaming | runLive() | Real-time audio/video conversations | Continuous bidirectional streaming with LiveRequestQueue. Agent, tools, and client exchange content during a single long-lived connection. |
Invocation-based execution (runAsync()) follows this pattern:
Runner.runAsync(userId, sessionId, newMessage, runConfig, stateDelta)BaseSessionServiceInvocationContext with the user contentonUserMessageCallback to allow plugins to transform the messageappendNewMessageToSession(), which also merges any provided stateDelta into the session statefindAgentToRun()InvocationContext with the updated session and selected agentbeforeRunCallback - if a plugin returns early content, it's streamed and execution endsagent.runAsync(invocationContext)sessionService.appendEvent()onEventCallbackafterRunCallback is invokedcompactEvents()Sources: core/src/main/java/com/google/adk/runner/Runner.java371-487 core/src/main/java/com/google/adk/runner/Runner.java489-545 core/src/main/java/com/google/adk/runner/Runner.java304-351 core/src/main/java/com/google/adk/agents/BaseAgent.java238-284
Live execution (runLive()) is designed for continuous bidirectional streaming interactions such as audio conversations:
LiveRequestQueueInvocationContext configured for live mode via newInvocationContextForLive()AUDIO modality with transcription enabledLlmAgent instances, the runner identifies streaming-capable tools (those with LiveRequestQueue parameters) and registers them in activeStreamingTools mapagent.runLive(invocationContext)liveRequestQueue during executionLiveRequestQueue instancessessionService.appendEvent()LiveRequestQueue allows bidirectional communication between the client, agent, and streaming toolsThe runner uses addActiveStreamingTools() to scan tool lists for FunctionTool instances that accept LiveRequestQueue parameters, creating separate queue instances for each streaming tool.
Sources: core/src/main/java/com/google/adk/runner/Runner.java611-659 core/src/main/java/com/google/adk/runner/Runner.java667-680 core/src/main/java/com/google/adk/runner/Runner.java564-591 core/src/main/java/com/google/adk/runner/Runner.java748-763
Diagram: Invocation-Based Execution Flow with State Delta
Sources: core/src/main/java/com/google/adk/runner/Runner.java371-487 core/src/main/java/com/google/adk/runner/Runner.java489-545 core/src/main/java/com/google/adk/runner/Runner.java304-351
The runtime provides a unified service architecture through InvocationContext. All services are accessible to agents, tools, and plugins during execution.
Diagram: Service Architecture and Implementations
Services are accessed through InvocationContext methods:
| Service | Access Method | Purpose |
|---|---|---|
| Session Service | invocationContext.sessionService() | Retrieve/update session state and history |
| Artifact Service | invocationContext.artifactService() | Save/retrieve files associated with sessions |
| Memory Service | invocationContext.memoryService() | Perform semantic search over stored content |
| Plugin Manager | invocationContext.pluginManager() | Access lifecycle hooks |
The session object itself is also available via invocationContext.session(), providing direct access to:
session.state(): Mutable key-value map for session statesession.events(): Immutable list of historical eventssession.userId(), session.id(): Session identifiersThe session's mutable state map (session.state()) can be updated through two mechanisms:
runAsync() as a parameter, merged into the session before agent execution beginsSources: core/src/main/java/com/google/adk/agents/InvocationContext.java227-307 core/src/test/java/com/google/adk/runner/RunnerTest.java455-613
The runtime uses findAgentToRun() to determine which agent in a hierarchy should handle an incoming request. This algorithm examines the session history to find the most recently active agent that can continue the conversation.
The selection logic:
session.events() and reverses the orderrootAgent.name(), returns the root agentrootAgent.findSubAgent(author) to locate the agent in the treeisTransferableAcrossAgentTree()LlmAgent instances and none have disallowTransferToParent() setDiagram: Agent Selection Algorithm
This algorithm enables agent hierarchies where sub-agents can be automatically resumed based on conversation context. For more details on agent transfer, see 5.3.
Sources: core/src/main/java/com/google/adk/runner/Runner.java720-746 core/src/main/java/com/google/adk/runner/Runner.java698-714
The runtime integrates PluginManager at multiple lifecycle stages. Plugins can:
Diagram: Plugin Lifecycle Hook Points in Execution
For comprehensive plugin system documentation, see 2.6.
Sources: core/src/main/java/com/google/adk/runner/Runner.java440-442 core/src/main/java/com/google/adk/runner/Runner.java505-517 core/src/main/java/com/google/adk/runner/Runner.java528-536 core/src/main/java/com/google/adk/runner/Runner.java543
The runtime supports pausing and resuming invocations for long-running operations. When configured with ResumabilityConfig, the system can:
BaseTool.longRunning() returning true)longRunningToolIds in the event containing the function callsThe InvocationContext provides methods to check resumability:
isResumable(): Returns whether the current invocation can be paused/resumed (checks resumabilityConfig)shouldPauseInvocation(event): Determines if execution should pause after this event by checking if any function call IDs match the longRunningToolIds setDiagram: Resumability Decision Flow
Long-running tool IDs are populated in BaseLlmFlow.buildModelResponseEvent() by calling Functions.getLongRunningFunctionCalls(), which checks each tool's longRunning() flag.
For details on resumability patterns and LongRunningFunctionTool, see 5.2.
Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java358-377 core/src/main/java/com/google/adk/flows/llmflows/BaseLlmFlow.java632-641 core/src/main/java/com/google/adk/flows/llmflows/Functions.java360-373
The runtime can optionally compact session history to reduce storage and improve performance. When configured with EventsCompactionConfig, the system uses a sliding window approach implemented by SlidingWindowEventCompactor:
Configuration Parameters:
compactionInterval: Number of new user invocations that trigger compactionoverlapSize: Number of preceding invocations to include from the last compacted range (maintains context)summarizer: A BaseEventSummarizer implementation (typically LlmEventSummarizer) to generate summariesCompaction Process:
afterRunCallback completes, Runner.compactEvents() is invokedSlidingWindowEventCompactor.compact() scans events in reverse chronological orderEventCompaction.endTimestamp()compactionInterval invocations are detected, it includes overlapSize additional invocations from the previous rangeBaseEventSummarizer.summarizeEvents()LlmEventSummarizer formats events as conversation history and calls the LLM with a summarization promptEvent with EventActions.compaction() is created containing the summarysessionService.appendEvent()The compaction event contains:
EventCompaction.startTimestamp(): First event timestamp in the compacted rangeEventCompaction.endTimestamp(): Last event timestamp in the compacted rangeEventCompaction.compactedContent(): The LLM-generated summary as ContentFor implementation details, see 5.4.
Sources: core/src/main/java/com/google/adk/runner/Runner.java547-552 core/src/main/java/com/google/adk/runner/Runner.java766-784 core/src/main/java/com/google/adk/summarizer/SlidingWindowEventCompactor.java100-110 core/src/main/java/com/google/adk/summarizer/LlmEventSummarizer.java60-103 core/src/main/java/com/google/adk/summarizer/EventsCompactionConfig.java21-37
The runtime tracks and enforces limits on LLM calls per invocation through an internal InvocationCostManager embedded in InvocationContext. This prevents runaway costs from infinite loops or excessive tool usage.
Call Tracking Mechanism:
InvocationContext.incrementLlmCallsCount() is called before each LLM invocation in BaseLlmFlow.runOneStep()InvocationCostManager.incrementAndEnforceLlmCallsLimit() increments a counterRunConfig.maxLlmCalls() is set and the counter exceeds this value, an LlmCallsLimitExceededException is thrownThe InvocationCostManager is created per invocation context and its state is not shared across contexts, ensuring accurate per-invocation tracking.
Sources: core/src/main/java/com/google/adk/agents/InvocationContext.java353-355 core/src/main/java/com/google/adk/agents/InvocationContext.java379-392 core/src/main/java/com/google/adk/flows/llmflows/BaseLlmFlow.java293-298
Both Runner and InvocationContext use builders to ensure proper construction with required parameters:
Runner Builder Example:
InvocationContext Builder Example:
The builders validate required parameters at build time, throwing IllegalStateException if essential components are missing.
Sources: core/src/main/java/com/google/adk/runner/Runner.java136-158 core/src/main/java/com/google/adk/agents/InvocationContext.java632-639
Refresh this wiki
This wiki was recently refreshed. Please wait 7 days to refresh again.