LangGraph Persistence
LangGraph Persistence
Threads¶
A thread is a unique ID or thread identifier assigned to each checkpoint
saved by a checkpointer. When invoking graph with a checkpointer, you
must specify a thread_id as part of the configurable portion of the config:
Checkpoints¶
Checkpoint is a snapshot of the graph state saved at each super-step and is
represented by StateSnapshot object with the following key properties:
Let's see what checkpoints are saved when a simple graph is invoked as
follows:
class State(TypedDict):
foo: str
bar: Annotated[list[str], add]
workflow = StateGraph(State)
workflow.add_node(node_a)
workflow.add_node(node_b)
workflow.add_edge(START, "node_a")
workflow.add_edge("node_a", "node_b")
workflow.add_edge("node_b", END)
checkpointer = MemorySaver()
graph = workflow.compile(checkpointer=checkpointer)
Note that we bar channel values contain outputs from both nodes as we
have a reducer for bar channel.
Get state¶
When interacting with the saved graph state, you must specify a thread
identifier. You can view the latest state of the graph by calling
graph.get_state(config). This will return a StateSnapshot object that
corresponds to the latest checkpoint associated with the thread ID provided
in the config or a checkpoint associated with a checkpoint ID for the thread,
if provided.
# get the latest state snapshot
config = {"configurable": {"thread_id": "1"}}
graph.get_state(config)
StateSnapshot(
values={'foo': 'b', 'bar': ['a', 'b']},
next=(),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1
metadata={'source': 'loop', 'writes': {'node_b': {'foo': 'b', 'bar': ['b']}}, 'step
created_at='2024-08-29T19:19:38.821749+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_
)
You can get the full history of the graph execution for a given thread by
calling graph.get_state_history(config). This will return a list of
StateSnapshot objects associated with the thread ID provided in the config.
Importantly, the checkpoints will be ordered chronologically with the most
recent checkpoint / StateSnapshot being the first in the list.
You must pass these when invoking the graph as part of the configurable
portion of the config:
Update state¶
config¶
values¶
These are the values that will be used to update the state. Note that this
update is treated exactly as any update from a node is treated. This means
that these values will be passed to the reducer functions, if they are defined
for some of the channels in the graph state. This means that update_state
does NOT automatically overwrite the channel values for every channel, but
only for the channels without reducers. Let's walk through an example.
Let's assume you have defined the state of your graph with the following
schema (see full example above):
class State(TypedDict):
foo: int
bar: Annotated[list[str], add]
as_node¶
The final thing you can optionally specify when calling update_state is
as_node. If you provided it, the update will be applied as if it came from node
as_node. If as_node is not provided, it will be set to the last node that updated
the state, if not ambiguous. The reason this matters is that the next steps to
execute depend on the last node to have given an update, so this can be
used to control which node executes next. See this how to guide on time-
travel to learn more about forking state.
Memory Store¶
A state schema specifies a set of keys that are populated as a graph is
executed. As discussed above, state can be written by a checkpointer to a
thread at each graph step, enabling state persistence.
But, what if we want to retain some information across threads? Consider the
case of a chatbot where we want to retain specific information about the
user across all chat conversations (e.g., threads) with that user!
Basic Usage¶
user_id = "1"
namespace_for_memory = (user_id, "memories")
memory_id = str(uuid.uuid4())
memory = {"food_preference" : "I like pizza"}
in_memory_store.put(namespace_for_memory, memory_id, memory)
memories = in_memory_store.search(namespace_for_memory)
memories[-1].dict()
{'value': {'food_preference': 'I like pizza'},
'key': '07e0caf4-1631-47b7-b15f-65515d4c1843',
'namespace': ['1', 'memories'],
'created_at': '2024-10-02T17:22:31.590602+00:00',
'updated_at': '2024-10-02T17:22:31.590605+00:00'}
Each memory type is a Python class (Item) with certain attributes. We can
access it as a dictionary by converting via .dict as above. The attributes it
has are:
value: The value (itself a dictionary) of this memory
key: A unique key for this memory in this namespace
namespace: A list of strings, the namespace of this memory type
created_at: Timestamp for when this memory was created
updated_at: Timestamp for when this memory was updated
Semantic Search¶
Beyond simple retrieval, the store also supports semantic search, allowing
you to find memories based on meaning rather than exact matches. To
enable this, configure the store with an embedding model:
store = InMemoryStore(
index={
"embed": init_embeddings("openai:text-embedding-3-small"), # Embedding provide
"dims": 1536, # Embedding dimensions
"fields": ["food_preference", "$"] # Fields to embed
}
)
Now when searching, you can use natural language queries to find relevant
memories:
Using in LangGraph¶
We invoke the graph with a thread_id, as before, and also with a user_id,
which we'll use to namespace our memories to this particular user as we
showed above.
We can access the in_memory_store and the user_id in any node by passing
store: BaseStore and config: RunnableConfig as node arguments. Here's
how we might use semantic search in a node to find relevant memories:
As we showed above, we can also access the store in any node and use the
store.search method to get memories. Recall the the memories are returned
as a list of objects that can be converted to a dictionary.
memories[-1].dict()
{'value': {'food_preference': 'I like pizza'},
'key': '07e0caf4-1631-47b7-b15f-65515d4c1843',
'namespace': ['1', 'memories'],
'created_at': '2024-10-02T17:22:31.590602+00:00',
'updated_at': '2024-10-02T17:22:31.590605+00:00'}
We can access the memories and use them in our model call.
If we create a new thread, we can still access the same memories so long as
the user_id is the same.
{
...
"store": {
"index": {
"embed": "openai:text-embeddings-3-small",
"dims": 1536,
"fields": ["$"]
}
}
}
See the deployment guide for more details and configuration options.
Checkpointer libraries¶
Under the hood, checkpointing is powered by checkpointer objects that
conform to BaseCheckpointSaver interface. LangGraph provides several
checkpointer implementations, all implemented via standalone, installable
libraries:
Checkpointer interface¶
Note
For running your graph asynchronously, you can use MemorySaver, or async
versions of Sqlite/Postgres checkpointers -- AsyncSqliteSaver /
AsyncPostgresSaver checkpointers.
Serializer¶
When checkpointers save the graph state, they need to serialize the channel
values in the state. This is done using serializer objects.
langgraph_checkpoint defines protocol for implementing serializers provides
a default implementation (JsonPlusSerializer) that handles a wide variety of
types, including LangChain and LangGraph primitives, datetimes, enums and
more.
Capabilities¶
Human-in-the-loop¶
Memory¶
Time Travel¶
Third, checkpointers allow for "time travel", allowing users to replay prior
graph executions to review and / or debug specific graph steps. In addition,
checkpointers make it possible to fork the graph state at arbitrary
checkpoints to explore alternative trajectories.
Fault-tolerance¶
Pending writes¶