0% found this document useful (0 votes)
10 views3 pages

Intermidiate

Uploaded by

fahim.nsudrive3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

Intermidiate

Uploaded by

fahim.nsudrive3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

Streaming intermediate steps

Suppose we want to stream not only the final outputs of the chain, but also some
intermediate steps. As an example let's take our Conversational RAG chain. Here we
reformulate the user question before passing it to the retriever. This reformulated
question is not returned as part of the final output. We could modify our chain to
return the new question, but for demonstration purposes we'll leave it as is.

from langchain.chains import create_history_aware_retriever


from langchain_core.prompts import MessagesPlaceholder

### Contextualize question ###


contextualize_q_system_prompt = (
"Given a chat history and the latest user question "
"which might reference context in the chat history, "
"formulate a standalone question which can be understood "
"without the chat history. Do NOT answer the question, "
"just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
[
("system", contextualize_q_system_prompt),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
contextualize_q_llm = llm.with_config(tags=["contextualize_q_llm"])
history_aware_retriever = create_history_aware_retriever(
contextualize_q_llm, retriever, contextualize_q_prompt
)

### Answer question ###


system_prompt = (
"You are an assistant for question-answering tasks. "
"Use the following pieces of retrieved context to answer "
"the question. If you don't know the answer, say that you "
"don't know. Use three sentences maximum and keep the "
"answer concise."
"\n\n"
"{context}"
)
qa_prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

API Reference:create_history_aware_retriever | MessagesPlaceholder


Note that above we use .with_config to assign a tag to the LLM that is used for the
question re-phrasing step. This is not necessary but will make it more convenient
to stream output from that specific step.

To demonstrate, we will pass in an artificial message history:


Human: What is task decomposition?

AI: Task decomposition involves breaking up a complex task into smaller and simpler
steps.

We then ask a follow up question: "What are some common ways of doing it?" Leading
into the retrieval step, our history_aware_retriever will rephrase this question
using the conversation's context to ensure that the retrieval is meaningful.

To stream intermediate output, we recommend use of the async .astream_events


method. This method will stream output from all "events" in the chain, and can be
quite verbose. We can filter using tags, event types, and other criteria, as we do
here.

Below we show a typical .astream_events loop, where we pass in the chain input and
emit desired results. See the API reference and streaming guide for more detail.

first_question = "What is task decomposition?"


first_answer = (
"Task decomposition involves breaking up "
"a complex task into smaller and simpler "
"steps."
)
follow_up_question = "What are some common ways of doing it?"

chat_history = [
("human", first_question),
("ai", first_answer),
]

async for event in rag_chain.astream_events(


{
"input": follow_up_question,
"chat_history": chat_history,
},
version="v1",
):
if (
event["event"] == "on_chat_model_stream"
and "contextualize_q_llm" in event["tags"]
):
ai_message_chunk = event["data"]["chunk"]
print(f"{ai_message_chunk.content}|", end="")

|What| are| some| typical| methods| used| for| task| decomposition|?||

Here we recover, token-by-token, the query that is passed into the retriever given
our question "What are some common ways of doing it?"

If we wanted to get our retrieved docs, we could filter on name "Retriever":

async for event in rag_chain.astream_events(


{
"input": follow_up_question,
"chat_history": chat_history,
},
version="v1",
):
if event["name"] == "Retriever":
print(event)
print()

You might also like