The Tools Landscape For LLM Pipelines Orchestration (Part 1)
The Tools Landscape For LLM Pipelines Orchestration (Part 1)
1 message
👋
Reply-to: 👋 Damien from the AiEdge
Damien from the AiEdge <[email protected]> Mon, Dec 2, 2024 at 8:18 AM
<reply+2iejnf&c7rrj&&8ff534b661b841cda187b618da29e39340fdf6200cd2f6ca0efa3b16a4add252@mg1.substack.com>
To: [email protected]
👋 Hey Damien here! This is a sneak peek of today’s paid newsletter for our
premium subscribers. Get access to this issue and all future issues by
subscribing here:
Upgrade to paid
DAMIEN BENVENISTE
DEC 2 ∙ PREVIEW
READ IN APP
Micro-Orchestration
Prompt Management
Macro-Orchestration
Stateful Application
Agentic Design
For a long time, I was in love with LangChain, mostly because the
documentation was structured to educate the users about LLM pipeline
orchestration and showcased how they approached building a solution for
implementing those pipelines. To some extent, all the existing frameworks
took their own opinionated approach to provide solutions to the complexities
around LLM pipeline orchestration.
Micro-Orchestration
I refer to Micro-orchestration in LLM pipelines as the fine-grained
coordination and management of individual LLM interactions and related
processes. It is more about the granular details of how data flows into,
through, and out of language models within a single task or a small set of
closely related tasks. It can involve things like:
Prompt Management
Data connection
Prompt Management
All those frameworks, for the better or worse, have a way to structure the
prompt inputs to a model. For example, in LangChain, we can wrap a string
with the PromptTemplate class:
from langchain_core.prompts import PromptTemplate
prompt_template = PromptTemplate.from_template(
"Tell me a joke about {topic}"
)
prompt_template.invoke({"topic": "cats"})
For example, AdalFlow and Haystack use the Jinja2 package as the
templating engine:
This may seem unnecessary in some cases, as we can do pretty much the
same thing with the default Python string:
However, this can help with maintenance and safer handling of user inputs as
it allows for the enforcement of all the required variables. Let’s take, for
example, how we create messages in Haystack:
It is a Python data class that provides a more robust Python object to validate
the text input than the simpler:
message = {
"content": "Tell me a joke about {topic}",
"role": "user"
}
messages = [
("system", "You are an AI assistant."),
("user", "Tell me a joke about {topic}"),
]
prompt = ChatPromptTemplate.from_messages(messages)
And it becomes easier to manipulate the underlying data. For example, I can
more easily access the input variables:
prompt.input_variables
> ['topic']
Also, the class is going to throw an error if the wrong role is injected:
messages = [
("system", "You are an AI assistant."),
("wrong_role", "Tell me a joke about {topic}"),
]
prompt = ChatPromptTemplate.from_messages(messages)
In most cases, this allows for better integration of the prompting aspect with
the rest of the software. For example, it is used in Langchain to integrate with
the other components, such as models:
model = ChatOpenAI()
chain = prompt | model
chain.invoke('cat')
Having more control over the prompt object allows the implementation of
prompt-specific operations. For example, here is how we can build a few-
shots example prompt in Langchain:
# Examples
examples = [
{"question": "What's 2+2?", "answer": "2+2 = 4"},
{"question": "What's 3+3?", "answer": "3+3 = 6"}
]
# LlamaIndex
from llama_index.core import SimpleDirectoryReader
loader = SimpleDirectoryReader("./book")
documents = loader.load_data()
# LangChain
from langchain.document_loaders import DirectoryLoader
loader = DirectoryLoader("./book")
documents = loader.load()
# Haystack
from haystack.components.converters import TextFileToDocument
from pathlib import Path
text_converter = TextFileToDocument()
documents = text_converter.run(
sources=[str(p) for p in Path("./book").glob("*.txt")]
)
# LangChain
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_ove
docs = splitter.split_documents(documents)
# LlamaIndex
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1200, chunk_overlap=100)
nodes = splitter.get_nodes_from_documents(documents)
# Haystack
from haystack.components.preprocessors import DocumentSplitter
splitter = DocumentSplitter(split_by="sentence", split_length=3)
docs = splitter.run(documents['documents'])
# AdalFlow
from adalflow.components.data_process.text_splitter import TextSplit
splitter = TextSplitter(split_by="word",chunk_size=50, chunk_overlap
docs = splitter.call(documents=docs)
None of those methods are hard to implement, but they are often useful
utilities that are worth using.
class Actor(BaseModel):
name: str = Field(description="name of an actor")
film_names: List[str] = Field(description="list of names of film
starred in")
# Langchain
from langchain_core.output_parsers import PydanticOutputParser
parser = PydanticOutputParser(pydantic_object=Actor)
parser.parse(json_str)
# llamaindex
from llama_index.core.output_parsers import PydanticOutputParser
parser = PydanticOutputParser(output_cls=Actor)
parsed = parser.parse(json_str)
In Langchain, we can even use the help of another LLM to correct the format
in case the previous misformatted the output. For example, the following is
not a correct jSON string:
ranker = SentenceTransformersDiversityRanker(
model="sentence-transformers/all-MiniLM-L6-v2",
similarity="cosine"
)
ranker.warm_up()
# OpenAI model
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0.7,
)
but we can do the same thing for a local model when using Llama.cpp:
As far as the rest of the code is concerned, we just use an LLM object that is
independent from the underlying model, and we can predict without thinking
of the specific API:
Amazon Bedrock
Anthropic
Azure OpenAI
Cohere
Google Gemini
HuggingFace
Llama.cpp
Mistral
Nvidia
Ollama
OpenAI
Sagemaker
VertexAI
LIKE COMMENT