Whitepaper Guide To The Event-Driven, Event Streaming Stack
Whitepaper Guide To The Event-Driven, Event Streaming Stack
Event-Driven, Event
Streaming Stack
The idea for Quix was started when we built the event stream processing platform as engineers
working for the McLaren Formula 1 team. Since then Quix has evolved into a unified tool that
specializes in simplifying event stream processing for data-intensive applications. It offers a single
developer environment for building, releasing, and scaling services that derive insights from
real-time data streams. But Quix is just one part of the overall EDA puzzle. It specializes in event
streaming and streaming processing but EDA is much more than that. We wanted to help clarify the
rest of the EDA tech landscape too.
We’re also aware that there’s enough content out there explaining what EDA is, why you should use
it, and what the challenges are. However, much of that content attempts to position Apache Kafka as
the single source of truth for an event-sourced, event-driven microservices architecture. While this
ideal is fantastic if you can get it to work (and makes for clean nice-looking architecture diagrams)
the reality is usually messier.
We want to acknowledge that while Kafka is an extremely powerful tool, it is often overkill for many
of the simpler event-driven architectures out there, especially those that don’t require any stream
processing. To that end, we wanted to present an overview of all the main technical components
of EDA and which tools might be better substitutes for Apache Kafka for certain parts of the
architecture. Later in the guide, we’ll explain what a hybrid architecture might look like, where
there is a clearer division of labor between an event streaming system such as Kafka and more
transaction-focused components such as event buses and message queues.
Nowadays, the landscape of tools supporting event-driven architectures has grown ever more
complex, with varied event types, processing methods, tools, and deployment options. To navigate
this multifaceted realm, we’ve created this guide to assist you in choosing the optimal combination
for your specific architecture.
This guide is not to convince you to move to an event-driven architecture, rather we assume that
you have made the decision but are unsure of what technologies to use.
However, we will briefly mention other architectures to highlight the differences in architecture
patterns. Few backends rely exclusively on one pattern and are usually a mixture of several
approaches. It’s important to understand how these components can interact with one another.
• You understand the basic difference between message queues and streams
While we will spend time looking at the advantages of streams over queues, this guide is not
intended as an exhaustive comparison. Instead it focuses on the technologies for event streaming
and stream processing as used in event-driven architectures.
If you need further information on the different between message queues and streams, check out
the following articles by one of our integration partners, Confluent:
◦ Comparison: JMS Message Queue vs. Apache Kafka by Kai Waehner (Confluent)
◦ Apache Kafka vs. Enterprise Service Bus (ESB) | Confluent
We’ll focus on products and technologies offered by the major cloud computing providers as well
as a couple of other managed solutions. While there are plenty of open source products that you
can run yourself, we’re assuming that you’re at a growth stage that still requires you to invest more
time in developing your actual application rather than the underlying application infrastructure.
Note that we won’t be looking at cost. Different providers have different pricing models and we
would require a completely separate guide to compare them all. Additionally, it doesn’t make sense
to focus on cost until you have a clearer understanding of your technical requirements.
Finally, while there are many backend technologies involved in EDA, we’ll be focusing on the
components that handle event streaming and stream processing (including the runtime environments
for both components). Of course, we won’t ignore other tools such as event buses and message queues,
but the emphasis will be how to integrate them in an architecture that centers around event streaming.
Suppose that we’re consulting for a fledgling auto parts seller called “Eagle Auto Parts”. They’re a
growing e-commerce platform that offers products across various categories. The platform supports
user profiles, product listings, order management, reviews, promotions, and analytics.
For Eagle Auto Parts, implementing an EDA using event streaming would mean embracing a
paradigm where every significant action in the system – a user signing up, a product being listed,
an order being placed, a review being written – becomes an event. These events flow through the
system, triggering various other actions, driving real-time insights, and ensuring a seamless and
responsive user experience.
And why would they want to do this? Any e-commerce business selling physical products needs to deal
with a variety of disparate systems (e.g. inventory management systems and customer databases).
Without EDA, these systems would have to communicate with one another through point-to-point
connections or a highly centralized Enterprise Service Bus. Both alternatives are less than ideal.
The former can lead to a tangled spider’s web of tightly coupled services and the latter can lead to
an unwieldy monolithic architecture where a single point of failure can disrupt the entire system.
Event streaming is a key component of this architecture because it will allow Eagle Auto Parts to
personalize the user experience and create more engagement with the online store. This requires
some intensive data processing which cannot be managed by other EDA components.
The following diagram illustrates how this architecture might look with the components explained
further down.
1. Event bus
An event bus can be used to propagate messages across many services. It follows the pub/sub
model where producers create events and consumers react to them. Producers and consumers are
loosely coupled where the producer doesn’t need any awareness of who its consumers are.
However, while an event bus aims to promote loose coupling and decentralization, there is still a
level of centralization involved, particularly in the declaration and management of subscriptions and
filtering rules. Amazon EventBridge, for example, centralizes the routing rules that determine which
events get delivered to which targets (subscribers).
This level of centralization can be considered a tradeoff that provides a certain level of control and
manageability, especially in systems with a large number of event producers and consumers. It allows
for efficient event filtering and routing without overloading consumers with events they don’t need.
This is in contrast to Apache Kafka which is supposed to be “dumb” middleware. All of the decisions
about what events should go where are handled by “smart” producers and consumers instead.
Although they both fall within the pub/sub messaging paradigm, a pub/sub messaging service is
different from an event bus. An event bus usually includes centralized routing rules based on the
content within messages, whereas a pub/sub service does not.
Instead, a pub/sub service broadcasts the same message to all subscribers of a specific topic.
They do this by using a “push” model, where consumers process a message as soon as it arrives.
Message routing is managed through decentralized subscriptions to topics rather than centralized
rules, and there usually isn’t any in-built filtering based on message content. For instance, in Google
Pub/Sub, you cannot natively define a subscription based on the structure of a message.
However, technologies that make use of the Java Message Service (JMS) API, such as ActiveMQ,
do give you the ability to define a simple conditional statement as a string and associate it with a
MessageConsumer. Nevertheless, these filtering capabilities are not as comprehensive as those
provided by an event bus.
Event streaming platforms like Apache Kafka also belong to the pub/sub paradigm, but can support
more advanced use cases out of the box (these are covered further down). In contrast, pure pub/sub
messaging services are primarily focused on one simple task: delivering messages reliably.
Message queueing services are also focused on delivering messages reliably, but have a different
delivery mode — they “pull” messages. For example, systems such as RabbitMQ and ActiveMQ use a
pull model as their primary method for delivering messages. This means that consumers are required
to poll for new messages at regular intervals.
This has the advantage of allowing services to consume messages at their own pace and not get
overwhelmed. This is why message queues are often used for point-to-point connections (where
messages are queued for a single worker or service task).
Event streaming platforms such as Kafka can handle very high volumes of events, allowing the system
to scale up as the event load increases. Discrete messages, on the other hand, typically come with
a lot of overhead (like establishing and tearing down connections or verifying identities every time a
message is sent). This overhead makes handling a large volume of messages a challenge.
Examples of sources that generate a huge volume of messages include trading platforms, user
analytics platforms, software telemetry, and IoT tracking systems. Event streaming platforms like
Kafka also make it easier to perform stateful stream processing and are designed to work in tandem
with stream processing frameworks such as Quix Streams, Kafka Streams, ksqlDB and Apache Flink.
Stream processing is closely related to event streaming, but we’ve kept them separate because you
don’t necessarily need dedicated stream processing tools to process event streams. However, more
often than not, they are used together to fulfill real-time analytics use cases.
Real-time stream analytics often involves identifying value changes over time in a stream of time
series data. In particular, window-based aggregation is often used to analyze and organize data in
an efficient manner. Common forms of window-based aggregation include: tumbling windows, which
focus on non-overlapping time intervals, and sliding windows, which look at a continuously adjusting
range of time.
A classic example of windowed aggregations are the 20-period and 50-period simple or exponential
moving averages you might see on stock trading charts.
An event store is a specialized storage system designed to persist events, which are immutable
records of state changes in a system. Events are stored in the order they occur, allowing systems
to reconstruct the current state of an entity by replaying its events. This approach contrasts with
traditional databases that store the current state of an entity directly. Many consider Apache Kafka
to be a sufficient event store on its own, whereas others prefer to use more specialized solutions
such as EventStoreDB or AxonIQ.
In any case, an event store is required to implement the event sourcing design pattern. Simply put,
event sourcing involves the reconstruction of state by replaying events (rather than maintaining
and updating state in a traditional database). It also involves maintaining multiple materialized
views of the current state for different systems to read efficiently. Setting up this pattern with a
“generalist” system such as Apache Kafka can involve more work than using a specialized tool such
as EventStoreDB.
For each component, the other cloud providers offer roughly equivalent products but there are also
more specialized, open source vendors who specialize in a subset of these components, such as
event streaming and stream processing.
With that in mind, let’s take a look at the landscape of middleware EDA technologies with a focus on
the built-in offerings that the major cloud vendors provide.
The following table summarizes these technologies in a more structured manner. Aside from the
“big three” cloud providers, we’ve also included a column for a hybrid setup where event streaming
is outsourced to a more specialized vendor such as Quix.
Event bus Amazon Eventarc, Google Azure Event Grid Red Hat Fuse
EventBridge Pub/Sub (Apache Camel + Apache
ActiveMQ)
Running in:
Red Hat OpenShift
Dedicated on AWS
Pub/Sub tool Amazon SNS Google Pub/Sub Azure Service Bus Apache Kafka
Running in:
Redpanda
Message Amazon SQS Google Tasks Azure Queue Apache ActiveMQ (in Red
queues Storage Hat Fuse)
Running in:
Red Hat OpenShift
Dedicated on AWS
Event Amazon Kinesis, Google Pub/Sub Azure Event Hubs Apache Kafka
streaming Amazon MSK (StreamingPull
Running in:
API)
Redpanda
Serverless AWS Lambda, Cloud Functions, Azure Functions, Quix Serverless Containers
compute (for AWS Fargate Cloud Run AKS with Virtual (C#, Python)
microservices) Nodes
AWS Lambda functions for
lighter workloads
We’ve also proposed alternative technologies for some of the parts that Quix does not cover
(such as running Red Hat Fuse as an event bus and message queue). Red Hat Fuse integrates well
with AWS and allows you to better integrate legacy applications with multiple protocols and ingest
those events into Kafka.
With that said, let’s walk through each of the components and consider options that fall into one of
two categories:
1. Cloud-native — this means using products offered by one of the major cloud providers such as
Amazon, Google and Azure. For this scenario, we’ll focus on AWS.
2. Open source — this means using products that include open source software such as Apache
Kafka and Apache ActiveMQ. To save Eagle Auto Parts the headache of managing these systems,
we’ll focus on managed offerings such as Confluent Cloud and Red Hat Fuse.
Event buses
Suppose that Eagle Auto Parts uses Salesforce as a CRM system and SAP as an ERP component.
They will need to propagate changes in those systems (such as customer address details) to other
parts of their architecture so that data is consistent.
In this scenario, Eagle Auto Parts uses products that are provided by their cloud vendor AWS.
They examine the tradeoff between cost and potential vendor lock-in versus short-term convenience
and ease of use.
• In this case, they might use Amazon EventBridge to route events from Salesforce (such as
address changes) to other subscriber systems that need to know the customer’s address.
• Amazon EventBridge has native integrations with Salesforce and SAP, so configuring
communication between these components does not require the setup of any extra
serverless functions.
Now let’s look at a scenario where Eagle Auto Parts looks for long-term flexibility.
Eagle Auto Parts decides to use multiple vendors so that they are not exclusively reliant on AWS.
They want to have more composable components that they can easily migrate to another system if
they chose to do so. They also believe that a hybrid setup will give them more leverage in controlling
their long-term costs.
In this case, they might use Red Hat Fuse which includes Apache Camel for integration (using
enterprise integration patterns) and ActiveMQ for messaging. This can be deployed to AWS as part
of the Red Hat OpenShift Dedicated platform.
• Again, custom integration work is unnecessary because they can use Red Hat Fuse’s pre-built
connectors for both Salesforce and SAP to bring in data.
• They can also use Fuse’s visual tools to design integration flows, defining how data is
transformed, mapped, and routed between Salesforce and SAP.
• To transmit events, they would use ActiveMQ which can be used both as a pub/sub system and
a message queue.
Since Eagle Auto Parts is already in the AWS ecosystem, it might be convenient to use Amazon SNS.
Nevertheless, we’ll also look at how this would work in a hybrid scenario.
• If Eagle Auto Parts is already using Amazon EventBridge to listen for changes from the fulfillment
systems, then they can route specific events to an SNS topic and have different AWS Lambda
functions respond to the message in different ways.
If Eagle Auto Parts plan to use Kafka for event streaming, they can also use it as a standard
pub/sub tool. In this case, they could use Confluent Cloud in combination with Quix Services.
Since Quix and Confluent have a native integration, they might find it easier to use Quix as the
serverless function provider (meaning there is less work involved in connecting to Kafka).
In this case:
• Confluent Connectors would be used to listen for events from the fulfillment center and write
them to Kafka topics.
• A Quix serverless function would filter events that need to be sent to customers, then use the
Quix Twilio connector to trigger each SMS. Similar triggers can then be set up for other providers
such as SendGrid.
This use case will involve steam processing, but for now, let’s just focus on the stream transport part.
• Eagle Auto Parts can opt to use Amazon Kinesis Data Firehose to ingest the clickstream data so
that it can be read by other systems (mainly Kinesis data applications).
• To get a detailed stream of events, Eagle Auto Parts would likely use an AWS web beacon server
since it integrates nicely with other AWS products.
• Each activity would need to be categorized and written into a different stream which would then
be divided into shards for horizontal scalability and efficiency.
Other cloud-native options: Amazon MSK, Google Pub/Sub’s StreamingPull API, Azure Event Hubs
Although convenient, Amazon Kinesis is a proprietary solution that isn’t quite as powerful as Kafka
(for more details on why, see our Kinesis vs Kafka comparison). Also, if Eagle Auto Parts want to
change providers, any related application code will be harder to migrate since it’s highly specific to
AWS technologies.
For this reason, they might choose the hybrid option instead:
• They could use one of the various connectors offered by Quix or Confluent to produce the raw
clickstream data into a Kafka topic (again, Confluent Cloud and Quix would be good options here
for hosting and managing Kafka).
• Again each clickstream activity would need to be categorized (page visits, button clicks, cart
updates) , but this time they would be written into different Kafka topics which can then be
partitioned for horizontal scalability and efficiency.
• However, if Eagle Auto Parts opted for Quix, they could also use the Quix Streams library to
partition the clickstream data into separate keyed streams for more efficient parallel processing.
• In fact, Quix has a project template that demonstrates real-time click stream analysis on a retail
dataset. You can find it on our project templates page.
• If Eagle Auto Parts were to use Amazon Kinesis Data Firehose for ingesting the data, then it would
make sense to pair it with Kinesis Data Analytics to extract insights from the data and write them
to a Kinesis stream for follow up actions.
• For example, Kinesis Data Analytics could leverage Flink’s CEP library to listen for hesitancy
signals (such as a user repeatedly placing a product in the shopping cart then removing it again)
and then trigger a function that would display an appropriate offer.
• Note that a solution where the ingestion, processing and event production are managed as separate
components can lead to a somewhat fragmented codebase. Often such a pipeline is built with
a mixture of languages (i.e. Java, SQL) with each component managed in a separate repository.
This can make the CI/CD aspect of the application fairly complex to manage (more on that later).
Other cloud-native options: Google Dataflow (with Apache Beam), Azure Stream Analytics
In the previous event streaming section, we covered the scenario where Eagle Auto Parts uses
Kafka to handle stream transport. In this case, it would be simpler for them to use a service that
integrates tightly with Kafka, such as Kafka Streams (Java) or Quix Streams (Python, C#). By pairing
Quix Streams with Quix serverless containers, they can also centralize the code for consuming,
processing and producing data as well as triggering external events (such as API calls to other
systems) which simplifies CI/CD.
• A Quix service consumes data from the Kafka topic that stores the raw clickstream data and
leverages the dedicated Quix Streams library along with external CEP libraries to process the data
and write the results back to a downstream Kafka topic.
• The front end listens to the events topic via a Websocket or the SignalR library and displays a
special offer when a new event is detected.
• In this scenario, all of the services are deployed and hosted within the Quix environment and
managed within the same repository. By managing stream processing and business logic in the
same place, Eagle Auto Parts can significantly simplify their CI/CD pipeline.
Since Eagle Auto Parts is already in the AWS ecosystem, it might be convenient to use Amazon SQS.
Nevertheless, they could also save money by going for an open source solution, so will look at that too.
• When a customer places an order in the shopfront (and payment has been confirmed) a serverless
function is triggered which posts a message to an SQS “OrderProcessingQueue”.
• The WMS (Warehouse Management System) at the fulfillment center has an integration that polls
the queue at regular intervals. When a new order message is detected, the system retrieves the
message, processes the order details, and prepares the items for shipment.
• As mentioned previously, the WMS notifies the rest of the system that orders have been
processed by posting a message to an SNS topic. If certain messages consistently fail, they can
be moved to a Dead Letter Queue (DLQ) in SQS for further analysis and manual intervention.
Event bus are often used to integrate legacy systems with heterogeneous message formats. A good
open source solution for this is Red Hat Fuse which runs in Red Hat OpenShift Dedicated (still hosted
on AWS). Fuse comes with ActiveMQ included, so Eagle Auto Parts can use ActiveMQ out of the box
with minimal setup required.
• Again, the WMS (Warehouse Management System) polls the queue at regular intervals, but the
integration would be tailored for ActiveMQ rather than SQS. This integration would be more
portable since ActiveMQ supports multiple messaging protocols including JMS, AMQP, and MQTT
(versus SQS which is proprietary).
• As before, when a new order message is detected, the WMS system retrieves the message,
processes the order details, and prepares the items for shipment.
• The WMS notifies other dependent systems that orders have been processed by posting a
message to a Kafka topic instead of an SNS topic. ActiveMQ also has highly customizable dead
letter queue functionality for failed messages.
Note that with the release of KIP-932, Apache Kafka will also provide native support for
message queues.
For example, Eagle Auto Parts might use CQRS to separate the handling of customer orders
(write-heavy) from product browsing (read-heavy). Event sourcing could be used to track all
changes to an order, allowing for detailed auditing and historical analysis.
In this case, the AWS ecosystem doesn’t have any products that are specifically tailored to event
sourcing as a use case. Of course, Eagle Auto Parts can use Amazon Kinesis as the backbone of
an event sourcing solution, but this would require extra development work to integrate other AWS
products (i.e., DynamoDB, S3, and Lambda functions) to get it to act as an event sourcing solution.
For example:
• When constituent systems process an order, they would emit a series of events that track the
progress of the order such as OrderCreated, OrderProcessed, OrderShipped, OrderDelivered.
• Using a combination of Kinesis and Lambda functions, Eagle Auto Parts could create read-
optimized views which offer a current snapshot of the latest state.
• When a system needs to know the current order state it can query the relevant table in DynamoDB.
Other cloud-native options: Google Dataflow + Google Datastore, Azure Stream Analytics + Azure
Cosmos DB
There are several open source products that are tailored for event sourcing. While Apache Kafka
is often cited as one of them, it does not specialize in event sourcing. Like Kinesis, it’s designed to
cover a wider range of use cases. More specialized tools such as EventStoreDB and AxonIQ offer
managed solutions that simplify the process of implementing event sourcing. For example, Eagle
Auto Parts could deploy a managed version of EventStoreDB to AWS so that they can focus their
efforts on the event processing logic.
• The relevant systems would still emit events such as OrderCreated, OrderProcessed,
OrderShipped, OrderDelivered. However, each system would use EventStoreDB’s client libraries
to write events to the appropriate streams which are optimized for specific aggregation tasks.
• The system would use EventStoreDB’s projections feature to create read-optimized views of the
data. Projections are a native feature in EventStoreDB, whereas creating read-optimized views in
DynamoDB would require manual design and indexing.
• EventStoreDB also includes the concept of “catch-up” and “persistent” subscriptions, which
allow other systems to receive notifications when new events are appended to a stream. This is
a powerful feature for event sourcing, as it enables other parts of the system to react to events
in real time. For example, when EventStoreDB receives a new delivery event, a downstream
application immediately reads that event and performs some kind of followup action such as
sending an email notification to a customer.
Bear in mind that there are still differing opinions about the necessity of having a separate event
store if you are already using Apache Kafka. If you would prefer not to use another third-party
system for event sourcing, Confluent has plenty of educational resources to help you implement
event sourcing directly in Kafka.
Serverless CaaS products, such as AWS Fargate or Quix Services are better for longer-running,
computationally intensive processes or processes with variable or unpredictable workloads.
For example, they’re generally better for any microservice that requires stream processing.
The “cloud-native vs open source” dichotomy isn’t as easy to apply here since serverless functions
and containers are designed to run on cloud infrastructure and integrate seamlessly with other
products by the same cloud provider.
Of course, there are plenty of open source serverless frameworks (such as OpenFaaS, the
Serverless Framework, and Apache OpenWhisk), but these require extra integration work to set up.
Thus, in this case, it wouldn’t be advisable for Eagle Auto Parts to implement an open source
serverless framework instead of using an “out of the box” serverless product unless they were
extremely concerned about vendor lock-in.
However, in a semi-open source, “hybrid” scenario, Eagle Auto Parts could use AWS Lambda
functions for short-lived, simple processes and Quix Services for more computationally intensive
processes or processes that need to integrate tightly with Apache Kafka.
Other cloud-native options: Google Cloud Functions, Google Cloud Run, Azure Functions,
AKS with Virtual Nodes
The event stream of an auction system: item placement, item bidding and processing: Source: Confluent.io
At the time, the focus was on comparing FaaS with “native” stream processing, where the stream
processing logic ran in a virtual machine or Docker container. But since then, serverless “Containers
as a Service (CaaS)” have evolved to address some of the weaknesses of FaaS while still retaining
the benefits of the serverless model. Thus, when we use the term “serverless”, we are referring to
both functions and containers.
Execution roles as described in “Kinesis Streams — Using AWS Lambda to Process Kinesis Streams” — Medium.com
There are no guarantees about invocation or container reuse in FaaS, meaning that stream
correctness cannot be guaranteed beyond a single batch invocation.
The performance characteristics of FaaS, such as latency and throughput, can be also at odds with
the requirements of real-time stream processing. Batching records together can overcome some
latency issues, but this approach may not be suitable for all use cases. In fact, when working with
AWS Lambda and Kinesis, processing messages in small batches as they arrive can lead to more
Lambda invocations and increased costs. Additionally, the lack of stream monitoring means that
If the stream stops receiving new data, Lambda will have nothing to analyze, and this often goes
unnoticed as it doesn’t produce errors.
As mentioned previously, FaaS is more suitable where processing is atomic (stateless), and reliable
latency is not a concern. However, these conditions may not always align with the demands of event
streaming, even when using products from the same vendor.
Managing state in a serverless environment presents unique challenges, especially when dealing
with event streams. Even though AWS Lambda and Amazon Kinesis are both Amazon products,
the integration between them does not automatically resolve these fundamental challenges.
This is why Kinesis Data Analytics is often used alongside Lambda functions for more heavy-duty
stream processing. Even then, ensuring temporal guarantees of stream-table joins and event time
correctness requires careful design and consideration despite product compatibility.
As might have noticed in the diagram from the previous section, Kinesis has granular permissions on
who can create or update a stream, or who can create or update a Kinesis Data Analytics application.
These permissions can be managed through different access control models such as tag-based and
conditional access control, or service-based roles. This is a good thing for security, but it can get
fairly complex when you are combining different AWS services.
Fragmented Codebases
Another example is event triggers with AWS Lambda functions. Kinesis Data Analytics is designed
to work easily with Flink’s high-level SQL API, so you define your stream processing logic with SQL
queries (much like Confluent’s ksqlDB). However, this means that the stream processing logic is
decoupled from the business logic that needs to respond to insights gleaned from stream processing
(such as sending a fraudulent activity alert). The stream processing done by Kinesis Data Analytics
needs to send its results to Kinesis Data Firehouse which can then buffer the streaming data for a
Lambda function that would then respond to specific “anomaly” events.
This can lead to a complex patchwork of resources that all need specific permissions and environment
variables for continuous deployment to work correctly. And given the mixture of programming
languages (for example SQL for stream processing and PHP for business logic), the code for the
microservice might live in different repositories which also makes CI/CD more challenging.
Cloud-native tools can be complex to work with because their product suite is so wide. It can be difficult
to provide an integrated developer experience for such a complex ecosystem. Niche, unified products
on the other hand, can afford to invest more time in ensuring their components integrate nicely.
For example, it would be more convenient to have your stream processing logic and business
logic from the same codebase, and have all the required resources (message broker, serverless
containers, environment variables) encapsulated in one single workspace, so that you only need one
set of permissions to access them. This is how Quix is designed, with maximum convenience for the
developer in mind.
Quix occupies a specific niche and has a deeper, more unified developer experience
All microservices that are responsible for processing or acting on event streams can be managed
and deployed from one repository which is linked to the workspace. The entire pipeline and flow of
data can also be managed as code in the form of a YAML-based workflow file.
For more details on how you can manage pipelines in Quix and other similar tools, see the article
Real-time infrastructure tooling for data scientists. Although the information is aimed at data teams,
it’s just as applicable to software engineers trying to manage complex stream processing pipelines
that power event-driven microservices. What’s particularly interesting is the trend towards unified
systems that foster better collaboration between data practitioners and software engineers by
allowing them to work in the same codebase.
ksqlDB is an event streaming database purpose-built for stream processing applications. Rather than
using separate systems for processing streams (such as Kafka Streams or Kinesis Data Analytics)
and for serving traditional queries, ksqlDB merges these layers into one, simplifying your architecture
and application code.
If we revisit our previous e-commerce scenario, here’s how Eagle Auto Parts might use Confluent
Cloud and ksqlDB.
Event streaming
Eagle Auto Parts can use Apache Kafka in Confluent Cloud as the main event streaming component
for their EDA. For example, they can ingest clickstream data into Kafka topics and have it partitioned
by activity type or User ID. In fact, Confluent has a tutorial for this use case using ksqlDB and
Grafana to create an analytics dashboard.
Confluent itself doesn’t offer any serverless containers or functions, but ksqlDB can take care of
some of the processing logic that serverless functions might normally handle:
• Producing and consuming: ksqlDB integrates natively with Kafka in Confluent Cloud, so it’s very
easy to consume data from Kafka and produce data back to Kafka.
• Data processing: Using ksqlDB, you can write SQL-like queries to process Kafka topics in real
time. For example, you can create a persistent query that reads from the PageViews topic,
aggregates the views for individual product pages, and writes the results into a new Kafka topic.
• Business logic: You’ll have to run this somewhere else. But your microservices can use ksqlDB
as their database. For example, an Inventory microservice could create a pull query to get the
current product popularity trends, cross-reference that data with current stock levels for the
product, and send internal notifications for any popular products that are low in stock.
Thus, ksqlDB can help to simplify the handling of stream processing and speed up the development
of Eagle Auto Parts’ EDA. It’s also fully-managed and serverless in Confluent Cloud, scaling up and
down automatically to match query demand, and removing the operational burden.
Two things to keep in mind though: Eagle Auto Parts will have to provision their own infrastructure for
Confluent Cloud to run in, and they’ll still need to run their microservices outside of Confluent Cloud
(i.e. in Amazon ECS, or AWS Lambda functions). This caveat unfortunately reinforces the separation
of codebases that we discussed earlier. The data-oriented logic (producer and consumer code,
data processing code) will need to be versioned and deployed separately from the business logic
and external connectors (code that triggers events derived from data processing, or code that pulls
external data to enrich event streams).
Again, here’s how Eagle Auto Parts might use Quix for both event streaming and event processing.
Event streaming
Using Quix’s managed and hosted version of Kafka as the event streaming system, Eagle Auto Parts
can ingest the clickstream data using a Quix Service running the Quix Streams library. This service
might also include connector code for a data collection platform such as Snowplow. The Quix Streams
library allows Eagle Auto Parts to partition different activities or user IDs into individual streams for
more efficient processing.
Quix offers serverless containers in a manner similar to AWS Fargate or Google Cloud Run, but more
suitable for long-running services:
• Producing and consuming: When you create a microservice, you can use the Quix Streams native
library (Python and C#) to interface with any Kafka instance (running out of the box in Quix, in
Confluent Cloud, or your own hosted server).
• Stream processing: Use the Quix Streams library in combination with any Python or C# library
you like to process streaming data in a stateful or stateless manner, then send the results back to
Kafka via the Quix Streams producer API.
• Business logic: You can use any Python or C# library to route events or trigger downstream
services, such as placing messages in queues or triggering webhooks.
Eagle Auto Parts can combine these operations together in one Quix Service — a serverless
container that you can deploy within the Quix platform, with a similar UI to what you might use to
edit and deploy a Lambda function or Google Cloud Function.
This makes Quix one of the only serverless solutions that can host all the operations for a
microservice (that leverages stream processing) in one place.
Another advantage is that Eagle Auto Parts won’t have to provision any infrastructure. The platform
is hosted and managed by Quix. However, if they wanted to run the Quix Platform in their own cloud
provider (such as AWS) they can choose the Quix BYOC (Bring Your Own Cloud) offering. One caveat
here is that Quix natively supports Python and C# only.
If Eagle Auto Parts had existing services running in languages such as Java or PHP, they can
still deploy them as Docker containers into Quix, they just won’t be able to leverage the built-in
integration between Quix and Kafka or use the online IDE in the Quix Portal.
Now that we’ve examined the strengths of these two systems, let’s see how they can be integrated
into a hybrid EDA that also uses standard cloud-native components.
This would avoid the temptation to use any particular solution as a “one-stop shop” and exploit the
various strengths of the different types of EDA components we have discussed.
Again, this architecture is intentionally overcomplicated because it’s designed to illustrate where
you could potentially use every EDA component. A cleaner architecture diagram might put Kafka at
the center with no transactional middleware or event store, rather pushing these roles to specific
microservices. Other architectures might put cloud native products at the center. For example,
a system with lower volume might use a standard database such as Amazon’s DynamoDB or Aurora
to persist events and then EventBridge to manage routing of events — no dedicated event streaming
or event store components are required.
As the saying goes, when it comes to choosing the right components and technologies “it all
depends on your requirements”. But it’s hard to know where to start, which is why we’ve created a
decision tree to help guide your decision making.
• If you answered yes to either of these criteria, an event bus might be useful. For example: if you
have messages with different structures you can easily route them to the various consumers
that need them rather than explicitly sending them to different queues or topics. For example,
some systems might have different representations of a customer entity. You can route messages
containing customer data based on the presence of a certain field.
• Event buses can be also useful for sending data to external systems such as Salesforce or
Zendesk. This is because they can adapt or transform events to be compatible with the protocols
or data formats required by external systems. This means that instead of individual services being
responsible for formatting or routing their output to multiple consumers, the event bus can handle
these responsibilities.
• For example, in Amazon EventBridge, you can communicate over HTTP(S) and take advantage
of EventBridge’s built-in retry and rate limiting features. This means you can trigger any external
system that supports webhooks without having to create an extra Lambda function for this
explicit purpose.
• Some pub/sub products (such as Amazon SNS, Firebase Cloud Messaging, and Azure
Notifications Hub) allow you to send notifications intended to be consumed by humans without
needing any extra integrations. Such notifications are typically sent as text messages or mobile
push notifications. However, not all pub/sub technologies have this capability and many are
intended for a wider range of use cases (e.g., Google Pub/Sub or IBM MQ).
• But regardless of whether they support application-to-person (A2P) messaging, pub/sub tools are
great for broadcasting a single message to multiple subscribers. Each of these subscribers can
be different types of consumers (SQS queues, Lambda functions, SMS services, email services,
and so on).
• Although event streaming technologies also fit the pub/sub model and can be used for the
same purpose, they are more suitable for events that can be derived from stream processing
(an example of this kind of “derived” event would be detecting a “death cross event” using
windowed moving averages on stock prices and then triggering a sell order on a trading platform).
• While it’s very difficult to quantify what would count as “high-throughput”, a rule of thumb might
be in the realm of several megabytes (or even gigabytes) per second, depending on the specific
use case, latency requirements, and other factors. A provider’s pricing model also plays a role in
determining whether event streaming is more worthwhile. Above a certain throughput threshold,
an event streaming product may be cheaper to use than a pub/sub product. For example,
according to Software Developer Maciej Radzikowski, once you get over 200 messages per minute,
Kinesis Data Streams becomes cheaper than Amazon SQS or SNS in terms of monthly spend.
• Another concrete criteria is whether you need to do any complex processing on the data.
While many stream processing tools are capable of processing data from a wide range of sources,
including files, databases, and traditional message queues, they are generally optimized to
integrate seamlessly with event streaming platforms. Thus, while you can do event streaming
without stream processing (such as routing log output to another destination), it’s more complicated
to do stream processing without streams (we cover why you might need stream processing in
more detail further down).
• Although a simple pub/sub system could fulfill this requirement too, a stream processor coupled
with Kafka is often the better choice where diverse consumers require varying subsets of messages.
At its core, Kafka is designed around topics and partitions, allowing for granular categorization
of data streams. Its consumer groups feature ensures different applications can interpret the
same topic differently, catering to their specific needs. This becomes particularly powerful when
integrated with stream processing tools like Quix Streams, which can perform real-time, intricate
operations on data such as filtering, transformations, and aggregations. Moreover, Kafka’s
log-based storage design ensures messages are stored sequentially and immutably, granting
consumers the flexibility to process data from different points in a stream. Coupled with fine-grained
offset control, consumers have precise control over which messages to process and when,
tailoring their consumption to their unique requirements.
Message queues are optimal for situations with unstable or frequently disconnected consumers due
to several key attributes:
• They ensure persistence so messages aren’t lost if a consumer disconnects. Their retry
mechanisms automatically attempt to redeliver unprocessed messages.
• They decouple producers from consumers, allowing each to operate independently, and provide
back-pressure handling by buffering messages.
• Moreover, they can ensure ordered processing and support dead letter queues to manage
messages that repeatedly fail processing. Overall, these features make message queues resilient
and robust in handling consumer instability.
Note that in this criteria, the emphasis is on the current state rather than longer-term state.
Event streaming tools such as Apache Kafka are also suitable for systems that need to maintain
state, but they might be overkill if you don’t need to keep a historical record of state changes.
For systems emphasizing current state, it’s often crucial to consume, act upon, and discard or
acknowledge messages promptly. Traditional message queues are tailored for this workflow.
Once a message is consumed and acknowledged, it’s removed from the queue, which helps in
ensuring that only the latest state is worked upon. Message queues can also prioritize messages,
which is important when updates to the current state are of varying urgency. Lastly, in state-centric
systems, if a particular update cannot be processed due to an error, it’s vital to have mechanisms
to capture these anomalies for later investigation or recovery. Thus, many message queue systems
have built-in dead letter queues to handle this scenario.
This pattern is otherwise known as point-to-point communication where a single sender (producer)
sends a certain type of message to a single receiver (consumer). Message queues are perfect
for this use case due to their inherent attributes. For instance, message queues ensure exclusive
consumption, meaning a message, once processed, is reserved for only one receiver even with
multiple listeners. They offer flow control mechanisms, buffering messages if sent too swiftly for
the consumer, ensuring controlled communication. These queues also bolster reliability through
acknowledgment protocols, guaranteeing message delivery even after disruptions. They can assure
the order of delivery, which is vital for sequential messaging. Above all, their simplicity, compared
to complex systems like pub/sub, makes them ideal for direct communications. For example, an
e-commerce platform might have an old warehouse management system that doesn’t support
modern event-driven architectures but can interact with a simple message queue. When an order is
placed, a message might be sent directly to a queue that this legacy system polls to know when to
ship a product.
Stream processing tools like Flink and Kafka Streams are great for this use case since they have
native support for processing events based on their actual occurrence time (event time), or the
time they arrive at the system (processing time). For example, consider a stream of sensor readings
from a network of IoT devices. If one device’s readings are delayed, using event time allows the
processing system to account for that delay and process the readings in the actual order of
occurrence, rather than the order of arrival. Additionally, when coupled with Apache Kafka, these
systems can leverage Kafka’s ordering guarantees. In Kafka, if events are placed in the same
partition, their order is preserved. Consider a chat application that needs to process chat messages
from a single conversation thread in the correct order. By storing the messages in the same Kafka
partition, that application can ensure that chat messages are processed in the order they were sent.
The notion of “low latency” is a little hard to quantify, and you’ll find various definitions of it in the wild.
For instance, Confluent field CTO Kai Waehner has described low latency as processing low or high
volumes of data in ~5 to 50 milliseconds end-to-end. In the world of Formula 1 (where Quix originated)
people talk of latencies between 10 to 15 milliseconds. Outside of IoT, examples of systems that have
a low latency requirement include online games, financial trading systems, and ad serving platforms.
In all these cases, you need a specialized system to process events at this speed. Apache Kafka
itself is designed for high-throughput and low-latency message delivery, and pairing it with a
stream processing framework such as Apache Flink, Kafka Streams or Quix Streams allows for an
end-to-end low-latency solution.
A classic example of continuous stream processing with rolling time windows is the continuous
calculation of the exponential moving average on a stock’s price (i.e. average bids in the last “X” hours,
minutes or seconds). You might assume that all stream processing is “continuous” stream processing,
but there’s a subtle difference. Stream processing is a broader term that covers the act of taking
data in the form of streams (i.e. continuously produced data) and processing them. The processing
might be real-time, near real-time, or even with some delay (such as basic log processing pipelines).
When talking of “continuous” stream processing, we mean that there is minimal delay between the
input stream and the output stream. This could be vital in contexts like real-time analytics, monitoring,
fraud detection. This is very similar to the low-latency requirement though not quite as strict.
Here, the nuance is in the amount of “real-timeyness” that you need. For example, embedded.com
distinguishes between “soft real-time and “hard real-time”, as illustrated in this diagram:
Event sourcing’s principle of separating writes (events) from reads (projections) aligns naturally
with the need to have different views or models based on the same data. For instance, an
e-commerce platform could use one view to render a customer profile (using events such as
“CustomerRegistered, CustomerUpdated, OrderPlaced, ProductReviewed”) and a different view to
render a product inventory dashboard (using events such as “ProductAdded”, “ProductRemoved”,
“OrderPlaced”, “OrderCancelled”), both of which rely on events that describe the flow of orders.
For businesses operating in regulatory environments like finance or healthcare, being able to audit
or trace back operations is paramount due to compliance. Event sourcing inherently provides an
“always-on” audit log since every change is stored as an immutable event. For example, in healthcare,
you can trace all actions taken on a patient’s record over time, making for strong auditability and
data consistency. There are also other uses for capturing the entire history of every entity. It serves
as a rich domain-specific log, allowing you to revert to any point in the past for troubleshooting,
analysis, or predictions. For example, a bank could trace back and understand the different positions
its customers’ bank accounts have been in over time.
While some event buses (such as Amazon EventBridge) allow you to replay messages too, event
sourcing tools such as EventStoreDB and the Axon Framework offer much more configurability
and granularity in terms of how you can archive and replay events. Additionally, event streaming
frameworks such as Apache Kafka can also be used for event sourcing (though they require more
work to be adapted for this use case).
Since that pilot was published, Quix has announced an official partnership with Confluent Cloud
which extends our further reach into the EDA ecosystem. This means you can leverage Confluent’s
extensive ecosystem of connectors to exchange data with external systems while taking advantage
of Quix’s serverless processing capabilities to process high-volume data streams and trigger other
microservices that depend on streaming data.
Perhaps you have gone through our scenarios and decision tree and are excited about the potential
value that event streaming and stream processing can add to your application. But you need to
convince other parts of your organization. The best way is to run a low-risk pilot with managed
technologies that won’t take too much of your time to get a working prototype up and running at no
cost to your company. Quix and Confluent Cloud can both be trialed for free, integrate seamlessly,
and a single developer can get both systems up and running within minutes. Even if you eventually
end up using other technology providers, Quix and Confluent Cloud provide you with an easy way
to get your foot in the door and prove the business value of event streaming and stream processing
within a wider event-driven architecture.