0% found this document useful (0 votes)

16 views

Survey Streaming Data Future Tech Stack

Uploaded by

Ramses Vidor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Survey Streaming Data Future Tech Stack

Uploaded by

Ramses Vidor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Streaming Data and

the Future Tech Stack

Insights from over 800 IT professionals
on their use of data stream processing
STREAMING DATA AND THE FUTURE TECH STACK

About This Report

The parallel growth trends for streaming data pipelines and container-based infrastructure combine
to address competitive pressure to deliver impactful results faster, more efficiently and with greater
agility. Streaming enables extraction of useful information from data more quickly than traditional
batch processes. It also enables timely integration of advanced analytics, such as recommendations
based on artificial intelligence and machine learning (AI/ML) models, all to achieve competitive
differentiation through higher customer satisfaction. Time pressure also affects the DevOps teams
building and deploying applications. Container-based infrastructure, like Kubernetes, eliminates
many of the inefficiencies and design problems faced by teams that are often responding to changes
by building and deploying applications rapidly and repeatedly, in response to change.

To better understand how and why data streaming is used in cloud applications and the underlying
software stack, Lightbend has partnered with The New Stack for its second survey on fast data
trends. Eight hundred and four IT professionals provided details about applications that use stream
processing at their organizations. Respondents were primarily from Western countries (41% in Europe
and 37% in North America) and worked at an approximately equal percentage of small, medium and
large organizations.

Which category most closely defines your job role?

Developer / software engineer 51.1%

Architect 25.9%
IT management, including
CIO / CISO / CTO 7%

Data engineer 6.7%

Data scientist 5.8%

DevOps 1.7%

Other 1.7%

At your organization, what percent of systems or applications

currently use stream processing technology?
Don’t know 0% 1-10% 11-25% 26-50 % > 50 %
or N/A
14% 12% 36% 20% 10% 8%
74%

The report asked in-depth questions of the 74% of respondents who have applications that use
stream processing technology.

2
STREAMING DATA AND THE FUTURE TECH STACK

Table of Contents
Executive Summary....................................................................................................................... 4

Artificial Intelligence and Machine Learning Overtaking Early Adopters’ Use Cases........................ 6
Unflashy app monitoring, log aggregation and ETL top current use cases.......................................................................................6
AI/ML, IoT and ETL adoption rose the most over 2017............................................................................................................................7
AI/ML use cases expected to see the biggest increases.........................................................................................................................8

Early Adopters Concerned About Unknowns................................................................................... 9

Knowledge and complexity top list of challenges...................................................................................................................................9
Over 56% believe type of users prevents additional adoption.......................................................................................................... 10

Concern About “State” Lessens as More Applications Use Stream Processing................................11

“State” is an obstacle.................................................................................................................................................................................. 11
Elasticsearch and Cassandra are central to current use cases.......................................................................................................... 12
Java programmers rely on JDBC and JPA for persistence.................................................................................................................. 13
Streaming container orchestration key to architects’ plans...............................................................................................................14

Technologists Looking Beyond Kafka for Advanced Use Cases.......................................................15

Kafka use is widespread............................................................................................................................................................................. 15
Outside of Lightbend, few specialists have gained traction.............................................................................................................. 16
Private cloud data centers rarely used in conjunction with stream processing.............................................................................17
Early use cases affect ETL and messaging most................................................................................................................................... 18

Lightbend’s Last Look...................................................................................................................19

About Lightbend...........................................................................................................................20

About The New Stack....................................................................................................................20

3
STREAMING DATA AND THE FUTURE TECH STACK

Executive Summary
Use cases for data streaming are growing in scope. Application monitoring; log aggregation; and
extract, transform, load (ETL) are still the most common use cases among IT professionals, but
the advent of containers and microservices application architecture allows these workloads to
be packaged and run in a way that provides additional business value beyond the enterprise IT
department. System architects now see container orchestrators, such as Kubernetes, as the real
pathway toward data streaming adoption within their organizations.

Scaled out, distributed architectures are built by teams of developers whose experience dictates
what data streaming technologies to adapt into the services they are building and managing. A data
streaming architecture built for microservices becomes a salient decision. Akka, for example, adds
value by providing fine-grained control of the kinds of processing possible as well as maximizing the
application’s efficiency and reliability. Many stream processing use cases don’t actually require this,
but AI/ML and real-time recommendation engines often do. Further data streaming adoption will
soon follow as their services become more advanced and machine learning and artificial intelligence
become more important to achieve higher business value.

However, barriers are still high for developing and managing application architectures on data streaming
infrastructure. It’s sophisticated technology that requires an understanding of how to keep a long-running
application resilient and able to scale up and down. Developer teams are adapting and trying new
workflows but it can be very risky when the impact on performance is unknown. The right knowledge for
the problem is still the biggest challenge in adoption of nascent data streaming technologies.

Containers are a mechanism to lower the barriers to entry and make data streaming valuable in
at-scale settings. The barriers will diminish further as developers become more comfortable using
data streaming in microservices applications. State management has already become less of an issue
in organizations where microservices adoption is high, because they can leverage existing, proven
design patterns. Today, handling state is complex but manageable if organizations are committed
to microservices architectures. We expect containers and microservices architecture to have a
continued impact on the evolution of data streaming — allowing organizations to take more risks and
realize the results faster than ever before.

Key Finding 01
Artificial Intelligence and Machine Learning Overtaking Early Adopters’ Use Cases
• The use of stream processing for AI/ML applications increased five-fold in two years. Those
already using streaming for AI/ML expect this trend to continue with even broader use in the
coming year.
• Pages (6-9)

4
STREAMING DATA AND THE FUTURE TECH STACK

Key Finding 02
Early Adopters Concerned About Unknowns
• Developer experience, familiarity with tools, and technical complexity are barriers to adoption.
Concern about scalability, latency and other technical challenges increases as the number of
workloads utilizing stream processing rises.
• Pages (9-11)

Key Finding 03
Concern About “State” Lessens as More Applications Use Stream Processing
• Persisting data in a microservices architecture becomes less of an issue as users gain more
experience with containers, microservices and modern databases. Architects see a future
where stream processing and microservices are deployed in the same container-based
infrastructure stack.
• Pages (11-14)

Key Finding 04
Technologists Looking Beyond Kafka for Advanced Use Cases
• While Kafka is sufficient for ETL and messaging, it faces robust vendor competition among
streaming platforms for advanced use cases such as IoT pipelines and recommendation engines.
• Pages (15-18)

5
STREAMING DATA AND THE FUTURE TECH STACK

Artificial Intelligence and Machine Learning

Overtaking Early Adopters’ Use Cases
Unflashy app monitoring, log aggregation and ETL top current use cases
However, AI/ML and integration of multiple data streams are starting to rival these leaders.

In which of the followinga applications and use cases does your organization
utilize real-time or stream processing in production environments?

Application monitoring 48.6%

Log aggregation 40.9%

ETL 36.3%

Artificial intelligence / machine learning 48.6%

Systems monitoring and management 33.1%

Integration of different data streams 32.3%

Data warehouse 23.9%

Financial data 20.9%

Security and/or fraud detection 17.5%

Recommendation and decision engines 16.6%

Operational insights 16.1%

Traditional statistical analytics 16.1%

IoT pipelines 15.6%

Customer 360 / consolidated views 11.2%

Real-time personalization 10%

• Companies have embraced real-time processing of data as a way to handle machine generated
data and to more efficiently manage existing data environments.

• Application monitoring and log aggregation are the top use cases because it is essential to detect
problems quickly instead of waiting for offline analysis. Instead of storing all of the raw data
generated by modern applications and systems, the data is often aggregated and stored into time-
series databases that only store metrics that can be easily analyzed.

• Based on a question only asked of the developers, 45% of developers surveyed have experience
working with at least one streaming data application that has been deployed into production or
will be within the next six months.

6
STREAMING DATA AND THE FUTURE TECH STACK

• ETL, data warehousing, and recommendation and decision engines use cases are more than
twice as likely to be deployed at organizations where developers have hands-on experience
incorporating data streams into a production-ready application. ETL and data warehousing are old
problems for which streaming is now being applied. Although recommendation engines are used
less often, developers are more likely to be involved with these types of applications.

AI/ML, IoT and ETL adoption rose the most over 2017
Companies processing data in real time for AI/ML use cases jumped from 6% in 2017 to 33% in 2019 —
a more than five-fold increase.
2019 = use of stream processing in production
2017 = use of “fast data,” which was defined as processing data streams as
they arrive while still supporting batch processing.

2017 2019

14%
ETL 36%

6%
Artificial intelligence / machine learning 33%

13%
Integration of different data streams 32%

13%
Operational insights 16%

17%

Traditional statistical analytics 16%

IoT pipelines 16%

Customer 360 / consolidated views 11%

Real-time personalization 10%

Only categories that are phrased exactly the same as in the 2017 survey were included in this time series chart.

• Production-level adoption widened dramatically, with several use cases seeing big jumps over
the last two years. The sharp rise in real-time processing for IoT pipelines, ETL and integration of
different data streams indicates that organizations need to extract insights from their data and
leverage advanced analytics (such as AI/ML) as quickly as possible.

7
STREAMING DATA AND THE FUTURE TECH STACK

• Adoption of stream processing for business operations was relatively stagnant, likely because
operational insights and consolidated views of customers can usually be successfully implemented
with the time lags associated with batch processing.

• Similar stagnation is seen for traditional statistical analysis, which had previously seen wide
consideration among companies that used Hadoop.

AI/ML use cases expected to see the biggest increases

Artificial intelligence is more than just speculative hype. Fifty-eight percent of those already using
stream processing in production AI/ML applications say it will see some of the greatest increases in
the next year.
Within the next 12 months, which two types of application or use cases in your
organization will see the biggest increase in their use of real-time data processing?

Artificial intelligence / machine learning 32.8%

Application monitoring 16.8%

Integration of different data streams 16.2%

IoT pipelines 11.5%

ETL 11.4%

Log aggregation 9.8%

Recommendation and decision engines 9%

Data warehouse 8.3%

Customer 360 / consolidated views 7.5%

Operational insights 7.4%

Systems monitoring and management 7%

Security and/or fraud detection 6.7%

Financial data 6.4%

Real-time personalization 5.6%

Financial data 2.6%

Other 1.1%

None 9.9%

• The consensus is that AI/ML use cases will see some of the largest increases in the next year.

• Not only will adoption widen to different use cases, it will also deepen for existing use cases, as
real-time data processing is utilized at a greater scale. Few organizations are outright rejecting

8
STREAMING DATA AND THE FUTURE TECH STACK

current use cases. Instead, they are significantly more likely to say that their current use cases will
expand the most in the next 12 months.

• In addition to AI/ML, enthusiasm among adopters of IoT pipelines is dramatic — 48% of those
already incorporating IoT data say this use case will see some of the biggest near-term growth.

Early Adopters Concerned About Unknowns

Knowledge and complexity top list of challenges
Developers need more experience picking tools and then writing code to be able to handle streaming data.

What are the top two challenges your organization faces in processing data immediately?

Developers attaining the knowledge needed

to write robust and performant applications 31.3%

Complexity of integrating tools, techniques

and technologies 29.9%

Choosing the right tools and techniques 26.2%

Difficulty making changes to existing

solutions and infrastructure 25.5%

24.2%
Integration with legacy infrastructure

12%
Cost of technology
Finding and retaining staff with data
engineering, operations or analytical skills 10.8%

Scaling to handle high data volumes 9.4%

Debugging 6.9%

We don’t face any technology-specific issues. 4.1%

3.7%
Don’t know or N/A

2.3%
Other

Respondents at organizations that don't use stream processing were not asked this question

• Effectively processing data immediately often requires developers to adapt broad changes to
their development and production environments, so it is unsurprising that additional knowledge
is needed.

9
STREAMING DATA AND THE FUTURE TECH STACK

• The second and third most commonly cited technical challenges for stream processing are
choosing and integrating the right tools and techniques.

• Some, but not all, of these challenges can be overcome with experience.
–– The more applications that utilize real-time data, the less often developer knowledge is cited as a
concern. Indeed, only 24% of developers with hands-on experience incorporating streaming data
into production applications say picking the right tools and techniques is a challenge.
–– Integration issues become a greater concern as more applications come online using different tools
and data types.
–– Unsurprisingly, scaling to handle high data volumes was twice as likely to be cited as a challenge by
those that had more than a quarter of their workloads comprised of stream processing.

Over 56% believe type of users prevents additional adoption

As organizations move past easy use cases, they are more likely to believe latency and scalability are
technical challenges.
To what degree are the following technical challenges preventing you
from adding more stream processing to existing workloads?

Greatly To some extent Not at all Don’t know or N/A

Type of users 25.5% 30.8% 27% 16.7%

Compute resources required 14.8% 37.4% 38.7% 9.1%

Throughput 11.9% 28.2% 47.4% 12.5%

Latency 11.7% 32.4% 43.5% 12.4%

Scalability 11.2% 30.4% 48.9% 9.6%

Only respondents at organizations that use stream processing were asked this question.

• Many respondents were not knowledgeable enough to know to what degree technical challenges
are inhibiting adoption. However, architects believe they know about these issues as they
answered “don’t know” to these questions half as often. Architects were also twice as likely to be
greatly concerned about compute resource requirements.

• The application’s end user is a particular concern when stream processing is utilized in
applications that require non-technical teams to actively use an application. Thus, organizations
that utilize stream processing are more concerned about one type of user (DBAs) when they
have an active data warehouse use case, and another (business analyst) when streaming data is
integrated into dashboards for operational insights.

10
STREAMING DATA AND THE FUTURE TECH STACK

• Concern about scalability and latency increased as the number and types of workloads utilizing
stream processing rose. In particular, latency is twice as likely to be inhibiting stream processing
among organizations that are working on recommendation and decision engines or IoT pipelines.

Lack of a compelling use case is the biggest barrier to initial adoption.

What are the top two reasons your organization does not process data immediately?
Top three responses shown

43%
No need
Developers do not have the knowledge needed
28%
to write robust and performant applications.
Difficulty making changes to existing
24%
solutions and infrastructure
Only respondents at organizations with 0% of applications using stream processing were asked this question

Concern About “State” Lessens as More

Applications Use Stream Processing
“State” is an obstacle
Overall, organizations are still concerned about how to handle state in microservices.
To what degree is handling state an obstacle to deploying
more applications within microservices architectures?

Greatly To some extent Not at all Don’t know

or N/A
18.6% 48.9% 18% 14.5%

• Two-thirds of respondents believe that handling state is at least partly inhibiting the deployment of
more applications within additional microservices architecture. This is not a permanent barrier though.

• Only 18% of respondents believe state is not at all an obstacle to microservices adoption. However,
among those using streaming in more than half of their applications, 41% say it’s not at all an obstacle.

• Organizations that have adopted microservices are the farthest along with stream processing.
While 58% of respondents are using microservices in production, that figure jumps to 74% among
those with more than a quarter of their applications utilizing stream processing.

• The people who believe state is greatly inhibiting adoption are also those most likely to believe that
increased developer knowledge is a key challenge for processing data immediately. In fact, those that
have solved the “state” problem and say it is “not at all” an inhibitor to microservices deployment are
more than twice as likely to have more than half of their applications utilizing stream processing.

11
STREAMING DATA AND THE FUTURE TECH STACK

• Organizations that have yet to utilize stream processing are particularly concerned that their
developers do not have enough knowledge to write performant applications. Increased education
about methods to handle state may increase adoption.

• Based on additional survey questions, we found that organizations with a high percentage of
applications using stream processing are utilizing more persistent datasets and storage models.
This is consistent with the fact that they are less concerned about persisting state.

Elasticsearch and Cassandra are central to current use cases

Users of modern databases do not believe state is an obstacle.

Which data stores do you integrate with stream processing?

Overall State does not inhibit more micoservices adoption

45%
Elasticsearch 44%

39%
Cassandra 49%

36%
PostgreSQL 43%

29%
MongoDB 38%

23%
Hadoop Distributed File System (HDFS) 32%

22%
MySQL 21%

22%
Redis 30%

18%
Oracle 11%

Other NoSQL database from a 18%

cloud provider 24%

15%
SQL Server 18%

12%
Other 14%

9%
MariaDB 14%

5%
DB2 3%

4%
SQLite 6%

12
STREAMING DATA AND THE FUTURE TECH STACK

• Users of modern data stores like Cassandra, MongoDB and Redis are less likely to believe state is
inhibiting adoption of microservices.

• However, some of the most common technologies used with stream processing are deployed by
those who believe handling state is greatly inhibiting microservices adoption. On average, this
group’s production-level adoption of Apache Kafka, Apache Spark Streaming and Elasticsearch
was 36% higher than the sample as a whole.

Java programmers rely on JDBC and JPA for persistence

Java developers that use JDBC, Akka persistence or graph databases were the most likely to
understand how state impacts microservices adoption.

Which programming languages and If using Java, with which of the following
frameworks do you regularly work with? ways do you handle persistence?

Java 75% Java Database

Connectivity (JDBC) 60%
JavaScript 47%
Java Persistence
API (JPA) 48%
Scala 45%

Python 32% 41%

Hibernate
HTML 29%
Spring Data 38%
Node.js 27%

TypeScript 23% File IO 30%

C# 14%
Serialization 20%
Go 14%
Other type of object-
relational mapping (ORM) 19%
Kotlin 13%

Groovy 9% Object databases 16%

C 8%
Akka persistence 15%
Other 6%

6% Graph databases 14%

R
PHP 6% 12%
Other
Ruby 6%
Java EE Connector
Architecture (JCA) 7%
Swift 3%

Clojure 3% XML databases 6%

• Respondents that utilize serialization often did not know if state is inhibiting microservices
adoption because an internal team often is handling the relevant infrastructure. In fact, 41% of
those utilizing serialization say they use a private, cloud-enabled datacenter with applications that
take advantage of stream processing.

13
STREAMING DATA AND THE FUTURE TECH STACK

Streaming container orchestration key to architects’ plans

Fifty-six percent of architects are “extremely likely” to deploy container orchestration within the next
12 months as compared to 42% of all respondents.

In the next 12 months, how likely are you to deploy stream processing
technology in the same “stack” as the following technologies?

Extremely Somewhat Neither likely Somewhat Extremely Don’t know

likely likely nor unlikely unlikely unlikely or N/A

Container orchestration
42% 26% 8% 5% 7% 12%
(e.g., Kubernetes)

Function as a Service
20% 25% 14% 10% 14% 17%
(e.g., AWS Lambda)

• Sixty-eight percent of architects believe it is at least somewhat likely that stream processing will be
deployed in the same stack as a container orchestrator like Kubernetes. This does not necessarily
mean that data persistence will be addressed within a container, but rather that some component
of an application will be hosted in a container cluster.

• Utilization of Function as a Service (FaaS) is less likely to be part of the stream processing stack. This
signals that event processing, which is often associated with a serverless architecture, has not become
an essential stream processing use case. When event processing does become more prevalent, we
expect that FaaS will rise in importance as a way to handle compute resource requirements.

14
STREAMING DATA AND THE FUTURE TECH STACK

Technologists Looking
Beyond Kafka for Advanced Use Cases
Kafka use is widespread
Apache Kafka adoption is often used in conjunction with other stream processing technologies.
What is your experience with the following stream processing technologies?

Using in Evaluating Plan to look Evaluated, but no plans No plans

production or piloting into it to use in production or N/A

Apache Kafka 48% 19% 11% 9% 13%

Apache Spark Streaming 25% 16% 16% 14% 29%

Akka Streams 23% 18% 17% 12% 30%

Other 18% 4 2 75%

Apache NiFi 9% 6% 9% 6% 70%

Apache Flink 6% 9% 17% 11% 57%

Apache Storm 6% 5% 13% 14% 62%

Apache Beam 4 4 13% 6% 73%

Apache Apex 8% 4 85%

Apache Samza 3 7% 6% 83%

Apache Pulsar 12% 2 83%

Apache Twitter Heron 8% 6% 84%

Only respondents at organizations that use stream processing were asked this question.

• The market has embraced Kafka because it is a robust, scalable way to capture streaming data and
serve it between applications.

• Users of Apache Kafka are 60% more likely to have Akka Streams in production. Many respondents
are likely only utilizing Kafka for messaging and not taking advantage of the Kafka Streams library.

• Akka is also being considered when the overhead of big data systems like Spark is high relative to
the amount of data processing being done.

• Recommendation and decision engines is an area where, despite initial adoption, Kafka is falling
short. In fact, while 70% of organizations with recommendation and decision engine use cases are

15
STREAMING DATA AND THE FUTURE TECH STACK

using Kafka, 35% of these organizations are evaluating or piloting Akka Streams. IoT pipelines are
also demanding technologies in addition to the stream storage enabled by Kafka. Apache Flink
gets more attention for this use case as compared to others, as 25% of organizations that use IoT
pipelines are evaluating or piloting Flink.

Outside of Lightbend, few specialists have gained traction

Given their huge user base, it is not surprising that cloud providers’ offerings are considered.
What is your experience with the following vendors’ stream processing offerings?

Using in Evaluating Plan to look Evaluated, but no plans No plans

production or piloting into it to use in production or N/A

AWS (e.g., Kinesis) 22% 11% 18% 12% 37%

Lightbend (e.g., Platform, Akka) 22% 11% 18% 8% 41%

Google Cloud (e.g., Dataflow,
Dataproc Pub/Sub) 11% 10% 12% 9% 58%
Cloudera / Hortonworks
(e.g., DataFlow) 10% 5% 7% 9% 69%
Azure (e.g., Event Hubs,
Stream Analytics) 9% 11% 9% 7% 64%

Other 7% 3 87%

Databricks 6% 10% 10% 7% 67%

MapR (e.g., Event Store) 5% 5% 5% 5% 80%

Pivotal (e.g., Spring Data Flow) 5% 5% 12% 10% 68%

WSO2 (e.g., Stream Processor) 3 5% 88%

StreamSets 6% 3 89%

Ably 94%

EsperTech 92%

Striim 93%

Only respondents at organizations that use stream processing were asked this question.

16
STREAMING DATA AND THE FUTURE TECH STACK

Private cloud data centers rarely used in conjunction with stream processing
Consistent with its overall market penetration, two-thirds of organizations say an AWS cloud is used
to some extent with production applications that have stream processing components in them.

Which of the following cloud platforms does your organization use at least in part
for production applications that have stream processing components in them?

68%
Amazon AWS

27%
Microsoft Azure

25.2%
Google Cloud

13.2%
Red Hat OpenShift

Private cloud-enabled data center 10.3%

Cloud Foundry (including Pivotal) 4.7%

Heroku 4.4%

Other 4.4%

IBM Cloud (including SoftLayer) 4.1%

Oracle Cloud 3.2%

Only respondents at organizations that use stream processing were asked this question.

• Cloud platforms may be used with applications that include stream processing, but the vendors’ own
stream processing offerings are not always adopted at the same rate by its customers. For example:
–– Google Cloud customers: Of those that have a stream processing application hosted in part with
Google, 45% are using one of Google’s own stream processing offerings versus the 13% of the overall
sample that use it. That’s a 246% increase in comparison.
–– Azure customers: Of those that have a stream processing application hosted in part with Azure, 18%
are using one of Azure’s own stream processing offerings versus the 8% of the overall sample that use
it. That’s a 125% increase in comparison.
–– AWS customers: Of those that have a stream processing application hosted in part with AWS, 37% are
using one of AWS’ own stream processing offerings versus the 26% of the overall sample that use it.
That’s a 42% increase in comparison.

• Azure production usage is particularly strong among those who have IoT use cases and those
expecting integration of multiple data streams to become front-and-center in the next year. Azure

17
STREAMING DATA AND THE FUTURE TECH STACK

customers are also likely to be giving consideration to Lightbend and to Pivotal, both of which
have strong positions among large enterprises.

• Red Hat OpenShift also has strong penetration among large enterprises, with 43% of its users
having more than 10,000 employees. Although they are likely running on separate clusters, Red
Hat’s OpenShift is often used as a platform for applications with stream processing at companies
that also utilize Hadoop in these applications. Twice as many of its respondents (48% versus 24%
for the overall sample) utilize Hadoop as a data store for stream processing.

Early use cases affect ETL and messaging most

Stream processing users often start with ETL and messaging between microservices, then move into
more advanced use cases.
Are stream processing deployments replacing If yes, what technologies are you replacing
existing technology within your organization? with stream processing technology?

16.3%
ETL 57.7%
Don’t know

Messaging 57.3%
41.8%
Pub/sub 41%
Yes
41.8% Database 39.4%
No
Replication 23.9%

Other 4.7%

• There is a tipping point for possible disruption as those who say streaming is replacing existing
technologies rises from 42% to 68% among those who use stream processing in more than a
quarter of their applications.

• Organizations that believe stream processing is replacing databases are more likely to use
MySQL and Hadoop as data sources for stream processing. Neither of these technologies were
designed to quickly handle the volume of data involved with streaming data use cases. Since
these are open source data stores, people may believe it is easier to swap them out with another
open source offering.

18
STREAMING DATA AND THE FUTURE TECH STACK

Lightbend’s Last Look

• Competitive pressure is driving organizations to embrace streaming data to extract useful
information more quickly from incoming data, as well as to serve impactful results from AI/ML to
customers. The “always on” characteristics of streaming pipelines require the same scalability,
resiliency, and efficiency that microservices deliver, which is why mature tools, like Akka Streams,
that bridge the gap are popular.

• The parallel trend of migration to container-based infrastructure, e.g., Kubernetes, is also driving
streaming data pipelines to look more like conventional microservices.

19
Lightbend (@Lightbend) is leading the enterprise transformation toward real-time,
cloud-native applications. Lightbend Platform provides scalable, high-performance
microservices frameworks and streaming engines for building data-centric systems
that are optimized to run on cloud-native infrastructure. The most admired brands
around the globe are transforming their businesses with Lightbend, engaging billions
of users every day through softwareAbout
that is Lightbend
changing the world. For more information,
visit www.lightbend.com.
About The New Stack

The New Stack publishes explanation and analysis of at scale, distributed technologies
for developers, DevOps and other IT professionals. The New Stack is a critical and
trusted resource for all people making complex technical decisions. Visit our website
for original articles, podcasts, ebooks and research at https://thenewstack.io.