DoK Report 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

RESEARCH REPORT

Data on Kubernetes 2021


Insights from over 500 executives and technology
leaders on how Kubernetes is being used for data
and the factors driving further adoption

Data on Kubernetes Report 2021 1


Introduction

Regardless of which survey you read, the use Key Findings


of Kubernetes is on the rise in organizations
of all sizes. While Kubernetes was initially • Kubernetes has become a core part of IT –
designed for stateless workloads, the half of the respondents are running 50%
community has made major strides in or more of their production workloads
supporting stateful workloads in the past • on it, and they are very satisfied and more
few years resulting in more organizations productive as a result. The most advanced
running them in production. users report 2x or greater productivity gains.

As organizations become more data-driven • 90% believe it is ready for stateful


and increasingly turn to real-time data for workloads, and a large majority
competitive advantage, their infrastructure (70%) are running them in production
needs to evolve to accommodate the with databases topping the list.
collection, storage, and processing of data Companies report significant benefits
across different environments (edge, public to standardization, consistency, and
cloud, and on-premise). In providing a management as key drivers.
standard way to run stateless and stateful
workloads, Kubernetes is strategically • Significant challenges remain. As they
positioned to be the platform organizations seek to expand their data on Kubernetes
can leverage to build state of the art data footprint, enterprises find a lack of
infrastructures. integration and interoperability with
existing tools and stacks; skilled staff;
The Data on Kubernetes Community (DoKC) quality of Kubernetes operators; and
is an openly governed group of practitioners trusted vendors.
sharing in the emergence and development
of techniques for the use of Kubernetes for • Business demands are creating
data. In September 2021, we engaged research pressures for further adoption. The
firm Clearpath Strategies to survey over increasing importance of real-time
500 Kubernetes users to understand data to competitive advantage will
the types and volume of data-intensive sharpen companiesʼ need to run data on
workloads being deployed in Kubernetes, Kubernetes. A majority believe standards
benefits and challenges, and the factors will improve data management and that
driving further adoption. data should become declarative.

Data on Kubernetes Report 2021 2


Data on Kubernetes Community
Founding Sponsors
PLATINUM

GOLD

SILVER

Data on Kubernetes Report 2021 3


Who we talked to ROLE

CIO/CTO 25

For the purpose of this report, the research VP / Director of IT or similar 20

firm only surveyed individuals whose Manager 21

organizations whose organizations are Data engineer 8

currently using, evaluating, or planning Database administrator 6

Developer/Software engineer
to use Kubernetes. This included an
6

Data scientist 5
international audience of 502 respondents.
DevOps 4

Business leadership
Because data on Kubernetes impacts a 3

Operations / Administrator 2
large part of an organization’s IT team –
Architect 1
from CIOs and CTOs, VPs and Directors of
IT, Software and Site-reliability engineers,
Database administrators, and Data ORGANIZATION SIZE
engineers – the targeted demographic ratio
is practitioners (~35%), managers (~20%), and 100 - 499 11
executives (~45%).
500 - 999 29

The majority of respondents (49%) came from 1,000 - 4,999 27


Technology organizations (software, hardware,
5,000 - 9,999 13
services), followed by Financial Services (12%),
Manufacturing and Heavy Industry (8%), and 10,000 - 24,999 10
Telecommunications (6%).
25,000 - 49,999 6

Respondents were from companies using


50,000 or more 4
or evaluating Kubernetes including a mix
of enterprise (~60%), mid-market (~30%),
and small business (~10%) providing a wide
view of the current data on Kubernetes HYBRID ENVIRONMENTS

landscape across a diversity of company


sizes and sectors.
Private cloud 63

Finally, we wanted to understand


organizations’ IT infrastructure hosting
strategy which affirmed that overall,
Public cloud
organizations operate in a hybrid, multi- 62

cloud world.

On-premises, local servers 61

Data on Kubernetes Report 2021 4


REGION

1%

19 %

31%
Australia & New Zealand

Europe
Asia

US & Canada

48 %

INDUSTRY

Telecommunications/ISP/
6 % Web Hosting
Other
1%
Education
1%
1% Infrastructure and Construction
1% Non-financial Services
3% Aerospace and Defense

IT
Financial Services
49 %
12%

Government
1%
Health care
4%

8% Manufacturing
and heavy industry

1% Media
5% 1% Primary goods
2%
Retail and wholesale Transportation and logistics

Data on Kubernetes Report 2021 5


How Kubernetes is Used Today

Before diving into the specifics of running data How much more productive is your
on Kubernetes, we wanted to understand how
organization/are your developers
organizations are using Kubernetes across
all workloads. We found that Kubernetes has after adopting Kubernetes?
become a core part of their infrastructure –
half of the respondents are running 50% or More than 2x more productive 5

more of their production workloads on it. About 2x more productive 14

About 50% more 38

About 10% more 36

What percentage of your We are no more or less productive 5

organization’s production We are less productive 1

workloads run on Kubernetes?

Under 10%
The more Kubernetes is in use; the more
2
productive an organization is: when we look at
10% - 24% 13 Kubernetes Leaders – the 11% of respondents
running 75% or more of their production
25% - 49% 34
workloads on Kubernetes – they report even
50% - 74%
higher levels of productivity with a majority
39
achieving an impressive 2x or more productivity.
75% - 99%
10

100%
1

A majority of Kubernetes Leaders


are 2x or more productive
Those using it in production are happy with
the results, with 69% being very satisfied. This More than 2x more productive 38
3
satisfaction might be driven by the fact that 29
About 2x more productive
over half of respondents report being 50% or 19

15
more productive after adopting Kubernetes. About 50% more
41

Unsurprisingly 68% say they are very likely to About 10% more
12
34
increase their Kubernetes footprint. 6
We are no more or less productive
1

We are less productive 2

75%+ production workloads (Kubernetes Leaders) <75%+ production workloads

Data on Kubernetes Report 2021 6


Kubernetes adoption is recent for many respondents with 86% adopting in the past two years with
many in the past six to 12 months – a likely correlation with the digital transformation accelerated by
the pandemic.

In general, how satisfied are you with How likely are you to migrate
the use of Kubernetes for production additional production workloads to
workloads in your organization? Kubernetes in your organization?

Very satisfied (5) 69 Very likely (5) 68

Somewhat likely (4) 26


Somewhat satisfied (4) 28

Not very likely (3) 5


2
A little satisfied (3)

25
Within the past 6 months

When did your organization


29

start to use Kubernetes?


In the past 6-12 months

32
1-2 years ago

15
2 or more years ago

Data on Kubernetes Report 2021 7


The State of Data on Kubernetes

Data on Kubernetes is widely adopted


The data is clear: Enterprises are confident that Kubernetes is ready to run their organization’s state-
ful workloads in production with 90% believing it and 70% currently doing so. (We’ll cover the 10% of
“non-believers” in the next section).

For those currently running stateful workloads in production, we see broad usage that maps closely to
overall Kubernetes usage in the previous section. Over half of this cohort (51%) intends to increase the
volume of workloads by 30% or more in the next 12 months. More Kubernetes = more productivity = high
satisfaction. See a pattern here? Kubernetes is here to stay.

Do you believe Kubernetes is ready What percentage of your organization’s


to run your organization’s stateful stateful production workloads are
workloads in production? running on Kubernetes?

Less than 10% 3

10

10% - 24% 16

25% - 49% 35

50% - 74%
39

75% - 100% 7

By how much do you expect your


Does your organization run organization’s percentage of stateful
stateful workload(s) on Kubernetes? production workloads run on Kubernetes
to increase in the next 12 months?

70

26

Data on Kubernetes Report 2021 8


Respondents are running a wide range of stateful workloads on Kubernetes with the Databases in the
top spot followed by a three-way tie including Persistent Storage, Streaming/Messaging, and Backup/
Archival Storage.

Q. Which of the following stateful workloads does your organization


run on Kubernetes?

Databases 50

Persistent storage 45

Streaming/messaging 45

Backup/Archival storage 45

Object storage 42

Analytics 39

AI/ML 38

When we look at Kubernetes Leaders, Databases remain in the top spot but become more important
– jumping 11% – while Persistent Storage becomes less important to this group, a reduction of 14%
compared to all respondents.

More databases and storage for Kubernetes leaders

Databases 61
50

Backup/Archival storage 58
43

Streaming/messaging 45
44

Object storage 45
43

AI/ML 42
39

Persistent storage 35
49

32
Analytics
43

75%+ production workloads (Kubernetes Leaders) <75%+ production workloads

Data on Kubernetes Report 2021 9


Standardization drives DoK adoption
In spite of the difficulty of running stateful workloads on Kubernetes (more in this in the next section),
organizations have embraced it. They do this not only because Kubernetes makes it easy to scale – a
Day 1 benefit – but also due to the ability to manage all workloads in a standard way, generally con-
sidered a Day 2 benefit. The massive productivity gains and high satisfaction we see in the first section
may be linked to the ability for organizations to standardize across hybrid environments with Kuberne-
tes – a data point we will explore in our next survey.

From the following, which are the THREE most important factors in your or-
ganization’s decision to run stateful workload(s) on Kubernetes?

Ensure Consistency 45

Standardizing on Kubernetes 40

Simplify Management 39

Enable Developers to
self manage 39

Enable hybrid/multi- 35
provider DBaaS

Reducing TCO 29

Avoid Vendor Lockin 25

Auto-healing 22

This becomes even more pronounced when we look at responses from Kubernetes Leaders. Not only does
standardization jump 10 points, we also see security jump ahead of scalability, deployment, and other Day 1
benefits of Kubernetes. The more workloads an organization runs on Kubernetes, the more it can capitalize
on the standardization advantage.

Data on Kubernetes Report 2021 10


Standardization is the key driver for Kubernetes Leaders

Standardizing on Kubernetes 50
39

Ensure consistency 41
45

Enable Developers to self manage 41


40

Enable hybrid/multi-provider 41
DBaaS 35

35
Auto-healing 21

32
Simplify management 40

24
Avoid Vendor Lockin 27

21
Reducing TCO 30

75%+ production workloads (Kubernetes Leaders) <75%+ production workloads

Looking ahead to a future where organizations can more seamlessly react to data in real-time, a
majority would like to see data become declarative, just like Kubernetes. Our next survey will delve
deeper into what will be required to create a language for declarative data on Kubernetes.

A majority believe that data should become


declarative, just like Kubernetes.

Much

Somewhat

While many organizations


62
have experienced success with
Kubernetes and are running
37
stateful workloads, challenges
remain. Next we’ll zoom in on
30
how respondents are managing
19 state in Kubernetes (there’s an
operator for that), and the key
In the future, the way data is managed Data management is imperative
challenges they face.
should become declarative, because it is too complex.
just like Kubernetes.

Now you will see a series of pairs of statements. For each pair select the statement
that you agree with more, even if you agree with both a little.

Data on Kubernetes Report 2021 11


The Stateful Challenge
If more Kubernetes = more productivity = higher satisfaction, then what challenges do our respondents
face when running data on Kubernetes? Here we see a mix of Day 1 and Day 2 problems emerge, with
the primary challenge being the lack of integration with existing tools (35%) followed by a lack of
interoperability with the rest of their stack (32%).

What are the primary challenges of running data on Kubernetes?


Lack of integration
35
with our existing tools
Lack of interoperability 32
with the rest of my stack

Vendor solutions solve niche needs 30

Lack of qualified talent 29

Little or no vendor solutions exist 27

Lack of examples showing other companies doing it 27

Kubernetes open source features are not 27


mature enough

Too much time/effort to manage 25

Too complex to integrate into


our environment
24

Kubernetes Leaders face a different set of challenges with a four-way tie for first place: vendor solutions
solve niche needs, little or no vendor solutions exist, too much time and effort to manage, and lack of quali-
fied talent. The talent gap was the most drastic difference when compared with all respondents, jumping 11%.

Kubernetes Leaders face a lack of support and skills

35
Vendor solutions solve niche needs
34

35
Little or no vendor solutions exist
28

35
Too much time/effort to manage
26

35
Lack of qualified talent
24

Lack of interoperability with 32


my existing stack 34

Lack of integration with our 32


existing tools 34

Lack of examples showing 29


other companies doing it 26

Too complex to integrate into 15


our environment 25

75%+ production workloads (Kubernetes Leaders) <75%+ production workloads

Data on Kubernetes Report 2021 12


In addition to native Kubernetes features, like StatefulSets and the Container Storage Interface (CSI),
the operator pattern extended Kubernetes’ use to stateful workloads. It uses the Kubernetes API to
create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes
user. Today, respondents are using operators to manage a wide range of stateful workloads from
databases to streaming/messaging.

Please indicate whether you, professionally, currently use Kubernetes


operators, plan to use Kubernetes operators, would like to use Kubernetes

Databases 52 32

Persistent Storage 43 38

Analytics 43 36

Backup/Archival storage 43 37

Object Storage 41 36

AI/ML
35 38

Streaming/messaging
33 37

Currently use Kubernetes operators for this Plan to use Kubernetes operators for this

The primary benefit of Kubernetes operators cited by respondents is that they simplify the
management of workloads in multi-cloud and hybrid cloud environments. Simplicity and scalability
also rank highly.

What would you say are the primary benefits of using Kubernetes
operators? (Select all that apply)

Simplifies management in multi and 50


hybrid cloud environments

Scalability 49

Improves application lifecycle


49
management

Automates operations for 45


stateful workloads

Ability to customize 40

More native integration with


37
Kubernetes

Elasticity 33

Self-healing 25

Data on Kubernetes Report 2021 13


Without industry standards for the development of operators, key challenges remain and may hinder
the broader adoption of Kubernetes for stateful workloads. Interoperability is cited as the primary
challenge (50%), followed by varying degrees of quality (42%) and lack of standardization (40%). A
consequence of this is that a majority of the respondents we surveyed are developing their own
operators professionally.

CHALLENGES WITH OPERATORS DEVELOPING OWN OPERATORS

61
Difficult to maintain interoperability
50
with other operators

Varying degrees of quality 42


37

Lack of standardization 40

24
Seems like a workaround rather than a 37
clean technical implementation

Too many to adopt 30

Does not fulfilll our use case(s) 29


Yes, professionally Yes, personally No

Other

Further underscoring this point, lack of quality operators is cited as the number one reason preventing
some from using Kubernetes for stateful workloads – the 10% “non-believers” who do not think
Kubernetes is production-ready for running data.

Why do you not think Kubernetes is ready to run your organization’s


stateful workloads in production? (Select all that apply)

44

31

31

29

27

19

Data on Kubernetes Report 2021 14


The Future of DoK

Despite the challenges, respondents believe that running stateful workloads on Kubernetes is the way
forward as evidenced by the productivity and satisfaction gains we see in the first section. When asked
to envision a future that simplifies management and automation on Kubernetes, a 2:1 majority agree
that the standardization of data management is important.

2:1 believe that standardization of data management is important


Much
65 Somewhat

35

35
17

The creation of standards for data The creation of standards for


management on Kubernetes would data would make management
help simplify management and automation. more complex.

Now you will see a series of pairs of statements. For each pair select the
statement that you agree with more, even if you agree with both a little.

Standardization is a recurring theme for our respondents. When done well, it drives Kubernetes
adoption; when absent, it slows it. This is even more pronounced for Kubernetes Leaders running
75%+ workloads in production – standardization jumps 11% to the top spot (up from number two for all
respondents), followed by consistency and the ability for developers to self-manage.

From the following, which are the THREE most important factors in your
organization’s decision to run stateful workload(s) on Kubernetes?

50
Standardizing on Kubernetes
39

41
Ensure Consistency
45

41
Enable Developers to self manage
40

41
Enable hybrid/multi-provider DBaaS
35

Auto-healing 35
21

32
Simplify Management
40

24
Avoid Vendor Lockin
27

21
Reducing TCO
30

75%+ production workloads (Kubernetes Leaders) <75%+ production workloads

Data on Kubernetes Report 2021 15


A 2:1 majority also believe that how companies leverage their real-time data is key to competitive
advantage. The rise of real-time is fueled by organizations’ desire to quickly react to actionable insights
that drive customer satisfaction and revenue. The ability for companies to standardize all workloads on
Kubernetes puts it in a position to become the favored system for real-time workloads.

Real-time data will increase the demand for running data on Kubernetes

Much
Somewhat

Going forward, how companies leverage Real-time data is useful but other
their real-time data is the key to factors will be more important to
competitive advantage. competitive advantage

Now you will see a series of pairs of statements. For each pair select the
statement that you agree with more, even if you agree with both a little.

Kubernetes Leaders are also more interested in running AI/ML workloads than all respondents,
jumping to the #2 position behind databases (up from #6) – an indicator of what the future of
DoK may look like for all.

Conclusion
Stateful workloads are pervasive, and the most advanced Kubernetes users benefit from massive
productivity gains thanks to the standardization of how they run stateless and stateful workloads. The
operator pattern is beneficial, but not without challenges which is forcing organizations to build their
own. Standards and/or best practices are needed to bring operators to a level of quality that will allow
organizations to realize the benefits of running stateful workloads in a consistent way.
Signs point to a future where organizations can standardize, or further standardize, on Kubernetes for
data-intensive workloads. This may be driven by industry standards and exemplified by declarative
data and similar concepts. It will undoubtedly encompass the world of data technologies (persistence,
streaming, analytics), data infrastructure (storage, security, networking), and data governance (pol-
icies, protocols, access) and require contributions from everyone; achieved with open communities,
open standards, and open source. The future is ours to build.

About Data on Kubernetes Community


Kubernetes was initially designed to run stateless workloads. Today it is increasingly being used to run
databases and other stateful workloads. The Data on Kubernetes Community was founded in June 2020
to bring practitioners together to solve the challenges of working with data on Kubernetes. An openly
governed community, DoKC exists to assist in the emergence and development of techniques for the
use of Kubernetes for data. https://dok.community/

Data on Kubernetes Report 2021 16


Data on Kubernetes Report 2021 17

You might also like