2024 Final Dora Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 120

2024

Accelerate
State of DevOps

Gold Sponsors

10
A decade with DORA
Contents
Executive summary 3 Final thoughts 83

Software delivery performance 9 Acknowledgements 85

Artificial intelligence: Authors 87


Adoption and attitudes 17
Demographics and
Exploring the downstream firmographics 91
impact of AI 27
Methodology 99
Platform engineering 47
Models 113
Developer experience 57
Recommended reading 117
Leading transformations 69

A decade with DORA 77

2 Contents
v. 2024.3
Executive
summary
DORA has been investigating the DORA collects data through an annual,
capabilities, practices, and measures worldwide survey of professionals
of high-performing technology-driven working in technical and adjacent roles.
teams and organizations for over a The survey includes questions related to
decade. This is our tenth DORA report. ways of working and accomplishments
We have heard from more than 39,000 that are relevant across an organization
professionals working at organizations and to the people working in that
of every size and across many different organization.
industries globally. Thank you for joining
us along this journey and being an DORA uses rigorous statistical evaluation
important part of the research! methodology to understand the
relationships between these factors and
how they each contribute to the success
of teams and organizations.

This year, we augmented our survey with


in-depth interviews of professionals as a
way to get deeper insights, triangulate,
and provide additional context for our
findings. See the Methodology chapter
for more details.

3 Executive summary
v. 2024.3
The key accomplishments and outcomes
we investigated this year are:

Reducing burnout Burnout is a state of emotional, physical, and mental exhaustion


caused by prolonged or excessive stress, often characterized by
feelings of cynicism, detachment, and a lack of accomplishment.

Flow Flow measures how much focus a person tends to achieve during
development tasks.

Job satisfaction Job satisfaction measures someone’s overall feeling


about their job.

Organizational This measures an organization's performance in areas including


performance profitability, market share, total customers, operating efficiency,
customer satisfaction, quality of products and services, and its
ability to achieve goals.

Product This measures the usability, functionality, value, availability,


performance performance (for example, latency), and security of a product.

Productivity Productivity measures the extent to which an individual


feels effective and efficient in their work, creating value
and achieving tasks.

Team performance This measures a team's ability to collaborate, innovate, work


efficiently, rely on each other, and adapt.

4 Executive summary
v. 2024.3
Key findings

AI is having broad impact However, AI adoption also brings some


detrimental effects. We have observed
AI is producing a paradigm shift in the reductions to software delivery
field of software development. Early performance, and the effect on product
adoption is showing some promising performance is uncertain. Additionally,
results, tempered by caution. individuals are reporting a decrease in
the amount of time they spend doing
AI adoption benefits: valuable work as AI adoption increases,
a curious finding that is explored more
• Flow
later in this report.
• Productivity
Teams should continue experimenting
and learning more about the impact of
• Job satisfaction
increasing reliance on AI.
• Code quality

• Internal documentation

• Review processes

• Team performance

• Organizational performance

5 Executive summary
v. 2024.3
AI adoption increases as trust Platform engineering
in AI increases can boost productivity

Using generative artificial intelligence Platform engineering has a positive


(gen AI) makes developers feel more impact on productivity and
productive, and developers who trust organizational performance, but there
gen AI use it more. There is room for are some cautionary signals for software
improvement in this area: 39.2% of delivery performance.
respondents reported having little or no
trust in AI. Cloud enables infrastructure flexibility

User-centricity drives performance Flexible infrastructure can increase


organizational performance. However,
Organizations that prioritize the end user moving to the cloud without adopting
experience produce higher quality the flexibility that cloud has to offer may
products, with developers who are more be more harmful than remaining in the
productive, satisfied, and less likely data center. Transforming approaches,
to experience burnout. processes, and technologies is required
for a successful migration.
Transformational leadership matters
High-levels of software delivery
Transformational leadership improves performance are achievable
employee productivity, job satisfaction,
team performance, product The highest performing teams excel
performance, and organizational across all four software delivery
performance while also helping decrease metrics (change lead time, deployment
employee burnout. frequency, change fail percentage, and
failed deployment recovery time) while
Stable priorities boost productivity the lowest performers perform poorly
and well-being across all four. We see teams from
every industry vertical in each of the
Unstable organizational priorities performance clusters.
lead to meaningful decreases in
productivity and substantial increases
in burnout, even when organizations
have strong leaders, good internal
documents, and a user-centric approach
to software development.

6 Executive summary
v. 2024.3
Applying insights from DORA

Driving team and organizational We recommend taking an


improvements with DORA requires that experimental approach to
you assess how you're doing today, improvement.
identify areas to invest in and make
improvements, and have feedback loops 1. Identify an area or outcome you would
to tell you how you're progressing. Teams like to improve
that adopt a mindset and practice of
continuous improvement are likely to 2. Measure your baseline or current state
see the most benefits. Invest in building
the organizational muscles required to 3. Develop a set of hypotheses about
repeat this over time. what might get you closer to your
desired state
Findings from our research can
help inform your own experiments 4. Agree and commit to a plan
and hypotheses. It's important to for improvement
experiment and measure the impact
of your changes to see what works best 5. Do the work
for your team and organization. Doing
so will help you validate our findings. 6. Measure the progress you’ve made
Expect your results to differ and please
share your progress so that we all may 7. Repeat the process.
learn from your experience. Improvement work is achieved
iteratively and incrementally

7 Executive summary
v. 2024.3
You cannot improve alone!

We can learn from each other’s


experience; an excellent forum for
sharing and learning about improvement
initiatives is the DORA Community
https://dora.community.

8 Executive summary
v. 2024.3
Software delivery
performance

Technology-driven teams need ways to


measure performance so that they can
assess how they’re doing today, prioritize
improvements, and validate their
progress. DORA has repeatedly validated
four software delivery metrics — the four
keys — that provide an effective way of
measuring the outcomes of the software
delivery process.

9 Software delivery performance


v. 2024.3
The four keys

DORA’s four keys have been used to


measure the throughput and stability
of software changes. This includes
changes of any kind, including changes
to configuration and changes to code.

Change lead time: Deployment frequency:


the time it takes for a code how often application changes are
commit or change to be successfully deployed to production.
deployed to production.

Change fail rate: Failed deployment recovery time:


the percentage of deployments the time it takes to recover from a
that cause failures in production,1 failed deployment.
requiring hotfixes or rollbacks.

We've observed that these metrics


typically move together, the best
performers do well on all four while the
lowest performers do poorly.

10 Software delivery performance


v. 2024.3
Evolving the measures of software
delivery performance
The analysis of the four key metrics This appears in the analysis of software
has long had an outlier: change failure performance levels, too. More than
rate.2 Change failure rate is strongly half of the teams in our study this
correlated with the other three metrics year show differences in software
but statistical tests and methods throughput and software stability.
prevent us from combining all four These differences have led us to
into one factor. A change to the way consider software delivery performance
respondents answered the change through two different factors:
failure rate question improved the
connection but we felt there might Concept
be something else happening.
Software delivery performance
We have a longstanding hypothesis
that the change failure rate metric works Factor
as a proxy for the amount of rework a
team is asked to do. When a delivery fails, Software Software
this requires the team to fix the change, delivery delivery
likely by introducing another change. throughput stability

To test this theory, we added another Metrics used


question this year about the rework
rate for an application: "For the primary • Change lead • Change failure
application or service you work on, time rate
approximately how many deployments
in the last six months were not planned • Deployment • Rework rate
but were performed to address a user- frequency
facing bug in the application?"
• Failed
Our data analysis confirmed our deployment
hypothesis that rework rate and change recovery time
failure rate are related. Together, these
two metrics create a reliable factor of
software delivery stability.

11 Software delivery performance


v. 2024.3
Our analysis throughout this report Change failure rate and rework rate are
utilizes the software delivery used when we describe software delivery
performance concept and both factors stability. This factor measures the
at various times. All five metrics are likelihood deployments unintentionally
considered for describing software lead to immediate, additional work.
delivery performance.

Change lead time, deployment


frequency, and failed deployment
recovery time are used when we
describe software delivery throughput.
This factor measures the speed of
making updates of any kind, normal
changes and changes in response
to a failure.

12 Software delivery performance


v. 2024.3
Performance levels

Each year we ask survey respondents In our analysis of software delivery


about the software delivery performance performance, four clusters of responses
of the primary application or service they emerged. We do not set these levels in
work on. We analyze their answers using advance, rather we let them emerge
cluster analysis, which is a statistical from the survey responses. This gives
method that identifies responses that are us a way to see a snapshot of software
similar to one another but distinct from delivery performance across all
other groups of responses. respondents each year.

We performed the cluster analysis on the Four distinct clusters emerged from the
original four software delivery metrics to data this year, as shown below.
remain consistent with previous years'
cluster analyses.

Performance Change lead Deployment Change fail Failed Percentage of


level time frequency rate deployment respondents*
recovery time

Elite Less than On demand 5% Less than 19%


one day (multiple one hour (18-20%)
deploys per day)

High Between one Between once 20% Less than 22%


day and one per day and one day (21-23%)
week once per week

Medium Between one Between once 10% Less than 35%


week and per week and one day (33-36%)
one month once per month

Low Between one Between once 40% Between one 25%


month and per month and week and (23-26%)
six months once every six one month
months

*89% uncertainty interval

13 Software delivery performance


v. 2024.3
Throughput or stability? There may not be a universal answer
to this question. It depends on the
Within all four clusters, throughput application or service being considered,
and stability are correlated. This the goals of the team working on that
correlation persists even in the medium application, and most importantly the
performance cluster (orange), where expectations of the application’s users.
throughput is lower and stability is
higher than in the high performance We made a decision to call the faster
cluster (yellow). This suggests that teams “high performers,” and the
factors besides throughput and stability slower but more stable teams “medium
influence cluster performance. The performers.” This decision highlights
medium performance cluster, for one of the potential pitfalls of using
example, may benefit from shipping these performance levels: Improving
changes more frequently. should be more important to a team
than reaching a particular performance
Which is better, more frequent level. The best teams are those that
deployments or fewer failures achieve elite improvement, not
when deploying? necessarily elite performance.

Software
Software delivery
Delivery performance
Performance Levelslevels

1m : 6m 1m : 6m 1m : 6m 1m : 6m 50% 50% 1w - 1m 1w - 1m

1w : 1m 1w : 1m 1w : 1m 1w : 1m 40% 40%
1d - 1w 1d - 1w

1d : 1w 1d : 1w 1d : 1w 1d : 1w 30% 30%

< 1d < 1d
< 1d < 1d 1h : 1d 1h : 1d 20% 20%

< 1hr < 1hr On Demand


On Demand 10% 10% < 1hr < 1hr

ChangeChange lead time


lead time Deployment
Deployment frequency
frequency ChangeChange failure rate
failure rate Failed deployment
Failed deployment
recoveryrecovery
time time

Elite Elite High High MediumMedium Low Low

Figure 1: Software delivery performance levels

14 Software delivery performance


v. 2024.3
When compared to low performers,
elite performers realize

127x 182x 8x 2293x


faster lead more lower change faster failed
time deployments failure rate deployment
per year recovery times

Industry does not meaningfully affect


How to use the performance levels
performance clusters
Our research rarely3 finds that industry
is a predictor of software delivery
performance; we see high-performing
The performance clusters provide teams in every industry vertical. This
benchmark data that show the isn’t to suggest that there are no unique
software delivery performance of challenges across industries, but no
this year's survey respondents. The one industry appears to be uniquely
clusters are intended to help inspire all encumbered or uniquely capable when it
that elite performance is achievable. comes to software delivery performance.

More important than reaching a


particular performance level, we
believe that teams should focus
on improving performance overall.
The best teams are those that
achieve elite improvement, not
necessarily elite performance.

15 Software delivery performance


v. 2024.3
Using the software delivery
performance metrics
Each application or service has its own Next, identify and agree on a plan for
unique context. This complexity makes it improvement. This plan may focus on
difficult to predict how any one change improving one of the many capabilities
may affect the overall performance of that DORA has researched5 or may be
the system. Beyond that, it is nearly something else that is unique to your
impossible to change only one thing application or organization.
at a time in an organization. With this
complexity in mind, how can we use the With this plan in-hand, it's now time to
software delivery performance metrics do the work! Dedicate capacity to this
to help guide our improvement efforts? improvement work and pay attention to
the lessons learned along the way.
Start by identifying the primary
application or service you would After the change has had a chance
like to measure and improve. We then to be implemented and take hold, it's
recommend gathering the cross- now time to re-evaluate the four keys.
functional team responsible for this How have they changed after the team
application to measure and agree implemented the change? What lessons
on its current software delivery have you learned?
performance. The DORA Quick Check
(https://dora.dev/quickcheck) can Repeating this process will help the
help guide a conversation and set team build a practice of continuous
this baseline measurement. Your improvement.
team will need to understand what
is preventing better performance. Remember: change does not happen
overnight. An iterative approach that
One effective way to find these enables a climate for learning, fast flow,
impediments is to complete a value and fast feedback6 is required.
stream mapping exercise4 with the team.

1. We consider a deployment to be a change failure only if it causes an issue after landing in production, where it can be experienced
by end users. In contrast, a change that is stopped on its way to production is a successful demonstration of the deployment
process's ability to detect errors.
2. Forsgren, Nicole, Jez Humble, and Gene Kim. 2018. Accelerate: The Science Behind DevOps : Building and Scaling High Performing
Technology Organizations. IT Revolution Press. pp. 37-38
3. The 2019 Accelerate State of DevOps (p 32) report found that the retail industry saw significantly better software delivery
performance. https://dora.dev/research/2019/dora-report/2019-dora-accelerate-state-of-devops-report.pdf#page=32
4. https://dora.dev/guides/value-stream-management/
5. https://dora.dev/capabilities
6. https://dora.dev/research

16 Software delivery performance


v. 2024.3
Artificial
intelligence:
Adoption
and attitudes

Takeaways Introduction
The vast majority of organizations It would be difficult to ignore the
across all industries surveyed are significant impact that AI has had on
shifting their priorities to more deeply the landscape of development work
incorporate AI into their applications this year, given the proliferation of
and services. A corresponding majority popular news articles outlining its
of development professionals are effects, from good1 to bad2 to ugly.3
relying on AI to help them perform So, while AI was only discussed as one
their core role responsibilities — and of many technical capabilities affecting
reporting increases in productivity as performance in our 2023 Accelerate
a result. Development professionals’ State of DevOps Report,4 this year we
perceptions that using AI is necessary explore this topic more fully.
for remaining competitive in today’s
market is pervasive and appears to As the use of AI in professional
be an important driver of AI adoption development work moves rapidly from
for both organizations and individual the fringes to ubiquity, we believe our
development professionals. 2024 Accelerate State of DevOps Report
represents an important opportunity to
assess the adoption, use, and attitudes
of development professionals at a critical
inflection point for the industry.

17 Artificial intelligence: Adoption and attitudes


v. 2024.3
Findings

Adopting Artificial Intelligence and services. 49.2% of respondents even


described the magnitude of this shift as
Findings on the adoption of AI suggest a being either “moderate” or “significant.”
growing awareness that AI is no longer
“on the horizon,” but has fully arrived and Notably, 3% of respondents reported
is, quite likely, here to stay. that their organizations are decreasing
focus on AI — within the margin of
error of our survey. 78% of respondents
Organizational adoption of reported that they trusted their
Artificial Intelligence organizations to be transparent about
how they plan on using AI as a result
The vast majority of respondents (81%) of these priority shifts. This data is
reported that their organizations have visualized in Figure 2.
shifted their priorities to increase their
incorporation of AI into their applications

Changes in organizational priorities concerning AI

Significant increase in AI prioritization

Moderate increase in AI prioritization

Slight increase in AI prioritization


Answer

No change in AI prioritization

Slight decrease in AI prioritization

Moderate decrease in AI prioritization

Significant decrease in AI prioritization

0% 10% 20% 30%


Percentage of respondents

Error bar represents 89% uncertainty interval


Figure 2: Respondents’ perceptions of their organizations’ shifts in priorities toward or away from
incorporation of AI into their applications and services.

18 Artificial intelligence: Adoption and attitudes


v. 2024.3
Participants from all surveyed industries Individual adoption of
reported statistically identical levels of artificial intelligence
reliance on AI in their daily work, which
suggests that this rapid adoption of AI At the individual level, we found that
is unfolding uniformly across all industry 75.9% of respondents are relying, at
sectors. This was somewhat surprising least in part, on AI in one or more of
to us. Individual industries can vary their daily professional responsibilities.
widely with respect to their levels of Among those whose job responsibilities
regulatory constraints and historical pace include the following tasks, a majority of
of innovation, each of which can impact respondents relied on AI for:
rates of technology adoption.
1. Writing code
However, we did find that respondents
working in larger organizations report 2. Summarizing information
less reliance on AI in their daily work
than respondents working in smaller 3. Explaining unfamiliar code
organizations, which is consistent with
prior literature indicating larger firms 4. Optimizing code
more slowly adapt to technological
change because of their higher 5. Documenting code
organizational complexities and
coordination costs.5 6. Writing tests

7. Debugging code

8. Data analysis

Of all tasks included in our survey


responses, the most common use cases
for AI in software development work
were writing code and summarizing
information, with 74.9% and 71.2% of
respondents whose job responsibilities
include these tasks relying on AI to
perform them — at least in part. This data
is visualized in Figure 3.

19 Artificial intelligence: Adoption and attitudes


v. 2024.3
Task reliance on AI

Code writing 74.9%

Summarizing information 71.2%

Code explanation 62.2%

Code optimization 61.3%

Documentation 60.8%

Test writing 59.6%


Task

Debugging 56.1%

Data analysis 54.6%

Code review 48.9%

Security analysis 46.3%

Language migration 45%

Codebase modernization 44.7%

0% 20% 40% 60% 80%

Percentage of respondents

Error bar represents 89% credibility interval


Figure 3: Percentage of respondents relying on AI, at least in part, to perform twelve common development tasks

Chatbots were the most common those technologies. So, these numbers
interface through which respondents might be artificially low.
interacted with AI in their daily work
(78.2%), followed by external web We found that data scientists and
interfaces (73.9%), and AI tools machine learning specialists were more
embedded within their IDEs (72.9%). likely than respondents holding all
Respondents were less likely to use other job roles to rely on AI. Conversely,
AI through internal web interfaces hardware engineers were less likely than
(58.1%) and as part of automated CI/CD respondents holding all other job roles
pipelines (50.2%). to rely on AI, which might be explained
by the responsibilities of hardware
However, we acknowledge that engineers differing from the above tasks
respondents' awareness of AI used for which AI is commonly used.
in their CI/CD pipelines and internal
platforms likely depends on the
frequency with which they interface with

20 Artificial intelligence: Adoption and attitudes


v. 2024.3
Drivers of adoption of At the individual level, many participants
artificial intelligence linked their adoption of AI to the
sentiment that proficiency with using AI
Interview participants frequently linked in software development is “kind of, like,
the decision to adopt AI to competitive the new bar for entry as an engineer”
pressures and a need to keep up with (P9). Several participants suggested
industry standards for both organizations fellow developers should rapidly adopt
and developers, which are increasingly AI in their development workflow,
recognized to include proficiency with AI. because “there’s so much happening
in this space, you can barely keep up… I
For several participants’ organizations, think, if you don’t use it, you will be left
using AI at all was seen as “a big behind quite soon” (P4).
marketing point” (P3)6 that could help
differentiate their firm from competitors.
Awareness that competitors are
beginning to adopt AI in their own
processes even prompted one firm to
forgo the typical “huge bureaucracy”
involved in adopting new technology
because they felt an urgency to adopt
AI, questioning “what if our competitor
takes those actions before us?” (P11).

21 Artificial intelligence: Adoption and attitudes


v. 2024.3
Perceptions of artificial intelligence

Performance improvements from Notably, more than one-third


artificial intelligence of respondents described their
observed productivity increases as
For the large number of organizations either moderate (25%) or extreme
and developers who are adopting it, the (10%) in magnitude. Fewer than 10%
benefits of using AI in development work of respondents reported negative
appear to be quite high. Seventy-five impacts of even a slight degree on their
percent of respondents reported positive productivity because of AI. This data is
productivity gains from AI in the three visualized in Figure 4.
months preceding our survey, which was
fielded in early 2024.

Perceptions of productivity changes due to AI

Extremely increased my productivity

Moderately increased my productivity

Slightly increased my productivity


Answer

No impact on my productivity

Slightly decreased my productivity

Moderately decreased my productivity

Extremely decreased my productivity

0% 10% 20% 30% 40%


Percentage of respondents

Error bar represents 89% uncertainty interval


Figure 4: Respondents’ perceptions of AI’s impacts on their productivity.

22 Artificial intelligence: Adoption and attitudes


v. 2024.3
Across roles, respondents who Trust in AI-generated code
reported the largest productivity
improvements from AI were security Participants’ perceptions of the
professionals, system administrators, trustworthiness of AI-generated code
and full-stack developers. Although used in development work were complex.
they also reported positive productivity While the vast majority of respondents
improvement, mobile developers, (87.9%) reported some level of trust in
site reliability engineers, and project the quality of AI-generated code, the
managers reported lower magnitudes degree to which respondents reported
of productivity benefits than all other trusting the quality of AI-generated code
named roles. was generally low, with 39.2% reporting
little (27.3%) or no trust (11.9%) at all. This
Although we suspected that the data is visualized in Figure 5.
novelty of AI in development work,
and corresponding learning curve,
might inhibit developers’ ability to write
code, our findings did not support that
hypothesis. Only 5% of respondents
reported that AI had inhibited their ability
to write code to any degree. In fact, 67%
of respondents reported at least some
improvement to their ability to write code
as a result of AI-assisted coding tools,
and about 10% have observed “extreme”
improvements to their ability to write
code because of AI.

23 Artificial intelligence: Adoption and attitudes


v. 2024.3
Trust in quality of AI-generated code

A great deal

A lot
Answer

Somewhat

A little

Not at all

0% 10% 20% 30%


Percentage of respondents

Error bar represents 89% uncertainty interval


Figure 5: Respondents’ reported trust in the quality of AI-generated code.

Given the evidence from the survey Perhaps because this is not a new
that developers are rapidly adopting problem, participants like P3 felt that
AI, relying on it, and perceiving it as their companies are not “worried
a positive performance contributor, about, like, someone just copy-and-
we found the overall lack of trust in AI pasting code from Copilot or ChatGPT
surprising. It’s worth noting that during [because of] having so many layers to
our interviews, many of our participants check it” with their existing code-quality
indicated that they were willing to, or assurance processes.
expected to, tweak the outputs of the
AI-generated code they used in their We hypothesize that developers do not
professional work. necessarily expect absolute trust in
the accuracy of AI-generated code,
One participant even likened the need nor does absolute trust appear to be
to evaluate and modify the outputs of required for developers to find AI-
AI-generated code to “the early days generated code useful. Rather, it seems
of StackOverflow, [when] you always that mostly-correct AI-generated code
thought people on StackOverflow are that can be perfected with some tweaks
really experienced, you know, that they is acceptable, sufficiently valuable to
will know exactly what to do. And then, motivate widespread adoption and use,
you just copy and paste the stuff, and and compatible with existing quality
things explode” (P2). assurance processes.

24 Artificial intelligence: Adoption and attitudes


v. 2024.3
Expectations for AI’s future

Overall, our findings indicate AI has Optimistically, and consistent with


already had a massive impact on our findings that AI has positively
development professionals’ work, a trend impacted development professionals’
we expect to continue to grow. While it performance, respondents reported that
would be impossible to predict exactly they expect the quality of their products
how AI will impact development — and to continue to improve as a result of AI
our world — in the future, we asked over the next one, five, and 10 years.
respondents to speculate and share their
expectations about the impacts of AI in However, respondents also reported
the next one, five, and 10 years. expectations that AI will have net-
negative impacts on their careers, the
Respondents reported quite positive environment, and society, as a whole,
impacts of AI on their development work and that these negative impacts will be
in reflecting on their recent experiences, fully realized in about five years time.
but their predictions for AI’s future This data is visualized in Figure 6.
impacts were not as hopeful.

Expected negative impacts of AI

Product quality Delivery speed Organizational Career Society Environment


performance

40
Percentage of respondents
with negative outlook

30

20

10

0
1 5 10 1 5 10 1 5 10 1 5 10 1 5 10 1 5 10
Years in the future
Responses about one, five, or 10 years into the future

Error bar represents 89% credibility interval


Figure 6: Respondents’ expectations about AI’s future negative impacts in the next one, five, and 10 years.

25 Artificial intelligence: Adoption and attitudes


v. 2024.3
Interview participants held similarly [But,] nothing got replaced. In fact, there
mixed feelings about the future impacts were more jobs created. I believe the
of AI as our survey respondents. Some same thing will happen with AI” (P1).
wondered about future legal actions in a
yet-to-be-decided regulatory landscape, The future effects AI will have on our
worrying they might “be on the wrong world remain unclear. But, this year, our
side of it, if things get decided” (P3). survey strongly indicates that AI has
produced an unignorable paradigm shift
Others echoed long-held anxieties and in the field of software development. So
asked, “Is it going to replace people? far, the changes have been well-received
Who knows? Maybe.” (P2), while their by development professionals.
peers dismissed their fears by drawing
parallels to the past, when “people
used to say ‘Oh, Y2K! Everything will be
doomed!’ Blah, blah… because it was a
new thing, at that time.

1. https://www.sciencedaily.com/releases/2024/03/240306144729.htm
2. https://tech.co/news/list-ai-failures-mistakes-errors
3. https://klyker.com/absurd-yoga-poses-generated-by-ai/
4. https://dora.dev/dora-report-2023
5. Rogers, Everett M., Arvind Singhal, and Margaret M. Quinlan. “Diffusion of innovations.” An integrated approach to communication
theory and research. Routledge, 2014. 432-44, Tornatzky, L. G., & Fleischer, M. (1990). The processes of technological innovation.
Lexington, MA: Lexington Books
6. (P[N]), for example (P1), indicates pseudonym of interview participants.

26 Artificial intelligence: Adoption and attitudes


v. 2024.3
Exploring the
downstream
impact of AI

Takeaways
This chapter investigates the impact Despite these challenges, AI adoption
of AI adoption across the spectrum, is linked to improved team and
from individual developers to entire organizational performance. This chapter
organizations. The findings reveal a concludes with a call to critically evaluate
complex picture with both clear benefits AI's role in software development
and unexpected drawbacks. While AI and proactively adapt its application
adoption boosts individual productivity, to maximize benefits and mitigate
flow, and job satisfaction, it may also unforeseen consequences.
decrease time spent on valuable work.

Similarly, AI positively impacts code


quality, documentation, and review
processes, but surprisingly, these
gains do not translate to improved
software delivery performance.
In fact, AI adoption appears detrimental
in this area, while its effect on product
performance remains negligible.

27 Exploring the downstream impact of AI


v. 2024.3
The AI moment & DORA

Estimates suggest that leading tech


giants will invest approximately $1 trillion
on the development of AI in the next five
years.1 This aligns well with a statistic
presented in the "Artificial intelligence:
Adoption and attitudes" chapter that 81%
of respondents say their company has
shifted resources into developing AI.

The environmental impacts of AI further


compound the costs. Some estimates
suggest that by 2030, AI will drive an
increase in data center power demand
by 160%.2 The training of an AI model can
add up to roughly “the yearly electricity
consumption of over 1,000 U.S. Some believe that AI has dramatically
households”.3 It is no surprise that more enhanced the ability of humanity,4 others
than 30% of respondents think AI is going suggest that AI is little more than a
to be detrimental to the environment. benign tool for helping with homework,5
and some fear that AI will be the downfall
Beyond the development and of humanity.6
environmental costs, we have the
potential for adoption costs. Evidence for proximal outcomes, such
as the ability to successfully complete a
This could come in many forms, from particular task, is largely positive.7 When
productivity decreases to the hiring of the outcome becomes more distant,
specialists. These adoption costs could such as a team’s codebase, the results
also come at a societal level. Over a start becoming a little less clear and a
third of respondents believe AI will harm little less positive. For example, some
society in the coming decade. Given research has suggested that code churn
these costs, it seems natural for people to may double from the pre-2021 baseline.8
have a deep curiosity about the returns.
The challenge of understanding these
This curiosity has manifested itself in a downstream effects is unsurprising.
wealth of media, articles, and research The further away the effect is from
whose sentiment and data are both the cause, the less pronounced and
mixed, at least to some extent. clear the connection.

28 Exploring the downstream impact of AI


v. 2024.3
Evaluating the downstream effects of Our approach is specifically designed to
AI mimics quantifying the effect of a be useful for these types of challenges.
rock thrown into a lake. You can most DORA is designed to understand the
easily attribute the ripples closest to the utility or disutility of a practice. We’ve
impact point of the rock in the water, explored the downstream impacts
but the farther from the entry point you of myriad practices over the last 10
go, the less pronounced the effect of years, including security practices,
the rock is and the harder it is to ascribe transformational leadership, generative
waves to its impact. cultures, documentation practices,
continuous integration, continuous
AI is essentially a rock thrown into a delivery, and user-centricity.10
stormy sea of other processes and
dynamics. Understanding the extent We believe that DORA’s approach11 can
of the waves caused by AI (or any help us learn about AI’s impact, especially
technology or practice) is a challenge. as we explore the effects of AI across
This may be part of the reason the many outcomes.
industry has struggled to adopt a
principled set of measurement and
analytic frameworks for understanding
the impact of AI.9

29 Exploring the downstream impact of AI


v. 2024.3
Measuring AI adoption

The first challenge of capturing the Using factor analysis, we found our
impact of adopting AI is measuring “general” AI reliance survey item had high
the adoption of AI. We determined overlap with reported AI reliance on the
measuring usage frequency is likely not following tasks:
as meaningful as measuring reliance
for understanding AI’s centrality to
• Code Writing
development workflows. You might only
do code reviews or write documentation
• Summarizing information
a few times a month or every couple
of months, but you see these tasks as
• Code explanation
critically important to your work.
• Code optimization
Conversely, just because you use AI
frequently does not mean that you are • Documentation
using AI for work that you consider
important or central to your role. • Test writing

Given this, we asked respondents


about their reliance on AI in general The strong commonality and covariance
and for particular tasks. The previous among these seven items suggests
chapter details the survey results an underlying factor that we call AI
and their interpretation. adoption.

30 Exploring the downstream impact of AI


v. 2024.3
AI’s impact on individuals is a story of clear
benefits (and some potential tradeoffs)
As we do every year, we measured
a variety of constructs related to an
individual’s success and well-being:

Job satisfaction A single item designed to capture someone’s overall feeling about
their job.

Burnout A factor that encapsulates the multifaceted nature of burnout,


encompassing its physical, emotional, and psychological
dimensions, as well as its impact on personal life.

Flow A single item designed to capture how much focus a person tends
to achieve during development tasks.

Productivity A factor score designed to measure the extent an individual


feels effective and efficient in their work, creating value and
achieving tasks.

Time doing A single item measuring the percentage of an individual’s


toilsome work time spent on repetitive, manual tasks that offer little
long-term value.

Time doing A single item measuring the percentage of an individual's time


valuable work spent on tasks that they consider valuable.

We wanted to figure out if the way Figure 7 is a visualization that shows


respondents answered these questions our best estimates about the impact of
changes as a function of adopting AI. The adopting AI on an individual’s success
results suggest that is often the case. and well-being.

31 Exploring the downstream impact of AI


v. 2024.3
If an individual increases AI adoption by 25%…

Flow 2.6%

Job satisfaction 2.2%

Productivity 2.1%
Outcome

Time doing toilsome work 0.4%

Burnout -0.6%

Time doing valuable work -2.6%

-4 -2 0 2
Estimated % change in outcome
Point = estimated value
Error bar = 89% uncertainty interval
Figure 7: Impacts of AI adoption on individual success and well-being

The clear benefits This pattern is what we expected.


We believe it emerged in part thanks
The story about the benefit of adopting to AI’s ability to synthesize disparate
AI for individuals is largely favorable, sources of information and give a
but like any good story, has some highly personalized response in a single
wrinkles. What seems clear is that location. Doing this on your own takes
AI has a substantial and beneficial time, lots of context switching, and is less
impact on flow, productivity, and job likely to foster flow.
satisfaction (see Figure 7).
Given the strong connection that
Productivity, for example, is likely to productivity and flow have with job
increase by approximately 2.1% when satisfaction, it shouldn’t be surprising
an individual’s AI adoption is increased that we see AI adoption leads to higher
by 25% (see Figure 7). This might seem job satisfaction.
small, but this is at the individual-level.
Imagine this pattern extended across
tens of developers, or even tens of
thousands of developers.

32 Exploring the downstream impact of AI


v. 2024.3
The potential tradeoffs There are innumerable hypotheses that
could fit the data, but we came up with a
Here is where the story gets a little hypothesis that seems parsimonious with
complicated. One value proposition for flow, productivity, and job satisfaction
adopting AI is that it will help people benefitting from AI while time spent
spend more time doing valuable work. doing valuable work decreases and toil
That is, by automating the manual, remains unchanged.
repetitive, toilsome tasks, we expect
respondents will be free to use their time We call our hypothesis the vacuum
on “something better.” However, our hypothesis. By increasing productivity
data suggest that increased AI adoption and flow, AI is helping people work more
may have the opposite effect—reducing efficiently. This efficiency is helping
reported time spent doing valuable people finish up work they consider
work—while time spent on toilsome work valuable faster.
appears to be unaffected.
This is where the vacuum is created;
Markers of respondents’ well-being, like there is extra time. AI does not steal
flow, job satisfaction, and productivity value from respondents’ work, it
have historically been associated with expedites its realization.
time spent doing valuable work. So,
observed increases in these measures
independently of decreases in time spent
on valuable work are surprising.

A good explanation of these patterns


will need to wrestle with this seeming
incongruity. A good explanation of
a movie cannot ignore a scene that
contradicts the explanation. A good
explanation of a book cannot ignore a
chapter that doesn’t fit neatly into the
explanation. Similarly, a good explanation
of these patterns cannot just focus on a
subset of the patterns that allows us to
tell a simple story.

33 Exploring the downstream impact of AI


v. 2024.3
Wait, what is valuable work?

To make sense of these counterintuitive For example, when describing a recent


findings we explored more deeply what role shift, P1012 indicated making the
types of work respondents judge to be decision because “It helps me impact
valuable or toilsome. more people. It helps me impact more
things.” Similarly, P11 noted “if you build
Traditional wisdom, our past reports, something from scratch and see it's
and qualitative data from our interviews delivered to a consumer or customer,
suggest that respondents find you can feel that achievement, you can
development-related tasks, like coding, say to yourself, ‘Yeah! I delivered this and
to be valuable work, while less-valuable, people use that!’”
even toilsome, work typically includes
tasks associated with organizational Understanding that the “meaningfulness”
coordination, like attending meetings. of development work is derived from
Within this categorization scheme, AI the impact of the solution created—
is better poised to assist with “valuable” not directly from the writing of the
work than “toilsome” work, as code—helps explain why we observed
defined by respondents. respondents spending less time on
valuable work, while also feeling more
We turned to qualitative data from satisfied with their jobs.
our interviews and found that, when
responding to the moderator’s question
of whether or not they would consider
their work “meaningful,” participants
frequently measured the value of their
work in relation to the impact of their
work on others.

This is solidified by two years of past


DORA evidence of the extremely
beneficial impact of user-centricity
on job satisfaction.

34 Exploring the downstream impact of AI


v. 2024.3
While AI is making the tasks people The good news is that AI hasn’t made
consider valuable easier and faster, it it worse, nor has it negatively affected
isn’t really helping with the tasks people respondents’ well-being.
don’t enjoy. That this is happening while
toil and burnout remain unchanged,
obstinate in the face of AI adoption,
highlights that AI hasn’t cracked the
code of helping us avoid the drudgery of
meetings, bureaucracy, and many other
toilsome tasks (Figure 8).

Toilsome work

What AI is
helping with

Valuable work

Figure 8: Not data, but a visualization of our hypothesis: AI is


helping with our valuable work, but not helping us with our toil.

35 Exploring the downstream impact of AI


v. 2024.3
The promising impact of AI
on development workflows
The last section explored outcomes
focused on the individual. The next
set of outcomes shift focus to explore
processes, codebases, and team
coordination. Here is a list of the
outcomes we measured:

Code complexity The degree to which code’s intricacy and sophistication


hinders productivity.

Technical debt The extent to which existing technical debt within the
primary application or service has hindered productivity
over the past six months.

Code review The average time required to complete a code review for the
speed primary application or service.

Approval speed The typical duration from proposing a code change to receiving
approval for production use in the primary application or service.

Cross-functional The level of agreement with the statement: "Over the last
team (XFN) three months, I have been able to effectively collaborate with
coordination cross-functional team members.”

Code quality The level of satisfaction or dissatisfaction with the quality of code
underlying the primary service or application in the last six months.

Documentation The perception of internal documentation (manuals, readmes, code


quality comments) in terms of its reliability, findability, updatedness, and
ability to provide support.

36 Exploring the downstream impact of AI


v. 2024.3
As before, our goal here is to understand Overall, the patterns here suggest a
if these aspects seem to vary as a very compelling story for AI. Here are the
function of adopting AI. Figure 9 is substantial results from this section.
a visualization that shows our best
estimates of the change in these A 25% increase in AI adoption is
outcomes in relation to a 25% increase associated with a…
in AI adoption.
7.5% increase in documentation quality

3.4% increase in code quality

3.1% increase in code review speed

1.3% increase in approval speed

1.8% decrease in code complexity

If AI adoption increases by 25%…

Documentation quality 7.5%

Code quality 3.4%

Code review speed 3.1%


Outcome

Approval speed 1.3%

XFN coordination 0.1%

Tech debt -0.8%

Code complexity -1.8%

0 5

Estimated % change in outcome

Point = estimated value


Error bar = 89% uncertainty interval
Figure 9: Impacts of AI adoption on organizations.

37 Exploring the downstream impact of AI


v. 2024.3
The data presented in the "Artificial our ability to get value from what would
intelligence: Adoption and attitudes" have otherwise been considered low-
chapter show the most common quality code and documentation. What
use of AI is for writing code. 67% of if the threshold for what we consider
respondents report that AI is helping quality code and documentation simply
them improve their code. Here, we see moves down a little bit when we’re using
further confirmation of that sentiment. AI because AI is powerful enough to help
AI seems to improve code quality and us make sense of it? These two ways of
reduce code complexity (Figure 9). understanding these patterns are not
When combined with some potential mutually exclusive interpretations; both
refactoring of old code, the high-quality, could be contributing to these patterns.
AI-generated code could lead to an
overall better codebase. This codebase What seems clear in these patterns is
might be additionally improved by having that AI helps people get more from the
better access to quality documentation, documents they depend on and the
which people are using AI to generate codebases they work on. AI also helps
(see Artificial intelligence: Adoption and reduce costly bottlenecks in the code
attitudes). review and approval process. What isn’t
obvious is how exactly AI is doing this
Better code is easier to review and and if these benefits lead to further
approve. Combined with AI-assisted downstream benefits, such as software
code reviews, we can get faster reviews delivery improvements.
and approvals, a pattern that has clearly
emerged in the data (Figure 9).

Of course, faster code reviews and


approvals do not equate to better and
more thorough code review processes
and approval processes. It is possible
that we’re gaining speed through an
over-reliance on AI for assisting in the
process or trusting code generated by AI
a bit too much. This finding is not at odds
with the patterns in Figure 9, but it also
not the obvious conclusion.

Further, it isn’t obvious whether the


quality of the code and the quality of the
documentation are improving because
AI is generating it or if AI has enhanced

38 Exploring the downstream impact of AI


v. 2024.3
AI is hurting delivery performance

For the past few years, we have seen that software delivery
throughput and software delivery stability indicators were
starting to show some independence from one another.
While the traditional association between throughput and
stability has persisted, emerging evidence suggests these
factors operate with sufficient independence to warrant
separate consideration.

If AI adoption increases by 25%…

Delivery throughput -1.5%


Outcome

Delivery stability -7.2%

- 7. 5 -5.0 -2.5 0.0

Estimated % change in outcome

Point = estimated value


Error bar = 89% uncertainty interval
Figure 10: Impacts of AI adoption on delivery throughput and stability.

39 Exploring the downstream impact of AI


v. 2024.3
Contrary to our expectations, our Considered together, our data
findings indicate that AI adoption is suggest that improving the
negatively impacting software delivery development process does not
performance. We see that the effect on automatically improve software
delivery throughput is small, but likely delivery—at least not without proper
negative (an estimated 1.5% reduction adherence to the basics of successful
for every 25% increase in AI adoption). software delivery, like small batch sizes
The negative impact on delivery stability and robust testing mechanisms.
is larger (an estimated 7.2% reduction for
every 25% increase in AI adoption). This The beneficial impact that AI has
data is visualized in Figure 10. on many important individual and
organizational factors that foster the
Historically, our research has found conditions for high software delivery
that improvements to the software performance is reason for optimism.
development process, including But, AI does not appear to be a panacea.
improved documentation quality, code
quality, code review speed, approval
speed, and reduced code complexity
lead to improvements in software
delivery. So, we were surprised to see
AI improve these process measures,
while seemingly hurting our performance
measures of delivery throughput
and stability.

Drawing from our prior years’ findings,


we hypothesize that the fundamental
paradigm shift that AI has produced in
terms of respondent productivity and
code generation speed may have caused
the field to forget one of DORA’s most
basic principles—the importance of
small batch sizes. That is, since AI allows
respondents to produce a much greater
amount of code in the same amount
of time, it is possible, even likely, that
changelists are growing in size. DORA
has consistently shown that larger
changes are slower and more prone to
creating instability.

40 Exploring the downstream impact of AI


v. 2024.3
High-performing teams and organizations
use AI, but products don’t seem to benefit.

Here we look at AI's relationship with


our most downstream outcomes:

Organizational This is a factor score that accounts for an organization's


performance overall performance, profitability, market share, total
customers, operating efficiency, customer satisfaction,
quality of products/service, and ability to achieve goals.

Team This is a factor score that accounts for a team’s ability to


performance collaborate, innovate, work efficiently, rely on each other,
and adapt.

Product This is a factor score that accounts for the usability,


performance functionality, value, availability, performance
(for example, latency), and security of a product.

41 Exploring the downstream impact of AI


v. 2024.3
Drawing a connection from these
outcomes to an individual adopting AI
is difficult and noisy. Sometimes it feels
like we’re trying to analyze the impact of
what you had for lunch today on how well
your organization performs this year.

There is a logic to making jumps


between the micro-level (for example,
an individual) to the macro-level
(for example, an organization). We
discuss that inferential leap in the
Methodology chapter. For now, let’s
just check out the associations:

If AI adoption increases by 25%…

Organizational performance 2.3%


Outcome

Team performance 1.4%

Product performance 0.2%

0 1 2 3

Estimated % change in outcome

Point = estimated value


Error bar = 89% uncertainty interval
Figure 11. Impacts of AI adoption on organizational, team, and product performance.

42 Exploring the downstream impact of AI


v. 2024.3
Organization-level performance (an is a problem to be resolved through
estimated 2.3% increase for every 25% computation, but certain elements of
increase in AI adoption) and team-level product development, such as creativity
performance (an estimated 1.4% increase or user experience design, may still (or
for every 25% increase in AI adoption) forever) heavily rely on human intuition
seem to benefit from AI adoption (Figure and expertise.
11). Product performance, however, does
not seem to have an obvious association The fact remains that organization,
with AI adoption. Now, we can shift to team, and product performance are
trying to understand what is underlying undeniably interconnected. When looking
these effects. at bivariate correlations (Pearson), we
find product performance has a medium
We hypothesize that the factors positive correlation with both team
contributing to strong team and performance (r = 0.56, 95% confidence
organizational performance differ from interval = 0.51 to 0.60) and organizational
those influencing product performance. performance (r = 0.47, 95% confidence
interval = 0.41 to 0.53).
Teams and organizations rely heavily
on communication, knowledge sharing, These outcomes influence each
decision making, and healthy culture. AI other reciprocally, creating clear
could be alleviating some bottlenecks in interdependencies. High-performing
those areas, beneficially impacting teams teams tend to develop better products,
and the organizations. but inheriting a subpar product
can hinder their success. Similarly,
Product success, however, might involve high-performing organizations foster
additional factors. Although good high-performing teams through
products surely have similar underlying resources and processes, but
causes as good high performing teams organizational struggles can stifle team
and organizations, there is likely a closer performance. Therefore, if AI adoption
and more direct connection to the significantly benefits teams and
development workflow and the software organizations, it's reasonable to expect
delivery, both of which may be still benefits for products to emerge as well.
stabilizing after the introduction of AI.
The adoption of AI is just starting.
The unique importance of technical Some benefits and detriments may take
aspects underlying a good product time to materialize, either due to the
might explain part of it, but there is inherent nature of AI's impact or the
also an art and empathy underlying a learning curve associated with
great product. This might be difficult to its effective utilization.
believe for people who think everything

43 Exploring the downstream impact of AI


v. 2024.3
Perhaps the story is simply that
we're figuring out how AI can help
organizations and teams before
we’ve fully realized its potential for
product innovation and development.
Figure 12 tries to visualize how this
might be unfolding.

We are here Product

Teams

Benefit

Detriment

Time using AI

Figure 12: Representations of different learning curves. This is an abstraction for demonstrative
purposes. This is not derived from real data.

44 Exploring the downstream impact of AI


v. 2024.3
So now what?

We wanted to understand the potential Adopting AI at scale might not be as


of AI as it currently stands to help easy as pressing play. A measured,
individuals, teams, and organizations. The transparent, and adaptable strategy
patterns that are emerging underscore has the potential to lead to substantial
that it isn’t all hot air; there really is benefits. This strategy is going to need
something happening. to be co-developed by leaders, teams,
organizations, researchers, and those
There is clear evidence in favor developing AI.
of adopting AI. That said, it is also
abundantly clear that there are plenty of Leaders and organizations need to find
potential roadblocks, growing pains, and ways to prioritize adoption in the areas
ways AI might have deleterious effects. that will best support their employees.

Here are some thoughts


about how to orient your Define a clear AI mission and
AI adoption strategy: policies to empower your
organization and team.

Provide employees with transparent


information about your AI mission,
goals, and AI adoption plan. By
articulating both the overarching
vision and specific policies —
addressing procedural concerns
such as permitted code placement
and available tools — you can
alleviate apprehension and position
AI as a means to help everyone focus
on more valuable, fulfilling,
and creative work.

45 Exploring the downstream impact of AI


v. 2024.3
Create a culture of continuous Recognize and leverage AI’s trade-
learning and experimentation offs for competitive advantage.
with AI.
By acknowledging potential
Foster an environment that drawbacks — such as reduced
encourages continuous exploration time spent on valuable work, over-
of AI tools by dedicating time for reliance on AI, the potential for
individuals and teams to discover benefits gained in one area leading
beneficial use cases and granting to challenges in another, and impacts
them autonomy to chart their on software delivery stability and
own course. Build trust with AI throughput — you can identify
technologies through hands-on opportunities to avoid pitfalls and
experience in sandbox or low-risk positively shape AI’s trajectory at
environments. Consider further your organization, on your team.
mitigating risks by focusing on Developing an understanding not
developing robust test automation. only of how AI can be beneficial, but
Implement a measurement of how it can be detrimental allows
framework that evaluates AI not by you to expedite learning curves,
sheer adoption but by meaningful support exploration, and translate
downstream impacts — how it helps your learnings into action and a real
employees thrive, benefits those who competitive advantage.
rely on your products, and unlocks
team potential.

It is obvious that there is a lot to be excited about and even more to learn. DORA will
stay tuned in and do our best to offer honest, accurate, and useful perspectives, just
as it has over the past decade.

1. https://www.goldmansachs.com/insights/top-of-mind/gen-ai-too-much-spend-too-little-benefit
2. https://www.goldmansachs.com/insights/articles/AI-poised-to-drive-160-increase-in-power-demand
3. https://www.washington.edu/news/2023/07/27/how-much-energy-does-chatgpt-use/
4. https://www.gatesnotes.com/The-Age-of-AI-Has-Begun
5. https://www.businessinsider.com/ai-chatgpt-homework-cheating-machine-sam-altman-openai-2024-8
6. https://www.safe.ai/work/statement-on-ai-risk
7. https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-
happiness/
8. https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality
9. https://www.nytimes.com/2024/04/15/technology/ai-models-measurement.html
10. https://dora.dev/capabilities
11. we should be clear that this isn’t a unique approach, but it is a somewhat unique approach for this space
12. (P[N]), for example (P1), indicates pseudonym of interview participants.

46 Exploring the downstream impact of AI


v. 2024.3
Platform
engineering

Introduction
Platform engineering is an emerging
engineering discipline that has been
gaining interest and momentum across
the industry. Industry leaders such as
Spotify and Netflix, and books such as
Team Topologies1 have helped excite
audiences.

Platform engineering is a sociotechnical


discipline where engineers focus on
the intersection of social interactions
between different teams and the
technical aspects of automation, self-
service, and repeatability of processes.
The concepts behind platform
engineering have been studied for many
years, including by DORA.

Generally, our research is focused on


how we deliver a software to external
users, whereas the output of platform
teams is typically an inwardly-focused
set of APIs, tools, and services designed
to support the software development
and operations lifecycle.

47 Platform engineering
v. 2024.3
In platform engineering, a lot of energy For example, if the platform has
and focus is spent on improving the the capability to execute unit tests
developer experience by building golden and report back results directly to
paths, which are highly-automated, development teams, but without that
self-service workflows that users of team needing to build and manage the
the platform use when interacting testing execution environment, then the
with resources required to deliver and continuous integration platform feature
operate applications. Their purpose is enables teams to focus on writing
to abstract away the complexities of high-quality tests. In this example, the
building and delivering software such continuous integration feature can scale
that the developer only needs to worry across the larger organization and make
about their code. it easier for multiple teams to improve
their capabilities with continuous testing3
Some examples of the tasks and test automation.4
automated through golden paths
include new application provisioning,
database provisioning, schema
management, test execution, build and
deployment infrastructure provisioning,
and DNS management.

Concepts in platform engineering such


as moving a capability down (sometimes
called “shifting down”)2 into a shared
system can seem counter to approaches
like 'you build it, you run it.' However,
we think of platform engineering as a
method to scale the adoption of these
practices across an organization because
once a capability is in the platform,
teams essentially get it for free through
adoption of the platform.

48 Platform engineering
v. 2024.3
A key factor in the success is to productivity and 10% higher levels of
approach platform engineering with team performance. Additionally, an
user-centeredness (users in the context organization's software delivery and
of an internal developer platform are operations performance increases
developers), developer independence, 6% when using a platform. However,
and a product mindset. This isn’t too these gains do not come without
surprising given that user centricity was some drawbacks. Throughput and
identified as a key factor in improved change stability saw decreases of 8%
organizational performance this year and 14%, respectively, which was a
and in previous years.5 Without a user- surprising result.
centered approach, the platform will be
more a hindrance rather than an aid. In the next sections we’ll dig deeper
into the numbers, nuances, and some
In this year’s report, we sought to test surprising data that this survey revealed.
the relationship between platforms Whether your platform engineering
and software delivery and operational initiative is just starting or has been
performance. We found some positive underway for many years, application of
results. Internal developer platform the key findings can help your platform
users had 8% higher levels of individual be more successful.

8.0
Estimated productivity factor score (0-10)

7.5

7.0

6.5

No platform Platform

Each dot is one of 8000 estimates of the most plausible mean productivity score
Figure 13: Productivity factor for individuals when using or not using an internal developer platform.

49 Platform engineering
v. 2024.3
The promise of platform engineering

Internal developer platforms are Overall, the impact of a platform is


garnering interest from large sections positive, individuals were 8% more
of the software developer and IT productive and teams performed
industry given the potential efficiency 10% better when using an internal
and productivity gains that could be developer platform.
achieved through the practice. For this
year’s survey, we left the definition of an Beyond productivity, we also see
internal developer platform quite broad6 gains when a platform is used in an
and found that 89% of respondents are organization’s overall performance, with
using an internal developer platform. an increase of 6%. On the whole, the
The interaction models are very diverse organization is able to quickly deliver
across that population. software, meet user needs, and drive
business value due to the platform.
These data points align with the broad
level of industry interest in platform
engineering and the emerging nature
of the field.
Organizational performance

less than a year 1-2 years 2-5 years more than 5 years
Platform age

Figure 14: Organization performance change when using an internal developer platform vs the age of the platform.

50 Platform engineering
v. 2024.3
When taking into account the age of Key finding - impact of developer
the platform with productivity, we see independence
initial performance gains at the onset
of a platform engineering initiative, Developer independence had a
followed by decrease and recovery as significant impact on the level of
the platform ages and matures. This productivity at both the individual and
pattern is typical of transformation team levels when delivering software
initiatives that experience early gains using an internal developer platform.
but encounter challenges once those Developer independence is defined
have been realized. as “developers' ability to perform their
tasks for the entire application lifecycle,
In the long run, productivity gains without relying on an enabling team.”
are maintained showing the overall
potential of an internal developer At both the team and individual level we
platform’s role in the software delivery see a 5% improvement in productivity
and operational processes. when users of the platform are able to
complete their tasks without involving an
enabling team. This finding points back
to one of the key principles of platform
engineering, focusing on enabling self-
service workflows.

For platform teams, this is key because


it points to an important part of the
platform engineering process, collecting
feedback from users. Survey responses
did not indicate which forms of feedback
are most effective, but common
methods are informal conversations
and issue trackers, followed by ongoing
co-development, surveys, telemetry,
and interviews.

All of these methods can be effective


at understanding whether or not
users are able to complete their tasks
independently. The survey data also
showed that not collecting feedback on
the platform has a negative impact.

51 Platform engineering
v. 2024.3
Secondary finding - impact of a Overall, the impact of having an
dedicated platform team internal developer platform has a
positive impact on productivity.
Interestingly, the impact on productivity
of having a dedicated platform team The key factors are:
was negligible for individuals. However,
it resulted in a 6% gain in productivity
at the team level. This finding is A user-centered approach that
surprising because of its uneven impact, enables developer independence
suggesting that having a dedicated through self-service and
platform team is useful to individuals, workflows that can be completed
but the dedicated platform team is more autonomously. Recall that in the
impactful for teams overall. context of the platform, users
are internal engineering and
Since teams have multiple developers
with different responsibilities and skills,
development teams.
they naturally have a more diverse
set of tasks when compared to an
individual engineer. It is possible that As with other transformations,
having a dedicated platform engineering the “j-curve” also applies to
team allows the platform to be more platform engineering, so productivity
supportive of the diversity in tasks gains will stabilize through
represented by a team. continuous improvement.

52 Platform engineering
v. 2024.3
The unexpected
downside
While platform engineering presents Each of these handoffs is an opportunity
some definite upsides, in terms of teams for time to be introduced into the
and individuals feeling more productive overall process resulting in a decrease in
and improvements in organizational throughput, but a net increase in ability
performance, platform engineering to get work done.
had an unexpected downside: We also
found that throughput and change Second, for respondents who reported,
stability decreased. they are required to “exclusively use the
platform to perform tasks for the entire
Unexpectedly, we discovered a very app lifecycle,” there was a 6% decrease
interesting linkage between change in throughput. While not a definitive
instability and burnout. connection, it could also be related to
the first hypothesis.
Throughput
If the systems and tools involved in
In the case of throughput, we saw developing and releasing software
approximately an 8% decrease when increases with the presence of a
compared to those who don’t use a platform, being required to use the
platform. We have hypotheses about platform when it might not be fit for
what might be the underlying cause. purpose or naturally-increasing latency
in the process could account for the
First, the added machinery that changes relationship between exclusivity and
need to pass through before getting decrease in productivity.
deployed to production decreases the
overall throughput of changes. In general, To counter this it is important to be
when an internal developer platform is user-centered and work toward
being used to build and deliver software, user independence in your platform
there is usually an increase in the number engineering initiatives.
of “handoffs” between systems and
implicitly teams.

For example, when code is committed


to source control, it is automatically
picked up by different systems for
testing, security checks, deployment,
and monitoring.

53 Platform engineering
v. 2024.3
Change instability and burnout It could also be that the platform
provides an automated testing
When considering the stability of the capability that exercises whatever tests
changes to applications being developed are included in the application. Yet
and operated when using an internal application teams aren't fully using that
developer platform, we observed a capability by prioritizing throughput over
surprising 14% decrease in change quality and not improving their tests.
stability. This indicates that the change In either scenario, bad changes are
failure rate and rate of rework are actually making it through the process,
significantly increased when a platform resulting in rework.
is being used.
A third possibility is that teams with
Even more interesting, in the results a high level of change instability and
we discovered that instability in burnout tend to create platforms in an
combination with a platform is linked effort to improve stability and reduce
to higher levels of burnout. That isn’t burnout. This makes sense because
to say that platforms lead to burnout, platform engineering is often viewed
but the combination of instability and as a practice which reduces burnout
platforms are particularly troublesome and increases the ability to consistently
when it comes to burnout. Similar to ship smaller changes. With this
the decrease in throughput, we aren’t hypothesis, platform engineering is
entirely sure why the change in burnout symptomatic of an organization with
occurs, but we have some hypotheses. burnout and change instability.

First, the platform enables developers In the first two scenarios, the rework
and teams to push changes with a higher allowed by the platform could be seen
degree of confidence that if the change as burdensome which could also be
is bad, it can be quickly remediated. In increasing burnout. In particular, the
this instance the higher level of instability second scenario where the platform is
isn’t necessarily a bad thing since enabling bad changes would contribute
the platform is empowering teams to more to burnout, but in both scenarios
experiment and deliver changes, which the team or individual could still feel
results in an increased level of change productive because of their ability to
failure and rework. push changes and features. In the third
scenario, change instability and burnout
A second idea is that the platform are predictive of a platform engineering
isn’t effective at ensuring the quality initiative and the platform is seen as a
of changes and/or deployments solution to those challenges.
to production.

54 Platform engineering
v. 2024.3
Balancing the Collaboration and feedback improve
Trade-offs the user-centeredness of the platform
initiative and will contribute to the long-
term success of the platform. As we
While platform engineering is no saw in the data, there are many different
panacea, it has the potential to be a methods used to collect feedback, so
powerful discipline when it comes employ more than one approach to
to the overall software development maximize feedback collection.
and operations process. As with any
discipline, platform engineering has Second, carefully monitor the instability
benefits and drawbacks. of your application changes and try
to understand whether the instability
Based on our research, there are a being experienced is intentional or
couple actions you can take to balance not. Platforms have the potential to
the trade-offs when embarking on a unlock experimentation in the terms of
platform engineering initiative. Doing so instability, increase productivity, and
will help your organization achieve the improve performance at scale.
benefits of platform engineering while
being able to monitor and manage any However, that same instability can also
potential downsides. have the potential to do this at the cost
of instability and burnout, so it needs to
First, prioritize platform functionality be carefully monitored and accounted
that enables developer independence for throughout the platform engineering
and self-service capabilities. When journey. When doing so it is important to
doing this, pay attention to the balance understand your appetite for instability.
between exclusively requiring the Using service level objectives (SLOs)
platform to be used for all aspects of and error budgets from site reliability
the application lifecycle, which could engineering (SRE) can help you gauge
hinder developer independence. your risk tolerance and effectiveness
of the platform in safely enabling
As good practice, a platform should experimentation.
provide methods for users of a platform
to break out of the tools and automations Internal developer platforms put a lot of
provided in the platform, which emphasis on the developer experience,
contributes to independence, however, however, there are many other teams
it comes at the cost of complexity. (including database administrators,
This trade-off can be mitigated with a security, and operations) who are
dedicated platform team that actively required to effectively deliver and
collaborates with and collects feedback operate software.
from users of the platform.

55 Platform engineering
v. 2024.3
In your platform engineering initiatives,
foster a culture of user-centeredness
and continuous improvement across
all teams and aligned with the
organization’s goals.

Doing so will align the platform’s


features, services, and APIs to best serve
individual and team needs as they work
to deliver software and business value.

1. Skelton, Matthew and Pais, Manuel. 2019. Team Topologies: Organizing Business and Technology Teams for Fast Flow. IT Revolution
Press. https://teamtopologies.com/
2. https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left
3. https://dora.dev/capabilities/continuous-integration/
4. https://dora.dev/capabilities/test-automation/
5. https://dora.dev/research/2023/, https://dora.dev/research/2016/
6. https://dora.dev/research/2024/questions/#platform-engineering

56 Platform engineering
v. 2024.3
Developer
experience

Takeaways
Software doesn’t build itself. Even Ultimately, software is built for people,
when assisted by AI, people build so it’s the organization’s responsibility to
software, and their experiences at foster environments that help developers
work are a foundational component focus on building software that will
of successful organizations. improve the user experience. We also
find that stable environments, where
In this year’s report, we again found priorities are not constantly shifting,
that alignment between what developers lead to small but meaningful increases in
build and what users need allows productivity and important, meaningful
employees and organizations to thrive. decreases in employee burnout.
Developers are more productive,
less prone to experiencing burnout, Environmental factors have substantial
and more likely to build high quality consequences in the quality of the
products when they build software products developed, and the overall
with a user-centered mindset. experience of developers whose job
is to build those products.

57 Developer experience
v. 2024.3
Put the user first, and (almost) This year, we asked questions focused
everything else falls into place on understanding whether developers:

We think that the job of a developer 1. Incorporate user feedback to revisit


is pretty cool. Developers are at the and reprioritize features
forefront of technological advancements
and help shape how we live, work, and 2. Know what users want to accomplish
interact with the world. with a specific application/service

Their jobs are fundamentally tied to 3. Believe focusing on the user is key to
people–the users of the software and the success of the business
applications they create. Yet developers
often work in environments that prioritize 4. Believe the user experience is a top
features and innovation. There’s less business priority
emphasis on figuring out whether these
features provide value to the people who
use the products they make.

Here we provide compelling evidence


showing that an approach to software
development that prioritizes the end
user positively impacts employees and
organizations alike.

58 Developer experience
v. 2024.3
Our findings and what they mean However, our data indicates there’s
another path that leads to success:
Our data strongly suggests that
organizations that see users’ needs Developers and their employers,
and challenges as a guiding light and organizations in general, can
make better products. create a user-centered approach to
software development.
We find that focusing on the user
increases productivity and job We find that when organizations know
satisfaction, while reducing the and understand users’ needs, stability
risk of burnout. and throughput of software delivery are
not a requirement for product quality.
Importantly, these benefits extend Product quality will be high as long as the
beyond the individual employee to user experience is at the forefront.
the organization. In previous years,
we’ve highlighted that high performing When organizations don’t focus on
organizations deliver software quickly the user, don’t incorporate user
and reliably. The implication is that feedback into their development
software-delivery performance is a process, doubling down on stable
requirement for success. and fast delivery is the only path to
product quality (see Figure 15).

59 Developer experience
v. 2024.3
We understand the inclination that When organizations and employees
some organizations might have to focus understand how their users experience
on creating features and innovating the world, they increase the likelihood
on technologies. At face value, this of building features that address the real
approach makes sense. After all, needs of their users. Addressing real user
developers most certainly know the ins needs increases the chances of those
and outs of the technology much better features being actually used.
than their average user.
Focus on building for your user and
However, developing software based on you will create delightful products.
assumptions about the user experience
increases the likelihood of developers
building features that are perhaps shiny
but hardly used.1

Level of user centricity Low Medium High

9
Predicted product performance

0.0 2.5 5.0 7.5 10.0

Delivery throughput

Figure 15: Product performance and delivery throughput across 3 levels of user centricity

60 Developer experience
v. 2024.3
Why is a user-centered approach Provides a clear sense of direction:
to software development such a
powerful philosophy and practice? A user-centered approach to software
development can fundamentally alter
Academic research shows that deriving how developers view their work. Instead
a sense of purpose from work benefits of shipping arbitrary features and
employees and organizations.2,3 guessing whether users might use them,
developers can rely on user feedback to
For example, a recent survey showed help them prioritize what to build.
that 93% of workers reported that it’s
important to have a job where they feel This approach gives developers
the work they do is meaningful.4 In a confidence that the features they are
similar vein, another survey found that working on have a reason for being.
on average, respondents were willing Suddenly, their work has meaning: to
to relinquish 23% of their entire future ensure people have a superb experience
earnings if it meant they could have a when using their products and services.
job that was always meaningful.5 There’s no longer a disconnect between
the software that’s developed and the
That’s an eye-popping trade-off world in which it lives.
employees are willing to make. It tells us
something about what motivates people, Developers can see the direct
and that people want to spend their time impact of their work through the
doing something that matters. software they create.

“It would be grand if everybody


could work at a company that “We are, as a company, under
affects individuals outside of pressure to deliver. So, all of
the company, or [in] your local these, like, nice shiny things, or
community in a positive way. That’s discussion points about how you
not always the case. That’s not want to improve, it’s kind of, like,
always possible. A lot of the grand with the recent change in how
vision of autonomous driving is we’re structured, we’re focusing
that it is going to enable people on delivery, not quality, and for
that can drive [to] sleep while me, personally, that’s kind of a big
they’re on a motorway. That’s not bugbear.” (P9)
why I’m here. I want to help people
that can’t drive to be able to get
about, wherever they want, have
the freedom to do whatever they
want to do.” (P2)6

61 Developer experience
v. 2024.3
Increases cross-functional This approach to software development
collaborations: can help developers break out of silos,
seek alignment, foster teamwork, and
Even the most talented developer create opportunities to learn more from
doesn’t build software on their own. others. Problem solving takes a different
Building high-quality products takes the shape. It’s not just about how to solve
collaboration of many people often with technical problems, but how to do so in
different yet complementary talents. ways that serve the user best.

A user-centered approach to This approach can help increase


development allows developers employee engagement and create an
to engage in cross-functional even more intellectually-stimulating
collaborations across the organization. environment that can stave off
In doing so, their responsibilities the feelings of stagnation that are
extend beyond simply shipping associated with burnout.
software. They are now part of a team
driven to create incredible experiences
for the people who use them.

What can organizations do?

Based on our findings, we recommend Resist the temptation to make


organizations invest time and resources assumptions about your users. Observe
in getting to know their users. Focus on them in their environments, ask them
understanding who you are building for, questions, and be humble enough
and the challenges they experience. to pivot based on what they tell you.
We strongly believe this is a worthy In doing so, developers will be more
investment. productive and be less prone to burnout
while delivering higher quality products.

62 Developer experience
v. 2024.3
The combination of good docs and a We see that internal documentation
user-centered approach to software doesn’t meaningfully affect predicted
development is a powerful one. product performance without user
signals. However, if a team has a high
Teams that focus on the user see an quality internal documentation then user
increase in product performance. When signals included in it will have a higher
this focus on the user is combined impact on product performance.
with an environment of quality internal
documentation, this increase in product We started to look at documentation
performance is amplified (see Figure 16). in 2021, and every year we continue
This finding is similar to the behavior that to find extensive impact of quality
we see where documentation amplifies documentation. This year’s findings
a technical capability’s impact on adds internal documentation’s impact
organizational performance.7 on predicted product performance
to the list.
Documentation helps propagate user
signals and feedback across the team
and into the product itself.

Level of user centricity Low Medium High


Predicted product performance

0.0 2.5 5.0 7.5 10.0

Documentation quality

The graph is a composite of 12000 lines from simulations trying to estimate the most plausible pattern
Figure 16: Product performance and documentation quality across 3 levels of user centricity

63 Developer experience
v. 2024.3
Culture of documentation In these cases, our measure of quality
documentation would likely score low.
The Agile manifesto advocates for This type of content is written for the
“working software over comprehensive wrong audience so doesn’t perform as
documentation”.7 We continue to find, well when you try to use it while doing
however, that quality documentation is a your work. And too much documentation
key component of working software. can be as problematic as not enough.

“Comprehensive documentation” Our measure of quality documentation


may be a phrase standing in for includes attributes like findability
unhealthy practices, which might and reliability of the documentation.
include documentation. Problematic Remember, for internal documentation,
documentation includes documentation the primary audience is your colleagues
that is created only for bureaucratic or even your future self trying to
purposes, or to paper over mistrust accomplish specific tasks.8 Teams
between management and employees. with a healthy documentation culture
An unhealthy documentation culture have a focus on serving these readers.
can also include writing documentation, This is another way that focusing on
but not maintaining or consolidating your users matters.
the documentation.

You can create a healthy culture of documentation on your own teams by


following the practices we’ve identified to create quality documentation,
such as:

Documenting critical use cases. Maintaining documentation as part of


the software development lifecycle.
Taking training in technical writing.
Deleting out-of-date or redundant
Defining ownership and processes documentation.
to update the documentation.
Recognizing documentation work in
Distributing documentation work performance reviews and promotions.
within the team.

64 Developer experience
v. 2024.3
The perils of
ever-shifting priorities

We all know the feeling. You’ve spent Our findings and what they mean
the last few months working on a new
feature. You know it’s the right thing to Overall, our findings show small
build for your users, you are focused and but meaningful decreases in
motivated. Suddenly, or seemingly so, the productivity and substantial increases
leadership team decides to change the in burnout when organizations have
organization’s priorities. Now it’s unclear unstable priorities.
whether your project will be paused,
scrapped, Frankensteined, or mutated. Our data indicates it is challenging
to mitigate this increase in burnout.
This common experience can have We examined whether having strong
profound implications for employees and leaders, good internal documents, and
organizations. Here we examine what a user-centered approach to software
happens when organizations constantly development can help counteract the
shift their priorities. effect of shifting priorities on burnout.

The answer is: They can’t. An organization


can have all these positive traits and, if
priorities are unstable, employees will still
be at risk of experiencing burnout.

65 Developer experience
v. 2024.3
Why are unstable What happens when
organizational priorities bad priorities stabilize?
for employees’ well-being?
Our findings here are a little puzzling.
We hypothesize that unstable We find that when priorities are
organizational priorities increase stabilized, software delivery performance
employee burnout by creating unclear declines. It becomes slow and less stable
expectations, decreasing employees' in its delivery.
sense of control, and increasing the
size of their workloads. We hypothesize that this might be
because organizations with stable
To be clear, we believe that the priorities might have products and
problem is not with changing priorities services that are generally in good shape
themselves. Business goals and product so changes are made less frequently. It
direction shift all the time. It can be is also possible that stability of priorities
good for organizational priorities to leads to shipping less and in larger
be malleable. batches than recommended.

We believe it is the frequency with which Nevertheless, we find this to be an


priorities change that has a negative unexpected finding. Why do you think
impact on employees' well-being. stabilizing organizational priorities
The uncertainty that accompanies decreases the speed and stability
unstable priorities implies something of software delivery?
chronic about the frequency with which
priorities change.

Decades of academic research


have shown the detrimental effects of
chronic stress on health and well-being.9
We see parallels between research on
chronic stress and our findings.
Chronic instability increases uncertainty
and decreases perceived control.
This combination is an excellent recipe
for burnout.

66 Developer experience
v. 2024.3
Building AI for end users creates This period likely leads to a
stability in priorities, but not stability destabilization of priorities as leaders
in delivery. try to figure the best move for the
organization. As the dust settles, and
Incorporating AI-powered experiences organizations clarify their next steps,
for end users stabilizes organizational priorities begin to stabilize.
priorities. This sounds like a flashy
endorsement for AI. However, we do Priorities stabilizing, however, doesn’t
not interpret this finding as telling us immediately translate into the software
something meaningful about AI itself. delivery process stabilizing. Our analyses
show that a shift to adding AI-powered
Instead, we believe that shifting efforts experiences into your service or
towards building AI provides clarity and application comes with challenges and
a northstar for organizations to follow. growing pains.
This clarity, and not AI, is what leads to a
stabilization of organizational priorities. We find that teams that have shifted have
a significant 10% decrease in software
This is worth highlighting because it tells delivery stability relative to teams who
us something about what happens to have not. Here is a visualization depicting
organizations when new technologies the challenge.
emerge. New technologies bring change
and organizations need time to adapt.

9.0
Predicted delivery stability

8.5

8.0

7.5

Strongly disagree Mostly disagree Slightly disagree Neither agree nor disagree Slightly agree Mostly agree Strongly agree

Adding AI-powered experiences to service or application

*Each line is one of 4000 simulations trying to estimate the most plausible pattern
Figure 17: Software delivery stability as a function of adding AI-powered experiences
to service or application

67 Developer experience
v. 2024.3
What can organizations do?

The answer, while easy, might not be


so simple. Based on our findings, we
recommend organizations focus on
stabilizing their priorities. This is one sure
way to counteract the negative effects of
unstable priorities on employee burnout.

Our findings show the negative effects


of unstable priorities are resistant
to having good leaders, good
documentation, and a user-centered
approach to software development.
This leads us to believe that, aside from
creating stability, there’s not much
organizations can do to avoid burnout
aside from finding ways to (1) stabilize
priorities and (2) shield employees from
having their day-to-day be impacted by
the constant shift in priorities.

1. https://www.nngroup.com/articles/bridging-the-designer-user-gap/
2. https://executiveeducation.wharton.upenn.edu/thought-leadership/wharton-at-work/2024/03/creating-meaning-at-work/
3. https://www.apa.org/pubs/reports/work-in-america/2023-workplace-health-well-being
4. https://bigthink.com/the-present/harvard-business-review-americans-meaningful-work/
5. https://hbr.org/2018/11/9-out-of-10-people-are-willing-to-earn-less-money-to-do-more-meaningful-work
6. (P[N]), for example (P1), indicates pseudonym of interview participants.
7. https://cloud.google.com/blog/products/devops-sre/deep-dive-into-2022-state-of-devops-report-on-documentation and
Accelerate State of DevOps Report 2023 - https://dora.dev/research/2023/dora-report
8. https://agilemanifesto.org/
9. Other audiences exist, such as management, regulators, or auditors.
10. Cohen S, Janicki-Deverts D, Miller GE. Psychological Stress and Disease. JAMA. 2007;298(14):1685–1687.doi:10.1001/
jama.298.14.1685

68 Developer experience
v. 2024.3
Leading
transformations

A lot needs to be in place for


transformation to work. This year,
we’ve found high-performing teams
are ones that prioritize stability, focus
on their users, have good leaders,
and craft quality documentation. Our
research points to some useful paths
in helping you plot a course towards
successful transformation.

We have found the key to success is


to approach transformation from a
mindset of continuous improvement.
High performers in our study understand
the variables holding them back, and
methodically and continuously improve
using the DORA metrics as a baseline.
While long-term success requires
excellence in all pillars, a decade of DORA
research has pointed us to four specific,
impactful ways to get started on driving
transformation in your own organization.

69 Leading transformations
v. 2024.3
Transformational
leadership
Transformational leadership is a
model in which leaders inspire and
motivate employees to achieve higher
performance by appealing to their
values and sense of purpose, facilitating
wide-scale organizational change.

These leaders encourage their teams


to work towards a common goal through
the following dimensions:1

Vision They have a clear vision of where their team and the
organization are going.

Inspirational They say positive things about the team; make employees proud
communication to be a part of their organization; encourage people to see
changing conditions as situations full of opportunities.

Intellectual They challenge team members to think about old problems


stimulation in new ways and to rethink some of their basic assumptions
about their work.

Supportive They consider others’ personal feelings before acting; behave in


leadership a manner which is thoughtful of others’ personal needs.

Personal They commend team members when they do a better-than-


recognition average job; acknowledge improvement in quality of team
members' work.

70 Leading transformations
v. 2024.3
This year, we saw that transformational Our research found a statistically
leadership leads to a boost in significant relationship between the
employee productivity. We see that above qualities of leadership and IT
increasing transformational leadership performance in 2017. High-performing
by 25% leads to a 9% increase in teams had leaders with strong scores
employee productivity. across all five characteristics and low-
performing teams had the lowest scores.
Transformational leadership can help Additionally, we saw that there’s a strong
improve more than just productivity. correlation between transformative
Having good leaders can also lead to: leadership and Employee Net Promoter
Score (eNPS), the likelihood to
recommend working at a company.
• A decrease in employee burnout
That said, transformative leadership by
• An increase in job satisfaction
itself does not lead to high performance,
but should be seen as an enabler.
• An increase in team performance
Transformative leadership plays a key
• An improved product performance
role in enabling the adoption of technical
• An improved organizational and product-management capabilities
performance and practices. This is enabled by (1)
delegating authority and autonomy to
teams; (2) providing them the metrics
and business intelligence needed to solve
problems; and (3) creating incentive
structures around value delivery as
opposed to feature delivery.

Transformation takes time and requires


tools. Resources must be allocated by
leadership specifically for the task of
improvement. Good leaders play a key
role in providing teams with the time and
funding necessary to improve. Engineers
should not be expected to learn new
things and automate on their off time,
this should be baked into their schedule.

71 Leading transformations
v. 2024.3
Our research has helped to flip the teams to achieve win-win outcomes;
narrative of IT being a cost-center lower levels of burnout; more effective
to IT being an investment that drives leadership; and effective implementation
business success. In 2020, we wrote of both continuous delivery and
the ROI of DevOps whitepaper,2 which lean management practices.”3 We
contains calculations you can use to recommend dedicating a certain amount
help articulate potential value created by of capacity specifically for improvement.
investing in IT improvement.

Monetary return is only one of the


returns you can expect from this
investment. Our research in 2015
showed that, “organizational investment
in DevOps is strongly correlated with
organizational culture; the ability of
development, operations, and infosec

If transformational leadership increases by 25%...

Job satisfaction 9.1%

Burnout -9.9%

Productivity 8.7%
Outcome

Team performance 8.7%

Product performance 4.5%

Organization performance 10.3%

-10 -5 0 5 10
Estimated % change in outcome
Point = estimated value
Error bar = 89% uncertainty interval
Figure 18: Impacts of transformational leadership on various outcomes.

72 Leading transformations
v. 2024.3
Be relentlessly user-centric

This year's research shows that Not only do products improve,


organizations with strong leaders but employees are more satisfied
and a focus on building software that with their jobs and less likely to
addresses user needs leads to the experience burnout.
development of better products: It’s a
powerful combination. When the user is Fast, stable software delivery
at the center of software development, enables organizations more frequent
leaders have a clear vision to articulate. opportunities to experiment and learn.
Ideally, these experiments and iterations
The ultimate goal is for users to love the are based on user feedback. Fast and
products we create. As we discuss in the stable software delivery allows you to
Developer experience chapter, focusing experiment, better understand user
on the user gives product capabilities needs, and quickly respond if those
a reason to exist. Developers can needs are not being met.
confidently build these features knowing
they’ll help improve the user experience. Having speed and stability baked
into your delivery also allows you
We see that teams that have a deep to more easily adjust to market
desire to understand and align to their changes or competition.
users’ needs and the mechanisms to
collect, track, and respond to user It is important to remember that
feedback have the highest levels of your internal developers are also users.
organizational performance. In fact, Internal Developer Platforms (IDPs) are a
organizations can be successful even way your organization can deliver value
without high levels of software velocity to developers that in turn deliver value to
and stability, as long as they are user- external users or other internal users.
focused. In 2023 we saw user-centered
teams have a 40% higher level of Our research shows that successful
organizational performance compared to IDPs are developed as a product and
those that did not,4 and in 2016 we also focus on user centricity to deliver an
saw that user-centered teams had better experience that allows developers to
organizational performance. work independently. An IDP deployed
in this way leads to higher individual
This year’s research echoes previous productivity, higher team productivity,
findings. Teams that focus on the user and higher organizational performance.
make better products.

73 Leading transformations
v. 2024.3
Become a data-informed organization

The ability to visualize your progress speed and stability. As a result, the
toward success is critical. Over the last benefits gained by speed and stability
10 years we have made the case for are diminished as higher performance
becoming a data-informed organization. becomes ubiquitous.
DORA's four key metrics5 have become
a global standard for measuring Thinking about transformation
software delivery performance, but holistically, we recommend creating
this is only part of the story. We have dashboards and visualizations that
identified more than 30 capabilities and combine both technical metrics
processes6 that can be used to drive (such as our four keys and reliability
organizational improvement. metrics) and business metrics. This helps
bridge the gap between the top-down
The value in the metrics lies in their and bottom-up transformation efforts.
ability to tell you if you are improving. The This also helps connect your northstar,
four key metrics should be used at the OKRs, and employee goals with the
application and service levels, and not at investments made in IT. They can help
the organization or line-of-business level. quantify the ROI.
The metrics should be used to visualize
your efforts in continuous improvement We believe metrics are a requirement
and not to compare teams — and for excellence. Metrics facilitate decision
certainly not to compare individuals. making. The more metrics you collect,
quantitative and qualitative, the better
The metrics should also not be used as and more informed decisions you can
a maturity model for your application make. People will always have opinions
or service teams. Being a low, medium, on the value of the data or the meaning
high, or elite performer is interesting, of the data, but using data as the
but we urge caution as these monikers basis by which to make a decision
have little value in the context of your is often preferable to relying on opinion
transformation journey. or intuition.

As our research progresses and evolves,


we encourage you to think beyond the
four keys. It has become clear that user
feedback metrics are as important as
the four key metrics. We believe this
is because most teams have devised
workable solutions for improving

74 Leading transformations
v. 2024.3
Be all-in on cloud or stay in the data center

We have been investigating the Organizations may be better off staying


relationship between the NIST defined-5 in the data center if they are not willing
characteristics of cloud computing7 to radically transform their application
(on-demand self-service, broad network or service. Of course, to accomplish
access, resource pooling, rapid elasticity, this, it is not simply adopting tools
and measured service also known as or technologies, but often an entire
flexible infrastructure) and organizational new paradigm in designing, building,
performance since 2018. We see that deploying, and running applications.
successful teams are more likely to take Making large-scale changes is easier
advantage of flexible infrastructure than when starting with a small number of
less successful teams. services, we recommend an iterative
approach that helps teams and
Last year, our research led us to the most organizations to learn and improve as
striking bit of information on this topic they move forward.
to date: Using the cloud without taking
advantage of the five characteristics
can be detrimental and predicts
decreased organizational performance.

75 Leading transformations
v. 2024.3
Summary

What we’ve seen consistently over the The idea of a never-ending journey can
last 10 years is that transformation is a seem daunting. It’s easy to get stuck
requirement for success. What many in planning or designing the perfect
organizations misunderstand is that transformation. The key to success is
transformation isn’t a destination, but rolling up your sleeves and just getting
a journey of continuous improvement.8 to work. The goal of the organization and
Our research is clear: Companies that are your teams should be to simply be a little
not continuously improving are actually better than you were yesterday. The goal
falling behind. Conversely, companies of our last 10 years of research and into
that adopt a mindset of continuous the future is to help you get better at
improvement see the highest levels getting better.
of success.

On this journey, be aware that you


will likely hit a little bit of pain and
discomfort along the way. Our research
has shown an initial drop in performance
followed by big gains (also known as
the “ j-curve “) with DevOps,9 SRE,10 and
this year with Platform Engineering. This
is normal, and if you are continuously
improving, things will get better and you
will come out the other end in much
better shape than when you started.

1. Dimensions of transformational leadership: Conceptual and empirical extensions - Rafferty, A. E., & Griffin, M. A.
2. The ROI of DevOps Transformation - https://dora.dev/research/2020/
3. 2015 State of DevOps Report https://dora.dev/research/2015/2015-state-of-devops-report.pdf#page=25
4. 2023 Accelerate State of DevOps Report -
https://dora.dev/research/2023/dora-report/2023-dora-accelerate-state-of-devops-report.pdf#page=17
5. DORA's Four Key Metrics https://dora.dev/guides/dora-metrics-four-keys/
6. DORA's capabilities and processess https://dora.dev/capabilities/
7. NIST defined-5 characteristics of cloud computing https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-145.pdf
8. Journey of continuous improvement
https://cloud.google.com/transform/moving-shields-into-position-organizing-security-for-digital-transformation
9. 2018 Accelerate State of DevOps Report https://dora.dev/research/2018/dora-report/
10. 2022 State of DevOps Report https://dora.dev/research/2022/dora-report/

76 Leading transformations
v. 2024.3
A decade
with DORA

77 A decade with DORA


v. 2024.3
History higher market capitalization growth
over three years than those with low-
performing IT organizations."
The DevOps movement was born
from two topically-related but otherwise The trend of annual reports was well-
uncoordinated events in 2009. John established by 2016, and Forsgren,
Allspaw and Paul Hammond gave a talk Jez Humble, and Gene Kim founded
that June at the Velocity conference DevOps Research and Assessment
titled, "10 deploys per day: Dev & ops (DORA). That year, the State of DevOps
cooperation at Flickr".1 Pactrick Debois Report included calculations to help
followed a few months later when he measure the investments made by teams
led a team of volunteer organizers to adopting DevOps practices. This work
host the first DevOpsDays event in was extended in the ROI of DevOps
Ghent, Belgium.2 Transformation5 whitepaper, published
in 2020.
It didn’t take long for the DevOps
community to want to learn more about Accelerate: The science behind devops:
how it was evolving. Alana Brown, who Building and scaling high performing
was working at Puppet Labs, ran a survey technology organizations,6 written
in 2011 to learn more about DevOps. This by Forsgren, Humble, and Kim was
survey helped confirm that, "working in a published by IT Revolution Press in
'DevOps' way is emerging as a new way 2017. This book summarized the early
to do business in IT." years of the research program and
included a focus on the capabilities
As the movement continued to expand that drive improvement.
to new industries and organizations,
Alana built on this success and partnered DORA, the company, published an
with IT Revolution Press to field another independent report in 2018, the
survey in 2012, publishing their findings in Accelerate State of DevOps: Strategies
the 2013 State of DevOps Report.3 for a New Economy.7 The team at Puppet
continued their own series of reports,8
Dr. Nicole Forsgren joined the research separate from DORA, beginning that
team the following year, bringing more same year.
scientific rigor to the program. The
2014 State of DevOps Report4 made In late 2018, DORA was acquired by
the connection between software Google Cloud9 where the platform-
delivery performance and organizational agnostic, scientific research continues.
performance, finding that, "publicly This year marks the tenth DORA Report,10
traded companies that had high- we are happy to share our findings with
performing IT teams had 50 percent you, thank you for reading!

78 A decade with DORA


v. 2024.3
Key insights from
DORA

Teams do not need to sacrifice speed There are many ways that teams
for stability measure the four keys including:

Technology-driven teams need ways • Through conversations and reflection


to measure performance so that they during team meetings
can assess how they’re doing today,
prioritize improvements, and validate • The DORA Quick Check
their progress. DORA identified and (https://dora.dev/quickcheck)
has validated four software-delivery
metrics—the four keys—that provide • Commercial and source-available11
an effective way of measuring the tools in the Software Engineering
outcomes of the software delivery Intelligence (SEI) category
process. These measures of software
delivery performance have become an • Bespoke integrations built for the
industry standard. specific tools in use by a team

The research has demonstrated that the


throughput and stability of changes tend
to move together, we have seen teams
achieving high levels of both in every
industry vertical.

Stability

Throughput

79 A decade with DORA


v. 2024.3
Software delivery and Practitioners working in technology-
operational performance drive driven teams recognize the importance
organizational performance of reducing friction in the delivery
process while meeting the reliability
DORA uses the four keys to measure expectations of an application's users.
software delivery performance.
Operational performance was first
studied by DORA in 2018. It measures
the ability to make and keep promises
and assertions about the software
product or service.

The best results are seen when


both software delivery and operational
performance come together to drive
organizational performance and
employee well-being.

Performance

Software delivery
Four keys metrics

Reliability
Service Level Objectives (SLOs)

Predicts

Outcomes

Organizational performance

Well-being

80 A decade with DORA


v. 2024.3
Culture is paramount to success Get better at getting better

One of the clearest predictors of We encourage teams to set a goal to


performance is the culture of the get better at getting better. Driving
organization. We've continually seen improvement requires a mindset and a
the power of a high-trust culture that practice of continuous improvement.
encourages a climate for learning and This requires a way to assess how you're
collaboration. For example, culture was doing today, prioritize improvement
shown to be biggest predictor of an work, and feedback mechanisms that
organization's application-development help you measure progress.
security practices in our 2022 research.12
An experimental approach to
Culture impacts every aspect of our improving will involve a mix of victories
research, and it’s multifaceted and and failures, but in both scenarios
always in flux. We've used many different teams can take meaningful actions as
measures over the years with inspiration a result of lessons learned.
from research such as Westrum's
Typology of Organizational Culture.13
Our measures of well-being have
included burnout, productivity, and
job satisfaction.

81 A decade with DORA


v. 2024.3
The decade ahead

Collectively, we've learned a lot from We are committed to the fundamentals


each other over the past decade. Thank principles that have always been a part
you for engaging in our annual surveys, of the DevOps movement: culture,
participating in the DORA Community of collaboration, automation, learning, and
Practice,14 and putting DORA to work in using technology to achieve business
your own context. goals. Our community and research
benefit from the perspectives of diverse
As the technology landscape roles, including people who might not
continues to evolve, DORA will continue associate with the "DevOps" label. You
to research the capabilities and should expect to see the term "DevOps"
practices that help technology-driven moving out of the spotlight.
teams and organizations succeed.
We will continue to prioritize the This year's report has a heavy focus
human aspects of technology and are on the use and impacts of artificial
committed to publishing platform- intelligence (AI). As you've read, adoption
agnostic research that you can use to is growing and there is a lot of room for
guide your own journey. experimentation in this space. We will
continue to investigate this and other
Many of our past insights are durable emerging technologies and practices
enough to inform your approach to into the future. Use our past research,
emerging technologies and practices together with our new findings, to
and we're excited to find new insights drive adoption and help improve the
along with you! experience of everyone on your team.

1. Slides - https://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr,
recording - https://www.youtube.com/watch?v=LdOe18KhtT4
2. https://legacy.devopsdays.org/events/2009-ghent/
3. https://www.puppet.com/resources/history-of-devops-reports#2013
4. 2014 State of DevOps Report - https://dora.dev/research/2014/
5. The ROI of DevOps Transformation - https://dora.dev/research/2020/
6. Forsgren, Nicole, Jez Humble, and Gene Kim. 2018. Accelerate: The Science Behind DevOps : Building and Scaling High Performing
Technology Organizations. IT Revolution Press.
7. Accelerate State of DevOps: Strategies for a New Economy - https://dora.dev/research/2018/dora-report/
8. https://www.puppet.com/resources/history-of-devops-reports#2018
9. https://dora.dev/news/dora-joins-google-cloud
10. We consider 2014, the year that Dr. Forsgren joined the program, to be the first DORA report, even though DORA was founded a
few years later. There was no report in 2020, making 2024 the tenth report.
11. https://dora.dev/resources/#source-available-tools
12. 2022 Accelerate State of DevOps Report - https://dora.dev/research/2022/dora-report/
13. Ron Westrum, “A typology of organisation culture”, BMJ Quality & Safety 13, no. 2(2004), doi:10.1136/qshc.2003.009522
14. https://dora.community

82 A decade with DORA


v. 2024.3
Final thoughts
DORA has established itself as a trusted source of research, insights, and information
over the past decade. As the industry continues to adopt new practices and
technologies like platform engineering and artificial intelligence, DORA will be here
with you, investigating the ways of working that help teams improve. Thank you for
having DORA along for the journey.

Replicate our research Share what you learn


As you learn from your experiments,
The area of research and the findings spread that knowledge throughout
in this year's report are complex your organization. Methods for sharing
and sometimes unclear or even can range from formal reports for
contradictory. We encourage you to large audiences, through informal
replicate our research. Focusing on a communities of practice, to casual
single team or organization opens many chats among peers. Try a variety of
opportunities for deeper understanding. approaches and learn what works best in
your context and culture. This, too, is an
Run experiments within your experimental process.
organization
DORA's findings can serve as hypotheses
for your next experiments. Learn more
about how your team operates and
identify areas for improvement which
may be inspired by findings from the
DORA research program.

Run surveys within your organization


Take inspiration from this report and
the questions used in this year's survey1
to design your own internal survey.
Your survey can incorporate more
nuanced questions that are relevant to
your audience.­2 Read the Methodology
chapter for more details into how our
research is conducted. Be sure to focus
on putting your findings into practice.

83 Final thoughts
v. 2024.3
How are you leveraging this research?

Share your experiences, learn from


others, and get inspiration from other
travelers on the continuous improvement
journey by joining the DORA community
at https://dora.community.

1. 2024 Survey https://dora.dev/research/2024/questions/


2. Experiences from Doing DORA Surveys Internally in Software Companies -
https://www.infoq.com/news/2024/08/dora-surveys-software-company/

84 Final thoughts
v. 2024.3
Acknowledgements

This year marks a special milestone: the 10th DORA report. We are thankful
for all of the dedicated work of researchers, experts, practitioners, leaders,
and transformation agents who have joined in shaping this body of work and
evolved alongside us.

We've come a long way since the first State of DevOps Report
published by Puppet Labs and IT Revolution Press. A heartfelt
thank you to our DORA founders for paving the way. It's
remarkable to reflect on how much has changed since
then and how much we've learned throughout the years.

We're deeply grateful to everyone involved in this


year's publication. It's a tremendous responsibility to
guide and influence industry practices, and your contributions
are invaluable.

To everyone who has been part of this journey, from


the early days to this exciting era of AI, thank you. Your support
and insights have been instrumental. Here's to
the next decade of discovery and collaboration!

85 Acknowledgements
v. 2024.3
DORA Report Team Marie-Blanche Panthou Gene Kim and IT
Revolution
James Brookbank Miguel Reyes
Laura Maguire, PhD
Kim Castillo Yoshi Yamaguchi
James Pashutinski
Derek DeBellis Jinhong Yu
Ryan J. Salva
Benjamin Good
Majed Samad
Nathen Harvey DORA guides
Harini Sampath
Michelle Irvine Lisa Crispin
Robin Savinar
Amanda Lewis Steve Fenton
Sean Sedlock
Eric Maxwell Denali Lumma
Dustin Smith
Steve McGhee Betsalel (Saul) Williamson
Finn Toner
Allison Park

Dave Stanke Advisors/experts in the


field Gold sponsors
Kevin Storer
John Allspaw
Daniella Villalba
Birgitta Böckeler

Sander Bogdan
Editor
Michele Chubirka Silver sponsors
Seth Rosenblatt
Thomas De Meo

Localization volunteers Jessica DeVita

Andrew Anolasco Rob Edwards

Mauricio Meléndez Dr. Nicole Forsgren

86 Acknowledgements
v. 2024.3
Authors
Derek is a quantitative user experience researcher
at Google and the lead investigator for DORA. Derek
focuses on survey research, logs analysis, and figuring
out ways to measure concepts that demonstrate
a product or feature is delivering capital-v value to
people. Derek has published on human-AI interaction,
the impact of COVID-19's onset on smoking cessation,
Derek DeBellis designing for NLP errors, the role of UX in privacy
discussions, team culture, and AI’s relationship to
employee well-being and productivity. His current
extracurricular research is exploring ways to simulate
the propagation of beliefs and power.

Dr. Kevin M. Storer is a developer experience researcher


at Google, where he serves as qualitative research
lead for the DORA team. Leveraging professional
experience in software engineering and postgraduate
transdisciplinary training in the social sciences and
humanities, Kevin has been leading human-centered
studies of software developers since 2015, spanning a
Kevin M. Storer diverse set of problem contexts, participant profiles,
and research methods. Kevin’s research has been
published in top scientific venues on the topics of
artificial intelligence, information retrieval, embedded
systems, programming languages, ubiquitous
computing, and interaction design.

87 Authors
v. 2024.3
Amanda Lewis is the DORA.community development
lead and a developer relations engineer at Google
Cloud. She has spent her career building connections
across developers, operators, product managers,
project managers, and leadership. She has worked
on teams that developed e-commerce platforms,
content management systems, observability tools,
Amanda Lewis and supported developers. These connections and
conversations lead to happy customers and better
outcomes for the business. She brings her experience
and empathy to the work that she does helping teams
understand and implement software delivery and
artificial intelligence practices.

Ben Good is cloud solutions architect at Google.


He is passionate about improving software delivery
practices through cloud technologies and automation.
As a solutions architect he gets to help Google
Cloud customers solve their problems by providing
architectural guidance, publication of technical guides
and open source contributions. Prior to joining Google,
Benjamin Good Ben ran cloud operations for a few different companies
in the Denver/Boulder area, implementing DevOps
practices along the way.

88 Authors
v. 2024.3
Daniella Villalba is a user experience researcher at
Google. She uses survey research to understand the
factors that make developers happy and productive.
Before Google, Daniella studied the benefits of
meditation training, and the psycho-social factors
that affect the experiences of college students. She
received her PhD in Experimental Psychology from
Daniella Villalba Florida International University.

Eric Maxwell leads Google’s DevOps Transformation


practice, where he advises the world’s best companies
on how to improve by delivering value faster. Eric spent
the first half of his career as an engineer in the trenches,
automating all the things and building empathy for
other practitioners. Eric co-created Google’s Cloud
Application Modernization Program (CAMP), and is a
Eric Maxwell member of the DORA team. Before Google, Eric spent
time whipping up awesome with other punny folks at
Chef Software.

Kim Castillo is a user experience program manager at


Google. Kim leads the cross-functional effort behind
DORA, overseeing its research operations and the
publication of this report since 2022. Kim also works
on UX research for Gemini in Google Cloud. Prior
to Google, Kim enjoyed a career in tech working in
technical program management and agile coaching.
Kim Castillo Kim's roots are in psycho-social research focused on
topics of extrajudicial killings, urban poor development,
and community resilience in her country of origin, the
Philippines.

89 Authors
v. 2024.3
Michelle Irvine is a technical writer at Google, and
her research focuses on documentation and other
technical communication. Before Google, she
worked in educational publishing and as a technical
writer for physics simulation software. Michelle has
a BSc in Physics, as well as an MA in Rhetoric and
Communication Design from the University of Waterloo.
Michelle Irvine

Nathen Harvey leads the DORA team at Google Cloud.


Nathen has learned and shared lessons from some
incredible organizations, teams, and open source
communities. He is a co-author of multiple DORA
reports and was a contributor and editor for 97 Things
Every Cloud Engineer Should Know, published by
O’Reilly in 2020.
Nathen Harvey

90 Authors
v. 2024.3
Demographics and
firmographics

Who took the survey

The DORA research program has been Over 90,000 respondents participated
researching the capabilities, practices, in the 2023 Stack Overflow Developer
and measures of high-performing, Survey.1 That survey didn't reach every
technology-driven organizations for technical practitioner, but is about as
over a decade. We've heard from close as you can get to a census of the
roughly 39,000 professionals working in developer world.
organizations of every size and across
many different industries. Thank you for With a sense of the population provided
sharing your insights! This year, nearly from that survey, we can locate response
3,000 working professionals from a bias in our data and understand how far
variety of industries around the world we might want to generalize our findings.
shared their experiences to help grow Further, the demographic and
our understanding of the factors that firmographic questions asked in this
drive high-performing, technology- Stack Overflow Developer Survey are
driven organizations. well-crafted and worth borrowing.

This year's demographic and In short, there are no major discrepancies


firmographic questions leveraged between our sample and Stack
research done by Stack Overflow. Overflow’s. This means we have every
reason to believe that our sample is
reflective of the population.

91 Demographics and firmographics


v. 2024.3
Industry

We asked survey respondents to Industry Percentage of


respondents
identify the industry sector in which
Technology 35.69%
their organization primarily operates,
across 12 categories. The most common Financial Services 15.66%
sectors in which respondents worked Retail/Consumer/E-commerce 9.49%
were Technology (35.69%), Financial
Other 5.94%
Services (15.66%) and Retail/Consumer/
E-commerce (9.49%). Industrials & Manufacturing 5.49%

Healthcare & Pharmaceuticals 4.60%

Media/Entertainment 4.26%

Government 3.89%

Education 3.66%

Energy 3.03%

Insurance 2.39%

Non-Profit 1%

Number of employees

We asked survey respondents to Organization Size Percentage


identify the number of employees at
solo 2.0%
their organization, using nine buckets.
The organizations in which respondents 2 to 9 3.2%
worked most commonly had 10,000
10 to 19 4.3%
or more employees (24.10%),
100 to 499 employees (18.50%) and 20 to 99 14.5%
1,000 to 9,999 employees (15.60%).
100 to 499 18.5%

500 to 999 11.2%

1,000 to 4,999 15.6%

5,000 to 9,999 6.7%

10,000 or more 24.1%

92 Demographics and firmographics


v. 2024.3
Disability

We identified disability along six Disability % of


respondents
dimensions that follow guidance from the
Washington Group Short Set.2 None of the disabilities 92%
applied
This is the fifth year we have asked
about disability. The percentage of At least one of the disabilities 4%
applied
respondents reporting disabilities has
decreased from 11% in 2022 to 6% in Preferred not to say 4%
2023, and 4% in 2024.

Gender

We asked survey respondents to report Gender Percentage


their gender. 83% of respondents
identified as men, 12% as women, Man 83%
1% chose to self-describe, and 4%
declined to answer. Woman 12%

Used their own words 1%

Preferred not to answer 4%

93 Demographics and firmographics


v. 2024.3
Experience

We asked survey respondents to report their years of


experience in their role and team. Respondents had a median
of 16 years of working experience, five years of experience
in their current role, and three years of experience on their
current team.

Question
5

How many years have


you worked on the
team in a role similar
to your current role?

How many years


have you worked
on the team you’re
currently on?

16

How many years of


working experience
do you have?

0 5 10 15 20 25 30
Years

Box width represents 25th and 75th percentiles. The line dissecting the box represents the median.

94 Demographics and firmographics


v. 2024.3
Percentage of respondents

v. 2024.3
Role

5%

0%
15%

10%
20%
Engineering manager
Developer, full-stack
DevOps specialist
Senior Executive (C-Suite, VP,…)

our data, including:


Developer, back-end
Other (please specify)
Project manager
Cloud infrastructure engineer
Developer, desktop or ente…
Developer, front-end
Developer, QA or test
meaningfully include roles which
In analyses, some individual roles

represented a small proportion of


were grouped together, to help us

Product manager
respondents in our analyses. Other

Data Engineer
categories were highly represented in

Site Reliability Engineer


Business analyst
Data analyst

95
Database administrator
Data scientist or machine…

Job title
Developer Advocate
Developer, embedded app…
Developer Experience
Developer, mobile
Prefer not to answer
Research & Development role
respondents.
respondents.

Security professional
System administrator
Academic researcher
of the respondents

Blockchain Engineer
Designer
Developer, game or graphics
Educator
Hardware Engineer
Marketing professional
Sales professional
Scientist
• Managers, representing 23% of the

Student
the respondents (+33% from 2023).
• Developers, representing 29% of the

• Analytic roles, representing about 5%


• Senior executives, representing 9% of

Demographics and firmographics


Employment status
Employment Type Percentage
We asked survey respondents to report Full-time contractor 6%
their current employment status. The
vast majority (90%) of respondents were Full-time employee 90%
full-time employees of an organization.
Part-time contractor 1%

Part-time employee 2%

Work location
Despite another year of return-to-office (RTO) pushes, the
pattern from last year has largely been retained, especially
toward the tails of the distribution. The 37.5% increase in the
median values does suggest that hybrid work or at least, some
regular visits, are becoming more common.

Year 24%

2023

33%

2024

0% 10% 20% 30% 40% 50% 60% 70% 80%


Time in office

Box width represents 25th and 75th percentiles. The line dissecting the box represents the median.

96 Demographics and firmographics


v. 2024.3
Country

We had respondents from 104 different countries. We are always thrilled to see
people from all over the world participate in the survey. Thank you all!

Country
USA Italy Singapore Iceland Luxembourg Guatemala

UK Switzerland Albania Iran Nicaragua Hong Kong (S.A.R.)

Canada Argentina Georgia Jordan Pakistan Malta

Germany Mexico Greece Kenya Peru Mauritius

Japan Portugal Philippines Saudi Arabia South Korea Morocco

India Austria Hungary Slovakia Sri Lanka Nepal

France Romania Serbia Slovenia Tunisia Paraguay

Brazil Finland Afghanistan Thailand Andorra Swaziland

Spain Turkey Algeria Uzbekistan Barbados Syrian Arab Republic

Australia Bulgaria Egypt Angola Belize Taiwan

Netherlands Ireland Indonesia Armenia Benin The former Yugoslav


Republic of
Macedonia

China Israel Russian Federation Bosnia and Bolivia Trinidad and Tobago
Herzegovina

Sweden Belgium Ukraine Dominican Republic Burkina Faso Uruguay

Norway Chile Viet Nam Ecuador Comoros Venezuela,


Bolivarian Republic
of..

New Zealand Colombia Bangladesh Estonia Côte d'Ivoire

Poland Czech Republic Belarus Kazakhstan El Salvador

South Africa Malaysia Costa Rica Latvia Ethiopia

Denmark Nigeria Croatia Lithuania Gambia

97 Demographics and firmographics


v. 2024.3
Race and ethnicity

We asked survey respondents to report


their race and ethnicity. Our largest
group of respondents were White
(32.4%), and/or European (22.7%).

Race or ethnicity Percentage Race or ethnicity Percentage

White 32.4 Middle Eastern 1.3

European 22.7 Biracial 0.4

Asian 9.9 Central American 0.4

North American 4.6 I don't know 0.4

Indian 4.1 North African 0.4

Prefer not to say 4.1 Caribbean 0.2

Hispanic or Latino/a 3.5 Central Asian 0.2

South American 3.2 South Asian 1.7

East Asian 2.5 Ethnoreligious group 0.2

African 1.8 Pacific Islander 0.2

South Asian 1.7 Indigenous (such as Native 0.1


American or Indigenous
Multiracial 1.5 Australian)

Or, in your own words: 1.5

Southeast Asian 1.4

Black 1.3

1. https://survey.stackoverflow.co/2023/
2. https://www.washingtongroup-disability.com/question-sets/wg-short-set-on-functioning-wg-ss/

98 Demographics and firmographics


v. 2024.3
Methodology

A methodology is supposed to be like


a recipe that will help you replicate our
work and determine if the way our data
was generated and analyzed is likely to
return valuable information. Although
we don’t have the space to go into the
exacts, hopefully this is a great starting
point for those considerations.

99 Methodology
v. 2024.3
Survey development

Question selection We address the literature, engage with


the DORA community, conduct cognitive
We think about the following aspects interviews, run parallel qualitative
when considering whether to include research, work with subject matter
a question into a survey: experts, and hold team workshops to
inform our decision as to whether to
Is this question… include a question into our survey.
• Established so we can connect our
work to previous efforts?
Survey experience
• Capturing an outcome the industry
We take great care to improve the
wants to accomplish (for example,
usability of the survey. We conduct
high team performance)?
cognitive interviews and usability tests
to make sure that the survey hits certain
• Capturing a capability the industry
specification points:
is considering investing resources
into (for example, AI)? • Time needed to complete survey
should, on average, be low
• Capturing a capability we believe
will help people accomplish • Comprehension of the questionnaire
their goals (for example, quality should be high
documentation)?
• Effortfulness should be reasonably
• Something that helps us evaluate the low, which is a huge challenge given
representativeness of our sample the technical nature of the concepts
(for example, role or gender)?

• Something that helps us block


biasing pathways (for example,
coding language or role)?

• Something that is possible to


answer with at least a decent
degree of accuracy for the vast
majority of respondents?

100 Methodology
v. 2024.3
Data collection

Localizations

People around the world have responded


to our survey every year. This year
we worked to make the survey more
accessible to a larger audience by
localizing the survey into English,
Español, Français, Português,日本語,
and 简体中文.

101 Methodology
v. 2024.3
Collect survey responses Survey flow

We use multiple channels to recruit. This year we had a lot of questions we


These channels fall into two categories: wanted to ask, but not enough time to
organic and panel. ask them. Our options were…
• Make an extremely long survey
The organic approach is to use all the
social means at our disposal to let
• Choose a subset of areas
people know that there is a survey that
to focus on
we want them to take. We create blog
posts. We use email campaigns. We post
• Randomly assign people
on social media, and we ask people in
to different topics
the community to do the same (that is,
snowball sampling).
We didn’t want to give up on any of
We use the panel approach to our interests, so we chose to randomly
supplement the organic channel. assign participants to one of three
Here we try to recruit people who are separate flows. There was a lot of
traditionally underrepresented in the overlap among the three different
broader technical community and try flows, but each flow dove deeply in
to get adequate responses from certain a different space.
industries and organization types.
Here are the three different pathways:
In short, this is where we get some
• AI
control over our recruitment—control
we don’t have with the organic approach.
• Workplace
The panel approach also allows us to
simply make sure that we get enough
• Platform Engineering
respondents, because we never know
if the organic approach is going to yield
the responses necessary to do the types
of analyses we do. This year we had
sufficient organic responses to run our
analysis and the panel helped round out
our group of participants.

102 Methodology
v. 2024.3
Survey analysis

Measurement validation when they’re happy. We assume that


happiness is underlying a certain pattern
There is a wide variety of concepts that of feelings, thoughts, and action.
we try to capture in the survey. There
are a lot of different language games we Therefore, we expect certain types
could partake in, but one view is that this of feelings, thoughts, and actions to
measure of a concept is called a variable. emerge together when happiness is
These variables are the ingredients of the present. We would then ask questions
models, which are the elements included about these feelings, thoughts, and
in our research. There are two broad actions. We would use confirmatory
ways to analyze the validity of these factor analysis to test whether they
measures: internally and externally. actually do show up together.

To understand the internal validity This year we used the lavaan1 R package
of the measure, we look at what we think to do this analysis. Lavaan returns
indicates the presence of a concept. a variety of fit statistics that help us
For example, quality documentation understand whether constructs
might be indicated by people using their actually represent the way people
documentation to solve problems. answer the questions.

A majority of our variables consist If the indicators of a concept don't gel,


of multiple indicators because the the concepts might need to be revised
constructs we’re interested in appear or dropped because it’s clear that we
to be multifaceted. haven't found a reliable way to measure
the concept.
To understand the multifaceted nature of
a variable, we test how well the items we The external validity of a construct is
use to represent that concept gel. If they all about looking at how the construct
gel well (that is, they share a high level fits into the world. We might expect a
of communal variance), we assume that construct to have certain relationships to
something underlies them—such as the other constructs. Sometimes we might
concept of interest. expect two constructs to have a negative
relationship, like happiness and sadness.
Think of happiness, for example,
happiness is multifaceted. We expect If our happiness measure comes back
someone to feel a certain way, act a positively correlated with sadness, we
certain way, and think a certain way might question our measure or our theory.

103 Methodology
v. 2024.3
Similarly, we might expect two Model evaluation
constructs to have positive
relationships, but not strong ones. Using a set of hypotheses as our guiding
Productivity and job satisfaction are principle, we build hypothetical models,
likely to be positively correlated, but little toys that try to capture some aspect
we don’t think they’re identical. If the about how the world works. We examine
correlation gets too high, we might how well those models fit the data we
say it looks like we’re measuring the collected. For evaluating a model, we go
same thing. This then means that our for parsimony. This amounts to starting
measures are not calibrated enough with a very simplistic model2 and adding
to pick up on the differences between complexity until the complexity is no
the two concepts, or the difference we longer justified.
hypothesized isn’t actually there.
For example, we predict that
organizational performance is the
product of the interaction between
software delivery performance and
operational performance. Our simplistic
model doesn’t include the interaction:

Organizational performance ~ Software delivery performance + Operational performance

Our second model adds the interaction:

Organizational performance ~ Software delivery performance + Operational performance +


Software delivery performance ✕ Operational performance

Based on the recommendations in


“Regression and other stories”3 and
“Statistical Rethinking,”4 we use leave-
one-out cross-validation (LOOCV)5 and
Watanabe–Akaike information criterion6
to determine whether the additional
complexity is necessary.

104 Methodology
v. 2024.3
Directed Acyclic Graphs for We are able to use the validated model
Causal Inference to tell us what we need to account for
to understand an effect. In short, it lets
A validated model tells us what we us try to get our data in the form of
need to know to start thinking an A/B experiment, where one tries to
causally. We talk about the challenges create two identical worlds with only
of thinking causally below. one difference between them. The logic
suggests that in doing so any differences
Here are some reasons why we’re trying that emerge between those two worlds
to talk causally: is attributable to that initial difference.
We think your question is fundamentally
In observational data and survey data,
a causal one. You want to know if doing
things are not as clearly divided — many
something is going to create something.
things are different between participants,
You are not going to invest in doing
which introduces confounds. Our
something if you just think there is a non-
method of causal inference tries to
causal correlation.
account for these differences in an
attempt to mimic an experiment — that
The results of our analyses depend on is, holding everything constant except for
our causal understanding of the world. one thing (for example, AI adoption).
The actual numbers we get from the
regression change based on what we Let’s take the classic example of ice
include in the regression. What we cream “causing” shark attacks. There is
include in the regression should depend a problem in that observation, namely
on how we think the data is generated, that people tend to eat ice cream on hot
which is a causal claim. Hence, we should days and also go to the beach on hot
be clear. days. The situation where people tend
to eat ice cream and go to the beach
Causal thinking is where our curiosity is not the same as the situation where
will take us and where we all spend a lot people tend not to eat ice cream and not
of time. We are often wondering about go to the beach. The data isn’t following
how the various aspects of the world are the logic of an experiment. We’ve got a
connected and why. We don’t need to confounding variable, temperature.
run experiments on every facet of our
lives to think causally about them.

Causal thinking is central to action, which


is what we’re hoping this report helps you
with, making decisions to act.

105 Methodology
v. 2024.3
Directed Acyclic Graphs (DAGs) help I draw my model, tell the tool what effect
you identify the ways in which the world I want to understand, and the tool tells
is different and offer approaches to me what is going to bias my estimate
remedy the situation, to try to mimic an of the effect. In this case, the tool says
experiment by making everything in the that I cannot estimate the effect of ice
world except one thing constant. Let’s cream consumption on shark attacks
see how the DAG directs us in the ice without adjusting for temperature,
cream and shark attack example, where which is a statistical approach of trying
we want to quantify the impact of ice to make everything equal besides ice
cream consumption on shark attacks: cream consumption and then seeing if
shark attacks continue to fluctuate as a
function of ice cream consumption.

We outline our models in the, you


guessed it, Models chapter.

The image is from https://www.dagitty.net/dags.html.

106 Methodology
v. 2024.3
The directed acyclic graph tells us
what to account for in our analyses
of particular effects.

For example, what do we need to


account for in our analysis of AI
adoption's impact on productivity?

Bayesian statistics

This analysis is done using Bayesian statistics.


Bayesian statistics offer a lot of benefits:

• We move away from thinking in • We are forced to confront the


terms of significant or insignificant underlying assumptions of the
(ask 10 people to explain frequentist modeling process
p-values and you’ll get 10 different
answers) • We can explore the posterior
distributions to get a sense of the
• We want to know the probability of magnitude, uncertainty, and overall,
hypothesis given the data, not the how and how well the model made
probability of the data given our sense of the data. Ultimately, it gives
hypothesis a great sense of what we do and do
not know given our data
• We like to incorporate our prior
knowledge into our models, or • A flexible framework that addresses
at least be explicit about how much many statistical problems in a very
we don’t know7 unified manner

107 Methodology
v. 2024.3
What do you mean by “simulation”?

It isn’t that we made up the data. We uncertainty there is. You can think of
use Bayesian statistics to calculate a each simulation as a little AI that knows
posterior, which tries to capture “the nothing besides our data and a few rules
expected frequency that different trying to fill in a blank (parameter) with an
parameter values will appear.”8 The informed guess. You do this 4,000 times
“simulation” part is drawing from this and you get the guesses of 4,000 little
posterior more than 1,000 times to AIs for a given parameter.
explore the values that are most credible
for a parameter (mean, beta weight, You can learn a lot from these guesses.
sigma, intercept, etc.) given our data. You can learn what the average guess is,
between which values do 89%10 of these
“Imagine the posterior is a bucket full of guesses fall, how many guesses are
parameter values, numbers such as 0.1, above a certain level, how much variation
0.7, 0.5, 1, etc. Within the bucket, is there in these guesses, etc. You can
each value exists in proportion to its even do fun things like combine guesses
posterior probability, such that values (simulations) across many models.
near the peak are much more common
than those in the tails.”9 When we show a graph with a bunch of
lines or a distribution of potential values,
This all amounts to our using simulations we are trying to show you what is most
to explore possible interpretations of plausible given our data and how much
the data and get a sense of how much uncertainty there is.

Synthesize findings with the We encourage you to join the DORA


community community (https://dora.community)
to share your experiences, learn
Our findings offer a valuable from others, and discover diverse
perspectives for technology-driven approaches to implementing these
teams and organizations, but they are recommendations. Together, we can
best understood through dialogue explore the best ways to leverage these
and shared learning. Engaging with insights and drive meaningful change
the DORA community gives us diverse within your organization.
insights, challenges our assumptions,
and helps us discover new ways to
interpret and apply these findings.

108 Methodology
v. 2024.3
Interviews Inferential leaps in
results
This year, we supplemented our annual Our goal is to create a pragmatic
survey with in-depth, semi-structured representation of the world, something
interviews to triangulate, contextualize, that we can all leverage to help improve
and clarify our quantitative findings. the way we work. We know there is
The interview guide paralleled the topics complexity we’re simplifying. That is kind
included in our survey and was designed of the point of the model. Jorge Luis
for sessions to last approximately 75 Borges has a very short story, called
minutes each, conducted remotely “On Exactitude in Science”, where he
via Google Meet. talks of an empire that makes maps of
the empire on a 1:1 scale.11 The absurdity
In total, we interviewed 11 participants is that this renders the map absolutely
whose profiles matched the inclusion useless (at least that’s my interpretation).
criteria of our survey. All interviews were The simplifications we make are
video- and audio-recorded. Sessions supposed to be helpful.
lasted between 57 minutes and 85
minutes, totaling 14 hours and 15 minutes That said, there are some inferential leaps
of data collected across all participants. that we want to be clear about.
Participants’ data were pseudonymized
using identifiers in the form of P(N),
where N corresponds to the order in
which they were interviewed.

All interviews were transcribed using


automated software. Transcriptions were
manually coded using our survey topics
as a priori codes. Quotations appearing
in the final publication of this report were
revisited and transcribed manually prior
to inclusion. Words added to participant
quotations by the authors of this report
are indicated by brackets ([]), words
removed are indicated by ellipses (..),
and edits were made only in cases where
required for clarity.

109 Methodology
v. 2024.3
Causality

According to John Stuart Mill, you graphs, we do the work to account for
needed to check three boxes to say biasing pathways, but that is a highly
X causes Y:12 theoretical exercise, one that, unlike
temporal precedence, has implications
• Correlation: X needs to covary with Y?
that can be explored in the data.
• Temporal precedence: X needs to
This is all to say that we didn’t do
happen before Y?
longitudinal studies or a proper
experiment. Despite this, we think causal
• Biasing pathways are accounted for (as
thinking is how we understand the world
described in the DAG section above)?
and we try our best to use emerging
techniques in causal inference to provide
We feel confident that we can you with good estimates. Correlation
understand correlation — that’s often does not imply causation, but it does
a standard statistical procedure. Our imply how you think about causation.
survey is capturing a moment in time, so
temporal precedence is theoretical, not
part of our data.

As for biasing pathways, as we mention


above when talking about structural
equation models and directed acyclic

110 Methodology
v. 2024.3
Micro-level phenomena
-> Macro-level phenomena

Often we take capabilities at an individual That is, we believe that the probability
level and see how those connect to of an individual doing something (X) is
higher levels. For example, we tied the higher when they are in an organization
individual adoption of AI to an application or a team that also does X. Hence,
or service and to team performance. individuals who do something represent
This isn’t terribly intuitive at first glance. teams and organizations that also tend to
The story of a macro-level phenomenon do X. Of course the noise here is pretty
causing an individual level phenomenon loud, but the pattern should emerge and
is usually easier to tell. Inflation (macro) allow this assumption to give us some
impacting whether I buy eggs (micro) important abilities.
seems like a more palatable story than
me not buying eggs causing inflation. Let’s back up for an example outside of
DORA: imagine two different countries
The same is true for an organization's where the average height differs. In one
performance (macro) impacting an country, people have an average height
individual’s well-being (micro). As a of 5’6”. The other’s average height is 6’2”.
heuristic, it is likely the organization The standard deviation is identical. If you
exerts more of an influence on picked a person at random from each
the individual than the individual country, which country do you think the
on the organization. taller person would be more likely to be
drawn from? If you do this thousands
So, why do we even bother saying an of times, taller countries would be
individual action impacts something like represented by taller people. The
team or organizational performance? height of the individuals would loosely
We make an inferential leap that we approximate the heights of the countries.
think isn’t completely illogical. Namely,
we assume that at scale, the following
statement tends to be true:

p(individual does X | organization does X) > p(individual does X | organization doesn’t do X).

111 Methodology
v. 2024.3
Not that it is necessary, but we ran a The results are unsurprising. 97.2%
quick simulation to validate this is true: of the 1,000 random draws are in
the correct direction. Of course, it
would be easy to get fooled with
#R code non-random draws, smaller differences
#set seed for reproducibility between the countries, and small
set.seed(10) samples. Still, the point stands:
#6'2 and 5'6 differences at the macro-level tend
height_means = c(6 + 1/6, 5.5)
to be represented in the micro-level.
#constant standard deviation at 1/4 of
foot
std_dev =0.25

#random draws
draws = 1000

#random draws from country A


country_a <-rnorm(draws, mean = height_
means[1], sd = std_dev)

#random draws from country B


country_b <-rnorm(draws, mean = height_
means[2], sd = std_dev)

#how of the draws represent the correct


difference
represented_difference = sum(country_a >
country_b) / 1000

#show results as percentage


represented_difference * 100

1. Rosseel Y (2012). “lavaan: An R Package for Structural Equation Modeling.” Journal of Statistical Software, 48(2), 1–36.
https://doi.org/10.18637/jss.v048.i02
2. This would also involve the examination of potential confounds.
3. Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and Other Stories. N.p.: Cambridge University Press.
4. McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. N.p.: CRC Press/Taylor & Francis
Group.
5. Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and Other Stories. N.p.: Cambridge University Press
6. McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. N.p.: CRC Press/Taylor & Francis
Group.
7. Our priors tend to be weak (skeptical, neutral, and low information) and we check that the results are not conditioned by our
priors.
8. McElreath, Richard. Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC, 2018, pg. 50
9. McElreath, Richard. Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC, 2018, pg. 52
10. Followed McElreath’s reasoning in Statistical Rethinking, pg. 56 for choosing 89%. “Why these values? No reason… And these
values avoid conventional 95%, since conventional 95% intervals encourage many readers to conduct unconscious hypothesis
tests.” The interval we’re providing is simply trying to show a plausible “range of parameter values compatible with the model and
data”.
11. Borges, J. L. (1999). Collected fictions. Penguin.
12. Duckworth, Angela Lee, Eli Tsukayama, and Henry May. “Establishing causality using longitudinal hierarchical linear modeling: An
illustration predicting achievement from self-control.” Social psychological and personality science 1, no. 4 (2010): 311-317.

112 Methodology
v. 2024.3
Models

Traditionally, we have built one giant


model that we validated using various
structural equation modeling techniques
(partial least squares, covariance-based,
bayesian). For the 2023 report, we
switched to focusing on many smaller
models aimed at helping us understand
specific processes.

For example, we made a nuanced model


to understand the physics of quality
documentation. There are important
benefits that come with creating smaller
models1 tailored to understanding
specific effects:
• Ease of identifying areas of poor
model fit

• Everything you add to a model


exerts a force, has a gravity. As your
model gets large, it is really difficult
to understand all the different ways
the variables are exerting force on
each other

• Prevents you from conditioning on


something that creates spurious
relationships2

113 Models
v. 2024.3
How do we use the models?

We all have a lot of questions, but many Here are some of this year's capabilities
vital questions have the following form: of interest:
• AI adoption
if we do X, what happens to Y?
• platform use
X is usually a practice, such as creating
quality documentation, adopting AI, or
• platform age
investing in culture.
• transformational leadership
Y is usually something that we care
about achieving or avoiding, which • priority stability
could happen at the individual level
(for example, productivity) up to the • user centricity
organizational level (for example,
market share).
Here are some of this year’s outcomes
We construct, evaluate, and use the and outcome groups:
models3 with the goal of addressing
• individual performance and
questions of this form. We work to
well-being (for example, burnout)
provide an accurate estimate of what
happens to important outcomes as
• team performance
a result of doing X.4 When we report
effects, we convey two vital features:
• product performance
1. How much certainty we have in the
• development workflow
direction of the effect, that is, how
(for example, codebase complexity
clear is it that this practice will be
and document quality)
beneficial or detrimental?
• software delivery performance
2. How much certainty we have in the
magnitude of the effect. We will
• organizational performance
provide an estimate a relative sense
of how impactful certain practices
are and the degree of uncertainty
surrounding these estimates.

114 Models
v. 2024.3
We focus on these outcomes because A repeated model
we believe that they are ends in
themselves. Of course, that is more true
for some of these outcomes than others. We developed and explored many
If you found out that organizational nuanced hypotheses over the
performance and team performance past three years, especially about
had nothing to do with the software moderation and mediation.
delivery performance, you would
probably be okay having low software This year, we spent less time focusing
delivery performance. on those types of hypotheses and more
time trying to estimate a capability’s
We hope, however, that even if effect on an outcome. This means that
organizational performance did not the model for each capability is largely
depend on individual well-being the same.
you would still want to prioritize the
well-being of employees. Hence, the model for AI adoption’s
effects is very similar in design to the
model for User-centricity’s effects. We
could copy the model and change the
name of capability, but that might not
be terribly useful for you.

Instead, we are just going to show the


AI model, but know it is the schematic
or form behind each of our models.
Should you be interested in running your
analysis, constructing this model in a tool
like DAGitty should allow you to get close
to replicating the regressions we used in
our analysis. That said, what is presented
is slightly simplified for readability.
Additionally, while the models are very
similar across each capability, the effects
are different. For example you'll see
below that AI adoption generally harms
software delivery performance but the
opposite is true for things like internal
documentation and user-centricity,
see each chapter for additional details.

115 Models
v. 2024.3
indicator or a latent factor
Capability
The practice, state or trait of
interest as a potential cause
Capability

The key The practice, state or trait of


interest as a potential cause
Capability
The practice, state or trait of
Outcome group
Outcome
A composite
A composite
we
group
combinedof
of outcomes that
outcomes
for the sake that
of
interest as a potential cause we combined
clarity and easyforofthe sake of
visualization.
claritywere
They and understood
easy of visualization.
They were understood
separately in the analysis
Covariate separately in the analysis
Something that we use to
account alternative hypotheses
Covariate
and block biasing pathways Effect of interest
Something that we use to Effect are
These of interest
the effects that
account alternative hypotheses These
we are the
wanted effects that
to quantify and
Covariate
and block biasing pathways we wanted
report to quantify and
to you
Something that we use to report to you
account alternative hypotheses
and block biasing pathways Auxiliary Effect
Outcome Auxiliary
We did notEffect
focus on quantifying
This is either a single We
this did notthis
effect focus on Itquantifying
year. is part of
indicator or a latent factor this effect this
our model, but year. It is part of
not used
Outcome our model, but not used
This is either a single
indicator or a latent factor
Outcome
This is either a single
indicator or a latent factor
Firmographic
OutcomeTraitsgroup
A composite of outcomes that
we combined for the sake of
Outcome
clarity and group
easy of visualization.
A composite
They of outcomes that
were understood
we combined
separately in the foranalysis
the sake of
Outcome group
clarity and easy of visualization.
A composite of
They were understoodoutcomes that
we combined
separately foranalysis
in the the sake of
Indivisual clarity
Effect and easy of visualization.
of interest Service
Traits
They
Thesewere understood
are the effects that Features

separately
we wantedintothe analysis
quantify and
Effect of interest
report to you
These are the effects that
we wanted to quantify and
Effect to
report of you
interest
Auxiliary
These are Effect
the effects that
WeHelps
did not focus on quantifying
we wanted to quantify and
this effect
report to this year. It is part of
you
Auxiliary Effect
our model, but not used
We did not focus on quantifying
this effect this year. It is part of
Auxiliary
our model,Effect
but not used Team
Performanc
We did not focus on quantifying
this effect this year. It is part of
our model, but not used

Individual
Performance
and

Mostly
Helps

Softwear
Organizational
AI adoption Delivery
Performance
Performance
Harms

Development
Workflow
Helps

Product
Performanc

no effect

1. Gelman et. al’s “Regression and other stories” offers some important tips on page 495 through 496
that seem illuminating: B.6 Fit many models and B.9 Do causal inference in a targeted way, not as a byproduct of a large regression
2. A great discussion about this can be found in chapter 6 of Statistical Rethinking. I am talking
particularly about collider bias.
3. See the conversation about how these models are tied with directed acyclic graphs in the methodology chapter
4. We talk about causality briefly in the methods chapter.

116 Models
v. 2024.3
Recommended
reading

Join the DORA Community to discuss, Read the book: Team Topologies:
learn, and collaborate on improving Organizing Business and Technology
software delivery and operations Teams for Fast Flow. IT Revolution Press.
performance. https://dora.community https://teamtopologies.com/

Take the DORA Quick Check. Publications from DORA’s research


https://dora.dev/quickcheck program, including prior DORA Reports.
https://dora.dev/publications
Explore the capabilities that enable a
climate for learning, fast flow, and fast Frequently asked questions about the
feedback. https://dora.dev/capabilities research and the reports.
http://dora.dev/faq
Fostering developers’ trust in generative
artificial intelligence. https://dora.dev/ Errata - Read and submit changes,
research/2024/trust-in-ai/ corrections, and clarifications to this
report. https://dora.dev/publications/
Read the book: Accelerate: The science errata
behind devops: Building and scaling high
performing technology organizations. Check if this is the latest version
IT Revolution. https://itrevolution.com/ of the 2024 DORA Report:
product/accelerate https://dora.dev/vc/?v=2024.3

117 Recommended reading


v. 2024.3
“Accelerate State of DevOps 2024”
by Google LLC is licensed under CC BY-NC-SA 4.0

118
v. 2024.3
119
v. 2024.3

You might also like