2024 Final Dora Report
2024 Final Dora Report
2024 Final Dora Report
Accelerate
State of DevOps
Gold Sponsors
10
A decade with DORA
Contents
Executive summary 3 Final thoughts 83
2 Contents
v. 2024.3
Executive
summary
DORA has been investigating the DORA collects data through an annual,
capabilities, practices, and measures worldwide survey of professionals
of high-performing technology-driven working in technical and adjacent roles.
teams and organizations for over a The survey includes questions related to
decade. This is our tenth DORA report. ways of working and accomplishments
We have heard from more than 39,000 that are relevant across an organization
professionals working at organizations and to the people working in that
of every size and across many different organization.
industries globally. Thank you for joining
us along this journey and being an DORA uses rigorous statistical evaluation
important part of the research! methodology to understand the
relationships between these factors and
how they each contribute to the success
of teams and organizations.
3 Executive summary
v. 2024.3
The key accomplishments and outcomes
we investigated this year are:
Flow Flow measures how much focus a person tends to achieve during
development tasks.
4 Executive summary
v. 2024.3
Key findings
• Internal documentation
• Review processes
• Team performance
• Organizational performance
5 Executive summary
v. 2024.3
AI adoption increases as trust Platform engineering
in AI increases can boost productivity
6 Executive summary
v. 2024.3
Applying insights from DORA
7 Executive summary
v. 2024.3
You cannot improve alone!
8 Executive summary
v. 2024.3
Software delivery
performance
We performed the cluster analysis on the Four distinct clusters emerged from the
original four software delivery metrics to data this year, as shown below.
remain consistent with previous years'
cluster analyses.
Software
Software delivery
Delivery performance
Performance Levelslevels
1m : 6m 1m : 6m 1m : 6m 1m : 6m 50% 50% 1w - 1m 1w - 1m
1w : 1m 1w : 1m 1w : 1m 1w : 1m 40% 40%
1d - 1w 1d - 1w
1d : 1w 1d : 1w 1d : 1w 1d : 1w 30% 30%
< 1d < 1d
< 1d < 1d 1h : 1d 1h : 1d 20% 20%
1. We consider a deployment to be a change failure only if it causes an issue after landing in production, where it can be experienced
by end users. In contrast, a change that is stopped on its way to production is a successful demonstration of the deployment
process's ability to detect errors.
2. Forsgren, Nicole, Jez Humble, and Gene Kim. 2018. Accelerate: The Science Behind DevOps : Building and Scaling High Performing
Technology Organizations. IT Revolution Press. pp. 37-38
3. The 2019 Accelerate State of DevOps (p 32) report found that the retail industry saw significantly better software delivery
performance. https://dora.dev/research/2019/dora-report/2019-dora-accelerate-state-of-devops-report.pdf#page=32
4. https://dora.dev/guides/value-stream-management/
5. https://dora.dev/capabilities
6. https://dora.dev/research
Takeaways Introduction
The vast majority of organizations It would be difficult to ignore the
across all industries surveyed are significant impact that AI has had on
shifting their priorities to more deeply the landscape of development work
incorporate AI into their applications this year, given the proliferation of
and services. A corresponding majority popular news articles outlining its
of development professionals are effects, from good1 to bad2 to ugly.3
relying on AI to help them perform So, while AI was only discussed as one
their core role responsibilities — and of many technical capabilities affecting
reporting increases in productivity as performance in our 2023 Accelerate
a result. Development professionals’ State of DevOps Report,4 this year we
perceptions that using AI is necessary explore this topic more fully.
for remaining competitive in today’s
market is pervasive and appears to As the use of AI in professional
be an important driver of AI adoption development work moves rapidly from
for both organizations and individual the fringes to ubiquity, we believe our
development professionals. 2024 Accelerate State of DevOps Report
represents an important opportunity to
assess the adoption, use, and attitudes
of development professionals at a critical
inflection point for the industry.
No change in AI prioritization
7. Debugging code
8. Data analysis
Documentation 60.8%
Debugging 56.1%
Percentage of respondents
Chatbots were the most common those technologies. So, these numbers
interface through which respondents might be artificially low.
interacted with AI in their daily work
(78.2%), followed by external web We found that data scientists and
interfaces (73.9%), and AI tools machine learning specialists were more
embedded within their IDEs (72.9%). likely than respondents holding all
Respondents were less likely to use other job roles to rely on AI. Conversely,
AI through internal web interfaces hardware engineers were less likely than
(58.1%) and as part of automated CI/CD respondents holding all other job roles
pipelines (50.2%). to rely on AI, which might be explained
by the responsibilities of hardware
However, we acknowledge that engineers differing from the above tasks
respondents' awareness of AI used for which AI is commonly used.
in their CI/CD pipelines and internal
platforms likely depends on the
frequency with which they interface with
No impact on my productivity
A great deal
A lot
Answer
Somewhat
A little
Not at all
Given the evidence from the survey Perhaps because this is not a new
that developers are rapidly adopting problem, participants like P3 felt that
AI, relying on it, and perceiving it as their companies are not “worried
a positive performance contributor, about, like, someone just copy-and-
we found the overall lack of trust in AI pasting code from Copilot or ChatGPT
surprising. It’s worth noting that during [because of] having so many layers to
our interviews, many of our participants check it” with their existing code-quality
indicated that they were willing to, or assurance processes.
expected to, tweak the outputs of the
AI-generated code they used in their We hypothesize that developers do not
professional work. necessarily expect absolute trust in
the accuracy of AI-generated code,
One participant even likened the need nor does absolute trust appear to be
to evaluate and modify the outputs of required for developers to find AI-
AI-generated code to “the early days generated code useful. Rather, it seems
of StackOverflow, [when] you always that mostly-correct AI-generated code
thought people on StackOverflow are that can be perfected with some tweaks
really experienced, you know, that they is acceptable, sufficiently valuable to
will know exactly what to do. And then, motivate widespread adoption and use,
you just copy and paste the stuff, and and compatible with existing quality
things explode” (P2). assurance processes.
40
Percentage of respondents
with negative outlook
30
20
10
0
1 5 10 1 5 10 1 5 10 1 5 10 1 5 10 1 5 10
Years in the future
Responses about one, five, or 10 years into the future
1. https://www.sciencedaily.com/releases/2024/03/240306144729.htm
2. https://tech.co/news/list-ai-failures-mistakes-errors
3. https://klyker.com/absurd-yoga-poses-generated-by-ai/
4. https://dora.dev/dora-report-2023
5. Rogers, Everett M., Arvind Singhal, and Margaret M. Quinlan. “Diffusion of innovations.” An integrated approach to communication
theory and research. Routledge, 2014. 432-44, Tornatzky, L. G., & Fleischer, M. (1990). The processes of technological innovation.
Lexington, MA: Lexington Books
6. (P[N]), for example (P1), indicates pseudonym of interview participants.
Takeaways
This chapter investigates the impact Despite these challenges, AI adoption
of AI adoption across the spectrum, is linked to improved team and
from individual developers to entire organizational performance. This chapter
organizations. The findings reveal a concludes with a call to critically evaluate
complex picture with both clear benefits AI's role in software development
and unexpected drawbacks. While AI and proactively adapt its application
adoption boosts individual productivity, to maximize benefits and mitigate
flow, and job satisfaction, it may also unforeseen consequences.
decrease time spent on valuable work.
The first challenge of capturing the Using factor analysis, we found our
impact of adopting AI is measuring “general” AI reliance survey item had high
the adoption of AI. We determined overlap with reported AI reliance on the
measuring usage frequency is likely not following tasks:
as meaningful as measuring reliance
for understanding AI’s centrality to
• Code Writing
development workflows. You might only
do code reviews or write documentation
• Summarizing information
a few times a month or every couple
of months, but you see these tasks as
• Code explanation
critically important to your work.
• Code optimization
Conversely, just because you use AI
frequently does not mean that you are • Documentation
using AI for work that you consider
important or central to your role. • Test writing
Job satisfaction A single item designed to capture someone’s overall feeling about
their job.
Flow A single item designed to capture how much focus a person tends
to achieve during development tasks.
Flow 2.6%
Productivity 2.1%
Outcome
Burnout -0.6%
-4 -2 0 2
Estimated % change in outcome
Point = estimated value
Error bar = 89% uncertainty interval
Figure 7: Impacts of AI adoption on individual success and well-being
Toilsome work
What AI is
helping with
Valuable work
Technical debt The extent to which existing technical debt within the
primary application or service has hindered productivity
over the past six months.
Code review The average time required to complete a code review for the
speed primary application or service.
Approval speed The typical duration from proposing a code change to receiving
approval for production use in the primary application or service.
Cross-functional The level of agreement with the statement: "Over the last
team (XFN) three months, I have been able to effectively collaborate with
coordination cross-functional team members.”
Code quality The level of satisfaction or dissatisfaction with the quality of code
underlying the primary service or application in the last six months.
0 5
For the past few years, we have seen that software delivery
throughput and software delivery stability indicators were
starting to show some independence from one another.
While the traditional association between throughput and
stability has persisted, emerging evidence suggests these
factors operate with sufficient independence to warrant
separate consideration.
0 1 2 3
Teams
Benefit
Detriment
Time using AI
Figure 12: Representations of different learning curves. This is an abstraction for demonstrative
purposes. This is not derived from real data.
It is obvious that there is a lot to be excited about and even more to learn. DORA will
stay tuned in and do our best to offer honest, accurate, and useful perspectives, just
as it has over the past decade.
1. https://www.goldmansachs.com/insights/top-of-mind/gen-ai-too-much-spend-too-little-benefit
2. https://www.goldmansachs.com/insights/articles/AI-poised-to-drive-160-increase-in-power-demand
3. https://www.washington.edu/news/2023/07/27/how-much-energy-does-chatgpt-use/
4. https://www.gatesnotes.com/The-Age-of-AI-Has-Begun
5. https://www.businessinsider.com/ai-chatgpt-homework-cheating-machine-sam-altman-openai-2024-8
6. https://www.safe.ai/work/statement-on-ai-risk
7. https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-
happiness/
8. https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality
9. https://www.nytimes.com/2024/04/15/technology/ai-models-measurement.html
10. https://dora.dev/capabilities
11. we should be clear that this isn’t a unique approach, but it is a somewhat unique approach for this space
12. (P[N]), for example (P1), indicates pseudonym of interview participants.
Introduction
Platform engineering is an emerging
engineering discipline that has been
gaining interest and momentum across
the industry. Industry leaders such as
Spotify and Netflix, and books such as
Team Topologies1 have helped excite
audiences.
47 Platform engineering
v. 2024.3
In platform engineering, a lot of energy For example, if the platform has
and focus is spent on improving the the capability to execute unit tests
developer experience by building golden and report back results directly to
paths, which are highly-automated, development teams, but without that
self-service workflows that users of team needing to build and manage the
the platform use when interacting testing execution environment, then the
with resources required to deliver and continuous integration platform feature
operate applications. Their purpose is enables teams to focus on writing
to abstract away the complexities of high-quality tests. In this example, the
building and delivering software such continuous integration feature can scale
that the developer only needs to worry across the larger organization and make
about their code. it easier for multiple teams to improve
their capabilities with continuous testing3
Some examples of the tasks and test automation.4
automated through golden paths
include new application provisioning,
database provisioning, schema
management, test execution, build and
deployment infrastructure provisioning,
and DNS management.
48 Platform engineering
v. 2024.3
A key factor in the success is to productivity and 10% higher levels of
approach platform engineering with team performance. Additionally, an
user-centeredness (users in the context organization's software delivery and
of an internal developer platform are operations performance increases
developers), developer independence, 6% when using a platform. However,
and a product mindset. This isn’t too these gains do not come without
surprising given that user centricity was some drawbacks. Throughput and
identified as a key factor in improved change stability saw decreases of 8%
organizational performance this year and 14%, respectively, which was a
and in previous years.5 Without a user- surprising result.
centered approach, the platform will be
more a hindrance rather than an aid. In the next sections we’ll dig deeper
into the numbers, nuances, and some
In this year’s report, we sought to test surprising data that this survey revealed.
the relationship between platforms Whether your platform engineering
and software delivery and operational initiative is just starting or has been
performance. We found some positive underway for many years, application of
results. Internal developer platform the key findings can help your platform
users had 8% higher levels of individual be more successful.
8.0
Estimated productivity factor score (0-10)
7.5
7.0
6.5
No platform Platform
Each dot is one of 8000 estimates of the most plausible mean productivity score
Figure 13: Productivity factor for individuals when using or not using an internal developer platform.
49 Platform engineering
v. 2024.3
The promise of platform engineering
less than a year 1-2 years 2-5 years more than 5 years
Platform age
Figure 14: Organization performance change when using an internal developer platform vs the age of the platform.
50 Platform engineering
v. 2024.3
When taking into account the age of Key finding - impact of developer
the platform with productivity, we see independence
initial performance gains at the onset
of a platform engineering initiative, Developer independence had a
followed by decrease and recovery as significant impact on the level of
the platform ages and matures. This productivity at both the individual and
pattern is typical of transformation team levels when delivering software
initiatives that experience early gains using an internal developer platform.
but encounter challenges once those Developer independence is defined
have been realized. as “developers' ability to perform their
tasks for the entire application lifecycle,
In the long run, productivity gains without relying on an enabling team.”
are maintained showing the overall
potential of an internal developer At both the team and individual level we
platform’s role in the software delivery see a 5% improvement in productivity
and operational processes. when users of the platform are able to
complete their tasks without involving an
enabling team. This finding points back
to one of the key principles of platform
engineering, focusing on enabling self-
service workflows.
51 Platform engineering
v. 2024.3
Secondary finding - impact of a Overall, the impact of having an
dedicated platform team internal developer platform has a
positive impact on productivity.
Interestingly, the impact on productivity
of having a dedicated platform team The key factors are:
was negligible for individuals. However,
it resulted in a 6% gain in productivity
at the team level. This finding is A user-centered approach that
surprising because of its uneven impact, enables developer independence
suggesting that having a dedicated through self-service and
platform team is useful to individuals, workflows that can be completed
but the dedicated platform team is more autonomously. Recall that in the
impactful for teams overall. context of the platform, users
are internal engineering and
Since teams have multiple developers
with different responsibilities and skills,
development teams.
they naturally have a more diverse
set of tasks when compared to an
individual engineer. It is possible that As with other transformations,
having a dedicated platform engineering the “j-curve” also applies to
team allows the platform to be more platform engineering, so productivity
supportive of the diversity in tasks gains will stabilize through
represented by a team. continuous improvement.
52 Platform engineering
v. 2024.3
The unexpected
downside
While platform engineering presents Each of these handoffs is an opportunity
some definite upsides, in terms of teams for time to be introduced into the
and individuals feeling more productive overall process resulting in a decrease in
and improvements in organizational throughput, but a net increase in ability
performance, platform engineering to get work done.
had an unexpected downside: We also
found that throughput and change Second, for respondents who reported,
stability decreased. they are required to “exclusively use the
platform to perform tasks for the entire
Unexpectedly, we discovered a very app lifecycle,” there was a 6% decrease
interesting linkage between change in throughput. While not a definitive
instability and burnout. connection, it could also be related to
the first hypothesis.
Throughput
If the systems and tools involved in
In the case of throughput, we saw developing and releasing software
approximately an 8% decrease when increases with the presence of a
compared to those who don’t use a platform, being required to use the
platform. We have hypotheses about platform when it might not be fit for
what might be the underlying cause. purpose or naturally-increasing latency
in the process could account for the
First, the added machinery that changes relationship between exclusivity and
need to pass through before getting decrease in productivity.
deployed to production decreases the
overall throughput of changes. In general, To counter this it is important to be
when an internal developer platform is user-centered and work toward
being used to build and deliver software, user independence in your platform
there is usually an increase in the number engineering initiatives.
of “handoffs” between systems and
implicitly teams.
53 Platform engineering
v. 2024.3
Change instability and burnout It could also be that the platform
provides an automated testing
When considering the stability of the capability that exercises whatever tests
changes to applications being developed are included in the application. Yet
and operated when using an internal application teams aren't fully using that
developer platform, we observed a capability by prioritizing throughput over
surprising 14% decrease in change quality and not improving their tests.
stability. This indicates that the change In either scenario, bad changes are
failure rate and rate of rework are actually making it through the process,
significantly increased when a platform resulting in rework.
is being used.
A third possibility is that teams with
Even more interesting, in the results a high level of change instability and
we discovered that instability in burnout tend to create platforms in an
combination with a platform is linked effort to improve stability and reduce
to higher levels of burnout. That isn’t burnout. This makes sense because
to say that platforms lead to burnout, platform engineering is often viewed
but the combination of instability and as a practice which reduces burnout
platforms are particularly troublesome and increases the ability to consistently
when it comes to burnout. Similar to ship smaller changes. With this
the decrease in throughput, we aren’t hypothesis, platform engineering is
entirely sure why the change in burnout symptomatic of an organization with
occurs, but we have some hypotheses. burnout and change instability.
First, the platform enables developers In the first two scenarios, the rework
and teams to push changes with a higher allowed by the platform could be seen
degree of confidence that if the change as burdensome which could also be
is bad, it can be quickly remediated. In increasing burnout. In particular, the
this instance the higher level of instability second scenario where the platform is
isn’t necessarily a bad thing since enabling bad changes would contribute
the platform is empowering teams to more to burnout, but in both scenarios
experiment and deliver changes, which the team or individual could still feel
results in an increased level of change productive because of their ability to
failure and rework. push changes and features. In the third
scenario, change instability and burnout
A second idea is that the platform are predictive of a platform engineering
isn’t effective at ensuring the quality initiative and the platform is seen as a
of changes and/or deployments solution to those challenges.
to production.
54 Platform engineering
v. 2024.3
Balancing the Collaboration and feedback improve
Trade-offs the user-centeredness of the platform
initiative and will contribute to the long-
term success of the platform. As we
While platform engineering is no saw in the data, there are many different
panacea, it has the potential to be a methods used to collect feedback, so
powerful discipline when it comes employ more than one approach to
to the overall software development maximize feedback collection.
and operations process. As with any
discipline, platform engineering has Second, carefully monitor the instability
benefits and drawbacks. of your application changes and try
to understand whether the instability
Based on our research, there are a being experienced is intentional or
couple actions you can take to balance not. Platforms have the potential to
the trade-offs when embarking on a unlock experimentation in the terms of
platform engineering initiative. Doing so instability, increase productivity, and
will help your organization achieve the improve performance at scale.
benefits of platform engineering while
being able to monitor and manage any However, that same instability can also
potential downsides. have the potential to do this at the cost
of instability and burnout, so it needs to
First, prioritize platform functionality be carefully monitored and accounted
that enables developer independence for throughout the platform engineering
and self-service capabilities. When journey. When doing so it is important to
doing this, pay attention to the balance understand your appetite for instability.
between exclusively requiring the Using service level objectives (SLOs)
platform to be used for all aspects of and error budgets from site reliability
the application lifecycle, which could engineering (SRE) can help you gauge
hinder developer independence. your risk tolerance and effectiveness
of the platform in safely enabling
As good practice, a platform should experimentation.
provide methods for users of a platform
to break out of the tools and automations Internal developer platforms put a lot of
provided in the platform, which emphasis on the developer experience,
contributes to independence, however, however, there are many other teams
it comes at the cost of complexity. (including database administrators,
This trade-off can be mitigated with a security, and operations) who are
dedicated platform team that actively required to effectively deliver and
collaborates with and collects feedback operate software.
from users of the platform.
55 Platform engineering
v. 2024.3
In your platform engineering initiatives,
foster a culture of user-centeredness
and continuous improvement across
all teams and aligned with the
organization’s goals.
1. Skelton, Matthew and Pais, Manuel. 2019. Team Topologies: Organizing Business and Technology Teams for Fast Flow. IT Revolution
Press. https://teamtopologies.com/
2. https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left
3. https://dora.dev/capabilities/continuous-integration/
4. https://dora.dev/capabilities/test-automation/
5. https://dora.dev/research/2023/, https://dora.dev/research/2016/
6. https://dora.dev/research/2024/questions/#platform-engineering
56 Platform engineering
v. 2024.3
Developer
experience
Takeaways
Software doesn’t build itself. Even Ultimately, software is built for people,
when assisted by AI, people build so it’s the organization’s responsibility to
software, and their experiences at foster environments that help developers
work are a foundational component focus on building software that will
of successful organizations. improve the user experience. We also
find that stable environments, where
In this year’s report, we again found priorities are not constantly shifting,
that alignment between what developers lead to small but meaningful increases in
build and what users need allows productivity and important, meaningful
employees and organizations to thrive. decreases in employee burnout.
Developers are more productive,
less prone to experiencing burnout, Environmental factors have substantial
and more likely to build high quality consequences in the quality of the
products when they build software products developed, and the overall
with a user-centered mindset. experience of developers whose job
is to build those products.
57 Developer experience
v. 2024.3
Put the user first, and (almost) This year, we asked questions focused
everything else falls into place on understanding whether developers:
Their jobs are fundamentally tied to 3. Believe focusing on the user is key to
people–the users of the software and the success of the business
applications they create. Yet developers
often work in environments that prioritize 4. Believe the user experience is a top
features and innovation. There’s less business priority
emphasis on figuring out whether these
features provide value to the people who
use the products they make.
58 Developer experience
v. 2024.3
Our findings and what they mean However, our data indicates there’s
another path that leads to success:
Our data strongly suggests that
organizations that see users’ needs Developers and their employers,
and challenges as a guiding light and organizations in general, can
make better products. create a user-centered approach to
software development.
We find that focusing on the user
increases productivity and job We find that when organizations know
satisfaction, while reducing the and understand users’ needs, stability
risk of burnout. and throughput of software delivery are
not a requirement for product quality.
Importantly, these benefits extend Product quality will be high as long as the
beyond the individual employee to user experience is at the forefront.
the organization. In previous years,
we’ve highlighted that high performing When organizations don’t focus on
organizations deliver software quickly the user, don’t incorporate user
and reliably. The implication is that feedback into their development
software-delivery performance is a process, doubling down on stable
requirement for success. and fast delivery is the only path to
product quality (see Figure 15).
59 Developer experience
v. 2024.3
We understand the inclination that When organizations and employees
some organizations might have to focus understand how their users experience
on creating features and innovating the world, they increase the likelihood
on technologies. At face value, this of building features that address the real
approach makes sense. After all, needs of their users. Addressing real user
developers most certainly know the ins needs increases the chances of those
and outs of the technology much better features being actually used.
than their average user.
Focus on building for your user and
However, developing software based on you will create delightful products.
assumptions about the user experience
increases the likelihood of developers
building features that are perhaps shiny
but hardly used.1
9
Predicted product performance
Delivery throughput
Figure 15: Product performance and delivery throughput across 3 levels of user centricity
60 Developer experience
v. 2024.3
Why is a user-centered approach Provides a clear sense of direction:
to software development such a
powerful philosophy and practice? A user-centered approach to software
development can fundamentally alter
Academic research shows that deriving how developers view their work. Instead
a sense of purpose from work benefits of shipping arbitrary features and
employees and organizations.2,3 guessing whether users might use them,
developers can rely on user feedback to
For example, a recent survey showed help them prioritize what to build.
that 93% of workers reported that it’s
important to have a job where they feel This approach gives developers
the work they do is meaningful.4 In a confidence that the features they are
similar vein, another survey found that working on have a reason for being.
on average, respondents were willing Suddenly, their work has meaning: to
to relinquish 23% of their entire future ensure people have a superb experience
earnings if it meant they could have a when using their products and services.
job that was always meaningful.5 There’s no longer a disconnect between
the software that’s developed and the
That’s an eye-popping trade-off world in which it lives.
employees are willing to make. It tells us
something about what motivates people, Developers can see the direct
and that people want to spend their time impact of their work through the
doing something that matters. software they create.
61 Developer experience
v. 2024.3
Increases cross-functional This approach to software development
collaborations: can help developers break out of silos,
seek alignment, foster teamwork, and
Even the most talented developer create opportunities to learn more from
doesn’t build software on their own. others. Problem solving takes a different
Building high-quality products takes the shape. It’s not just about how to solve
collaboration of many people often with technical problems, but how to do so in
different yet complementary talents. ways that serve the user best.
62 Developer experience
v. 2024.3
The combination of good docs and a We see that internal documentation
user-centered approach to software doesn’t meaningfully affect predicted
development is a powerful one. product performance without user
signals. However, if a team has a high
Teams that focus on the user see an quality internal documentation then user
increase in product performance. When signals included in it will have a higher
this focus on the user is combined impact on product performance.
with an environment of quality internal
documentation, this increase in product We started to look at documentation
performance is amplified (see Figure 16). in 2021, and every year we continue
This finding is similar to the behavior that to find extensive impact of quality
we see where documentation amplifies documentation. This year’s findings
a technical capability’s impact on adds internal documentation’s impact
organizational performance.7 on predicted product performance
to the list.
Documentation helps propagate user
signals and feedback across the team
and into the product itself.
Documentation quality
The graph is a composite of 12000 lines from simulations trying to estimate the most plausible pattern
Figure 16: Product performance and documentation quality across 3 levels of user centricity
63 Developer experience
v. 2024.3
Culture of documentation In these cases, our measure of quality
documentation would likely score low.
The Agile manifesto advocates for This type of content is written for the
“working software over comprehensive wrong audience so doesn’t perform as
documentation”.7 We continue to find, well when you try to use it while doing
however, that quality documentation is a your work. And too much documentation
key component of working software. can be as problematic as not enough.
64 Developer experience
v. 2024.3
The perils of
ever-shifting priorities
We all know the feeling. You’ve spent Our findings and what they mean
the last few months working on a new
feature. You know it’s the right thing to Overall, our findings show small
build for your users, you are focused and but meaningful decreases in
motivated. Suddenly, or seemingly so, the productivity and substantial increases
leadership team decides to change the in burnout when organizations have
organization’s priorities. Now it’s unclear unstable priorities.
whether your project will be paused,
scrapped, Frankensteined, or mutated. Our data indicates it is challenging
to mitigate this increase in burnout.
This common experience can have We examined whether having strong
profound implications for employees and leaders, good internal documents, and
organizations. Here we examine what a user-centered approach to software
happens when organizations constantly development can help counteract the
shift their priorities. effect of shifting priorities on burnout.
65 Developer experience
v. 2024.3
Why are unstable What happens when
organizational priorities bad priorities stabilize?
for employees’ well-being?
Our findings here are a little puzzling.
We hypothesize that unstable We find that when priorities are
organizational priorities increase stabilized, software delivery performance
employee burnout by creating unclear declines. It becomes slow and less stable
expectations, decreasing employees' in its delivery.
sense of control, and increasing the
size of their workloads. We hypothesize that this might be
because organizations with stable
To be clear, we believe that the priorities might have products and
problem is not with changing priorities services that are generally in good shape
themselves. Business goals and product so changes are made less frequently. It
direction shift all the time. It can be is also possible that stability of priorities
good for organizational priorities to leads to shipping less and in larger
be malleable. batches than recommended.
66 Developer experience
v. 2024.3
Building AI for end users creates This period likely leads to a
stability in priorities, but not stability destabilization of priorities as leaders
in delivery. try to figure the best move for the
organization. As the dust settles, and
Incorporating AI-powered experiences organizations clarify their next steps,
for end users stabilizes organizational priorities begin to stabilize.
priorities. This sounds like a flashy
endorsement for AI. However, we do Priorities stabilizing, however, doesn’t
not interpret this finding as telling us immediately translate into the software
something meaningful about AI itself. delivery process stabilizing. Our analyses
show that a shift to adding AI-powered
Instead, we believe that shifting efforts experiences into your service or
towards building AI provides clarity and application comes with challenges and
a northstar for organizations to follow. growing pains.
This clarity, and not AI, is what leads to a
stabilization of organizational priorities. We find that teams that have shifted have
a significant 10% decrease in software
This is worth highlighting because it tells delivery stability relative to teams who
us something about what happens to have not. Here is a visualization depicting
organizations when new technologies the challenge.
emerge. New technologies bring change
and organizations need time to adapt.
9.0
Predicted delivery stability
8.5
8.0
7.5
Strongly disagree Mostly disagree Slightly disagree Neither agree nor disagree Slightly agree Mostly agree Strongly agree
*Each line is one of 4000 simulations trying to estimate the most plausible pattern
Figure 17: Software delivery stability as a function of adding AI-powered experiences
to service or application
67 Developer experience
v. 2024.3
What can organizations do?
1. https://www.nngroup.com/articles/bridging-the-designer-user-gap/
2. https://executiveeducation.wharton.upenn.edu/thought-leadership/wharton-at-work/2024/03/creating-meaning-at-work/
3. https://www.apa.org/pubs/reports/work-in-america/2023-workplace-health-well-being
4. https://bigthink.com/the-present/harvard-business-review-americans-meaningful-work/
5. https://hbr.org/2018/11/9-out-of-10-people-are-willing-to-earn-less-money-to-do-more-meaningful-work
6. (P[N]), for example (P1), indicates pseudonym of interview participants.
7. https://cloud.google.com/blog/products/devops-sre/deep-dive-into-2022-state-of-devops-report-on-documentation and
Accelerate State of DevOps Report 2023 - https://dora.dev/research/2023/dora-report
8. https://agilemanifesto.org/
9. Other audiences exist, such as management, regulators, or auditors.
10. Cohen S, Janicki-Deverts D, Miller GE. Psychological Stress and Disease. JAMA. 2007;298(14):1685–1687.doi:10.1001/
jama.298.14.1685
68 Developer experience
v. 2024.3
Leading
transformations
69 Leading transformations
v. 2024.3
Transformational
leadership
Transformational leadership is a
model in which leaders inspire and
motivate employees to achieve higher
performance by appealing to their
values and sense of purpose, facilitating
wide-scale organizational change.
Vision They have a clear vision of where their team and the
organization are going.
Inspirational They say positive things about the team; make employees proud
communication to be a part of their organization; encourage people to see
changing conditions as situations full of opportunities.
70 Leading transformations
v. 2024.3
This year, we saw that transformational Our research found a statistically
leadership leads to a boost in significant relationship between the
employee productivity. We see that above qualities of leadership and IT
increasing transformational leadership performance in 2017. High-performing
by 25% leads to a 9% increase in teams had leaders with strong scores
employee productivity. across all five characteristics and low-
performing teams had the lowest scores.
Transformational leadership can help Additionally, we saw that there’s a strong
improve more than just productivity. correlation between transformative
Having good leaders can also lead to: leadership and Employee Net Promoter
Score (eNPS), the likelihood to
recommend working at a company.
• A decrease in employee burnout
That said, transformative leadership by
• An increase in job satisfaction
itself does not lead to high performance,
but should be seen as an enabler.
• An increase in team performance
Transformative leadership plays a key
• An improved product performance
role in enabling the adoption of technical
• An improved organizational and product-management capabilities
performance and practices. This is enabled by (1)
delegating authority and autonomy to
teams; (2) providing them the metrics
and business intelligence needed to solve
problems; and (3) creating incentive
structures around value delivery as
opposed to feature delivery.
71 Leading transformations
v. 2024.3
Our research has helped to flip the teams to achieve win-win outcomes;
narrative of IT being a cost-center lower levels of burnout; more effective
to IT being an investment that drives leadership; and effective implementation
business success. In 2020, we wrote of both continuous delivery and
the ROI of DevOps whitepaper,2 which lean management practices.”3 We
contains calculations you can use to recommend dedicating a certain amount
help articulate potential value created by of capacity specifically for improvement.
investing in IT improvement.
Burnout -9.9%
Productivity 8.7%
Outcome
-10 -5 0 5 10
Estimated % change in outcome
Point = estimated value
Error bar = 89% uncertainty interval
Figure 18: Impacts of transformational leadership on various outcomes.
72 Leading transformations
v. 2024.3
Be relentlessly user-centric
73 Leading transformations
v. 2024.3
Become a data-informed organization
The ability to visualize your progress speed and stability. As a result, the
toward success is critical. Over the last benefits gained by speed and stability
10 years we have made the case for are diminished as higher performance
becoming a data-informed organization. becomes ubiquitous.
DORA's four key metrics5 have become
a global standard for measuring Thinking about transformation
software delivery performance, but holistically, we recommend creating
this is only part of the story. We have dashboards and visualizations that
identified more than 30 capabilities and combine both technical metrics
processes6 that can be used to drive (such as our four keys and reliability
organizational improvement. metrics) and business metrics. This helps
bridge the gap between the top-down
The value in the metrics lies in their and bottom-up transformation efforts.
ability to tell you if you are improving. The This also helps connect your northstar,
four key metrics should be used at the OKRs, and employee goals with the
application and service levels, and not at investments made in IT. They can help
the organization or line-of-business level. quantify the ROI.
The metrics should be used to visualize
your efforts in continuous improvement We believe metrics are a requirement
and not to compare teams — and for excellence. Metrics facilitate decision
certainly not to compare individuals. making. The more metrics you collect,
quantitative and qualitative, the better
The metrics should also not be used as and more informed decisions you can
a maturity model for your application make. People will always have opinions
or service teams. Being a low, medium, on the value of the data or the meaning
high, or elite performer is interesting, of the data, but using data as the
but we urge caution as these monikers basis by which to make a decision
have little value in the context of your is often preferable to relying on opinion
transformation journey. or intuition.
74 Leading transformations
v. 2024.3
Be all-in on cloud or stay in the data center
75 Leading transformations
v. 2024.3
Summary
What we’ve seen consistently over the The idea of a never-ending journey can
last 10 years is that transformation is a seem daunting. It’s easy to get stuck
requirement for success. What many in planning or designing the perfect
organizations misunderstand is that transformation. The key to success is
transformation isn’t a destination, but rolling up your sleeves and just getting
a journey of continuous improvement.8 to work. The goal of the organization and
Our research is clear: Companies that are your teams should be to simply be a little
not continuously improving are actually better than you were yesterday. The goal
falling behind. Conversely, companies of our last 10 years of research and into
that adopt a mindset of continuous the future is to help you get better at
improvement see the highest levels getting better.
of success.
1. Dimensions of transformational leadership: Conceptual and empirical extensions - Rafferty, A. E., & Griffin, M. A.
2. The ROI of DevOps Transformation - https://dora.dev/research/2020/
3. 2015 State of DevOps Report https://dora.dev/research/2015/2015-state-of-devops-report.pdf#page=25
4. 2023 Accelerate State of DevOps Report -
https://dora.dev/research/2023/dora-report/2023-dora-accelerate-state-of-devops-report.pdf#page=17
5. DORA's Four Key Metrics https://dora.dev/guides/dora-metrics-four-keys/
6. DORA's capabilities and processess https://dora.dev/capabilities/
7. NIST defined-5 characteristics of cloud computing https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-145.pdf
8. Journey of continuous improvement
https://cloud.google.com/transform/moving-shields-into-position-organizing-security-for-digital-transformation
9. 2018 Accelerate State of DevOps Report https://dora.dev/research/2018/dora-report/
10. 2022 State of DevOps Report https://dora.dev/research/2022/dora-report/
76 Leading transformations
v. 2024.3
A decade
with DORA
Teams do not need to sacrifice speed There are many ways that teams
for stability measure the four keys including:
Stability
Throughput
Performance
Software delivery
Four keys metrics
Reliability
Service Level Objectives (SLOs)
Predicts
Outcomes
Organizational performance
Well-being
1. Slides - https://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr,
recording - https://www.youtube.com/watch?v=LdOe18KhtT4
2. https://legacy.devopsdays.org/events/2009-ghent/
3. https://www.puppet.com/resources/history-of-devops-reports#2013
4. 2014 State of DevOps Report - https://dora.dev/research/2014/
5. The ROI of DevOps Transformation - https://dora.dev/research/2020/
6. Forsgren, Nicole, Jez Humble, and Gene Kim. 2018. Accelerate: The Science Behind DevOps : Building and Scaling High Performing
Technology Organizations. IT Revolution Press.
7. Accelerate State of DevOps: Strategies for a New Economy - https://dora.dev/research/2018/dora-report/
8. https://www.puppet.com/resources/history-of-devops-reports#2018
9. https://dora.dev/news/dora-joins-google-cloud
10. We consider 2014, the year that Dr. Forsgren joined the program, to be the first DORA report, even though DORA was founded a
few years later. There was no report in 2020, making 2024 the tenth report.
11. https://dora.dev/resources/#source-available-tools
12. 2022 Accelerate State of DevOps Report - https://dora.dev/research/2022/dora-report/
13. Ron Westrum, “A typology of organisation culture”, BMJ Quality & Safety 13, no. 2(2004), doi:10.1136/qshc.2003.009522
14. https://dora.community
83 Final thoughts
v. 2024.3
How are you leveraging this research?
84 Final thoughts
v. 2024.3
Acknowledgements
This year marks a special milestone: the 10th DORA report. We are thankful
for all of the dedicated work of researchers, experts, practitioners, leaders,
and transformation agents who have joined in shaping this body of work and
evolved alongside us.
We've come a long way since the first State of DevOps Report
published by Puppet Labs and IT Revolution Press. A heartfelt
thank you to our DORA founders for paving the way. It's
remarkable to reflect on how much has changed since
then and how much we've learned throughout the years.
85 Acknowledgements
v. 2024.3
DORA Report Team Marie-Blanche Panthou Gene Kim and IT
Revolution
James Brookbank Miguel Reyes
Laura Maguire, PhD
Kim Castillo Yoshi Yamaguchi
James Pashutinski
Derek DeBellis Jinhong Yu
Ryan J. Salva
Benjamin Good
Majed Samad
Nathen Harvey DORA guides
Harini Sampath
Michelle Irvine Lisa Crispin
Robin Savinar
Amanda Lewis Steve Fenton
Sean Sedlock
Eric Maxwell Denali Lumma
Dustin Smith
Steve McGhee Betsalel (Saul) Williamson
Finn Toner
Allison Park
Sander Bogdan
Editor
Michele Chubirka Silver sponsors
Seth Rosenblatt
Thomas De Meo
86 Acknowledgements
v. 2024.3
Authors
Derek is a quantitative user experience researcher
at Google and the lead investigator for DORA. Derek
focuses on survey research, logs analysis, and figuring
out ways to measure concepts that demonstrate
a product or feature is delivering capital-v value to
people. Derek has published on human-AI interaction,
the impact of COVID-19's onset on smoking cessation,
Derek DeBellis designing for NLP errors, the role of UX in privacy
discussions, team culture, and AI’s relationship to
employee well-being and productivity. His current
extracurricular research is exploring ways to simulate
the propagation of beliefs and power.
87 Authors
v. 2024.3
Amanda Lewis is the DORA.community development
lead and a developer relations engineer at Google
Cloud. She has spent her career building connections
across developers, operators, product managers,
project managers, and leadership. She has worked
on teams that developed e-commerce platforms,
content management systems, observability tools,
Amanda Lewis and supported developers. These connections and
conversations lead to happy customers and better
outcomes for the business. She brings her experience
and empathy to the work that she does helping teams
understand and implement software delivery and
artificial intelligence practices.
88 Authors
v. 2024.3
Daniella Villalba is a user experience researcher at
Google. She uses survey research to understand the
factors that make developers happy and productive.
Before Google, Daniella studied the benefits of
meditation training, and the psycho-social factors
that affect the experiences of college students. She
received her PhD in Experimental Psychology from
Daniella Villalba Florida International University.
89 Authors
v. 2024.3
Michelle Irvine is a technical writer at Google, and
her research focuses on documentation and other
technical communication. Before Google, she
worked in educational publishing and as a technical
writer for physics simulation software. Michelle has
a BSc in Physics, as well as an MA in Rhetoric and
Communication Design from the University of Waterloo.
Michelle Irvine
90 Authors
v. 2024.3
Demographics and
firmographics
The DORA research program has been Over 90,000 respondents participated
researching the capabilities, practices, in the 2023 Stack Overflow Developer
and measures of high-performing, Survey.1 That survey didn't reach every
technology-driven organizations for technical practitioner, but is about as
over a decade. We've heard from close as you can get to a census of the
roughly 39,000 professionals working in developer world.
organizations of every size and across
many different industries. Thank you for With a sense of the population provided
sharing your insights! This year, nearly from that survey, we can locate response
3,000 working professionals from a bias in our data and understand how far
variety of industries around the world we might want to generalize our findings.
shared their experiences to help grow Further, the demographic and
our understanding of the factors that firmographic questions asked in this
drive high-performing, technology- Stack Overflow Developer Survey are
driven organizations. well-crafted and worth borrowing.
Media/Entertainment 4.26%
Government 3.89%
Education 3.66%
Energy 3.03%
Insurance 2.39%
Non-Profit 1%
Number of employees
Gender
Question
5
16
0 5 10 15 20 25 30
Years
Box width represents 25th and 75th percentiles. The line dissecting the box represents the median.
v. 2024.3
Role
5%
0%
15%
10%
20%
Engineering manager
Developer, full-stack
DevOps specialist
Senior Executive (C-Suite, VP,…)
Product manager
respondents in our analyses. Other
Data Engineer
categories were highly represented in
95
Database administrator
Data scientist or machine…
Job title
Developer Advocate
Developer, embedded app…
Developer Experience
Developer, mobile
Prefer not to answer
Research & Development role
respondents.
respondents.
Security professional
System administrator
Academic researcher
of the respondents
Blockchain Engineer
Designer
Developer, game or graphics
Educator
Hardware Engineer
Marketing professional
Sales professional
Scientist
• Managers, representing 23% of the
Student
the respondents (+33% from 2023).
• Developers, representing 29% of the
Part-time employee 2%
Work location
Despite another year of return-to-office (RTO) pushes, the
pattern from last year has largely been retained, especially
toward the tails of the distribution. The 37.5% increase in the
median values does suggest that hybrid work or at least, some
regular visits, are becoming more common.
Year 24%
2023
33%
2024
Box width represents 25th and 75th percentiles. The line dissecting the box represents the median.
We had respondents from 104 different countries. We are always thrilled to see
people from all over the world participate in the survey. Thank you all!
Country
USA Italy Singapore Iceland Luxembourg Guatemala
China Israel Russian Federation Bosnia and Bolivia Trinidad and Tobago
Herzegovina
Black 1.3
1. https://survey.stackoverflow.co/2023/
2. https://www.washingtongroup-disability.com/question-sets/wg-short-set-on-functioning-wg-ss/
99 Methodology
v. 2024.3
Survey development
100 Methodology
v. 2024.3
Data collection
Localizations
101 Methodology
v. 2024.3
Collect survey responses Survey flow
102 Methodology
v. 2024.3
Survey analysis
To understand the internal validity This year we used the lavaan1 R package
of the measure, we look at what we think to do this analysis. Lavaan returns
indicates the presence of a concept. a variety of fit statistics that help us
For example, quality documentation understand whether constructs
might be indicated by people using their actually represent the way people
documentation to solve problems. answer the questions.
103 Methodology
v. 2024.3
Similarly, we might expect two Model evaluation
constructs to have positive
relationships, but not strong ones. Using a set of hypotheses as our guiding
Productivity and job satisfaction are principle, we build hypothetical models,
likely to be positively correlated, but little toys that try to capture some aspect
we don’t think they’re identical. If the about how the world works. We examine
correlation gets too high, we might how well those models fit the data we
say it looks like we’re measuring the collected. For evaluating a model, we go
same thing. This then means that our for parsimony. This amounts to starting
measures are not calibrated enough with a very simplistic model2 and adding
to pick up on the differences between complexity until the complexity is no
the two concepts, or the difference we longer justified.
hypothesized isn’t actually there.
For example, we predict that
organizational performance is the
product of the interaction between
software delivery performance and
operational performance. Our simplistic
model doesn’t include the interaction:
104 Methodology
v. 2024.3
Directed Acyclic Graphs for We are able to use the validated model
Causal Inference to tell us what we need to account for
to understand an effect. In short, it lets
A validated model tells us what we us try to get our data in the form of
need to know to start thinking an A/B experiment, where one tries to
causally. We talk about the challenges create two identical worlds with only
of thinking causally below. one difference between them. The logic
suggests that in doing so any differences
Here are some reasons why we’re trying that emerge between those two worlds
to talk causally: is attributable to that initial difference.
We think your question is fundamentally
In observational data and survey data,
a causal one. You want to know if doing
things are not as clearly divided — many
something is going to create something.
things are different between participants,
You are not going to invest in doing
which introduces confounds. Our
something if you just think there is a non-
method of causal inference tries to
causal correlation.
account for these differences in an
attempt to mimic an experiment — that
The results of our analyses depend on is, holding everything constant except for
our causal understanding of the world. one thing (for example, AI adoption).
The actual numbers we get from the
regression change based on what we Let’s take the classic example of ice
include in the regression. What we cream “causing” shark attacks. There is
include in the regression should depend a problem in that observation, namely
on how we think the data is generated, that people tend to eat ice cream on hot
which is a causal claim. Hence, we should days and also go to the beach on hot
be clear. days. The situation where people tend
to eat ice cream and go to the beach
Causal thinking is where our curiosity is not the same as the situation where
will take us and where we all spend a lot people tend not to eat ice cream and not
of time. We are often wondering about go to the beach. The data isn’t following
how the various aspects of the world are the logic of an experiment. We’ve got a
connected and why. We don’t need to confounding variable, temperature.
run experiments on every facet of our
lives to think causally about them.
105 Methodology
v. 2024.3
Directed Acyclic Graphs (DAGs) help I draw my model, tell the tool what effect
you identify the ways in which the world I want to understand, and the tool tells
is different and offer approaches to me what is going to bias my estimate
remedy the situation, to try to mimic an of the effect. In this case, the tool says
experiment by making everything in the that I cannot estimate the effect of ice
world except one thing constant. Let’s cream consumption on shark attacks
see how the DAG directs us in the ice without adjusting for temperature,
cream and shark attack example, where which is a statistical approach of trying
we want to quantify the impact of ice to make everything equal besides ice
cream consumption on shark attacks: cream consumption and then seeing if
shark attacks continue to fluctuate as a
function of ice cream consumption.
106 Methodology
v. 2024.3
The directed acyclic graph tells us
what to account for in our analyses
of particular effects.
Bayesian statistics
107 Methodology
v. 2024.3
What do you mean by “simulation”?
It isn’t that we made up the data. We uncertainty there is. You can think of
use Bayesian statistics to calculate a each simulation as a little AI that knows
posterior, which tries to capture “the nothing besides our data and a few rules
expected frequency that different trying to fill in a blank (parameter) with an
parameter values will appear.”8 The informed guess. You do this 4,000 times
“simulation” part is drawing from this and you get the guesses of 4,000 little
posterior more than 1,000 times to AIs for a given parameter.
explore the values that are most credible
for a parameter (mean, beta weight, You can learn a lot from these guesses.
sigma, intercept, etc.) given our data. You can learn what the average guess is,
between which values do 89%10 of these
“Imagine the posterior is a bucket full of guesses fall, how many guesses are
parameter values, numbers such as 0.1, above a certain level, how much variation
0.7, 0.5, 1, etc. Within the bucket, is there in these guesses, etc. You can
each value exists in proportion to its even do fun things like combine guesses
posterior probability, such that values (simulations) across many models.
near the peak are much more common
than those in the tails.”9 When we show a graph with a bunch of
lines or a distribution of potential values,
This all amounts to our using simulations we are trying to show you what is most
to explore possible interpretations of plausible given our data and how much
the data and get a sense of how much uncertainty there is.
108 Methodology
v. 2024.3
Interviews Inferential leaps in
results
This year, we supplemented our annual Our goal is to create a pragmatic
survey with in-depth, semi-structured representation of the world, something
interviews to triangulate, contextualize, that we can all leverage to help improve
and clarify our quantitative findings. the way we work. We know there is
The interview guide paralleled the topics complexity we’re simplifying. That is kind
included in our survey and was designed of the point of the model. Jorge Luis
for sessions to last approximately 75 Borges has a very short story, called
minutes each, conducted remotely “On Exactitude in Science”, where he
via Google Meet. talks of an empire that makes maps of
the empire on a 1:1 scale.11 The absurdity
In total, we interviewed 11 participants is that this renders the map absolutely
whose profiles matched the inclusion useless (at least that’s my interpretation).
criteria of our survey. All interviews were The simplifications we make are
video- and audio-recorded. Sessions supposed to be helpful.
lasted between 57 minutes and 85
minutes, totaling 14 hours and 15 minutes That said, there are some inferential leaps
of data collected across all participants. that we want to be clear about.
Participants’ data were pseudonymized
using identifiers in the form of P(N),
where N corresponds to the order in
which they were interviewed.
109 Methodology
v. 2024.3
Causality
According to John Stuart Mill, you graphs, we do the work to account for
needed to check three boxes to say biasing pathways, but that is a highly
X causes Y:12 theoretical exercise, one that, unlike
temporal precedence, has implications
• Correlation: X needs to covary with Y?
that can be explored in the data.
• Temporal precedence: X needs to
This is all to say that we didn’t do
happen before Y?
longitudinal studies or a proper
experiment. Despite this, we think causal
• Biasing pathways are accounted for (as
thinking is how we understand the world
described in the DAG section above)?
and we try our best to use emerging
techniques in causal inference to provide
We feel confident that we can you with good estimates. Correlation
understand correlation — that’s often does not imply causation, but it does
a standard statistical procedure. Our imply how you think about causation.
survey is capturing a moment in time, so
temporal precedence is theoretical, not
part of our data.
110 Methodology
v. 2024.3
Micro-level phenomena
-> Macro-level phenomena
Often we take capabilities at an individual That is, we believe that the probability
level and see how those connect to of an individual doing something (X) is
higher levels. For example, we tied the higher when they are in an organization
individual adoption of AI to an application or a team that also does X. Hence,
or service and to team performance. individuals who do something represent
This isn’t terribly intuitive at first glance. teams and organizations that also tend to
The story of a macro-level phenomenon do X. Of course the noise here is pretty
causing an individual level phenomenon loud, but the pattern should emerge and
is usually easier to tell. Inflation (macro) allow this assumption to give us some
impacting whether I buy eggs (micro) important abilities.
seems like a more palatable story than
me not buying eggs causing inflation. Let’s back up for an example outside of
DORA: imagine two different countries
The same is true for an organization's where the average height differs. In one
performance (macro) impacting an country, people have an average height
individual’s well-being (micro). As a of 5’6”. The other’s average height is 6’2”.
heuristic, it is likely the organization The standard deviation is identical. If you
exerts more of an influence on picked a person at random from each
the individual than the individual country, which country do you think the
on the organization. taller person would be more likely to be
drawn from? If you do this thousands
So, why do we even bother saying an of times, taller countries would be
individual action impacts something like represented by taller people. The
team or organizational performance? height of the individuals would loosely
We make an inferential leap that we approximate the heights of the countries.
think isn’t completely illogical. Namely,
we assume that at scale, the following
statement tends to be true:
p(individual does X | organization does X) > p(individual does X | organization doesn’t do X).
111 Methodology
v. 2024.3
Not that it is necessary, but we ran a The results are unsurprising. 97.2%
quick simulation to validate this is true: of the 1,000 random draws are in
the correct direction. Of course, it
would be easy to get fooled with
#R code non-random draws, smaller differences
#set seed for reproducibility between the countries, and small
set.seed(10) samples. Still, the point stands:
#6'2 and 5'6 differences at the macro-level tend
height_means = c(6 + 1/6, 5.5)
to be represented in the micro-level.
#constant standard deviation at 1/4 of
foot
std_dev =0.25
#random draws
draws = 1000
1. Rosseel Y (2012). “lavaan: An R Package for Structural Equation Modeling.” Journal of Statistical Software, 48(2), 1–36.
https://doi.org/10.18637/jss.v048.i02
2. This would also involve the examination of potential confounds.
3. Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and Other Stories. N.p.: Cambridge University Press.
4. McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. N.p.: CRC Press/Taylor & Francis
Group.
5. Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and Other Stories. N.p.: Cambridge University Press
6. McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. N.p.: CRC Press/Taylor & Francis
Group.
7. Our priors tend to be weak (skeptical, neutral, and low information) and we check that the results are not conditioned by our
priors.
8. McElreath, Richard. Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC, 2018, pg. 50
9. McElreath, Richard. Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC, 2018, pg. 52
10. Followed McElreath’s reasoning in Statistical Rethinking, pg. 56 for choosing 89%. “Why these values? No reason… And these
values avoid conventional 95%, since conventional 95% intervals encourage many readers to conduct unconscious hypothesis
tests.” The interval we’re providing is simply trying to show a plausible “range of parameter values compatible with the model and
data”.
11. Borges, J. L. (1999). Collected fictions. Penguin.
12. Duckworth, Angela Lee, Eli Tsukayama, and Henry May. “Establishing causality using longitudinal hierarchical linear modeling: An
illustration predicting achievement from self-control.” Social psychological and personality science 1, no. 4 (2010): 311-317.
112 Methodology
v. 2024.3
Models
113 Models
v. 2024.3
How do we use the models?
We all have a lot of questions, but many Here are some of this year's capabilities
vital questions have the following form: of interest:
• AI adoption
if we do X, what happens to Y?
• platform use
X is usually a practice, such as creating
quality documentation, adopting AI, or
• platform age
investing in culture.
• transformational leadership
Y is usually something that we care
about achieving or avoiding, which • priority stability
could happen at the individual level
(for example, productivity) up to the • user centricity
organizational level (for example,
market share).
Here are some of this year’s outcomes
We construct, evaluate, and use the and outcome groups:
models3 with the goal of addressing
• individual performance and
questions of this form. We work to
well-being (for example, burnout)
provide an accurate estimate of what
happens to important outcomes as
• team performance
a result of doing X.4 When we report
effects, we convey two vital features:
• product performance
1. How much certainty we have in the
• development workflow
direction of the effect, that is, how
(for example, codebase complexity
clear is it that this practice will be
and document quality)
beneficial or detrimental?
• software delivery performance
2. How much certainty we have in the
magnitude of the effect. We will
• organizational performance
provide an estimate a relative sense
of how impactful certain practices
are and the degree of uncertainty
surrounding these estimates.
114 Models
v. 2024.3
We focus on these outcomes because A repeated model
we believe that they are ends in
themselves. Of course, that is more true
for some of these outcomes than others. We developed and explored many
If you found out that organizational nuanced hypotheses over the
performance and team performance past three years, especially about
had nothing to do with the software moderation and mediation.
delivery performance, you would
probably be okay having low software This year, we spent less time focusing
delivery performance. on those types of hypotheses and more
time trying to estimate a capability’s
We hope, however, that even if effect on an outcome. This means that
organizational performance did not the model for each capability is largely
depend on individual well-being the same.
you would still want to prioritize the
well-being of employees. Hence, the model for AI adoption’s
effects is very similar in design to the
model for User-centricity’s effects. We
could copy the model and change the
name of capability, but that might not
be terribly useful for you.
115 Models
v. 2024.3
indicator or a latent factor
Capability
The practice, state or trait of
interest as a potential cause
Capability
separately
we wantedintothe analysis
quantify and
Effect of interest
report to you
These are the effects that
we wanted to quantify and
Effect to
report of you
interest
Auxiliary
These are Effect
the effects that
WeHelps
did not focus on quantifying
we wanted to quantify and
this effect
report to this year. It is part of
you
Auxiliary Effect
our model, but not used
We did not focus on quantifying
this effect this year. It is part of
Auxiliary
our model,Effect
but not used Team
Performanc
We did not focus on quantifying
this effect this year. It is part of
our model, but not used
Individual
Performance
and
Mostly
Helps
Softwear
Organizational
AI adoption Delivery
Performance
Performance
Harms
Development
Workflow
Helps
Product
Performanc
no effect
1. Gelman et. al’s “Regression and other stories” offers some important tips on page 495 through 496
that seem illuminating: B.6 Fit many models and B.9 Do causal inference in a targeted way, not as a byproduct of a large regression
2. A great discussion about this can be found in chapter 6 of Statistical Rethinking. I am talking
particularly about collider bias.
3. See the conversation about how these models are tied with directed acyclic graphs in the methodology chapter
4. We talk about causality briefly in the methods chapter.
116 Models
v. 2024.3
Recommended
reading
Join the DORA Community to discuss, Read the book: Team Topologies:
learn, and collaborate on improving Organizing Business and Technology
software delivery and operations Teams for Fast Flow. IT Revolution Press.
performance. https://dora.community https://teamtopologies.com/
118
v. 2024.3
119
v. 2024.3