1.2 R - Lean Software Development - Fragment
1.2 R - Lean Software Development - Fragment
When it replaced sequential development practices typical at the time, agile software
development improved the software development process most of the time – in IT
departments as well as product development organizations. However, the expected
organizational benefits of agile often failed to materialize because agile focused on
optimizing software development, which frequently was not the system constraint. Lean
software development differed from agile in that it worked to optimize flow efficiency
across the entire value stream “from concept to cash.” (Note the subtitle of the book
Implementing Lean Software Development: From Concept to Cash (Poppendieck, 2006)).
This end-to-end view was consistent with the work of Taiichi Ohno, who said:
“All we are doing is looking at the time line, from the moment the customer gives us an
order to the point when we collect the cash. And we are reducing that time line by
removing the non-value-added wastes.” (Ohno, 1988. p ix)
Lean software development came to focus on these areas:
1. Build the right thing: Understand and deliver real value to real customers.
2. Build it fast: Dramatically reduce the lead time from customer need to delivered
solution.
3. Build the thing right: Guarantee quality and speed with automated testing,
integration and deployment.
4. Learn through feedback: Evolve the product design based on early and frequent
end-to-end feedback.
A software development team working with a single customer proxy has one view of the
customer interest, and often that view is not informed by technical experience or feedback
from downstream processes (such as operations). A product team focused on solving real
customer problems will continually integrate the knowledge of diverse team members, both
upstream and downstream, to make sure the customer perspective is truly understood and
effectively addressed. Clark and Fujimoto call this “integrated problem solving” and
consider it an essential element of lean product development.
2. Dramatically reduce the lead time from customer need to delivered solution.
A focus on flow efficiency is the secret ingredient of lean software development. How long
does it take for a team to deploy into production a single small change that solves a
customer problem? Typically it can take weeks or months – even when the actual work
involved consumes only an hour. Why? Because subtle dependencies among various areas
of the code make it probable that a small change will break other areas of the code;
therefore it is necessary to deploy large batches of code as a package after extensive
(usually manual) testing. In many ways the decade of 2000-2010 was dedicated to finding
ways to break dependencies, automate the provisioning and testing processes, and thus
allow rapid independent deployment of small batches of code.
3. Guarantee quality and speed with automated testing, integration and deployment.
Next the operations people got involved and automated the provisioning of environments
for development, testing, and deployment. Finally teams (which now included operations)
could automate the entire specification, development, test, and deployment processes –
creating an automated deployment pipeline. There was initial fear that more rapid
deployment would cause more frequent failure, but exactly the opposite happened.
Automated testing and frequent deployment of small changes meant that risk was limited.
When errors did occur, detection and recovery was much faster and easier, and the team
became a lot better at it. Far from increasing risk, it is now known that deploying code
frequently in small batches is best way to reduce risk and increase the stability of large
complex code bases.
4. Evolve the product design based on early and frequent end-to-end feedback.
To cap these remarkable advancements, once product teams could deploy multiple times
per day they began to close the loop with customers. Through canary releases, A/B testing,
and other techniques, product teams learned from real customers which product ideas
worked and how to fine tune their offerings for better business results.
Over the next few years, the ideas in these books became mainstream and the limitations of
agile software development (software-only perspective and iteration-based delivery) were
gradually expanded to include a wider part of the value stream and a more rapid flow. A
grassroots movement called DevOps worked to make automated provision-code-build-test-
deployment pipelines practical. Cloud computing arrived, providing easy and automated
provisioning of environments. Cloud elements (virtual machines, containers), services
(storage, analysis, etc.) and architectures (microservices) made it possible for small services
and applications to be easily and rapidly deployed. Improved testing techniques
(simulations, contract assertions) have made error-free deployments the norm.
Today’s successful internet companies have learned how to optimize software development
over the entire value stream. They create full stack teams that are expected to understand
the consumer problem, deal effectively with tough engineering issues, try multiple
solutions until the data shows which one works best, and maintain responsibility for
improving the solution over time. Large companies with legacy systems have begun to take
notice, but they struggle with moving from where they are to the world of thriving internet
companies.
Lean principles are a big help for organizations that want to move from old development
techniques to modern software approaches. For example, (Calçado, 2015) shows how
classic lean tools – Value Stream Mapping and problem solving with Five Whys – were
used to increase flow efficiency at Soundcloud, leading over time to a microservices
architecture. In fact, focusing on flow efficiency is an excellent way for an organization to
discover the most effective path to a modern technology stack and development approach.
For traditional software development, flow efficiency is typically lower than 10%; agile
practices usually bring it up to 30 or 40%. But in thriving internet companies, flow
efficiency approaches 70% and is often quite a bit higher. Low flow efficiencies are caused
by friction – in the form of batching, queueing, handovers, delayed discovery of defects, as
well as misunderstanding of consumer problems and changes in those problems during long
resolution times. Improving flow efficiency involves identifying and removing the biggest
sources of friction from the development process.
Modern software development practices – the ones used by successful internet companies –
address the friction in software development in a very particular way. The companies start
by looking for the root causes of friction, which usually turn out to be 1) misunderstanding
of the customer problem, 2) dependencies in the code base and 3) information and time lost
during handovers and multitasking. Therefore they focus on three areas: 1) understanding
the consumer journey, 2) architecture and automation to expose and reduce dependencies,
and 3) team structures and responsibilities. Today (2015), lean development in software
usually focuses on these three areas as the primary way to increase efficiency, assure
quality, and improve responsiveness in software-intensive systems.
Other internet companies, including Google and Facebook, have maintained existing
architectures but developed sophisticated deployment pipelines that automatically send
each small code change through a series of automated tests with automatic error handling.
The deployment pipeline culminates in safe deployments which occur at very frequent
intervals; the more frequent the deployment, the easier it is to isolate problems and
determine their cause. In addition, these automation tools often contain dependency maps
so that feedback on failures can be sent directly to the responsible engineers and offending
code can be automatically reverted (taken out of the pipeline in a safe manner).
These architectural structures and automation tools are a key element in a development
approach that uses Big Data combined with extremely rapid feedback to improve the
consumer journey and solve consumer problems. They are most commonly found in
internet companies, but are being used in many others, including organizations that develop
embedded software. (See case study, below.)
When consumer empathy, data analytics and very rapid feedback are combined, there is
one more point of friction that can easily reduce flow efficiency. If an organization has not
delegated responsibility for product decisions to the team involved in the rapid feedback
loop, the benefits of this approach are lost. In order for such feedback loops to work, teams
with a full stack of capabilities must be given responsibility to make decisions and
implement immediate changes based on the data they collect. Typically such teams include
people with product, design, data, technology, quality, and operations backgrounds. They
are responsible for a improving set of business metrics rather than delivering a set of
features. An example of this would be the UK Government Digital Service (GDS), where
teams are responsible for delivering improvements in four key areas: cost per transaction,
user satisfaction, transaction completion rate, and digital take-up.
It is interesting to note that UK laws makes it difficult to base contracts on such metrics, so
GDS staffs internal teams with designers and software engineers and makes them
responsible for the metrics. Following this logic to its conclusion, the typical approach of
IT departments – contracting with their business colleagues to deliver a pre-specified set of
features – is incompatible with full stack teams responsible for business metrics. In fact, it
is rare to find separate IT departments in companies founded after the mid 1990’s (which
includes virtually all internet companies). Instead, these newer companies place their
software engineers in line organizations, reducing the friction of handovers between
organizations.
— Case Study —
Hewlett Packard LaserJet Firmware
The HP LaserJet firmware department had been the bottleneck of the LaserJet product line
for a couple of decades, but by 2008 the situation had turned desperate. Software was
increasingly important for differentiating the printer line, but the firmware department
simply could not keep up with the demand for more features. Department leaders tried to
spend their way out of the problem, but more than doubling the number of engineers did
little to help. So they decided to engineer a solution to the problem by reengineering the
development process.
The starting point was to quantify exactly where all the engineers’ time was going. Fully
half of the time went to updating existing LaserJet printers or porting code between
different branches that supported different versions the product. A quarter of the time went
to manual builds and manual testing, yet despite this investment, developers had to wait for
days or weeks after they made a change to find out if it worked. Another twenty percent of
the time went to planning how to use the five percent of time that was left to do any new
work. The reengineered process would have to radically reduce the effort needed to
maintain existing firmware, while seriously streamlining the build and test process. The
planning process could also use some rethinking.
It’s not unusual to see a technical group use the fact that they inherited a messy legacy code
base as an excuse to avoid change. Not in this case. As impossible as it seemed, a new
architecture was proposed and implemented that allowed all printers – past, present and
even future – to operate off of the same code branch, determining printer-specific
capabilities dynamically instead of having them embedded in the firmware. Of course this
required a massive change, but the department tackled one monthly goal after another and
gradually implemented the new architecture. But changing the architecture would not solve
the problem if the build and test process remained slow and cumbersome, so the engineers
methodically implemented techniques to streamline that process. In the end, a full
regression test – which used to take six weeks – was routinely run overnight. Yes, this
involved a large amount of hardware, simulation and emulation, and yes it was expensive.
But it paid for itself many times over.
During the recession of 2008 the firmware department was required to return to its previous
staffing levels. Despite a 50% headcount reduction, there was a 70% reduction in cost per
printer program once the new architecture and automated provisioning system were in place
in 2011. At that point there was a single code branch and twenty percent of engineering
time was spend maintaining the branch and supporting existing products. Thirty percent of
engineering time was spent on the continuous delivery infrastructure, including build and
test automation. Wasted planning time was reclaimed by delaying speculative decisions and
making choices based on short feedback loops. And there was something to plan for,
because over forty percent of the engineering time was available for innovation.
This multi-year transition was neither easy nor cheap, but it absolutely was worth the effort.
If you would like more detail, see (Gruver et al, 2013).
A more recent case study of how the software company Paddy Power moved to continuous
delivery can be found in (Chen, 2015). In this case study the benefits of continuous delivery
are listed: improved customer satisfaction, accelerated time to market, building the right
product, improved product quality, reliable releases, and improved productivity and
efficiency. There is really no downside to continuous delivery. Of course it is a challenging
engineering problem that can require significant architectural modifications to existing code
bases as well as sophisticated pipeline automation. But technically, continuous delivery is
no more difficult than other problems software engineers struggle with every day. The real
stumbling block is the change in organizational structure and mindset required to achieve
serious improvements in flow efficiency.
References
Anderson, David. Kanban, Blue Hole Press, 2010
Beck, Kent. Extreme Programming Explained, Addison-Wesley, 2000
Beck, Kent et al. Manifesto for Agile Software Development, http://agilemanifesto.org/,
2001
Calçado, Phil. How we ended up with microservices.
http://philcalcado.com/2015/09/08/how_we_ended_up_with_microservices.html
Chen, Lianping. "Continuous Delivery: Huge Benefits but Challenges Too" IEEE
Software 32 (2). 50-54. 2015
Clark, Kim B. and Takahiro Fujimoto. Product Development Performance, Harvard
Business School Press, 1991
Gruver, Gary, Mike Young, and Pat Fulghum. A Practical Approach to Large-Scale Agile
Development, Pearson Education, 2013
Humble, Jez and David Farley. Continuous Delivery, Addison-Wesley Professional, 2010
Modig, Niklas, and Par Ahlstrom. This is Lean, Stockholm: Rheologica Publishing, 2012
Morgan, James M and and Jeffrey K Liker. The Toyota Product Development System,
Productivity Press, 2006
Ohno, Taiichi. Toyota Production System, English, Productivity, Inc. 1988, published in
Japanese in 1978
Poppendieck, Mary and Tom. Lean Software Development, Addison Wesley, 2003
Poppendieck, Mary and Tom. Implementing Lean Software Development, Addison Wesley,
2006
Porter, Michael E. and James E. Heppelmann. How Smart, Connected Products are
Transforming Companies, Harvard Business Review 93 (10), 97-112, 2015
Reinertsen, Donald G. Managing the Design Factory, The Free Press, 1997
Ries, Eric. The Lean Startup, Crown Business, 2011
Smith, Preston G. and Donald G. Reinertsen. Developing Products in Half the Time, Van
Nostrand Reinhold/co Wiley, 1991
Ward, Allen. Lean Product and Process Development, Lean Enterprise Institute, 2007
Womack, James P., Daniel T. Jones, and Daniel Roos. The Machine That Changed the
World; the Story of Lean Production, Rawson & Associates, 1990