s192 - The Open Group Agile Architecture Framework
s192 - The Open Group Agile Architecture Framework
Architecture Framework™
Draft Standard
The Open Group Snapshot
Table of Contents
The Open Group Agile Architecture Framework Draft Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Referenced Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Normative References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Informative References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3. Conformance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3. A Dual Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4. Product-Centricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4. What is Architecture? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.1. Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.2. Architectural . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.3. Continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.3. Guardrails. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4.2. Componentization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.8. Accountability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Part 2: Playbooks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
11.6. Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Appendices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
The Open Group Agile Architecture
Framework Draft Standard
The Open Group Snapshot
NOTICE
Snapshot documents are draft standards, which provide a mechanism for The Open
Group to disseminate information on its current direction and thinking to an interested
audience, in advance of formal publication, with a view to soliciting feedback and
comment.
This Snapshot document is intended to make public the direction and thinking about
the path we are taking in the development of The Open Group Agile Architecture
Framework Standard. We invite your feedback and guidance. To provide feedback on
this Snapshot document, please send comments by email to ogspecs-snapshot-
[email protected] no later than January 15, 2020.
The Open Group hereby authorizes you to use this document for any purpose, PROVIDED THAT any
copy of this document, or any part thereof, which you make shall retain all copyright and other
proprietary notices contained herein.
This document may contain other proprietary notices and copyright information.
Nothing contained herein shall be construed as conferring by implication, estoppel, or otherwise any
license or right under any patent or trademark of The Open Group or any third party. Except as
expressly provided above, nothing contained herein shall be construed as conferring any license or
right under any copyright of The Open Group.
Note that any product, process, or technology in this document may be the subject of other intellectual
property rights reserved by The Open Group, and may not be licensed hereunder.
This document is provided “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. Some jurisdictions do not allow the
exclusion of implied warranties, so the above exclusion may not apply to you.
Any publication of The Open Group may include technical inaccuracies or typographical errors.
Changes may be periodically made to these publications; these changes will be incorporated in new
editions of these publications. The Open Group may make improvements and/or changes in the
products and/or the programs described in these publications at any time without notice.
Should any viewer of this document respond with information including feedback data, such as
questions, comments, suggestions, or the like regarding the content of this document, such information
shall be deemed to be non-confidential and The Open Group shall have no obligation of any kind with
respect to such information and shall be free to reproduce, use, disclose, and distribute the
information to others without limitation. Further, The Open Group shall be free to use any ideas,
concepts, know-how, or techniques contained in such information for any purpose whatsoever
including but not limited to developing, manufacturing, and marketing products incorporating such
information.
If you did not obtain this copy through The Open Group, it may not be the latest version. For your
convenience, the latest version of this publication may be downloaded at www.opengroup.org/library.
DRAFT: Built with asciidoctor, version 2.0.9. Backend: pdf Build date: 2019-07-22 17:42:03 +0100
Preface
The Open Group
The Open Group is a global consortium that enables the achievement of business objectives through
technology standards. Our diverse membership of more than 700 organizations includes customers,
systems and solutions suppliers, tools vendors, integrators, academics, and consultants across multiple
industries.
The mission of The Open Group is to drive the creation of Boundaryless Information Flow™ achieved
by:
• Working with customers to capture, understand, and address current and emerging requirements,
establish policies, and share best practices
• Working with suppliers, consortia, and standards bodies to develop consensus and facilitate
interoperability, to evolve and integrate specifications and open source technologies
• Developing and operating the industry’s premier certification service and encouraging
procurement of certified products
The Open Group publishes a wide range of technical documentation, most of which is focused on
development of Standards and Guides, but which also includes white papers, technical studies,
certification and testing documentation, and business titles. Full details and a catalog are available at
www.opengroup.org/library.
This Document
This document is a Snapshot of what is intended to become The Open Group Agile Architecture
Framework™ Standard, also known as the O-AAF™ Standard. It is being developed by The Open Group.
• Part 1: Agile Architecture Fundamentals gives an overview of this document and introduces the key
concepts
This first Snapshot document has fully developed an initial release of the architecture fundamentals
section. The other sections are still incomplete and will be completed in the next version.
• Agilists who need to understand the importance of architecture when shifting toward an Agile at
scale model and who want to learn architecture skills
• Enterprise Architects who want to stay relevant in an Agile at scale world and who need to learn
new architecture skills for the digital age
• Business managers and executives who need to learn the importance of the architecture discipline
and who need to influence architecture decisions
Trademarks
ArchiMate®, DirecNet®, Making Standards Work®, Open O® logo, Open O and Check® Certification
logo, OpenPegasus®, Platform 3.0®, The Open Group®, TOGAF®, UNIX®, UNIXWARE®, and the Open
Brand X® logo are registered trademarks and Boundaryless Information Flow™, Build with Integrity
Buy with Confidence™, Dependability Through Assuredness™, Digital Practitioner Body of
Knowledge™, DPBoK™, EMMM™, FACE™, the FACE™ logo, IT4IT™, the IT4IT™ logo, O-DEF™, O-HERA™,
O-PAS™, Open FAIR™, Open Platform 3.0™, Open Process Automation™, Open Subsurface Data
Universe™, Open Trusted Technology Provider™, O-SDU™, Sensor Integration Simplified™, SOSA™, and
the SOSA™ logo are trademarks of The Open Group.
CMMI® and PCMM® are registered trademarks of CMMI Institute LLC, USA.
ISACA® is a registered trademark of the Information Systems Audit and Control Association.
All other brands, company, and product names are used for identification purposes only and may be
trademarks that are the sole property of their respective owners.
Acknowledgments
The Open Group gratefully acknowledges the contribution of the following people in the development
of this document:
• Miguel de Andrade
• Paddy Fagan
• Jérémie Grodziski
• Peter Haviland
• Frédéric Le
• Jean-Pierre Le Cam
• Antoine Lonjon
• Eamonn Moriarty
Referenced Documents
The following documents are referenced in this Snapshot.
(Please note that the links below are good at the time of writing but cannot be guaranteed for the
future.)
Normative References
This document does not contain any normative references at the time of publication. These may be
added in a future release.
Informative References
• [Agile Manifesto] Manifesto for Agile Software Development, 2001: https://agilemanifesto.org/
• [Bain 2014] Winning Operating Models that Convert Strategy to Results, Marcia Blenko, Eric Garton,
Ludovica Mottura, Bain & Company, December 2014: https://www.bain.com/insights/winning-
operating-models-that-convert-strategy-to-results/, retrieved June 6, 2019
• [Baiyere 2017] Desining for Digital – Lessons from Spotify™, Abayomi Baiyere, Jeanne W. Ross, Ina
M. Sebastien, Research Briefing, MIT Sloan CISR, December 2017
• [Ballé 2019] Lean is a Product-Driven Strategy, Michael Ballé, April 2019, retrieved July 4, 2019:
https://www.lean.org/LeanPost/Posting.cfm?LeanPostId=1024
• [Chheda 2017] Putting Customer Experience at the Heart of Next-generation Operating Models,
Shital Chheda, Ewan Duncan, Stefan Roggenhofer, Digital McKinsey, 2017
• [Christensen 2016] Know Your Customers’ “Jobs-to-be-done”, Clayton M. Christensen, Taddy Hall,
Karen Dillon, David S. Duncan, Harvard Business Review, September 2016 Issue
• [Crawley 2016] Systems Architecture, Edward Crawley, Bruce Cameron, Daniel Selva, Global
Edition, Pearson Education Limited, 2016
• [Evans 2003] Domain-Driven Design: Tackling Complexity in the Heart of Software, Eric Evans,
Addison-Wesley Professional, 2003
• [Evans 2013] Getting Started with DDD when Surrounded by Legacy Systems, Eric Evans, 2013,
retrieved April 24, 2019: http://domainlanguage.com/wp-content/uploads/2016/04/
GettingStartedWithDDDWhenSurroundedByLegacySystemsV1.pdf
• [Ford 2017] Building Evolutionary Architectures, Neal Ford, Rebecca Parsons, Patrick Kua, O′Reilly,
2017
• [Forsgren 2018] Accelerate: The Science of Lean Software and DevOps: Building and Scaling High
Performing Technology Organizations, Forsgren, Humble, Kim, Trade Select, 2018
• [Fowler 2015] Making Architecture Matter – Martin Fowler Keynote, youtube.com, posted by
O’Reilly Media, July 23, 2015: https://www.youtube.com/watch?v=DngAZyWMGR0
• [Fowler 2019] Refactoring: Improving the Design of Existing Code, Martin Fowler, Addison-Wesley,
2019
• [George 2004] Conquering Complexity in your Business: How Walmart®, Toyota®, and Other Top
Companies are Breaking through the Ceiling on Profits and Growth, Michael L. George, Stephen A.
Wilson, McGraw Hill, 2004
• [Gof 1994] Design Patterns: Elements of Reusable Object-Oriented Software, Vlissides, Helm,
Gamma, Johnson, Addison-Wesley, 1994
• [Hammer 1990 Reengineering Work: Don’t Automate, Obliterate, Michael Hammer, Harvard
Business Review, July-August 1990 Issue
• [Hammer 1993] Re-engineering the Corporation: A Manifesto for Business Revolution, Michael
Hammer, James A. Champy, 1993
• [Hayler 2006] Six Sigma for Financial Services, Rowland Hayler, Michael D. Nichols, McGraw-Hill,
2006
• [HBR 2013] IT Governance is Killing Innovation, Andrew Horne, Brian Foster, Harvard Business
Review, 2013: https://hbr.org/2013/08/it-governance-is-killing-innov
• [HBR 2017] How Spotify™ Balances Employee Autonomy and Accountability, Michael Mankins, Eric
Garton, Harvard Business Review, 2017: https://hbr.org/2017/02/how-spotify-balances-employee-
autonomy-and-accountability
• [Hellman 2018] Delivering Customer Outcomes versus Selling Products: The GE Digital Case, Karl
Hellman, Frank M. Grillo, The Marketing Journal, June 2018: http://www.marketingjournal.org/
delivering-customer-outcomes-versus-selling-products-the-ge-digital-case-study-by-frank-m-grillo-
and-karl-hellman/, retrieved July 2, 2018
• [Hodgson 2017] Feature Toggles (aka Feature Flags), Hodgson, martinfowler.com, posted by Pete
Hodgson, October 2017
• [Holland 2014] Complexity: A Very Short Introduction, John H. Holland, Oxford University Press,
2014
• Signals and Boundaries: Building Blocks for Complex Adaptive Systems, John H. Holland, The MIT
Press, 2012
• [Humble 2010] Continuous Delivery: Reliable Software Releases through Build, Test, and
Deployment Automation, Humble, Farley, Addison-Wesley, 2010
• [Johnson 2008] Reinventing your Business Model, Mark W. Johnson, Clayton M. Christensen,
Henning Kagermann, Harvard Business Review, December 2008
• [Kane 2019] How Digital Leadership Is(n’t) Different, MIT Sloan Management Review, Gerald C.
Kane, Anh Nguyen Phillips, Jonathan Copulsky, Garth Andrus, Spring 2019 Issue, March 12, 2019
• [Kersten 2018] Project to Product: How to Survive and Thrive in the Age of Digital Disruption with
the Flow Framework, Mik Kersten, IT Revolution, 2018
• [Kesler 2008] How Coke’s CEO Aligned Strategy and People to Recharge Growth: An Interview with
Neville Isdell, G. Kesler, People & Strategy, 31(2), 18-21, 2008
• [Kim 2013] The Phoenix Project: A Novel about IT, DevOps, and Helping your Business Win, Kim,
Behr, IT Revolution Press, 2013
• [Kim 2016] The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in
Technology Organizations, Kim, Debois, Willis, Trade Select, 2016
• [Lancelott 2017] Operating Model Canvas: Aligning Operations and Organization with Strategy,
Mark Lancelott, Mikel Gutierrez, Andrew Campbell, Van Haren Publishing, 2017
• [Leffingwell 2011] Agile Software Requirements, Dean Leffingwell, Pearson Education, 2011
• [Luna 2014] State of the Art of Agile Governance: A Systematic Review, Alexandre J.H. de O.Luna,
Philippe Kruchten, Marcello L.G. do E. Pedrosa, Humberto R. de Almeida Neto, Hermano P. de
Moura, International Journal of Computer Science & Information Technology (IJCSIT) Vol 6, No 5,
October 2014: https://arxiv.org/ftp/arxiv/papers/1411/1411.1922.pdf
• [McKinsey 2013] Mastering the Building Blocks of Strategy, Chris Bradley, Angus Dawson, Antoine
Montard, October 2013: https://www.mckinsey.com/business-functions/strategy-and-corporate-
finance/our-insights/mastering-the-building-blocks-of-strategy
• [McKinsey 2016] An Operating Model for Company-wide Agile Development, Santiago Comella-
Dorda, Swati Lohiya, Gerard Speksnijder, May 2016: https://www.mckinsey.com/business-functions/
digital-mckinsey/our-insights/an-operating-model-for-company-wide-agile-development
• [Merriam-Webster] https://www.merriam-webster.com/
• [Morgan 2019] Designing the Future: How Ford™, Toyota®, and Other World-Class Organizations
Use Lean Product Development to Drive Innovation and Transform their Business, James M.
Morgan, Jeffery K. Liker, the Lean Enterprise Institute, McGraw-Hill Education, 2019
• [Oosterwal 2010] The Lean Machine: How Harley-Davidson Drove Top-Line Growth and
Profitability with Revolutionary Lean Product Development, Dantar P. Oosterwal, AMACOM, 2010
• [Osterwalder 2010] Business Model Generation: A Handbook for Visionaries, Game Changers, and
Challengers, Alexander Osterwalder, Yves Pigneur, Wiley, 2010
• [Parker 2016] Platform Revolution: How Networked Markets are Transforming the Economy – and
How to Make them Work for You, Geoffrey G. Parker, Marshall W. Van Alstyne, Sangeet Paul
Choudary, W.W. Norton & Company, 2016
• [Parnas 1972] On the Criteria to be Used in Decomposing Systems into Modules, S.L. Parnas,
Carnegie-Mellon University, 1972
• [Patton 2014] User Story Mapping, Jeff Patton, O’Reilly Media, Inc., 2014
• [Porter 2004] Competitive Advantage: Creating and Sustaining Superior Performance, Michael E.
Porter, Free Press, 2004
• [PWC 2013] The Financial Conduct Authority (FCA) and the Focus on Product Governance, PWC,
Autumn 2013: https://pwc.blogs.com/files/the-financial-conduct-authority-fca-and-the-focus-on-
product-governance.pdf
• [Ries 2011] The Lean Startup: How Constant Innovation Creates Radically Successful Businesses,
Eric Ries, Portfolio Penguin, 2011
• [Rigby 2018] Agile at Scale, Darrell K. Rigby et al., Harvard Business Review (http://hbr.org), May-
June 2018 Issue
• [Ross 2018] Goodbye Structure; Hello Accountability, Jeanne W. Ross, MIT Sloan Management
Review, June 27, 2018
• [Ross 2018] Tech Republic interview retrieved June 21, 2018: https://www.techrepublic.com/article/
how-to-create-a-vision-for-digital-transformation-at-your-company/
• [Ross 2018] Let Your Digital Strategy Emerge, Jeanne Ross, MIT Sloan Managment Review, October
2018
• [Ross 2019] Designed for Digital: How to Architect your Business for Sustained Success, Jeanne W.
Ross, Cynthia M. Beath, Martin Mocker, MIT Press, 2019
• [Rossman 2019] Think Like Amazon™: 50 1/2 Ideas to Become a Digital Leader, John Rossman,
McGraw-Hill, 2019
• [Rozanski 2005] Software Systems Architecture: Working with Stakeholders using Viewpoints and
Perspectives, Rozanski, Woods, Addison-Wesley, 2005
• [SEI1993] Capability Maturity Model for Software, Version 1.1, Mark C. Paulk, Bill Curtis, Mary Beth
Chrissis, Charles V. Weber, SEI Technical Report, CMU/SEI-93-TR-024, ESC-TR-93-177, February 1993
• [SEI 1995] People Capability Maturity Model, Mark C. Paulk, Bill Curtis, Mary Beth Chrissis, Charles
V. Weber, CMU SEI-95-MM-02, September 1995
• [Senge 1994] The Fifth Discipline Fieldbook: Strategies and Tools for Building a Learning
Organization, Peter M. Senge, Crown Business, 1994
• [Shoup 2014] From the Monolith to Micro-services, slideshare.net, posted by Randy Shoup, October
2014: https://www.slideshare.net/RandyShoup/monoliths-migrations-and-microservices
• [Simon 2018] Liquid Software: How to Achieve Trusted Continuous Updates in the DevOps World,
Simon, Landman, Sadogursky, JFrog, 2018
• [Sobek 1999] Toyota®'s Principles of Set-Based Concurrent Engineering, Durward K. Sobek II, Allen
C. Ward, Jeffrey K. Liker, MIT Sloan Management Review, January 15, 1999
• [Stanford 2010] An Introduction to Design Thinking – Process Guide, Institute of Design, Stanford
• [Stevenson 2004] An Agile Approach to a Legacy System, Chris Stevenson, Andy Pols, 2004,
retrieved April 23, 2019: http://cdn.pols.co.uk/papers/agile-approach-to-legacy-systems.pdf
• [TOGAF 2018] The TOGAF® Standard, Version 9.2, a standard of The Open Group (C192), published
by The Open Group, April 2018; refer to: http://www.opengroup.org/togaf
• [Ton 2014] The Good Jobs Strategy: How the Smartest Companies Invest in Employees to Lower
Costs and Boost Profits, Zeynep Ton, Amazon, 2014
• [Ward 2014] Lean Product and Process Development, Second Edition, Allen C. Ward, Durward K.
Sobek II, Lean Enterprise Institute, Inc., 2014
• [Watson 2005] Design and Execution of a Collaborative Business Strategy, Journal For Quality &
Participation, 2005
• [Wind 2016] Beyond Advertising: Creating Value through All Customer Touchpoints, Yoram Jerry
Wind, Catharine Findiesen Hays, John Wiley & Sons, 2016
• [Womack 2007] The Machine that Changed the World, James P. Womack, Daniel T. Jones, Daniel
Roos, Simon & Schuster, 2007
Chapter 1. Introduction
1.1. Objective
This Snapshot document is a draft of The Open Group Agile Architecture Framework™ Standard. The
objective of this document is to cover both Digital Transformation of the enterprise, together with Agile
Transformation of the enterprise.
This Snapshot document is intended to make public the direction and thinking about the path we are
taking in the development of The Open Group Agile Architecture Framework Standard. We invite your
feedback and guidance. To provide feedback on this document, please send comments by email to
[email protected] no later than January 15, 2020.
1.2. Overview
This Snapshot documents a proposal for a standard that covers both Digital Transformation of the
enterprise, together with Agile Transformation of the enterprise. The scope of the Snapshot covers key
concepts and includes the topics below:
• Leveraging event-driven architecture to design modular systems and modernize legacy systems
1.3. Conformance
This is a Snapshot, not an approved standard. Do not specify or claim conformance to it.
1.5. Terminology
For the purposes of this document, the following terminology definitions apply:
Can
Describes a possible feature or behavior available to the user or application.
May
Describes a feature or behavior that is optional. To avoid ambiguity, the opposite of “may” is
expressed as “need not”, instead of “may not”.
Shall
Describes a feature or behavior that is a requirement. To avoid ambiguity, do not use “must” as an
alternative to “shall”.
Shall not
Describes a feature or behavior that is an absolute prohibition.
Should
Describes a feature or behavior that is recommended but not required.
Will
Same meaning as “shall”; “shall” is the preferred term.
Chapter 2. Definitions
For the purposes of this document, the following terms and definitions apply. Merriam-Webster’s
Collegiate Dictionary should be referenced for terms not defined in this section.
Architectural Runway
• Ability to implement new features without excessive refactoring (Source: Leffingwell 2011)
• Consists of the existing code, components, and technical infrastructure needed to implement
near-term features without excessive redesign and delay (Source: Scaled Agile, Inc.
https://www.scaledagile.com/)
Catchball
A dialog between senior managers and project teams about the resources and time both available
and needed to achieve the targets.
Once the major goals are set, planning should become a top-down and bottom-up
process involving a dialog. This dialog is often called catchball (or nemawashi) as ideas
NOTE
are tossed back and forth like a ball. (Source: https://www.lean.org/lexicon/strategy-
deployment)
Continuous Architecture
An architecture with no end state and that is designed to evolve to support the evolving needs of the
digital enterprise.
Customer Journey
Series of interactions between a customer and a company that occur as the customer pursues a
specific goal. The journey may not conform to the company’s intentions. (Source:
https://www.forrester.com/Customer-Journey)
Design Thinking
A methodology for creative problem solving that begins with understanding unmet customer needs.
(Source: https://dschool.stanford.edu/resources/getting-started-with-design-thinking and
https://executive-ed.mit.edu/mastering-design-thinking)
Digital Platform
Software system composed of application and infrastructure components that can be rapidly
reconfigured using DevOps and Cloud Native Computing.
Evolutionary Architecture
An architecture that supports guided, incremental change across multiple dimensions. (Source:
Ford 2017)
Evolvability
A meta-non-functional requirement that aims to prevent other architecture requirements, in
particular the non-functional ones, from degrading over time.
Job-to-be-done
What the customer hopes to accomplish. “Job” is shorthand for what an individual really seeks to
accomplish in a given circumstance. (Source: Christensen 2016)
Lead Time
Time between the initiation and completion of a process.
Modularization
Design decisions which must be made before the work on independent modules can begin. Every
module is characterized by its knowledge of a design decision which it hides from all others. Its
interface or definition is chosen to reveal as little as possible about its inner workings. (Source:
Parnas 1972)
Persona
Fictional character which is created based upon research in order to represent the different user
types that might use your service, product, site, or brand in a similar way. (Source:
https://www.interaction-design.org/literature/article/personas-why-and-how-you-should-use-them)
Process
Any activity or group of activities that takes an input, adds value to it, and provides an output to an
internal or external customer. There is no product and/or service without a process. Likewise, there
is no process without a product or a service. (Source: Harrington 1991)
Product
Something a value stream produces. A product has a lifecycle which is comprised of a product and
process development value stream and a production value stream. Broadly speaking, a product can
refer to a product or a service. A service will be referred to as a product if its delivery is
industrialized or repeatable.
Product-centricity
Shift from temporary organizational structures – projects – to permanent ones. A product-centric
organization is composed of cross-functional Agile teams which are responsible for developing
products or services, and also operating or running them. The DevOps principle "you build it, you
run it" is core to product-centricity.
Refactoring
The process of changing a software system in a way that does not alter the external behavior of the
code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the
chances of introducing bugs. (Source: Fowler 2019)
Story
Conversations about working together to arrive at a best solution to a problem understood by both
parties. A simple story template follows:
• As a [type of user]
Story Map
Used for breaking down big stories as they are told. (Source: Patton 2014)
System
Perceived whole whose elements "hang together" because they continually affect each other over
time and operate toward a common purpose. (Source: Senge 1994)
The more Agile the enterprise, the faster the learning cycles. Fast learning cycles translate into shorter
time-to-market and higher quality. Traditional "command and control" organizations get in the way by
slowing down learning cycles.
The Agile Transformation of the enterprise and its culture is becoming a prerequisite of effective
Digital Transformation. Digital leaders and their teams need to steer the Digital/Agile transformation.
Figure 1, “Architecting the Dual Digital/Agile Transformation”, inspired from an Escher painting, shows
the recursive nature of this dual transformation.
This document covers both the Digital Transformation of the enterprise and its Agile Transformation.
Digital is about defining a business strategy that is inspired by the capabilities of digital technologies
[Ross 2018]. Digital technologies are game-changing in helping to solve customer problems in ways that
were not possible before.
Cross-functional product teams break down organizational silos by bringing together marketing,
software engineering, sales, business operations, compliance, risk, or IT operations. Each of the
disciplines comes with its established body of knowledge and specialized language. The same word
used in different contexts is likely to have different meanings.
The O-AAF approach takes the view it is counter-productive and reductionist to impose a unified
"architecturally-driven" language to all stakeholders. The challenge is to bring together each domain of
the enterprise with its own body of knowledge:
• Marketing is bringing customer-centricity into the field along with new disciplines such as design
thinking
• IT is bringing flexible and adaptive new software technologies and has popularized new Agile ways
of working
• Executives are looking for innovative business models that generate profitable revenue growth
• Compliance departments need to ensure that privacy and security regulations are applied
throughout the organization
The role of architecture is to provide an integrative view of these different disciplines. Integrative
doesn’t mean unified. There will not be one method to rule them all. This document recognizes the
value of concepts and tools brought by each discipline; for example, design thinking, job-to-be-done,
event-driven or domain-driven design, etc. It provides a modular framework that architects and
practitioners can use to help shape Digital Transformation endeavors. It includes:
• A systems thinking view of architecture formulation that combines intentional and emergent
design
• Modularity and loose-coupling to bring agility to the organization and its software systems
• An outside-in framework that starts from clients' pains and expected gains to develop digital
offerings that generate profitable growth
We expect the materials in this framework to evolve at different frequencies but continuously as the
state-of-the-art in each of these disciplines evolves and matures.
The value of an open standard in this area is that the contribution of (many organizations' and
practitioners') practical experience and expertise across industries and domains can be distilled into a
continuously updated, open standard as published by The Open Group.
This means organizations do not have to carry a singular burden of integration of these disciplines, but
The enterprise’s purpose states the strategic intent. The Merriam-Webster dictionary [Merriam-
Webster] defines purpose as: "something set up as an object or end to be attained : intention".
Strategy evolves from big up-front design toward a learning process that involves all organizational
levels. The strategic process itself must be adaptive and capable of responding to changes in the
business system. The people who are charged with executing the strategic plan should participate in
the planning process itself [Watson 2005].
The Japanese have coined the term catchball which refers to a strategic planning approach that
incorporates structured group dialog at all levels of the organization. The key benefits of the catchball
process are to drive alignment among all silos and hierarchical levels, and to help shape the business
as a coherent whole aligned on its core objectives.
This view of strategy formulation and deployment goes hand-in-hand with a flatter and cross-
functional organization. It also requires the enterprise’s governance model to evolve. Agile teams are
granted more autonomy but not at the expense of effective alignment. The catchball process provides
an alignment mechanism that is compatible with autonomy because it works best in a non-control and
command culture.
The "acid" test of a successful business strategy is the increased capability of the enterprise to develop
offerings that customers are willing to pay for and that deliver the right experience.
Team autonomy is a prerequisite to speed and agility. Why? If the coordination work your teams have
to perform is too high, it will slow them down, which results in a slower pace and additional rigidity.
John Rossman claims that speed is: "about moving in one direction very efficiently, very precisely.
Operational excellence at scale is the business equivalent to speed".
Operational excellence implies effective coordination between autonomous teams. Freedom needs to
be combined with responsibility. Accountability relations that link teams bring predictability. For
example, at Amazon you are expected to identify and tenaciously manage every potential business-
derailing dependency you have. Peer pressure is more powerful than pressure from managers who do
not have the bandwidth to micro-manage inter-team dependencies.
This has forced large retailers such as Walmart® to raise their investment level to develop new digital
capabilities on par with those of Amazon. Recently a startup has developed a service to provide state-
of-the-art logistic services to smaller retailers which do not have the skills nor the scale to compete
with Amazon’s logistic capabilities.
Amazon, with the purchase of Whole Foods, adds stores to its distribution strategy. The future of retail
may be "click and mortar" executed in the context of an omnichannel business model.
Customer experience starts with the discovery of customer needs. Just asking clients what they need is
not going to help create a superior customer experience. Discovering the hidden or untold needs of
clients is key to success. The analysis of the customer’s job-to-be-done or the mapping of the customer
journey are two examples of practices on which digital enterprises are relying. The rapid adoption of
design thinking is helping enterprises better understand the spoken and unspoken needs of their
clients.
Design thinking is a human-centered approach that positions empathy as the centerpiece of the design
discipline. It provides a process that helps design better products, services, or processes. The design
thinking process has five steps: Empathize, Define, Ideate, Prototype, and Test, which are not intended
to be applied in a linear manner.
Once the problem space is well understood, the enterprise can architect a product and service system
that will satisfy its clients. Though the product concept is central to Agile, the Agile literature gives few
if any product definitions.
3.4. Product-Centricity
The terms "product" and "service" have different meanings depending of their context of use. The
common meanings of product are:
• (2) Something (such as a service) that is marketed or sold as a commodity (Source: [Merriam-
Webster]
This document defines a product as something a value stream produces. A product has a lifecycle
which is comprised of a product and process development value stream and a production value
stream. Broadly speaking a product can refer to a product or a service. A service will be referred to as
a product if its delivery is industrialized or repeatable.
In an Agile context, product-centricity refers to the shift from temporary organizational structures –
projects – to permanent ones. A product-centric organization is composed of cross-functional Agile
teams which are responsible for developing products or services and operating or running them. The
DevOps principle "you build it, you run it" is core to product-centricity.
When digging into the subtleties of the various product and service meanings a few key ideas emerge:
• Some marketing experts claim customers hire services and products to get specific jobs done
• The service-dominant logic viewpoint defines service as the application of competencies for the
benefit of another entity: SD logic claims that value is always co-created through interactions and
• The servitization of products refers to industries that use their products to sell "outcome as a
service" rather than a one-off sale
• Highly personalized services are leveraging technology to productize a few stages of the serving
value chain to improve operational efficiency
Using the word "offering" helps talk about product, service, or a combination in a generic way. This
avoids some of the semantic problems linked to the variety of meanings carried by the product and
service terminology.
The following definition borrowed from the Lean body of knowledge summarizes how the ideas above
can be combined to guide the design of digital offerings: "A product is an object, material or digital, that
allows its owner to solve a problem autonomously. There are always limits to that of course, as a product
will need consumables, services, and maybe even training to use it more efficiently. But the ideal product
is one that I use effortlessly, grasping intuitively how it works without needing to talk to anyone else to
make it work or maintain it. I’m autonomous." [Ballé 2019]
Agile shifts from project-oriented management to product-oriented management. A true Agile team
does not deliver a project but a product! Product-centricity drives the Agile organization: "Moving from
project-oriented management to product-oriented management is critical for connecting software
delivery to the business" [Kersten 2018]. The product-based focus helps create stable Agile teams that
develop an end-to-end perspective and are staffed with stable resources.
The Agile enterprise architects its product and service system in a modular way to minimize inter-
team dependencies. The lower the dependencies, the faster products are delivered to the market.
Capability modeling and Domain Driven Design (DDD) help guide the modular decomposition of the
enterprise and its software systems.
The Inverse Conway Manoeuvre suggests modeling Agile teams' structure to map the intentional
system architecture’s structure. When the teams' architecture mirrors the software system’s
architecture, it reinforces the development of an end-to-end perspective that improves effectiveness
and efficiency.
The clear distinction between problem space and solution space facilitates innovation because it opens
the space of possible solutions. More than one solution can solve a problem that has been well
formulated.
• Systems thinking can be applied to any field; for example, biology, engineering, software, or human
organizations
• It provides a comprehensive body of knowledge that helps understand and model complexity
The digital enterprise requires bridging fields such as marketing, finance, software engineering, or
operations into a coherent whole. Systems thinking provides the glue that helps integrate such diverse
fields into an actionable framework.
Digital enterprises' adoption of Agile at scale results into the creation of a great number of autonomous
teams. When several hundred Agile teams run in parallel at a fast pace, it may be hard to cope with the
resulting complexity.
System theory provides key insights and methods to understand the emergent properties of complex
systems. It has developed models that can help the practitioner steer their evolution.
John Holland [Holland 2014] characterizes the behavior of complex systems by:
• Self-organization
• Chaotic behavior when small changes in initial conditions produce large change later
• Adaptive behavior, when interacting agents modify their strategies in diverse ways as experience
accumulates
When interaction patterns emerge, it influences the behavior of interacting agents and therefore
impacts the system itself. A complex system has emergent properties that cannot be understood nor
predicted by just breaking it down into its elements. This lack of predictability makes steering the
evolution of complex systems difficult. A complex system can evolve toward self-organization or chaos
should it fail to maintain some kind of invariant state in the face of perturbations.
An Agile at scale enterprise that runs hundreds of Agile teams in parallel is a complex system. A fine
line must be walked between self-organization and chaos. A research paper from MIT Sloan describes
how Spotify™ balances autonomy with alignment [Baiyere 2017]. Spotify avoids chaos while it does not
slow down innovation. The definition of shared missions which are key strategic objectives helps align
Agile teams.
John Holland suggests steering complex systems by modifying their signal/boundary hierarchies. The
components of complex systems are bounded sub-systems or agents that adapt or learn when they
interact. Defining the boundary of sub-systems and their rules of interaction has a major influence on
the evolution of a system. For example, establishing a taxonomy of Agile teams and clarifying their
dependencies impact coordination and cooperation. Similarly, Domain-Driven Design (DDD) uses the
bounded context concept to decompose complex software systems in a modular way [Evans 2003].
It consists of:
• Function
• Related by concept
• To form
Form is what a system is; it is the physical or informational embodiment that exists or has the potential
to exist. Form has shape, configuration, arrangement, or layout. Over some period of time, form keeps
its identity though it can change state. Form is not function, but form is necessary to deliver function.
For example, in a software system, form defines the sub-systems and entities that compose it. In an
enterprise, form defines the organization and the products that the organization produces.
Function is what a system does; it is the activities, operations, and transformations that cause, create, or
contribute to performance. For example, the function of an Agile team (e.g., a tribe) could be to develop
a payment product and the sharding function is to distribute data over a network. Emergence occurs in
A concept is a product or system vision, idea, or mental image which maps function to form. In the
sharding example, the distributed computing concept of master-slave replication defines how to
distribute data while keeping it consistent.
Upstream Influences
The design of an architecture is influenced by many factors such as the enterprise’s business strategy,
compliance requirements, the needs of clients and users, or the behavior of competitors. When these
upstream influences are not dealt with, ambiguity is likely to jeopardize the design of the new product
or system. One of the key architecture missions is to help reduce this ambiguity.
A well-architected system delivers value to its stakeholders within the limits imposed by regulators,
competition, and technology.
As defined by L.D. Miles from General Electric®, value is the lowest price you must pay to provide a
reliable function or service. Value is the ratio of function to cost.
The value of a product can be increased by maintaining its functions and reducing its cost or keeping
the cost constant and increasing the functionality of the product.
• Utility is defined by the client or user and is driven by the function delivered
Downstream Influences
Downstream influences extend the architecture’s scope to the full product or system lifecycle. For
example, product architecture decisions ought to factor in manufacturing costs or ease-of-maintenance
of products or systems. When architecting a product, a product line strategy can influence architecture
decisions.
The emphasis is on form elements and relationships, and the definition does not give much guidance on
function and concept.
The TOGAF standard embraces the ISO definition, but does not strictly adhere to it, adding the
definition below:
"The structure of components, their inter-relationships, and the principles and guidelines governing their
design and evolution over time."
The emphasis on form is expressed through a different vocabulary: the "structure of components and
relationships" versus "elements and relationships".
The system engineering definition of architecture used in this document is fully compatible with the
ISO and TOGAF definitions.
• How to design architectures that are resilient to product and system evolution?
Chapter 5, Continuous Architectural Refactoring will describe how to design architectures that can
evolve gracefully. It will explain the role played by refactoring to help meet the evolvability’s
architecture quality.
A point to note when answering the first question is: what is the right balance between anticipating
foreseeable change versus over engineering? Not all change can or should be anticipated.
There are cases when it is wise to make a product or system resilient to change. For example, using a
strategy pattern that enables selecting a pricing algorithm at run time instead of implementing it at
design time is useful when designing a trading system.
The second question raises the issue of the reversibility of architecture decisions. Hard to reverse
architecture decisions make incremental architecture change more difficult. Chapter 9, Minimum
Viable Architecture will cover this topic in more depth.
However, to properly discuss the concept, we must first adequately define it, and every word in the
term is important. We will briefly discuss each in turn (but not in order).
5.1.1. Refactoring
Our discussion will center around the concept of taking a system architecture and changing its
structure over time. We may have many reasons for doing so: design debt or "cruft" which has
inevitably accumulated; changes to our understanding of the important non-functional requirements;
remedying suboptimal architectural decisions; changes to the environment; project pivots, etc.
Whatever the reason, sometimes we need to change fundamental aspects of how our system is put
together.
Before we continue, a note on the choice of the word "refactoring". Martin Fowler [Fowler 2019] would
likely describe this topic as "Architectural Restructuring"; he uses the term "refactoring" to describe
small, almost impact-free changes to the codebase of a system to improve its design. We decorate the
term with the word "architectural" to make it obvious that we are describing larger-scale, structural
system changes.
All of which leads us to the next question – what do actually we mean by "architecture" in this context?
5.1.2. Architectural
There may be as many definitions of "architecture" as there are software architects to define it. Ralph
Johnson of the Gang of Four [GoF 1994] defined software architecture as: "the important stuff (whatever
that is)". This deceptively obvious statement calls out the need for an architect to identify, analyze, and
prioritize the non-functional requirements of a system. In this definition, the architecture could be
viewed as a plan to implement these non-functional requirements. Ford [Ford 2017] gives a
comprehensive list of such requirement types, or "-ilities". The TOGAF standard [TOGAF 2018] provides
a more concrete description of architecture, namely: "the structure of components, their
interrelationships, and the principles and guidelines governing their design and evolution over time".
This "evolvability" – the ability for architecture to be changed or evolved over time – is becoming
critical. There are many reasons for this: the increasingly fast pace of the industry; adoption of Agile
approaches at scale; the cloud-first nature of much new development; the failure of expensive, high-
profile long-running projects, etc. System evolution has always been an important concept in
architectural frameworks. Rozanski [Rozanski 2005] had an "evolution" perspective, the TOGAF
standard has the concept of "Change Management". There is an increasing reluctance to worry up-
front about five-year architecture plans or massive up-front architectural efforts, which is requiring
5.1.3. Continuous
The industry has over the past few years revisited the "hard to change later" problem in a new light.
Instead of looking at individual requirements from the perspective of how they will evolve in a system,
what if "evolvability" was baked into the architecture as a first-class concept? Evolutionary
architectures, as described by Ford [Ford 2017], have no end state. They are designed to evolve with an
ever-changing software development ecosystem, and include built-in protections around important
architectural characteristics –' adding time and change as first-class architectural elements.
Indeed, Ford [Ford 2017] describes such an architecture as one that "supports guided, incremental
change across multiple dimensions". And it is this incremental nature of change that facilitates us
making changes to our software architecture in a continuous manner, planning for such change from
the outset, and having as part of our backlog items which reflect our desired architectural evolution.
Each sub-section covers a different aspect of the necessary prerequisites for continuous architectural
refactoring. Taken together they offer a complete view of the enablers for successful continuous
architectural refactoring.
5.3.1. Constraints
Every organization operates under a range of constraints; they constrain the valid choices that can be
made by a business in achieving its aims. They come in many flavors, including financial, cultural,
technical, resource-related, regulatory, political, and time-based. The very nature of the word
"constraint" implies a limiting, constricting force which will choke us of productivity and creativity,
and it is human nature to try to dismiss them or rail against them. However, constraints need not be
negative forces; they force us to describe our current reality, and provide guidance as to how that
reality should shape our efforts. Individual constraints may be long-lived; others may be eliminated
through effort; but to ignore any of them is folly.
Inevitably, some of these constraints will manifest in software as architectural constraints. Technical
constraints may mandate an infrastructural topology (e.g., "organization A only deploys on
Infrastructure as a Service (IaaS) vendor B’s offerings"), an architectural style of development (e.g.,
"organization C is a model-driven development software house"), or an integration requirement (e.g.,
"financial transactions are always handled by system D"). Financial and resource constraints can shape
software development team members and their skill sets, in addition to imposing hardware and
software limitations. Time-based constraints may manifest as software release cadences, which will
influence development architectural choices. Regulatory constraints can have big impacts on
development practices, deployment topology, and even whether development teams are allowed to
continuously deploy into production.
As an antidote to such problems, Ford [Ford 2017] introduces us to the deceptively simple concept of
fitness functions. Fitness functions objectively assess whether the system is actually meeting its
identified non-functional requirements; each fitness function tests a specific system characteristic.
For example, we could have a fitness function that measures the performance of a specific API call.
Does the API complete in under one second at the 90th percentile? This question is far from abstract; it
is an embodiment of a non-functional requirement that is testable. If evaluation of the fitness function
fails, then this aspect of our system is failing a key non-functional requirement. This is not open to
opinion or subjectivity; the results speak for themselves. To take the example further, imagine that one
of our proposed architectural refactorings was to implement database replication to meet availability
requirements.If we implemented this, and the "API performance" fitness function subsequently failed,
then we know early in the development cycle that our architecture is no longer fit-for-purpose in this
respect, and we can address the problem or pivot.
It follows, therefore, that fitness functions are key enablers of our goal to continuously restructure our
architecture. They allow us to ensure that those system characteristics which need to remain constant
over time actually do so. They reduce both our fear of breaking something inadvertently and the
ability for us to show our stakeholders that we haven’t done so. They represent a physical, tangible
manifestation of our constraints and architectural goals.
5.3.3. Guardrails
Another mechanism that organizations use to bake-in evolvability into their system architectures is the
concept of architectural guardrails. As with their real-world roadside equivalents, software guardrails
are designed to keep people from straying onto dangerous territory.
In real terms, guardrails represent a lightweight governance structure. They document how an
organization typically "does" things – and how, by implication, development teams are expected to "do"
similar things. For example, a guardrail may document not just the specific availability requirements
for a new service, but also how the organization goes about meeting such requirements. Typically,
guardrails are used in combination with an external oversight team – be this an architecture board,
guild, or program office. Typically, the message from such oversight teams is simple: if you stick to the
guardrails, you don’t need to justify your architectural choices – we will just approve them. However,
in those situations where you could not abide by a guardrail, then we need to discuss it. If your
reasoning is sound, then we may well agree with you and modify our guardrails, but we reserve the
right to tell you to change your approach if there was no good reason not to abide by the guardrails.
The key to their power is that they are not mandates. They do not impose absolute bans on teams
taking different approaches; rather they encourage creativity and collaboration, and encourage the
evolution of the governance structure itself.
• Continuous delivery
• Componentization
In addition, Agile development practices are a key enabler for continuous architectural refactoring. As
described in Chapter 6, Architecting the Digital Enterprise, there are a number of practices which are
promoted by Agile working. These practices allow continuous architectural refactoring to be
successfully implemented; in particular the rapid iteration and experimentation, which allows
architectural evolution to be readily incorporated into ongoing development activities.
A seminal work on the topic Humble [Humble 2010] converted many software teams to the advantages
of an Agile manifestation of configuration management, automated build, and continuous deployment.
Most recently, Forsgren [Forsgren 2018] has statistically illustrated the advantages of continuous
delivery – there is now no question but that its adoption will both help teams deploy on-demand, get
continuous actionable feedback, and achieve one of the main principles of the Agile Manifesto [Agile
Manifesto]: to "promote sustainable development". It is moreover difficult to achieve scalable
continuous architectural refactoring without it.
Continuous integration and continuous delivery are important elements to support continuous
architectural refactoring. Continuous integration and continuous delivery are often considered as a
single concept, and in many cases are linked by a single implementation. However, this is not a
requirement and for flexibility they will be discussed separately here.
Continuous integration is about developers' work being merged into a single branch frequently. Some
source control tooling makes this the default, but irrespective of the technology choice it is possible to
implement continuous integration with a combination of development practices and build process.
One of the most important elements of continuous integration is the integration of automated testing
into the build process, so that there is confidence in the quality of the code on the main branch at all
times. The key benefit in terms of architectural refactoring is the removal of "long-running" branches,
which mitigate against architectural change, but which extend the window of potential impact of a
change until all branches have merged. In practice this can make it cumbersome for developers to
manage the impact of architectural change, that it will prevent it from happening.
Continuous delivery is about being able to release at any time, which can be realized as releasing on
every commit. It is important to note that in organizations with compliance, regulatory, or other
mandatory checkpoints continuous delivery may not be about a release to production being fully
automated. Rather, the aim of continuous delivery should be that as each change is integrated it should
be possible to release that version, and in particular that the entire team is confident that it is
releasable. The key benefit in terms of architectural refactoring is in empowering the developers to
make architectural changes, knowing that the combination of continuous integration and continuous
delivery will guarantee that the change is non-breaking in terms of functionality and deployment.
It is possible, and in many cases desirable, to evolve to have a continuous integration/delivery pipeline,
rather than trying to take one step to a fully automated process. The key to this is to understand the
required steps in the process, and work to automate them one at a time. It is also important to look at
the larger environment and make the decision to find the right solution for your organization, even if
that means that some manual checkpoints remain.
Finally, it is key here to take the advice of Humble [Humble 2010]: "in software, when something is
painful, the way to reduce the pain is to do it more frequently, not less". Because building towards a
continuous integration/delivery pipeline is hard, it is all the more important to do it, because if you
don’t the effort to deliver it manually will be all the more limiting in your evolution.
Feature toggles (or feature flags) are an important mechanism in creating an environment to allow
continuous architectural refactoring. They allow features to be developed and included on the main
stream (see Section 5.4.1, “Continuous Delivery”), but without exposing them to end users. This gives
the development team options to factor their work solely based on their needs.
In addition, as described by Kim [Kim 2016] the key enablers arising from the use of feature toggles are
the ability to:
Hodgson [Hodgson 2017 details the different types of feature toggle that exist. Some toggles enable A/B
testing (where several possible solutions are trialed simultaneously, but to different users), some
enable gradual rollouts of new functionality (such as Canary testing), but of particular note to our
discussion on continuous architectural refactoring is the "release toggle". Such toggles allow untested
or incomplete refactorings and restructurings to be released into a production environment, safe in
the knowledge that such code paths will never be accessed.
5.4.2. Componentization
The structure of your architecture can play a key role in mitigating against continuous architectural
refactoring. A monolithic architecture, while not inherently bad, can as an organization expands or as
the need for flexibility increases, become a key constraint. As Kim [Kim 2016] observes: "… most
DevOps organizations were hobbled by tightly-coupled, monolithic architectures that – while extremely
successful at helping them achieve product/market fit – put them at risk of organizational failure once
they had to operate at scale …".
The key therefore is to evolve your architecture to have sufficient componentization to support your
organizational evolution on an ongoing basis. The strangler pattern, described in Chapter 15, Strangler
Pattern, can be key in this kind of evolution by creating the space for the implementation to evolve
behind an unchanging API.
This can be achieved as a staged process moving from a monolithic architecture to a layered
architecture, and on to micro-services, as described by Shoup [Shoup 2014].
Before we continue, it is worth noting that development team structure is also a key enabler for
continuous architectural refactoring, in particular the Inverse Conway Manoeuvre. This technique has
been described separately in Chapter 3, A Dual Transformation.
Management teams have businesses to run, and they have a point. Customers do not typically hand
over money for architectural refactorings, no matter how elegant they are, and without shiny new
things to sell, there may be no money to continue to employ the development teams who want to do
the refactoring.
As such, this issue has two aspects: firstly, development teams need to learn how to justify such
investment; secondly, such non-functional investment will always have to be balanced with functional
requirements.
It is worth at this point returning to the Fowler [Fowler 2019] distinction between code refactoring and
architectural restructuring. Fowler, like the present authors, would be strongly of the opinion that code
refactoring requires no justification; rather it is part of a developer’s "day job". This does not mean that
we have to take on a massive code restructuring exercise for a legacy codebase; to the contrary, there
may be no reason whatsoever to restructure the code for a stable legacy project. However, that said,
developers should refactor their code when the opportunity arises. Such activity constitutes a "Type 2"
decision as documented in Chapter 9, Minimum Viable Architecture.
Architectural refactoring (restructuring), however, often requires explicit investment because the
required effort is significant. In such cases, it is incumbent on development teams and architects to
"sell" the refactoring in monetary, time, or customer success terms. For example, "if we perform
refactoring A, the build for Product B will be reduced by seven minutes, resulting in us being able to
deploy C times more frequently per day"; or, "implementing refactoring D will directly address key
customer E’s escalated pain-point; their annual subscription and support fee is $12 million per
annum". Note, however, that claims that "refactoring F will make us G% more productive" should be
avoided as software productivity is notoriously difficult to measure.
• Vision: a target end state is key to assessing individual changes as moving towards the target state
• Step-wise: a number of intermediate states need to be described between the "as is" and "to be"
architectures with the benefits and challenges of each state documented
• Flexible: the target and intermediate states may evolve as the understanding of the architecture
and the constraints themselves evolve
• Open: a successful architecture is rarely defined by a committee, but the process and
documentation of the architectural roadmap needs to be available to the whole team, and everyone
must feel empowered to comment/question
In order to create the space for the Agile implementation, it is also important that the roadmap
remains high-level. There is a tension here between the need to keep the project within its constraints,
while giving the team the space and support to make Agile decisions as they are implementing the
architectural roadmap. Beyond the roadmap and in particular the vision of a target architecture,
guardrails (see above) are key to supporting and enabling emergent architecture, while allowing the
overall architecture to remain effective and meet all of its identified requirements.
In particular, our suggested aim is to create an environment where the risk of architectural change can
be removed by the supporting conditions, allowing the team the freedom to make architectural
changes, knowing that the process and culture will support them. To quote from Kim [Kim 2013]:
"Furthermore, there is hypothesis-driven culture, requiring everyone to be a scientist, taking no
assumption for granted, and doing nothing without measuring." The measuring of the impact of
architectural change was discussed in Section 5.3.2, “Fitness Functions”.
In the authors' experience this can also have a varying focus over time; sometimes the business needs
to "win" and the focus shifts to business features at the expense of architectural evolution. But it is
critical that the environment for architectural evolution persists, so that if and when the focus shifts
back on architecture concerns, the option to continue to evolve it would remain open.
Value delivered to customers as well as operational efficiency are core concerns of any
enterprise. While this is not new, Digital Transformation re-enforces the importance of these
concerns. It also drives change in key areas:
• Clients' experience expectations are shaped by Internet giants; for example, Amazon Prime
where you can follow your delivery in real time using your smartphone - the digital
enterprise is experience-driven
• Delivering a superior client experience impacts the enterprise end-to-end; for example,
operating model weaknesses are likely to result in client dissatisfaction and pain
• Digital is about creating and delivering innovative products or services that meet clients’
explicit and implicit needs
• Methods such as design thinking or the analysis of the “job-to-be-done” help invent
innovative products and services
• Fast learning cycles to quickly experiment with new products and services thanks to Lean
Startup MVPs combined with DevOps’s rapid continuous deployment
The authors of an article on Digital Transformation have surveyed more than 20,000 business
executives, managers, and analysts [Kane 2019]. They conclude that leaders facing the challenges of
digital disruption need to possess three distinctive skills:
• Adaptability
• Digital literacy
The first level of innovation materializes into digital offerings that can meet underserved customer
needs. The next innovation level is about inventing disruptive business models that provide
sustainable competitive advantages and can sometimes disrupt industries.
New digital offerings and business models may require the enterprise to develop new capabilities.
Before committing too many investment dollars, digital leaders validate their market strategy by
experimenting with Minimum Viable Products (MVPs) with customers. A few learning cycles may be
required before committing significant investments in new offerings or business models. This requires
adaptability.
Enterprises that pre-date the age of software see a growing portion of their spend shifting to
technology as their market success is increasingly determined by software. However, the productivity
of software delivery of the clear majority of enterprises falls woefully behind that of tech giants
[Kersten 2018]. Therefore, digital literacy of enterprise leaders is critically important to help them hire
the right technology people and recognize when they receive bad technology advice.
Figure 4, “Architecting the Digital Enterprise” shows digital architecture developed concurrently and
follows two key principles: outcome-driven and modularity.
Outcome-Driven
Looking at products with outside-in perspective requires a shift from outputs to outcomes. An output is
what is created at the end of a process. Outputs tell the story of what you produced or your
organization’s activities. A very large company in the car industry recently said they were looking at
product as “how it is used by a customer rather than how it is created/delivered". Does it mean that
these firms are no more car-producers? Absolutely not. This illustrates that the customer job
perspective helps discover different and sometimes innovative ways to satisfy customer needs.
Outcomes are the effects produced by using an enterprise’s products and services. As stated by Karl
Hellman: "outcomes are the benefits your customers receive from using your stuff <…> This requires a
true understanding of customers’ needs — their challenges, issues, constraints, priorities — by walking in
their shoes and in their neighborhoods" [Hellman 2018].
• Describing the outcomes you want to achieve: why your customer is using or would want to use
your product?
• Associating quantitative measure to these outcomes (i.e., % of clients demonstrating new behavior,
% of clients coming back into treatment, etc.)
Because there is more than one way to deliver desired outcomes, more than one product can deliver it.
This opens the range of possible solutions. A product owner should assess the qualities of each
candidate product. She should ensure desired outcomes are linked to the product’s outputs or
activities. In other words, she needs to be confident that the operating model supporting her product
can reasonably deliver the customer’s desired outcomes.
Modular
Modularity is about decomposing a system into parts that are loosely-coupled. In this section the term
system refers to any type of system from human or social to technical. Chapter 4, What is Architecture?
provides a comprehensive description of systems thinking.
• Enable parallel work to dramatically shorten capability or product development lead times
• Changes in one part of the system have limited impact on other parts of the system which makes it
more resilient to change
• Failures of one part of the system are less likely to propagate to other parts of the system
For example, cloud-native computing promotes an architecture style that decomposes software
systems into services that have well-defined boundaries. Changes in one service have a limited impact
on other services and failures are easier to isolate which makes the system more resilient.
• Discovering what your customers want and how competition (if any) provides it
"A positioning statement defines the value proposition of products to the target: … the point of difference
(reason to buy) and the point of parity (point of reference)" [Wharton 2018].
Strategic marketing provides context to help drive digital strategy. The enterprise can choose to
develop new capabilities to serve targeted client segments. Traditional strategic marketing is mostly
developed in a top-down manner.
The difficult part is changing an existing business model or creating a brand new one that works.
Following a generic template is no guarantee of success.
Successful business models start with an innovative value proposition that has been field tested. The
Amazon Flywheel illustrates how systems thinking can help design successful business models. "Simply
put, a flywheel is a self-reinforcing loop or systems diagram driven by key objectives or initiatives"
[Rossman 2019].
John Rossman’s Idea 25 states: "Study and analyze either your industry or the situation you are trying to
improve using systems thinking. Once you have an idea or hypothesis on how to achieve your goal, create
a simple version of your system, often called a “flywheel”, to assist in testing your strategy and then in
communicating your logic and plan to others".
Digital enterprises can pursue both differentiation and low cost. This creates a leap in value for both
the enterprise and customers. The Amazon Prime Now™ service epitomizes this. Customers can
purchase a large variety of products at a competitive price and get a free two-hour delivery at their
home (https://primenow.amazon.com).
Product variety is increased through a platform that allows third-party vendors to sell and deliver
their products leveraging Amazon’s portal and logistics capabilities. The Amazon business model is
characterized by:
• A two-sided market platform business model with customers on one side of the market and third-
party vendors on the other
• A logistics capability which translates into a superior experience allowing the customer to track
delivery in real time using any device
The Amazon example illustrates how firms can compete leveraging difficult-to-replicate capabilities
which are enabled by a digital platform that enables an adaptive operating model.
Design thinking postulates that to create meaningful innovations, the enterprise needs to know its
customers and care about their lives [Stanford 2010].
The concept of job-to-be-done is a key tool, developed by Clayton Christensen. It helps better
understand customer needs by wrapping offered products and services in a usage context.
"After decades of watching great companies fail, we’ve come to the conclusion that the focus on
correlation – and on knowing more and more about customers – is taking firms in the wrong direction.
What they really need to home in on is the progress that the customer is trying to make in a given
circumstance – what the customer hopes to accomplish. This is what we’ve come to call the job-to-be-
done."
New market segmentation methods emerge from analyzing jobs-to-be-done. Customers can now be
classified based on the outcomes they expect. Unlike abstract client segments, personas help better
understand who the persons are the enterprise targets and what are their pains and expected gains.
The four remaining steps of design thinking help the enterprise formulate a better problem statement
(define), generate a broad range of ideas (ideate), and prototype and test candidate solutions.
Validated customer insights provide key inputs to help define innovative value propositions.
Customer journey maps come in many shapes and forms. For example, customer maps can follow a
timeline positioning activities or phases chronologically.
In addition to describing the customer’s functional job, a customer journey map captures the feelings
of customers during moments of truth. It can also capture how a customer believes she is perceived
socially.
Platform-based business models are based on the two-sided markets theory developed by the
Economics Nobel prizewinner Jean Tirole. When platform-based businesses enter markets dominated
by "pipelines", they enjoy a competitive advantage. Why? Because pipelines rely on inefficient
gatekeepers to manage the flow of value when platforms promote self-service and direct interactions
between participants.
A platform can scale and grow more rapidly and efficiently because the traditional gatekeeper is
replaced by signals provided by market participants through a platform that acts as a mediator.
Platforms stimulate growth because they expose new supply and unlock new demand. They also use
big/fast data and analytics capabilities to create community feedback loops [Parker 2016].
Technology is a key enabler of platform-based business models because high-levels of automation and
self-service capabilities are required to succeed. For example, when Amazon decided to develop a
third-party sellers' market, one of the key requirements was: "A third-party seller, in the middle of the
night without talking to anyone, would be able to register, list an item, fulfill an order, and delight a
customer as though Amazon the retailer had received the order.” [Rossman 2019]
This document makes a clear distinction between the platform business model which is based on the
two-sided market theory and the digital platform which is a technology enabler that allows enterprises
to achieve economic gains by reusing or redeploying assets across families of products.
A digital platform is a software system composed of application and infrastructure components that
can be rapidly reconfigured using:
• DevOps to dramatically reduce the "requirement to deploy" lead time which is key to reducing
digital offerings' time-to-market
• Cloud Native Computing to bring agility, scalability, and resilience to the operating model
The definition in this document is compatible with Jeanne Ross' definition from the MIT CISR which
defines a Digital Platform as: "a repository of business, data, and infrastructure components used to
rapidly configure digital offerings" [Ross 2019].
Most established enterprises have monolithic legacy systems that get in the way of creating effective
and efficient digital platforms. To compete with Internet giants or more nimble competitors, these
enterprises must develop refactoring and modernization strategies.
offerings inspired by customer insights and powered by a digital platform to deliver differentiated
outcomes to customers. "A digital offering is the confluence of a customer solution and a great
experience" says Jeanne Ross [Ross 2018].
The most advanced enterprises use a Lean product development approach to create digital offerings
[Oosterval 2010] [Morgan 2019].
The Lean Startup best practices are helping to market test digital offerings before enterprises invest
too much in them [Ries 2011].
Figure 5, “Revised Service Blueprint”, inspired by service blueprinting, helps bridge customer journeys
with required capabilities. The example describes a simplified loan origination journey. Variants of this
journey can be created to account for a different channel mix.
The top part of the diagram borrows from a journey map. The bottom part below the line-of-visibility
describes the capabilities that are required to support the story map. Capabilities are implemented by
services that are described in functional terms and/or specified using APIs and/or business events.
Architecting a set of modular and reusable services will allow for rapid reconfiguration of customer
journeys. System architecting techniques combined with Domain-Driven Design (DDD) can help design
loosely-coupled services that will be easier to assemble into unanticipated composite services. An
adaptive operating model takes advantage of modularity and composability to gracefully adapt to
changing customer experience requirements.
6.8. Accountability
In a paper published in the Sloan Management Review Jeanne Ross suggests to "initiate change by
assigning accountabilities for specific business outcomes to small teams or individual problem owners"
[Ross 2018].
In many enterprises this change impacts the formal organizational structure. For example, several
banks are adopting the Spotify model to re-organize the IT function and some such as ING or ANZ are
also re-organizing the business the same way.
Changing the organizational structure is not enough. It is key to change the culture, the ways of
working, and the management system. Accountability is a powerful alignment mechanism when teams
trust each other. Trust requires visibility and predictability which are not the hallmark of "command
and control" organizations.
• Look at one solution at a time and change it only when problems arise
• Disregard new customer or technical knowledge that would challenge early design decisions
The resulting solution is likely to require significant rework and will be suboptimal because good
concepts are eliminated too early.
• Aggressively attacks those solutions with rapid, low-cost analysis and tests, progressively
eliminating weak solutions
• Uses the analysis and test results to define the limits of the possible
"Taking time up-front to explore and document feasible solutions from design and manufacturing
perspectives leads to tremendous gains in efficiency and product integration later in the process and for
subsequent development cycles" [Sobek 1999].
In our context concurrent engineering brings agility and facilitates innovation, for example:
• The enterprise can evolve from a product distribution model toward the development of a two-
sided market
• Operating models evolve alongside the enterprise Agile Transformation as organizations become
flatter and cross-functional
• The accountability framework strengthens as the enterprise’s culture evolves toward increased
teams' autonomy
• Fact-based decision-making
The management systems evolve to promote a mix of freedom balanced by clear accountable roles.
Freedom is required to empower teams to rapidly make decisions closer to the field. Accountability in
an Agile organization is not about controlling people; it is about a two-way exchange where you agree
to deliver something to another person.
In an Agile organization an employee is accountable to her peers, her manager, and her clients.
Managers are accountable to their teams, the board of directors, and society. The management system
cascades goals at all levels of the organization and promotes a constructive dialog to help set up
accountability relationships between employees and managers. The reward system recognizes
individual performance while promoting collaboration.
The organizational structure is flattened. Autonomous cross-functional teams often named "feature
teams" or "squads" are formed. Cross-functional roles emerge to help construct robust communities of
practice often named "chapters" or "guilds". Resource allocation is flexible and driven by the evolution
of demand or activity level.
The left part of Figure 6, “Agile Transformation” represents the three transformation dimensions we
have introduced, plus one which is the enterprise’s culture. Culture evolution results from changes in
the three dimensions. For culture change to take hold, people have to experience success operating in
the new Agile organization.
The middle part of the figure lists a few important questions that the enterprise needs to address. For
example, the waterfall scenario is likely to create intermediary stages that are suboptimal. New ways
of working may conflict with the existing management system. The enterprise which deploys a new
management system on top of an existing organizational structure will have to redeploy it at a later
time.
When those conditions are not met, it is better to change the organizational structure before deploying
new ways of working and the new management system. The new organizational structure reflects an
architectural intent which is either implicit or explicit.
Pools of software development resources were organized by technologies; for example, server-side
Java® development or mobile front-end development. Software development projects would on-board
required resources for the duration of the project and release them when done. Last but not the least,
IT operations and support was separate from application development.
Though the specialization of the legacy organization was supposed to help the IT organization capture
The level of inter-team dependency is high because of the multiplication of organizational entities that
have to intervene on a project. Time-to-market is increasing due to the high level of inter-silo
coordination required; for example, between development and IT operations teams. Alignment
between business and IT is complicated because of the lack of a simple mapping between
organizational structures.
The IT re-organization was inspired by the Spotify model. Small cross-functional teams named
"squads" replaced projects. The size of a squad does not exceed 10 to 15 people. Unlike a project, a
squad is a stable team composed of dedicated resources that develops an end-to-end perspective.
Squads are product-centric, meaning they develop, deploy, and operate a service. Squads adopt a
DevOps culture which translates into the mantra "you build it, you run it".
Squads are grouped into tribes which are not bigger, on average, than 150 people. In order to maintain
and develop functional or technical competencies, chapters and guilds are created. For example,
chapters that regroup mobile developers or NoSQL DBMS experts.
As the number of Agile teams grows with a few hundred squads running in parallel, it is important to
define a taxonomy of Agile teams that clearly defines the scope of each one and minimizes inter-team
dependencies. True Enterprise Architecture thinking is required to discover an effective way to
decompose the organization and to draw boundaries that minimize inter-team dependencies.
The primary goal of an Agile teams' taxonomy is to minimize redundancy and duplication [Rigby
2018]. Because an Agile teams' taxonomy may be different from the formal P&L structure of the
enterprise, it is necessary to map it to the division, business unit, and P&L structure of the enterprise.
The scope of this initial model was defined by two words: process and software. As the Capability
Maturity Model gained in influence, the underlying approach was applied to other domains. For
example, in 1995 the SEI developed a People Capability Maturity Model (PCMM®) [SEI 1995]. More
recently, the SEI made public another maturity model, the Smart Grid Maturity Model (SGMM) [SEI
2018].
Other parties inspired by these maturity models created their own to be applied to different domains.
For example, the US Department of Commerce developed a model for Enterprise Architecture [DoC
2007].
The original SEI CMM for software became CMMI®. The latest version, CMMI V2.0, is managed by the
CMMI Institute which is an ISACA® Enterprise.
We propose reviewing maturity levels as defined by CMMI V2.0 to inspire the definition of the maturity
levels in this document. Not all maturity dimensions are process-related, therefore we borrow the
CMMI V2.0 "Practice Area" terminology which is more general than "Process Area". In the next sections
we will define the maturity levels and corresponding practice areas.
Because Agile at scale shifts from project to product and organization-wide practices are not limited to
standards, we need to define specific maturity levels.
Figure 10, “O-AAF Maturity Levels” introduces the maturity levels we propose.
For each maturity level, the enterprise should develop specific architecture practices.
We will now identify the practice areas that need to be analyzed for each maturity level.
The practice areas table above can be used to assess the maturity level of an Agile enterprise. In the
next version of this document, we will develop a maturity assessment method which will be described
in a playbook.
The O-AAF playbooks provide guidelines to solve a particular Agile Architecture problem. For example,
how to adapt governance or how to handle legacy systems when developing a digital platform.
This part of the document is composed of a set of playbooks that architects can activate to meet the
specific objectives and context of an enterprise. Each playbook is self-contained though it describes
prerequisites that can constrain the order in which playbooks are activated.
Context
In his book The Lean Startup [Ries 2011], Eric Ries coined the term Minimum Viable Product (MVP)
defined as: "that version of the product that enables a full turn of the Build-Measure-Learn loop with a
minimum amount of effort and the least amount of development time".
An MVP needs to be placed in front of customers to discover and analyze their reactions. Unlike a
prototype whose quality is assessed by engineers and designers, the purpose of an MVP is to assess
whether or not a product meets customers' expectation and if they would pay for it. When the
experiment fails, meaning that customers are unlikely to buy, the product owner can pivot to a revised
product concept or give up and stop development. The MVP helps save money because it minimizes
the time and investment required to experimentally verify the product concept.
The term MVP is becoming very popular and is often used to mean something different; for example,
justifying the development of a poor-quality prototype. Jumping on the MVP bandwagon, some agilists
coined the term Minimum Viable Architecture (MVA).
The MVA concept means different things to different people; for example:
• The "architecture that enables the delivery of the core product features to be deployed in a given
phase of a project and satisfied known requirements" [Erder 2016]
• Just enough or good enough architecture by opposition to big up-front design or heavy investment
in "plumbing"
To say the least, the MVA concept lacks clarity and we need to reframe the way the problem is defined.
Wording such as building architecture or architecture runway may give the impression that
architecture means infrastructure or platform. Infrastructures and platforms implement architecture
models but should not be confused with architecture which is about the fundamental concepts and
properties of a system. Therefore, we suggest distinguishing two types of decisions: the true
architecture ones from infrastructure sourcing and provisioning ones.
Taking the example of an MVP, we suggest distinguishing architecture decisions that pre-condition its
development, from investment decisions to fund its development and production environments.
An enterprise can be modeled as a complex system. A large enterprise may be composed of many
divisions and departments; it can operate in many regions of the world and it can market a large
number of products and services. When Agile at scale is deployed, the minimum architecture is the one
required to define an Agile team’s taxonomy as described in Chapter 7, Architecting the Agile
Transformation.
In an Agile culture where team autonomy is valued, architecture is the result of a problem-solving
process that starts from an intentional architecture vision which challenged, amended, and completed
by Agile teams.
What is the minimum definition of this intentional architecture vision? How to conduct the dialog
between the owner of the architecture vision and Agile teams? It depends on the context. The
heuristics described in the next section can help architects answer these questions.
Jeff Bezos, CEO of Amazon, distinguishes two types of decision: type 1 and type 2. Type 2 decisions are
changeable and reversible; they are two-way doors. If you’ve made a suboptimal type 2 decision, you don’t
have to live with the consequences for that long. Type 2 decisions can and should be made quickly by high
judgment individuals or small groups." Type 2 decisions can be made by autonomous Agile teams, while
type 1 decisions require architecture thinking. Reversing type 1 decisions has the potential of creating
a lot of rework which is wasteful. This is why the right timing of type 1 decisions is critical.
Impacting decisions, mostly type 1, should be delayed until the probability that they would be
questioned down the line is low enough. The right timing of type 1 decisions is a doubled-edged sword.
Made too late, the resulting ambiguity is likely to significantly slow down design activities. Made too
early, the rework needed to reverse them is likely to add significant delay and wasteful rework.
Lean product and process development promotes Set-Based Concurrent Engineering (SBCE) which
helps to optimize when architecture decisions are to be made. Unlike traditional methods which lock
design decisions too early or Agile methods which leave design decisions open too long, SBCE allows
the final architecture design to emerge from teams' learning [Ward 2014].
• The product leadership team often picks a fundamental concept, sometimes even before the project
begins
• The product team details the concept through the identification of sub-systems
Architecture and design decision incompatibilities are often identified too late in the process, which is
a major cause of rework. The solution space is constrained too early in the process which can result in
suboptimal architecture decisions.
• The product team breaks the system recursively into sub-systems until the right level of granularity
is reached
• Multiple concepts for the system and each sub-system are created
• The team filters these concepts by thorough evaluation, eliminating concepts that don’t fit with
each other
• Failure information goes into a trade-off curve knowledge base that guides architecture design
• As they filter, there is rapid convergence toward a solution that is often more elegant and
innovative than the one conventional point-based development would have produced
"The set of design alternatives shrinks because they are eliminated, and as trade-off curves are developed,
the remaining alternatives are developed to increasing levels of fidelity. Simultaneously, target ranges
narrow in corresponding stages, converging on values that provide the best set of trade-offs."
The authors of Building Evolutionary Architectures [Ford 2017] describe an evolutionary architecture
as supporting "guided, incremental change across multiple dimensions". The key idea is to enable the
incremental development of the product or system while preserving functional requirements such as
scalability, elasticity, or resilience. As the product or system evolves during Agile iterations, its
architecture qualities should not degrade over time.
The methods used to improve evolvability depend on the type of architecture. For example, loose-
coupling and separation of concerns help architect software systems to evolve more gracefully. Let us
illustrate this with clean architectures. Clean architectures such as the hexagonal one isolate the
domain logic (the core) from non-core domain concerns such as inputs and outputs or persistence
mechanisms. The code that implements the domain logic is protected from changes that could impact,
for example, persistence mechanisms. The initial version of a piece of software could use an RDBMS
Transforming type 1 decisions into type 2 decisions by making them easier to reverse also contributes
to evolvability.
Sacrificial Architecture
Martin Fowler coined the term "sacrificial architecture" [Fowler 2014] to designate situations where a
team deliberately chooses to throw away a codebase. Martin Fowler lists a few examples of companies
who have done it, such as eBay® or Google®.
When the goal is to get rapid market feedback experimenting with an MVP, a sacrificial architecture is
an option to consider as it would not be worth spending too much time designing an architecture that
would have to change should the product owner decide to pivot.
Figure 13, “MVA – Forces and Heuristics” shows the relationships that link heuristics as well as the
forces that influence architecture decisions and their timing.
Type 1 decisions can be delayed until the last possible moment, transformed into type 2 decisions by
making the architecture more evolvable, or avoided with the creation of a sacrificial architecture.
Many forces influence the structure and timing of the architecture decision space:
◦ The product development process is about creating verified and validated knowledge on
customers, technology, and integration issues. The more unknowns, the higher risk that type 1
decisions would be reversed resulting in rework, higher costs, and delays.
◦ It helps to determine the last responsible moment to make an architecture decision. The Lean
SBCE strategy optimizes the speed/risk equation.
• Other contextual forces are to be factored, in particular the organizational culture, the volatility of
customer needs, and the stability of purpose
When architecture decisions are made, it is important to document the motivations behind them. A
minimum architecture documentation should be composed of a collection of Architecture Decision
Records (ADRs) [Nygard 2011]. Each ADR describes a set of forces and a single decision in response to
those forces. A simple command line tool such as https://github.com/npryce/adr-tools provides a
lightweight way of documenting an architecture.
In their book "Six Sigma for Financial Services" Rowland Hayler and Michael D. Nichols describe
process architecture as "understand our organization’s end-to-end processes and how they fit together to
maximize value" [Hayler 2006].
Most organizations describe their process architecture using some hierarchical modeling scheme. For
example, the enterprise is decomposed into major process areas which are decomposed into process
groups that are composed of processes.
This playbook will define the process concept and compare it to Lean value stream. It will also show
how the rise in complexity of the digital enterprise requires to go beyond process architecture toward
rethinking operating models. It will introduce the concept of the adaptive operating model which can
gracefully evolve while it preserves and improves both effectiveness and efficiency.
The BPR movement enjoyed rapid growth for a few years before it fell out of favor following a number
of large-scale BPR failures.
Hammer and Champy define process as: "a collection of activities that takes one or more kinds of input
and creates an output that is of value to the customer" [Hammer 1993].
About at the same time, James Harrington [Harrington 1991] created a Business Process Improvement
(BPI) method that defined the foundations of modern business process management practices.
James Harrington defines a process as: "any activity or group of activities that takes an input, adds value
to it, and provides an output to an internal or external customer … There is no product and/or service
without a process. Likewise, there is no process without a product or a service."
• Making processes adaptable – being able to adapt to changing customer or business needs
A new role is defined – "process owner" – who is accountable for how well the process performs.
The method defines key process improvement concepts that are very similar to equivalent Lean
concepts. Because the text was written at about the same time as The Machine that Changed the World
[Womack 2007], we formulate the assumption that the authors (Harrington and Womack) either
shared the same sources and/or invented the same concepts at the same time.
• Real Value-Added (RVA) that are required from a customer perspective to provide the output the
customer is expecting
• Non-Value Activities (NVA) that do not contribute to meeting customer requirements, and could be
eliminated without degrading the product or service functionality
NVA are often activities that exist because the process is inadequately designed, or the process is not
functioning as designed.
BPI aims at reducing the process cycle time. It proposes a set of heuristics to compress cycle time. The
equivalent Lean concept is the reduction of lead time.
BPI recognizes the need for big picture improvement when making incremental process improvement
does not bring the desired result. The big picture technique requires stepping out of today’s processes
and defines what perfect processes would be without the constraints of the present organization,
processes, and/or technologies.
Process architecture is especially relevant to help understand when and how big picture improvement
is required because it helps connect the operating model level to the individual process level. That is
why it is useful in the context of Digital Transformation because operating models are likely to change.
Last but not the least, the author explains why feedback systems are very important. He recommends
relating feedback to the individual performing the tasks, so they quickly understand their impact on
quality and giving them the responsibility to take immediate action. This last point is likely to have
been influenced by the Toyota® practice of the andon cord.
The Lean value stream definition is similar to James Harrington’s process definition. So why has Lean
settled for a different terminology? James Womack, a co-founder of LEI, explains that the word
"process" has different meanings (i.e., confusing process and procedure), therefore using value stream
instead of process helps to remove ambiguity.
James Harrington defines the process owner role, while Lean defines the value stream manager or
leader role.
The Lean definition brings an important distinction: development value streams to develop products
and processes versus operational value streams.
Zeynep Ton observes: "Higher product variety and more promotions, in particular, increase costs all
throughout the supply chain … More product variety and promotions also increase the likelihood of
errors and operational problems in the stores." [Ton 2014].
Michael George recommends eliminating complexity the customer will not pay for and exploiting the
complexity customers will pay for. He also recommends minimizing the costs of complexity offered
[George 2004]. His approach includes analyzing core processes, identifying product families, and
creating complexity value stream maps.
Multiplication of Touchpoints
In their book Beyond Advertising, the authors recommend thinking about brand ecosystems defined as
"the brand’s multiple touchpoints and how they interact with each other, from a digital out-of-home
experience to a tablet, from mobile to the store" [Wind 2016].
The authors observe that all interactions with a brand, from the first time you become aware that it
exists to every touchpoint that you encounter along the way in your daily life, have an impact: "From
the customer perspective, touchpoints with a brand or product are not differentiated: it is the seamless
experience that matters."
Figure 14, “Touchpoints” illustrates the variety of touchpoints that a brand has to orchestrate to deliver
The authors predict that in the future: "Touchpoints will continue to multiply as we enter an era where
every object has the potential to become connected and interactive."
This evolution impacts the enterprise as a whole: "New structures and processes will need to allow for
agility and reaping the benefits both from decentralization and, when needed, the power that leveraging
through centralization facilitates."
New operating models are emerging to support the ability to create real-time, personalized
experiences.
Bain positions an enterprise’s operating model as a bridge connecting strategy and execution [Bain
2014]. Operating model design starts from a clear formulation of the enterprise’s value proposition.
According to the Operating Model Canvas [Lancelott 2017], an operating model has six components:
• Value delivery chain(s): the work that needs to be done to deliver the value proposition
• Organization: the people who do the work and how they are organized
• Location: where the people will be located and the assets they need to help them
The accountability framework balances the autonomy agility required with effective alignment
mechanisms.
The modular nature of the adaptive operating model allows for a "plug-and-pay" reconfiguration in
response to evolving customer feedback.
The starting point is the discovery of what creates value across a given journey from the customer’s
point of view. By analyzing customer journeys, enterprises can pinpoint the operational improvements
that will have the biggest effect on customer experience [Chheda 2017]. Once the desired operational
improvements have been identified, the enterprise can implement them by activating four levers:
• Lean: to streamline processes, eliminate waste, and foster a culture of continuous improvement
• Digitization: the process of using technology to automate and improve journeys directly
• Advanced analytics: leveraging the power of machine learning to discover insights and make
recommendations
This document includes a revised service blueprint modeling technique to analyze customer journeys
end-to-end. Figure 15, “Revised Service Blueprint Template” introduces a template that can be used to
analyze customer journeys end-to-end.
[blueprint-template] | images/blueprint-template.png
Figure 15. Revised Service Blueprint Template
For each stage of the journey, the diagram specifies the channel that supports the interaction with the
user. A user story describes the set of interactions that occurs at that touchpoint.
For each of the user stories an analysis of required capabilities is performed. It starts with a functional
description and can go as far as specifying the APIs that encapsulate corresponding business services.
When required capabilities are missing, an analysis of existing applications helps identify missing or
incomplete ones. This gap analysis feeds an Agile requirement backlog. Backlog prioritization may
result in customer journey changes.
The candidate services that would implement the required capabilities should be architected in a
modular manner. This would facilitate the creation of new composite services. Reusable services can
be aggregated into digital platforms that facilitate reuse and accelerate the customer journey’s
reconfiguration.
This section presents the event storming workshop style, what it is, and its benefits.
To do so, people will be placing domain events on sticky notes on a wall along a timeline with an
unlimited modeling surface.
The workshop puts together three kind of people: people with questions, people with answers, and a
facilitator.
The orange color sticky note is the convention for the event.
The event is named with a past participle because of its simple semantic and notation. For example,
order paid, product sent, etc.
Events are used because they are easy to grasp and relevant for domain experts.
The goal of the event storming workshop is to maximize the learning of all the participants. At the
end of the workshop, the participants should have a shared knowledge of the domain subject of the
workshop.
The physical set up is important: the surface of the wall represents an unlimited modeling surface
and everyone should stand up to increase people’s engagement in the workshop. There must be a
continuous effort of every participant to maximize the engagement. The collaborative approach
involves people with questions, people with answers, and a facilitator.
Markers and sticky notes should be available in different colors and in sufficient quantity.
There should be no limitation of the scope under investigation and the model should be
continuously refined through low-fidelity incremental notation.
◦ Maximize learning
◦ One of its objects is the root, ensuring the integrity of the whole aggregate by enforcing its
invariant policy and transactional boundaries
◦ One of its objects is the root, ensuring the integrity of the whole aggregate by enforcing its
invariant policy and transactional boundaries
• Policy or rule
• Persona
• Read model/query
11.6. Benefits
• Opportunities to learn and facilitate a structured conversation about the domain; this workshop
is the best and most efficient way for all participants to have shared knowledge of a domain
• Uncover assumptions about how people think the system works; allows you to identify
misunderstandings and missing concepts
• Highly visual, tactile representation of business concepts and how they relate
• Allow participants to quickly try out multiple domain models so they can see where those concepts
work and where they break down
• Ask questions
◦ What are the targets? How will we know we have reached them?
• Visualize alternatives
• Reverse narrative
◦ Start from the end – what needs to happen before so that this event can happen too?
◦ Visualize every opinion and ask every party if they feel their opinion is accurately represented
• Timebox
◦ Use the pomodoro technique (25 mn); after each pomodoro, ask what is going well and what
isn’t – move on even if the model is incomplete
◦ Start from the end – what needs to happen before so that this event can happen too?
◦ You can throw the model away and start again with different people
As its name implies, event-driven architecture is centered around the concept of "event"; that is,
whenever something changes an event is issued to notify the interested consumers of such a change.
Event is a powerful concept to build architecture around because of the immutable and decoupling
nature of events, as well as being a great way to design and build domain logic. The following sections
detail the concepts and benefits of event-driven architecture, and then dive into the practical details of
implementing such an architecture.
• A command represents the intention of a system’s user regarding what the system will do that will
change its state
• A query asks a system for its current state as data in a specific model
• An event represents a fact about the domain from the past; particularly, on every state change the
systems perform, it will publish an event denoting that state mutation
Figure 17, “Concepts of Command/Query/Event and their Relation to Time” illustrates these concepts
related to time.
Command
A command represents the intention of a system’s user regarding what the system will do to change its
state.
• The result of a command can be either success or failure, the result is an event
• In case of success, state change(s) must have occurred somewhere (otherwise nothing happened)
• Commands should be named with a verb, in the present tense or infinitive and a nominal group
coming from the domain (entity of aggregate type)
Query
"A query is a request asking to retrieve some data about the current state of a
system."
A query asks a system for its current state as data with a specific model.
• The query contains fields with some value to match for or an identifier
• Query can result in success or failure (not found) and long results can be paginated
• Queries can be named with “get” something (with identifier as arguments) or “find” something
(with values to match as arguments) describing the characteristics of data we want to retrieve
Event
An event represents a fact about the domain from the past; particularly, on every state change
performed by the system, it will publish an event denoting that state mutation.
Events characteristics:
• Events are primarily raised on every state transition that acknowledged the new fact as data in our
system
◦ Events can also represent every interaction with the system, even when there is no state
transition as the interaction or the failure can itself be valuable (for instance, the failure of an
hotel booking command because of no-vacancy can be an opportunity to propose something
else to the customer)
• Events can be ignored, but can’t be retracted or deleted; only a new event can invalidate a previous
one
◦ Internal events: the ones raised and controlled in our bounded context (see [bounded-context])
◦ External events: the ones from other upstream bounded contexts to which we subscribed
Events are published whenever there is an interaction with the system through a command (triggering
or not a state transition; if not, the failure is also an event) or a query (no state transition, but the
interaction is interesting in itself, such as for analytics purposes).
Basically, command and query represents the intention of the end users regarding the system:
• A command represents the user asking the system to do something – they are not safe as they will
mutate state
• A query asks the current state of the system – they are safe as they will not mutate any data
This distinction relates to state and time management as well as expressing what the user wants the
system to do and, once the state has mutated, the system will publish an event notifying the outside
world that something has happened.
The world is event-driven as the present is very difficult to grasp and we can only clearly separate past
and future. Past is the only thing we can – almost – be sure of and the event way of describing the
result that occurred in the past or should occur with the future system is to use an event. It is as simple
as "this happened". The event storming workshop format is one of the most efficient and popular ways
of grasping a domain for people involved in software design. (See Chapter 11, Event Storming
Workshop for a description of the technique.)
Event mechanisms loosen the coupling between the event publisher and the subscribers: the publisher
doesn’t know its subscribers and their numbers. This focus on event also enforces a better division of
concerns and responsibility as too many events or coarse-grained events can be a design issue.
Commands and events force software designers to think about the system’s behavior instead of too
much focus on its structure.
Software designers and developers should focus on separating their domain logic between:
• Decision logic: whether this command is actually able to operate given the current contextual state
(that includes any data coming from external systems, such as market data as well as time)
• State mutation logic: once the context data is retrieved and decisions are made, the domain logic
can issue what the state mutations are – whether internal and external
• State mutation execution: this is where transactional mechanisms come into play, being automated
for a single data source or distributed using a Saga pattern (see Section 12.7, “Ensuring Global
Consistency with Saga Patterns”) and compensating transactions
• Command execution result: the command execution result, be it a success or a failure, is expressed
as an event published for private or public consumption
Commands can be used to better represent the user’s intention and fit well with deterministic domain
logic that benefits from consensus algorithms such as the Raft Consensus Algorithm (see
https://raft.github.io/) to distribute the execution of business logic on several machines for better
resiliency and scalability along with some sharding logic (see http://www.startuplessonslearned.com/
2009/01/sharding-for-startups.html).
Good operability needs good observability of the running systems and this is reached by strong logging
practices. Good logs are actually events in disguise; replaying a system behavior through logs is
actually following the flow of technical and domain events that exhibit what the system has done.
The main idea of event sourcing is to store all the "events" that represents stimuli asking for state
change of the system, then being able to reconstruct the system’s end state by applying the
domain logic for all of these events in order.
The event store becomes the source of truth and the system’s end state is the end result of applying all
theses events. Event-driven architecture doesn’t mean event sourcing.
CQRS is an architecture style that advises us to use a different data model and storage for
command (asking for a state change, aka a "write") and query (asking for the current state, aka a
"read").
The main motivation of CQRS to use these dedicated models is to simplify and gain better performance
for interactions with the system that is unbalanced (read-intensive/write-scarce or write-
intensive/read-scarce interactions). If CQRS simplifies each model in itself, the synchronization and
keeping all the models up-to-date also brings some complexity. CQRS can be implemented sometimes
without proven requirements and can lead to some over-engineering.
• Type: an identifier of the type of this artifact; it should be properly namespaced to be unique
among several systems
• Emitted-at: a timestamp in UTC timezone of when the command/query/event was emitted by the
source
• Source: the source system that emitted the artifact (in case of a distributed system, it can be the
particular machine/virtual machine that emitted that artifact)
• Various key: for partitioning the artifact among one or several values (typically, is the command
issued for a particular use organization of the system, etc.)
• Reference: for event, the command or query that triggers that particular event
• Payload: a payload as a data structure containing everything relevant to the purpose of the artifact
The Clound Native Computing Foundation issued a specification describing event data in a common
way (see https://cloudevents.io/).
We also advocate to translate each event that comes from another bounded context to a command of
the context that consumes it to denote explicitly the intention behind the event’s consumption.
As an example, think about the way your bank "cancels" a contentious movement on your account; the
bank doesn’t remove the contentious movement, instead it issues a new movement compensating the
effect of the bad one. For instance, given a contentious debit movement of $100, the bank issues a
credit movement of $100 to get back to a consistent balance even if the movements list of the account
now exhibits two movements cancelling each other.
As a first step, we need to identify the inverse of each command that will cancel the effect or
"compensate" a former one. The Saga patterns describe the structure and behavior to attain such a
consistency goal: in case of failure of one command, the other services issue new commands that
compensate the former one, hence "rollback" the whole distributed transaction as a result. Two types
of the Saga pattern exist: orchestration and choreography.
In the choreography Saga pattern, each service produces and listens to other services' events and
decides whether an action should be taken.
• Benefits:
◦ All services participating are loosely-coupled as they don’t have direct knowledge of each other;
a good fit if the transaction has four or five steps
• Drawbacks:
◦ Can quickly become confusing if extra steps are added to the transaction as it is difficult to track
which services listen to which events
◦ Risk of adding cyclic dependency between services as they have to subscribe to one another’s
events
In the orchestration Saga pattern, a "coordinator" service is responsible for centralizing the Saga
pattern’s decision-making and sequencing business logic.
• Benefits:
• Drawbacks:
Command/Query/Event Declaration
Command, Query, and Events are represented using a data record structure; they should not have any
operations associated with them, contrary to a value type, for instance. Each one should have an
identifier, even if they are immutable, to reference them easily.
Command/Query/Events are respectively declared with the "defcommand", "defquery", and "defevent"
macros. The first argument is a keyword for naming the concept, and then a keyword of a spec
describing the payload of that command, query, or event. For example:
Hexagonal Architecture
Hexagonal architecture decouples the application, domain, and infrastructure logic of the considered
system. It does so by using interfaces (ports) located in the domain part and implementations
(adapters) located in the application and infrastructure parts that are wired into the domain. The event
bus abstraction belongs to the infrastructure part of the architecture; several implementations can
then fulfill the infrastructure requirements (distributed or not), forwarding events to the browser with
Server-Side-Events (SSE).
Bus Abstraction
The event bus has two sides: publishing events and subscribing to events. These two operations are
exposed on an interface that can be implemented with various means, hence being a Service Provider
Interface in a hexagonal architecture.
(defprotocol EventBus
"A simple pub/sub interface for a publishing event indexed with key k (the key value
must be extracted from the event and is up to the implementation; this ensures
homogeneity of the key extraction from the event)"
(subscribe [_ f]
[_ k f] "Subscribe to events, optionally filtered with the key value k, then
callback the function f")
(publish! [_ event] "Publish the event to that bus "))
Each event needs to be classified depending on a value it holds; the exact value extracted from the
event needs to be parameterized depending on the bus, but each bus should do it in a homogeneous
way. So, the bus creation needs to have a key extraction function as an argument; usually it will be
whether the organization that is concerned by this event – the ":org-ref" – or the event type ":event-
type" to which every event belongs.
Some events can trigger other domain logic (e.g., a search query triggers adding this query to the user
search history). Events here allow the decoupling of the business logic between the publishing context
and the reacting context.
Event Publication
Domain logic that sits in back ends publishes events on every API interaction from the clients. These
events are published through the EventBus interface with various implementations detailed in the
following sections.
Kafka is used as the distributed messaging system that collects and distributes events between sub-
systems.
Kafka Subscription
Metrics are a numeric representation of data measured over intervals of time. Metrics can derive
knowledge of the behavior of a system over intervals of time in the present and future. Events
counting through Prometheus counters (see https://prometheus.io/) is a great way to observe the
system’s behavior over time. Metrics have:
• Name
• Timestamp
• Value (can be as simple as +1 for a counter that denotes each published event)
Back ends expose technical and domain metrics through Prometheus endpoints for monitoring and
alerting concerns. The business events are a strong proxy to any technical incidents; monitoring
domain events should be the main values to monitor over time.
The biggest advantage of metrics-based monitoring over logs is that, unlike log generation and storage,
metrics transfer and storage has a constant overhead. Metrics are also better suited to trigger alerts
over aggregation of events.
Once a user connects to the application, the browser app needs to collect all the events related to the
organization to which this connected user belongs.
"Our highest priority is to satisfy the customer through early and continuous delivery … Welcome
changing requirements, even late in development. Agile processes harness change for the customer’s
competitive advantage.“ [Agile Manifesto]
The TOGAF standard defines governance as: “the ability to engage the involvement and support of all
parties with an interest in or responsibility to the endeavor with the objective of ensuring that the
corporate interests are served, and the objectives achieved” [TOGAF 2018].
[1]
[COBIT 5] , in Principle 5, makes an important distinction between the management
activities/processes and governance itself.
Governance ensures that enterprise objectives are achieved by evaluating stakeholder needs,
conditions, and options; setting direction through prioritization and decision-making; and monitoring
performance, compliance, and progress against agreed direction and objectives.
Management plans, builds, runs, and monitors activities in alignment with the direction set by the
governance body to achieve the enterprise objectives.
Another standard, [ISO/IEC 38500], applies to the governance of management processes (and decisions)
relating to the information and communication services used by an organization. The standard
provides a structure for governance of IT to assist those at the highest level of organizations to
understand and fulfill their legal, regulatory, and ethical obligations regarding their organization’s use
of IT.
Figure 20. ISO/IEC 38500 Model for Corporate Governance of IT (Copyright © 2008 ISO)
ISO/IEC 38500 is driven from the top down; IT departments need to make sure that they are ready for
the new demands the board will pose (e.g., performance measurements, clear governance
mechanisms). One criticism of both COBIT and ISO/IEC 38500 is that these frameworks are reactive and
don’t adapt to quick change well; an example is change outside budgetary cycles, and the time it takes
to enact these changes.
Agile doesn’t contradict both the COBIT and ISO/IEC 38500 definitions and guidance; in fact, it
emphasizes the interaction of individuals, collaboration, and ensures the stakeholder objectives/goals
are achieved. Often in non-Agile organizations, ownership is fluid between business and IT, whereas in
Agile organizations the business stakeholder (product owner) "owns" the value chain. This risk has
been reduced in Agile IT governance frameworks as it focuses on recognizing the right person as the
suitable owner of the portfolio/program/project at the right level from inception to delivery.
There is a need for procedures and policies to be in place to reconcile sometimes what can be
described as conflicts of interest, agreements, responsibilities, and rewards, and update stakeholders
on "contracts" that have been made.
Critics of Agile say that Agile methods allow teams to work in an unstructured way, which prevents
clear lines of accountability and discourages documentation. On the other hand, proponents of Agile
delivery argue that the methods rely on information on the current status of the project being visible
to the whole business, instead of central processes for command and control. They also believe that,
because the methods are designed to be self-assuring, there is proper governance and accountability
built into Agile practices. Control processes, in Agile, are more collaborative and are run continuously
by the business owner of the product or service. On the other hand, critiques of the status quo argue
that “IT governance is killing innovation” [HBR 2013]; governance does not allow us to respond quickly
enough to new demand.
Both camps agree that communicating on the status, progress, and budget and satisfying regulatory
and auditing requirements are required.
Governance in Agile is a balancing act; on one hand autonomy, and on the other accountability to a
“contract”, freedom to innovate versus following proven routines and policies [HBR 2017].
On this same point, Isdell’s “freedom within a framework” [Kesler 2008], governance became the
means to corralling an accepted level of chaos – a way to engage the natural tension between many
new global initiatives and the need for geographic General Managers to get more aggressive about
finding local solutions to brand, product, and revenue gaps.
What has this to do with Agile governance? Agile teams, and in consequence Enterprise Architecture,
co-create and deliver customer-centric products (not projects). The implication is the need for cross-
functional teams (business and technology) that are product-led. This poses a huge challenge to
"legacy" data in siloed organizations that need to reinvent their own organizations and operating
models. Other aspects to consider are metrics, performance, and incentives; how to recognize success
or failure?
An Agile governance structure needs to recognize these new ways of working and to measure value.
The following sections assume you are on a journey to a customer-centric product-led organization
where "Agile" is the way of working [McKinsey 2016].
"At PepsiCo®, we are leveraging design to create meaningful and relevant brand experiences for our
customers any time they interact with our portfolio of products." (https://hbr.org/2015/08/pepsicos-chief-
design-officer-on-creating-an-organization-where-design-can-thrive)
A product or service is delivered to a customer via a series of activities; the chain of activities that
delivers a valuable product or service to a customer is usually referred as a "value chain". The concept
was first described by Michael Porter [Porter 2004].
An Agile Architecture governance structure could look like the following diagram.
Key Areas
Within an enterprise, governance structures exist that need to adhere to internal policies and external
regulatory frameworks. Corporate and IT governance are such "bodies" that with support of
frameworks (like COBIT and ISO/IEC 38500) help enterprises to manage such adherence. In a product-
led organization, customer focus is essential and has increased regulatory attention [PWC 2013].
Architecture is represented by the organization’s Chief Architect or delegates on each of the
governance boards as required.
The Enterprise Technology Board is a decision and policy-making group charged by the CIO/CTO with
driving and influencing the Enterprise Architecture of the organization aligned with its strategic goals
and objectives. Architecture should not impose a model nor define it; it should be an outcome as a
result of collaboration within the enterprise – IT and business. As part of the Centers of Enablement and
in conjunction with the business and product owners, architecture is defined and enabled via
Innovation Hackathons. These innovation hackathons play a major part in enabling technology
innovation and collaboration within the enterprise but also in partnership with external stakeholders
such as vendors. Co-creation is an important feature of these enablement centers.
Within the product value chain, the product owner, with the support of the product architect,
prioritizes features considering customer needs and existing technology roadmaps ("near-term
roadmapping"). The impact of changes as a result of constant refinement is governed within the
Product Architecture Board.
• Business Agility Institute: Agile Governance: Not an Oxymoron by Bala Bulusu, February 2019:
https://www.youtube.com/watch?v=bgtOP5ArwIU&feature=youtu.be
[1] COBIT® provides an implementable "set of controls over IT and organizes them around a logical
framework of IT-related processes and enablers."
Effective integration mechanisms are required because pure greenfield or "big bang" approaches are
not realistic. However, such integration mechanisms are not easy to architect, implement, and operate.
The main challenges are:
• To architect production-ready systems that are safe, secure, compliant, and that can scale
• To cope with the variety of data models that are inherited from a variety of legacy systems
developed in different countries using different technologies
• To bridge the old software architecture practices with an entirely different way of thinking and
reasoning about software architecture
The ambition of this playbook is to provide a roadmap to help progressively architect the overall
system into a loosely-coupled one. Figure 23, “Monolithic to Modular Journey” illustrates this journey.
1. Create RESTful API domain extensions to legacy systems. Use mediation to translate legacy data
types into API data types (Anti-Corruption Layer).
2. Decouple front-end from back-end development by jointly defining APIs that cater to the needs of
front-end developers while being “implementable” by back-end developers (e.g., Reactive REST
architectures, GraphQL technology developed by Facebook®, etc.).
3. Start modularizing the monolith by formalizing boundaries between sub-domains. Mix request-
response style APIs with asynchronous message passing (events) to connect them.
4. Further modularize by creating microservices aligned with context boundaries (see Chapter 16,
Domain-Driven Design Strategic Patterns). Move further toward event-orientation by implementing
the Event Sourcing pattern.
In order to facilitate steps 1 and 2, instead of a relying on a single API with a flat set of endpoints,
Etsy® (see https://www.etsy.com/developers/documentation/getting_started/api_basics) created a two-
layer API using meta-endpoints. Similar to a pattern used by Netflix® and eBay’s ql.io, each of Etsy’s
meta-endpoints aggregates several other endpoints. This enables server-side composition of low-level,
general-purpose resources into device or view-specific resources.
Since the heyday of SOA the idea of creating an API layer above legacy systems is seen by many as the
magic bullet that can solve a majority of legacy integration problems.
An API is an application software intermediary that enables a software program to interact with
other software. In the context of trading, an API can, for example, enable your software to
connect with a broker to obtain real-time pricing data or place trades.
The quality of an API is constrained by the quality of the software that publishes it. A good API shields
the internal complexity of the software that provides it and presents data in a usable way. Because
legacy software is often complex and monolithic, it is a challenge to design APIs that are not prone to
abstraction leak. Abstraction leak refers to a situation where an API consumer has to understand
unnecessary implementation details to use it. When it happens it creates unintended coupling that
puts the system’s agility in jeopardy.
Designing APIs requires business domain knowledge. Technologies such as API managers provide little
if any help solving the hard problems which are:
Designing APIs that external developers will love is key to creating the kind of digital ecosystems that
characterize many digital business models. Designing the information architecture that will guide API
definition does not happen in a vacuum. Best practices show that it is driven by business goals,
persona modeling, and task analysis.
When integrating a new application with a legacy one that manages primary data, it is important to
distinguish when:
• It processes a business event that will result in the creation or modification of that primary data
In the latter case, it is safer to let the legacy application modify the primary data because it implements
the business rules that ensure the modification will be performed in the right way. The new
application could become the new system of record if it would implement equivalent business rules.
This is often impractical because legacy applications tend to be poorly documented.
When a legacy application exposes a "write" API, it may increase the transactional load to a level that is
not sustainable. For example, the third-party securities lending system of a global custodian could not
include custody system calls within the boundary of a distributed transaction. In addition, developing
APIs on top of some old technologies such as IMS/DC may present specific technical challenges.
Using the synchronous API style, in particular when write-APIs are in scope, decreases the ability of
the system to stay responsive in the face of a failure (lower resilience). It also decreases its ability to
respond in a timely manner because it has to wait for the response of a legacy application’s API (less
responsive).
• Controlled data replication that ensures consistency across redundant data sources
• Event-driven logic where noteworthy state changes are broadcasted to interested software
components that can respond to them
• Eventual consistency that only guarantees that all replicas will eventually become consistent
Asynchronous message-passing still supports business transactions, but does it using Sagas. The Saga
pattern describes how to implement business transactions without two-phase commit as this does not
scale well, in particular in cloud-native systems.
The business transaction is divided into multiple steps or activities. The Saga pattern has the
responsibility to either get the overall business transaction completed or to leave the system in a
known termination state. So, in case of errors a business rollback procedure is applied which occurs by
calling compensation steps or activities in reverse order. This pattern is not new, though it was not
named Saga. For example, the idea of compensating transactions has been used in the past to process
payments or securities handling business transactions.
Unlike two-phase commit transactions that are handled automatically by the database or the
middleware, the Saga pattern may require specification of specific business logic to handle the
consequences of business events. For example, if the securities that have been loaned (third-party
lending) are sold, the securities lending system will replace them with other available securities. If the
securities are no longer available, the system has to inform the trader and assist her resolving the
issue.
The example of When Issued (WI) transactions illustrates that identity management is more than
specifying tables’ primary keys. For example, a treasury bond can be purchased or sold when it has
been authorized but not yet issued. A dummy CUSIP number is created to identify the security before
CUSIP Global Services (CGS) creates the official one. The WI trade is conditional and settlement can
occur only when the security has been listed and has been attributed with an official CUSIP number.
Some changes of an entity object state can have business consequences. For example, in the case that
the WI security is not listed or admitted to trading, all transactions effected during that period are
declared void by the exchange. This state change is a business event that triggers downstream actions
to undo the WI trades. That is why asynchronous communication using IBM MQSeries® is not
sufficient. Communicating events (state changes) is better suited to express dynamic business rules.
Just sending messages that represent the state of an entity at a point in time does not inform the
listener that a business event happened.
An entity object has attributes that describe its state. A state change of an entity object modifies the
value of some of its attributes. Depending on the context, some attributes may not be relevant. For
example, the front office does not need some of the data that will be required at settlement time.
Master data is about managing in a consistent manner the identifiers and key attributes of core entity
objects of the bank such as Party, Security, or Account. Referential data can be fully managed by new
digital applications when they implement all the business rules required to modify primary data.
When this is not the case, derived master data can be replicated in new digital systems and accessed in
read-only mode: updates being still handled by "legacy" systems of record.
Too many legacy systems are polluted by inflated master data structures that aggregate all the data
that describe entities regardless of their context and usage. This tends to increase the overall system’s
complexity and promote high coupling. When integrating new digital applications with legacy ones, it
is preferable to protect new software code from unnecessary complexity that can pollute it.
When designing the resource model of a RESTful API or an event data structure, business domain
concepts should prevail over constraints imposed by legacy data structures. Finding the right balance
between clean versus unclean design cannot be solved by technology alone; it requires business
domain expertise.
An entity can be replicated in more than one system as long as its identity is preserved and data
lineage properly managed. REST resources map to business entities. Derived data is formatted to meet
the needs of the data consumer and cached to minimize network traffic.
Legacy applications publish events that model entity state changes that are of interest to digital cloud-
native applications. Sagas whose scope spans digital and legacy help maintain the overall system’s
(eventual) consistency.
The diagram also shows that the legacy application manages Entity D, which could be Instrument. If
the Instrument’s price changes, the legacy application can publish a business event that represents
that change. A new digital application that consumes the price change event can, for example, trigger a
state change of limits to the order it manages. The digital application emits a new event that ascertains
that the order (Entity E3) has been released for execution. This type of causality chain can be expressed
in a graph that can drive Saga logic.
We believe that architecture models that only represent a system statically are incomplete. They need
to be completed by models that represent the system’s behavior. The scalability, resilience, and
responsiveness of a system cannot be verified in the absence of dynamic modeling.
The process repeats itself to develop other bounded contexts (bc2 to bci) at the edge of the legacy
systems. When the last bounded context (bcn) is developed, the legacy system can be decommissioned.
It has been strangled!
Some recommend a more aggressive approach; they claim that you should starve your monolith to
death (see https://read.acloud.guru/if-you-cant-strangle-the-monolith-starve-it-to-death-fcc824d3c82).
Because decommissioning legacy systems requires significant investment, attention should be given to
the economic side of the equation. It is good to starve your monolith rapidly if the business case proves
positive.
Conclusion
The approach we have described is key to better integrate digital applications with legacy ones and to
ultimately decommission them. However, the reader may think that the learning curve is too high and
the enterprise is not up to the challenge.
The obvious alternative would be to develop new digital capabilities with the same old architecture
models mastered by the IT organization. This is not such a good idea because:
• Classical distributed computing models scale vertically up to the limit of "big iron" computers’
power
• The enterprise would lose the elasticity, scalability, and cost benefits of cloud-native computing
• The enterprise would be at a competitive disadvantage vis-à-vis market players who master this
class of technology
MITRE® defines an architecture pattern as: "a method of arranging blocks of functionality to address a
need. Patterns can be used at the software, system, or enterprise levels. Good pattern expressions tell you
how to use them, and when, why, and what trade-offs to make in doing so. Patterns can be characterized
according to the type of solution they are addressing (e.g., structural or behavioral)." [MITRE]
The TOGAF framework has not yet integrated architecture patterns but has published a template to
describe them [TOGAF 2018, Chapter 28].
Martin Fowler described how to create a new system around the edges of the old, letting it grow slowly
over several years until the old system is "strangled" [Fowler 2004].
Chris Stevenson published a paper that describes how his team rewrote a legacy application by
creating new features using the pattern described by Martin Fowler [Stevenson 2004].
Eric Evans wrote a document that describes how to get started with Domain-Driven Design (DDD)
when surrounded by legacy systems [Evans 2013]. It describes four strategies to progressively
modularize a monolithic legacy system by applying DDD.
The term "Domain-Driven Design" (DDD) was coined by Eric Evans in his book Domain-Driven Design:
Tackling Complexity in the Heart of Software [Evans 2003].
Domain-Driven Design (DDD) offers strategic building blocks for analyzing and structuring the
problem space and the solution space.
• The words used by the people and their meanings – the domain language is the language used by
people as it is, so it can be messy and organic
The problem space holds the domain within which the enterprise operates and represents the world
as we perceive it; it describes the Business Architecture.
The domain is the set of concepts that, through use-cases, allows people in the enterprise to
solve problems.
Sub-domains
A domain can be decomposed into sub-domains which typically reflect some organizational structure.
Sub-domain boundaries are determined in part by communication structures within an organization.
The sub-domains are stable; they change only for strategic reasons and are independent of software.
An e-commerce system consists of a product catalog, an inventory system, a purchasing system, and an
accounting system, etc. They are sub-systems in that the system as a whole is partitioned into them.
The system is partitioned in this specific way because the resulting sub-systems form cohesive units of
functionality.
Domain knowledge is key to decomposing a domain into sub-domains that have a high level of internal
cohesion and minimum dependencies with other sub-domains. Conducting event storming workshops
is a great way to accelerate the acquisition of domain knowledge and explore domain decomposition
scenario. The event storming workshop technique is introduced in Chapter 11, Event Storming
Workshop.
Distillation
The enterprise operates with several sub-domains. Depending on its business, some are generic (such
as accounting or HR), some are support, and some are core, meaning the current strategy directly relies
on the core domains to attains its goal. Not all parts of a large system will be well designed.
The core domain is the domain that directly contributes to the current enterprise’s strategy.
Strategy is defined as: "Strategy describes the organization;s objectives according to the environment and
the available resources, then the resources allocation in order to create value for the clients along with
profits for the organization and its employees."
Bounded context is the solution as we design it. It describes the software architecture and is used to
manage the complexity, and is therefore linked to the business.
Bounded context means different models of the same thing (e.g., book, customer, etc). Bounded context
is represented by models and software that implement those models. This is where we find patterns
and heuristics.
A language structured around the domain model and used by all team
members to connect all the activities of the team with the software.
The ubiquitous language is a deliberate language designed to be unambiguous and on which all
stakeholders agreed. This language is found in every artifact manipulated by the stakeholders (UI,
database, source code, documents, etc.). The concepts conveyed by the domain model are the primary
means of communication; these words should be used in speech and every written artifact. If an idea
cannot be expressed using this set of concepts, the designers should iterate once again and extend the
model, and they should look for and remove ambiguities and inconsistencies. The domain model is the
backbone of the ubiquitous language.
Bounded Context
A bounded context delimits the applicability of a particular model so that team members have a clear
and shared understanding of what has to be consistent and how it relates to other contexts. Bounded
contexts are not modules,
Bounded contexts separate concerns and decrease complexity. A bounded context is the boundary for
the meaning of a model. A bounded context creates autonomy, hence allowing a dedicated team for
each. Bounded contexts simplify the architecture by separating concerns.
Context Map
A context map describes the flow of models between contexts and provides an overview of the systems
landscape. A context map help to identify governance issues between applications and teams. It helps
us to see how teams communicate and their "power" relationships. With a context map we get a clear
view on where and how bad models propagate through IS landscapes.
You can use the metaphor of a river flowing to describe the relations between two bounded contexts: if
you are upstream and pollute the river, the downstream people will be impacted - not the opposite.
A relationship between two bounded contexts in which the upstream group’s actions affect the
downstream group, but the actions of the downstream do not affect the upstream. It is not about the
data flow’s direction, but about the models' flow.
Upstream Patterns
Define a protocol that gives access to your sub-system as a set of services. Open
the protocol so that all who need to integrate with you can use it. Enhance and
expand the protocol to handle new integration requirements, except when a
single team has idiosyncratic needs. Then, use a one-off translator to augment
the protocol for that special case so that the shared protocol can stay simple
and coherent.
Event Publisher
Domain events are something that happens in the domain and that is important to domain experts. An
upstream context publishes all is domain events through a messaging system (preferably an
asynchronous one) and downstream contexts can subscribe to the events that are relevant for them
and conform or transform those events in their models (following an ACL) and react accordingly.
Midway Patterns
Shared Kernel
Designate some subset of the domain model that the two teams agree to share.
Of course this includes, along with this subset of the model, the subset of code
or of the database design associated with that part of the model. This explicitly
shared stuff has special status, and shouldn’t be changed without consultation
with the other team.
Published Language
Separate Ways
Partnership
Downstream Patterns
Customer/Supplier
Conformist
Anti-corruption Layer
We can organize the context map patterns along two axis: Control and Communication.
The relevance and applicability of methods contained in this section precedes the listing of references
and resources of interest to the reader.
When needed, specific method knowledge can be incorporated into the O-AAF playbooks or pattern
sections. For example, Domain-Driven Design (DDD) is described in Part 4: Methods and the DDD
strategic patterns are incorporated into Part 3: Architecture Patterns because they are key to the O-AAF
Standard.
This Snapshot document does not include method references. However, relevant method knowledge
has been incorporated in the chapters that are included in this document.
Appendix A: Abbreviations
ACL
Access Control List
ADM
Architecture Development Method
ADR
Architecture Decision Record
API
Application Program Interface
BASE
Basically Available, Soft State, Eventual
BDUF
Big Design Up-front
BPI
Business Process Improvement
BPR
Business Process Re-engineering
CGS
CUSIP Global Services
CMM
Capability Maturity Model
CMMI
Capability Maturity Model Integration
CQRS
Command Query Responsibility Segregation
CUSIP
Committee on Uniform Security Identification Procedures
DBMS
Database Management System
DDD
Domain-Driven Design
DoD
Definition of Done
ERP
Enterprise Resource Planning
FCA
Financial Conduct Authority
IaaS
Infrastructure as a Service
IMS/DC
Information Management System/Data Communications
ISACA
IS Audit and Control Association
LEI
Lean Enterprise Institute
MVA
Minimum Viable Architecture
MVP
Minimum Viable Product
NoSQL
Not only SQL
NVA
Non-Value Activities
O-AAF
The Open Group Agile Architecture Framework
PCMM
People Capability Maturity Model
P&L
Profit and Loss
RDBMS
Relational Database Management System
REST
Representational State Transfer
RVA
Real Value-Added
SBCE
Set-Based Concurrent Engineering
SEI
Software Engineering Institute
SOA
Service-Oriented Architecture
SGMM
Smart Grid Maturity Model
SSE
Server-Side-Event
UUID
Universally Unique Identifier
URN
Uniform Resource Name
WI
When Issued
Index
@ continuous delivery, 34
(BDUF, 45 continuous integration, 34
(SBCE, 45 customer experience, 22, 23
customer insights, 42, 42
A customer journey, 15, 42, 66
ACID versus BASE, 89 customer-supplier, 103
API, 89
D
Agile Transformation, 47
accountability, 45 Digital Transformation, 38
adaptive operating model, 44, 65 DoD, 87
allowable lead time, 15 Domain-Driven Design, 98
andon cord, 63 definitions, 15
anti-corruption layer, 92, 104 design decision, 57
architectural, 30 design thinking, 15, 24, 38, 42
architectural roadmap, 37 digital offerings, 44
architectural runway, 15 digital platform, 16, 43
architecture decision, 57
E
architecture pattern, 96
asynchronouse, 91 enterprise technology board, 86
entity object, 91
B epic, 16, 16
BPI, 62 event, 73
BRP, 62 event data, 76
bounded context, 99, 100 event publisher, 101
bus abstraction, 80 event sourcing, 75
business event, 92 event storming, 68, 74, 99
business model, 41 event-driven, 74
event-driven architecture, 72
C evolvability, 30, 59
CQRS, 75
F
Conformist, 104
catchball, 23 feature team, 47
center of enablement, 86 feature toggle, 35
chapter, 47 fitness function, 32
choreography Saga pattern, 77
G
clean architecture, 59
command, 73, 74 governance, 82
complex system, 26 guardrail, 33
componentization, 35 guild, 47
conformance, 13
H
constraints, 32
hexagonal architecture, 79
context map, 100
I persona, 16
ISO/IEC 38500, 83 pipeline, 43
Inverse Conway Manoeuvre, 36 platform, 43
innovation hackathon, 86 platform business model, 17
playbook, 56
J problem space, 98
job-to-be-done, 16, 42 process, 17, 62, 62
jobs-to-be-done analysis, 38 process architecture, 62, 63
process owner, 63
L product, 17, 24
LEI, 63 product architecture board, 86
Lean Startup, 44 product variety, 41
Lean value stream, 16, 62, 63 product-centricity, 17, 24
lead time, 16 published language, 102
lean, 58 purpose, 22
legacy, 88
Q
M query, 73, 74
MVA, 57
R
MVP, 57
maturity level, 52 RESTful API, 89
metadata, 76 RVA, 63
metrics, 81 refactoring, 30
microservices, 89 release toggle, 35
modular, 88
S
modularity, 40
modularization, 16 SBCE, 58
monolithic, 88 SEI, 51
Saga pattern, 77, 91
N sacrificial architecture, 60
NVA, 63 separate ways, 103
network effect, 84 service, 17, 17
service blueprint, 66
O service blueprinting, 44
Open Host Service, 101 sharding logic, 75
omnichannel model, 24 shared kernel, 102
operating model, 62 solution space, 99
orchestration Saga pattern, 77 squad, 47
outcomes, 39 story, 17
outputs, 39 story map, 18
outside-in perspective, 38 strangler pattern, 94, 97
strategic marketing, 40
P strategy, 22, 99
Prometheus, 81 strategy formulation, 22
partnership, 103 sub-domain, 98
system, 18
systems thinking, 26
T
terminology, 14
can, 14
may, 14
shall, 14
shall not, 14
should, 14
will, 14
touchpoint, 64, 64
tribe, 50
type 1 decision, 58
type 2 decision, 58
U
ubiquitous language, 99
V
value chain, 85
value stream, 63