Visualising Software Architecture Sample
Visualising Software Architecture Sample
architecture
Simon Brown
This book is for sale at http://leanpub.com/visualising-software-architecture
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean
Publishing process. Lean Publishing is the act of publishing an in-progress ebook using
lightweight tools and many iterations to get reader feedback, pivot until you have the right
book and build traction once you do.
4. A shared vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1 Common abstractions over a common notation . . . . . . . . . . . . . . . . 21
4.2 Static structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Components vs code? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Modules and subsystems? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.5 Microservices? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.6 Serverless? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.7 Platforms, frameworks and libraries? . . . . . . . . . . . . . . . . . . . . . . 32
4.8 Create your own shared vocabulary . . . . . . . . . . . . . . . . . . . . . . . 32
5. The C4 model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1 Hierarchical maps of your code . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1. About the book
I graduated from university in 1996, a time when CASE and modeling tools were popular
and in common use. I remember attending a training course about the Unified Modeling
Language and the SELECT tooling soon after I started my professional career. A number
of the projects I worked on made extensive use of tools like SELECT and Rational Rose for
diagramming and documenting the design of software systems. With buggy user interfaces
and ugly diagrams, the tooling may not have been brilliant back then, but it was still very
useful if used in a pragmatic way.
I’m very much a visual person. I like being able to visualise a problem before trying to find
a solution. Describe a business process to me and I’ll sketch up a summary of it. Talk to
me about a business problem and I’m likely to draw a high-level domain model. Visualising
the problem is a way for me to ask questions and figure out whether I’ve understood what
you’re saying. I also like sketching out solutions to problems, again because it’s a great way
to get everything out into the open in a way that other people can understand quickly.
But then something happened somewhere during the early 2000s, probably as a result of the
Manifesto for Agile Software Development that had been published a few years beforehand.
The way teams built software started to change, with things like diagramming and docu-
mentation being thrown away alongside big design up front. I remember seeing a number
of software development teams reducing the quantity of diagrams and documentation they
were creating. In fact, I was often the only person on the team who really understood UML
well enough to create diagrams with it.
Fast forward to the present day, and creating software architecture diagrams seems to be
something of a lost art. I’ve been running software architecture training courses for a number
of years, part of which is a simple architecture kata where groups of people are asked to
design a software solution and draw some architecture diagrams to describe it. Over 10,000
people and 30+ countries later, I have gigabytes of anecdotal photo evidence - the majority
of the diagrams use an ad hoc “boxes and lines” notation with no clear notation or semantics.
Designing software is where the complexity should be, not communicating it.
This book focusses on the visual communication and documentation of software architecture.
I’ve seen a number of debates over the years about whether software development is an art,
a craft or an engineering discipline. Although I think it should be an engineering discipline,
I believe we’re a number of years away from this being a reality. So while this book doesn’t
About the book 2
Systems Analysis And Design Method (SSADM). Although the software development
industry has moved on in many ways, we seem to have forgotten some of the good things that
these prior approaches gave us, specifically related to some of the artifacts these approaches
encouraged us to create.
Of course, the Unified Modelling Language (UML), a standardised notation for communi-
cating the design of software systems, still lives on. However, while you can argue about
whether UML offers an effective way to communicate software architecture or not, that’s
often irrelevant because many teams have already thrown out UML or simply don’t know it.
Such teams typically favour informal boxes and lines style sketches instead, but often these
diagrams don’t make much sense unless they are accompanied by a detailed narrative.
Abandoning UML is all very well but, in the race for agility, many software development
teams have lost the ability to communicate visually. The example software architecture
sketches (pictured) illustrate the typical software architecture diagrams that I see on my
travels and, as we’ll see in the next chapter, they suffer from a number of problems.
way. Just to be clear, I’m not talking about detailed modelling, comprehensive UML models,
or model-driven development. This is about effectively and efficiently communicating the
software architecture of the software that you’re building, with a view to:
• Help everybody understand the “big picture” of what is being built, and how this fits
into the “bigger picture” of the organisation/environment in which it exists.
• Create a shared vision for the development team.
• Provide a “map” that can be used by software developers to navigate the source code.
• Provide a point of focus for technical conversations about new features, technical debt,
risk reviews, etc.
• Fast-track the on-boarding of new software developers into the team.
• Provide a way to explain what’s being built to people outside of the development team,
whether they are technical or non-technical.
Why is this important? In today’s world of agile delivery and lean startups, many software
teams have lost the ability to communicate what it is they are building, so it’s no surprise
that these same teams often seem to lack technical leadership, direction, and consistency. If
you want to ensure that everybody is contributing to the same end-goal, you need to be able
to effectively communicate the vision of what it is you’re building. And if you want agility
and the ability to move fast, you need to be able to communicate that vision efficiently too.
Moving fast requires good communication.
I run this as a group-based exercise, typically with between two and five people per group.
Rather than making the exercise easier, having a group of people with different backgrounds
and experience tends to complicate matters, as time is wasted debating how best to complete
the task. This is because, unlike the building industry, the software development industry
lacks a standard, consistent way to think about, describe and visually communicate software
architecture. I believe there are a number of factors that contribute to this:
1. In their haste to adopt agile approaches in recent years, many software teams have
“thrown out the baby with the bath water”. Modeling and documentation have been
thrown out alongside traditional plan-driven processes and methodologies. That may
sound a little extreme, but many of the software teams I work with only have a very
limited amount of documentation for their software systems.
2. Teams that still do see the value in documents and diagrams have typically aban-
doned the Unified Modeling Language (UML) in favour of an approach that is more
lightweight and pragmatic. I’ll discuss UML later in the book, but my anecdotal
evidence, based upon meeting and speaking to thousands of software developers,
suggests that UML is optimistically only used by a small percentage of software
developers.
3. There are very few people out there who teach software teams how to effectively
model, visualise and communicate software architecture. Based upon running a small
number of workshops for computer science undergraduates, this includes lecturers at
universities too.
We have a failure to communicate 8
Regardless of whether this is the only software architecture diagram or one of a collection
of software architecture diagrams, this diagram doesn’t tell you much about the solution.
Essentially it’s just a shopping list of technologies.
There’s a Unix box and a Windows box, with some additional product selections that include
JBoss (a Java EE application server) and Microsoft SQL Server. The problem is, I don’t know
We have a failure to communicate 9
what those products are doing, plus there seems to be a connection missing between the Unix
box and the Windows box. It’s essentially a bulleted list that’s been presented as a diagram.
When people talk about software architecture, they often refer to “boxes and lines”. This next
diagram has boxes, but no lines.
This is a three-tier solution (I think) that uses the Microsoft technology stack. There’s an
ASP.NET web application at the top, which I assume is being used for some sort of user
interaction, although that’s not shown on the diagram. The bottom section is labelled “SQL
Server” and there are lots of separate cylinders. To be honest though, I’m left wondering
whether these are separate database servers, schemas or tables.
Finally, in the middle, is a collection of boxes, which I assume are things like components,
services, modules, etc. From one perspective, it’s great to see how the middle-tier of the
overall solution has been decomposed into smaller chunks, and these are certainly the types
of components/services/modules that I would expect to see for such a solution. But again,
there are no responsibilities and no interactions. Software architecture is about structure,
which is about things (boxes) and how they interact (lines). This diagram has one, but not
the other. It’s telling a story, but not the whole story.
We have a failure to communicate 10
Essentially the group that produced this diagram has simply documented their functional
decomposition of the solution into a number of smaller things. Imagine a building architect
drawing you a diagram of your new house that simply had a collection of boxes labelled
“Cooking”, “Eating”, “Sleeping”, “Relaxing”, etc or “Kitchen”, “Dining Room”, “Bedroom”,
“Lounge”, etc.
This diagram suffers from the same problem as the previous diagram (no responsibilities
and no interactions) plus we additionally have a colour coding to decipher. Can you work
out what the colour coding means? Is it related to input vs output functions? Or perhaps it’s
business vs infrastructure? Existing vs new? Buy vs build? Or maybe different people simply
had different colour pens! Who knows. I often get asked why the central “Risk Assessment
Processor” box has a noticeably thicker border than the other boxes. I honestly don’t know,
but I suspect it’s simply because the marker pen was held at a different angle.
The central spine of this diagram is great because it shows how data comes in from the source
data systems (TDS and RDS) and then flows through a series of steps to import the data,
perform some calculations, generate reports and finally distribute them. It’s a super-simple
activity diagram that provides a nice high-level overview of what the system is doing. But
then it all goes wrong.
I think the green circle on the right of the diagram is important because everything is pointing
to it, but I’m not sure why. And there’s also a clock, which I assume means that something
is scheduled to happen at a specific time.
The left of the diagram is equally confusing, with various lines of differing colours and styles
zipping across one another. If you look carefully you’ll see the letters “UI” (User Interface)
upside-down. The reason? People were writing from wherever they sat around the table.
Generically true
This is another very common style of diagram. Next time somebody asks you to produce a
software architecture diagram of a system, present them this photo and you’re done!
We have a failure to communicate 12
It’s a very “Software Architecture 101” style of diagram where most of the content is generic.
Ignoring the source data systems at the top of the diagram (TDS and RDS); we have boxes
generically labelled “transport”, “archive”, “audit”, “report generation”, “error handling” and
arrows labelled “error” and “action”. And look at the box in the centre - it’s labelled “business
logic”, which is not hugely descriptive!
This diagram is also relatively common. It shows the logical (or conceptual, or functional)
building blocks that the software system is comprised of, but offers very little information
other than that.
We have a failure to communicate 13
This diagram is also relatively common. It shows the overall shape of the software archi-
tecture (including responsibilities, which I really like) but the technology choices are left to
your imagination.
We have a failure to communicate 14
And similarly, this next diagram tells us that the solution is an n-tier Java EE system but,
like the previous diagram, it omits some important technology details.
The lines between the web server and the application server have no information about
how this communication occurs. Is it SOAP? A JSON web API? XML over HTTPS? Remote
We have a failure to communicate 15
If you look at the Unix box in the centre of the diagram, you’ll see two smaller boxes labelled
“Risk Analysis System” and “Data Import Service”. If you look closely, you’ll see that both
boxes are annotated “JAR”, which is the deployment mechanism for Java code (Java ARchive).
Basically this is a ZIP file containing compiled Java bytecode. The equivalent in the .NET
world is a DLL.
And herein lies the ambiguity. What happens if you put a JAR file on a Unix box? Well,
the answer is not very much other than it takes up some disk space. And cron (the Unix
scheduler) doesn’t execute JAR files unless they are really standalone console applications,
the sort that have a “public static void main” method as a program entry point. By deduction
then, I think both of those JAR files are actually standalone applications and that’s what I’d
We have a failure to communicate 16
like to see on the diagram. Rather than the deployment mechanism, I want to understand
the execution context.
If you’ve heard of “Plain Old C# Objects” (POCOs) or “Plain Old Java Objects” (POJOs), this
is the homeless edition. This diagram mixes up a number of different levels of detail.
In the bottom left of the diagram is a SQL Server database, and at the top left of the diagram
is a box labelled “Application”. Notice how that same box is annotated (in green) “Console-
C#”. Basically, this system seems to be made up of a C# console application and a database.
But what about the other boxes?
Well, most of them seem to be C# components, services, modules or objects and they’re
much like what we’ve seen on some of the other diagrams. There’s also a “data access” box
and a “logger” box, which could be frameworks or architectural layers. Do all of these boxes
represent the same level of granularity as the console application and the database? Or are
they actually part of the application? I suspect the latter, but the lack of boundaries makes
this diagram confusing. I’d like to draw a big box around most of the boxes to say “all of
these things live inside the console application”. I want to give those boxes a home. Again, I
do want to understand how the system has been decomposed into smaller components, but
I also want to know about the execution context too.
We have a failure to communicate 17
It’s a little like those “choose your own adventure” books that I used to read as a kid.
You would start reading at page 1 and eventually arrive at a fork in the story where you
decide what should happen next. If you want to attack the big scary creature you’ve just
encountered, you turn to page 47. If you want to run away like a coward, it’s page 205 for
you. You keep making similar choices and eventually, and annoyingly, your character ends
up dying and you have to start over again.
This diagram is the same. You start at the top and weave your way downwards through what
is a complex asynchronous and event-driven style of architecture. You often get to make a
choice - should you follow the “fail event” or the “complete event”? As with the books, all
paths eventually lead to the (SNMP) trap on the left of the diagram.
The diagram is complex, it’s trying to show everything and the single colour being used
doesn’t help. Removing some information and/or using colour coding to highlight the
different paths through the architecture would help tremendously.
We have a failure to communicate 18
Stormtroopers
To pick up on something you may have noticed from previous diagrams, I regularly see
diagrams that include unlabelled users/actors. Essentially they are faceless clones. I don’t
know who they are and why they are using the software.
The final diagram is a great example of why whiteboards are such useful bits of equipment!
We have a failure to communicate 19
In addition, the problems associated with a single diagram are often exacerbated when a
collection of diagrams is created:
• The notation (colour coding, line styles, etc) is not consistent between diagrams.
• The naming of elements is not consistent between diagrams.
• The logical order in which to read the diagrams isn’t clear.
We have a failure to communicate 20
The example diagrams typify what I see during my workshops and these types problems are
incredibly common. A quick Google image search will uncover a plethora of similar block
diagrams that suffer from many of the same problems we’ve seen already. I’m sure you will
have seen diagrams like this within your own organisations too.
The assumption that a diagram will be accompanied by a narrative creates a gap between the
information captured on the paper and what remains in people’s heads. Diagrams that need
explaining have limited value, especially when used for the purpose of creating long-lived
documentation.
4. A shared vocabulary
The diagrams we’ve seen so far have been an ad hoc collection of “boxes and lines”. Although
notation is important, one of the fundamental problems I believe we have in the software
development industry is that we lack a common, shared vocabulary with which to think
about and describe the software systems we build.
Next time you’re sitting in a conversation about software design, listen out for how
people use terms like “component”, “module”, “sub-system”, etc. These terms are typically
ambiguous. For example, the dictionary definition for the word “component” is “a part of a
larger whole”. Imagine that you’re building a web application, which itself uses a database.
Given the dictionary definition, both of the following uses of the word “component” are
valid.
In essence, the word “component” is being used to describe two very different levels of
abstraction.
probably use different notation in terms of colour-coding, line styles, iconography, etc. In
other words, the maps are showing the same things (the same abstractions), but the notation
varies. The key to understanding them is exactly that; a key or legend tucked away in a
corner somewhere. We can do the same with our software architecture diagrams.
Diagrams are the maps that help software developers navigate a complex
codebase.
I’ve witnessed groups of people having design discussions using terms like “component”
where they are clearly not talking about the same thing. Yet everybody in the group
is oblivious to this. Each group needs to agree upon the vocabulary, terminology and
abstractions they are going to use. The notation can then evolve.
So, notation aside (we’ll cover that later in the book), my approach to tackling this problem
is to introduce a shared vocabulary that we can use to describe our software. The primary
aspect I’m interested in is the static structure. And I’m interested in the static structure from
different levels of abstraction. Once this static structure is understood and in use, it’s easy
to supplement it with other information to illustrate runtime/behavioural characteristics,
infrastructure, deployment models, etc.
A shared vocabulary 23
A simple model of architectural constructs used to define the static structure of a software system
A software system is the highest level of abstraction, representing something that delivers
value to its users, whether they are human or not.
Level 2: Containers
Put simply, a container represents something that hosts code or data, like an application or a
data store. A container is something that needs to be running in order for the overall software
system to work. In real terms, a container is something like:
A shared vocabulary 24
A container is essentially a context or boundary inside which some code is executed or some
data is stored. The name “container” was chosen because I wanted a name that didn’t imply
anything about the physical nature of how that container is executed¹. For example, some
web servers run multiple threads inside a single process, whereas others run single threads
across multiple processes. When I’m thinking about the static structure of a software system,
I don’t want to concern myself with the details of whether a web application is using one
operating system process or many when it’s servicing requests. It’s an important detail, but
we can get into that later.
¹I do appreciate that the term “container” is now in widespread use because of containerisation and technologies like Docker.
Feel free to use something like “runtime context”, “execution environment” or “deployable unit” instead if you’d prefer to avoid the
term “container” when discussing software architecture.
A shared vocabulary 25
It’s also worth noting that each container should be a separately deployable thing. The
physical deployment is another important detail that we will look at later, but, in theory
anyway, every container can be deployed onto or run on a separate piece of infrastructure;
whether that infrastructure is physical, virtual or containerised. The implication here is
that communication between containers is likely to require an out-of-process or remote
procedure call across the process and/or network boundary.
To give an example, let’s imagine you’re building a website that is comprised of two different
web applications (e.g. a desktop version and a mobile version, or an end-user version serving
HTML and an API endpoint serving JSON). There are a number of scenarios to consider:
1. Each web application is packaged up into separately deployable units (e.g. two Java
WAR files, two ASP.NET web applications, etc). This is two containers, regardless of
whether both deployable units are actually deployed into the same physical web server
(a deployment optimisation).
2. Although you think about the two web applications as being logically separate, they
are actually inseparable because they are packaged as a single deployment unit (e.g. a
single Java WAR file or ASP.NET web application). This is a single container.
The same is true with relational database schemas. I would treat two separate schemas as
two separate containers, irrespective of whether they are deployed into the same database
server or not.
As a final note, put simply, a container refers to an execution context and it’s a really runtime
construct. This means that libraries or modules (e.g. JAR files, DLL files, .NET assemblies,
etc) should not be considered as containers unless they are runnable on their own, like a Java
or Spring Boot application that is packaged into an executable JAR file, for example.
Level 3: Components
The word “component” is a hugely overloaded term in the software development industry,
but I like to think of a component as being a grouping of related functionality encapsulated
behind a well-defined interface. With the C4 model, components are not separately deploy-
able units. Instead, it’s the container that’s the deployable unit. In other words, all of the
components inside a container typically execute in the same process space.
Aspects such as how those components are packaged (e.g. one component vs many com-
ponents per JAR file, DLL, shared library, etc) is an orthogonal concern and, from my
perspective, doesn’t affect how we think about components.
A shared vocabulary 26
Level 4: Code
Finally, components are made up of one or more code elements constructed with the basic
building blocks of the programming language that you’re using; classes, interfaces, enums,
functions, objects, etc.
download a copy of the GitHub repository, open it in your IDE of choice and visualise it by
reverse-engineering a UML class diagram from the code, you’ll get something like this.
As you would expect, this diagram is showing you all of the Java classes and interfaces that
make up the Spring PetClinic web application, plus all of the relationships between them.
The properties and methods are hidden on the diagram because they add too much noise to
the picture. This isn’t a complex codebase by any stretch of the imagination but, by showing
classes and interfaces, the diagram is showing too much detail.
Let’s remove those classes that aren’t useful to having an “architecture” discussion about the
system. In other words, let’s only show those classes/interfaces that have some significance
from a static structure perspective. In concrete terms, for this specific codebase, it means
excluding the model (domain) and util classes.
A shared vocabulary 28
After a little rearranging, we now have a simpler diagram with which to reason about the
software architecture. We can also see the architectural layers again (controllers, services and
repositories). But this diagram is still showing code-level elements (i.e. classes and interfaces).
In order to zoom up one level, we need to identify which of these code-level elements can be
grouped together to form “components”. The strategy for grouping code-level elements into
components will vary from codebase to codebase (and we’ll discuss this later in the book)
but, for this codebase, the strategy might look like this.
A shared vocabulary 29
Each of the blue boxes represents what I would consider to be a “component” in this codebase.
In summary, each of the web controllers is a separate component, along with the result of
combining the remaining interfaces and their implementation classes. If we remove the code
level noise, we get a picture like this.
A shared vocabulary 30
In essence, we’re grouping the classes and interfaces into components, which is a unit of
related functionality. You will likely have shared code (e.g. abstract base classes, supporting
classes, helper classes, utility classes, etc) that are used across many components, such as
the JdbcPetVisitExtractor in this example. Some can be refactored and moved “inside” a
particular component, but some of them are inevitable.
Although this example illustrates a traditional layered architecture, the same principles are
applicable regardless of how you package your code (e.g. by layer, feature or component)
or the architectural style in use (e.g. layered, hexagonal, ports and adapters, etc). My aim
in all of this is to minimise, and in fact remove, the gap between how software developers
think about components from a logical and physical perspective. Components should be
real things, evident in the code, rather than logical constructs that are used in architecture
discussions only.
at runtime. While this model makes sense, I find it adds an additional level of detail that is
usually unnecessary when thinking about a software system from a “big picture” perspective.
For this reason, I’ve deliberately avoided using the term “module” and instead focus on the
identification of coarser-grained components within the static structure.
I’ve also avoided using the term “subsystem”, which some people use to refer to a collection
of related components or a functional slice of a software system. The problem I have with the
term “subsystem” is that it’s often difficult to map this concept onto a real-world codebase.
If the concept of components and modules, or systems and subsystems, is useful, then feel
free to build that into the shared vocabulary that you create.
4.5 Microservices?
Given the degree of hype and discussion around microservices at the moment, it’s worth
being explicit about how to describe microservices using the vocabulary we’ve defined so
far. Broadly speaking, there are two options.
2. Microservices as containers
On the other hand, if the microservices are a part of a software system that you are building
(i.e. you own them), I would treat them as containers, along with any data stores that those
microservices use (these are separate containers). In the same way that a modular monolithic
application is a container with a number of components running inside it, a microservice is
a container with a (smaller) number of components running inside it. The actual number
of components will depend upon the implementation strategy. It could range from the very
simple (i.e. one, where a microservice is a container with a single component running inside)
through to something like a mini-layered or hexagonal architecture.
4.6 Serverless?
I tend to treat the serverless concepts (e.g. Amazon Lambdas, Azure Functions) in the
same way as microservices. If you’re building a software system comprised of a number of
A shared vocabulary 32
serverless functions, think of them as containers because they are all separately deployable.
1. System Context: A System Context diagram provides a starting point, showing how
the software system in scope fits into the world around it.
2. Containers: A Container diagram zooms into the software system in scope, showing
the high-level technical building blocks (containers) and how they interact.
3. Components: A Component diagram zooms into an individual container, showing the
components inside it.
4. Code: A code (e.g. UML class) diagram can be used to zoom into an individual
component, showing how that component is implemented.
The C4 model 34
Shneiderman’s mantra
My starting point for understanding any software system is to draw a system context
diagram. This helps me to understand the scope of the system, who is using it and what
the key system dependencies are. It’s usually quick to draw and quick to understand.
Next I’ll open up the system and draw a diagram showing the containers (web applications,
mobile apps, standalone applications, databases, file systems, message buses, etc) that make
up the system. This shows the overall shape of the software system, how responsibilities
have been distributed and the key technology choices that have been made.
As developers, we often need more detail, so I’ll then zoom into each (interesting) container
in turn and show the “components” inside it. This is where I show how each application has
been decomposed into components, along with a brief note about key responsibilities and
technology choices of those components. Hand-drawing the diagrams can become tedious,
which is why you should ideally look at tooling to help automate it instead.
The C4 model 36
I might optionally progress deeper into the hierarchy to show the code-level elements (e.g.
classes, interfaces, objects, functions, etc) that make up a particular component. Ultimately
though, this detail resides in the code and, as software developers, we can get that on demand
via our IDEs.
Next time you’re asked to create some software architecture diagrams (whether that’s to
understand an existing system, present a system overview, or do some software archaeology),
my advice is to keep Shneiderman’s mantra in mind. Start at the top and work into the
detail, creating a story that gets deeper into the detail as it progresses. The different levels
of diagrams allow you to tell different stories to different audiences; some of who will be
technical, some not.
As a quick note, the C4 model is not a description of a design process, it’s just a collection
of diagrams that you can use to describe the static structure of a software system. That
said, and we’ll cover this later, while the C4 model describes diagrams covering four levels
of abstraction, you don’t necessarily need to create every diagram at every level. My
recommendation is that all teams create System Context and Container diagrams, and really
think about whether Component and Code diagrams provide enough benefit considering the
cost of creating and keeping them up to date.