Arc42 by Example - Software Architecture Documentation in Practice - 1st Ed (2023)
Arc42 by Example - Software Architecture Documentation in Practice - 1st Ed (2023)
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean
Publishing process. Lean Publishing is the act of publishing an in-progress ebook
using lightweight tools and many iterations to get reader feedback, pivot until you
have the right book and build traction once you do.
© 2016 - 2023 Gernot Starke, Michael Simons, Stefan Zörner, Ralf D. Müller and
Hendrik Lösch
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Disclaimer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
I - Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
I.1 What is arc42? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
I.2 Why this Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
I.3 What this Book is Not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
I.4 Overview of the Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
I.5 Table of arc42 Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
IV - biking2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
IV.1 Introduction and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
IV.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
IV.3 System Scope and Context . . . . . . . . . . . . . . . . . . . . . . . . . . 110
IV.4 Solution Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
IV.5 Building Block View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
IV.6 Runtime View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
IV.7 Deployment View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
IV.8 Cross-cutting Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
IV.9 Architecture Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
IV.10 Quality Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
IV.11 Risks and Technical Debt . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
IV.12 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
V - DokChess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
V.1 Introduction and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
V.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
V.3 System Scope and Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
V.4 Solution Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
V.5 Building Block View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
V.6 Runtime View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
V.7 Deployment View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
V.8 Cross-cutting Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
CONTENTS
VI - docToolchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
VI.1 Introduction and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
VI.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
VI.3 System Scope and Context . . . . . . . . . . . . . . . . . . . . . . . . . . 217
VI.4 Solution Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
VI.5 Building Block View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
VI.6 Runtime View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
VI.7 Deployment View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
VI.8 Cross-cutting Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
VI.9 Architecture Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
VI.10 Quality Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
VI.11 Risks and Technical Debt . . . . . . . . . . . . . . . . . . . . . . . . . . 251
VI.12 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Gernot
Long ago, on a winters’ day in 2004, I sat together with Peter Hruschka, a long-time
friend of mine and discussed one¹ of our mutual favorite subjects - structure and
concepts of software systems.
We reflected about an issue we both encountered often within our work, independent
of client, domain and technology: Developers know their way around implementa-
tion technologies, managers theirs’ around budgets and risk management. But when
forced to communicate (or even document) the architecture of systems, they often
started inventing their own specific ways of articulating structures, designs, concepts
and decisions.
Peter talked about his experience in requirements engineering: He introduced me to
a template for requirements, a pre-structured cabinet (or document) called VOLERE²
which contains placeholders for everything that might be important for a specific re-
quirements document. When working on requirements, engineers therefore needn’t
think long before they could dump their results in the right place - and others would
be able to retrieve it later on… (as long as they knew about VOLEREs structure). This
requirements template had been in industry use since several years, there even was
a book available³ on its usage and underlying methodology.
“If we only had a similar template for software architecture”, Peter complained, and
continued “countless IT projects could save big time and money”… My developer-
soul added a silent wish: “If this was great, it could even take the ugliness out of
documentation”.
¹Peter and Gernot both share a passion for cooking too, but you probably wouldn’t share our sometimes exotic taste
²http://volere.co.uk
³http://www.volere.co.uk/book/mastering-the-requirements-process-getting-requirements-right
Preface 2
We both looked at each other, and within this second decided to create exactly this:
A template for software architecture documentation (and communication), that is
highly practical, allows for simple and efficient documentation, is usable for all kinds
of stakeholders and could facilitate software architecture documentation. And of
course, it had to be open source, completely free for organizations to use.
That’s how the arc42 journey started.
Since then, Peter and myself have used arc42 in dozens of different IT systems within
various domains. It has found significant acceptance within small, medium and large
organizations throughout the world. We wrote more many articles around it, taught
it to more than 1000 (!) IT professionals and included it in several of our software
architecture-related books.
Thanx Peter for starting this wild ride with me. And, of course, for your lectures on
cooking.
Thanx to my customers and clients - I learned an incredible lot by working together
with you on your complex, huge, hard, interesting and sometimes stressful problems.
Due to all these nondisclosure agreements I signed all my life, I’m not officially
allowed to mention you all by name.
Thanx to my wife Cheffe Uli and my kids Lynn and Per for allowing dad to (once
more) sit on the big red chair and ponder about another book project… You’re the
best, and I call myself incredibly lucky to have you!
Thanx to my parents, who, back in 1985, when the computer stuff was regarded
to be something between crime and witchcraft, they encouraged my to buy one (an
Apple-2, by the way) and didn’t even object when I wanted to study computer science
instead of something (by that time) more serious. You’re great!
Michael
I’ve met Gernot and Peter (Hruschka) as instructors on a training at the end of 2015.
The training was called “Mastering Software Architectures” and I learned an awful
lot from both of them, not only the knowledge they shared but how they both shared
it. By the end of the training I could call myself “Certified Professional for Software
Architecture”, but what I really took home was the wish structure, document and
communicate my own projects like Peter and Gernot proposed and that’s why the
current documentation of my pet project biking2 was created.
Preface 3
Since then I used arc42 based documentations several times: As well as for in-house
products and also at projects and consultancy gigs. The best feedback was something
along the lines: “Wow, now we’ve got an actual idea about what is going on in this
module.” What helped that special project a lot was the fact that we set fully on
Asciidoctor and the “Code as documentation and documentation as code”⁴ approach
I described in depth in my blog.
So here’s to Gernot and Peter: Thanks for your inspiration and the idea for arc42.
StefanZ
Ralf D. Müller
Quite a while ago, I discovered the arc42 template as MS Word document. It didn’t
take long to see how useful it is and I started to use it in my projects. Soon, I
discovered that MS Word wasn’t the best format for me as a developing architect.
I started to experiment with various text-based formats like Markdown, AsciiDoc
but also with Wikis. The JAX conference was the chance to exchange my ideas with
Gernot. He told me that Jürgen Krey already created an AsciiDoc version of the
arc42 template. We started to consider this template as the golden master and tried
to generate all other formats needed (at this time, mainly MS Word and Confluence)
from this template. The arc42-generator⁶ was born, and a wonderful journey was
about to start. The current peak of this journey is docToolchain⁷ - the evolution of
the arc42-generator. Read more about its architecture in this book.
⁴https://info.michael-simons.eu/2018/12/05/documentation-as-code-code-as-documentation/
⁵https://www.swadok.de
⁶https://github.com/arc42/arc42-generator
⁷https://doctoolchain.github.io/docToolchain/
Preface 4
On this journey, I have met many people who helped me along this way - impossible
to name all of them.
However my biggest “Thanx!” goes out to Gernot who always encouraged me to do
my next step and helped me along the way.
Thank you, Peter and Gernot for pushing my architectural skills to the next level
through their superb training workshops.
Thanx to Jakub Jabłoński and Peter for their review of the architecture - You gave
great feedback!
Last but not least, I have to thank my family for their patience while I spent too much
time with my notebook!
Hendrik Lösch
Like some of my fellow authors, I met Gernot at conferences and then got to know
him better in a workshop. At that time I had already specialized in restructuring
legacy software and since I come from the field of cyber-physical systems, I noticed
some peculiarities. Bold as I was, I suggested to Gernot that we could write a book
together.
Fast-forward some years, I had created a video training about architecture documen-
tation for LinkedIn Learning. Gernot took up the idea and asked me whether I would
want to contribute to this book with the example I used in the video training. Lo and
behold this is what happened.
My biggest thanks go to both Gernot and Stefan. The work of both has helped me
very often in the past years. It is a great honor for me to be able to publish something
together with them that helps other people in their work.
I also want to say a big thank you to Attila Bertok who reviewed my part of the book
and was an invaluable help in translating it from German into English.
All of us
Thanx to our thorough reviewers that helped us improve the examples, especially
Jerry Preissler, Roland Schimmack and Markus Schmitz.
Preface 5
Conventions
Chapter and Section Numbering:
We use roman chapter numbers (I, II, III etc), so we can have the arabic numbers
within chapters in alignment with the arc42 sections…
In the sections within chapters, we add the chapter-prefix only for the top-level
sections. That leads to the following structure:
Chapter II: HtmlSC
II.1 Introduction and Goals
II.2 Constraints
II.3 Context
…
Chapter III: Mass Market CRM
III.1 Introduction and Goals
III.2 Constraints
III.3 Context
…
Explanations:
The first example (HTML Sanity Checking) contains short explanations on the
arc42 sections, formatted like this one.
In this book, we keep these explanations to a bare minimum, as there are other books
extensively covering arc42 background and foundations.
Preface 6
Disclaimer
We like to add a few words of caution before we dive into arc42 examples:
The content of this book has been created with care and to the best of our knowledge.
However, we cannot assume any liability for the up-to-dateness, completeness,
accuracy or suitability to specific situations of any of the pages.
I - Introduction
This chapter explains the following topics
Figure I.1 gives you the big picture: It shows a (slightly simplified) overview of the
structure of arc42.
Figure I.1
Compare arc42 to a cabinet with drawers: the drawers are clearly marked with labels
indicating the content of each drawer. arc42 contains 12 such drawers (a few more
than you see in the picture above). The meaning of these arc42 drawers is easy to
understand.
I - Introduction 9
Why 42?
You’re kidding, aren’t you? Ever heard of Douglas Adams, the (very British) and
already deceased sci-fi writer… his novel “Hitchhikers Guide to The Galaxy” calls 42
the:
arc42 aims at providing the answer to everything around your software architecture.
(Yes, I know it’s a little pretentious, but we couldn’t think of a better name back in
2005.)
⁸https://en.wikipedia.org/wiki/Phrases_from_The_Hitchhiker’s_Guide_to_the_Galaxy#Answer_to_the_Ultimate_
Question_of_Life.2C_the_Universe.2C_and_Everything_.2842.29
I - Introduction 10
⁹https://docs.arc42.org
¹⁰https://faq.arc42.org
I - Introduction 11
Examples are often better suited to show how things can work than lengthy
explanations.
arc42 users have often asked for examples to complement the (quite extensive)
conceptual documentation of the template that was, unfortunately, only available
in German for several years.
There were a few approaches to illustrate how arc42 can be used in real-world
applications, but those were (and still are) scattered around numerous sources, and
not carefully curated.
After an incredibly successful (again, German only) experiment to publish one single
example as a (very skinny) 40-page booklet we decided to publish a collection of
examples on a modern publishing platform - so we can quickly react to user feedback
and add further samples without any hassle.
• Software architecture
• Architecture and Design Patterns
• Modeling, especially UML
• Chinese cooking (but propably you didn’t expect that here…)
I - Introduction 12
independent mid-sized data center to support the launch of the German (government-
enforced) e-Health-Card - and later on used to support campaigns like telephone
billing, electrical-power metering and similar stuff.
The architecture of MaMa-CRM is covered in chapter III.
biking2
biking2 or “Michis milage” is a web based application for tracking biking related
activities. It allows the user to track the covered milage, collect GPS tracks of routes,
convert them to different formats, track the location of the user and publish pictures
of bike tours.
The architecture of biking2 is covered in chapter IV
docToolchain
docToolchain¹³ is a heavily used and highly practical implementation of the docs-as-
code¹⁴ approach: The basic idea is to facilitate creation and maintenance of technical
documentation.
The architecture of docToolchain is covered in chapter VI
Foto Max
Foto Max is a made-up company that offers solutions for ordering photos. The
software architecture in this book describes how the software is implemented to be
used on the company’s devices.
Foto Max can be found in chapter VII
¹³https://doctoolchain.github.io/docToolchain
¹⁴https://docs-as-co.de
I - Introduction 14
MiniMenu
Surely one of the leanest architecture documentations you will ever encounter.
Captures some design and implementation decisions for a tiny Mac-OS menu bar
application.
The main goal is to show that arc42 even works for very small systems.
Skip to chapter VIII to see for yourself.
I - Introduction 15
3. System Scope and Context HtmlSC (3), MaMa-CRM (4), biking2 (3), DokChess (2),
docToolchain (4)
4. Solution Strategy HtmlSC (1), MaMa-CRM (1), biking2 (1), DokChess (3),
docToolchain (3)
6. Runtime View HtmlSC (2), MaMa-CRM (3), biking2 (2), DokChess (2),
docToolchain (0)
7. Deployment View HtmlSC (3), MaMa-CRM (2), biking2 (1), DokChess (2),
docToolchain (4)
8. Cross-cutting Concepts HtmlSC (6), MaMa-CRM (5), biking2 (9), DokChess (7),
docToolchain (1)
9. Architecture Decisions HtmlSC (1), MaMa-CRM (1), biking2 (2), DokChess (5),
docToolchain (3)
10. Quality Requirements HtmlSC (1), MaMa-CRM (2), biking2 (1), DokChess (2),
docToolchain (2)
11. Risks and Technical Debt HtmlSC (1), MaMa-CRM (1), biking2 (1), DokChess (2),
docToolchain (2)
I - Introduction 16
12. Glossary HtmlSC (1), MaMa-CRM (1), biking2 (4), DokChess (3),
docToolchain (1)
II. HTML Sanity Checking
By Gernot Starke.
¹⁵https://github.com/aim42/htmlSanityCheck
II. HTML Sanity Checking 18
The overall goal of HtmlSC is to create neat and clear reports, showing errors within
HTML files. Below you find a sample report.
II. HTML Sanity Checking 19
Sample Report
HtmlSanityCheck (HtmlSC) checks HTML for semantic errors, like broken links and
missing images. It has been created to support authors who create HTML as output
format.
Basic Usage
1. A user configures the location (directory and filename) of one or several HTML
file(s), and the corresponding images directory.
2. HtmlSC performs various checks on the HTML and
3. reports its results either on the console or as HTML report.
Basic Requirements
ID Requirement Explanation
G-1 Check HTML for semantic errors HtmlSC checks HTML files for semantic errors,
like broken links.
G-2 Gradle and Maven Plugin HtmlSC can be run/used as Gradle and Maven
plugin.
Required Checks
Check Explanation
Missing images Check all image tags if the referenced image files exist.
Broken internal links Check all internal links from anchor-tags (‘href=”#XYZ”) if the
link targets “XYZ” are defined.
Missing local resources Check if referenced files (e.g. css, js, pdf) are missing.
Duplicate link targets Check all link targets (… id=”XYZ”) if the id’s (“XYZ”)are
unique.
Illegal link targets Check for malformed or illegal anchors (link targets).
II. HTML Sanity Checking 22
Check Explanation
Broken external links Check external links for both syntax and availability.
Broken ImageMaps Though ImageMaps are a rarely used HTML construct, HtmlSC
shall identify the most common errors in their usage.
¹⁸Especially when checking external links, the correctness of links depends on external factors, like network
availability, latency or server configuration, where HtmlSC cannot always identify the root cause of potential problems.
II. HTML Sanity Checking 23
1.3 Stakeholders
Remark: For our simple HtmlSC example we have an extremely limited number of
stakeholders, in real-life you will most likely have many more stakeholders!
arc42 user uses arc42 for architecture wants a small but practical
documentation example of how to apply arc42.
II.2 Constraints
You want to identify all neighboring systems and the different kinds of (business)
data or events that are exchanged between your system and its neighbors.
Business context
II. HTML Sanity Checking 26
Neighbor Description
user documents software with toolchain that generates html. Wants
to ensure that links within this HTML are valid.
local HTML files HtmlSC reads and parses local HTML files and performs sanity
checks within those.
local image files HtmlSC checks if linked images exist as (local) files.
external web resources HtmlSC can be configured to optionally check for the existence
of external web resources. Risk: Due to the nature of web
systems and the involved remote network operations, this check
might need significant time and might yield invalid results due
to network and latency issues.
¹⁹https://gradle.org
II. HTML Sanity Checking 27
You like to know about the technical or physical infrastucture of your system,
together with physical channels or protocols.
The following diagram shows the participating computers (nodes) with their techni-
cal connections plus the major artifacts of HtmlSC, the hsc-plugin-binary.
Deployment context
artifact repository A global public cloud repository for binary artifacts, similar to
MavenCentral²⁰, the Gradle Plugin Portal²¹ or similar. HtmlSC
binaries are uploaded to this server.
hsc user computer where arbitrary documentation takes place with html as output
formats.
build.gradle Gradle build script configuring (among other things) the HtmlSC
plugin to perform the HTML checking.
²²https://docs.gradle.org/current/userguide/userguide.html
²³https://sourcemaking.com/design_patterns/template_method/
II. HTML Sanity Checking 29
Whitebox (HtmlSC)
Contained Blackboxes:
Building block Description
HSC Core HTML parsing and sanity checking
HSC Gradle Plugin Exposes HtmlSC via a standard Gradle plugin, as described in the
Gradle user guide²⁴. Source: Package org.aim42.htmlsanitycheck,
classes: HtmlSanityCheckPlugin and HtmlSanityCheckTask
²⁴https://docs.gradle.org/current/userguide/userguide.html
II. HTML Sanity Checking 31
HSC-Core (Whitebox)
• configuration,
• parsing and handling HTML input,
• checking,
• creating suggestions and
• collecting checking results
II. HTML Sanity Checking 32
Contained Blackboxes:
Building block Description
Checker Contains the pure checking functionality. See its blackbox
description below.
The abstract class Checker provides the uniform interface (public void check()) to
different checking algorithms.
Based upon polymorphism, the actual checking is handled by subclasses of the
abstract Checkerclass, uses the template-method pattern. It uses the concept of
extensible checking algorithms.
For a given input (target), Suggester searches within a set of possible values (options)
to find the n most similar values. For example:
• Target = “McDown”
• Options = {“McUp”, “McDon”, “Mickey”}
• The resulting suggestion would be “McDon”, because it has the greatest simi-
larity to the target “McDown”.
II. HTML Sanity Checking 33
The implementation is based upon the Jaro-Winkler distance²⁵, one of the algorithms
to calculate similarity between strings.
Suggester is used at least in the following cases:
• Broken image links: Compares the name of the missing image with all available
image file names to find the closest match.
• Missing cross references (broken internal links): Compares the broken link with
all available link targets (anchors).
Rationale: This structures follows the hierarchy of checks, managing results for:
Contained Blackboxes:
Building block Description
Per-Run Results Aggregated results for potentially many HTML
pages/documents.
Explanation:
Reporting is done in the natural hierarchy of results (see the corresponding concept
in section 8.2.1 for an example report).
1. per “run” (PerRunResults): date/time of this run, files checked, some configura-
tion info, summary of results
2. per “page” (SinglePageResults):
3. create page result header with summary of page name and results
4. for each check performed on this page create a section with SingleCheckRe-
sults
5. per “single check on this page” report the results for this particular check
II. HTML Sanity Checking 38
artifact repository Global public cloud repository for binary artifacts, similar to
mavenCentral²⁶ HtmlSC binaries are uploaded to this server.
build.gradle Gradle build script configuring (among other things) the HtmlSC
plugin.
The three nodes (computers) shown in the diagram above are connected via Internet.
Prerequisites:
• HtmlSC developers need a Java development kit, Groovy, Gradle plus the JSoup
HTML parser.
• HtmlSC users need a Java runtime (> 1.6) plus a build file named build.gradle.
See below for a complete example.
1 buildscript {
2 repositories {
3 mavenLocal()
4 maven {
5 url "https://plugins.gradle.org/m2/"
6 }
7 }
8 dependencies {
9 // in case of mavenLocal(), the following line is valid:
10 classpath(group: 'org.aim42',
11
12 // in case of using the official Gradle plugin repository:
²⁶https://search.maven.org/
II. HTML Sanity Checking 40
49 sectlink : true,
50 sectanchors: true ]
51
52 resources {
53 from(srcImagesPath) { include '**' }
54 into "./images" }
55 }
56
57 // ========================================================
58 apply plugin: 'org.aim42.htmlSanityCheck'
59
60 htmlSanityCheck {
61 // ensure asciidoctor->html runs first
62 // and images are copied to build directory
63
64 dependsOn asciidoctor
65
66 sourceDir = new File("${buildDir}/html5")
67
68 // files to check, in Set-notation
69 sourceDocuments = ["many-errors.html", "no-errors.html"]
70
71 // fail the build if any error is encountered
72 failOnErrors = false
73
74 // set the http connection timeout to 2 secs
75 httpConnectionTimeout = 2000
76
77 ignoreLocalHost = false
78 ignoreIPAddresses = false
79 }
80
81 defaultTasks 'htmlSanityCheck'
II. HTML Sanity Checking 42
Term Description
Anchor Html element to create ->Links. Contains link-target in the form <a
href="link-target">
Cross Reference Link from one part of the document to another part within the same
document. Special form of ->Internal Link, with a ->Link Target in
the same document.
Term Description
Finding Description of a problem found by one ->Checker within the ->Html
Page.
Html Element HTML pages (documents) are made up by HTML elements .e.g., <a
href=”link target”>, <img src=”image.png”>‘ and others. See the
definition from the W3-Consortium²⁷
Html Page A single chunk of HTML, mostly regarded as a single file. Shall
comply to standard HTML syntax. Minimal requirement: Our HTML
parser can successfully parse this page. Contains ->Html Elements.
Synonym: Html Document.
Internal Link Link to another section of the same page or to another page of the
same domain. Also called ->Cross Reference or Local Link.
Link Any a reference in the ->Html Page that lets you display or activate
another part of this document (->Internal Link) or another document,
image or resource (can be either ->Internal (local) or ->External
Link). Every link leads from the Link Source to the Link Target.
Link Target Target of any ->Link, e.g. heading or any other a part of ->Html
Documents, any internal or external resource (identified by URI).
Expressed by ->id.
Local Resource Local file, either other Html files or other types (e.g. pdf, docx)
Run Result The overall results of checking a number of pages (at least one page).
1 @Test
2 public void testGenericURISyntax() {
3 // based upon an example from the Oracle(tm) Java tutorial:
4 // https://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.\
5 html
6 def aURL = new URL(
7 "https://example.com:42/docs/tutorial/index.html?name=aim42#INT\
8 RO");
9 aURL.with {
10 assert getProtocol() == "http"
11 assert getAuthority() == "example.com:42"
12 assert getHost() == "example.com"
13 assert getPort() == 42
14 assert getPath() == "/docs/tutorial/index.html"
15 assert getQuery() == "name=aim42"
16 assert getRef() == "INTRO"
²⁹https://www.ietf.org/rfc/rfc2396.txt
II. HTML Sanity Checking 46
17 }
18 }
We achieve that by defining the skeleton of the checking algorithm in one operation
(performCheck), deferring the specific checking algorithm steps to subclasses. The
invariant steps are implemented in the abstract base class, while the variant checking
algorithms have to be provided by the subclasses.
Template method for performing a single type of checks
1 /**
2 * Prerequisite: pageToCheck has been successfully parsed,
3 * prior to constructing this Checker instance.
4 **/
5 public CheckingResultsCollector performCheck() {
6 // assert prerequisite
7 assert pageToCheck != null
8 initResults()
9 return check() // subclass executes the actual checking algorithm
10 }
³⁰https://sourcemaking.com/design_patterns/template_method/
II. HTML Sanity Checking 47
Component Description
Checker abstract base class, containing the template
method check() plus the public method
performCheck()
8.4 Reporting
HtmlSC supports the following output (== reporting) formats and destinations:
The reporting subsystem uses the template method pattern to allow different output
formats (e.g. Console and HTML). The overall structure of reports is always the same.
The (generic and abstract) reporting is implemented in the abstract Reporter class as
follows:
Report findings using TemplateMethod pattern
1 /**
2 * main entry point for reporting - to be called when a report is reque\
3 sted
4 * Uses template-method to delegate concrete implementations to subclas\
5 ses
6 */
7 public void reportFindings() {
8 initReport() // (1)
9 reportOverallSummary() // (2)
10 reportAllPages() // (3)
II. HTML Sanity Checking 48
11 closeReport() // (4)
12 }
13
14 private void reportAllPages() {
15 pageResults.each { pageResult ->
16 reportPageSummary( pageResult ) // (5)
17 pageResult.singleCheckResults.each { resultForOneCheck ->
18 reportSingleCheckSummary( resultForOneCheck ) // (6)
19 reportSingleCheckDetails( resultForOneCheck ) // (7)
20 reportPageFooter()
21 }
22 }
1. initialize the report, e.g. create and open the file, copy css-, javascript and image
files.
2. create the overall summary, with the overall success percentage and a list of all
checked pages with their success rate.
3. iterate over all pages
4. write report footer - in HTML report also create back-to-top-link
5. for a single page, report the number of checks and problems plus the success
rate
6. for every singleCheck on that page, report a summary and
7. all detailed findings for a singleCheck.
8. for every checked page, create a footer, page break or similar to graphically
distinguish pages between each other.
• Highly flexible: Can parse files and strings (and other) input.
Alternatives:
• jsoup: a plain HTML parser without any dependencies (!) and a rich API to
access all HTML elements in DOM-like syntax. Clear winner!
• HTTPUnit: a testing framework for web applications and -sites. Its main focus
is web testing and it suffers from a large number of dependencies.
• HtmlCleaner³²
³²http://htmlcleaner.sourceforge.net/
II. HTML Sanity Checking 52
Remark: For our small example, such a quality tree is overly extensive… whereas in
real-live systems we’ve seen quality trees with more than 100 scenarios. Therefore,
we stick to (repeating) a few scenarios here.
Quality Scenarios
Attribute Description
Correctness Every broken internal link will be found.
Safety HtmlSC leaves its source files completely intact: Content of files
to be checked will never be modified.
Remark: In our small example we don’t see any real risks for architecture and
implementation. Therefore the risks shown below are a bit artificial…
Risk Description
Bottleneck with access rights on public Currently only one single developer has
repositories access rights to deploy new versions of
HtmlSC on public servers like Bintray or
Gradle plugin portal.
High effort required for new versions of Upgrading Gradle from v-3.x to v-4.x
Gradle required configuration changes in HtmlSC.
Such effort might be needed again for
future upgrades of the Gradle API.
Risk Description
System might become obsolete In case AsciiDoc or Markdown processors implement
HTML checking natively, HtmlSC might become
obsolete.
II. HTML Sanity Checking 54
II.12 Glossary
In the case of our small example, the terms given here should be good friends to most
developers. You find a more interesting version of the glossary in section II-8.1.
Term Definition
Link A reference within an →HTMLPage. Points to →LinkTarget
Cross Reference Link from one part of a document to another part within the
same document.
• Management of binary images for credit card and insurance companies, which
print cardholders’ images onto the cards to prevent misuse. The German e-
Health card belongs to this type.
• Meter reading for either energy or water providing companies or large-scale
real-estate enterprises.
As MaMa is the foundation for a whole family of CRM systems, it shall be adaptable
to a variety of different input and output interfaces and channels without any source
code modification! This interface flexibility is detailed in section 1.1.2.
An example shall clarify the complex business and technical requirements to MaMa.
MoPho is a (hypothetical) mobile phone provider with a large customer base and
a variety of different tariff options (flat fee, time-based, volume-based etc.). Some
of these tariffs or tariff combinations are not marketable any longer (a shorthand
for “MoPho does not earn its’ desired profit from them… - but that’s another story).
Others are technically outdated.
In their ongoing effort to optimize their business, MoPho management decided
to streamline their tariff landscape. Within a large campaign they contact every
customer with outdated tariffs and offer upgrades to new and mostly more beneficial
tariff options. Some customers get to pay less, but have to accept a longer contract
duration. Others will have to pay a little more, but receive additional services,
increased capacity, bandwidth or other benefits. In all cases where customers accept
the new tariff offer, the formal contract between MoPho and the customer has to be
updated, which requires a valid signature to be legally binding³⁴.
MoPho intends to inform certain customers via printed letters of the new tariff
offerings. Customers can react via letter, fax, phone or in person at one of MoPho’s
many sales outlets. As MoPho’s core competency are phone services, it doesn’t want
to bother with printing letters, scanning and processing replies, answering phone
inquiries and dealing with other out-of-scope activities. Therefore they employ
MaMa-CRM as an all-around carefree solution. Look at MaMa’s approach to this
scenario:
³⁴In some countries or for some contract types there might be simpler solutions than a written signature. Please ignore
that for the moment.
III - Mass Market Customer Relationship Management 59
1. MoPho selects pertained customers from its internal IT systems. We’re talking
about a 30+ million customer base and approximately 10 million of those
customers will be part of this campaign. MoPho exports their address-, contract-
and tariff data and transmits these to MaMa-CRM. MaMa imports this customer
data.
2. MaMa forwards only the relevant parts of customer data to the print service
provider (a «Partner»). This company creates personalized letters from address
and tariff data, prints those letters and finally delivers them to the postal service
(which ensures the letters finally end up in customers’ mailboxes).
3. MaMa now informs the call center (also a «Partner») participating in this
campaign, so they can prepare for the customers reaction.
4. In the meantime the print service has delivered the letters to the postal service,
which delivers letters to the customers. This is more difficult than it sounds:
1-5% of addresses tend to change within 12 month, depending on the customer
base. Some customers refuse to accept marketing letters. The postal service
informs MaMa about all the problematic cases (address unknown, forwarding
request etc.), so MaMa can decide on further activities.
5. The customers decide what to do with the letter: There’s a multitude of options
here:
• Customer fills out the enclosed reply form and signs it.
• Customer fills out the reply form, but forgets to sign it.
• Customer does not understand this offer and sends written inquiry by
letter.
• Customer inquires by phone call.
• Customer ignores letter and does not react at all.
• There are additional special cases (customer is not contractually capable,
has a custodian or legal guardian, is under-age, deceased or temporarily
unavailable…).
III - Mass Market Customer Relationship Management 60
6. Let’s assume the customer accepts and sends the letter back via postal service.
These letters will be forwarded to the scan service provider (again, a «Partner»),
which scans them and performs optical character recognition.
7. MaMa imports the scan-results from the scan service provider. Again, several
cases are possible.
8. MaMa distinguishes between all possible cases and takes appropriate actions:
• The form is completely filled and signed.
• The form is completely filled, but customer forgot to sign.
• Customer signed, but forgot to fill other important fields in the form…
• If the form is not returned within an appropriate period of time, MaMa
might decide to resend the letter, maybe even with a different wording,
layout or even a different contractual offer.
• Surely you can imagine several other possible options…
9. Finally, for every customer who accepted the changed tariff agreement, MaMa
sends back the appropriate information, so the mandator MoPho can internally
change the corresponding master data, namely contracts and tariffs and take all
other required actions (activities in MaMa).
For the phone service provider MoPho, the primary advantage is the centralized
interface to all customer interaction: All processes required to conduct this crucial
campaign are handled by MaMa-CRM, without intervention by MoPho. Very prac-
tical for MoPho is the nearly unlimited flexibility of MaMa-CRM to import and
export a variety of different data formats. This flexibility enables MaMa-CRM to
easily incorporate new campaign partners (like a second scan service provider, an
additional email-service provider or web hosting partner).
MaMa provides the software foundation for campaign instances that have to be
extensively configured to operate for a specific mandator with a number of specific
partners.
The following section gives an overview of such a configuration for the MoPho
campaign described in the previous section.
1.Configure client master data
III - Mass Market Customer Relationship Management 61
Define and configure required data attributes and additional classes, usually this
information is completely provided by the mandator.
This data (e.g. contract-, tariff-, offer- and other business-specific entities or at-
tributes) are modeled as extensions of the MaMa base model in UML.
In the MoPho telecommunications example, this data includes:
• Existing tariff
• List of potential new tariffs
• Contract number
• Validity period of existing tariff
• Validity period of new tariff
– ProductionOrderToCardProvider
• Input-activities import data provided by the mandator or partners, e.g.:
– MasterDataFromMandatorImport
– ScanDataImport
– CallCenterImport
– SalesOutletImport
• Internal activities define data maintenance operations, e.g.:
– Delete customer data 60 days after completion.
– Delete all campaign data 120 days after campaign termination.
³⁷Effectively this requirement forbade the use of the well-known version control systems “subversion” (svn), as by
development time of MaMa-CRM, svn could not reliably purge files from version history! Therefore a campaign-spanning
svn repository was not allowed.
III - Mass Market Customer Relationship Management 64
Simply stated, MaMa is a data hub with flexibility in several aspects: It imports data
(aspect 1) coming over various transmission channels (aspect 2) executes a set of
specific business rules (aspect 3) to determine further activities to be performed with
this data. Such activities consist of data exports (aspect 4) to certain business partners
(like print-, scan- or fax service providers, call centers and the like).
MaMa has to be flexible concerning all these aspects.
III - Mass Market Customer Relationship Management 65
MaMa always deals with variants of clientdata - the most simple version could look
like an instance of the following class definition:
Generic Definition of Client Record
MaMa needs to import such data from various campaign partners. These partners
operates their own IT systems with their own data formats. They will export the
affected data records in some format (see below) of their choice. MaMa needs to
handle many such formats:
• Standard transmission protocols (ftp, sftp, http, https) as both client and server
• Compressed,
• Encrypted
• Creation of checksums, at least with counters, MD5 and SHA-1
• Mandator- or campaign specific credentials for secure transmission
• Transmission metadata, i.e.: Count the number of successful transmissions and
send the number of the last successful transmissions as a prefix to the next
transmission.
Now that MaMa has received and processed data (by import, transmission and rule
processing), it needs to send data to the campaign partners. Analogous to data import,
these receiving partners have very specific requirements concerning data formats.
It is pretty common for MaMa to receive data from a partner company or mandator
in CSV format, import it into its own relational database, process it record-by-record,
export some records in XML to one partner and some other records in a fix format
to another partner.
III - Mass Market Customer Relationship Management 67
Prio 1: Flexibility
MaMa-CRM can:
For all these topics, InDAC administrators can fully configure all campaign-specific
aspects within one workday.
Prio 2: Security
MaMa will keep all its client and campaign related data safe. It shall never be possible
for campaign-external parties to access data or metadata.
InDAC Administrators need to sign nondisclosure contracts to be allowed to admin-
ister campaign data. This organizational issue is out-of-scope for MaMa.
MaMa can fully process 250.000 scanned return letters within 24 hours.
1.3 Stakeholder
III.2 Constraints
General Constraints
Operational constraints
Final results (outbound, to mandator) MaMa transfers final campaign results back
to mandator. This is the ultimate goal of the
campaign.
Preliminary results (inbound, from partner) Partners send results of their respective
work back to MaMa. This data is called
“preliminary results”, as it requires
processing and evaluation by MaMa before
it can be marked as final. Process logs and
partner status report are also transmitted to
MaMa via this interface.
Client data (outbound)
Client data is sent to partners on a “need-to-know” basis to achieve data minimality:
Every partner organization gets only the data they absolutely require to fulfill their
campaign tasks.
For example, MaMa will not disclose clients’ street address to call centers (they
usually get to know name, phone contact and sometimes one or two additional
attributes for verification purposes.)
On the other hand, print service providers usually don’t get to know the phone
numbers of clients, as the latter is not required to deliver printed letters via postal
services.
The diagram below contains a more formal version of the context diagram. It includes
an admin interface, which was left out in the informal version above.
III - Mass Market Customer Relationship Management 73
The admin interface enables MaMa and campaign administrators to perform all
required administrative tasks needed to init, configure and operate campaigns.
III - Mass Market Customer Relationship Management 74
The following diagram details the example already shown in section 1.1.1.
The data flows are detailed (in excerpts!) in the following table:
Mandator (outbound) Final results: ID, tariff and Zip-compressed CSV over
contract details for every sftp, MaMa uploads
client who accepted the
contract modification
proposal
… … …
III - Mass Market Customer Relationship Management 75
³⁸Some mandators with extremely high security requirements negotiated their own distinct physical hardware for
their MaMa instance(s).
III - Mass Market Customer Relationship Management 76
Element Description
«Instance» MaMa A distinct instance of MaMa, running a specific campaign
(connected to a single mandator and a number of
campaign-specific partner organizations)
«Category» Mandator For every MaMa instance there is one distinct mandator.
«Category» Partner For every MaMa instance there might be several different partner
organizations, each one having a distinct communication
channel.
«Instance» Database Every MaMa instance has its own database instance, usually
within the same virtual machine.
Rationale: The structure of building blocks within MaMa is based upon functional
decomposition and the concept of generated persistence (see section 8.1).
III - Mass Market Customer Relationship Management 79
Contained Blackboxes:
Element Description
Import Handler Imports data from Partners or Mandator via external
interfaces
Campaign Data Management Completely generated. Stores all client- and campaign
data.
Operations Monitoring Monitors (and reports) all import and export processes
plus database and application state.
Configuration (Blackbox)
Interfaces:
• For all configuration methods, the campaignID and mandatorID need always
be input parameters.
• Configuration information is always subclass of the (abstract) superclass Configuration.
getCampaignConfig
MaMa Level 2
Rationale: This is (again) based upon functional decomposition of the generic import
process. Section 6.1 describes the runtime behavior of this component.
Contained Blackboxes:
Element Description
Receiver Receives data from partners or mandators via the ImportData port.
ImportErrorHandler Handles the various possible errors during import. With severe
errors, import is stopped. Many (especially record or object level)
errors are recoverable - these will be logged, eventually the
administrator is notified.
ImportData (Port) Connection to the outside world - via ftp and http, usually
transmitted via VPN.
FileArchiver Non-erasable archive where all imported files are kept for
auditability.
FileFilter Various filter operations, like decrypt, unzip etc. Explained in the
filter concept in section 8
Validator Checks files, records (collections of strings) and client objects for
validity.
Important Interfaces:
Not documented.
III - Mass Market Customer Relationship Management 84
MaMa Level 3
Receiver (Whitebox)
Rationale: We have to admit that this structure just evolved out of a number of
prototypes. A more functional oriented design would most likely improve under-
standability, but we never refactored the code into that direction due to different
priorities.
Contained Blackboxes:
Element Description
Directory or WebService or Message - Components that listen for input of specific
Listener kinds, e.g. the DirectoryListener watches
for new files to appear in certain
directories, (configurable) either in a local
or remote file system.
III - Mass Market Customer Relationship Management 85
Element Description
FileProcessor Completely handles input files, calls all
required operations to be performed on the
file (archive, unzip, decrypt etc.). A big
mess of spaghetti code - you don’t want to
look at it…
At first we explain the generic import, where no campaign-specific activities are exe-
cuted. This concerns configureReceiveChannel and especially the instantiateFilterChain()
activities.
III - Mass Market Customer Relationship Management 87
The steps 5+6 are a dynamically configured pipes-and-filter dataflow subsystem. You
find some more info in the filter concept.
III - Mass Market Customer Relationship Management 88
Prerequisite: Data has been imported from external source, has been successfully
filtered (i.e. decrypted and decompressed). See previous section (Import Raw).
The diagram below contains error handling. In good cases there will be no errors.
Calls to ImportErrorHandler are only executed if errors occur!
Due to the sensitive nature of data handled by the original MaMa system
the owner required strict nondisclosure in that aspect. Therefore we are not
allowed to go into any detail of security.
III - Mass Market Customer Relationship Management 90
Element Description
Campaign-i Virtual machine for one single campaign.
• Unique ID of the requesting entity (usually the tax ID number of the organi-
zation/company issueing the request.) MaMa needed to use the tax ID of the
InDAC data center.
• request purpose (for MaMa, a constant)
• request sequence number (RSN)
³⁹https://de.wikipedia.org/wiki/Informationstechnische_Servicestelle_der_gesetzlichen_Krankenversicherung
III - Mass Market Customer Relationship Management 92
Prerequisites:
• Every MaMa instance will handle data related to individual people - called
clients in MaMa domain terminology.
• All clients will have a small number of common attributes.
• For all productive campaigns MaMa needs to handle an arbitrary number of
additional attributes.
• Every mandator will add several campaign-specific attributes to the client,
and/or will add campaign specific types (like insurance-contract or mobile-
phone-contract)
• Once configured prior to campaign start, these campaign specific data struc-
tures will rarely change⁴⁰
⁴⁰In several years of MaMa operation, data structures within an active campaign always remained fix, therefore
MaMa did never need any data migration utilities…
III - Mass Market Customer Relationship Management 94
Element Description
Client Abstract class, representing a person plus corresponding contact
information.
Contact Contact information that will be used for contacting the client instances
during campaign execution.
Next Action Generic class describing campaign activities. Central to the concept of
campaign process control and business rule execution
Specific campaign models always contain a (physical) copy of the complete core
domain. The abstract Client class always need to be subclassed, and might be 1:n
associated with additional classes.
III - Mass Market Customer Relationship Management 95
Due to nondisclosure agreements with InDAC we cannot show example source code
for the persistence concept.
III - Mass Market Customer Relationship Management 96
Alternatives
• Initially MaMa had started with AndroMDA code generation framework, but
that open source project lost popularity, could not deliver the required support
and ceased working with newer Maven releases - so MaMa switched to OAW.
• MaMa uses the commercial MagicDraw UML (in version 9.0) modeling tool,
which can in principle generate code based upon models, but proved to be too
inflexible for the desired Hibernate integration. The contracting entity (InDAC)
refused to upgrade to newer versions or alternative tools.
Encryption filters
Due to the sensitive nature of data handled by the original MaMa system
the owner required strict nondisclosure in that aspect. Therefore we are not
allowed to go into any detail of security.
⁴¹https://docs.oracle.com/javase/8/docs/technotes/guides/security/crypto/CryptoSpec.html
⁴²https://www.bouncycastle.org/java.html
III - Mass Market Customer Relationship Management 97
Compression filter
MaMa uses the open source rule engine Drools⁴³ for definition, implementation and
execution of business rules. Rules are defined as text files, which is interpreted at
runtime by the rule engine. This enables modification and maintenance of rules
without recompilation and redeployment of the whole system.
On the other hand, faulty rules can seriously hamper an active campaign – therefore
modification of business rules shall always be thoroughly tested!
Rules always have a simple “when <A> then <B>” format, where <A> and <B> are
Java expressions.
You find a complete reference and many examples of the rule language at Drools
documentation home⁴⁴.
⁴³www.drools.org
⁴⁴http://www.drools.org/learn/documentation.html
III - Mass Market Customer Relationship Management 98
Flexibility Scenarios
ID Scenario
F1 New CSV import format shall be configurable at CCT within 2 hours.
F3 New XML based import format shall be configurable at CCT within 2 hours.
F6 New XML based export format shall be configurable at CCT within 2 hours.
ID Scenario
P1 Import and fully process 250.000 scanned documents (including images) within
24hrs. That’s an average processing rate of approximately 3 complete documents per
second. Import format will be a combination of csv file plus images as single files.
P2 Import and fully process 100.000 records of csv file within 30 minutes
III - Mass Market Customer Relationship Management 100
Security Scenarios
ID Scenario
S1 Client and campaign data from one mandator shall never be accessible for another
mandator.
S2 MaMa is required to preserve all incoming data from mandators and partners for the
appropriate timeframe (usually 90-180 days after the end of a campaign). Such
archived data (e.g. files or messages) needs to be made completely accessible for an
auditor or inspection within 90 minutes at most.
S3 In case campaigns involve financial data of clients (e.g. credit card, bank account or
similar information), these have to be processed and managed compliant to
PCIDSS⁴⁵ regulations.
⁴⁵https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
III - Mass Market Customer Relationship Management 101
III.11 Risks
• The Receiver component suffers from overly complicated source code, created
by a number of developers without consent. Since early days, most production
bugs resulted from this part of MaMa-CRM.
• The runtime flexibility of import/export configurations and campaign processes
might lead to incorrect and undetected behavior at runtime, as there are no
configuration checks. Mischievous administrators can misconfigure any MaMa-
CRM instance at any time.
• Configuration settings are not archived and therefore might get lost (so there
might be no fallback to the last working configuration in case of trouble).
• The ‘Common-Metadata-Store’ is an overly trivial and resource-wasting syn-
chronization mechanism and should be replaced with a decent async / event-
based system asap.
III - Mass Market Customer Relationship Management 102
III.12 Glossary
Term Definition
Activity Process step of campaign. For MaMa-CRM: either inbound,
outbound or internal.
Activity, internal Scheduled data maintenance activities, i.e. removing some data 90
days after its last usage. In Germany, data security law requires
some kinds of data to be deleted after certain intervals.
Activity, inbound Read data (i.e. files) delivered by either a ->partner or ->mandator.
Term Definition
Instance MaMa-CRM is a family of systems, where a single instance is
configured and operated for exactly one mandator and one or
more campaigns.
⁴⁶https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
IV - biking2
By Michael Simons.
biking2 or “Michis milage” is primarily a web based application for tracking biking
related activities. It allows the user to track the covered milage, collect GPS tracks of
routes, convert them to different formats, track the location of the user and publish
pictures of bike tours.
The secondary goal if this application is to have a topic to experiment with various
technologies, for example Spring Boot on the server side and AngularJS, JavaFX and
others on the client side.
As such biking2 has been around since 2009 and in it’s current Java based form since
2014. Defining production as full filling the primary goal it’s been in production ever
since.
The project is published under Apache License on GitHub⁴⁷, use it however you like.
Though I’ve been blogging regularly about this pet project, the documentation in its
current form was created after I met Gernot and Peter at an awesome workshop in
Munich.
⁴⁷https://github.com/michael-simons/biking2
IV - biking2 105
What is biking2?
The main purpose of biking2 is keeping track of bicycles and their milages as well
as converting Garmin Training Center XML (tcx)⁴⁸ files to standard GPS Exchange
Format (GPX)⁴⁹ files and storing them in an accessible way.
In addition biking2 is used to study and evaluate technology, patterns and frame-
works. The functional requirements are simple enough to leave enough room for
concentration on quality goals.
Main features
The application must only handle exactly one user with write permissions.
Most bike aficionados have problems understanding the question “why more than
one bike?”, the system should be able to keep track of everything between 2 and 10
bikes for one user, storing 1 total milage per bike and month. All milages per month,
year and other metrics should be derived from this running total, so that the user
only need to look at his odometer and enter the value.
The application should store an “unlimited” number of tracks.
The images should be collected from Daily Fratze⁵⁰, the source are all images that are
tagged with “Radtour”. In addition the user should be able to provide an “unlimited”
number of “gallery pictures” together with a date and a short description.
⁴⁸https://en.wikipedia.org/wiki/Training_Center_XML
⁴⁹https://en.wikipedia.org/wiki/GPS_Exchange_Format
⁵⁰https://dailyfratze.de
IV - biking2 106
# Quality Motivation
1 Understandability The functional requirements are simple enough to allow a simple,
understandable solution that allows focus on learning about new
Java 8 features, Spring Boot and AngularJS.
3 Interoperability The application should provide a simple API that allows access to
new clients.
5 Testability The architecture should allow easy testing of all main building
blocks.
1.3. Stakeholders
The following lists contains the most important personas for this application:
Role/Name Goal/Boundaries
Developers Developers who want to learn about developing modern
applications with Spring Boot and various frontends, preferable
using Java 8 in the backend.
Software Architects Looking for an arc42 example; want to get ideas for their daily
work.
Michael Simons Improving his skills; wants to blog about Spring Boot; looking for a
topic he can eventually hold a talk about; needed a project to try
out new Java 8 features.
IV - biking2 107
IV.2 Constraints
The few constraints on this project are reflected in the final solution. This section
shows them and if applicable, their motivation.
Constraint Background
TC1 Implementation in Java The application should be part of a Java
8 and Spring Boot show case. The
interface (i.e. the api) should be
language and framework agnostic,
however. It should be possible that
clients can be implemented using
various frameworks and languages.
TC2 Third party software must be available The interested developer or architect
under an compatible open source should be able to check out the sources,
license and installable via a package compile and run the application
manager without problems compiling or
installing dependencies. All external
dependencies should be available via
the package manager of the operation
system or at least through an installer.
Constraint Background
TC3 OS independent development The application should be compilable on all 3
mayor operation systems (Mac OS X, Linux and
Windows)
Constraint Background
TC5 Memory friendly Memory can be limited (due to availability on a shared host or
deployment to cloud based host). If deployed to a cloud based
solution, every megabyte of memory costs.
Constraint Background
OC1 Team Michael Simons
OC3 IDE independent project setup No need to continue the editor and IDE
wars. The project must be compilable
on the command line via standard build
tools. Due to OC2 there is only one IDE
supporting Java 8 features out of the
box: NetBeans 8 beta and release
candidates.
OC4 Configuration and version control / Private git repository with a complete
management commit history and a public master
branch pushed to GitHub and linked a
project blog.
OC6 Published under an Open Source license The source, including documentation,
should be published as Open Source
under the Apache 2 License.
IV - biking2 109
2.3 Conventions
Conventions Background
C1 Architecture documentation Structure based on the english arc42-Template in
version 6.5
C2 Coding conventions The project uses the Code Conventions for the Java
TM Programming Language⁵¹. The conventions are
enforced through Checkstyle.
⁵¹https://www.oracle.com/technetwork/java/codeconvtoc-136057.html
IV - biking2 110
Business Context
Biker
A passionate biker uses biking2 to manage his bikes, milages, tracks and also visual
memories (aka images) taken on tours etc. He also wants to embed his tracks as
interactive maps on other websites.
IV - biking2 111
Daily Fratze
Daily Fratze⁵² provides a list of images tagged with certain topics. biking2 should
collect all images for a given user tagged with “Theme/Radtour”.
GPSBabel
GPSBabel is a command line utility for manipulating GPS related files in various
ways. biking2 uses it to convert TCX into GPX files. The heaving lifting is done by
GPSBabel and the resulting file will be managed by biking2.
Arbitrary websites
The user may want to embed (or brag with) tracks on arbitrary websites. He only
wants to paste a link to a track on a website that supports embedded content to
embed a map with the given track.
⁵²https://dailyfratze.de
IV - biking2 112
Technical Context
Backend (biking2::api)
The api runs on a supported application server, using either an embedded container
or an external container. It communicates via operating system processes with
GPSBabel on the same server.
The connection to Daily Fratze is http based RSS-feed. The feed is paginated and
provides all images with a given tag but older images may not be available any more
when the owner decided to add a digital expiry.
Furthermore biking2 provides an oEmbed⁵³ interface for all tracks stored in the
⁵³https://oembed.com
IV - biking2 113
system. Arbitrary websites supporting that protocol can request embeddable content
over http knowing only a link to the track without working on the track or map apis
themselves.
The frontend is implemented with two different components, the biking2::spa (Single
Page Application) is part of this package. The spa runs in any modern web browser
and communicates via http with the api.
⁵⁴https://www.highcharts.com
IV - biking2 115
From those two we have a closer look at the api only. For details regarding the
structure of an AngularJS 1.2.x application, have a look at their developers guide⁵⁵.
NOTE: To comply with the Java coding style guidelines, the modules “bikingPictures”
and “galleryPictures” reside in the Java packages “bikingpictures” and “gallerypic-
tures”.
⁵⁵https://code.angularjs.org/1.2.28/docs/guide
IV - biking2 116
Contained blackboxes
IV - biking2 117
Blackbox Purpose
bikes (Blackbox) Managing bikes, adding monthly milages, computing
statistics and generating charts.
locations (Blackbox) MQTT and STOMP interface for creating new locations and
providing them in real time on websockets via stomp.
bikingPictures (Blackbox) Reading biking pictures from an RSS feed provided by Daily
Fratze and providing an API to them.
Interfaces
Interface Description
bikes Api REST api containing methods for reading, adding and
decommissioning bikes and for adding milages to single
bikes.
tracks Api REST api for uploading and reading TCX files.
Real time locations WebSocket / STOMP based interface on which new locations
are published.
Real time location updates MQTT interface to which MQTT compatible systems like
OwnTracks⁵⁶ can offer location updates.
⁵⁶https://owntracks.org
IV - biking2 118
Interface Description
RSS feed reader Needs an Daily Fratze OAuth token for accessing a RSS feed
containing biking pictures which are than grabbed from
Daily Fratze.
galleryPictures Api REST api for uploading and reading arbitrary image files
(pictures related to biking).
Intent/Responsibility
bikes provides the external API for reading, creating and manipulating bikes and
their milages as well as computing statistics and generating charts.
Interfaces
Interface Description
REST interface /api/bikes/* Contains all methods for manipulating bikes and their
milages.
Files
The bikes module and all of its dependencies are contained inside the Java package
ac.simons.biking2.bikes.
Intent/Responsibility
tracks manages file uploads (TCX files), converts them to GPX files and computes
their surrounding rectangle (envelope) using GPSBabel. It also provides the oEmbed
interface that resolves URLS to embeddable tracks.
Interfaces
IV - biking2 119
Interface Description
REST interface /api/tracks/* Contains all methods for manipulating tracks.
Files
The tracks module and all of its dependencies are contained inside the Java package
ac.simons.biking2.tracks.
Intent/Responsibility
trips manages distances that have been covered on single days without relationships
to bikes.
Interfaces
Interface Description
REST interface /api/trips/* Contains all methods for manipulating trips.
Files
The trips module and all of its dependencies are contained inside the Java package
ac.simons.biking2.trips.
Intent/Responsibility
locations stores locations with timestamps in near realtime and provides access to
locations for the last 30 minutes.
Interfaces
IV - biking2 120
Interface Description
REST interface /api/locations/* For retrieving all locations in the last 30
minutes.
Files
The locations module and all of its dependencies are contained inside the Java
package ac.simons.biking2.tracker. The module is configured through the class
ac.simons.biking2.config.TrackerConfig.
Intent/Responsibility
bikingPictures is used for regularly checking a RSS feed from Daily Fratze col-
lecting new images and storing them locally. It also provides an API for getting all
collected images.
Interfaces
Interface Description
RSS Feed reader Provides access to the Daily Fratze RSS Feed.
Files
The bikingPictures module and all of its dependencies are contained inside the Java
package
ac.simons.biking2.bikingpictures.
Intent/Responsibility
galleryPictures manages file uploads (images). It stores them locally and provides
an RSS interface for getting metadata and image data.
Interfaces
Interface Description
REST interface /api/galleryPictures/* Contains all methods for adding and reading
arbitrary pictures.
Files
The galleryPictures module and all of its dependencies are contained inside the
Java package ac.simons.biking2.gallerypictures.
IV - biking2 122
The BikeRepository is a Spring Data JPA based repository for BikeEntities. The
BikeController and the ChartsController access it to retrieve and store instances
of BikeEntity and provide external interfaces.
Contained blackboxes
Blackbox Purpose
highcharts Contains logic for generating configurations and definitions for Highcharts⁵⁸
on the server side.
⁵⁸https://www.highcharts.com
IV - biking2 123
The TrackRepository is a Spring Data JPA based repository for TrackEntities. The
TracksController and the OembedController access it to retrieve and store instances
of TrackEntity and provide external interfaces.
IV - biking2 124
Contained blackboxes
Blackbox Purpose
gpx Generated JAXB classes for parsing GPX files. Used by the TracksController
to retrieve the surrounding rectangle (envelope) for new tracks.
Locations are stored and read via a Spring Data JPA based repository named
LocationRepository. This repository is only accessed through the LocationService.
The LocationService provides real time updates for connected clients through a
SimpMessagingTemplate and the LocationController uses the service to provide
access to all locations created within the last 30 minutes.
IV - biking2 126
New locations are created by the service either through a REST interface in form of
the LocationController or via a MessageListener on a MQTT channel.
Contained blackboxes
Blackbox Purpose
rss Generated JAXB classes for parsing RSS feeds. Used by the
FetchBikingPicturesJob to read the contents of an RSS feed.
IV - biking2 127
Deployment View
IV - biking2 131
uberspace host A host on Uberspace⁵⁹ where biking2.jar runs inside a Server JRE⁶⁰
with restricted memory usage.
biking2.jar A “fat jar” containing all Java dependencies and a loader so that
the Jar is runnable either as jar file or as a service script (on Linux
hosts).
⁵⁹https://uberspace.de
⁶⁰https://www.oracle.com/technetwork/java/javase/downloads/server-jre8-downloads-2133154.html
IV - biking2 132
Domain Models
biking2 is a datacentric application, therefore everything is based around those
entities that manifest as tables.
Entities in biking2
IV - biking2 133
Tables
Name Description
bikes Stores the bikes. Contains dates when the bike was bought and
decomissioned, an optional link, color for the charts and also an
auditing column when a row was created.
tracks Stores GPS tracks recorded and uploaded with an optional description.
For each day the track names must be unique. The columns minlat,
minlon, maxlat and maxlon store the encapsulating rectangle for the
track. The type column is constrainted to “biking” and “running”.
assorted_trips Stores a date and a distance on that day. Multiple distances per day are
allowed.
locations Stores arbitrary locations (latitude and longitude based) for given
timestamp with an optional description.
biking_pictures Stores pictures collected from Daily Fratze together with their original
date of publication, their unique external id and a link to the page the
picture originaly appeared.
gallery_pictures Stores all pictures uploaded by the user with a description and the date
the picture was taken. The filename column contains a single,
computed filename without path information.
IV - biking2 134
Domain model
Name Description
BikeEntity A bike was bought on a given date and can be decommisioned. It
has a color and an optional link to an arbitrary website. It may or
may not have milages recorded. It has some important functions.
MilageEntity A milage is part of a bike. For each bike one milage per month can
be recored. The milage is the combination of it’s recording date,
the amount and the bike.
GalleryPictureEntity A bean for handling the pictures uploaded by the user. prePersist
fills the createdAt attribute prior to inserting into the database.
LocationEntity Used in the tracker module for working with real time locations.
IV - biking2 136
Name Description
decommission Decommissions a bike on a given date.
addMilage Adds a new milage for a given date and returns it. The milage will
only be added if the date is after the date the last milage was added
and if the amount is greater than the last milage.
getPeriods Gets all monthly periods in which milages have been recorded.
getLastMilage Gets the last milage recorded. In most cases the same as getMilage.
Persistency
biking2 uses an H2⁶¹ database for storing relational data and the file system for binary
image files and large ascii files (especially all GPS files).
During development and production the H2 database is retained and not in-memory
based. The location of this file is configured through the biking2.database-file
property and the default value during development is ./var/dev/db/biking-dev
relative to the working directory of the VM.
All access to the database goes through JPA using Hibernate as provider. See the
Domain Models for all entities used in the application.
The JPA Entity Manager isn’t accessed directly but only through the facilities offered
by Spring Data JPA, that is through repositories only.
All data stored as files is stored relative to biking2.datastore-base-directory
which defaults to ./var/dev. Inside are 3 directories:
⁶¹https://www.h2database.com/html/main.html
IV - biking2 137
User Interface
The default user interface for biking2 which is packaged within the final artifact is a
Single Page Application written in JavaScript using Angular JS together with a very
default Bootstrap template.
For using the realtime location update interface, choose one of the many MQTT
clients out there.
There is a second user interface written in Java called bikingFX⁶².
⁶²https://info.michael-simons.eu/2014/10/22/getting-started-with-javafx-8-developing-a-rest-client-application-
from-scratch/
⁶³https://www.webjars.org
⁶⁴https://alexo.github.io/wro4j/
⁶⁵https://github.com/michael-simons/wro4j-spring-boot-starter
IV - biking2 138
Wro4j configuration
1 <groups xmlns="http://www.isdc.ro/wro">
2 <!-- Dependencies for the full site -->
3 <group name="biking2">
4 <group-ref>osm</group-ref>
5
6 <css minimize="false">/webjars/bootstrap/@bootstrap.version@/css/b\
7 ootstrap.min.css</css>
8 <css>/css/stylesheet.css</css>
9
10 <js minimize="false">/webjars/jquery/@jquery.version@/jquery.min.j\
11 s</js>
12 <js minimize="false">/webjars/bootstrap/@bootstrap.version@/js/boo\
13 tstrap.min.js</js>
14 <js minimize="false">/webjars/momentjs/@momentjs.version@/min/mome\
15 nt-with-locales.min.js</js>
16 <js minimize="false">/webjars/angular-file-upload/@angular-file-up\
17 load.version@/angular-file-upload-html5-shim.min.js</js>
18 <js minimize="false">/webjars/angularjs/@angularjs.version@/angula\
19 r.min.js</js>
20 <js minimize="false">/webjars/angularjs/@angularjs.version@/angula\
21 r-route.min.js</js>
22 <js minimize="false">/webjars/angular-file-upload/@angular-file-up\
23 load.version@/angular-file-upload.min.js</js>
24 <js minimize="false">/webjars/angular-ui-bootstrap/@angular-ui-boo\
25 tstrap.version@/ui-bootstrap.min.js</js>
26 <js minimize="false">/webjars/angular-ui-bootstrap/@angular-ui-boo\
27 tstrap.version@/ui-bootstrap-tpls.min.js</js>
28 <js minimize="false">/webjars/highcharts/@highcharts.version@/high\
29 charts.js</js>
30 <js minimize="false">/webjars/highcharts/@highcharts.version@/high\
31 charts-more.js</js>
32 <js minimize="false">/webjars/sockjs-client/@sockjs-client.version\
33 @/sockjs.min.js</js>
34 <js minimize="false">/webjars/stomp-websocket/@stomp-websocket.ver\
35 sion@/stomp.min.js</js>
IV - biking2 139
36
37 <js>/js/app.js</js>
38 <js>/js/controllers.js</js>
39 <js>/js/directives.js</js>
40 </group>
41 </groups>
This model file is filtered by the Maven build, version placeholders will be replaced
and all resources, in webjars as well as inside the filesystem, will be available as
biking.css and biking.js.
Transaction Processing
biking2 relies on Spring Boot to create all necessary beans for handling local
transactions within the JPA EntityManager. biking2 does not support distributed
transactions.
Session Handling
biking2 only provides a stateless public API, there is no session handling.
Security
biking2 offers security for its API endpoints only via HTTP basic access authentica-
tion⁶⁶ and in case of the MQTT module with MQTTs default security model. Security
can be increased by running the application behind a SSL proxy or configuring SSL
support in the embedded Tomcat container.
For the kind of data managed here it’s an agreed tradeoff to keep the application
simple. See also Safety.
⁶⁶https://en.wikipedia.org/wiki/Basic_access_authentication
IV - biking2 140
Safety
No part of the system has life endangering aspect.
1. Bikes which have been decommissioned cannot be modified (i.e. they can have
no new milages): Checked in BikesController.
2. For each unique month only one milage can be added to a bike. Checked in the
BikeEntity.
3. A new milage must be greater than the last one. Also checked inside BikeEntity.
Exception/Error Handling
Errors handling to inconsistent data (in regard to the data models constraint) as well
as failures to validation are mapped to HTTP errors. Those errors are handled by the
frontends controller code. Technical errors (hardware, database etc.) are not handled
and may lead to application failure or lost data.
⁶⁷https://beanvalidation.org/1.0/spec/
IV - biking2 141
Logging, Tracing
Spring Boot configures logging per default to standard out. The default configuration
isn’t change in that regard, so all framework logging (especially Spring and Hiberate)
go to standard out in standard format and can be grabbed or ignored via OS specific
means.
All business components use the Simple Logging Facade for Java (SLF4J). The
actual configuration of logging is configured through the means of Spring Boot. No
special implementation is included manually, instead biking2 depends transitively
on spring-boot-starter-logging.
The names of the logger corresponds with the package names of the classes which
instantiate loggers, so the modules are immediately recognizable in the logs.
Configurability
Spring Boot offers a plethora of configuration options, those are just the main options
to configure Spring Boot and available starters: Common application properties⁶⁸.
The default configuration is available in src/main/resources/application.properties.
During development those properties are merged with src/main/resources/application-dev.pr
Additional properties can be added through system environment or through an
application-*.properties in the current JVM directory.
⁶⁸https://docs.spring.io/spring-boot/docs/current/reference/html/common-application-properties.html
IV - biking2 142
Internationalization
Only supported language is English. There is no hook for doing internationalization
in the frontend and there are no plans for creating one.
Migration
biking2 replaced a Ruby application based on the Sinatra framework. Data was
stored in a SQLite database which has been migrated by hand to the H2 database.
Testability
The project contains JUnit tests in the standard location of a Maven project. At the
time of writing those tests covers >95% of the code written. Tests must be executed
during build and should not be skipped.
Build-Management
The application can be build with Maven without external dependencies outside
Maven. gpsbabel must be on the path to run all tests, though.
IV - biking2 144
Problem
Constraints
• Conversion should handle TCX files with single tracks, laps and additional
points without problem
• Focus for this project has been on developing a modern application backend for
an AngularJS SPA, not parsing GPX data
Assumptions
• Using an external, non Java based tool makes it harder for people who just want
to try out this application
• Although good documented, both file types can contain varieties for informa-
tions (routes, tracks, waypoints) which makes it hard to parse
Considered Alternatives
Decision
biking2 uses GPSBabel for the heavy lifting of GPS related data. The project contains
a README stating that GPSBabel must be installed. GPSBabel can be installed on
Windows with an installer and on most Linux systems through the official packet
manager. Under OS X it is available via MacPorts or Homebrew.
Problem
biking2 needs to store “large” objects: Image data (biking and gallery pictures) as
well as track data.
Considered Alternatives
Decision
I opted for local file system because I didn’t want to put much effort into evaluating
cloud services. If biking2 should runnable in cloud based setup, one has to create an
abstraction over the local filesystem currently used.
IV - biking2 146
Quality tree
Testability / Coverage
By using JaCoCo during development and the build process⁷⁰ ensure a code coverage
of at least 95%.
⁷⁰https://info.michael-simons.eu/2014/05/22/jacoco-maven-and-netbeans-8-integration/
IV - biking2 147
IV.12 Glossary
Term Description
AngularJS AngularJS⁷¹ is an open-source web application framework
mainly maintained by Google and by a community of individual
developers and corporations to address many of the challenges
encountered in developing single-page applications.
Daily Fratze (DF) An online community where users can upload a daily picture of
themselves (a selfie, but the site did them before they where
called selfies).
Fat Jar A way of packaging Java applications into one single Jar file
containing all dependencies, either repackaged or inside their
original jars together with a special class loader.
Gallery picture Pictures from tours provided manually by the hours in addition
to the pictures collected automatically from Daily Fratze.
⁷¹https://en.wikipedia.org/wiki/AngularJS
⁷²https://en.wikipedia.org/wiki/Apache_ActiveMQ
⁷³https://www.apache.org/licenses/LICENSE-2.0
⁷⁴https://getbootstrap.com
⁷⁵https://checkstyle.sourceforge.net
IV - biking2 149
Term Description
Garmin Garmin⁷⁶ develops consumer, aviation, outdoor, fitness, and
marine technologies for the Global Positioning System.
GPS Exchange Format GPX, or GPS Exchange Format, is an XML schema designed as a
common GPS data format for software applications. It can be
used to describe waypoints, tracks, and routes.
Java 8 or JDK 8 The eight installment of the Java programming language⁷⁹ and
the first one to support functional paradigms in the form
Lambda expressions.
NetBeans NetBeans⁸² is a free and open Source IDE that fits the pieces of
modern development together.
⁷⁶https://en.wikipedia.org/wiki/Garmin
⁷⁷https://www.gpsbabel.org
⁷⁸https://eclemma.org/jacoco/
⁷⁹https://en.wikipedia.org/wiki/Java_(programming_language)
⁸⁰https://junit.org/junit4/
⁸¹https://en.wikipedia.org/wiki/MQTT
⁸²https://netbeans.org
⁸³https://en.wikipedia.org/wiki/OAuth
IV - biking2 150
Term Description
oEmbed The oEmbed⁸⁴ protocol is a simple and lightweight format for
allowing an embedded representation of an URL on third party
sites.
SLF4J The Simple Logging Facade for Java (SLF4J)⁸⁶ serves as a simple
facade or abstraction for various logging frameworks (e.g.
java.util.logging, logback, log4j) allowing the end user to plug in
the desired logging framework at deployment time.
Spring Data JPA Spring Data JPA⁸⁸, part of the larger Spring Data family, makes it
easy to easily implement JPA based repositories.
⁸⁴https://oembed.com
⁸⁵https://en.wikipedia.org/wiki/RSS
⁸⁶https://www.slf4j.org
⁸⁷https://projects.spring.io/spring-boot/
⁸⁸https://projects.spring.io/spring-data-jpa/
⁸⁹https://stomp.github.io
V - DokChess
By Stefan Zörner.
This chapter describes the architecture of the chess program DokChess. I originally
created it as an example for presentations and training on software architecture
and design. Later on, I implemented it in Java and refined it for a German book
on documenting software architectures (see www.swadok.de⁹⁰). The source code is
available on GitHub and more information, including this architectural overview in
both English and German, can be found on www.dokchess.de⁹¹.
⁹⁰https://www.swadok.de
⁹¹https://www.dokchess.de
V - DokChess 152
What is DokChess?
Essential Features
The quality scenarios in section V.10 detail these goals and serve to evaluate their
achievement.
V - DokChess 154
1.3 Stakeholders
The following table illustrates the stakeholders of DokChess and their respective
intentions.
Who? Matters and concern
Software Architects Software architects get an impression on how architecture
documentation for a specific system may look like. They
reproduce things (e.g. format, notation) in their daily work.
They gain confidence for their own documentation tasks.
Usually they have no deep knowledge about chess.
Stefan Zörner Stefan needs attractive examples for his book. He uses
DokChess as a case study in workshops and presentations
on software design and architecture.
V.2 Constraints
At the beginning of the project various constraints had to be respected within the
design of DokChess. They still affect the solution. This section represents these
restrictions and explains – where necessary – their motivations.
Test tools and test processes JUnit 4 with annotation style both for
correctness and integration testing and for
compliance with efficiency targets.
V - DokChess 157
2.3 Conventions
Coding guidelines for Java Java coding conventions of Sun / Oracle, checked using
CheckStyle
⁹²https://github.com/DokChess/
V - DokChess 158
Actor Description
Human opponent (user) Chess is played between two opponents, who
move their pieces in turn. DokChess takes the
role of one of the opponents, and competes
against a human opponent. For this purpose, the
two need to communicate, e.g. about their moves,
or draw offers.
Actor Description
Openings (external system) About the opening, which is the early stage of a
game, extensive knowledge exists in chess
literature. This knowledge is partly free and
partly also commercially available in the form of
libraries and databases. Within DokChess no
such library is created. Optionally an external
system is connected instead in order to permit a
knowledge based play in the early stages, as
expected by human players.
Endgames (external system) If just a very few pieces are left on the board (e.g.
only the two kings and a queen), endgame
libraries can be used analogously to opening
libraries. For any position with this piece
constellation these libraries include the statement
whether a position is won, drawn or lost, and if
possible the necessary winning move. Within
DokChess no such library is created. Optionally
an external system is connected instead in order
to bring clearly won games home safely, or to use
the knowledge from the libraries for analysis and
position evaluation.
V - DokChess 160
Actor Description
XBoard client (external system) A human player is connected to DokChess
with a graphical front-end. The development
of such is not part of DokChess. Each
graphical frontend can be used instead, if it
supports the so-called XBoard protocol. These
include Xboard (or Winboard on Windows),
Arena and Aquarium.
Polyglot Opening Book (external system) Polyglot Opening Book is a binary file format
for opening libraries. DokChess allows the
optional connection of such books. Only read
access is used.
On endgames
⁹³https://en.wikipedia.org/wiki/Endgame_tablebase
V - DokChess 161
Small letters in brackets, e.g. (x), link individual approaches from the right hand of
the table to the following architectural overview diagram.
The remaining section V.4 introduces significant architectural aspects and refers to
further information in chapter V.
This decomposition allows you to replace things such as the communication pro-
tocol or the opening book format if necessary. All parts are abstracted through
interfaces. Their implementations are assembled via dependency injection (→ V.5
“Building Block View”, → Concept V.8.1 “Dependencies Between Modules”). The
decomposition further allows the software, especially the chess algorithms, to be
tested automatically. (→ Concept V.8.7 “Testability”).
The interaction between algorithms takes place using the exchange of data structures
motivated by the domain implemented as Java classes (piece, move and so on →
Concept V.8.2 “Chess Domain Model”). Here, better understandability is preferred at
the cost of efficiency. Nevertheless, DokChess reached an acceptable playing strength,
as a run through the corresponding scenarios shows (→ V.10 “Quality Scenarios”).
The key element of the data structure design is the game situation. This includes
the placement of the chess pieces and other aspects that belong to the position
(such as which side moves next). Again readability is preferred to efficiency in the
implementation of the class motivated by the domain. An important aspect is that,
like all other domain classes this class is immutable (→ decision V.9.2 “Are position
objects changeable or not?”).
within the same computation time. The immutable data structures of DokChess
also facilitate implementing concurrent algorithms; a parallel minimax algorithm
is included as an example.
the Java Virtual Machine (JVM) with the class with the main method as a parameter
(→ V.7 “Deployment View”).
V - DokChess 166
Intent/Responsibility
This subsystem implements the communication with a client (for example, a graphi-
cal user interface) using the text-based XBoard protocol (→ Decision V.9.1). It reads
commands from standard input, checks them against the rules of the game and
converts them for the Engine. Responses from the Engine (especially the moves) will
be accepted by the subsystem as events, formatted according to the protocol and
returned via standard output. Thus the subsystem is driving the whole game.
Interfaces
setOutput Set the protocol output. Typically, the standard output (stdout),
automated tests may use a different target.
Files
Open Issues
• Time control
• Permanent brain (thinking while the opponent thinks)
• Draw-offers and giving up of the opponent
• Chess variants (alternative rules, such as Chess960)
Intent/Responsibility
This subsystem accounts for the rules of chess according to the International Chess
Federation (FIDE). It determines all valid moves for a position and decides whether
V - DokChess 169
Interfaces
Interface ChessRules
getLegalMoves Returns the set of all legal moves for a given position. The current
player is determined from the position parameter. In case of a mate
or stalemate an empty collection is the result. Thus the method
never returns null.
isCheck Checks whether the king of the given colour is attacked by the
opponent.
isCheckmate Checks whether the given position is a mate. I.e. the king of the
current player is under attack, and no legal move changes this. The
player to move has lost the game.
isStalemate Checks whether the given position is a stalemate. I.e. the current
player has no valid move, but the king is not under attack. The
game is considered a draw.
Concept V.8.2 “Chess Domain Model” describes the types used in the interface
V - DokChess 170
as call and return parameters (Move, Position, Colour). Refer to the source code
documentation (javadoc) for more details.
Files
Open Issues
Apart from the stalemate, the subsystem can not recognize any draw. In particular,
the following rules are not implemented (→ V.11.2 “Risk: Implementation effort too
high”):
• 50 moves rule
• Threefold repetition
Intent/Responsibility
This subsystem contains the determination of a next move starting from a game
position. The position is given from outside. The engine itself is stateful and
always plays one game at the same time. The default implementation needs an
implementation of the game rules to work. An opening library, however, is optional.
Interfaces
The Engine subsystem provides its functionality via the Java interface
org.dokchess.engine.Engine.
setupPieces Sets the state of the engine to the specified position. If currently a
move calculation is running, this will be cancelled.
determineYourMove Starts the determination of a move for the current game situation.
Returns move candidates asynchronously via an Observable (→
Runtime View V.6.1 “Move Determination Walkthrough”). The
engine does not perform the moves.
performMove Performs the move given, which changes the state of the engine. If
currently a move calculation is running, this will be canceled.
close Closes the engine. The method makes it possible to free resources.
No move calculations are allowed afterwards.
Methods of the DefaultEngine class (in addition to the Engine interface):
Method Short description
Concept V.8.2 (“Chess domain model”) describes the types used in the interface as
call and return parameters (Move, Position). Refer to the source code documentation
(javadoc) for more information. You find details of the Engine subsystem implemen-
tation in the white box view in section V.5.2 of this overview.
V - DokChess 172
Files
The implementation of the Engine subsystem and corresponding unit tests are located
below the packages
org.dokchess.engine...‘
Intent/Responsibility
This subsystem provides opening libraries and implements the Polyglot opening book
format. This format is currently the only one available, which is not proprietary.
Corresponding book files and associated tools are freely available on the Internet.
Interfaces
The Opening subsystem provides its functionality via the Java interface
org.dokchess.opening.OpeningLibrary.
setSelectionMode Sets the mode to select a move, if there is more than one
candidate in the library for the given position.
Concept V.8.2 “Chess Domain Model” describes the types used in the interface as
call and return parameters (Move, Position). Refer to the source code documentation
(javadoc) for more information.
Files
The implementation, unit tests and test data for the Polyglot file format are located
below the packages
org.dokchess.opening...
Open Issues
• The implemented options for a move selection from the Polyglot opening book
in case of several candidates are limited (the first, the most often played, by
chance).
• The implementation can not handle multiple library files at the same time. It
can therefore not mix them to combine the knowledge.
Intent/Responsibility
The module determines the optimal move for a position under certain conditions. In
the game of chess an optimal move always exists, theoretically. The high number
of possible moves and the resulting incredible mass of game situations to consider
makes it impossible to determine it in practice. Common algorithms like the Minimax
therefore explore the “game tree” only up to a certain depth.
Interfaces
it finds a better move the caller receives a message onNext via the observer pattern.
The search indicates the completion of its work with the message onComplete.
close Closes the search completely. No moves may be determined after calling
this method.
setDepth Set the maximum search depth in half moves. That means at 4 each
player moves twice.
Files
Intent/Responsibility
Interfaces
The Evaluation module provides its functionality via the Java interface
org.dokchess.engine.eval.Evaluation.
Files
Open Issues
In the pure material evaluation it does not matter where the pieces stand. A pawn in
starting position is worth as much as one short before promotion. And a knight on the
edge corresponds to a knight in the center. There is plenty of room for improvement,
which has not been exploited because DokChess should invite others to experiment.
V - DokChess 178
First, the Text UI subsystem validates the input with the aid of the Rules subsystem
(→ Concept V.8.4 “Plausibility Checks and Validation”). The move in the example
is recognized as legal and performed on the (stateful) Engine (the performMove
message) afterward. Then, the Text UI subsystem asks the engine to determine its
move. Since move computation can take a long time, but DokChess should still
continue to react to inputs, this call is asynchronous. The engine comes back with
possible moves.
The Engine examines at first whether the opening book has something to offer. In
the example, this is not the case. The engine has to calculate the move on its own.
It then accesses the Rules and determines all valid moves as candidates. Afterward,
it investigates and rates them, and gradually reports better moves (better from the
V - DokChess 180
perspective of the engine) back to the caller (the Text UI subsystem). Here, the
observer pattern is used (implementation with reactive extensions⁹⁷).
The example diagram shows that two moves have been found (pawn e7-e5, knight
b8-c6) and finally the message, that the search is complete, so the engine does not
provide better moves. The Text UI subsystem takes the last move of the Engine and
prints it as a string to standard output according to the XBoard protocol: “move b8c6”.
⁹⁷https://reactivex.io/intro.html
V - DokChess 181
• Arena⁹⁸
DokChess.jar contains the compiled Java source code of all the modules and all
the necessary dependencies (“Uber-jar”). The script file dokchess.bat starts the Java
Virtual Machine with DokChess. Both are available on the computer in a common
directory, because dokchess.bat relatively addresses the jar file.
Within Arena, the script file is declared in the following menu “Engine | Install a new
Engine …”. You will see a file selection, the file type can be limited to * .bat files. Then,
set the engine type to “Winboard”. In other chess frontends, declaring an engine is
very similar See corresponding documentation for details.
⁹⁸http://www.playwitharena.de
V - DokChess 183
The different modules of DokChess exchange chess-specific data. This includes the
game situation on the chessboard (position) for instance, as well as opponent’s and
own moves. All interfaces use the same domain objects as call and return parameters.
This section contains a brief overview of these data structures and their relationships.
All the classes and enumeration types (enums) are located in the org.dokchess.domain
package. See the source documentation (javadoc) for details.
A chess piece is characterized by colour (black or white) and type (king, queen, and
so on). In the DokChess domain model, a piece does not know its location on the
board. The Piece class is immutable, and so are all other domain classes.
The Position class describes the current situation on the board. In particular, these
are the piece locations on the board, which are internally represented as a two-
dimensional array (8 x 8). If a square is not occupied, null is stored in the array.
To complete the game situation, the Position class includes information about which
side moves next, which castlings are still possible (if any), and whether capturing en
passant is allowed.
The Position class is immutable as well. Therefore, the performMove() method returns
a new position with the modified game situation (→ decision V.9.2 “Are position
objects changeable or not?”).
below). It accepts the moves of the opponent in a comfortable interface, passes them
to DokChess in the form of XBoard commands like in the table above (column
“Client → DokChess”) and translates the answers (column “DokChess → Client”)
graphically.
• the XBoard protocol for interactive user input from the opponent
• opening libraries in the form of files
Input that comes through the XBoard protocol is parsed from the corresponding
subsystem. In case of unknown or unimplemented commands DokChess reports the
XBoard command “Error” back to the client.
In case of move commands DokChess checks using the rules subsystem whether
the move is allowed or not. In case of illegal moves DokChess reports the XBoard
V - DokChess 188
command “Illegal move” back to the client. When using a graphical front end, this
case should not occur, since these typically accept only valid moves. The case is likely
relevant when interacting via command line (→ Concept V.8.3 “User Interface”).
Inputs that come through the XBoard protocol are parsed from the correspond-
ing subsystem. For unknown or unimplemented commands DokChess reports the
XBoard command “Error” back to the client. When setting up a position DokChess
checks the compliance with the protocol, but not whether the position is permitted.
In extreme cases (e.g. if no kings are present on the board), the engine subsystem
may raise an error during the game.
For opening libraries DokChess only checks whether it can open and read the file.
In case of a failure (e.g. file not found) it raises an exception (→ Concept V.8.5
“Exception and Error Handling”). While reading the file the opening subsystem
responds with a runtime error to recognized problems (for example, invalid file
format). However, the content of the library itself is not checked. For example, if
invalid moves are stored for a position it is not recognized. The user is responsible
for the quality of the library (see → V.3.1 “Business Context”). In extreme cases, the
engine may respond with an illegal move.
visualizes them in an error dialog or alert box. The image below depicts that for the
chess frontend Arena.
If the engine blocks and it is unclear what went on on the XBoard protocol such
tools are simply invaluable. Due to the availability of this feature a communication
protocol tracing was not implemented within DokChess at all.
8.7 Testability
Nothing is more embarrassing for an engine than an illegal move.
The functionality of the individual modules of DokChess is ensured by extensive
unit tests. You find a folder src/test next to src/main, where the Java source code
of the modules is stored. It mirrors the package structure, and in the corresponding
packages unit tests for the classes realized with JUnit 4¹⁰³.
Standard unit testing, which examine the individual classes, are named as the class
itself with suffix Test . In addition, there are tests that examine the interaction of
modules, and in extreme cases the whole system. With the help of such tests, the
correct playing of DokChess is checked. More complex, long-running integration
tests are below src/integTest. This includes playing entire games.
In many tests positions must be provided as call parameters. Here the Forsyth-
Edwards Notation (FEN in short) is used. This notation allows the specification of
a complete game situation as a compact string without a line break and is therefore
perfect for use in automated tests.
The starting position in FEN for example is denoted:
¹⁰³https://junit.org/junit4/
V - DokChess 191
Lowercase letters stand for black, uppercase for white pieces. For the piece types the
English names (r for rook, p for pawn …) are used.
Sample position
The game situation in the picture above with white before the 79th move, with 30
half-moves long no piece was captured and no pawn was moved, looks like this in
FEN
1 "6r1/6pp/7r/1B5K/1P3k2/N7/3R4/8 w - - 30 79"
and reads “6 squares free, black rook, square free, new rank …”.
Details about the notation can be found for example at Wikipedia¹⁰⁴. The Position
class has a constructor that accepts a string in FEN. The toString method of this class
also provides FEN.
¹⁰⁴https://en.wikipedia.org/wiki/Forsyth–Edwards_Notation
V - DokChess 192
Problem Background
As a central requirement DokChess must work together with existing chess frontends.
How do we connect them?
A whole series of graphical user interfaces are available for playing against chess
programs. Moreover, there are software solutions with a larger scope for chess enthu-
siasts. In addition to the game “Human vs. Machine”, they offer more functionality,
such as analyzing games. Over time, new chess programs will be released – and
others will possibly disappear from the market.
Depending on how the connection with such programs is realized, DokChess can or
can not communicate with specific frontend clients. Thus, the issue affects DokChess’
interoperability with existing chess software and its adaptability to future chess
software.
Constraints
Affected Risks
Assumptions
Considered Alternatives
Neither of the two protocols are formally specified, but both are publicly docu-
mented.
Both protocols are text-based, and communication between the frontend and the
engine is via stdin/stdout. In both cases the front end starts the engine in a separate
process.
The following Table compares the three investigated frontends to their implemented
protocols:
Decision
Under the given constraints, the quality goals can generally be achieved with both
options. Depending on which protocol is implemented, different front ends are
supported.
The decision in favor of the XBoard protocol was made in early 2011. The structure
of DokChess allows us to add alternative communication protocols (UCI or other)
without having to change the engine itself for this, see dependencies in the building
block view (→ V.5.1 “Building Block View, Level 1”).
The preferred front end for Windows is Arena. It is freely available and tops the
functionality of WinBoard. It has good debugging facilities. For example, it can
present the communication between the frontend and the engine live in a window
(→ screenshot in V.8). Arena supports both protocols.
By opting for the XBoard protocol, other operating systems (in addition to Windows,
especially Mac OS X and Linux) are also supported with freely available frontends
(see the preceding table). As such, a larger circle of interested developers may use
the engine, which was finally essential.
Problem Background
For various DokChess modules, game situations on the chess board (so-called posi-
tions) must be provided and exchanged between them. Do we make the associated
data structure changeable or unchangeable (immutable)?
During a game, the position changes when the opposing players move their pieces.
In addition, the engine performs possible moves in its analysis, tries to counter the
moves of the opponent, evaluates the outcoming positions, and then discards moves.
The result is a tree that contains many thousands of different positions, depending
on the search depth.
Depending on whether the position is immutable as a data structure or not, algo-
rithms are easier or more difficult to implement, and its execution is efficient in a
different way.
V - DokChess 195
All modules depend on the position interface. A subsequent change would affect all
of DokChess.
Constraints
Affected Risks
Assumptions
Considered Alternatives
The starting point is domain-driven classes for square, piece and move (→ Concept
V.8.2 “Chess Domain Model”). These classes are realized immutable as value objects
(the e4 field always remains e4 after it’s construction).
For the position, two alternatives are considered:
V - DokChess 196
The following table summarizes the strengths and weaknesses of the two options,
they are explained below.
and CPU time, and it treats the garbage collector with care. For analysis algorithms,
however, it is necessary to implement functionality that takes back the effect of
moves (“undo”). This Undo also takes time, hence the neutral rating (o) for the time
behavior.
(-) Negative Arguments
The implementation of an Undo-functionality is complex. Not only does it have to
set up captured pieces back on the board, the castling rule and en passant require
special treatment as well. The Gamma+94 Command pattern suggests itself as an
option. The use of that pattern within algorithms is also more complex because the
latter must explicitly call methods to take back the moves.
Finally, changeable state has drawbacks related to concurrency.
Decision
The decision for the unchangeable position (option 2) was made in early 2011 due
to the advantages in terms of ease of implementation and the prospect of exploiting
concurrency. All the disadvantages of option 2 are related to efficiency.
V - DokChess 198
Due to the risk of failing to achieve the objectives with respect to the playing
strength in an acceptable computing time (attractiveness, efficiency), prototypes of
both versions have been implemented and compared in a mate search (Checkmate in
three moves) with the minimax algorithm. With option 2, the search took 30% longer,
assuming you implement copying efficiently. But this option met the constraints still
well.
Further optimization options could reduce the efficiency disadvantage compared
with option 1, if necessary. But they have not been implemented in the first run
in order to keep the implementation simple.
These options included the use of multiple processors/cores with concurrent algo-
rithms. In the meantime (with DokChess 2.0), this was illustrated with a parallel
minimax example.
V - DokChess 199
No. Scenario
W01 A person with basic knowledge of UML and chess looks for an introduction to the
DokChess architecture. He or she gets the idea of the solution strategy and the
essential design within 15 minutes.
W02 An architect who wishes to apply arc42 searches for a concrete example for an
arbitrary section of the template. He or she finds the relevant content immediately
in the documentation.
W04 A developer implements a new position evaluation. He or she integrates it into the
existing strategies without modification and without compilation of existing
source code.
K01 A user plans to use DokChess with a frontend that supports a communication
protocol that’s already implemented by the solution. The integration does not
require any programming effort, the configuration with the front end is carried
out and tested within 10 minutes.
F01 In a game situation, the engine has one or more rule-compliant moves to choose
from. It plays one of them.
F02 A weak player is in a game against the engine. The player moves a piece to an
unguarded position which is being attacked by the engine. The engine
consequently takes the dropped piece.
V - DokChess 201
No. Scenario
F03 Within a chess match, a knight fork to gain the queen or a rook arises for the
engine. The engine gains the queen (or the rook) in exchange for the knight.
F04 Within a chess match, a mate in two unfolds to the engine. The engine moves
safely to win the game.
E01 During the game, the engine responds to the move of the opponent with its own
move within 5 seconds.
E02 An engine integrated in a graphical frontend plays as black, and the human player
begins. The engine responds within 10 seconds with its first move, and the user
gets a message that the engine “thinks” within 5 seconds.
Z01 During the game, the engine receives an invalid move. The engine rejects this
move and allows the input of another move thereafter and plays on in an
error-free way.
Z02 The engine receives an illegal position to start the game. The behavior of the
engine is undefined, cancelling of the game is allowed, as well as invalid moves.
P01 A Java programmer plans to use a chess frontend that allows the integration of
engines, but does not support any of the protocols implemented by DokChess. The
new protocol can be implemented without changing the existing code, and the
engine can then be integrated as usual.
V - DokChess 202
Contingency Planning
A simple textual user interface could be implemented in order to interact with the
engine. The implementation of a DokChess specific graphical front end would be
costly (→ Risk V.11.2 “Effort of Implementation”).
Risk Mitigation
as stalemate and promotion. In the case of castling and en passant, the move history,
and not only the current situation on the board, is relevant.
The programming of the algorithms is also non-trivial. For the connection of opening
libraries and endgame databases, extensive research is required.
The implementation of DokChess runs as a hobby alongside, within the spare time.
It is unclear whether this is sufficient to present ambitious results within schedule
(→ constraints V.2.2).
Contingency Planning
If there is no runnable version for the conference sessions in March and September
2011, a live demonstration could be omitted. The free evening talk at oose in March
could even be canceled completely (which could negatively affect the reputation).
Risk Mitigation
In order to reduce effort, the following rules are not implemented at first:
• 50 moves rule
• threefold repetition
Their absence has little consequence with respect to the playing strength, and
consequence no consequence with respect to the correctness of the engine.
Connecting opening libraries and endgame databases has low priority and and takes
a back seat at first.
Contingency Planning
Risk Mitigation
V.12 Glossary
“The game of chess is played between two opponents who move their pieces
on a square board called a ‘chessboard’.”
From the FIDE Laws of Chess
The following glossary contains English chess terms and terms from the world of
computer chess. Some of them go beyond the vocabulary of infrequent or casual
chess players.
See FIDE Laws of Chess¹⁰⁵ or the Wikipedia glossary of chess¹⁰⁶ for more information.
¹⁰⁵https://www.fide.com/component/handbook/?id=208&view=article
¹⁰⁶https://en.wikipedia.org/wiki/Glossary_of_chess
V - DokChess 206
Chessboard Geometry
“The chessboard is composed of an 8 x 8 grid of 64 equal squares alternately
light (the ‘white’ squares) and dark (the ‘black’ squares).”
From the FIDE Laws of Chess
Chessboard
V - DokChess 207
Chess Terms
Term Definition
50 move rule A rule in chess, which states that a player can claim a draw
after 50 moves whilst in the meantime no pawn has been
moved and no piece has been taken.
Castling A special move in chess where both the players’s king and one
the rooks are moved. For castling, different conditions must be
met.
Chess 960 A chess variant developed by Bobby Fischer. The initial position
is drawn from 960 possibilities. Also known as Fischer Random
Chess.
Draw A chess game which ends with no winner. There are various
ways to end it with a draw, one of them is stalemate.
Half-move Single action (move) of an individual player, unlike the
sequence of a white and a black move which is counted as a
move e.g. when numbering.
Term Definition
FEN Forsyth-Edwards Notation. Compact representation of a chess
board position as a character string. Supported by many chess
tools. Used in DokChess by unit and integration tests.
Explanation see e.g. [Wikipedia]
(https://en.wikipedia.org/wiki/Forsyth–Edwards_Notation)
Fork A tactic in chess in which a piece attacks two (or more) of the
opponent’s pieces simultaneously.
Mate End of a chess game in which the king of the player to move is
attacked and the player has no valid move left (i.e. can not
escape the attack). The player has lost. Also known as
‘checkmate’.
Minimax algorithm Algorithm to determine the best move under the consideration
of all options of both players.
Opening First stage of a chess game. Knowledge and best practices of the
first moves in chess fill many books and large databases.
Polyglot Opening Book Binary file format for opening libraries. Unlike many other
formats for this purpose the documentation of it is freely
accessible.
Promotion A rule in chess which states that a pawn who reached the
opponent’s base line is immediately converted into a queen,
rook, bishop or knight.
Term Definition
Stalemate End of a chess game in which the player to move does not have
a valid move, but his or her king is not under attack. The game
is considered a draw.
Threefold repetition A rule in chess, which states that a player can claim a draw if
the same position occurs at least three times.
docToolchain¹⁰⁷ started out as the arc42-generator: a Gradle build which converts the
arc42-template from AsciiDoc to various other formats like Word, HTML, ePub, and
Confluence.
While working with AsciiDoc, I came up with several other little scripted tasks, which
helped me in my workflow. These were added to the original arc42-generator and
turned it into what I now call docToolchain.
The goal of docToolchain is to support you as a technical writer with some repeating
conversion tasks and to enable you to implement the - sometimes quite complex -
Docs-as-Code¹⁰⁸ approach more easily.
I derived the name docToolchain from the fact that it implements a documentation
toolchain. Several tools like Asciidoctor, Pandoc, MS Visio (only to name a few) are
glued together via Gradle to form a chain or process for all your documentation
needs.
The following chapters describe the architecture of docToolchain. The whole project,
together with its remaining documentation, is hosted on Github:
https://doctoolchain.github.io/docToolchain/
¹⁰⁷https://doctoolchain.github.io/docToolchain/
¹⁰⁸https://docs-as-code
VI - docToolchain 211
Soon after fulfilling these needs, I noticed that it is quite hard to update changing
UML diagrams. “Couldn’t this be automated?” I thought to myself and created the
exportEA-task as a simple Visual Basic script.
I didn’t plan docToolchain on the drawing board. It evolved in an agile way - through
the requirements of the users:
Together with Gernot Starke, I gave a talk about this approach, and I got feedback
from the audience like “… but I don’t use Enterprise Architect - can I also use MS
Visio?”.
My answer was something like
“no, not at the moment. But docToolchain is not only a tool. I want it to be an idea
- a change in how you think. If EA can be easily automated, I am sure Visio can be
automated too!”
What I didn’t expect was, that a few days after this statement, we got a pull request
with the exportVisio-Task!
This way, docToolchain evolved through the requirements and imagination of its
users.
VI - docToolchain 212
When a software development team creates a software system, a build system is used
to create a compiled artifact which is then deployed to the runtime system (see figure
1).
You can apply the same approach to your documentation. This is called the Docs-
as-Code approach, e.g., you treat your documentation the same way as you treat
your code. To do so, you need several helper-scripts to automate repeating tasks for
building your documentation and it is a waste of time to set up these tasks for each
and every project from scratch.
docToolchain - the system described in this documentation - solves this problem by
providing a set of build scripts which you can easily install. They provide you with
a set of tasks to solve your standard Docs-as-Code needs.
The functional requirements are mainly split up into two different kinds of function-
ality:
The system should be easy to integrate into a standard software build to treat the
docs the same way as you treat your code.
The project should run in different enterprise environments. It has to run with
different version control systems and without direct access to the internet (only via
proxies, e.g., for dependency resolution)
Documentation is not only crucial for open source projects but also essential for
enterprise projects. (Note: I also listed this as constraint C3)
RQ4 - OS independent
The toolchain should be available for the two major build systems, Maven and
Gradle.
This ensures that most JVM based projects can use it.
¹⁰⁹dropped for now (only Gradle is implemented) - see Design Decision 3. Workaround through a mixed build is
available.
VI - docToolchain 214
The project should be set up in such a way that it runs right after you’ve cloned the
repository.
This is to avoid the need for a special setup to be able to use it or change project code.
Confidentiality
This quality goal is not docToolchain specific but regarding security very important.
Because the system is not only targeted for open source environments but also for
enterprise ones, it has to be ensured that no data is sent to 3rd-party systems without
the user knowing it. For example, there are server-based text analysis tools and
libraries which send the text to be analyzed to a 3rd party server. Tools and libraries
like these must not be used.
Ease of Use
The docToolchain project is still a young open-source project and hence needs a
stable and growing user base. The easiest way to achieve this is to make docToolchain
extremely easy to use.
Maintainability
There is no big company behind docToolchain - only users who evolve it in their
spare time. So, a developer should be able to make most changes needed to evolve
docToolchain in a time-box of one to three hours.
1.3 Stakeholders
OSS Contributors:¹¹⁰ need to be aware of the architecture, to follow the decisions,
and to evolve the system in the right direction.
¹¹⁰https://github.com/docToolchain/docToolchain/graphs/contributors
VI - docToolchain 215
Contributors are mostly users of the system who need to fit a docToolchain task to
their needs or need an additional task.
General docToolchain users: These are mainly software architects and technical
writers with a development background. It is very likely that a docToolchain user
also uses arc42 and is familiar with it.
The current focus is on users who are software architects in a corporate environment
and work on the JVM with Git and Gradle.
These users need to include architectural diagrams in their documentation and work
in a restricted environment (proxies, local maven repositories, restricted internet
access, etc.).
They also need to be able to create different output formats for different stakeholders.
Users not working with the JVM: docToolchain also caught the attention of users
who do not primarily work on the JVM. These users need to use docToolchain, just
like any other command-line tool.
Readers: the main purpose of documentation is to be read by someone who needs
specific information.
VI - docToolchain 216
VI.2 Constraints
C1 - Run everywhere
docToolchain has to run on all modern operating systems which users need for
development, e.g., Windows, Linux, macOS.
docToolchain is built for the JVM. While some of the features provided might also
work together with other environments, this can’t be guaranteed for all tasks.
C3 - enterprise ready
C4 - headless
docToolchain has to run headless, e.g., without a display, to run on build servers.
User interaction and windows features have to be avoided.
VI - docToolchain 217
Business Context
The following diagram shows the context in which docToolchain operates:
Business Context
The above diagram is correct regarding the connections from docToolchain, but ab-
stracted regarding the relationships from the two actors “Contributor” and “Reader”:
Those actors represent users who contribute to the documentation or access it. The
connections shown between those actors and the neighbor systems are abstract
because these users do not directly access the files. And they might use different
applications to read an modify the content.
docToolchain itself uses exactly the applications shown or reads and modifies the
files directly.
VI - docToolchain 218
It creates the output documents directly as file content. Only for Confluence, the
Confluence API and thus Confluence as a System is used.
To read the input documents, docToolchain often needs an Application which can be
remote-controlled to convert and read the documents. But it directly reads xlsx- and
MarkDown-files.
The Actors represent roles. Each user can access the system through all of those roles.
Each user will likely have a primary role.
The Reader and Contributor roles don’t have a direct connection to docToolchain.
They may read and modify content through different applications than docToolchain
uses to access the content.
Actor / Neighbor Description
Technical Writer (user) These are the users who use docToolchain
to manage their documentation. They
create the document structure and
configure the connection between different
documentation items mainly via include¹¹¹-
and image¹¹²-directives in AsciiDoc.
¹¹¹https://asciidoctor.org/docs/user-manual/#include-directive
¹¹²https://asciidoctor.org/docs/user-manual/#images
VI - docToolchain 219
your solution architecture documentation The main document which references all
(AsciiDoc-document) other parts of your documentation through
include-statements. For a solution
architecture, this document is derived from
the arc42-template
arc42 template (as AsciiDoc-document) The template that helps you to structure
the documentation of your solution
architecture
HTML Error Report (output file) To check the quality of the documentation,
docToolchain can create an error report
which contains the results of some syntax
check on the generated HTML documents
Technical Context
This section describes the technical interfaces to the components listed above.
System Interface
Jira (external system) REST-API
System Interface
Scope
This section explains the scope of docToolchain, e.g., what types of tasks doc-
Toolchain is responsible for and also what it is not responsible for.
docToolchain:
1 task newTask (
2 description: 'just a demo',
3 group: 'docToolchain'
4 ) {
5 doLast {
6 // here you can put any Groovy script
7 }
8 }
The original build.gradle soon grew into a huge single file. To make it more modular,
we split it up into so-called script-plugins¹¹³. You extract tasks which belong together
in an extra .gradle file (for docToolchain you will find those in the /scripts folder)
and reference them from the main build.gradle:
Export Tasks
These are tasks which export diagrams and plain text from other tools. All tools are
different, so most of them need individual solution strategies.
These tools save their data often in a proprietary binary data format, which is not
easy to put under version control.
The solution strategy for export tasks is first to try and read the file format directly
to export the data. An excellent example of this is the exportExcel task, which uses
the Apache POI library¹¹⁴ to access the file directly. This solution makes the task
independent of MS Excel itself.
However, there are tools where the file format can’t be read directly, or the effort
is too high. In these cases, the solution strategy is to automate the tool itself. An
excellent example of this case is the exportEA task where the task starts Enterprise
Architect in an invisible, headless mode and exports all diagrams and notes through
the COM interface. The drawbacks are that this task only runs on Windows systems.
It can also be slow, and while Enterprise Architect is invisible, it still claims the input
focus so that you can’t work with the machine during export. However, besides those
drawbacks, the automation is still valuable.
A third kind of tools from which docToolchain exports data are web-based tools
like Atlassian Jira. The obvious solution is to use the available REST-API to export
it. There is also a Java-API which wraps the REST-API, but the direct REST-API is
preferred.
Most export tasks are quite slow. At least so slow that you don’t want to run them
every time you generate your documentation. In addition, some of the exports (like
exportEA) don’t run on a Linux-based build server. Because of this, the general
strategy is to put the exported artifacts under version control. The tasks create Special
folders into which they place - among the exported data - a README.adoc which warns
the user that the files in the folder will be overwritten with the next export.
It first sounds strange to put exported or generated artifacts under version control, but
you even get an additional benefit. The exported files - in contrast to their proprietary
source files - can be easily compared and reviewed!
¹¹⁴https://poi.apache.org/
VI - docToolchain 225
When a software system has a longer lifetime, it might also occur that some of the
input systems get outdated. This may result in unreadable source files. In these cases,
the version-controlled artifacts might also be quite useful since they are in easy to
read text-only or image formats.
Generate Tasks
These are tasks which generate different output formats from the sources.
For these tasks, the general solution strategy should be to implement them as
additional Asciidoctor backend (as reveal.js does for example) or at least as a plugin
(as the PDF plugin does for example). However, the effort to create an Asciidoctor
backend is, because of missing documentation, quite high.
So, we used two other solution approaches:
5.1 Level 1
The following picture shows level 1 building blocks.
Component Description
Gradle This is the Gradle build system which is used for
docToolchain to orchestrate the documentation tasks.
Details can be found at https://gradle.org .
Documentation Tools These are various tools which are used to create parts of
your system documentation. Popular examples are MS
Word and MS PowerPoint. docToolchain interfaces these
tools directly instead reading the source files because there
is no known direct file API for these tools. See level 2:
Documentation Tools for details.
VI - docToolchain 227
Component Description
Documentation Source Files Some parts of your documentation might be created in a
format for which a direct file API exists. This package
contains those input files.
Export Tasks This package contains several tasks which export data from
Documentation Tools by either interfacing the
Documentation Tools or accessing the Documentation
Source Files directly . See level 2: Export Tasks for details.
Generate Tasks All generate tasks have in common that they use the
documentation master to generate the assembled
documentation in a specific output format. This package
contains all of these tasks. See level 2: Generate Tasks for
details.
Component Description
5.2 Level 2
These are the tools from which docToolchain is able to export documentation
fragments. Each tool is interfaced by at least one export task through a specific
interface.
Component Description
Jira & exportJiraIssues Jira is the issue tracking system from
Atlassian. The exportJiraIssues task exports
a list of issues for a given JQL-Search.
These issues are exported through the Jira
Server REST API¹¹⁵. The result is stored as
an AsciiDoc table to be included from the
Documentation Master. More
documentation for this task can be found in
the manual¹¹⁶.
¹¹⁵https://docs.atlassian.com/software/jira/docs/api/REST/
¹¹⁶https://doctoolchain.github.io/docToolchain/#_exportjiraissues
¹¹⁷https://doctoolchain.github.io/docToolchain/#_exportchangelog
¹¹⁸https://doctoolchain.github.io/docToolchain/#_exportcontributors
VI - docToolchain 231
Component Description
Sparx EA & exportEA Sparx Enterprise Architect (EA) is the UML
modeling tool from Sparx Systems. The
exportEA task exports all diagrams and
element notes from a model through the
COM API provided by EA. The API is
interfaced through a Visual Basic Script
started from the task. The COM API is
more an application remoting interface
than a data access interface. As a result,
docToolchain starts EA headless and
iterates through the model in order to
export the data. The resulting images and
text files can be included from the
Documentation Master. More
documentation for this task can be found in
the manual¹¹⁹.
¹¹⁹https://doctoolchain.github.io/docToolchain/#_exportea
¹²⁰https://doctoolchain.github.io/docToolchain/#_exportppt
VI - docToolchain 232
Component Description
MS-Visio & exportVisio MS-Visio is the diagramming software
from Microsoft. The exportVisio task
exports all diagrams as image files together
with their corresponding diagram notes as
text files. As Interface, the COM API
provided by Visio is used through a Visual
Basic Script which is started from the task.
The resulting images and text files can be
included from the Documentation Master.
More documentation for this task can be
found in the manual¹²¹.
Documentation Parts & Export Tasks These are files which can be directly
interfaced by docToolchain without the
need of an additional tool. Each file is
interfaced by one export task through a
certain library.
¹²¹https://doctoolchain.github.io/docToolchain/#_exportvisio
¹²²https://doctoolchain.github.io/docToolchain/#_exportexcel
VI - docToolchain 233
Component Description
MarkDown & exportMarkdown MarkDown is the well-known, and quite
popular markup language. The
exportMarkdown task exports all .md files
found in the documentation folder to
AsciiDoc to be included from the
Documentation Master. The interface is a
direct file access and as converter the
nl.jworks.markdown_to_asciidoc library is
used. More documentation for this task can
be found in the manual¹²³.
Utitlity Tasks
These are tasks which might be helpful in certain situations but do not directly belong
to the scope of docToolchain
¹²³https://doctoolchain.github.io/docToolchain/#_exportmarkdown
VI - docToolchain 234
Component Description
fixEncoding Asciidoctor is very strict regarding the encoding. It needs UTF-8 as
file encoding. But sometimes a text editor or some operation on your
documentation file might change the encoding. In this case, this task
will help. It reads all documentation files with the given or guessed
encoding and rewrites them in UTF-8. More documentation for this
task can be found in the manual¹²⁴.
This package contains all tasks which assemble your documentation to various
output files in different formats.
VI - docToolchain 236
Component Description
Asciidoctor Asciidoctor as rendering engine is at the heart of this
package. docToolchain uses Asciidoctor as Gradle
plugin which wraps the Ruby implementation with
jRuby. This is mostly transparent for the user, but it is
important in the enterprise context: all Asciidoctor
plugins also refer to their wrapped versions and not
the Ruby gems. This way, we avoid to reference the
Ruby gem repository as an external dependency for
docToolchain. Instead, all wrapped plugins are
referenced from the standard java repositories. More
details can be found at
https://asciidoctor.org/docs/asciidoctor-gradle-
plugin/
.
Component Description
generateHTML This is the most straightforward task which generates
the standard HTML output with the help of the
Asciidoctor Gradle Plugin. Takes as input the Master
Documentation which includes the Documentation
Fragments. More documentation for this task can be
found in the manual¹²⁶.
convertToDocx & convertToEpub These two tasks use Pandoc as command-line tool to
convert the generated HTML documentation into a
DOCx or ePub file. More documentation for this task
can be found in the manual¹³⁰.
¹²⁶https://doctoolchain.github.io/docToolchain/#_generatehtml
¹²⁷https://doctoolchain.github.io/docToolchain/#_generatepdf
¹²⁸https://doctoolchain.github.io/docToolchain/#_generatedocbook
¹²⁹https://doctoolchain.github.io/docToolchain/#_generatedeck
¹³⁰https://doctoolchain.github.io/docToolchain/#_generatedocx
VI - docToolchain 238
Component Description
htmlSanityCheck This task runs a check on the generated HTML
documentation. It uses the Gradle Plugin of the same
name to perform this task. It generates a Unit-Test
like HTML Error Report. See
https://github.com/aim42/htmlSanityCheck for more
details.
publishToConfluence This task takes the generated HTML output and splits
it by the headline levels into separate pages. These
pages and their images are then published to a
Confluence instance. It uses the confluence REST
API¹³¹ to interface Confluence. More documentation
for this task can be found in the manual¹³².
¹³¹https://developer.atlassian.com/server/confluence/confluence-server-rest-api/
¹³²https://doctoolchain.github.io/docToolchain/#_publishtoconfluence
VI - docToolchain 239
1. docToolchain exports data (diagram images and plain text) from documentation
tools.
2. the main documentation (on the basis of the arc42-template) includes the
exported artifacts
3. the final output is generated via Asciidoctor and plugins…
4. …and converted through additional scripts and tools
Besides, docToolchain contains some additional utility tasks which do not directly
fit into the schema above, like the htmlSanityCheck-task.
The next diagram shows the details of all currently implemented tasks. It focuses on
the various input artifacts which are transformed through the Gradle tasks which
make up docToolchain. Each connection is either a Gradle task which transforms
the artifact or an AsciiDoc include-directive which references the input artifacts
VI - docToolchain 240
from the main documentation and thus instructs Asciidoctor to assemble the whole
documentation.
Runtime View
VI - docToolchain 241
This approach works quite well and independent of your project. However, it violates
RQ6 - clone, build, test, run for your project. Someone who clones your project will
also have to install docToolchain before he can work on the documentation. Also,
breaking changes in docToolchain are also likely to break your build if you do not
update docToolchain on all systems.
This line adds one additional folder to your project, containing the whole doc-
Toolchain repository:
VI - docToolchain 242
1 yourProject/
2 ├─ docToolchain/ <-- docToolchain as submodule
3 ├─ src/
4 │ ├─ docs <-- docs for yourProject
5 │ ├─ test <-- tests for yourProject
6 │ └─ java <-- sources of yourProject
7 ├─ ...
8 ├─ build.gradle <-- build of yourProject
9 ├─ ...
As you can see, this gives you a clean separation of your project and docToolchain.
The git submodule is just a pointer to the docToolchain repository - not a copy. It
points to a specific release. This approach ensures that whoever clones your project,
also gets the correct version of docToolchain.
To use docToolchain with this approach, you configure it as a sub-project of your
main Gradle project (see details in this tutorial¹³³).
If your project uses another build system like Maven, you can still use this approach.
You simply use Maven to build your code and Gradle to build your documentation.
See also TR3: Git Submodules
¹³³https://docs-as-co.de/getstarted/tutorial2
VI - docToolchain 243
Automated Testing
In an ideal world, we should have covered all code with unit tests. However, currently,
the smallest unit of docToolchain is a Gradle task. Because of the dependencies to
Gradle, we can’t easily test these tasks in isolation.
The current solution for automated tests is to use the gradleTestKit¹³⁴ and Spock¹³⁵.
Through the gradleTestKit, Gradle will start another Gradle instance with its own
test configurations. The gradleTestKit ensures that all tasks can be integration-tested
in an automated way.
Spock helps to define the tests in an easy to maintain, behavior driven development¹³⁶
(BDD) way.
¹³⁴https://docs.gradle.org/4.9/userguide/userguide.html
¹³⁵http://spockframework.org/
¹³⁶https://en.wikipedia.org/wiki/Behavior-driven_development
VI - docToolchain 244
• explain users and contributors who are new to the project, why things are how
they are
• help to revise decisions whose basis has changed
Decision: The first approach isn’t easy to use, and the second even violates a
constraint. So it was decided to use visual basic as an automation tool. Since this
is the native interface for a COM-Object (which Enterprise Architect provides), the
amount of bugs in the interface is expected to be minimal. For a first POC, I used
Visual Basic together with Visual Studio. This setup provides intelli-sense and code
completion features and thus makes it easy to explore the API. However, the final
task uses Visual Basic Script, which is a little bit different and doesn’t provide you
code completion features.
Decided by Ralf D. Müller, ca. August 2016
However, script plugins also have advantages. They make the code more compact
and easier to maintain because they consist of single files and are not distributed
over several classes. The tasks are often so small that we included their source in
the manual of docToolchain. When a user reads through the manual, he sees at once
how the tasks are implemented. The chances are good that such a user turns into a
contributor with this knowledge.
Decision: the pros of using script-plugins weigh currently more than the cons.
docToolchain has to be easy to maintain and extend (see RQ2 - easy to modify and
extend.
There is one exception to this rule: Some tasks - like the Pandoc tasks - are just
wrappers to other executables. A binary plugin could even help in this situation by
providing an automated download of the executable.
(in fact, a first version of a binary Pandoc plugin is already available¹³⁸)
Decided by Ralf D. Müller, mid. 2018
Decision: the most common options to use (deploy) docToolchain are the three
described in chapter 7 - Deployment View
Considered alternatives: the following deployment options have been considered
but not implemented.
DD6.1 Embedded
This was the first approach used, and we do not recommend it anymore. In this
approach, you use docToolchain itself as a base project, and add all code and
documentation to that base project.
This approach makes it hard to update docToolchain itself since docToolchain, and
your project are mangled together:
1 yourProject/
2 ├─ scripts/ <-- docToolchain task definitions
3 │ ├─ AsciiDocBasics.gradle
4 │ │ ...
5 │ └─ publishToConfluence.gradle
6 ├─ docs/ <-- docToolchain rendered manual
7 ├─ resources/ <-- docToolchain external submodules
8 ├─ src/
9 │ ├─ docs <-- docs for docToolchain AND yourProject
10 │ ├─ test <-- tests for docToolchain AND yourProject
11 │ └─ java <-- sources of yourProject
12 ├─ ...
13 ├─ build.gradle <-- merged build file of docToolchain
14 ├─ ... AND yourProject
As you can see, this was only a good idea in the beginning, when docToolchain
mainly consisted of an extended Asciidoctor build templates.
This approach is currently a theoretical option. It has never been used yet, but is
mentioned here for the sake of completeness.
VI - docToolchain 248
Gradle has a feature called Script-Plugins. The same way as docToolchain references
the modular, local Gradle-files for different tasks,
you can reference these scripts as remote files in your project’s build.gradle-file:
This line references an exact version of the file - so as long as you trust GitHub and
your connection to GitHub, this approach is safe and works for all self-contained
Tasks.
See DD4: Binary Gradle Plugins for why this was not an option in the past.
For this option, the existing script plugins have to be converted binary plugins for
Gradle.
As a result, they can be referenced directly from your build.gradle file. Html Sanity
Check is an excellent example of this approach.
1 plugins {
2 id "org.aim42.htmlSanityCheck" version "1.0.0-RC-2"
3 }
We could turn docToolchain into a binary plugin which references all other plugins as
its dependency. Alternatively, we could turn docToolchain into a collection of binary
plugins, and a User of docToolchain would reference those plugins he needs.
VI - docToolchain 249
QS1: Confidentiality
In order to keep control over the content of your documentation, docToolchain shall
never use a web-based, remote service for any task without the explicit knowledge
of the user.
QS3: Repeatability
Every run of a docToolchain task creates the same predictable output.
This rule ensures that it doesn’t matter if you move the source or the generated output
from stage to stage.
QS5: Performance
Functionality first. docToolchain is about the automation of repetitive, manual tasks.
So even when an automated task is slow, it will have a value.
However, we should strive for better performance of the tasks whenever we see a
chance to optimize.
QS7: Maintainability
Many users with different needs use docToolchain. Those different needs are the
result of different environments and different tools in use.
Therefore docToolchain tasks should be easy to modify to fit everyone‘s needs. At
least those tasks which are not stable yet.
There should also be blueprints for new tasks.
The “easy to modify” quality goal is currently achieved through the scripting char-
acter of the Gradle script-plugins. The source is referenced from the documentation
and easy to understand and thus easy to modify.
However, this is in contrast to the “Easy of Use” quality goal: script plugin vs. binary
plugin.
VI - docToolchain 251
• if one of the tools gets outdated, you still have the exported data and hence can
still work with it.
• in addition, the exported .png-images and plain text files are easier to compare
than the often binary source formats. This enables a better review of the changes
in your documentation.
• if added via ssh protocol, users without ssh configured for their git account
can’t clone them
• submodules easily get in a “detached head” state
• at least git on windows has authentication problems with submodules (use of
pageant helps)
VI.12 Glossary
Term Definition
COM, Component Object Model COM is an interface technology defined and
implemented as a standard by Microsoft. See
also COM on Wikipedia¹⁴⁰
¹⁴⁰https://en.wikipedia.org/wiki/Component_Object_Model
¹⁴¹https://en.wikipedia.org/wiki/Dynamic-link_library
¹⁴²https://en.wikipedia.org/wiki/Enterprise_Architect_(software)
VII - Foto Max
By Hendrik Lösch.
FotoMaX is a purely theoretical product for ordering photos at a kiosk system,
such as those found in drugstores or supermarkets all over Europe. The photos can
additionally be ordered via a website that is also used by partnering businesses to
provide additional services. This website is only mentioned where necessary but is
not described in detail because otherwise the example would become too complex.
The documentation was created based on various real projects in context of cyber-
physical systems but has no real implementation. It was created for and used during
many workshops about architecture documentation. The goal is to show how to
document systems that have been grown over many years, were maintained by
multiple teams, and have a certain degree of interaction with special hardware
components.
This is also the reason why the structure of this document differs slightly from the
one described in arc42. Especially the first chapter might seem redundant in context
of this book because it describes the overall structure. The idea behind this chapter
is to give the readers all the information they need to understand which parts of the
documentation are important for them. The last chapter is also one that is not part of
arc42 because it contains typical information teams use to organize their daily work.
It was added to show how to incorporate such things in your overall documentation
and link between architecture documentation and team organization. Moreover,
it contains actual links between different pages because the whole document was
written with the use of a wiki system in mind.
VII - Foto Max 255
It is safe to assume that new readers of the document will start with the first
page of the document. Thus we can use this page to provide the most important
information and guide the reader through the document. This approach is very
important especially when using wiki systems otherwise they can become very
confusing.
This documentation describes all aspects that should and must be considered in the
development and maintenance of the FotoMaX device software. It uses the arc42
template as a basic structure. According to this, the documentation is divided into
the following subsections, which can be read from top to bottom, leading from an
abstract and external problem view to a concrete internal solution view.
Quality Requirements
This chapter deviates from the official Arc42 structure in order to emphasize the
importance of the quality requirements. Normally, these would only appear much
later in the document, which makes sense if they are described in great detail. In
this case, here are only a comparatively small number of scenarios. All of these
are very important and therefore brought into focus by appearing very early in the
document.
This chapter describes the non-functional requirements of the software system and
prioritizes its quality attributes. These are fundamental decisions that influence many
different aspects of the software and software architecture.
VII - Foto Max 256
Constraints
This chapter describes the constraints and requirements that limit software design,
implementation or the development process.
Solution Strategy
The actual implementation strategy is described in this chapter.
Runtime View
All dynamic relationships that are to be highlighted are described in the runtime
view chapter.
Deployment View
The deployment view clarifies in which runtime environments the individual com-
ponents of the system are installed and executed.
Crosscutting Concepts
Fundamental decisions should not be made based on gut feeling, but as empirically
as possible. In this chapter, all the information that is decisive for design decisions is
collected.
VII - Foto Max 257
Architectural Decisions
Fundamental decisions should not be made on the basis of a gut feeling, but as
empirically as possible. In this chapter, all the information that is decisive for design
decisions is collected.
Glossary
The glossary summarizes all terms that may hinder the understanding of the
documentation.
Organizational Topics
This chapter is not part of ARC42 but was added to this documentation to show
how teams can organize their daily work next to the architecture documentation
so that both information sources are kept close togther. This allows to link between
both documentations but also ensures that both are also always present. This in
turn ensures that the architecture documentation is not perceived as something
that exists somewhere somehow, but is an important part of one’s own work.
This chapter contains information that does not relate directly to the software
architecture, but to the day-to-day collaboration in implementing the software.
This includes, the Definition of Ready and Definition of Done, but also guidelines,
checklists and things like vacation planning.
VII - Foto Max 258
About FotoMaX
FotoMaX is a white label solution for ordering photo products. It enables retailers
of various types to offer their own customers additional services, related to the
processing and ordering of photographic content. This solution consists of two parts,
the device hardware and software, which in their combination are set up at various
partners as “Sales Units” or just “Units”. This document primarily describes the
software and only discusses the hardware when it influences the software design
significantly.
FotoMax does not take care of the actual printing of the orders. It provides the link
between the companies that set up a sales unit in their stores and those that do the
actual printing of the photo products. Based on the chosen version of FotoMax these
could even be the same companies. The products are paid directly in the respective
store or via mobile payment solutions.
In the simplest case, printing takes place directly in the store (“FotoMaX Standalone”)
or can be forwarded to a print shop (“FotoMaX Connect”) to allow a larger variety
of products like cloths, coffee cups or posters.
Typical points of sale:
• Supermarkets
• Drugstores
• Hotels
• …
VII - Foto Max 259
Stakeholders
The following people represent a particular view of the system and should therefore
be consulted in fundamental decisions.
Please note that these persons and their contact details are purely fictional. Simi-
larities to existing companies or persons are therefore purely coincidental.
Prioritization
The prioritization of the quality attributes of the system was carried out with the
involvement of the stakeholders (→VII.2.2). It was initially executed for the sales
unit then modified later to incorporate the portal:
2 Security The sales units are located in a public space and offers
various possibilities for data transmission.
¹⁴³https://iso25000.com/index.php/en/iso-25000-standards/iso-25010
VII - Foto Max 261
Legend:
2 Important Compromises are only allowed if features with higher priority are
strengthened.
3 Significant Compromises are possible, as long as the core requirements are not
disturbed.
Quality Scenarios
The following scenarios specify how exactly the software system should behave in
certain situations and what compromises may be possible.
VII - Foto Max 262
VII - Foto Max 263
# Attribute Description
S1 Functional Correctness When a customer places an order, the system must
round the final amount in a commercially correct
manner.
S3 Learnability A user who has not seen the software before must be
able to order four different photo products within
five minutes.
S9 Capacity & Time Behavior If a customer orders 1000 different images with a
total size of one gigabyte, he must receive an order
confirmation within five minutes.
VII.4 Constraints
This chapter describes the constraints and requirements that limit software design,
implementation or the development process.
Organizational Constraints
The company basically has two main products that are customized depending on the
business partner in terms of features, workflows and graphical assets (→VII.5.1).
Team Portal Eight members, responsible for the portal solution where
customers can view their orders and partners can view their
commissions.
Team Data Two members, responsible for the analysis and preparation of
the accruing data for partners and for product maintenance.
Technical Constraints
• Technologies Used:
– Wizard: .NET 4.8, C#, WPF
– Portal: .NET 6, C#, Angular 11, NServiceBus for internal communication
between services
• Runtime Environment:
– Wizard: Windows 10, SQLite
– Portal: Windows Server 2019, SQL Server 2019
• Release Cycles:
– Wizard: About once every two to six months, depending on the partner,
over the air.
– Portal: Approx. every two weeks Technische Besonderheiten:
Note: 40 Wizard instances from before 2015 do not have a remote update capabilities,
and must be updated by a service engineer until 2025. After that, the maintenance
contracts expire, forcing the customer to upgrade to a remote-enabled version.
Conventions
• Code
– The coding guidelines are located in VII.14.2.1.
• Test
– The testing guidelines are located in VII.14.2.2.
• Review
– The review guidelines are located in VII.14.2.3.
VII - Foto Max 266
Business Context
The business context describes the external dependencies of the software system from
a domain-oriented perspective.
FotoMaX maps two different workflows: In standalone mode, all jobs are processed
on the client’s premises; in connected mode, jobs are forwarded to external print
service providers.
VII - Foto Max 267
Designation Description
End Customer A person who uses photo printing services via a sales unit
and can track them via the portal.
Print Provider An employee of the print provider who can view order
data and modify orders.
Order System Print The external system of the print provider to which print
orders are forwarded. Each print provider can be expected
to have its own system.
Billing System / Printing Site The billing system of the printing site provider, via which
print jobs are billed.
Standalone Setup
In standalone mode the sales unit is only connected to a local printer and has no
internet connection. All orders are processed and paid for within the premises of the
pitch provider. The supplier rents both the sales unit and the printer from FotoMaX
at a fixed monthly rate. In addition, consumables such as ink and paper are billed
according to usage. In addition to printing the ordered items, the sales unit also prints
a receipt, which has to be paid by the end customer at the checkout of the pitch
provider. FotoMaX is not involved in the actual billing process.
VII - Foto Max 268
Connected Setup
The connected setup allows end customers to place more extensive orders and track
them online. In addition, the orders are not executed exclusivly on the premises of the
provider’s premises, but also at a corresponding print provider on additional products
such as mugs, t-shirts or similiar articles. In a partially connected setup, the billing
is done by the pitch provider.
In a full connected setup, the orders are then either paid for by EC or credit card
at the device or the customer receives an invoice via e-mail. In this case, the pitch
provider is not involved in the billing of an order, but receives a fixed monthly stand
fee, and commission on all sales. They can view their earnings at any time via the
portal.
Technical Context
The technical context highlights the technical environment in which the software
system exists and the respective interfaces to external systems.
VII - Foto Max 269
Element Description
Service Technician An employee of FotoMaX GmbH, who performs maintenance work
on sales units on site.
Local Technician An employee of the respective sales pitch provider, who performs
minor maintenance work on sales units. This includes replenishing
printing paper and cartridges of the receipt printer or retrieving
orders in standalone mode. The wizard’s UI is used to confirm all
maintenance tasks. Local technicians have no access to the sales
unit’s operating system.
End Customer A person who uses photo printing services through a sales unit, or
views order data on the customer management website.
Wizard The software that takes orders and initiates prints either locally or
remotely.
Hardware Hub The central control unit via which various hardware components
can be addressed. This is used to read images from cameras, USB
sticks or memory cards. The connection is made via USB and is
standardized for all sales units.
Receipt Printer A thermal printer used to print out receipts for orders that then have
to be paid at a cash register.
Order Printer FotoMaX can forward print orders directly to printers of the
location provider and thus trigger immediate printing. Printers can
be connected via USB or as network printers.
Pitch provider An employee of the pitch provider who can view orders and check
invoices.
Print provider An employee of the print provider who can view and, if necessary,
change job status and orders.
Job system The external system of the respective print provider, to which jobs
are forwarded. The connection varies greatly depending on the
system and is therefore assumed in simplified form only as HTTPs.
VII - Foto Max 270
The hardware is connected to the Wizard via the API of the operating system during
a standardized configuration process after the manufacturing of the units. Thus, no
special protocols have to be considered. The only exceptions are order printers: when
integrated as network printers, those have to be configured separately.
VII - Foto Max 271
The Modulith
Basic Design
As a basis for the modulith, a framework application was created based on the Prism
Framework (→VII.11.4), which provides general features for central services such as
user administration, settings management, etc.
VII - Foto Max 272
Module Design
Module Integration
The kernel integrates the modules into the application during a bootstrapping process.
The modules can access central interfaces, and they have a root element that is
called by bootstrapper to control their initialization. Modules are strictly prohibited
VII - Foto Max 273
from accessing the concrete implementation of either other modules or any central
services. Rather, they have to request all services via dependency injection, using
constructor injection, meaning that the service interfaces are injected as parameters
into the module constructors.
Migration Strategy
In 2019, the architecture of the Wizard had a highly coupled structure, with neither
functional nor technical layering. The following diagram depicts this situation by
using the same colors for components with the same responsibility, while the
different sizes and distributions to represent the unstructured nature of the system.
VII - Foto Max 274
To decompose the monolith, first the various responsibilities within the codebase
are assessed. Then a strategic decomposition is prepared, and a frame application is
created, that can incorporate the modules.
This procedure is continued until either the entire Wizard is disassembled, or only
such components are left for which restructuring is not worthwhile.
VII - Foto Max 277
• infrastructure (gray),
• logic (orange),
• and hardware connection (purple).
VII - Foto Max 278
All versions of the sales units used the same infrastructure components. The logic
components are provided with their actual functionality at runtime through a
configuration that is specific to the sales unit. The hardware connection in turn
decouples all other components from the possible printers and connection options
of the sales unit.
Application Frame
Interfaces No external interfaces but C# interfaces are provided for the various
components.
Special Features Much of the functionality is provided by the Prism framework and
partially encapsulated by custom interfaces to reduce dependencies on
the framework, so it is possible to replace the WPF and Prism in the
future. Since the Application Frame is primarily a connection to the
Prism Framework and a collection of loose components, there is no
white-box representation of this in this documentation.
Workflow Management
Special features Workflow Management cannot work without logic components and
workflow configuration as it would only display an empty frame.
VII - Foto Max 279
Maintenance Management
Purpose Contains all features that are needed to read and change the machine
state, getting information about consumables and help with calibration.
Open points Older versions of the Maintenance GUI could be accessed via the
general UI. The functionality required for this can still be found in the
source code for compatibility reasons, however, it is no longer linked in
the UI.
Image Handling
Purpose Image Handling contains all functions that can be used to read, process
and modify images. In addition, it also contains the corresponding
work steps, which are triggered via workflow management at runtime
and displayed to the user.
Interfaces No external interfaces, since all work steps are called via the workflow
engine.
Special Features A large part of the functionality is provided by the Prism framework
and partially encapsulated by its own interfaces to reduce
dependencies on the framework, since in the future a move away from
WPF and thus from Prism is possible. Since Image Handling is a
collection of loose components, there is no white-box representation of
this in this documentation.
Risks The code in Image Handling is very disorganized and needs a thorough
restructuring. Especially the exchange of image data is solved via a
global context and causes confusion. See also technical risks.
VII - Foto Max 280
Order Management
Purpose Order Management handles the actual ordering. This is done via the
portal in the connected case or via the local printers in the standalone
case.
Special Features Order Management must read from the configuration which work
mode the sales unit is in and then decide independently how it
processes orders.
Open points There is only one dialog in which the entire order is summarized. This
is used equally on all Sales Units. However, the pitch providers have
already stated needs for adjustments here.
Hardware Management
Purpose The hardware management abstracts the hardware accesses for all
other subsystems.
Interfaces Primarily the operating system’s own interfaces are used to access the
hardware.
Special Features Hardware Management does not have its own UI. Since Hardware
Management is primarily a collection of loose components, there is no
white-box representation in this documentation.
VII - Foto Max 281
Name Responsibility
Workflow UI Contains all views that are dynamically integrated into the main UI.
Order Workflow Aggregates all components of the order management and attaches
them to the frame application and workflow based on a
configuration.
Local Processing Takes over all processes that are necessary to execute a local print
job. This also includes the issuing of a receipt.
Remote Processing Handles all processes that are necessary to have a print job executed
by a print partner. This also includes the issuing of a receipt.
This part of the documentation has not been further elaborated as it does not add
any value to the actual purpose of the documentation.
This part of the documentation has not been further elaborated as it does not add
any value to the actual purpose of the documentation.
VII - Foto Max 283
Not every type has to be initialized by the module itself. This is only necessary for
module components which actually require a special initialization. The component
VII - Foto Max 284
initialization should happen here if possible asynchronously around the start not to
hinder.
Note: The same document printer is always installed in the sales unit, therefore it can
be decided via the selection of a template whether a remote order or a local order is
to be issued.
VII - Foto Max 285
For the sake of clarity, the above illustration only shows the production environment
in a simplified form. This does not contain virtualization environments, development
environment, staging environments, and fallback servers. A detailed overview can
be found in the Infrastructure documentation of the IT department, which takes care
of the operation.
The production environment is mirrored in its structure as a test environment. This
has two databases, Test and Dev.
VII - Foto Max 286
Artifact Description
wizard.exe Executable file of the wizard, which is installed by a service
technician on a sales unit before delivery. Updates are done
by the service technician on site.
Partner Assets Logos, fonts, theme, etc. in the corporate identity of the
parking space provider.
Test DB Special database with anonymized test data. This is used for
acceptance tests by the departments and for quality
assurance.
Types of “errors”
Different types of errors can occur within software, therefore not everything that
is commonly referred to as an error is also an error in the sense of software
development.
We can distinguish between the following scenarios:
Exceptions are the common means of choice in .NET to indicate errors. They have
the advantage that they can be checked and logged using appropriate analysis tools.
In addition, they automatically contain various environmental information such as a
stack trace, the time of occurrence and others. The disadvantage is that they are not
necessarily known to the caller. Thus, they may be overlooked, not handled and in the
last instance close the entire application. In addition, the information that exceptions
contain is not necessarily helpful for the user since it is very technical.
Note: Exceptions should be reserved for exceptional situations or pure programming
errors. If an action fails due to a foreseeable event, then that event should be handled
appropriately, and no exception should be thrown.
A good example of an unusual situation can be found in switch case statements,
where the list of possible values might change over time. As a result, it is possible that
case statements no longer cover all possible combinations. To detect such a problem
at an early stage, a corresponding exception can be thrown in the default branch.
1 decimale calculatedPrice;
2
3 switch (customer.Status)
4 {
5 case CustomerStatus.Normal:
6 calculatedPrice = CalculatePrice(order);
7 break;
8 case CustomerStatus.VIP:
9 calculatedPrice = CalculateSpecialPrice(order);
10 break
11 default:
12 throw new InvalidEnumArgumentException($"Unknown customer status!\
13 ");
14 }
Not Exceptional
Exceptions are intended for exceptional situations, but how do you deal with non-
exceptional situations? A result object is provided for this purpose. This contains
VII - Foto Max 289
both technical error descriptions and error descriptions that a user can follow. The
result object can therefore also be used to inform users about a problem situation
and, if necessary, to prompt them to take countermeasures.
In the following example, this is illustrated by first checking whether the correct data
has been transferred or not in the case of a division. If not, this is communicated to
the user so that they can correct their input if necessary.
The example is deliberately simplistic and may lead to discussions. Whether a result
object should be used or not is in fact very much a contextual decision. Within
components it is quite possible to return concrete data types, as far as no errors
can occur at all or an error automatically weighs so heavily that it must come to an
exception.
Handling Errors
There are several ways to handle errors. In the following section, we will briefly
discuss these to give you a better sense of what to think about in each case.
Ignore
If possible, error conditions should not occur in the first place. If an error is so unlikely
that concrete handling seems too costly, then it should at least be logged. However,
this should be discussed with the product owner and quality assurance.
VII - Foto Max 290
Retry
Based on the context, it can be quite helpful to simply try an action again if it does
not succeed instead of throwing an exception immediately. However, this should be
discussed with the product owner and quality assurance.
Logging
Logging is a very extensive topic and has therefore been moved to a separate chapter
(→VII.10.3).
User Interactions
• Notifications - information for the user without direct influence on the pro-
cessing. Example: “System configuration faulty, Sales Unit is shutting down!”
• Confirmation - Information for the user, which requires an input from them,
whereby they can influence the processing. Example: “File could not be read.
Should it be tried again? Yes/No”
It may happen that exceptions are not caught. In that case they end up in the final
exception handler. This logs the error and shuts down the application in a structured
way. Such errors should not occur in a production environment!
Logging
When searching for the causes of errors, it is very helpful to know how the software
behaves in its real environment. This is particularly important in the context of the
wizard, since the sales units represent closed systems that are difficult to access from
the outside. Remote debugging or similar are not possible at the current time. The
production logs are therefore the only way to obtain information about the use and
status of the sales unit.
VII - Foto Max 291
The ILogger interface is used for logging in the portal and in the wizard. This is
provided centrally and can be requested by any class via its constructor.
Log Level
Fatal A Fatal log entry is written whenever the stability of the entire software is
negatively affected. For example, a Fatal Log should always be written by
the Final Exception Handler if exceptions occur that were not handled.
Error Error log entries indicate that something has happened that should not
have happened, without automatically affecting the stability of the
software. The most common case for error logs are caught exceptions that
could be handled.
Warning Warnings are not necessarily errors, but indicate that something is wrong.
They can occur, for example, if the connection to the printer is lost. In such
a case, the code writes a warning and tries to reconnect after a few seconds.
If it cannot establish a connection after several further attempts, it logs an
error instead of a warning. If the error persists for a long time, it would
result in a Fatal log entry.
Debug Debug logs are used when debugging by developers. This log level is not
available in the production environment and is only used in the test
environments.
VII - Foto Max 292
In addition to the log level and possible exceptions, the context in which the
information occurred and when the log was written must also be logged. The more
meaningful the logs, the easier it is to analyze them later.
Logged data must not allow any conclusions to be drawn about individuals, including
contact details. Furthermore, it is not permitted to write account details or similar
sensitive information in the log. In order to allow better traceability through the
different layers or services, a unique session ID is automatically assigned to each
order transaction, and this is automatically part of each log message.
Domain Model
The domain model is a technical data model that helps both in the design of interfaces
and in the communication between the various project participants. The following
domain model describes the considerations in the context of the business processes
of the FotoMaX company.
The following domain model is highly simplified and serves the purpose of a prac-
tical training. Therefore, it does not claim to be technically correct in its entirety.
For example, large parts of Customer Management and Partner Management are
missing. The domain Operations will only be found in this chapter and in no other,
as it would have greatly complicated all other examples.
VII - Foto Max 293
Entity Description
Costumer Customer who initiates an order and provides images. Both
activities can be done via a website, app or a sales unit.
Invoice End customer invoice that settles the amount due resulting from an
order. In the standalone case there is no connection to the customer.
Print Job Contains images that are forwarded together with various
parameters to a print partner for the actual printing. An order on a
sales unit automatically leads to the creation of a print job.
VII - Foto Max 294
Entity Description
Partner Prints print jobs and invoices the related services as Incoming
Invoice.
Print Partner Prints print jobs and invoices the related services as Ingoing Invoice.
Sales Partner Establishes sales units. In the connected case, receives a commission
for each order placed, which is paid as an Incoming Invoice. In
standalone mode, a monthly fee is issued as an outgoing invoice.
Sales Unit A sales unit that is set up at a sales partner and triggers orders.
Maintenance Job A maintenance order to a technician for one or more sales units.
Service Technician Service technician who accepts and processes maintenance orders
for sales units.
VII - Foto Max 295
Context
The basic technology of a software determines not only which APIs are available
for development, but also which ecosystem will have an influence on the software
in the long term. This ecosystem includes both free and proprietary libraries and
frameworks. The selected technology also determines future viability, licensing costs
and maintenance efforts.
In order to prevent technological fragmentation as much as possible, FotoMaX must
have a base technology that can be used in the long term on the sales units, the portal,
and possibly also on the end users’ (mobile) devices. In addition, it must be stable
and cost-effective to avoid unexpectedly high costs of maintenance, operation, and
licensing.
In the course of setting up a new customer portal, a decision must be made about
which basic technology should be used.
State
Accepted.
VII - Foto Max 296
Decision
Consequences
Higher licensing costs are generally to be expected when using .NET and Microsoft
technologies as opposed to other technologies. These include the costs for develop-
ment environments, database servers and operating systems. However, economies of
scale can be exploited here by standardizing corporate IT.
Context
The architecture review of May 12, 2019, determined that the architecture of the
Wizard is too difficult to maintain. Therefore, an alternative architecture is to be
developed. The following architecture samples have been compared to the quality
requirements to enable a selection. Please note: The evaluation was made based
an existing and complex software structure, which excludes a completely new
development.
Monolith is an architecture pattern where the software system can only function
as a whole during development and runtime. Separations between the
(technical) concerns must be enforced organizationally, as they are
separated to a limited extent (all code is developed in the same
repository). This means less organizational effort for changes but
increases organizational effort for maintenance.
SOA According to Wikipedia, “The term SOA has slightly different meanings.
The first meaning is quite broad, describing it as a software design that
divides functions into distinct units that developers make accessible over
a network so that users can combine and reuse them when building
applications. The other meaning describes it in more detail, following
specific protocols and responsibilities.” This comparison is based on the
broader definition.
For monoliths, the entire structure needs to be understood as it may not be decoupled.
SOA and Microservices become complicated when the whole picture needs to be
VII - Foto Max 298
understood. This can be done with special tools, but is more difficult than with
moduliths, where the software exists as a whole.
Appropriateness
This quality attribute must be addressed by specific implementation details that are
not part of the architecture pattern.
Authenticity
This quality attribute must be addressed by specific implementation details that are
not part of the architecture pattern.
Completeness
This quality attribute must be addressed by specific implementation details that are
not part of the architecture pattern.
Confidentiality
This quality attribute must be addressed by specific implementation details that are
not part of the architecture pattern. However, services need to be secured more at
runtime because they are deployed independently and separated from each other.
This means a larger attack surface.
Fault Tolerance
Monolith Modulith SOA Microservices
Poor Medium Good Good
VII - Foto Max 299
Moduliths work as “deployment monoliths” and thus have the same advantage.
The infrastructure for installing microservices is more difficult to set up than the
infrastructure for just a few services.
Integrity
This quality attribute must be addressed by specific implementation details that are
not part of the architecture pattern. However, services simply need to be secured
more at runtime because they are provided separately. This means a larger attack
surface.
Interoperability
Services use open interfaces that facilitate connecting external services. The Mod-
ulith is better prepared for such connections because it can be equipped with the
appropriate adapters more quickly than is the case with the Monolith.
Maturity
When used correctly, all architectural patterns can ensure the proper functioning of
the software system.
Modifiability
VII - Foto Max 300
Monoliths are difficult to modify because their internal structure is less likely to be
decoupled and therefore changes to one part can affect other parts. Microservices, on
the other hand, operate as independently as possible.
Modularity
Microservices are the most modular systems possible. SOA depends on the amount
of and size of services. Moduliths are built with modularity in mind.
Recoverability
Monoliths only work as a whole, so errors are more likely to affect the entire system.
Moduliths have a greater chance of looking at individual components separately than
monoliths do. Service-oriented architectures are associated with a whole range of
tools that can be used to restore the status of individual services as quickly as possible.
Testability
Moduliths are great to test because they can run as a whole but also divided in their
modules. Services based architectures are sometimes hard to test as a whole and
monoliths cannot be decomposed.
Time behavior
Monolith Modulith SOA Microservices
Good Medium Poor Poor
Performance is lost with each layer of abstraction. Service-based systems have many
more abstraction layers than software systems running in the same process.
VII - Foto Max 301
This quality attribute must be addressed by specific implementation details that are
not part of the architecture pattern.
Evaluation is based on ISO 25010.
Decision
Accepted - 20.05.2019
The future Wizard will be implemented as a Modulith. This allows the best back-
wards compatibility and a smooth migration of the existing system. The actual
advantages of service-oriented architectures and microservices cannot be exploited
in the case of the deployment scenario.
Consequences
This chapter is meant to show a more complex ADR as an example on how these
can be used to justify design decissions and strategies.
Context
Part of the further development of the Wizard is its decomposition into workflow
modules and service modules. This document describes which grouping and priority
VII - Foto Max 302
should be taken into account in the decomposition. This is being coordinated with
various stakeholders. It should also be noted that this decomposition is currently
still very rough and is primarily intended as a preparation for creating a minimally
modular application. In a further restructuring step, a functional decomposition can
be performed on the basis of the target architecture.
Workflow Engine This group describes The code has to be The biggest risks
all the logic that is rewritten as much come from the fact
used for the general as possible because that the workflow
implementation of such a procedure engine has to be
dynamic workflows was not foreseen freely configurable
and navigation. It is when the existing and must not have
not (!) about the application was any dependencies to
actual workflows, designed. the concrete
but only about their workflows at
infrastructure. individual partners.
Since it is a central
control element,
errors here may
have an impact all
the way to the end
customer.
VII - Foto Max 304
Special Workflows Dialogs, workflows, The same applies Here, the same
product here as with the applies as for the
descriptions, etc., partner assets, only Partner Assets, only
the use of which can in an increased in increased form.
be configured via form. Here, the The same applies
the partner assets. same applies as for here as for partner
the Partner Assets, assets, only in
only in increased increased form.
form.
Decision
Accepted 27.11.2019
Due to the low effort and low risk of the conversion measures, it was decided to
first outsource the codecs and the image algorithms to a separate service module.
Although this does not serve the goal of a functional modularization of the wizard, it
does allow the migration to be run through once with a low investment before more
complex and riskier conversion measures are undertaken.
After the migration of the codecs and the image algorithms, the implementation of
the general workflow logic as an independent module will be pushed forward. Again,
VII - Foto Max 305
this does not serve the long-term goal of functional modularization, but it does allow
such a central and important component to be decoupled and easily testable from
the outset.
Based on the resulting structures, the areas of partner assets and specific workflows
can then be implemented.
Consequences
While the initial modules will most likely not yet conform to the long-term architec-
ture, they will be designed to be easily adaptable. Once the workflow logic has been
implemented, further analysis of the legacy code and more detailed planning of the
migration will need to take place in consultation with the various stakeholders.
This chapter shows how decission matrizes can be used to formalize the compara-
sion of different solution alternatives.
Context
PRISM Catel
WPF
Score 78 55
Description Rank
Documentation
How much 3 3 excellently 2 partial
informa- docu- (Stack
tion is mented overflow);
available? many base
How well classes;
is it property
written? Is bag (Un-
it up-to- do/Redo
date? available);
One man
show but
fast
reaction;
huge com-
munity
PRISM Catel
WPF
View Is it 3 3 wire 1 no special
Handling possible to view/view- info found
switch Model
between declara-
views? tively in
XAML (DI
is possible)
Messaging Does it 2 3 3
EventAggregator; via static
have a messaging Message
message UI possible Mediator
bus? from back-
ground
threads is
possible
PRISM Catel
WPF
.NET 5 3 3 yes 2 .NET Core
compati- 3.1
bility
PRISM Catel
WPF
SustainabilityHow long 3 3 yes 3 yes (2008)
does it
exist? Are
there signs
that it will
be closed
down in
the future?
Decision
Accepted - 13.10.2020
The Prism framework was chosen as the application framework for the Wizard
because it meets all the requirements. Static code analysis must be used to ensure
that there are no dependencies on the framework outside of bootstrapping, module
definition and the user interface. In these parts, dependencies can be tolerated, since
there is a dependency on WPF here anyway. However, if a platform-independent
technology is chosen in the future, then a new development of the entire wizard
should not be necessary because of Prism!
Consequences
operating system. At the same time, the migration of the existing application is easier
because it has already been implemented with WPF.
VII - Foto Max 311
R3 WPF does not WPF is used as the WPF mandates the The incompatible
allow platform front-end use of Windows. libraries are
independence technology of the The licensing costs replaced. Then the
Wizard. The entire for Windows are application is
architecture of the high compared to migrated to a
Wizard is geared Linux newer .NET
towards the use of distributions. The version. Then the
WPF and thus WPF add-on need for platform
replacement is not libraries used in independence is
easily possible. the software re-evaluated.
hinder the actual
compatibility to
newer .NET
versions.
VII - Foto Max 313
R5 Global context for All loaded images The logic to fill the A restructuring of
image exchange are kept in a static context and empty the image
context class at it again after the algorithms has top
runtime. end of a workflow priority in the
is very complex. In restructuring of
the past, there the wizard.
were cases where
customers saw
other customers’
images. This has
been fixed but it
must not happen
again under any
circumstances!
VII - Foto Max 314
VII.13 Glossary
Term Description
Assets Image files, color definitions, fonts, workflows, etc., through which the
Sales Unit is adapted to the corporate identity of a pitch provider.
Pitch provider A company that offers photo printing services via FotoMaX and provides
a space for this purpose on their own premises.
Sales Unit Sales units consist of hardware components (screen, computer, etc.) and
the Wizard software. If applicable, an external printer is also offered but
does not count as part of the unit.
Onboarding
Note that this chapter is a direct result of the quality requirements and scenarios
regarding maintainability which states that new team members should be able to
be productive in a certain amount of time.
This chapter describes resources to help new team members get the best possible start
on the project.
The following checklist is intended to help new team members find their way around
the team as quickly as possible and be able to work productively.
Installation Guide
Tutorial
Guidelines
Coding Guidelines
This chapter would contain descriptions of how the code should be styled, if this
was a real documentation. In general, the coding guidelines should mainly deal
with special cases instead of explaining the code style in detail. The latter can be
better checked by static code analysis tools and/or a linters. Things to consider are
for example: * Mention of the code style used with a link to additional sources and
incl. explanation of how it is checked. * Dealing with exceptions (if supported by the
language) * Dealing with enumerations (if supported by the language) * Common
(Anti-)Patterns to avoid
Testing Guidelines
This chapter should contain descriptions on how automated or manual tests are to
be performed. It is worthwhile at this point to formulate guidelines individually for
each type of test in order to be able to design them in a way that is appropriate for
the target group. * Unit Test Guidelines * Integration Test Guidelines * Exploratory
Test Guidelines * …
Review Guidelines
Code reviews are a very important means of quality assurance. This document is
intended to provide reviewers with a guideline to decide whether or not a pull request
may be included in the main branch. This guideline only includes things that are not
already checked by an automated check through build tools.
Precondition of a review
VII - Foto Max 318
Check Description
� The review is performed by a person who was not involved in the programming
himself.
� The code was checked out locally by the reviewer.
� The code was built locally.
� Automated tests were executed locally.
Criteria
Check Description
� The software was built successfully.
� The software can be started locally and simple operations in the area of the
change are possible without complications.
� No exceptions were detected during the execution of the code (see Exception
Information of the debugger).
� All automated tests can be executed successfully.
� There are no ignored tests.
� If tests were deleted, this was done justifiably or they were replaced by others.
� There are no warnings in the local development environment.
� There is no code that suppresses warnings in the code analysis tools.
� The code satisfies the rules from the Coding Guidelines.
� The tests satisfy the rules from the Testing Guidelines.
Daily Work
This part of the documentation is reserved for all topics regarding the organization
of the daily work of the team.
This area would be separately handled for each and every team. Following only
one example can be found to hold the overall documentation short.
Team Charter
The team charter is a set of rules created by each team itself to define how it wants to
work together. It should only contain things that all members can agree on. Examples:
VII - Foto Max 319
• We are punctual.
• Problems are resolved promptly and factually.
• Documentation and testing are as important as source code.
• We maintain a positive feedback culture.
• Cell phones may not be used during an appointment.
Definition of Ready
With the Definition of Ready, the team determines which prerequisites must be met
before the actual development can start. The DoR was created by the team and each
team member agreed on it.
• The requirements are described as user stories and placed in the backlog.
• The user story has been assigned a business value.
• The user story is understood by all team members.
• Testable acceptance criteria are available.
• External dependencies have been named.
• The user story has been estimated in story points.
Definition of Done
With the Definition of Done, the team determines when it is finally finished with the
work on a feature. The DoD was created by the team and each team member agreed
on it.
• A code review was performed by a developer who was not involved in the
implementation.
• All acceptance criteria have been implemented.
• Test coverage is at least 75%.
• All tests are successful.
• The code has been transferred to the Main Branch.
• There was a formal review by the product owner.
Sprint Reviews
VII - Foto Max 320
Results or resources from and for the sprint reviews can be stored in this area. In this
way, the information is not lost and can contribute to continuous improvement.
Vacation Planning
If there is no better solution, this area of the documentation can be used to track the
availability of team members and plan vacations. This allows a more effective sprint
planning.
VIII - Mac-OS Menubar
Application
By Gernot Starke.
This surely is one of the shortest and most compact architecture documentations.
Its goal is to show that arc42 can be extremely tailored and reduced to keep only a
minimal amount of documentation.
I implemented this example as a brief excursion into the Swift programming
language. I did not want to invest in documentation, though I wanted to persist the
most important facts about this implementation.
The result: A few keywords on requirements, names of some important OS classes,
a few links to resources I found useful during implementation.
I used a mindmap instead of text, tables and diagrams.
VIII - Mac-OS Menubar Application 322
AG and lead the first European Java Project (the Janatol project for Hypobank in
Munich).
Since then he has consulted and coached numerous clients from various domains,
mainly finance, insurance, telecommunication, logistics, automotive and industry on
topics around software engineering, software development and development process
organization.
Gernot was an early adopter of the agile movement and has successfully worked as
Scrum master in agile projects.
He lives in Cologne with his wife (Cheffe Uli) and his two (nearly grown-up) kids,
two cats and a few Macs.
Email Gernot Starke¹⁵¹ or contact him via Twitter @gernotstarke¹⁵².
¹⁵¹mailto:[email protected]?subject=arc42%20By%20Example
¹⁵²https://twitter.com/gernotstarke
The Authors 325
Hendrik Lösch
¹⁵³http://www.zeiss.com
¹⁵⁴http://www.hendrik-loesch.de
The Authors 326
Michael Simons
¹⁵⁵https://www.fz-juelich.de/portal/DE/Home/home_node.html
¹⁵⁶https://neo4j.com
¹⁵⁷http://springbootbuch.de
¹⁵⁸http://michael-simons.eu
¹⁵⁹mailto:[email protected]
¹⁶⁰https://twitter.com/rotnroll666
The Authors 327
Stefan Zörner
About embarc
Do you want to build a strong foundation for architecture documentation across your
organization? Do you want to use a standardized template like e.g. arc42?
Architecture documentation isn’t what you might think of first: baroque diagrams,
large concepts, etc. Start your architectural overview with only a few basic ‘ingredi-
ents’. Our cheat sheet¹⁶³ gives you a quick and helpful introduction.
¹⁶¹mailto:[email protected]
¹⁶²https://www.embarc.de
¹⁶³https://www.embarc.de/architektur-spicker/
The Authors 328
¹⁶⁴https://www.embarc.de/themen/dokumentation/
The Authors 329
Ralf D. Müller
¹⁶⁵mailto:[email protected]
¹⁶⁶https://github.com/docToolchain/docToolchain
¹⁶⁷https://docs-as-co.de
The Authors 330
Gernot Starke
Just email Gernot Starke¹⁶⁸ or contact him via Twitter as @gernotstarke¹⁶⁹.
Hendrik Lösch
Just email Hendrik Lösch¹⁷⁰ or contact him via his website hendrik-loesch.de¹⁷¹.
Michael Simons
Just email Michael Simons¹⁷² or contact him via Twitter as @rotnroll666¹⁷³.
Ralf D. Müller
Just email Ralf D. Müller¹⁷⁴ or contact him via Twitter: @RalfDMueller¹⁷⁵.
Stefan Zörner
Just email Stefan Zörner¹⁷⁶ or contact him via Twitter: @StefanZoerner¹⁷⁷.
¹⁶⁸mailto:[email protected]?subject=arc42%20By%20Example
¹⁶⁹https://twitter.com/gernotstarke
¹⁷⁰mailto:[email protected]?subject=arc42%20By%20Example
¹⁷¹https://hendrik-loesch.de
¹⁷²mailto:[email protected]?subject=arc42%20By%20Example
¹⁷³https://twitter.com/rotnroll666
¹⁷⁴mailto:[email protected]
¹⁷⁵https://twitter.com/RalfDMueller
¹⁷⁶mailto:[email protected]
¹⁷⁷https://twitter.com/StefanZoerner
Further Reading
If you liked the way we explained, communicated and documented software archi-
tectures, you might want to know more about arc42.
It contains more than 200 practical tips how to improve your architecture communi-
cation and documentation:
¹⁷⁸https://leanpub.com/arc42inpractice
Further Reading 333
In addition to definitions this eBook also contains translation tables, currently for
German and English.
You find it on https://leanpub.com/isaqbglossary¹⁸¹.
¹⁸¹https://leanpub.com/isaqbglossary
Further Reading 336
¹⁸²https://www.amazon.de/arc42-Aktion-Praktische-Tipps-Architekturdokumentation/dp/3446448012