0% found this document useful (0 votes)
14 views

Seroukhov S. Microservices Design Patterns with Java. 70+ patterns...2024

The document is a comprehensive guide titled 'Microservices Design Patterns with Java' by Sergey Seroukhov, detailing over 70 design patterns for developing microservices. It is aimed at professionals in software architecture, offering practical approaches and Java code examples for various aspects of microservices, including architecture, communication, and deployment. The book serves as a flexible reference for architects, developers, and DevOps engineers to enhance their understanding and implementation of microservices systems.

Uploaded by

Pablo Serratos
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Seroukhov S. Microservices Design Patterns with Java. 70+ patterns...2024

The document is a comprehensive guide titled 'Microservices Design Patterns with Java' by Sergey Seroukhov, detailing over 70 design patterns for developing microservices. It is aimed at professionals in software architecture, offering practical approaches and Java code examples for various aspects of microservices, including architecture, communication, and deployment. The book serves as a flexible reference for architects, developers, and DevOps engineers to enhance their understanding and implementation of microservices systems.

Uploaded by

Pablo Serratos
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 742

Microservices

Design Patterns with Java


70+ patterns for designing,
building, and deploying
microservices

Sergey Seroukhov

www.bpbonline.com
First Edition 2024

Copyright © BPB Publications, India

ISBN: 978-93-55517-005

All Rights Reserved. No part of this publication may be reproduced, distributed


or transmitted in any form or by any means or stored in a database or retrieval
system, without the prior written permission of the publisher with the exception
to the program listings which may be entered, stored and executed in a
computer system, but they can not be reproduced by the means of publication,
photocopy, recording, or by any electronic and mechanical means.

LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY


The information contained in this book is true to correct and the best of author’s
and publisher’s knowledge. The author has made every effort to ensure the
accuracy of these publications, but publisher cannot be held responsible for any
loss or damage arising from any information in this book.

All trademarks referred to in the book are acknowledged as properties of their


respective owners but BPB Publications cannot guarantee the accuracy of this
information.

www.bpbonline.com
Dedicated to

My parents Anatoly and Ludmila


my wife Natalya, and kids Michael and
Alexandra
About the Author

Sergey Seroukhov, a passionate Technology Evangelist,


resides in Tucson, AZ, with his wife Natalya and two
children, Michael and Alexandra. He is the visionary founder
of Enterprise Innovation Consulting, a boutique consulting
firm that empowers development teams to embrace modern
development methods, enhance productivity, reduce costs,
accelerate time to market, and foster innovation.
Enterprise Innovation Consulting(https://www.entinco.com/)
was founded in 2016, driven by the dream of helping
software teams build more and better software faster. With
his group of talented engineers, Sergey assisted multiple
organizations in developing complex enterprise systems
utilizing microservices, microfrontends, and DevOps. Moving
the business to the next level, Enterprise Innovation
Consulting, under Sergey’s leadership, created several
programs called “Better Microservices
(https://www.entinco.com/programs/better-microservices),”
“Better Microfrontends
(https://www.entinco.com/programs/better-microfrontends),”
“Better Delivery (https://www.entinco.com/programs/better-
delivery),” and “Better Testing
(https://www.entinco.com/programs/better-testing)” that
aimed at drastically improving development productivity
through standardization of architecture and implementation
of development patterns and practices. Moreover, all those
programs represented the 1st step toward the “Software
Factory”, a new development model that brings a step-
change in productivity and cost reduction through
standardization, deeper specialization of labor, and
conveyor-like development processes. Emerging Generative
AI combined with the Software Factory
(https://www.entinco.com/programs/software-factory) model
represents a perfect fit, allowing the systematic and
incremental increase of automation in software
development until it finally reaches the “Light-off Factory”
state when most of the software is generated automatically.
The world is not there yet, but the work of visionaries like
Sergey Seroukhov and companies like Enterprise Innovation
Consulting are making that future come sooner.
Sergey's journey in coding began at the age of 14, and he
implemented his first commercial software product using
dBase around 1991, even before graduating from high
school. After completing his master's degree at Donetsk
State Technical University in 2001, he embarked on a new
chapter in the United States with his wife, Natalya. Over the
next two decades, Sergey honed his skills, working as a
Software Developer, Team Lead, Solution Architect, and
eventually as a CTO in several startups. His foray into
microservices started around 2005, leading the creation of a
distributed system architecture composed of loosely
coupled services and composable frontends. Since 2012,
when microservices gained recognition, he has been
instrumental in the development of numerous microservices
systems, using a wide range of programming languages like
.NET, Java, Node.js, Go, Python, and Dart.
About the Reviewers

❖ Praharsh Jain is a passionate programmer and an


information security enthusiast with hands-on experience
developing web, mobile and desktop applications using
multiple technology stacks.
As a seasoned engineering leader, he loves tackling
complex problems and mentoring other team members.
He has extensive professional experience in creating
highly scalable backends leveraging technologies like
Java, Golang and Node.js.
With a background in Computer Science, Praharsh
possesses strong knowledge of CS fundamentals.
He is currently working with Grab as a part of the Risk
Engineering team creating and maintaining systems that
are pivotal in evaluating millions of real-time transactions
to prevent fraud.
❖ Venkata Karthik Penikalapati is a distinguished
software developer with a decade's expertise in
distributed systems and AI/ML pipelines, holding a
Master's in Computer Science from the University at
Buffalo. At Salesforce's Search Cloud, Venkata drives
Data, AI and ML innovation, showcasing his influence in
tech. An accomplished speaker and published author, his
insights resonate at conferences and in scholarly papers,
highlighting his thought leadership. Venkata's role in
evolving AI-driven solutions marks him as a pivotal figure
in technology's future.
Acknowledgement

The journey of writing this book has been a profoundly


rewarding experience, made possible by the unwavering
support and contributions of several key individuals and
groups.
My deepest gratitude goes to my family, whose
encouragement and belief in my work have been the
bedrock of my motivation. Their support has been
invaluable throughout this process.
The team at Enterprise Innovation Consulting has played a
crucial role, sharing their expertise and experiences in
microservices systems, which have greatly enriched the
content of this book. Their dedication has been instrumental
to our collective success.
I extend my thanks to Venkata Karthik and Praharsh Jain for
their meticulous technical review, which has significantly
enhanced the book's quality. Eugenio Andrieu's editing and
Danil Prisyazhniy's preparation of code samples have been
vital in ensuring the clarity and applicability of the material
presented.
The entire team at BPB Publication has been exceptional in
guiding me through the book writing and publishing process,
helping to refine and polish the content to fit within the
confines of the book without compromising its richness.
This book is a testament to the collaboration, expertise, and
support of each individual mentioned and more. To everyone
involved, thank you for helping turn this vision into reality.
Preface

In the evolving landscape of software architecture,


microservices have emerged as a cornerstone for building
scalable, resilient systems. Microservices Design
Patterns with Java is crafted for professionals navigating
this complex domain, offering over 70 design patterns and
practices essential for developing robust microservices.
Tailored for architects, team leads, developers, and DevOps
engineers with a solid grounding in microservices and Java,
this book serves as a comprehensive guide to mastering the
intricacies of microservices architecture.
With a practical approach, we present patterns ranging from
architectural design to deployment, each accompanied by
Java code examples. This format allows readers to apply the
concepts directly to their projects, facilitating a deeper
understanding and immediate implementation. The book is
structured as a flexible reference, enabling professionals to
explore topics in any order and apply patterns to various
challenges.
Distinguishing itself in a crowded field, this publication
targets experienced practitioners, offering a concise
compilation of established and emerging patterns. It aims to
equip readers with the knowledge and tools to tackle the
challenges of microservices development, ensuring the
delivery of efficient, scalable, and reliable systems.
Chapter 1: Defining Product Vision and Organization
Structure - Explores the importance of aligning
microservices with organizational structure and product
vision for successful implementation.
Chapter 2: Architecting Microservices Systems -
Introduces architectural patterns for decomposing systems
into microservices, covering communication styles, security
models, and deployment strategies.
Chapter 3: Organizing and Documenting Code -
Discusses best practices for structuring and documenting
microservices code to ensure maintainability and scalability.
Chapter 4: Configuring Microservices - Covers various
configuration strategies for microservices at different
lifecycle stages, emphasizing dynamic configuration for
flexibility.
Chapter 5: Implementing Communication - Details
synchronous and asynchronous communication patterns,
including HTTP/REST, gRPC, and message-driven
approaches, ensuring efficient service interaction.
Chapter 6: Working with Data - Presents data
management patterns for microservices, including CRUD,
CQRS, event sourcing, and strategies for database
architecture.
Chapter 7: Handling Complex Business Transactions -
Explores patterns for managing business transactions in
microservices, including state management, distributed
transactions, and reliability strategies.
Chapter 8: Exposing External APIs - Discusses designing
and securing external APIs, highlighting the importance of
API gateways, authentication, and versioning for external
integration.
Chapter 9: Monitoring Microservices - Introduces
monitoring strategies for microservices, covering logging,
metrics collection, distributed tracing, and health checks.
Chapter 10: Packaging Microservices - Explores
packaging strategies for deploying microservices across
various platforms, including Docker, serverless, and
traditional JEE servers.
Chapter 11: Testing Microservices - Details patterns for
automating microservices testing, covering both functional
and non-functional aspects to ensure robustness and
performance.
Chapter 12: Scripting Environments - Discusses the use
of scripted environments for efficient microservices delivery,
emphasizing automation in infrastructure management.
Chapter 13: Automating CI/CD Pipelines - Introduces
continuous integration and continuous delivery pipelines
tailored for microservices, focusing on incremental delivery
and secure deployment strategies.
Chapter 14: Assembling and Deploying Products -
Provides a guide to assembling and deploying
microservices-based products, covering product packaging,
version management, and deployment strategies.
Code Bundle and Coloured
Images
Please follow the link to download the
Code Bundle and the Coloured Images of the book:

https://rebrand.ly/b4tfcj2
The code bundle for the book is also hosted on GitHub at
https://github.com/bpbpublications/Microservices-
Design-Patterns-with-Java. In case there’s an update to
the code, it will be updated on the existing GitHub
repository.
We have code bundles from our rich catalogue of books and
videos available at https://github.com/bpbpublications.
Check them out!

Errata
We take immense pride in our work at BPB Publications and
follow best practices to ensure the accuracy of our content
to provide with an indulging reading experience to our
subscribers. Our readers are our mirrors, and we use their
inputs to reflect and improve upon human errors, if any, that
may have occurred during the publishing processes
involved. To let us maintain the quality and help us reach
out to any readers who might be having difficulties due to
any unforeseen errors, please write to us at :
[email protected]
Your support, suggestions and feedbacks are highly
appreciated by the BPB Publications’ Family.

Did you know that BPB offers eBook versions of every book published, with
PDF and ePub files available? You can upgrade to the eBook version at
www.bpbonline.com and as a print book customer, you are entitled to a
discount on the eBook copy. Get in touch with us at :
[email protected] for more details.
At www.bpbonline.com, you can also read a collection of free technical
articles, sign up for a range of free newsletters, and receive exclusive
discounts and offers on BPB books and eBooks.

Piracy
If you come across any illegal copies of our works in any form on the internet,
we would be grateful if you would provide us with the location address or
website name. Please contact us at [email protected] with a link to
the material.

If you are interested in becoming an author


If there is a topic that you have expertise in, and you are interested in either
writing or contributing to a book, please visit www.bpbonline.com. We have
worked with thousands of developers and tech professionals, just like you, to
help them share their insights with the global tech community. You can make
a general application, apply for a specific hot topic that we are recruiting an
author for, or submit your own idea.

Reviews
Please leave a review. Once you have read and used this book, why not leave
a review on the site that you purchased it from? Potential readers can then
see and use your unbiased opinion to make purchase decisions. We at BPB
can understand what you think about our products, and our authors can see
your feedback on their book. Thank you!
For more information about BPB, please visit www.bpbonline.com.

Join our book’s Discord space


Join the book’s Discord Workspace for Latest updates,
Offers, Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
Table of Contents

1. Defining Product Vision and Organization


Structure
Introduction
Structure
Objectives
Microservices adoption goals
Problem
Scalability
Productivity
Time to Market
Innovation
Incremental delivery
Problem
Solution
Development Model
Problem
Agile Workshop
Software factory
Organization structure
Problem
Feature delivery teams
Platform teams
Integration teams
Microservices adoption process
Problem
Solution
Antipatterns
Conclusion
References
Further reading

2. Architecting Microservices Systems


Introduction
Structure
Objectives
Microservice definition
Problem
Solution
Architectural decomposition
Problem
Functional decomposition
Data decomposition
Domain-driven design
Layered architecture
Microservice sizing
Problem
Solution
Communication style
Problem
Synchronous microservices
Message-driven microservices
Event-driven microservices
Business logic coordination and control flow
Problem
Orchestration
Choreography
Security model
Problem
Zero trust model
Secure perimeter
Cross-platform deployments
Problem
Symmetric deployments
Asymmetric deployments
Tenancy
Problem
Single-tenancy
Multi-tenancy
Development stacks
Problem
Platform-specific frameworks
Cross-platform frameworks
Polyglot and cross-platform frameworks
Conclusion
References
Further reading

3. Organizing and Documenting Code


Introduction
Structure
Objectives
Code repositories
Problem
Mono-repo
Multi-repo
Workspace
Problem
Solution
Code structure
Problem
Functional / domain-driven code structure
Type / Technology-based code structure
Code sharing
Problem
No code sharing
Shared libraries / versioned dependencies
Sidecar
Code compatibility
Problem
Full backward compatibility
Namespace versioning
Minimalist documentation
Problem
Handwritten documentation
Readme
Changelog
Todo
Commit messages
Auto code documentation
Problem
JavaDoc generation
Auto-generated comments
Code reviews
Problem
Pull request reviews
Periodic reviews
Code review checklist
Auto code checks
Microservice Chassis / Microservice template
Problem
Solution
Antipatterns
Conclusion
Further reading

4. Configuring Microservices
Introduction
Structure
Objectives
Configuration types
Problem
Solution
Day 0 Configuration
Day 1 Configuration
Day 2 Configuration
Hardcoded configuration
Problem
Solution
Static configuration
Problem
Environment variables
Config file
Configuration template / consul
Dynamic configuration
Problem
Generic configuration service
Specialized data microservice
Environment configuration
Problem
Solution
Connection configuration
Problem
DNS registrations
Discovery services
Client-side registrations
Deployment-time composition
Problem
Solution
Feature flag
Problem
Solution
Antipatterns
Conclusion
Further reading

5. Implementing Communication
Introduction
Structure
Objectives
Synchronous calls
Problem
HTTP/REST
gRPC
Asynchronous messaging
Problem
Point-to-point
Publish/subscribe
API versioning
Problem
Versioned channels
Versioned routing
API Documentation
Problem
OpenAPI
ProtoBuf
AsyncAPI
Blob streaming
Problem
Continuous streaming
Transferring blob IDs
Chunking
Commandable API
Problem
Solution
Reliability
Problem
Timeout
Retries
Rate limiter
Circuit breaker
Client library
Problem
Solution
Conclusion
Further reading

6. Working with Data


Introduction
Structure
Objectives
Data objects
Problem
Static data
Dynamic data
Object ID
Problem
Natural key
Generated key
GUID
Data management
Problem
CRUD
CQRS
Event Sourcing
Materialized View
Dynamic query
Problem
Filtering
Pagination
Sorting
Projection
Database architecture
Problem
Database per service
Database sharding
Data migration
Problem
Disruptive migration
Versioned tables
Schemaless
Antipatterns
Static queries
Shared database
Conclusion
Further reading

7. Handling Complex Business Transactions


Introduction
Structure
Objectives
Concurrency and coordination
Problem
Distributed cache
Partial updates
Optimistic lock
Distributed lock
State management
Process flow
Problem
Aggregator
Chain of Responsibility
Branch
Transaction management
Problem
Orchestrated Saga
Choreographic Saga
Compensating transaction
Workflow
Reliability
Problem
Backpressure
Bulkhead
Outbox
Delayed execution
Problem
Job Queue
Background worker
Antipatterns
Conclusion
Further reading

8. Exposing External APIs


Introduction
Structure
Objectives
External interface
Problem
API gateway
Facade
Backend for Frontend
API management
Synchronous request/response
Problem
HTTP/REST
GraphQL
Push notifications and callbacks
Problem
Webhooks
WebSockets
Authentication
Problem
Basic authentication
API key
OpenID Connect
Multi-factor authentication
Authorization
Problem
Session tracking
JWT token
OAuth 2
SSL/TLS encryption
Problem
Solution
Conclusion
References

9. Monitoring Microservices
Introduction
Structure
Objectives
Trace ID
Problem
Solution
Error propagation
Problem
Solution
Logging
Problem
Triple-layered logging
Log aggregation
Application metrics
Problem
Solution
Distributed tracing
Problem
Solution
Health checks
Problem
Solution
Conclusion
Further reading

10. Packaging Microservices


Introduction
Structure
Objectives
Microservice packaging
Problem
System process
Docker container
JEE bean
Serverless function
Cross-platform deployment
Problem
Symmetric deployments
Platform abstraction
Repackaging
Micromonolith
Problem
Solution
External activation
Problem
Cron jobs
Cron service
JEE Timer
Conclusion
Further reading

11. Testing Microservices


Introduction
Structure
Objectives
Test planning
Problem
Solution
Functional testing
Problem
Unit test
Integration test
End-to-end test
Contract test
Acceptance test
Initial state
Non-functional testing
Problem
Benchmark
Simulators
Data generator
Mock
Problem
Solution
Chaos Monkey
Problem
Solution
Conclusion
Further reading

12. Scripting Environments


Introduction
Structure
Objectives
Scripted environment
Problem
Production environment
Test environment
Development environment
Cross-platform deployments
Problem
Asymmetric environment
Symmetric environment
Dockerized environment
Deployment security
Problem
IP access lists
Traffic control rules
Management station
Environment verification
Problem
Environment testing
Infrastructure certification
Conclusion
Further readings

13. Automating CI/CD Pipelines


Introduction
Structure
Objectives
CI/CD pipeline
Problem
Incremental delivery
Multiple deployments
Application platform
Product integration
Development/DevOps delineation
Problem
Solution
Virtualized build process
Problem
Solution
Quality gate
Problem
Automated gate
Manual gate
Secure delivery
Problem
Solution
Environment provisioning
Problem
Static Environment
Spin-off environment
Branching strategy
Problem
No-branching or continuous deployment
Feature branching
Trunk-based development
Release branching
Gitflow
Delivery metrics
Problem
Detail metrics
DORA metrics
Delivery dashboard
Conclusion
Further reading

14. Assembling and Deploying Products


Introduction
Structure
Objectives
Product packaging
Problem
Kubernetes YAML manifests
Helm chart
EAR archive
Resource template
Custom script
Baseline management
Problem
Development branch
System deployment
Updates from CI/CD pipeline
Deployment strategy
Problem
Blue/green deployment
Rolling deployment
Canary deployment
Conclusion
Further reading

Index
CHAPTER 1
Defining Product Vision
and Organization Structure

Introduction
In recent years, there has been a significant rise in the
adoption of microservices. An increasing number of
software teams are currently either engaged in developing
microservices or considering them. Numerous publications
on the subject offer various patterns, and multiple
technologies pledge to simplify microservice development.
Nonetheless, eight out of ten organizations that have
adopted microservices are encountering significant issues
that prevent them from meeting their initial expectations.
The sources of these problems are seldom purely technical.
The key to success in the development of microservice
systems is understanding that microservices are not just an
architectural style. A microservice is a software component
with an independent lifecycle. It can be built by different
teams, at different times, using different technologies, and
delivered independently into production. Achieving that kind
of independence requires not only technical decisions. It
touches all areas of software development, including
organization structure, product, and project management.
This chapter introduces you to patterns at the organization
and product level that help solve this problem by setting the
right structure and direction to ensure success in
microservices development.

Structure
In this chapter, we will cover the following topics:
Microservices Adoption Goals
Scalability
Productivity
Time to Market
Innovation
Incremental Delivery
Development Model
Agile Workshop
Software Factory
Organization Structure
Feature Delivery Teams
Platform Teams
Integration Team
Microservices Adoption Process
Antipatterns

Objectives
After studying this chapter, you should be able to set clear
goals for microservice adoption, define an appropriate
organizational structure, adopt an incremental delivery
model, and assign clear roles and responsibilities to your
team. Furthermore, this chapter explores different
development models and introduces the Software Factory
development model, which facilitates substantial increases
in development productivity and cost reduction.

Microservices adoption goals


The concept of microservices extends beyond the technical
realm, encompassing a divide and conquer strategy that
empowers users to overcome the mounting complexity of
software. However, embarking on a microservices journey is
not a simple task. Success requires the involvement of both
technical and non-technical teams, and necessitates a strong
rationale linked to the organization's overall business vision.

Problem
Microservices have been a prominent topic in recent years,
with numerous success stories shared by industry giants
such as Amazon, Google, and Facebook (as depicted in
Figure 1.1). Such achievements have inspired others to
follow suit and adopt this approach to development.
However, this decision is often taken solely by the technical
team, lacking clear justification or support from management
or other stakeholders.

Figure 1.1: Large companies that shared their success stories about
microservice adoption
Often, when teams opt to develop microservices, they tend
to handle the technical aspects correctly. Their code appears
to be structured as microservices and may function as such.
However, when it is time to work on a subsequent release,
the team is overwhelmed with the extensive amount of work
they must complete. Then, after several months of hard
labor, they eventually managed to produce a long-awaited
release.
This problem happens because teams continue to operate
with a monolithic mindset and follow monolithic processes,
even though microservices provide new possibilities. To fully
leverage the advantages of microservices, it is critical to
start with well-defined objectives that are aligned with
broader business goals, approved by management, and
reinforced by other teams. Collaboration is essential to
achieving success.

Scalability
Microservices were initially championed by major internet
corporations such as Facebook, Netflix, and Amazon as a
solution to the challenges of scaling monolithic systems that
were inefficient at serving millions of users simultaneously.
Large components consumed excessive resources, with
many of them underutilized. By dividing their monolithic
systems into smaller, independent chunks, they were able to
improve resource utilization and scale each component
separately.
Although most organizations do not require such extensive
scalability, it remains the primary factor for selecting
microservices, as per the responses of people when asked
why they opted for this approach.
Nonetheless, it is important to recognize a few key points:
Microservices are not the only way to achieve
scalability. Besides, many organizations just do not
need that kind of scale. It's crucial to understand that
while microservices offer one approach to scaling,
there is a common misconception that they are the only
option available.
Regarding scalability, computational issues are not
always the primary bottleneck in the system.
Bottlenecks can manifest in various areas, not
exclusively in microservices. In many cases, it has been
observed that the most significant bottlenecks arise at
the database level.
Although scalability is often considered the primary
objective, it may only be relevant to a small group of
software organizations, specifically those that anticipate
exponential growth. An example is SaaS companies, which
aim to safeguard their code investments by ensuring that
they can handle heavy loads should success arrive.
It is also important to understand that when scalability is
identified as a target, it is critical to establish a specific
metric for it. Additionally, it is necessary to investigate all
areas of the architecture that may contain bottlenecks, not
only the microservices' backend.

Productivity
The second most popular reason for adopting microservices
is probably the desire for higher development productivity.
This belief has been reinforced by success stories from larger
companies, leading people to view microservices as a
guaranteed way to improve productivity.
Unfortunately, many who have adopted microservices have
been surprised by the discovery that their productivity has
significantly decreased. In addition to writing regular code,
they now spend a great deal of time coding communication,
troubleshooting difficult issues, and building and maintaining
multiple CICD pipelines. What used to be a single software
release has now become many, making their lives much
more difficult. Almost without exception, the stories behind
such cases involve a distributed monolith that stems from an
old mindset, inefficient organizational structures, and
monolithic development practices.
To enhance development productivity, microservices can be
a valuable tool. However, to fully capitalize on their
potential, the organization and development model must
undergo a transformation, and the ability to
compartmentalize the work across microservices should be
used to its fullest extent. In addition,
Product releases should be incremental, delivering a
few features that require changes in a small number of
microservices.
Those microservices that have been modified should be
the only ones eligible for development, testing, and
release.
Developers should focus only on assigned
microservices and not spend their time and mental
energy thinking about the entire system.
DevOps engineers should assemble the system from
the microservices, treating them as black boxes.
Microservice implementations should be standardized
and templated, so every new microservice gets to be a
close copy of others. In this manner, developers do not
need to think about how they should write the code.
Productivity can be set as a goal to deliver more features
with limited resources, in order to achieve business growth
and market domination. However, it is crucial for the
organization to have clarity on this matter. This includes an
understanding of past and current productivity levels,
identifying bottlenecks that cause productivity to decline,
and defining the precise development process required to
achieve higher productivity.

Time to Market
In today's saturated market, vendors that deliver a new idea
can quickly capture a big portion of the market, and those
who come after them usually have a very hard time to battle
uphill. That’s why Time to Market can be extremely
important for software companies that experience high
competition.
Although microservices can potentially reduce Time to
Market significantly, their implementation often fails to
achieve this goal due to a lack of alignment among team
members and uncertainty about how to achieve it. While
microservices are intended to provide a solution, the reality
can be different, resulting in a larger and more complex
system with numerous components to be released and
integrated, ultimately slowing down the release cycle. The
root cause of this issue is typically the result of a monolithic
mindset, development practices, and organizational
structure that leads to a distributed monolith.
To release faster, the team should adopt the Incremental
Delivery model. This involves selecting a small set of
features that can be released independently and only affect
a limited number of components. The team can then
develop, test, and release these features while leaving the
rest of the system unchanged. Importantly, product
management should also be involved in defining a small set
of features with high business value that will incentivize
customers to purchase or upgrade (refer to the Incremental
Delivery Pattern).

Innovation
Innovation is another popular goal for microservices adoption
that we discuss in this chapter. It can be related to functional
innovation: delivering new features quickly. Or it could be a
technological innovation: using the latest technologies,
integrating scanners, AI, etc.
Functional innovation shares similarities with the Time to
Market concept previously discussed: it requires incremental
delivery, and everything associated with it.
Technological innovation is quite different. It requires the
ability to quickly adopt new infrastructure services, new
libraries, new frameworks, or even new programming
languages. In monolithic architecture that almost inevitably
leads to an extensive rework of the entire codebase,
requiring lots of time and resources. Microservices on the
other hand can be developed using different languages and
frameworks, compiled with different libraries, use different
databases, messaging, etc.
In practice it is not always that easy. Changing a shared
library may trigger changes in all dependent microservices,
changing databases may require changes in the deployment
and operational scripts that in turn, trigger a wave of other
changes. And adding a new language may just not be
possible, because older microservices may not be
interoperable to communicate with the new ones. All those
things should be carefully designed and implemented if
innovation is one of the main reasons for microservice
adoption.

Incremental delivery
Many companies want to speed up their Time to Market
and/or improve productivity and lower development costs.
Achieving those goals within microservices systems involves
utilizing Incremental Delivery.
Problem
Development teams can put lots of effort into building
microservices systems and automating delivery processes.
However, if Product Management fails to embrace the
concept of incremental delivery, they may continue to
operate under an outdated monolithic mindset and present
development teams with an extensive list of features to
include in the next version.
Some teams try to break down the requested functionality
into smaller pieces and deliver them in a few iterations.
However, without a business decision, they do not know how
or/and do not have the mandate to release a simpler but
functional version. Instead, they deliver parts of the
requested release, which has no business value, as depicted
in Figure 1.2:

Figure 1.2: Incorrect approach to incremental delivery that misses business


value

When a development team works on the entire codebase


developing a new version it takes significantly longer than
traditional monolithic development because of the added
complexity. If they make intermediate releases, every
release cycle consumes even more time. But, since there is
no business value it doesn’t go into customer hands, and it
does not count.

Solution
Incremental delivery should always start with a product
vision and roadmap. The product management shall be
completely onboard and work together with the development
team to define releases that are:
Small in scope
Deliver clear business value
Require changes in a small number of components
The last point is important, because in a correctly
implemented microservices system when only a small
number of components need changes, the rest shall not be
touched. If they are not touched, they do not need
development and testing work. And that shortens the release
cycle considerably.
As you may notice on Figure 1.3, every new version of a
system may require some rework to go from a basic to more
advanced level, where each step adds new extra value and
functional complexity.

Figure 1.3: Correct incremental delivery that brings real value to customers on
every step

That is usually quite acceptable for two reasons:


Companies generate flows of money from selling
product versions that pay off new developments.
Microservices, because of their smaller size, are easier
to rework than large monolithic systems.

Development Model
The development model adopted by an organization can
have a tremendous impact on software development. This
model defines a comprehensive process or methodology that
guides development teams on how to design, build, test, and
deliver software products.

Problem
Prior to the publication of the Agile manifesto in 2001 [1],
software development adhered to the Waterfall model [2],
which originated from traditional industries. However, unlike
those industries that typically produce duplicates, software
teams must create one-of-a-kind products every time. With
constantly advancing software technologies and escalating
complexity, risks and uncertainties were amplified even
further. As a result, most software projects during that period
failed to meet their delivery timelines or budget
requirements.
The Agile model recognized those inherited risks and
uncertainties and offered a different approach that
decreased the level of formalities and accepted
experimentation as a necessary part of the development
process.
However, an idea of a Post-Agile model is emerging, but not
many people have a clear vision of what it can be. This
search has been mainly prompted by three factors. First,
after using the model for over 20 years, teams have gained
more experience and understanding of the Agile method.
Second, while monolithic systems were unique,
microservices offer higher consistency and repeatability in
development. And finally, automated delivery processes
require a greater emphasis on standardization

Agile Workshop
The Agile model's adoption led to the removal of many
formalities, with informal communication and functional code
deemed the ultimate goals. Any element that doesn't
directly contribute to these goals is eliminated, including
formal documentation, architectures, and processes, which
are substituted with informal, lightweight alternatives. As a
result, the boundaries between different roles have become
less distinct. After a decade of increasing specialization, the
broad skill sets of Fullstack developers and all-in-one DevOps
engineers have once again become widespread.
Teams that adopted the Agile model have been better suited
for rapid deployment and efficient experimentation, akin to a
car workshop shown in Figure 1.4. In comparison to the older
Waterfall model, Agile more effectively tackles the inherent
risks and uncertainties of software projects, leading to
improved project success rates.
The key characteristics of an Agile Workshop are:
Components: bespoke, non-standardized
Roles and responsibilities: fuzzy and broad
Processes: ad hoc, informal
Figure 1.4: Car workshop where a small group of highly skilled mechanics
build a car

Besides all these positives, the Agile model has a number of


limitations. One of them is heavy reliance on informal
communication which limits its scale. A practical size of agile
teams is from 5 to 9 people. Larger teams get to be broken
into a few smaller teams and establish formal interfaces
between them.
Another constraint arises from poor code standardization and
incomplete documentation, which can result in the
deterioration of informally preserved technical knowledge
over time or when there is high turnover within the
development team. This can result in losing control over the
codebase and a substantial decline in productivity and
quality.
Following are the pros and cons:

Pros:
Smallest amount of overhead.
Quick start.
Efficient experimentation.
Good reaction to changing requirements and
unforeseen risks.

Cons:
Asks for high qualifications of team members.
Difficult to scale.
Difficult to sustain over long periods of time.
Hard to preserve knowledge when attrition is high.
Long onboarding of new team members.
Software factory
Although the Agile model has proven to be effective for small
and brief software projects, it faces difficulties when
implemented in larger and more intricate products or
product lines with extended lifetimes. To address these
challenges, there has been discussion about Post-agile
models that may deviate from the original Agile principles.
One of these emerging models is known as the Software
Factory model, which should not be confused with the term
Software Factories that refers to automated development.
The Software Factory model is a translation of manufacturing
principles used in traditional factories into the software
world.
The first factories in known human history appeared at the
end of the Renaissance era and represented a significant
step forward in terms of productivity improvements and cost
reduction, when compared to medieval workshops.
They used three key principles:
Interchangeable parts (standard components)
Division of labor
Assembly lines
As we can see, the principles bring consistency and
repeatability in work products (components), job
responsibilities (skill sets), and production processes
(management and automation), see a car factory in Figure
1.5 as an analogy.
The most important characteristics of a software factory are:
Components: standardized
Roles and responsibilities: clearly defined and highly
specialized
Processes: formal and preplanned
Figure 1.5: A factory mass producing cars using an automated assembly line

Similar to the Agile model and unlike Waterfall, the Software


Factory uses interactive development and incremental
delivery. Just like flexible manufacturing systems, software
factories use pre-built component templates that are
modified by developers (or eventually AI) according to a new
set of requirements, repackaged into product releases, and
shipped to customers in very short time intervals. Thus,
microservices are a perfect fit for the Software Factory model
that benefit from their independent, loosely coupled nature.
Just like the Agile model, Software Factories deal with the
inherent risks and uncertainties of software development,
but in a different way. Risks related to unclear product
requirements, as usual, are addressed via quick delivery of
experimental versions or/and feature flag versions and
released into customer hands. But technology-related risks
are mitigated by standardizing components and templating
their implementations, which are then employed by
development teams to build new functionality.
Creation of Software Factories is a big topic that deserves a
dedicated book. But in a nutshell, it requires the following
steps:
1. Standardize components at the architecture level. For
instance: data microservices, business-process
microservices, connectors, and facades.
2. For standard components, build well-defined templates.
Define a sequence of elementary operations to develop
each component type. For instance: create a
microservice structure, implement persistence,
implement business logic, implement communication,
and package the microservice.
3. Define standard skill sets (job profiles). Create training
and certifications. Train team members.
4. Establish a production process as a sequence of
connected workstations. Each workstation shall have
clear inputs, outputs and performed operations. Create
a Standard Operating Procedure (SOP) and define
process metrics. Assign team members to their
workstations.
5. When a new set of requirements or defects arrive,
execute a process defined by the SOP and use process
metrics to set time and quality expectations.
In addition to traditional development roles (Managers,
Architects, Leads, Developers) that have to work within their
more formal and standardized structure, teams may
introduce a few new roles:
Technologist: a person responsible for finding
technical solutions for standard architectural
components, creating templates and defining
operations
Process Engineer: a person responsible for the
definition of a manufacturing process, SOP and
automation of delivery pipelines
Product Assembler: a person responsible for
packaging components into a deployable product and
automating deployment procedures (actions)
Platform Engineer: a person responsible for
automating (scripting) deployment environments to
provide consistent hardware and software platforms
for one or a few products
Trainer: a person responsible for training specialized
workforce to perform operations following organization
SOPs, ensuring team members are proficient in their
designated roles and the technologies used within the
software factory
Although setting up Software Factories requires more effort
and may be less efficient than the Agile model in the initial
stages of development, once established, they can produce
software at a significantly lower cost and time compared to
Agile Workshops. A comparison between system size and
time/cost for both models is presented in Figure 1.6 below:
Figure 1.6: Comparison between System Size and Time/Cost for Agile
Workshops and Software Factories

Following are the pros and cons:

Pros:
High productivity and low development costs.
Low expectations for qualifications of team members.
High scalability and sustainability.
Quick onboarding of new team members.
Good reaction to changes in product requirements.

Cons:
High initial investments.
Slow start.
Slow reaction to technological changes.
Some additional insights on Software Factory can be found in
the “Software Factory”
(https://www.entinco.com/programs/software-factory)
article.

Organization structure
As Melvin E. Conway [3] stated, organizational structures
have deep and intricate relationships with the architecture of
the products they create. Microservices Systems are not an
exception to that rule. A suitable structure will make the
product development easy and fast. However, a bad
structure can escalate microservices challenges to a
breaking point and can become one of the root causes for
project failures.
A book called Team Topologies written by Matthew Skelton,
Manuel Pais discusses organizational structure in great
detail. Those who want to dive deeper into this topic go to
http://teamtopologies.com.

Problem
In the 1980s and 1990s, when software development
reached a certain level of complexity and maturity,
specialization and formal processes were adopted by the
industry. Since that time most organizations have structured
their teams by functional areas. A typical structure looks like
the one in Figure 1.7 below:

Figure 1.7: Traditional function-oriented organization structure

However, although this functional structure is easy to


understand and set up, it is highly ineffective. The reasons
for this are:
It is very slow. Even an elementary hello world
product may take weeks or even months to move from
inception to release; as each team needs to plan the
work, allocate resources, do the work, and iterate a few
times until the work is accepted by another team.
It represents a broken phone. Extra channels add
inevitable miscommunications that affect the result
It is prone to power struggles. Each team has its own
leadership, with its own background and vision. When
it comes to collaboration, different views often lead to
conflicts that take time and effort to resolve.
Microservices tend to escalate the problems inherent in the
functional model. They require changes in the approach that
are very hard to achieve in a competitive environment. Plus,
added complexity makes the hard work even harder. As a
result, often this structure eventually collapses under its own
weight.

Feature delivery teams


Although microservices systems are more complex when
compared to traditional monolithic software, they allow the
use of a divide and conquer approach to
compartmentalize the complexity. Furthermore, the focus on
fast incremental delivery demands breaking organizational
barriers and combining forces under a single leadership to
deliver new features to the market.
Organizations can adopt a new cross-functional
organizational structure to effectively leverage the strengths
and weaknesses of microservices. This structure involves
dividing the development organization into multiple cross-
functional teams that focus on delivering specific features
(refer to Figure 1.8):
Figure 1.8: Organization structure with feature-delivery teams recommended
for microservice development

Every team is accountable for software components that are


grouped according to a specific functional domain. The team
comprises all the necessary roles required to complete the
development cycle, under a unified leadership. When it is
necessary to create a new set of features, all team members
collaborate to ensure a quick and efficient delivery.
In the event of disagreements, the team lead is responsible
for resolving them promptly, as prolonged political battles
are not an option. Each team's responsibility to operate their
components in production compels them to maintain a high-
quality standard. Overall, this structure enables parallel
development streams and facilitates faster delivery of
multiple features to the market (see Figure 1.9):
Figure 1.9: Feature output in traditional organization structures vs multiple
feature delivery teams

Inter-team dependencies are common, but it is crucial to


manage them using formally defined and versioned
interfaces. Failure to follow this guideline can disrupt the
work of dependent teams, resulting in constant
synchronization and operating similarly to a traditional
functional structure with inefficient sequential processes.
Organizations strive to achieve shared code ownership
where teams can modify components while preserving
already released functionality for other teams. This approach
requires mature processes, high standardization, formal
versioned interfaces, and comprehensive automated testing,
as described in the Software Factory pattern. Without these
elements, attempting to implement shared code ownership
can lead to disastrous outcomes.
Following are the pros and cons:

Pros:
High productivity.
Fast time-to-market.
Low overhead.

Cons:
Dependencies between teams require extra care.
Potential duplication of effort.
Possible inconsistencies in code and processes.
Hard to maintain organization-wide standards.

Platform teams
The two most significant drawbacks of feature-delivery
teams are duplication of effort and lack of standardization. To
deliver their features, teams need to build certain
infrastructure capabilities. In some situations, this may take
up to 20-30% of the total development effort. Plus, it may
bring challenges in production when infrastructures built by
different teams start competing and conflicting with each
other.
In order to tackle these challenges, numerous organizations
opt to remove shared infrastructure, develop a new one, and
then distribute it among all feature-delivery teams. This
includes deployment environments (deployment platforms)
or shared application services (application platforms). Teams
responsible for implementing and delivering these platforms
are commonly known as Platform Teams (refer to Figure
1.10).

Figure 1.10: Feature Delivery Teams share platform created by a Platform


Team
Having a formal versioned interface for its consumers and an
independent development and release cycle are crucial for
the Platform approach. Without these, it could suffer from
the same adverse effects seen in dependencies between
Feature Delivery Teams, as explained in the preceding
pattern.
Following are the pros and cons:

Pros:
Minimizes duplication in development efforts.
Increases standardization.
Optimizes production deployments, reduces overhead,
and resolves conflicts.

Cons:
Introduces additional dependencies that must be
carefully managed.

Integration teams
There are instances when the software delivery process is
very labor intensive. It may include exhaustive testing,
verification for multiple deployment scenarios, obtaining
certifications from external entities, writing comprehensive
user documentation and so on. Alternatively, despite
multiple development streams, the product needs to be
released as a single deployable artifact. In such cases,
organizations may opt to form an Integration Team that
assumes the responsibility of packaging the product,
conducting verifications and other necessary steps, and
ultimately delivering it to customers (refer to Figure 1.11):
Figure 1.11: Feature Delivery Teams build components that are packaged and
delivered by a separate Integration Team

To make it work well, Feature Delivery teams should formally


release their components with proper versioned interfaces
and defined compatibility requirements. This enables the
Integration team to select a compatible set of components at
any given time and integrate and deliver them to customers
without having to wait for a particular team to complete their
development.
Following are the pros and cons:

Pros:
Minimizes duplication in testing and delivery efforts.
Allows to deliver a product as a single deployable
artifact.

Cons:
Delays in releases of one Feature Delivery Team may
affect product releases.
Weaker and slower feedback loop for issues found
during integration and in production.

Microservices adoption process


The development of microservice systems is a relatively new
and complex field that allows for various interpretations. To
ensure a swift and seamless adoption of microservices, it is
necessary to exercise caution and engage in meticulous
planning and a gradual start.

Problem
Some organizations jump into microservice development
with full force without the necessary preparation steps. While
having no standards, no infrastructure, and without a
selected stack of technologies, they hire a bunch of
developers and ask them to build a product and deliver it in
a few months. Most of those developers come with their own
experiences and understanding of how microservices
systems must be built.
When a diverse group of individuals joins a project with so
many gaps, developers invest their efforts to address the
problems based on their knowledge and understanding. This
can result in conflicts among individuals who hold alternative
perspectives but share an equally strong desire to make a
positive impact. Such conflicts often create chaos, and the
team spends months in the "storming" and "forming" phases
before eventually reaching a state of high performance.

Solution
To adopt microservices successfully, it is advisable to start
with a deliberate and cautious approach, following a plan
with a small number of steps as shown in Figure 1.12:
Figure 1.12: Recommended microservices adoption process

Following are the recommended steps for microservices


adoption process:
1. Start with a small leadership team that includes
managers and architects. Clarify microservice adoption
goals and metrics to measure the process. Define
product architecture, organization structure, and key
development processes. Select the development stack
and related core technologies.
2. Hire experienced technologists and DevOps automation
engineers. Let technologists develop templates for
components defined in the solution architecture. Assign
DevOps engineers to setup build and release
infrastructure (code version control, binary repositories,
build servers and runners). Then, script test and
production environments and automate CICD pipelines.
3. Continue with a small team. Select a vertical slice of
the product that includes a few representative
components. Implement them using the templates,
follow the processes and use the delivery platform. Fix
any found issues.
4. Expand the team, you do not need experts at this point
- average developers will do. Assemble them into
teams. Provide them with training on how to use the
technologies and patterns and follow the processes.
5. Develop a Minimum Viable Product (MVP) that
includes a minimally usable set of functionality and
hand it over to users. Solve any found issues.
6. Continue the development following the incremental
delivery process. Build and release the product in small
portions that deliver business value to customers.

Antipatterns
Inadequate approach to microservice adoption, ineffective
organization structure, old monolithic mindset and similar old
processes are among most common reasons why
microservices fail. These factors are frequently disregarded
as people tend to focus on architectural and implementation
aspects.
Consequently, there are a few critical antipatterns that are
worth mentioning here:
Starting microservice adoption without clear
reasons: Technical teams may follow the industry hype
and see microservices as a fun experience and the next
step in their professional development. They may sell
that idea to management or just decide to execute it on
their own. However, without understanding the impact
of their actions, and without sufficient time and
resources, those initiatives often lead to a situation
where development gets stuck in the initial phases
when everything is more difficult and slower, making it
impossible to overcome the hurdle and enjoy the
advantages.
Not aligning with other groups: When a
development team has full autonomy and all necessary
roles, it can choose to adopt microservices and execute
them in isolation. Nonetheless, this is often not the
case, and usually the team must work with other
departments within the organization. Without their
buy-in, the team can only execute the technical part of
the adoption and remain stack in other areas. For
instance, without Product Management, incremental
delivery is impossible; without reorganization into
cross-functional teams, releases will still take a long
time and; without formalizing and versioning
interdependencies with other development teams,
developed microservices will not have an independent
lifecycle.
Distributed monolith: It happens when development
teams had intention to implement a microservice
system, but due to mistakes in architecture or
implementation introduced coupling, so development
and delivery of individual microservices became no
longer possible. Other reasons could be related to
issues in organizational structure, management or
delivery processes that force teams to use monolithic
development practices. Distributed monolithic can be a
nightmare for organizations as it increases complexity,
raises development costs but does not bring the
desired benefits of microservices.
Lack of understanding: Many people and teams do
not clearly understand whether they really need
microservices and blindly follow the industry hype.

Conclusion
Throughout this chapter, we have gained insights into how
the effective establishment of clear goals, appropriate
organizational structure, and progressive product delivery
can maximize the advantages that microservices offer.
Additionally, we explored various development models,
including Software Factories, a Post-Agile approach that can
significantly enhance development efficiency and decrease
costs.
The next chapter will explain different microservice
architecture patterns.

References
1. K. Beck et al. Manifesto for Agile Software Development.
2001. Available at https://agilemanifesto.org/ Accessed
on Mar 3, 2023.
2. W.W. Royce. Managing the Development of Large Software
Systems. 1970. Available at http://www-
scf.usc.edu/~csci201/lectures/Lecture11/royce1970.pd
f Accessed on Mar 3, 2023.
3. M.E. Conway. How do committees invent? Datamation,
April 1968. Available at
http://www.melconway.com/Home/pdf/committees.pdf
Accessed on Mar 3, 2023.

Further reading
1. SJ Fowler. 2017. Production-Ready Microservices:
Building Standardized Systems Across an Engineering
Organization. O’Reilly. Sebastopol, CA.
2. M.A. Cusumano. 1991. Factory Concepts and Practices
in Software Development". Annals of the History of
Computing. 13 (1): 3–32.
doi:10.1109/mahc.1991.10004. S2CID 7733552.
3. A. Craske. Only 1% Need Microservices. Medium. Jan
17, 2023. Available at https://medium.com/qe-
unit/only-1-need-microservices-1f8649ecdd6d
4. A. Paredes. Bring back the monolith. Medium. Nov 21,
2022. Available at https://medium.com/glovo-
engineering/bring-back-the-monolith-
92de928ae322
5. G. Nicassio. When you should use modular monolith
instead microservices. Medium. Feb 7, 2022.
https://medium.com/@nicas-snaptech/when-you-
should-use-modular-monolith-instead-
microservices-99f460f4b0ef
6. A. Craske. Microservices — Do You Need Them? Are You
Ready? Medium. Aug 23, 2022.
https://medium.com/qe-unit/the-microservices-
adoption-roadmap-e37f3f32877
7. A. Mucci. Microservices E-commerce Boutique Tutorial
— Part 1. Medium. Feb 5, 2022.
https://blog.minos.run/microservices-online-
boutique-tutorial-part-1-ea9287d34c83
8. I. Stoev. The Problem With Microservices. Feb 2, 2022.
https://levelup.gitconnected.com/the-problem-
with-microservices-2068f64c52e2
9. VMWare Tanzu. Software Factory: Modern software
development. https://tanzu.vmware.com/software-
factory

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 2
Architecting Microservices
Systems

Introduction
This chapter introduces us to microservices architecture
patterns which are used to decompose systems into distinct
microservices and select suitable architectural styles for
communication, coordination, deployments, tenancy, or
security. Towards the end, the chapter will delve into the
responsibilities of the architect in microservice development,
as well as the methods for creating long-living evolutionary
architectures. Some additional insights on Microservice
Architecturing can be found in the “Better Microservices”
(https://www.entinco.com/programs/better-
microservices) program.

Structure
This chapter covers the following topics:
Microservice definition
Architectural decomposition
Functional decomposition
Data decomposition
Domain-driven design
Layered architecture
Microservice sizing
Communication style
Synchronous microservices
Message-driven microservices
Event-driven microservices
Business logic coordination and control flow
Orchestration
Choreography
Security model
Zero trust model
Secure perimeter
Cross-platform deployments
Symmetric deployments
Asymmetric deployments
Tenancy models
Single-tenancy
Multi-tenancy
Development stacks
Platform-specific frameworks
Cross-platform frameworks
Polyglot and cross-platform frameworks

Objectives
After studying this chapter, you will be able to decompose
software systems into microservices, choose an architectural
style, and make intelligent decisions to address system-wide
capabilities.

Microservice definition
Before diving into a microservices architecture, it is
important to understand what microservices are and clear
some misconceptions related to this concept.

Problem
Software systems become complex over time, affecting
productivity, cost, and time. Developing and understanding
codes becomes difficult as the system grows due to the
human brain's limited capacity. Additionally, the number of
interconnections between parts affects productivity, as
changes to one part require updates to interconnected parts.
To improve software development productivity, developers
classified the code by reducing visible parts and
interconnections. Initially done via changes in programming
languages, development paradigms affected the code
structure, but monolithic systems were still in place. As
monoliths grew, testing and delivery became significant
factors, breaking monoliths into smaller deliverable units
called microservices that can be coded, tested, and delivered
independently. Figure 2.1 summarizes this evolution of
software architectures:
Figure 2.1: Evolution of software architecture

Moreover, the reduction of granularity in architectural


components positively impacted productivity and,
consequently, development cost and time (Figure 2.2):

Figure 2.2: Dependency between system size and development productivity


When software architecture changes, complexity does not
disappear; it shifts. Individual component development
becomes easier, but the number of system parts grows,
making testing and delivery more difficult. To address this
issue, incremental delivery was introduced by building
fine-grained systems in small increments.
However, teams often focus solely on development in
microservices systems and continue testing and delivering
their systems using old monolithic practices. This results in a
distributed monolith with high overall complexity and low
productivity, usually worse than traditional monolithic
systems.

Solution
Microservices architecture, which is a variation of the
service-oriented architecture structural style, organizes an
application into a set of finely grained, loosely connected
services that communicate through lightweight protocols.
These services have the following characteristics:
Is a service organized around business capabilities (a
reference to the SOA architecture)
Is small in size
Communicates using lightweight protocols
Can be implemented using different technologies
Can be developed by different teams at different times
Can be delivered and deployed independently
These points describe a microservice as a software
component with an independent lifecycle. This characteristic
is central because it allows for incremental software delivery,
reduces complexity, and improves productivity for large
software systems. However, achieving an independent
microservice lifecycle requires changes across all areas of
software development, including organization structure,
product and project management, testing, and delivery.

Architectural decomposition
Software Architecture is a system structure represented by
components, relationships between components, and their
externally visible properties. The decomposition of a system
into interconnected components is the first step toward
defining an architecture.

Problem
From the definition above, it is possible to see that a
microservice represents a software component that:
Is a service organized around business capabilities
Is small in size, autonomously developed, and
independently deployed
Communicates over lightweight protocols like HTTP,
gRPC, or simple messaging can be implemented using
different programming languages, databases,
hardware, and software environments.
Additionally, a well-done decomposition must find the right
balance between 3 principles: responsibilities, cohesion, and
coupling (Figure 2.3):
Figure 2.3: Principles of well-architected microservices

Consequently, as microservices systems may consist of tens


or hundreds of microservices, it is necessary to use a
systematic approach to architectural decomposition to
satisfy these requirements.

Functional decomposition
Functional decomposition decomposes a system into
microservices based on their specific functionality. The
technique identifies distinct business capabilities provided by
the functionality, which can be implemented as individual
microservices. For instance, an e-commerce application may
have microservices for product search, shopping cart
management, and order processing. Common functionality
or data is extracted and placed into separate microservices,
as shown in Figure 2.4:
Figure 2.4: Functional decomposition of microservices

Following are the pros and cons of functional composition:

Pros:
Is the simplest and straightforward approach that can
utilize a high-level set of requirements as input without
the need for a deep analysis.
Works well when product features are well-understood
and relatively independent.

Cons:
The method is not specific on how to expose external
interfaces, connect to external systems or deal with
complex data analytics.
May lead to inconsistencies when microservices
become too large or too simple.
May cause significant rework when system
functionality changes over time.

Data decomposition
Data decomposition is a technique for breaking down a
system into microservices based on the data they manage.
The process requires creating a data model with distinct data
entities and relationships. Each data entity can then be
implemented as an individual microservice, with each
microservice responsible for managing a specific data type.
Business functions closely related to data entities are usually
in the same microservices. Business transactions that
require access to multiple data entities are generally
organized into logical groups and placed into separate
microservices (Figure 2.5):

Figure 2.5: Data decomposition of microservices

Following are the pros and cons of data decomposition:

Pros:
A simple method that requires only high-level
requirements and a data model.
Works well for data-driven products with relatively
simple functionality.

Cons:
Not specific on how to expose external interfaces,
connect to external systems or deal with complex data
analytics.
May cause performance issues for complex
transactions involving multiple data entities.
Requires additional logic to handle references between
entities and ensure data integrity.

Domain-driven design
Domain-driven design (DDD) was introduced by Eric
Evans in his book Domain-Driven Design: Tackling
Complexity in the Heart of Software[1]. He suggested an
approach to system architecture based on the premise that a
software system should be designed around its business
domain.
In DDD, decomposition microservices are defined around
bounded contexts of domains or subdomains (Figure 2.6).
Shared data can be translated and replicated between
domains (microservices). For example, consider a healthcare
application that manages patient information, appointment
scheduling, and medical records. These bounded contexts
can be implemented as a separate microservice in a domain-
driven decomposition approach.

Figure 2.6: Domain-driven decomposition of microservices

The key principles of domain-driven design include:


Ubiquitous language: DDD promotes the use of a
standard, domain-specific language between
developers, stakeholders, and users. This language
should be used consistently across all aspects of the
system, from requirements gathering to
implementation.
Bounded contexts: These are specific system areas
with a clear and distinct purpose. Each bounded
context should have its model and language, and
should be implemented as a separate component or
microservice.
Entities and value objects: DDD recognizes two main
types of objects within the domain model: entities and
value objects. Entities represent objects with unique
identity and lifecycle, while value objects represent
objects with no inherent identity or lifecycle defined
solely by their attributes.
Aggregates: DDD defines aggregates as a cluster of
related objects treated as a single unit for data
changes. Aggregates ensure consistency and integrity
within the domain model by enforcing transactional
boundaries.
Domain events: DDD promotes the use of domain
events, which are notifications of significant changes
within the domain model. Domain events can be used
to trigger other processes or update other components
within the system.
Continuous refinement: DDD recognizes that the
business domain is constantly evolving and changing,
and the software system should be designed to
accommodate those changes. Continuous refinement
involves an iterative process of modeling,
implementation, and feedback, with the goal of
improving the system's alignment with the business
domain.
Following are the pros and cons of domain driven design:

Pros:
Microservices defined around bounded contexts are
less affected by the performance or data integrity
problems.
Matches well with the asynchronous event-driven
communication style.

Cons:
A complex method that requires a good understanding
of DDD and extensive modeling.
In practice, microservices tend to be quite large and
complex.
The method is not specific on how to expose external
interfaces, connect to external systems or deal with
complex data analytics.

Layered architecture
In recent years, amidst the rising popularity of architectural
models such as Hexagonal, Onion, and Clean architectures—
each designed to enhance modularity, testability, and
maintainability—the concept of Layered Architecture has
emerged as a versatile framework. This approach can be
seen as a generalization where microservices are
systematically divided into distinct layers based on their
functionality.
In microservices, this approach typically divides systems into
three layers:
The interface layer: This layer contains the facades
responsible for translating external requests into
domain-specific commands and queries. They are also
responsible for coordinating the communication
between microservices and handling user input/output.
The domain layer: This layer contains business-
process microservices with the core logic of the
system, including the domain models and business
rules. The domain layer is responsible for processing
incoming requests and producing the appropriate
responses.
The infrastructure layer: This layer contains the
implementation details for connecting the application
to external systems, such as databases and APIs. The
infrastructure layer contains the adapters that
translate the domain-specific commands and queries
into the appropriate data formats for communication
with external systems. It contains data and connector
microservices.
When this approach is applied to microservices, it leads to
well-structured, loosely coupled code (Figure 2.7):
Figure 2.7: Design of a microservice according to principles of layered
architecture

Following are the pros and cons of layered architecture:

Pros:
Simple to understand and define.
Explicitly defines all external touchpoints: facades,
connectors, and data microservices.

Cons:
The method does not specify how to decompose
business-process microservices in the domain layer.

Microservice sizing
Although the prefix “micro” in the microservice term refers
to size, no strict rules exist to determine the right size for
microservices. In real-world systems, the size varies broadly
from extremely small nano-services to large mini-monoliths.

Problem
To reap the benefits of microservices, it is crucial to maintain
an appropriate size for them. If microservices are too large,
development complexity increases, and issues similar to
monolithic development arise. Conversely, if microservices
are too small, it increases the complexity of the system and
organizational overhead. Finding the right size for
microservices depends on the team's technology, processes,
and skill level.

Solution
Although microservices vary in size, there are several
distinct buckets:
Mini-Monoliths or Macro-services: Larger
microservices combine multiple domain models or
business capabilities in a single service. They can be
easier to develop and manage but may have scaling,
maintainability, and agility challenges due to their
broader scope and interdependence.
Coarse-grained Microservices: It groups related
functionalities or domain models based on bounded
contexts in Domain-Driven Design, striking a balance
between modularity and managing multiple
microservices. While they still uphold the separation of
concerns, they are more convenient to develop and
oversee compared to numerous fine-grained
microservices..
Fine-grained Microservices: Fine-grained
microservices follow the Single Responsibility
Principle (SRP) and have a narrow scope, providing
high modularity and ease of maintenance and scaling
for individual components. However, managing many
fine-grained microservices can be challenging due to
communication, deployment, and monitoring
complexities.
Nano-services: Nano-services are extremely small
microservices that offer a minimal and focused set of
functions, usually accessible through a single business
method. These nano-services are commonly used in
serverless systems, where the simplicity of serverless
functions encourages straightforward and
uncomplicated.
The choice of architecture and technology stack affects the
size of a system. Macro-services are common in early
migration stages from monolithic or SOA systems. DDD or
functional decomposition leads to more extensive, coarser-
grained microservices, while hexagon or data decomposition
results in smaller, finer-grained microservices. Serverless
systems commonly use nano-services.
Whatever architecture or technology is used, there are a
couple of rules to apply when deciding on microservice size.
If a microservice is difficult or costly to replace when
technology changes or code quality decreases, it is
likely too large. Ideally, a development team should
take one to two weeks to replace a microservice.
If it is too difficult to remember all microservices that
compose a system or/and the organization overhead to
package and release a new version is more than 10% of
the total effort, then their size is probably too small.
Typically the number of microservices in a system is
around 50-60, with 20-30 on the lower end and about
100 on the higher end.
Communication style
Microservices are components that communicate with each
other using lightweight protocols. Therefore, the second key
decision in the definition of a microservices architecture is
choosing the right communication style.

Problem
Microservices systems consist of multiple fine-grained
services that communicate with each other and external
services or consumers using inter-process communication.
This makes inter-process communication the most crucial
aspect of microservices architecture, impacting performance,
scalability, reliability, and other critical capabilities.

Synchronous microservices
Synchronous microservices use synchronous communication
with blocking request-response calls (Figure 2.8). It is the
simplest communication style and is often based on HTTP,
gRPC, or SOAP communication protocols.

Figure 2.8: Synchronous communication in microservices


Typically microservices that use synchronous communication
create a separate thread for every blocked request until this
request is completed. This limits the number of parallel
requests that can be processed by a single microservice
instance and make this communication style the least
suitable for massively scaled systems. Besides, synchronous
microservices are more prone to cascade failures, which,
even due to minor issues, can make a system unresponsive
for a while.
Following are the pros and cons of synchronous
microservices:

Pros:
The simplest implementation option.
Well-supported in all development stacks.
Typically lowest latency and best performance.

Cons:
Poor throughput and scalability.
Cannot handle spikes in consumer requests, which
requires to reserve extra capacity.
Poor resilience. Error recovery is limited to a few
retries in a short time.

Message-driven microservices
Message-driven microservices use asynchronous messaging
via message brokers to communicate with each other (Figure
2.9). Many implementations of asynchronous messaging
technologies available on the market are different in
features. Among them, the most common patterns are
transient and persistent messaging and the distribution of
messages to single (queue) or multiple (topic) subscribers.
Compared to synchronous communication, asynchronous
messaging is more complex to implement, and it has a
higher latency since messages are routed via a broker that
doubles the number of inter-process calls. However, this
method has significantly higher throughput and resilience.

Figure 2.9: Asynchronous messaging in microservices

While processing transaction requests, asynchronous


microservices may need to store and then retrieve the state
of the transactions. This allows to share transaction
execution between multiple microservice instances and even
continue transactions after a microservice crash. Persistent
messaging adds extra resilience and allows for the recovery
of unprocessed transactions after a partial or complete
system crash.
To architect message-driven microservices, it is important to
know the types of messages:
Commands are messages that request a specific
action to be performed by a component or service.
They are commonly used in a request-response
communication pattern, where the sender expects the
receiver to process the command and return a
response. Commands are named using a verb that
describes the action they represent, such as
CreateOrder or UpdateCustomer.
Events are messages that signify a change in state or
the occurrence of a specific situation within the
system. They are used in event-driven communication
and follow a publish-subscribe pattern. Events are
typically named using a past-tense verb or a noun, such
as OrderCreated or CustomerUpdated, to indicate that
something has already happened. Unlike commands,
events do not expect a response from the receiver; they
merely notify interested parties of the change.
Notifications inform components or services about
relevant updates or occurrences within the system.
They are similar to events in that they do not expect a
direct response but tend to convey less critical or time-
sensitive information. Notifications often deliver status
updates, alerts, or non-urgent messages to users,
services, or monitoring systems.
Callbacks are messages or function references that
represent a way for a component or service to provide
a response to an earlier request or command
asynchronously. In this pattern, the sender provides a
callback function, which the receiver invokes upon
completing the requested action or encountering an
error. Callbacks are commonly used when synchronous
communication is not ideal, allowing for a more
flexible, asynchronous interaction between
components.
Following are the pros and cons of message-driven
microservices:

Pros:
Well-supported in all development stacks.
High throughput and scalability.
Better elasticity to handle spikes in consumer requests
and leads to more optimal resource utilization.
High resilience; capable of processing incomplete
requests even after a system crash.

Cons:
Complex to implement.
Higher latency and lower performance.

Event-driven microservices
Event-driven communication is a specific form of message-
driven communication that focuses on exchanging events
between components or services. Instead of sending
commands that direct microservices to take a certain course
of actions, event-driven microservices exchange events that
describe changes in state or occurrences within the system,
letting receivers decide how those events must be processed
and what actions should be taken (Figure 2.10):
Figure 2.10: Event-driven communication in microservices

While event-driven communication is a thing on its own, it is


closely related to domain-driven design and choreography,
both patterns are described in this chapter. Domain Driven
Design presents an elaborate architectural process that
incorporates events as a key design element and
choreographic logic is based on sending and processing
events.
Following are the pros and cons of event-driven
microservices:

Pros:
Same as message-driven communication.
Higher decoupling and easier extensibility and
maintainability.

Cons:
Same as message-driven communication.
Additional learning curve.
Requires detailed domain design.

Business logic coordination and control flow


There are two distinct approaches to managing coordination
and control flow in microservices architectures: orchestration
and choreography. By understanding the nuances between
the two styles, you will be better equipped to create
scalable, maintainable, and adaptable applications that
efficiently handle the challenges posed by distributed
computing environments.

Problem
Microservices systems involve complex business processes
that require coordination of distributed logic across multiple
microservices. The choice between orchestration and
choreography approaches is critical for system performance,
scalability, and maintainability. The challenge lies in
identifying the most appropriate approach for a given system
while considering trade-offs and constraints to ensure
efficient communication among services.

Orchestration
Orchestration is a centralized approach to managing the
interactions and data flow between different microservices
within a distributed system. In an orchestrated system, a
central component, often called the orchestrator or process
manager, is responsible for coordinating the overall process
(Figure 2.11).

Figure 2.11: Orchestrated business logic in microservices systems

The main components of this approach are:


Orchestrator: The orchestrator is the central
authority that determines the order in which services
should be called and manages data exchange, error
handling, retries, and fallbacks. Depending on the
system's requirements and complexity, it can be a
dedicated business process microservice, a workflow
engine, or an API gateway.
Microservices: Each microservice in an orchestrated
system is responsible for handling its specific domain
and providing the functionality required by the
orchestrator. Microservices are typically designed to
be stateless, single-purpose, and loosely coupled.
The synchronized communication in the Synchronous
Microservices pattern is well-suited for orchestrated business
logic. For increased reliability, it is advisable to save
transaction states and implement compensation transactions
to manage transaction errors or recover from microservice
crashes. Some teams prefer to use workflow engines to build
their orchestrator implementation. To ensure even greater
reliability, additional measures like retries, circuit breakers,
logging, and performance metrics should be included to
monitor distributed transactions and promptly identify any
implementation issues. Further information can be found in
the Orchestrated Saga pattern in chapter seven.
Following are the pros and cons of orchestration:

Pros:
A clear, centralized point of control for managing the
flow of data and interactions between microservices.
Better visibility of the system's overall process.
Easier to implement consistent error handling, retries,
and fallback strategies.

Cons:
A single point of failure, as the entire system relies on
the orchestrator for coordination and control.
A bottleneck, impacting the system's scalability and
performance.

Choreography
Choreography refers to a decentralized approach to
managing interactions and data flow between different
microservices within a distributed system. In a
choreographed system, there is no central authority, and the
responsibility for coordination and control is distributed
among the microservices themselves (Figure 2.12):

Figure 2.12: Choreographic business logic in microservices systems

Choreography relies on event-driven communication, where


microservices publish and consume events to interact with
one another (see Event-driven communication). Persistent
messaging adds extra resiliency. If a microservice crashes, a
message bounces back into the persistent queue and gets
processed subsequently.
In complex orchestrated transactions, a shared and persisted
transaction state may be necessary, as well as a locking
mechanism to prevent concurrent step activation. Despite
the lack of centralized control logic, well-designed
choreographic logic should be able to handle common real-
world scenarios such as duplicated or out-of-order messages.
See the Orchestrated Saga pattern in chapter seven for
further information.
Following are the pros and cons of choreography:

Pros:
Provides better decoupling between microservices and
promotes modularity, scalability, and adaptability.
Increased performance and responsiveness as
processing can be done asynchronously.
Better horizontal scaling capabilities.
Higher robustness since the system can buffer events
and handle retries in case of failures.

Cons:
Additional complexity to handle asynchronous
processing, event ordering, and event sourcing.
Harder debugging and monitoring due to their non-
linear and decentralized nature.
Eventual consistency since state changes might not be
immediately reflected across all microservices.

Security model
Microservices architecture requires robust security measures
to ensure system protection, reliability, and resilience.
Implementing security involves various layers, including
authentication, authorization, secure communication, data
protection, and monitoring.
Problem
To build a well-protected system it is important to
understand vectors of attacks. Despite common beliefs that
many people share, hacking public APIs is only one possible
vector responsible for a relatively small percentage of
attacks. Typical attack vectors on microservices systems are
presented in Figure 2.13:

Figure 2.13: Typical attack vectors in microservices systems

Typical attack vectors include:


Attacks against public APIs: Unauthorized access,
interception of communication, DoS attacks, SQL
injection, and so on.
Unauthorized access via maintenance windows:
That is where hackers can get root access into the
system.
Attacks against external integration points:
Similar to attacks against public APIs and maintenance
windows.
Code injections: These can include malicious code
injected into the codebase, Trojans injected via binary
dependencies, and injections of code and binaries
during the build process.
To address security threats and protect the system, the
architect has to decide on several mechanisms:
Authentication (identity management and propagation)
Authorization (access permissions)
Privacy (secure communication, data protection, and
encryption)
Security monitoring and auditing (security logs)
Security is a serious topic that requires special attention. The
solutions presented below are intended only to help
navigating between architectural options and must not be
taken as comprehensive advice. When dealing with security
aspects, the reader should use its own judgment and seek
professional advice when needed.

Zero trust model


The zero trust security model is an approach to system
security that assumes no user, device, or component can be
trusted by default. This model aims to minimize the risk of
unauthorized access and data breaches by requiring
constant verification and validation for any entity attempting
to access resources within a system (Figure 2.14):
Figure 2.14: Zero-Trust security model

The key principles of the zero trust security model include:


Never trust, always verify: Trust is not granted
automatically based on the location or network
segment. Instead, every user, device, and application
must be authenticated and authorized before being
granted access to resources, regardless of whether the
access originates from inside or outside the network
perimeter.
Least privilege access: Users, devices, and
applications should only be granted the minimum level
of access necessary to perform their tasks. This limits
the potential attack surface and reduces the risk of
unauthorized access to sensitive data and resources.
Micro-segmentation: This involves dividing the
system into smaller, isolated parts based on specific
criteria, such as a microservice or group of
microservices. This approach reduces the potential
blast radius in case of a security breach and restricts
lateral movement for attackers within the network.
Context-aware security: Relies on real-time, context-
aware security policies that consider factors such as
user identity, device posture, location, and time of day.
These dynamic policies enable adaptive security
controls that can respond to changing conditions and
risks.
Continuous monitoring and validation: Security is
an ongoing process that requires continuous
monitoring and validation of user and device behavior.
This includes logging and analysing security events,
detecting anomalies, and implementing automated
responses to potential threats.
Data-Centric Security: Focuses on protecting data at
its core, ensuring that sensitive information is
encrypted and secured both in transit and at rest. By
prioritizing data protection, the zero trust model
reduces the risk of data breaches and ensures the
confidentiality, integrity, and availability of critical
information.
The zero trust security model is implemented by using
modern security technologies and practices such as Multi-
Factor Authentication (MFA), Role-Based Access
Control (RBAC), Attribute-Based Access Control
(ABAC), and advanced threat detection and response
solutions. This model provides organizations with a resilient
and adaptive security posture that protects their valuable
data and resources from an ever-evolving threat landscape.
Following are the pros and cons of zero trust model:

Pros:
Reducing attack surface and minimizing the risk of
unauthorized access to sensitive data and resources.
Limits the ability of attackers to move laterally within
the system, containing potential breaches and
minimizing the blast radius.
Adaptability to respond to changing conditions, risks
and evolving attack vectors.
Protects sensitive information regardless of where it is
stored or accessed, reducing the risk of data breaches
and ensuring confidentiality, integrity, and availability.
Provides continuous monitoring, which can assist
organizations in achieving regulatory compliance
requirements and exhibiting a robust security posture.

Cons:
Complexity caused by significant changes to existing
infrastructure, processes, and technologies.
Increased latency by additional layers of
authentication, authorization, and encryption. It is
crucial to optimize and balance security controls to
minimize this impact.
Management overhead caused by continuous
monitoring, validating, and updating security policies.
Suboptimal user experience as users may be required
to authenticate more frequently or face additional
access restrictions.
High cost raised by investments in new security
technologies, staff training, and changes to existing
infrastructure.

Secure perimeter
The perimeter-based security model, also known as the
castle and moat approach, is a traditional security paradigm
that focuses on protecting a system by creating a strong
boundary around it. This model relies on trust, where users,
devices, and components within the perimeter are
considered secure, while external entities are treated as
potential threats.
The correct implementation of the perimeter approach
requires a deep understanding of attack vectors and system
surface area. A common mistake many teams make is that
all their efforts focus on securing public APIs while leaving
big holes in other areas (Figure 2.15):

Figure 2.15: Perimeter-based security model

A surface area must include the perimeter of a deployed


system and the code, binary repositories, and delivery
pipeline to ensure that all system components developed
internally or adopted from third parties can be trusted and
do not contain malicious code. Its main characteristics are:
Source code: Only authorized developers can access
the code, and the source code must be continuously
checked manually or/and automatically for malicious
code injections.
Binary repositories: All external dependencies must
be checked for safety and authorization before using
them in development. To enhance security, authorized
third-party components can be copied into private
repositories, and access to public repositories can be
restricted for build servers. This ensures that external
dependencies cannot be utilized without appropriate
authorization.
Delivery pipeline: Only a few trusted DevOps
engineers shall be granted access to the build
infrastructure to prevent hackers from injecting
malicious code or binaries during the build process.
Maintenance windows: Maintenance access to
production systems must be tightly controlled. The
number of entry points must be decreased to a
minimum by allowing access to a single maintenance
station (jump server) within a secure perimeter. Only a
few authorized support engineers should be granted
access permissions. Access keys must be frequently
rotated and connections shall be allowed only from
trusted IP addresses.
Public APIs: All public APIs must have authentication,
authorization, encryption, and logging. Rate limits
must minimize the chances of DoS attacks. Black lists
can help support engineers to block suspicious users
until a security breach is addressed quickly.
Integration points: Integration points must be
controlled using the same practices that are used for
maintenance windows and public APIs.
Following are the pros and cons of secure perimeter:

Pros:
Easy to understand, manage, and maintain in most
organizations.
Centralized management in the development and build
infrastructure, as well as deployed systems reduces
complexity.
More cost-effective than implementing complex and
granular security models.
Lower latency and higher performance as security is
done at the system perimeter and all secondary calls
between microservices go unprotected without extra
overhead.

Cons:
Limited protection against lateral movement and
insider attacks.
Inadequate for complex manually created
environments with a large surface area that is non-
standard and difficult to control.
False sense of security, as organizations may overlook
the need for additional layers of defence or
underestimate the potential for threats originating
within the network.

Cross-platform deployments
Organizations are moving towards cloud environments,
which offer scalability, cost-efficiency, and flexibility. This
allows for quick deployment and scaling of applications while
reducing the need for physical data centres. However,
software vendors must support multiple deployment
scenarios to meet the infrastructure need of customers.

Problem
Cross-platform deployments require consistent user
experience and functionality across different platforms while
dealing with platform-specific limitations and requirements.
This involves adapting the user interface, optimizing
performance, and integrating with platform-specific APIs or
services. However, this can increase development and
maintenance complexity as developers need to ensure the
application runs smoothly on each target platform while
adhering to unique constraints and guidelines.

Symmetric deployments
In symmetric deployments, infrastructure services must offer
consistent APIs and capabilities so that microservices can be
tested and implemented against them. This ensures that
when microservices are deployed in a new environment, all
infrastructure services adhere to the same set of APIs (Figure
2.16):

Figure 2.16: Symmetric deployment architecture

There are two primary approaches to provide consistent


deployment environments across all supported on-premises
and cloud platforms.
Common low denominator approach, which involves
selecting infrastructure services that are available on all
platforms. Some of these services are natively supported
and managed by the infrastructure provider, which provides
elasticity and reduces the total cost of ownership. If the
provider does not offer managed services, the team can
provide them in-house. Such services include databases like
MySQL, Postgres, and MongoDB, message brokers like Kafka
and MQTT, caching services like Memcached and Redis, and
container orchestration software like Kubernetes.
Containerized infrastructure approach, which does not
rely on the capabilities of deployment platform and deploys
the required containerized infrastructure services next to
system microservices. As container orchestrators like
Kubernetes became more capable to reserve capacity and
guarantee the stability and performance of stateless
containers, this approach is becoming increasingly popular.
Following are the pros and cons of symmetric deployments:

Pros:
Lower complexity as developers only need to code
against a single set of APIs.
Testing is shorter because systems are only tested once
against a single deployment configuration.
Simple deployment configurations that only require
one set of configuration parameters.

Cons:
Potential inonsistency in implementations from
different providers.
Lower scalability, higher cost, and more expensive
maintenance for self-managed services when the
deployment platform does not provide adequate
support.
Asymmetric deployments
The asymmetric approach in architecture embraces
differences in infrastructure services and maximizes the
native capabilities of each platform. To minimize differences
the architecture includes an abstraction layer that has
adapters to different infrastructure services and provides a
consistent set of APIs that microservices are coded against
(Figure 2.17):

Figure 2.17: Asymmetric deployment architecture

Depending on the designer’s choice, the abstraction layer


can be implemented as subcomponents inside microservices
or as connector microservices, or both.
Following are the pros and cons of asymmetric deployments:

Pros:
Higher scalability, lower cost, and easier maintenance
due to platform-managed services, simplified
deployment, and elastic scalability.
Makes implementation future-proof as the abstraction
layer simplifies support of new infrastructure services
and new deployment configurations.
Cons:
Longer testing since systems shall be tested in all
supported deployment environments.
Complex deployment configurations as they require
multiple configuration sets for different types of
infrastructure service.

Tenancy
Software vendors that offer complex business systems using
the SaaS model get to choose between simpler systems built
for a single client and deployed multiple times or a single
system designed to handle multiple clients simultaneously.

Problem
Business systems built for organizations with many users
require significant computing power and ample data storage,
and historically were deployed for each client. However, as
many software vendors migrate to the cloud and offer SaaS
solutions, traditional single-client deployments require too
much effort and may raise infrastructure costs, reducing
vendor profits, increasing prices for customers, and thus,
making products less competitive.

Single-tenancy
Traditional single-tenant architecture, also called isolated
tenancy, provides each customer or tenant with a dedicated
instance of the software system, including its infrastructure,
data storage, and processing resources. Each tenant's data
and deployment environments are separate, ensuring high
data privacy, security, and customization (Figure 2.18):

Figure 2.18: Single tenant architecture

To make the single tenant architecture more suitable for


cloud SaaS deployments, an architect may add a few
capabilities into the system:
Common entry: Implement a single entry point with a
common sign-in. After a user is identified, his requests
can be automatically routed to a system for their
organization.
Automated upgrades and maintenance: To
decrease maintenance efforts, system deployments and
maintenance procedures must be automated as much
as possible.
Elastic scalability: Elastic scalability minimizes
infrastructure costs by automatically adding resources
during increased load and removing excessive capacity
when not needed, avoiding the expense of reserving
extra resources for client growth.
Following are the pros and cons of single-tenancy:

Pros:
Simple design and development.
Isolation that prevents unauthorized access between
clients and ensures data privacy and security.
Ability to customize deployments to meet unique
customer needs.
Better performance as spikes of usage caused by one
client will not affect others.

Cons:
Higher infrastructure cost as each system deployment
may need the reserved capacity that adds up.
Higher maintenance as each system has to be
deployed, upgraded and maintained independently.

Multi-tenancy
Multi-tenant architecture refers to sharing a single instance
of a software system among multiple tenants, where system
components, infrastructure, and data storage resources are
shared, but each tenant's data is logically separated (Figure
2.19). Multi-tenancy allows for efficient resource utilization,
reduced operational costs, and simplified maintenance by
consolidating resources and infrastructure for multiple
tenants.
Figure 2.19: Multi-tenant architecture

Multi-tenant architecture may not always be a good choice.


Most infrastructure services have a practical scalability limit.
For example, Postgres can store up to 128 terabytes in a
single database, Kubernetes cluster may have up to 5000
nodes and so forth. Although those limits are quite high,
there are many situations when the amount of data or
number of transactions required for each client is so high
that the system requires partitioning to handle that volume.
In those cases, building multiple simpler single-tenant
systems may be practical rather than one multi-tenant with
complex partitioning.
Following are the pros and cons of multi-tenancy:

Pros:
Lower infrastructure cost as shared resources are
utilized more efficiently.
Lower maintenance as only one system needs to be
deployed, upgraded, and maintained.

Cons:
More complex design and development as the system
needs to differentiate and isolate clients from each
other.
Potential security breaches when one client may get
access to information from other clients.
Limited customization as a single implementation must
accommodate all clients.
Spikes in usage caused by one client may affect other
users.

Development stacks
A development stack is a collection of programming
languages, frameworks, libraries, and tools that developers
use to create, deploy, and manage software applications.
Choosing the right development stack is crucial for project
efficiency, maintainability, and success in the constantly
changing world of software development.

Problem
Choosing the right stack for microservice development is a
critical and complex decision that impacts system success,
scalability, and maintainability. Organizations evaluate
several programming languages, frameworks, libraries, and
tools while balancing factors like development speed,
performance, and expertise. The chosen stack should meet
current and future project requirements with flexibility and
adaptability for system growth. Failure to select the
appropriate stack may lead to increased costs, reduced
productivity, maintenance difficulties, and limited scalability,
ultimately affecting project success and viability.

Platform-specific frameworks
When the deployment platform is set and not expected to
change in the future, developers may use platform-specific
stacks that embrace the native platform capabilities and
programming models. Here are a few current primary
options:
AWS Lambda: AWS Lambda is a popular serverless
compute service that allows you to run Java code in
response to events, such as changes to data in an
Amazon S3 bucket or an Amazon DynamoDB table. It
automatically manages the underlying compute
resources, scaling the application in response to
incoming requests.
Azure Functions: Azure Functions is a serverless
compute service from Microsoft that lets you run Java
code in response to events or triggers, such as HTTP
requests, messages in Azure Service Bus queues, or
changes in Azure Blob Storage. It automatically scales
based on demand and requires no infrastructure
management.
Google Cloud Functions: Google Cloud Functions is a
serverless compute service that allows you to execute
Java code in response to events, such as changes in
Google Cloud Storage or messages in Google Pub/Sub.
It takes care of the underlying infrastructure,
automatically scaling your application based on
demand.
Vert.x: A toolkit for building reactive applications on
the Java Virtual Machine (JVM). Vert.x is designed
for high concurrency, low latency, and scalability,
which are essential qualities for microservices. It also
supports a range of programming languages in
addition to Java, including Kotlin, Groovy, and Scala.
Jakarta EE: Jakarta EE (previously Java Enterprise
Edition) is a framework for building scalable and
reliable Java applications. It offers a standardized set
of APIs, components, and specifications for developing
distributed, multi-tiered, and service-oriented
applications. Java Servlets enable developers to
package microservices within stateful or stateless
servlets and deploy them in JEE servers.

Cross-platform frameworks
When a system needs to be deployed across multiple
deployment platforms, it is a good idea to find a
development stack that allows developers to build
microservices once and deploy them across multiple
platforms. Some options support bare-metal, containerized
and serverless deployments, while others are designed just
for serverless functions:
Quarkus: A modern, Kubernetes-native framework
designed for building lightweight high-performance
microservices. Quarkus offers a fast startup time and
low memory footprint, making it ideal for containerized
and serverless environments. It also provides a wide
range of extensions and support for common Java
libraries.
Micronaut: A full-stack framework focused on
performance, minimal memory footprint, and ease of
development. Micronaut offers built-in support for
microservices patterns, cloud-native features, and
reactive programming, making it a strong contender
for building modern Java-based microservices.
OpenFaaS: OpenFaaS (Functions as a Service) is an
open-source serverless framework that supports Java
and other languages. It allows you to build, package,
and deploy functions on any container orchestration
platform, such as Kubernetes, Docker Swarm, or the
OpenFaaS managed platform. OpenFaaS focuses on
providing a simple, developer-friendly experience for
creating serverless applications.
Fn Project: Fn Project is an open-source, container-
native serverless platform that supports Java and other
languages. It allows developers to build, deploy, and
manage serverless applications using Docker
containers, with built-in support for scaling, load
balancing, and monitoring. Fn Project can run on any
infrastructure, including public clouds, private data
centers, or on your local machine.

Polyglot and cross-platform frameworks


The last groups of microservices frameworks have cross-
platform deployment capabilities and multilingual
implementations that are beneficial in polyglot
environments, allowing developers to transition to new
languages with ease. Here are the few options:
Spring Boot: Spring Boot is a Java framework that
simplifies microservice development, deployment, and
management. It includes pre-configured templates,
starter projects, and embedded servers. It supports
Java, Kotlin, and Groovy and has been ported to other
languages. Spring Boot can be deployed on various
platforms including containers (Docker, Kubernetes,
and OpenShift), virtual machines, bare metal servers,
and cloud platforms like AWS, Azure, and Google
Cloud.
Distributed Application Runtime (DAPR): Dapr is a
portable runtime for building microservices and
distributed applications across different platforms and
programming languages. It provides building blocks for
common tasks like state management, messaging, and
service invocation. It supports multiple languages,
including Java, JavaScript, Python, and .NET. Dapr
applications can be deployed on Kubernetes, Docker,
self-hosted environments, and cloud platforms like
AWS, Azure, and Google Cloud.
Pip.Services toolkit: Pip.Services toolkit facilitates
the development of cloud-native, containerized apps
using microservices architecture. It is language-
agnostic, with components for Java, Node.js, Python,
Go, and .NET. Microservices built with this toolkit can
be deployed on various platforms including Docker,
Kubernetes, OpenShift, VMs, bare metal servers, and
cloud platforms like AWS, Azure, and GCP.

Conclusion
This chapter discussed key architectural patterns used to
design complex microservices systems. It also clarified what
microservices are, reviewed decomposition methods and
went through architectural styles for communication,
business logic, security and system deployments. The next
chapter will explore patterns used to organize and document
microservices’ code effectively.

References
[1] E. Evans. 2004. Domain-Driven Design: Tackling
Complexity in the Heart of Software. Addison-Wesley,
Massachusetts, USA
Further reading
T. Welemariam Medium. Microservice Architecture and
Design Patterns for Microservices. Dec 16, 2022.
Available at
https://medium.com/@tewelle.welemariam/micros
ervice-architecture-and-design-patterns-for-
microservices-6fa1a0d0876c
S.Ravi. Designing Event-Driven Architecture. Medium.
Jan 6, 2023. Available at
https://awstip.com/designing-event-driven-
architecture-1681b9ad6e65
Love. Principles of Domain-Driven Design for
Microservices. Aug 26, 2022. Available at
https://medium.com/@isaiahlove085/principles-of-
domain-driven-design-for-microservices-
43706f2bbd32
Global Technology. Behind the scenes: McDonald’s
event-driven architecture. Medium. Aug 24, 2022.
Available at https://medium.com/mcdonalds-
technical-blog/behind-the-scenes-mcdonalds-
event-driven-architecture-51a6542c0d86

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 3
Organizing and
Documenting Code

Introduction
This chapter covers patterns used to organize and document
microservices code effectively. Microservices architecture
allows individual developers and teams to work in parallel
with minimal coordination. Consequently, without proper
structures and rules, developers may cut corners, skip
documentation, and violate microservices boundaries,
leading to a disorganized codebase. To help prevent this, this
chapter teaches you how to structure, document, and
manage microservice code generation.

Structure
In this chapter, we will cover the following topics:
Code repositories
Mono-repo
Multi-repo
Workspace
Code structure
Functional / Domain-driven code structure
Type / Technology-based code structure
Code sharing
No code sharing
Shared libraries / versioned dependencies
Sidecar
Code compatibility
Full backward compatibility
Namespace versioning
Minimalistic documentation
Handwritten documentation
Commit messages
Auto code documentation
JavaDoc generation
Auto-generated comments
Code reviews
Pull request reviews
Periodic reviews
Review checklist
Automated code checks
Microservice chassis
Antipatterns

Objectives
After studying this chapter, you should be able to choose and
set rules to structure your microservices codebase, share
code without constantly breaking microservices, provide
clear documentation with minimal effort, maintain code
quality via an effective review process, minimize onboarding
time and, ultimately, guarantee long-term sustainable and
productive development.

Code repositories
Structuring the microservices codebase starts at the
repository level. A proper structure must allow for clearly
identifying every component in the codebase, its type,
purpose, and ownership. The high-level breakdown must be
clear not only to the developers who spent months working
with that code but also to any newcomer or manager without
development experience.

Problem
Source code is valuable and costly to produce. It is a crucial
part of a company’s intellectual property (IP).
Unfortunately, many companies store their code
haphazardly, making it difficult to locate and manage. In
some cases, code may have been created without a clear
purpose or left unattended. As a result, only a few
developers may be familiar with the codebase, and if they
leave, the company may lose control over it, making it
impossible to continue development.

Mono-repo
A mono-repo stores microservices belonging to a group or
company in a single repository (See Figure 3.1). It is a
common choice for large companies like Google but can lead
to disorganization and boundary violations without clear
rules and oversight.
Following are the pros and cons of code repositories:

Pros:
All code required for a project can be pulled in one
step.
Branches and tags can span across multiple
microservices, making it easier to define baselines and
use them in delivery processes.
Feature pull-requires or feature branches can include
multiple microservices allowing isolated development
of a particular feature.
Developers can push changes in a single commit
(although this practice violates the independence of a
microservice lifecycle).

Cons:
Easy to violate microservice boundaries. As a result,
sharing code without formal dependencies may turn
code into a monolith.
Easy to create scripts that span across multiple
components, couple their delivery processes, disrupt
their lifecycle and turn code into a monolith.
Easy to create unplanned components or store code no
one knows about, manages and maintains.
Some build servers have limited support for multiple
pipelines. That complicates the creation of CICD
pipelines for microservices.
Not possible to manage access rights at the component
level. Once a person gets access to a mono-repo, he
gets access to all components stored in it.
These are some recommendations for organizing a
monorepo and maximizing its advantages while minimizing
its drawbacks:
Choose the mono-repo's granularity, either for the
whole organization or per team, product, or functional
group, and keep it consistent across all teams and
products.
Use Version Control System’s (VCS) mechanisms to
group related mono-repos. GitHub uses
"organizations," Gitlab uses "groups", and Bitbucket
uses "projects."
Define conventions to name groups of mono-repos like
[<organization>-]<product|team>-<group>

Use a flat structure within the repository, placing all


components in a root folder. Each folder in the repo
should be a formal component.
Define conventions to name components in mono-repo,
like <type>-<name>-<language | dev stack>
The component type could be defined as:
service: a backend (micro)service
client: a client library
process: a backend process
façade: an external facade
sdk: a client SDK
app: a frontend (micro)application
mfe: a frontend microfrontend
package: a product or component deployment package
image: a docker image
script: an automation script
test: system-level functional and non-functional tests
Typical languages or development stacks:
java: Java components
dotnet: .NET components
python: Python components
ps: PowerShell scripts
docker: Docker images
js: JavaScript components
react: React.js libraries or frontends
ng: Angular libraries or frontends
Include a README.md file in the root folder that outlines the
monorepo's contents, rules, and responsible parties:

Figure 3.1: Mono-repo on Bitbucket

Multi-repo
A multi-repo stores each microservice in a separate code
repository to ensure their independence and isolation. It
needs minimal maintenance but has development overheads
and cannot support common practices like pull requests
across multiple components.
Following are the pros and cons of multi-repo:

Pros:
All components are highly visible to everyone.
Additional barriers prevent bad code-sharing practices,
violating microservice boundaries and their lifecycle.
None or little oversight is needed to maintain a proper
structure of the codebase.
Possible to control access at the repository
(component) level.
Easier to manage since it is clear who is responsible for
each repo.
No issues automating CICD pipelines per microservice
for incremental delivery.

Cons:
Extra overhead to approve, create or delete
repositories (which could be a good thing).
Extra steps to pull components to work on a project.
Not possible to use some traditional code management
techniques like feature pull requests, branches, and
tags across multiple microservices.
Not possible to change multiple microservices in a
single commit (which is also a good thing).
These recommended practices optimize multi-repo
structures and alleviate their limitations:
Define conventions to name groups of multi-repos like
[<organization>-]<product|team>-<group>

Every repository must be a formal component.


Define conventions to name component repositories
like [<subgroup>-]<type>-<name>-<language | dev stack>, where
the subgroup name is optional.
For complex microservices systems, use VCS
mechanisms to group component repositories. GitHub
uses "organizations" (one per logical group), Gitlab
uses "groups", and Bitbucket uses "projects" (as shown
in Figure 3.2):

Figure 3.2: Multi-repo on Gitlab


Note: For conventions on component type and typical languages, see
the Mono-Repo pattern.

Workspace
A workspace is a designated area, either physical or virtual,
that is specifically set up for work-related activities.
Problem
From time to time, teams need to store artifacts they use in
their work, but that do not fit into the definition of a formal
component that needs to live a long life. Or, they may need
a script that does something with multiple components but
doesn’t belong to any of them.

Solution
A Workspace is a component for project teams to store code
and artifacts separate from the repository rules. It tells
everyone that content stored in it is temporary, does not
represent a production code, and doesn’t need maintenance.
A Workspace enables a mono-repo experience with Multirepo
by using clone and push scripts. Developers can clone the
workspace repository to check out components related to a
project and execute the clone script to clone other
repositories. They can use the push script to push changes
across multiple components. The following figure shows an
example of a workspace:
Figure 3.3: Workspace in GitHub multi-repo

Code structure
Code structure refers to the way in which the code is
organized and arranged into logical components and
modules. It is important for the structure to be clear and
intuitive, as this can enhance the code's readability,
maintainability, and scalability.

Problem
There are many ways to organize code in a microservice (or
a software component in general). However, without clear
rules, code organization in microservices can be inconsistent.
The two most common ways to organize code are
functional/domain-driven and technology-based.

Functional / domain-driven code structure


Functional/domain-based code organization is suitable for
monolithic systems. For microservices, this may not be the
case, as they have a single responsibility and are relatively
small, making the code belong to a single or a few functional
areas.
Facades are an exception to this, as they contain multiple
controllers and export complex APIs. Thus, it is better to
separate the controllers based on their domains instead of
putting them all in a single package. A typical file
organization may look like this:

Figure 3.4: Functional/domain-based organization

Following are the pros and cons of code structure:

Pros:
Separates code that belongs to different functional
areas/domains and adds logical meaning.
Good fit for facades with complex APIs used to
organize controllers.

Cons:
Not very useful for regular microservices since their
code often belongs to a single domain.

Type / Technology-based code structure


In the type-based structure, microservice code is organized
according to class types. Since microservices have similar
designs, their code will follow a similar structure. It makes it
easy to navigate through the code and relieves developers
from the need to figure out the code structure on their own.
A typical type-based organization may look like this:

Figure 3.5: Type/technology-based organization

Following are the pros and cons of type / technology-based


code structure:

Pros:
Separates code by its responsibilities (type).
A good fit for microservices with standardized designs.

Cons:
Not very useful for fine-grain microservices as they
implement a few features that often belong to a single
domain.

Code sharing
Code sharing is a crucial aspect of modern software
development, allowing developers to collaborate, save time,
and improve efficiency. In this section, we will examine the
different ways in which code can be shared and the pros and
cons of each approach.

Problem
It is a good thing to standardize microservices
implementations and introduce common patterns. This
avoids code duplication. Besides, by following DRY principles,
developers tend to extract and share generic functions and
abstract classes across microservices. However, when not
dealt with care, changes in the shared code can break all
microservices that use it (See Figure 3.6). This is probably
the main cause of coupling in microservice implementations,
which also leads to a distributed monolith architecture.
Figure 3.6: Potential impact of breaking changes in a shared library on
dependent microservices

No code sharing
Shared libraries causing waves of breaking changes are
common and painful, leading some teams to ban them
altogether (See Figure 3.7). Code duplication is often
considered a lesser evil than days lost fixing broken
microservices when shared libraries change. Some teams go
even further and limit the use of third-party libraries unless
they are confident in their stability and backward
compatibility, preferring to spend more time creating a
stable codebase.

Figure 3.7: Isolating impact of breaking changes with code duplication

Often, teams with a "no code sharing" policy create


microservice templates to enhance consistency and speed
up development. These serve as starting points for new
microservices, with each microservice carrying a full copy of
the code. After that, updates only introduce new functionality
and never changes in dependencies.
Following are the pros and cons of no code sharing:

Pros:
Allows the development of extremely stable code that is
rarely broken by unplanned changes in external
dependencies.
Cons:
Causes significant code duplication, which slows down
development and complicates maintenance.

Shared libraries / versioned dependencies


To prevent problems with breaking changes from shared
code, teams can add shared libraries as formal dependencies
managed by a package manager (refer to Figure 3.8).
Semantic versioning ensures microservice code only uses
bug-fix and minor updates for a specific version. Developers
must adhere to semantic versioning rules and apply major
version increments for breaking changes.

Figure 3.8: Isolating impact of breaking changes using versioned dependencies

Unfortunately, this method has a significant issue in


programming languages such as Java and .NET, where it is
not compatible with transitive dependencies (as shown in
Figure 3.9). In these cases, when a microservice uses two
client libraries built on different shared library versions, the
code executes against the newest, breaking one of the client
libraries. To overcome this problem, developers responsible
for shared libraries must follow one of the Code Backward
Compatibility patterns described.
Figure 3.9: Breaking microservice caused by a version conflict in transitive
dependencies

Following are the pros and cons of shared libraries /


versioned dependencies:

Pros:
Make microservices code more compact, speeds up
development, and simplify maintenance.

Cons:
Can break dependent microservices when developers
do not follow versioning rules.
Have issues with transitive dependencies.

Sidecar
The Sidecar pattern separates shared functionality from
microservices, interacting via a backward-compatible
interface. It is popular among platform builders to enhance
microservices without code changes. Frameworks like DAPR
use it to share functionality across different languages with a
simple SDK (as shown in Figure 3.10):
Figure 3.10: Extending or augmenting microservice using a Sidecar

Following are the pros and cons of sidecar:

Pros:
Limited touchpoints between microservices and
Sidecars imply low risks of breaking the microservice.
Able to augment existing microservices by intercepting
their communication.
It can be written in a language different from the
connected microservice.
It can be updated in real-time without having to touch
or change the microservice code or container.

Cons:
An additional overhead caused by Sidecar processes.
Latency caused by communication between a
microservice and its Sidecar.

Code compatibility
Code compatibility refers to the ability of code to work
seamlessly across different platforms, operating systems,
programming languages, and versions. In this section, we
will explore some key patterns for achieving this in software
development.

Problem
Sharing code between microservices with a library carries
the risk of breaking dependent microservices when this
library is modified. This can disrupt development and require
significant time to fix, retest, and rerelease affected
microservices.

Full backward compatibility


One way to avoid breaking changes is to provide full
backward compatibility for the entire duration of a library’s
lifespan (as shown in Figure 3.11). To guarantee it, the
library must be well covered by automated tests. When new
functionality is added, all previously created tests must stay
intact and continue running in regression testing.

Figure 3.11: Shared library with full backward compatibility

Following are the pros and cons of code compatibility:


Pros:
Microservices are not affected and do not need
changes when they are upgraded to a new version of a
shared library.
Works with transitive dependencies.

Cons:
Requires careful design and implementation.
Forced to carry forward old (obsolete)
implementations.

Namespace versioning
An alternative approach is to treat every major release of a
shared library with breaking changes as a new library (as
shown in Figure 3.12). This can be done by placing it as a
separate component with a new name that has a version
number in it. And to prevent name conflicts between
overlapping classes when both libraries are used in transitive
dependencies, their namespaces and, sometimes, class
names should also be versioned. The following example
shows how this can be done:
shared-library:
/com.orgname.sharedlibrary
SharedClass.java
shared-library2:
/com.orgname.sharedlibrary2
SharedClass[V2].java
Figure 3.12: Major releases of shared library V1 and V2 implemented by two
separate libraries with different namespaces to avoid naming conflicts

Following are the pros and cons of Namespace versioning:

Pros:
Works with transitive dependencies.
No need to carry forward obsolete implementations.

Cons:
Microservices code needs changes when they are
upgraded to a new version of a shared library.
It may increase the number of microservice
dependencies.

Minimalist documentation
Code documentation refers to the process of describing the
functions, variables, classes, and modules of code with the
aim of making it easier for other developers to understand
and use the code. Within this section, we will investigate
various approaches for achieving efficient code
documentation.
Problem
Developers often overlook documentation, viewing it as a
low-value and laborious task. This can result in insufficient
documentation or even no documentation at all.
Unfortunately, this can cause a significant time investment
for anyone unfamiliar with the code who needs to use or
modify it.

Handwritten documentation
To alleviate this burden, require only the bare minimum of
handwritten documentation and provide pre-made templates
for developers to use. The following documents are
particularly critical:

Readme
README files are presented by the majority of the popular
Version Control Systems when a repository or source code
folder is opened (as shown in Figure 3.13). It is
recommended to add the following information to them:
Name of the component and a brief description of it.
Quick links to the relevant documentation, including
requirements, architecture, CHANGELOG, and API
documentation.
List of key features of the component.
A description of how to get access to the component
binaries and source code. Provide links to the code and
binary repositories, and required credentials for access
control.
Guidelines on how to develop the component. Describe
the steps to set up the development environment and
commands to build and test the component.
A description of how to use the component. Describe a
few main usage scenarios and demonstrate code
snippets.
And finally, a list of the people responsible for the
component with their contact details.

Figure 3.13: Example of a README.md file

Changelog
The CHANGELOG file shows a history of changes
implemented in the component. It tells what features were
implemented, when, and who did it. Also, it lists fixed defects
and any breaking changes. Its recommended structure is:
File title with the component name
Sections for each component release with a version
number, date, and the person who made the changes
New features
Fixed defects
Breaking changes
The following figure is an example of a typical Changelog
file:

Figure 3.14: Example of a CHANGELOG.md file

Todo
The TODO file is an option used to keep a list of planned
changes in the component: technical debt or code
improvements (as shown in Figure 3.15). Also, it is a good
place to list corrective actions from code reviewers. Its
recommended structure is:
File title with the component name
Sections for todo items with the type of change, date,
and the person who suggested the changes
List of todo items clearly stating what needs to be done
and why

Figure 3.15: Example of TODO.md file

Commit messages
Well-written commit messages can greatly increase the
transparency of the work and tell code reviewers what was
changed and why. Good commit messages can even be used
to automatically generate a CHANGELOG file.
The recommended structure of a commit message is:
Subject line: <type>[(scope)]: description, where:
Type (mandatory) is one of the standard commit
types listed below
Scope (optional) is an area of code affected by the
change or simply an epic name
Description (mandatory) is a short description of the
change containing 50 characters or less
Body (optional): a detailed explanation of the change
Footer (optional): a list of additional descriptors with
information about who designed, reviewed, or
approved the change and associated ticket numbers.
For instance: Designed-by: AlexN, Reviewed-by: JohnK,
Jira-ticket: INS-124
Standard commit types may include:
feat: introduces a new feature.
fix: patches a bug in your codebase (bug fix or hotfix).
build:introduces changes that affect the build system
or external dependencies.
chore:updates dependencies and does not relate to fix
or feat and does not modify src or test files.
ci: introduces changes that affect the continuous
integration process.
docs:updates the documentation or introduces
documentation.
style:updates the formatting of code; removes white
spaces, adds missing spaces, removes unnecessary
newlines.
refactor: introduces refactored code segments to
optimize readability without changing behavior.
perf: improves performance.
test: adds, removes, or updates tests.
revert: reverts one or many previous commits.
The requirement to specify one of the standard change types
forces developers to work in short increments focusing on
the clear outcome, instead of bundling too many changes in
a single commit that could be hard to review and fix if they
break anything.
An example of a simple feature commit:
feat(customers): Send an email to the customer when
a product is shipped
An example of a big fix commit with additional references:
fix: Fixed NullPointerException when product
description is not set
Reviewed-by: AntonK
Jira-Bug: PROD-1123
An example of code refactoring that caused a breaking
change:
refactor: Updated code to use features from Java 19
BREAKING CHANGE: The component now requires Java 19
or higher

Auto code documentation


Auto code documentation is the process of automatically
generating documentation for software code, without
requiring manual effort from developers. In this section, we
will explore some of the tools and techniques used to
automatically generate documentation from source code.

Problem
When a software component exposes a public API that can
be used by other developers, it requires good
documentation. In situations when an API represents a
formal product that is used by thousands of developers,
writing good documentation may need a professional
technical writer. However, in most cases, it can be generated
automatically from comments in the source code.

JavaDoc generation
JavaDoc is a standard tool that uses specially formatted
comments to generate API documentation in HTML format.
The comments can be associated with classes, their files,
and methods. They commonly consist of two parts:
The description of what we are commenting on
The standalone block tags (marked with the “@”
symbol), which describe specific meta-data
An example of a comment for a class method (Code snippet
3.1):
1. /**

2. * <p>This is a simple description of the method. . .

3. * <a href="http://www.bestcars.com">Get a car!</a>

4. * </p>

5. * @param incomingSpeed the amount of incoming car speed

6. * @return the amount of driven distance

7. * @see <a href="http://www.link_to_jira/CAR-777">CAR-777</a>

8. * @since 1.0

9. */

10. public int drive(int incomingSpeed) {


11. // do things

12. return 0;

13. }

Autogenerated documentation for the class of the example


above would look like this:

Figure 3.16: Example of autogenerated JavaDoc API documentation

Auto-generated comments
The software development landscape is undergoing a
transformative shift with the introduction of advanced
Language Model-based tools. Notable examples include
Codeium, a state-of-the-art code generation platform, and
GitHub Co-pilot, an AI-powered coding assistant. These tools
harness the capabilities of Large Language Models
(LLMs) to automate various aspects of development,
including code, comments, and testing, streamlining the
entire process.
In terms of code documentation, both Codeium and GitHub
Co-pilot play a vital role by automatically generating
comments, explanations, and inline documentation. They
achieve this by understanding the context and offering
relevant and informative documentation, making it easier for
developers and others to comprehend the code.
Furthermore, AI technology has given rise to dedicated
documentation tools like Mintlify, which automatically
generates code comments for various programming
languages, including Java (example in Figure 3.17):

Figure 3.17: Use of Mintlify plugin to automatically generate comments in


Visual Studio Code

Code reviews
Code reviews are one of the most effective techniques to
improve code quality. They boost code efficacy, catching
issues missed by automated tests and preventing costly
downstream fixes. Also, they enforce adherence by
developers to recommended standards, patterns, and
practices.

Problem
Poorly organized reviews waste time and offer little value
when reviewers lack knowledge of the components, coding
skills, clear guidelines, or when the review process causes
lengthy delays in the development cycle.

Pull request reviews


Pull Request Reviews are a popular technique supported by
most VCSs. They suspend the commit process until one or a
few reviewers approve changes. If issues are found, the
commit is rejected and sent back to the developer for
corrective actions and resubmission.
The Pull Request Review dialog on GitHub allows reviewers to
enter their review results and reject the submission if issues
are found, as depicted in Figure 3.18:
Figure 3.18: Pull Request Review dialog in GitHub

This process works well for colocated development teams


when reviews are treated with priority. However, in
distributed teams, developers can wait for hours while they
are blocked by reviewers.
Following are the pros and cons of code reviews:

Pros:
A formal process that is strictly enforced.
A clear indication of what changes were made.
Clear records of who did the review and what they
found.
Requested corrective actions are automatically
captured and passed to developers.

Cons:
Can introduce significant delays in the development
process and lower developers’ productivity.
It can consume significant time from senior developers,
interrupt their flow, and put them into low-productive
multitasking mode.

Periodic reviews
Periodic reviews, scheduled before a release or special
event, are an alternative to just-in-time reviews. Reviewers
can examine commit histories to identify changes since the
last review, and track corrective actions in a TODO file. This
approach suits distributed teams and is less onerous for
senior developers, but requires team maturity and is usually
not strictly enforced.
Following are the pros and cons of periodic reviews:

Pros:
Do not disrupt senior developers responsible for
reviews.
Do not block developers, and do not lower their
productivity.

Cons:
Are not formally enforced.
Require a special mechanism to collect and track
corrective actions.

Code review checklist


Development teams can improve the effectiveness of code
reviews by using formal checklists that reviewers must
follow. These checklists typically include some of the
following sections:
Functionality
Does it implement the required features correctly?
Does it handle boundary or error conditions?
Readability, code syntax, and formatting
Is the code clear and concise?
Does it follow the recommended naming
conventions?
Is the code formatted properly?
Are identifiers clear and meaningful?
Design principles
Is the code properly planned and designed?
Does it have a clear Separation of Concerns?
Is the code properly structured?

Auto code checks


Automated checks can improve the review process, including
code formatting and verification of style, structure, and
documentation with linters. Java has several linters available,
such as Checkstyle (https://checkstyle.org/), which can
verify code style and documentation. Checkstyle offers
preconfigured styles by Sun and Google, as well as the
option to define custom rules implemented in Java using
standard rules. Here is a typical configuration fragment for
Checkstyle (Code snippet 3.2):
1. <module name="Checker">

2. <module name="JavadocPackage"/>

3. <module name="TreeWalker">

4. <module name="AvoidStarImport"/>
5. <module name="ConstantName"/>

6. <module name="EmptyBlock"/>

7. </module>

8. </module>

To run Checkstyle from the command line using google style,


execute the following command (Code snippet 3.3):
1. > java com.puppycrawl.tools.checkstyle.Main -
c /google_checks.xml src/

Or, add this reporting plugin configuration to pom.xml to run


Checkstyle with Maven (Code snippet 3.4):
1. <groupId>org.apache.maven.plugins</groupId>

2. <artifactId>maven-checkstyle-plugin</artifactId>

3. <version>3.2.1</version>

4. <reportSets>

5. <reportSet>

6. <reports>

7. <report>checkstyle</report>

8. </reports>

9. </reportSet>

10. </reportSets>

Then, run the following maven command (Code snippet 3.5):


1. > mvn checkstyle:checkstyle

In the realm of static code analysis, SonarQube stands as a


prominent tool, offering a robust platform that thoroughly
evaluates code quality and security by identifying issues,
measuring code complexity, and ensuring adherence to
coding standards. It provides developers with actionable
feedback and allows teams to continuously monitor and
improve their code, making it an invaluable asset in software
development and maintenance.
In addition, there is a growing number of tools that
incorporate cutting-edge AI-based solutions. These AI-based
tools contribute to more efficient and effective code reviews,
fostering higher code quality and reducing the likelihood of
defects and vulnerabilities in software projects. As software
development practices continue to evolve, these tools
represent a crucial step toward more automated, intelligent,
and comprehensive code analysis.
For example, "PR-Agent" by Codium-ai employs advanced
artificial intelligence and machine learning techniques to
enhance code reviews. It intelligently examines code
changes, identifies potential issues, and offers suggestions
for improvement, significantly streamlining the review
process.

Microservice Chassis / Microservice template


Microservice Chassis is a pattern for scalable and modular
software. This section covers its aspects, benefits, and
implementation.

Problem
To improve efficiency, development teams often create a
Microservice Chassis pattern - a set of templates with
boilerplate code containing essential pieces and
recommended design patterns - to use when developing new
microservices. This is necessary as microservices systems
can grow quickly and have numerous components.

Solution
A basic template that can be used as a starting point for any
new microservice should include the following recommended
elements and patterns (as shown in Figure 3.19):
A standard component structure (See Code Structure
pattern)
Minimal documentation (See Minimal Documentation
pattern)
Build scripts and definition of CICD pipeline (See
Dockerized build scripts pattern)
Externalized configuration (See configuration patterns)
Synchronous and/and asynchronous communication
(See Synchronous Communication and Asynchronous
Communication patterns)
Communication interface versioning (See Interface
Versioning pattern)
Error handling and propagation (See Error
Propagation pattern)
Observability (See Logging and Performance Metrics
patterns)
Automated tests (See Unit tests and Integration Tests
patterns)
Packaging into a deployment component such as
docker image, Jakarta servlet, or a serverless function
(See microservice packaging patterns)
Figure 3.19: Patterns recommended for a basic microservice template

Template engines like T4 or StringTemplate can be used by


teams to generate microservices with specified parameters,
but they require frequent updates and become outdated
quickly. An alternative approach is to select one or a few
golden standard microservices and use them to create new
microservices through basic search and replace operations.
This method reduces the need for template maintenance and
ensures that the latest patterns and standards are reflected.

Antipatterns
These are a number of antipatterns that are common in code
organization and documentation of microservices systems:
Poorly structured monorepos: Deeply nested folders and
fuzzy monorepos can obscure the structure of a codebase,
making it difficult to see all components, their ownership,
type, and purpose. This can lead to pockets of unmanaged
code and cause management to lose control of the
codebase, relying on developers for information on its
structure and status.
Code sharing in monorepo: Monorepo makes it easy to
reference shared code without formal versioned
dependencies. This creates coupling at the code level and
turns the system into a distributed monolith.
Monolithic build and deployment: Teams may choose to
use a single build and deployment process to handle multiple
software components instead of setting up individual
pipelines for each component. This can be done manually or
with advanced tools, but it leads to tightly coupled
microservice lifecycles and the creation of a distributed
monolith.
Broad code changes: Monorepos let developers modify
multiple components at once and deploy them with one
commit, but contradict incremental delivery in microservices.
Instead of changing, testing, and releasing one microservice,
broad changes destabilize the system with no proper
versioning. This creates extra work for the team and slows
down release cycles. Such practices show a monolithic
mindset and are unsuitable for microservices.
As we see above, most antipatterns take place when code is
stored in monorepo. Differently, multi-repo creates physical
boundaries between microservices that are hard to violate.

Conclusion
In this chapter, we learned to organize code repositories,
define our microservice code structure, write good
documentation with minimum effort, share code between
microservices properly, ensure its backward compatibility,
and organize code reviews to get the maximum value out of
them. All those patterns help to effectively manage the
codebase and do not let it turn into a ball of mud. At the end
of the chapter, we discussed the Microservice Chassis
pattern that represents a microservice template used by a
development team as a starting point to create new
microservices. This pattern contains an implementation of
many patterns that we will discuss in the following chapters.
We finished with a description of several common
antipatterns.
Further reading
1. R. Vargas. Monorepo vs Polyrepo. Medium, Oct 21,
2021. Available at https://medium.com/avenue-
tech/monorepo-vs-polyrepo-4e5ccf2b4362
2. F. Kuzman. Domain-Driven Instead of Technology-Based
Project Structure. Medium. Dec 29, 2022. Available at
https://medium.com/@f.s.a.kuzman/domain-
driven-instead-of-technology-based-project-
structure-db3b34c3fd2d
3. Agrawal. Library vs Service vs Sidecar. Medium. Jan 23,
2022. Available at https://atul-
agrawal.medium.com/library-vs-service-vs-
sidecar-ff5a20b50cad
4. J.M Carrol. 1998. Minimalism Beyond the Nurnberg
Funnel. The MIT Press. London, England.
5. C. Wu. Microservices start here: Chassis Pattern.
Medium. Feb 13, 2023. Available at
https://medium.com/starbugs/microservices-
start-here-chassis-pattern-f1be783c522b
6. M. Šušteršič. How to Code Review. Medium. Feb 11,
2023. Available at
https://betterprogramming.pub/how-to-code-
review-34607e4a96ab
7. Mintlify. Documentation. Available at
https://mintlify.com/docs/quickstart
CHAPTER 4
Configuring Microservices

Introduction
This chapter covers different ways to configure microservices
at various stages of their lifecycle. By using these
configurations, microservices can connect to the right
infrastructure services, message brokers, databases, and
other microservices in the system. Additionally, they can be
utilized to alter microservice composition to accommodate
diverse deployment scenarios or customize their behavior to
meet the requirements of customers.
This chapter starts with an explanation of three types of
configurations:
Day 0 (development-time)
Day 1 (deployment time)
Day 2 (run time)
Then it describes general patterns to address hardcoded,
static, and dynamic configurations. At the end, it reviews
patterns for specific scenarios.

Structure
In this chapter, we will cover the following topics:
Configuration types
Day 0 configuration
Day 1 configuration
Day 2 configuration
Hardcoded configuration
Static configuration
Environment variables
Config file
Configuration templates / Consul
Dynamic configuration
Generic configuration service
Specialized data microservice
Environment configuration
Connection configurations
DNS registrations
Discovery service
Client-side registrations
Feature flag
Deployment-time composition
Antipatterns

Objectives
By the end of this chapter, you will gain an understanding of
the various types of microservice configurations. You will be
able to differentiate between Day 0, Day 1, and Day 2
configurations and determine the appropriate strategies to
implement them and achieve a simple and robust
configuration process. Furthermore, you will have a good
knowledge of patterns that are designed to tackle specific
configuration problems.

Configuration types
Configurations cannot be generalized as a single process.
They must be implemented for various parts of a system at
different stages of the lifecycle and may necessitate different
tools and techniques. To address this complex subject, in this
section, we will learn how to categorize configurations into a
few major groups and establish a structure for them.

Problem
Microservices systems are usually vast and may consist of
numerous moving parts, ranging from tens to even
hundreds. Consequently, extensive, and complex
configurations are usually necessary to ensure that these
components work cohesively. Additionally, managing
deployment and operational processes within such a
structure can become overly complicated, leading to lengthy
and error-prone procedures.

Solution
To simplify the process, configurations can be categorized in
various ways. One common approach is to divide them into
Day 0 (development-time), Day 1 (deployment time), and
Day 2 (run-time) configurations, as depicted in Figure 4.1:
Figure 4.1: Three types of system configurations

Day 0 Configuration
Development-time configuration, also known as Day 0
configuration, involves configuring parameters that can only
be modified during the development phase. This may include
settings like timeouts, retries, and test switches.
Certain Day 0 configuration parameters can be set as
hardcoded values, which are directly coded by developers
and cannot be changed after the component's release.
However, there may be instances where configurations need
to be altered during testing to enable certain testing hooks
or flows in the code. These configurations can either be
static or dynamic (See Static Configurations and Dynamic
Configurations patterns below).
It's important to note that testing configurations and Day
1/Day 2 configurations may use similar mechanisms, which
can lead to confusion and expose deployment and support
specialists to testing configurations. Moreover, changes
made to production parameters can result in significant
problems. Therefore, it's recommended to implement test
parameters with caution and keep them hidden in production
whenever possible.

Day 1 Configuration
The Day 1 configuration includes a group of parameters set
during the initial deployment. These parameters shouldn't be
changed unless a new version is released or there is a
change in the deployment scenario. Typically, these
parameters involve connection settings for internal
components, modes of operation, and feature flags, as
shown in Figure 4.2. This kind of configuration is commonly
known as deployment-time configuration. Since they do
not change unless the system is redeployed they are usually
implemented using static configurations (see patterns
below).

Figure 4.2: Deployment-time configuration of MySQL

Day 2 Configuration
Day 2 configurations are changes made after deployment,
using dynamic configuration patterns and a special
command line or a GUI tool. System components should
have built-in mechanisms to reconfigure themselves without
restarting. The following figure shows the runtime
configuration of webhooks in GitHub:

Figure 4.3: Webhooks configuration in GitHub

Some Day 2 configuration parameters are required from the


start, so some products also include them in Day 1
configuration that is set during deployment. This increases
the size of the deployment-time configuration, prolongs
deployment processes, and augments the risk of human
error.
An alternative approach is to automatically set Day 2
configuration to default values and allow the system to run
in degraded mode. When the deployment is completed, an
administrator can take the provided tools and complete the
system configuration to move it from a non-functional into a
working state.

Hardcoded configuration
The term hardcoded configuration describes the inclusion
of specific settings or values directly into a software system
or application.

Problem
When development-time configuration parameters are set as
hardcoded numbers in the code, it becomes extremely
difficult to find and change them. Placing them as constants
at the top of a class where they are used is a better option.
But in large components, locating them may present
difficulties.

Solution
A recommended approach is to collect all Day 0
configuration parameters in a microservice and put them
into a GlobalConstants class. Place the class where it can
be easily found. Give each parameter a meaningful name
and add comments to explain the parameter's purpose and
acceptable values. The following code shows how to do this
(code snippet 4.1):
1. public static class GlobalConstants {

2. public static final int HTTP_PORT = 8080;

3. public static final int HTTP_TIMEOUT = 1000; // in milliseconds

4. public static final int HTTP_MAX_RETRIES = 3; // do not set more


than 3!

5. …

6. }
Following are the pros and cons of hardcoded configuration:

Pros:
Extremely simple implementation.
Values are completely separated and hidden during
deployment.

Cons:
Modification of the parameters requires code changes.

Static configuration
Static configuration refers to settings that are established
once during system deployment and remain unaltered until
the next deployment.

Problem
Static configurations should only be established once when
microservices start. The critical aspect is that all
configuration parameters must be externally defined and
injected into microservices during deployment.
Consequently, patterns for static configurations are shaped
around the injection mechanisms that deployment packages
and platforms support.

Environment variables
The simplest and most universal method to inject static
configurations is Environment Variables. It is supported for
microservices deployed as system processes, containers in
Docker or Kubernetes, or as serverless functions. All built
servers and deployment tools also support environment
variables.
To safeguard sensitive parameters such as passwords or
keys, some technologies utilize what are known as secrets
(See environment configuration section for an example on
how to utilize secrets). These constructs are akin to
environment variables, but their values are not shown by
default and necessitate additional steps to access.
In Java, environment variables can be obtained directly (code
snippet 4.2):
1. String databaseHost = System.getEnv(“DATABASE_HOST”);

2. int databasePort =
Integer.parseInt(System.getEnv(“DATABASE_PORT”));

Spring allows injecting environment variables into the


application.yml or application.properties file by using
placeholder annotations like ${ENV:default_value}. An example
of a configuration variable declaration in an
application.properties file is (code snippet 4.3):

1. database.host: ${DATABASE_HOST:localhost}

Which can be injected into our code with the @Value


annotation (code snippet 4.4):
1. @Value("${database.host}")

2. private String databaseHost;

Or by getting its value from the Spring’s environment (code


snippet 4.5):
1. @Autowired

2. private Environment environment;

Then, we can call the parameter anywhere within our code


(code snippet 4.6):
1. environment.getProperty("database.host")
Following are the pros and cons:

Pros:
Very simple to implement.
Supported virtually everywhere.
The maximum size of values stored in environment
variables is 32Kb. This allows passing relatively large
objects as JSON or binary arrays encoded as Base64.

Cons:
Could be challenging to pass complex configurations
consisting of tens or hundreds of parameters.
Not very secure: Anyone accessing a running container
can access the configurations.

Config file
The second most popular method to inject static
configurations is via config files. The most common formats
used for configuration files are JSON, YAML, or property files.
Java provides simple mechanisms to read from any of these
formats.
Configuration files are usually added to the rest of the code.
For example, when using Docker, configuration files can be
placed inside a microservice container via volume mapping
(code snippet 4.7):
1. $ docker run -v ./config/config.yml:/app/config/config.yml ...

Another common deployment method used for serverless


functions or traditional deployments is to package all
deployment binaries into a zip archive or similar formats
such as war or jar. In this case, configuration files can be
injected inside the archive using a standard zip utility (code
snippet 4.8):
1. $ zip -u service-products-java.zip ./config.yml

SpringBoot natively supports application.properties or


application.yml configuration files and @Value attribute to
automatically inject configuration parameters directly into
class properties (see example in Environment Variables
pattern)
Following are the pros and cons:

Pros:
Suitable for very complex configurations.

Cons:
Requires extra effort to inject configurations into
microservice deployments.
Not very secure. Anyone who gets access to a running
container, may get access to the sensitive
configurations.
May require manual editing of config files to inject
deployment parameters.
Configurations may get out of sync.

Configuration template / consul


A less common approach for static configurations is consul
templates. This is a tool that enables the dynamic
configuration of applications based on data stored in
HashiCorp Consul, a distributed key-value store. Consul
Template uses a template language to define dynamic
configuration files that can be updated in real-time as
services and applications change.
This pattern blurs the line between static and dynamic
configuration as it requires regeneration of configurations
and restart of microservices, but it is still considered a static
configuration method.
Its usage is quite simple. For example, a sample template
like the one below will list all IP addresses for each registered
service (code snippet 4.9 - all-services.tpl file):
all-services.tpl:
1. {{ range services -}}

2. # {{ .Name }}

3. {{- range service .Name }}

4. {{ .Address }}

5. {{- end }}

6. {{ end -}}

To see the results, the following command can be used to


generate the output (code snippet 4.10 ):
$ consul-template -template="all-services.tpl:all-
services.txt" -once
Which, depending on service registrations inside consul, it
may look like (code snippet 4.11 - all services.txt file):
all-services.txt:

1. # consul

2. 104.131.121.232

3.
4. # redis

5. 104.131.86.92

6. 104.131.109.224

Following are the pros and cons:


Pros:
Automates generation of config files.
Able to update config files in real-time.

Cons:
Requires extra effort to inject configurations into
microservice deployments.
Not very secure. Anyone accessing a running
microservice container may get access to the
configurations.

Dynamic configuration
Dynamic configuration refers to the process of adjusting
system settings, variables, or parameters during runtime,
without requiring a system restart or manual reconfiguration.
It allows a system to adapt to changing circumstances, such
as increased traffic, changing user requirements, or new
hardware additions.

Problem
As dynamic configurations can be changed at any time, they
require three key mechanisms:
A persistent storage that can hold configuration
parameters
A command line or GUI tool to change that parameter
by an administrator
A method to read configuration parameters in real time
At the same time, microservices must be implemented in a
way that they can obtain the latest configuration version
after value changes. This can be done via notifications,
periodic pulls, or rereading configurations every time before
use.
Production can also be affected by Runtime Config, and it
may be difficult to determine the state or version of the
Runtime Config. To facilitate incident response, debugging,
and audits, this must be managed and logged appropriately.
Two important patterns used to solve this problem are
Generic Configuration Service and Specialized Data
Microservice.

Generic configuration service


One way to store and manage runtime configurations is to
use one of the available generic services that keep
configuration parameters in key-value format. There are
several options:
Etcd: a distributed key-value store https://etcd.io/ It
is used by Kubernetes itself to store configurations.
Consul: a service implemented by Hashi Corp
https://www.consul.io/ and described in the previous
Configuration Template / Consul pattern. It is often
used as a configuration or discovery service and
supports many interesting use cases.
HashiVault: another service from Hashi Corp
https://www.vaultproject.io/. It is an encrypted key-
value store primarily used to keep sensitive data like
keys or passwords. But many teams adopt it as a
general-purpose configuration store.
SpringBoot Configuration Service: a service
implemented by VMware Tanzu, creators of
SpringBoot, with out-of-the-box integration for
SpringBoot microservices. It is primarily used for Day 1
configurations but can also store and retrieve
configuration parameters during runtime.
Since Day 2 configuration may change, microservices must
have mechanisms to reread and update their configurations.
Unfortunately, most generic configuration services do not
offer asynchronous notifications. To overcome this limitation,
microservices should use periodic pulls to retrieve a setting,
determine if it was modified since the last time attempt, and
then reconfigure the code. The example below shows how to
read and write configuration parameters using Spring Boot
Config Service (code snippet 4.12):
1. @Component

2. @ConfigurationProperties(prefix = "component")

3. @RefreshScope

4. public class UrlConfigs {

5. // This is for storing application properties in a Map

6. public Map<String, String> configurations;

7. ...

8. // Get property value using key

9. public String getUrlProps(String key) {

10. return configurations.get(key);

11. }

12. }

Which reads from the following properties file (code snippet


4.13):
1. spring.application.name=example-service

2. spring.cloud.consul.discovery.instanceId=${spring.application.name
}:${random.value}

3. spring.cloud.config.watch.delay=1000
4. spring.config.import="optional:consul:localhost:8500"

5. server.port=8080

6. management.security.enabled=false

Similarly, we can obtain the values from Consul (code


snipped 4.14 - application.java):
1. @EnableDiscoveryClient

2. @RestController

3. public class Application {

4.
5. @Value("${param1}") // param2=value_from_consul_1

6. String value1;

7. @Value("${param2}") // param2=value_from_consul_2

8. String value2;

9.
10. @RequestMapping(value = "/consul_configs", method =
RequestMethod.GET)

11. public String getValue() {

12. return value1 + " | " + value2 ; // returns: "value_from_consul_1


| value_from_consul_2"

13. }

14. ...

15. }

Following are the pros and cons:

Pros:
Generic implementation can be used to store virtually
any type of configuration.
Small, robust, and highly efficient. Able to handle
significant load using minimum resources.
Available out-of-the-box. Does not require coding.
Often available with some client tools to view and
manage the configurations.

Cons:
Mostly suitable for simple key-value parameters and
may not be convenient for complex configurations.

Specialized data microservice


One alternative for storing dynamic configurations involves
creating a dedicated data microservice that utilizes a
conventional CRUD interface for managing configuration
data objects. Such as user accounts and roles, edge device
configurations, email message templates, and more.
This approach offers several benefits. First, configuration
microservices can be implemented in the same manner as
standard data microservices, seamlessly integrated into the
system and leveraging the same deployment platform
(including utilizing the same database and logging and
performance monitoring tools). Additionally, these
microservices can employ asynchronous messages to inform
other microservices of configuration changes, just like any
other microservice would.
Following are the pros and cons:

Pros:
Implementation consistent with other data
microservices.
Fully integrated into the system and utilizes the
deployment platform (database, logging, messaging).
Able to store complex configuration data in a well-
structured format.
Can support asynchronous notifications if needed.

Cons:
Similarly, to other microservices, it requires
implementation.
Requires command-line or GUI tools for administrators
to manage configurations.

Environment configuration
Environment configuration refers to the specific settings and
configurations of a software application or system in a
particular environment. An environment can refer to a
development, testing, staging, or production environment,
each with its own unique set of settings.

Problem
Most Day 1 configuration parameters represent IP addresses,
port numbers, and credentials to connect to infrastructure
services: databases, blob storages, message brokers,
caching services, etc. When the deployment platform is
created manually or/and managed by a separate group,
collecting all configuration parameters and including them in
Day 1 configuration during product deployment may
represent a significant challenge.

Solution
One possible way to solve the problem is by following these
steps:
1. Create environment provisioning scripts that gather
connection parameters immediately after the resources
are created. These parameters should be stored in a file
or configuration storage.
2. In the process of product development, the connection
parameters that were collected in step 1 can be
incorporated into the Day 1 configuration, or they can
be passed alongside manually defined configuration
parameters (see Figure 4.4).

Figure 4.4: Automatic collection of connection parameters during environment


provisioning and injecting it into Day 1 configuration

One common scenario is the deployment of dockerized


microservices into Kubernetes. In order to run, microservices
need infrastructure services like databases, distributed
caches, message brokers, distributed logging, and others. In
simple test or on-prem deployments, infrastructure services
can run inside Kubernetes.
However, in large-scale production deployments, they might
be provisioned as services managed by a cloud provider. In
this scenario, environment provisioning scripts collect
connection parameters for infrastructure services and place
them into Kubernetes (see Figure 4.5).
Figure 4.5: Injection of connection parameters for infrastructure services via
Config Maps and Secrets for Kubernetes deployments

The code below exemplifies this case. In it, sensitive data like
usernames, passwords, or keys are stored into Environment
Secrets (code snipped 4.15 - env-secrets),
1. apiVersion: v1

2. kind: Secret

3. metadata:

4. name: env-secret

5. type: Opaque

6. stringData:

7. db_user: "mysql"

8. db_pass: "p4ssw0rd"

And host names, ports, and other values are kept into
Environment Config Map (code snippet 4.16 - env-config).
1. apiVersion: v1

2. kind: ConfigMap

3. metadata:
4. name: env-config

5. data:

6. db_type: "mysql"

7. db_host: "10.1.12.123"

8. db_port: "3306"

Then, when the product deployment package deploys


microservice pods, their environment variables are mapped
to get connection parameters from the Environment
Config Maps and Secret (code snippet 4.17):
1. apiVersion: apps/v1

2. kind: Deployment

3. ...

4. containers:

5. ...

6. env:

7. - name: DB_TYPE

8. valueFrom:

9. configMapKeyRef:

10. key: db_type

11. name: env-config

12. - name: DB_HOST

13. valueFrom:

14. configMapKeyRef:

15. key: db_host

16. name: env-config


17. ...

18. - name: DB_PASS

19. valueFrom:

20. secretKeyRef:

21. key: db_pass

22. name: env-secrets

23. ...

Connection configuration
Connection configuration refers to the settings and
information required to connect two devices or systems.
These parameters are typically used in networking protocols
to ensure that devices can communicate with each other
effectively.

Problem
Connection parameters are a very common type of
configuration. If they point to infrastructure services that are
part of the product deployment environment and never
change, they are included in Day 1 configuration. Otherwise,
if a service location or credentials may change after the
deployment, then they are part of the Day 2 configuration.
Because connection configuration is such a common
problem, there are several solutions exclusively focused on
it. Some of these approaches are:

DNS registrations
Domain Naming Service or DNS is the oldest and the most
standard method to discover host addresses by their domain
names. Initially, DNS was meant for the discovery of hosts on
the Internet registered manually and changing at a slow
pace. Nowadays, DNS is deeply integrated into the IP stack
and requires no additional effort from users. As a result,
newer technologies have tried to leverage DNS for other
purposes, such as service wiring in computing systems.
The following figure explains how DNS registration works:

Figure 4.6: Discovery of IP addresses using DNS

For example, Docker and Kubernetes use DNS as a built-in


discovery service. Docker registers running containers in
DNS under their names, and Kubernetes does the same for
pods. Additionally, Kubernetes uses DNS reversed proxies
called “services”. While pods are temporary and can come
and go anytime, Kubernetes “services'' are static and
provide stable and reliable connection points between pods
and their callers. The following example shows how to deploy
a pod and a service in Kubernetes (code snippet 4.18 -
config.yml):

1. apiVersion: v1

2. kind: Pod
3. metadata:

4. name: service-products-java

5. labels:

6. app: service-products-java

7. spec:

8. selector:

9. matchLabels:

10. app: service-products-java

11. containers:

12. - name: service-products-java

13. image: service-products-java:1.0.0

14. ports:

15. - containerPort: 8080

16. ---

17. apiVersion: v1

18. kind: Service

19. metadata:

20. name: products-java

21. spec:

22. ports:

23. - port: 8080

24. protocol: TCP

25. selector:

26. app: service-products-java


27. type: ClusterIP

Following are the pros and cons:

Pros:
Integrated into IP stack, no extra effort to perform
service discovery.
Supports basic round-robin algorithm to distribute load
across multiple service instances.

Cons:
Requires registration to update naming records (could
be integrated into the deployment platform).
Limited to IP addresses (no port numbers or additional
information) only.
No health checks. When a service becomes
unavailable, it stays in the list until naming records are
updated.

Discovery services
To overcome the limitations of DNS, multiple vendors offer
their implementations of discovery services. In addition to IP
addresses, they can store port numbers and additional
service information. They also support automated health
checks, advanced load-balancing scenarios, and other
features. Unfortunately, no standards exist in this area, and
every Discovery Service has a unique method to perform the
task (see Figure 4.7).
Figure 4.7: Discovery of connection information using a Discovery Service

There are a number of Discovery Services available on the


market:
Eureka: is a REST-based service
https://github.com/spring-cloud/spring-cloud-
netflix that is primarily used in the AWS cloud for
locating services for the purpose of load balancing and
failover of middle-tier servers.
Etcd: a distributed key-value store https://etcd.io/. It
is used by Kubernetes itself to store configurations.
Consul: a service implemented by HashiCorp
https://www.consul.io/. It is often a configuration or
discovery service and supports many interesting use
cases.
Apache Zookeeper: a centralized service
https://zookeeper.apache.org/ for maintaining
configuration information, naming, providing
distributed synchronization, and group services.
Nomad: a highly available, distributed job scheduler
and deployment system by HashiCorp
https://www.nomadproject.io/ that includes support
for service discovery and health checking.
The example below demonstrates the discovery of a
microservice using an Eureka client which needs the
following declaration in the application.yml file (code snippet
4.19):
1. eureka.client.registerWithEureka=false

2. eureka.client.fetchRegistry=false

And, the addition of the @EnableEurekaServer annotation in


the application.java class (code snipped 4.20):
1. @SpringBootApplication

2. @EnableEurekaServer

3. public class Application {

4. public static void main(String[] args) {

5. SpringApplication.run(Application.class, args);

6. }

7. }

Following are the pros and cons:

Pros:
Store a complete set of connection metadata.
Support advanced load-balancing scenarios.
Support automated health checks and remove
unresponsive services from the list.

Cons:
May require registration of the services.
Require special steps to discover a service.
Non-standard - tie clients to a particular Discovery
Service.

Client-side registrations
The biggest inconvenience of Discovery Services is the
need to update service registrations. The solution to this
problem is simple, services can register themselves during
startup. This ties the services to a particular infrastructure
and can be an issue for consumers that use a different
method of service discovery. However, for systems where
microservices and their consumers are developed by the
same team, it is usually not a problem. The following figure
explains this pattern:

Figure 4.8: Service discovery with automated client registrations

The code below demonstrates the automated registration of


a microservice in Eureka.
To register your service with SpringBoot, you need to add the
Eureka client dependency and some additional
configurations (see code snippet 4.21 below).
1. eureka.client.serviceUrl.defaultZone=http://localhost:8761/eureka

2. eureka.client.instance.preferIpAddress=true

3. spring.application.name=myeurekaclient

Once your application is running, the following interface can


be accessed (see Figure 4.9 below):

Figure 4.9: Registered Eureka client

Following are the pros and cons:

Pros:
Stores a complete set of connection metadata.
Automated registrations of services. No need for
special tools or support from the infrastructure.

Cons:
Requires special steps to register and discover a
service.
Non-standard - ties both services and clients to a
particular Discovery Service.
Deployment-time composition
Deployment-time composition refers to a situation where a
microservice is deployed in different environments and
needs to adjust its implementation accordingly.

Problem
A typical scenario includes a microservice that needs to opt
for a persistence implementation that suits the connected
database, select suitable message queues, loggers, and
performance counters, and also enable or disable external
interfaces as required by the environment.

Solution
The first step to enable deployment-time composition is
implementing a microservice using well-defined, loosely
coupled components. A typical microservice may include the
following types of components:
Repositories to store and retrieve persistent data
Services that implement microservice business logic
Controllers that expose call interfaces to microservice
consumers
Data objects (or DTOs) to define data that is handled
by the microservice
Message queues for asynchronous messaging
Clients to communicate with other microservices
Distributed locks to handle concurrency
Loggers, Traces, and Performance Counters for
observability
The following figure illustrates a typical composition of a
microservice:
Figure 4.10: Typical component composition of microservices

And each component type may have a common interface


and a few implementations (code snippet 4.22):
1. public class IProductRepository {

2. …

3. }

4.
5. public interface MySqlProductRepository extends
IProductRepository {

6. …

7. }

8.
9. public interface OracleProductRepository extends
IProductRepository {

10. …

11. }

Deployment-time composition can be done in several ways.


For example, we can use plain Java code. In the example
below (code snippet 4.22), we assume that the configuration
parameters are passed as environment variables:
1. public class ProductMicroservice {

2. private IProductRepository _repository;

3. …

4. }

5.
6. public class ProductMicroserviceBuilder {

7. private IProductRepository _repository;

8.
9. public ProductRepositoryBuilder withRepositoryFor(String
databaseType) throws ConfigurationException {

10. // Creation logic...

11. returns this;

12. }

13.
14. public ProductMicroservice build() {

15. return new ProductMicroservice(this._repository, ...);

16. }

17. }

18. ...

19. public static void main(String[] args) {

20. try {

21. ProductMicroservice microservice = new


ProductMicroserviceBuilder()
22.
.withRepositoryFor(System.getenv("DATABASE_TYPE")).build();

23. microservice.run();

24. } catch (Exception ex) {

25. System.err.println(ex.getMessage());

26. }

27. }

Even better, Spring Boot provides some features that make


deployment time composition very easy. The following
example shows how this can be done:
First, we need to inject the environment variables into the
application yml (or application.properties) file. In the example
below, we define a variable called DATABASE_TYPE that specifies
the chosen database (code snippet 4.23 - application.yml):
1. database:

2. type: {{DATABASE_TYPE}}

Then, we use the @ConditionalOnProperty annotation to include


the components corresponding to the selected database. For
example, we can select the appropriate interface like this
(code snippet 4.24):
1. @ConditionalOnProperty(value = "database.type", havingValue =
"mysql")

2. public interface MySqlProductRepository extends


IProductRepository {

3. ...

4. }

Feature flag
A feature flag, also known as a feature toggle or feature
switch, is a technique used in microservices architecture to
selectively enable or disable certain features or functionality
of an application.

Problem
Traditionally when a development team implements a new
functionality, they may choose two different approaches. The
first one is a “safe approach”, which requires long and
rigorous testing and involves “user representatives” in
usability testing. Only when everything is verified can a
product be released into production. Another approach is to
“throw sh**t against the wall” and hope the old functionality
is not broken and the new one will not fall apart.

Solution
Recently high-velocity teams started to adopt a new pattern
called Feature Flag. It means that old functionality is kept
intact while new features are implemented on the side and
enabled by a special flag that can be set during the
deployment or changed during runtime.
Feature Flags have a few significant advantages over the
traditional approaches. Its implementation:
Speeds up innovation as teams can put out features
faster.
Shortens the test cycle for new features and reduces
required test resources as some are delegated to beta-
users / early adopters.
Allows developers to quickly receive feedback from
beta-users, improve implementation, and nail
remaining defects before the prime time.
Keeps existing functionality intact for regular users,
while early adopters can get access to the latest and
greatest features.
Some companies build their entire releases around feature
flags by publicly announcing newly released features,
actively involving users in beta-testing and idea generation
and creating a community of raving fans. For example,
GoHighLevel’s marketing system uses them to invite users to
join the beta community:

Figure 4.11: Labs interface of GoHighLevel marketing system that opens users’
ability to join a beta-testing program

Feature flags are implemented as Day 1 (deployment-time)


or Day 2 (runtime) configurations and fall into a few
categories:
Release toggles: simple on/off switches that enable
new features and are removed after the feature is
released.
Operational toggles: switches that control
operational flow at the backend. They can switch to an
alternative branch in business logic and enable or
disable old/ new APIs. They have a short lifespan and
are typically removed after a task that requires an
alternative flow is complete.
Experimental toggles: are used for A/B or
multivariate testing to collect user data. They stay as
long as needed and are removed after testing is
complete.
Permission toggles: used to control access to
particular features at the interface level.
Depending on the scope of a feature toggle, it can be
implemented at different levels.
When a feature is small and isolated, it can be implemented
as a branch inside the code (code snippet 4.25):
1. public class SearchService {

2. private boolean _newAlgorithm = false;

3.
4. public SearchService(boolean newAlgorithm) {

5. this._newAlgorithm = newAlgorithm;

6. …

7. }

8.
9. public SearchResults[] search() {

10. if (this._newAlgorithm) {

11. // New search algorithm


12. …

13. } else {

14. // Old search algorithm

15. …

16. }

17. }

However, when the feature implementation is extensive, it


may be more practical to have different versions of
components and select the appropriate one by using the
deployment-time composition pattern (code snippet 4.26):
1. public interface ISearchService {

2. public SearchResult[] search();

3. …

4. }

5.
6. // Old search algorithm

7. public class OldSearchService implements ISearchService {

8. …

9. }

10.
11. // New search algorithm

12. public class NewSearchService implements ISearchService {

13. …

14. }

If a feature flag is static and set during deployment time,


then the microservice can instantiate one of the alternative
component implementations during startup by using the
Deployment-Time Composition pattern described earlier (see
Figure 4.12).

Figure 4.12: Choosing a component implementation on microservice startup

When a microservice is assembled statically, it can choose


an appropriate version of a component based on
deployment-time configuration code snippet 4.27):
1. public SearchMicroservice {

2. public ISearchService _searchService;

3. …

4.
5. public void build(Configuration config) {

6. …

7. this._searchService = config.newSearchAlgorithm

8. ? new NewSearchService() : new OldSearchService();

9. …

10. }

11. }

For feature flags that are changed during runtime, we can


use a Proxy pattern. The following figure shows how this
pattern works:
Figure 4.13: Switching component implementation in runtime using a Proxy

And the following code (code snipped 4.28) exemplifies how


to implement this pattern:
1. public class ProxySearchService implements ISearchService {

2. private ISearchService _service;

3.
4. public void configure(Configuration config) {

5. this._service = config.newSearchAlgorithm

6. ? new NewSearchService() : new OldSearchService();

7. }

8.
9. public SearchResult[] search() {

10. return._search.search();

11. }

12. }

In addition to feature flags, software development and


release management employ several other techniques to
enhance the control and optimization of product
deployments. Percentage rollout allows for a gradual release
of new features to a subset of users, enabling teams to
monitor performance and gather user feedback before a full-
scale launch. A/B testing, on the other hand, facilitates the
comparison of two or more variants of a feature to determine
which one performs better in terms of user engagement and
conversion. Segmentation allows for tailoring feature
availability to specific user groups, providing a more
personalized user experience. Sticky sessions, often used in
load balancing, help ensure that a user consistently interacts
with the same server, which can be essential for maintaining
session-based application states. These techniques, while
distinct from feature flags, play a crucial role in the broader
arsenal of tools available to developers and product
managers for effective software delivery and optimization.

Antipatterns
The most common anti-patten that can be found in
microservices implementation is mixing Day 0, Day 1 and
Day2 configurations and exposing them all at deployment
time. Some of most frequent cases are:
Opening development time configurations in
production. Developers expose configuration
parameters intended for modification during
development and carry them forward to deployment,
increasing the number of parameters and cognitive
load for deployment engineers. Moreover, mistakenly
altering development time configurations can cause
further complications.
Setting runtime configurations during
deployment. Proper configuration is crucial for system
functionality but configuring all settings at once can be
lengthy and error-prone. A better approach is to
separate into Day 1 and Day 2 phases. During
deployment, only essential Day 1 configurations should
be set, and non-essential Day 2 configurations disabled
or set to defaults. After the system has been started,
engineers can use tools to implement necessary Day 2
configurations.
Manual wiring of internal services. Microservices
systems typically consist of numerous components,
sometimes ranging from tens to hundreds. Manually
configuring internal interconnections during
deployment can make the process overly complicated
and error-prone. To mitigate this, teams should explore
automating the process. This can be achieved by
scripting deployment environments and utilizing
available patterns to automate connection
configurations.
Hardcoded environment profiles. A very common
anti-pattern is the practice of hardcoding deployment
time configurations in "environments" or "profiles,"
which are then selected during deployment. While this
may seem like a convenient approach, it results in
coupling between code and deployments. Any changes
made to deployment configurations require changes
throughout the entire codebase, which can occur
frequently. In optimal microservices implementations,
all Day 1 configurations should be external.
Microservices should remain unchanged for extended
periods and be deployable in any environment.

Conclusion
Throughout this chapter, we have gained knowledge about
three distinct configuration types - Day 1, Day 2, and Day 3,
which correspond to configurations executed during
development, deployment, and runtime stages. It is crucial
to distinguish between these configurations to ensure
simplicity and clarity in the configuration process.
Furthermore, we also explored various patterns that can be
employed to handle diverse configuration scenarios in
microservices systems and some antipatterns that describe
situations that should be avoided.

Further reading
1. L. G. De O. Costa. Testcontainers and Java: Using
properties instead of hardcoded configuration. Medium.
Jun 13, 2022. Available at
https://medium.com/@luizcostatech/testcontainer
s-and-java-using-properties-instead-of-
hardcoded-configuration-19320665c822
2. S. P. Sarkar. Service Discovery & why it is so important
in Microservices. Medium. Dec 26, 2021. Available at
https://blog.devgenius.io/service-discovery-
importance-in-microservices-17970569685

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 5
Implementing
Communication

Introduction
In a microservices architecture, communication between
services is a crucial aspect of building robust, efficient, and
maintainable systems. Each service interacts with others to
collectively fulfill a broader set of functionalities. Therefore,
establishing effective communication channels is essential
for enabling a seamless exchange of information and
ensuring the overall success of a microservices-based
application.
There are two primary communication styles: synchronous
and asynchronous. Synchronous communication requires
both the sender and receiver to be available simultaneously,
whereas asynchronous communication allows them to
operate independently. Each style has its pros and cons, and
the choice between them depends on the specific
requirements of a system.
The design patterns presented in this chapter address
various aspects of communication in microservices. These
include methods for sending and receiving information,
making communication reliable and high-performing, and
adding documentation and implementing versioning.
Understanding and implementing these patterns will help
you build a well-structured and efficient microservices
architecture in Java.

Structure
In this chapter, we will cover the following topics:
Synchronous calls
HTTP/REST
gRPC
Asynchronous messaging
Point-to-point
Publish/subscribe
API versioning
Versioned channels
Versioned routing
API documentation
Open API / Swagger
Protobuf Definitions
AsyncAPI
Streaming
Continuous streaming
Transferring blob IDs
Chunking
Commandable API
Reliability
Timeout
Retries
Rate limiter
Circuit breaker
Client library

Objectives
After studying this chapter, you should be able to understand
and apply different communication patterns, choose
between synchronous and asynchronous communication,
make interfaces versioned and reliable, and implement client
libraries to simplify the use of services for consumers.

Synchronous calls
Synchronous communication in microservices is a simple and
widely used inter-process communication approach,
involving simultaneous availability of sender and receiver.
However, while providing real-time feedback and easy error
handling, it may introduce latency and coupling between
services. There are many ways to implement synchronous
calls between microservices. Among them, two of the most
popular protocols are HTTP/REST and gRPC.

Problem
In microservices architectures, breaking a monolithic
application into smaller, independent services leads to
multiple interactions between these services. When relying
on synchronous communication, the sender must wait for the
receiver's response before proceeding (see Figure 5.1):
Figure 5.1: Synchronous call between microservices

While simple to implement, synchronous communication can


increase latency, reduce performance, and create potential
bottlenecks in the system. Furthermore, it can result in tight
coupling between services due to their direct dependency on
each other's availability and response times. Addressing
these challenges is crucial for ensuring the efficiency and
resilience of a microservices-based system.

HTTP/REST
Hypertext Transfer Protocol (HTTP) is one of the oldest
Internet protocols. It was created to communicate with web
servers and later adopted for transmitting data between
clients and servers in distributed systems, including
microservices architectures.
HTTP provides standardized methods, such as GET, POST,
PUT, and DELETE, which facilitate communication between
services. By leveraging the ubiquity and interoperability of
the HTTP protocol, microservices can efficiently exchange
information and interact with each other across various
platforms and languages.
Representational State Transfer (REST) is an
architectural style that builds upon HTTP, outlining a set of
conventions and best practices for designing networked
applications. RESTful services use HTTP methods and URIs to
identify resources while employing standard media types,
such as JSON or XML, for data exchange. The stateless
nature of REST simplifies the management of services,
promotes loose coupling, and ensures scalability.
Examples of REST Interface:
GET /products: Retrieve a list of products
GET /products/{id}: Retrieve a specific product by its
ID
POST /products: Create a new product
PUT /products/{id}: Update an existing product by its
ID
DELETE /products/{id}: Delete a product by its ID
While REST is not a standard but rather a set of guidelines, it
remains a popular choice for HTTP-based communication in
microservices. However, it is worth noting that some
systems may not strictly adhere to its conventions. Non-
RESTful HTTP communication still leverages the same HTTP
methods and data exchange formats, but it may not fully
conform to REST principles.
For a similar example of a product management
microservice, a non-RESTful interface might use custom or
non-standardized URIs and methods:
GET /getAllProducts: Retrieve a list of products
GET /getProduct?id={id}: Retrieve a specific
product by its ID
POST /createProduct: Create a new product
POST /updateProduct?id={id}: Update an existing
product by its ID
POST /deleteProduct?id={id}: Delete a product by
its ID
Error handling in the HTTP protocol is managed through the
use of standardized status codes. These three-digit codes
indicate the outcome of an HTTP request and can be
categorized into five classes based on their first digit:
1xx (Informational): Request received, server is
continuing to process it
2xx (Successful): Request was successfully received,
understood, and accepted
3xx (Redirection): Further action is needed to
complete the request
4xx (Client Error): Request contains bad syntax or
cannot be fulfilled by the server
5xx (Server Error): Server failed to fulfill a valid
request
For example, common HTTP status codes used for error
handling include:
200 OK: The request has succeeded. The information
returned with the response is dependent on the method
used in the request.
301 Moved Permanently: This status code indicates
that the requested resource has been permanently
moved to a new location, and future requests should
use the new URL.
400 Bad Request: The server cannot process the
request due to invalid syntax.
401 Unauthorized: The request requires
authentication, and the provided credentials are
missing or incorrect.
403 Forbidden: The client does not have permission to
access the requested resource.
404 Not Found: The requested resource could not be
found on the server.
410 Gone: This status code indicates that the
requested resource is no longer available at the server
and no forwarding address is known. It is considered a
permanent condition.
500 Internal Server Error: The server encountered
an error while processing the request.
In addition to error codes, it is recommended to send error
responses that contain detailed information about the error,
its type, stacktrace and correlation (trace) ID, so it can be
used for logging and troubleshooting.
Most microservices frameworks have support for HTTP/REST
synchronous communication and make their implementation
very easy. The example below shows a simple REST
controller implemented in Java/SpringBoot.
1. First, create a new Spring Boot project with the “web”
dependency (Code snippet 5.1).
1. <?xml version="1.0" encoding="UTF-8"?>

2. <project …>

3. <modelVersion>4.0.0</modelVersion>

4. <parent>

5. <groupId>org.springframework.boot</groupId>

6. <artifactId>spring-boot-starter-parent</artifactId>
7. <version>3.0.6</version>

8. <relativePath/>

9. </parent>

10. <groupId>com.example.client</groupId>

11. <artifactId>sample</artifactId>

12. <version>0.0.1-SNAPSHOT</version>

13. <properties>

14. <java.version>17</java.version>

15. </properties>

16. <dependencies>

17. <dependency>

18. <groupId>org.springframework.boot</groupId>

19. <artifactId>spring-boot-starter-web</artifactId>

20. </dependency>

21.
22. <dependency>

23. <groupId>org.springframework.boot</groupId>

24. <artifactId>spring-boot-starter-test</artifactId>

25. <scope>test</scope>

26. </dependency>

27. </dependencies>

28. …

29. </project>

2. Create a SimpleRestService class (Code snippet 5.2):


1. package com.example.samples.http;

2.
3. import org.springframework.web.bind.annotation.GetMap
ping;

4. import org.springframework.web.bind.annotation.Reques
tMapping;

5. import org.springframework.web.bind.annotation.RestCo
ntroller;

6.
7. @RestController

8. @RequestMapping("/sample")

9. public class SampleRestService {

10.
11. @PostMapping("/do-something")

12. public String doSomething() {

13. return «Hello, I just did something!";

14. }

15. }

3. Finally, create the main Spring Boot application class


(Code snippet 5.3):
1. package com.example.samples;

2.
3. import org.springframework.boot.SpringApplication;

4. import org.springframework.boot.autoconfigure.SpringBo
otApplication;
5.
6. @SpringBootApplication

7. public class SampleRestServiceApplication {

8.
9. public static void main(String[] args) {

10. SpringApplication.run(SampleRestServiceApplicatio
n.class, args);

11. }

12. }

When you run the application, it will start an embedded


Tomcat server on port 8080 (by default) and expose the
doSomething method at the /sample/do-something
endpoint.
The following code invokes this microservice (Code snippet
5.4):
1. import org.springframework.boot.CommandLineRunner;

2. import org.springframework.boot.SpringApplication;

3. import org.springframework.boot.WebApplicationType;

4. import org.springframework.boot.autoconfigure.SpringBootApplicati
on;

5. import org.springframework.http.HttpEntity;

6. import org.springframework.http.HttpMethod;

7. import org.springframework.http.HttpStatus;

8. import org.springframework.http.ResponseEntity;

9. import org.springframework.web.client.RestTemplate;

10.
11. @SpringBootApplication

12. public class SampleClientApplication implements CommandLineRun


ner {

13.
14. private final RestTemplate restTemplate = new RestTemplate();

15.
16. public static void main(String[] args) {

17. var application = new SpringApplication(SampleClientApplication.c


lass);

18. application.setWebApplicationType(WebApplicationType.NONE);

19. application.run(args);

20. }

21.
22. @Override

23. public void run(String... args) {

24. String serviceUrl = "http://localhost:8080/sample/do-something";

25. String requestBody = "This is a sample request body";

26. HttpEntity<String> request = new HttpEntity<>(requestBody);

27.
28. ResponseEntity<String> response = restTemplate.exchange(servic
eUrl, HttpMethod.POST, request, String.class);

29.
30. if (response.getStatusCode() == HttpStatus.OK) {

31. System.out.println("Response from server: " + response.getBody()


);
32. } else {

33. System.out.println("Error occurred: " + response.getStatusCode())


;

34. }

35. }

36. }

Pros of the HTTP/REST Pattern:


Easy to implement and understand
Comprehensive platform and language support
Stateless nature promotes loose coupling
Scalable and maintainable architecture
Standardized HTTP methods and status codes
Interoperable with various tools and libraries

Cons of the HTTP/REST Pattern:


Increased network overhead due to text-based data
exchange
Latency concerns in high-performance scenarios
It may be overly verbose for simple interactions
Strict adherence to conventions can be limiting
Less efficient compared to binary protocols (e.g.,
gRPC)

gRPC
gRPC stands out as a contemporary and efficient Remote
Procedure Call (RPC) framework, designed to facilitate
seamless communication between distributed systems.
Unlike traditional RPC implementations, gRPC leverages the
advanced capabilities of the HTTP/2 protocol for transport,
ensuring enhanced performance and scalability. Additionally,
gRPC adopts Protocol Buffers as its interface description
language, enabling concise and platform-independent
service definitions. By embracing these modern
technologies, gRPC not only streamlines communication
between microservices but also offers significant
improvements in latency and throughput, making it an ideal
choice for building large-scale, distributed systems. This
distinction highlights gRPC's specialization as an
implementation of RPC tailored to harness the capabilities of
the HTTP/2 protocol for efficient communication.
gRPC leverages HTTP/2 features like header compression,
multiplexing, and streaming to minimize overhead and
optimize communication. With built-in support for bi-
directional streaming, gRPC facilitates real-time data
exchange between services. Additionally, gRPC's use of
Protocol Buffers ensures strong data typing and efficient
binary serialization, which leads to faster processing times
and smaller payloads compared to text-based formats like
JSON. As a language-agnostic framework, gRPC offers broad
support for various programming languages, making it easy
to integrate into diverse microservice architectures.
While gRPC allows creating dynamic services and clients, it is
more common to define an interface in a proto file, then
generate stubs for client and service.
The following example shows a simple gRPC service:
1. Define the gRPC service and method in a .proto file
(Code snippet 5.5: sample.proto):
1. syntax = "proto3";

2.
3. option java_multiple_files = true;

4. option java_package = "com.example.samples.grpc";

5. option java_outer_classname = «ServiceProto»;

6.
7. package com.example.grpc;

8.
9. service SampleGRPCService {

10. rpc DoSomething (DoSomethingRequest) returns


(DoSomethingResponse);

11. }

12.
13. message DoSomethingRequest {

14. string input = 1;

15. }

16.
17. message DoSomethingResponse {

18. string result = 1;

19. }

2. Generate the Java code using the Protocol Buffers


compiler (Code snippet 5.6).
1. protoc --java_out=src/main/java sample.proto

3. Implement the server with the generated code (Code


Snippet 5.7: SampleGRPCServer.java):
1. package com.example.samples;

2.
3. import com.example.samples.grpc.SampleGRPCServiceI
mpl;

4. import io.grpc.Server;

5. import io.grpc.ServerBuilder;

6.
7. import java.io.IOException;

8.
9. public class SampleGRPCServer {

10. public static void main(String[] args) throws IOExcepti


on,
InterruptedException {

11. int port = 8090;

12. Server server = ServerBuilder.forPort(port)

13. .addService(new SampleGRPCServiceImpl())

14. .build()

15. .start();

16. System.out.println("Server started on port " + port);

17. server.awaitTermination();

18. }

19. }

20.
21. package com.example.samples.grpc;

22.
23.
24. import io.grpc.stub.StreamObserver;
25.
26. public class SampleGRPCServiceImpl extends
SampleGRPCServiceGrpc.SampleGRPCServiceImplBase {

27.
28. @Override

29. public void doSomething(DoSomethingRequest request


,

30. StreamObserver<DoSomethingRespons
e> responseObserver) {

31. var req = request.getInput();

32. var res = "Request: " + req + "\n" + "Hello, I just did

something!";

33.
34. responseObserver.onNext(DoSomethingResponse.ne
wBuilder().
setResult(res).build());

35. responseObserver.onCompleted();

36. }

37. }

The following code invokes the service (Code snippet 5.8).


1. package com.example.samples.grpc;

2. import io.grpc.stub.StreamObserver;

3.
4. public class SampleGRPCServiceImpl extends SampleGRPCServiceG
rpc.SampleGRPCServiceImplBase {
5.
6. @Override

7. public void doSomething(DoSomethingRequest request,

8. StreamObserver<DoSomethingResponse>
responseObserver) {

9. var req = request.getInput();

10. var res = "Request: " + req + "\n" + "Hello, I just did
something!";

11.
12. responseObserver.onNext(DoSomethingResponse.newBuilder()
.
setResult(res).build());

13. responseObserver.onCompleted();

14. }

15. }

Pros of the gRPC Pattern:


High performance and low latency due to binary
serialization
HTTP/2 transport enables efficient communication
Supports bi-directional streaming for real-time data
exchange
Language-agnostic with broad support for various
programming languages
Strongly typed and schema-based through Protocol
Buffers
Compact message sizes compared to text-based
formats
Cons of the gRPC Pattern:
Requires a steep learning curve to understand Protocol
Buffers and gRPC concepts
Less human-readable compared to text-based formats
like JSON
Limited browser support and not suitable for RESTful
APIs
Not ideal for simple use cases where REST might
suffice
It may require additional tooling and libraries for full
functionality

Asynchronous messaging
Asynchronous messaging is a pattern that allows services
within a microservices architecture to interact without
waiting for a response, thus decoupling their execution and
promoting scalability and responsiveness. In contrast to
synchronous communication, where the calling service must
wait for a response before proceeding, asynchronous
messaging enables services to continue their tasks
independently.
Asynchronous messaging can be categorized into point-to-
point communication, which typically utilizes queues for
managing message exchanges between two services, and
publish/subscribe communication, which leverages topics to
broadcast messages to multiple subscribers.
Messaging and event-driven microservices architectural
styles require asynchronous messaging. They tend to
differentiate messages by their types (commands, events,
notifications, and callbacks) and use them in different
communication scenarios. For more information, see Chapter
2, Communication Styles.
Problem
In microservices architectures, achieving efficient, scalable,
and resilient inter-service communication is a critical
concern. Relying on synchronous communication alone can
lead to performance bottlenecks, reduced scalability, and an
increased risk of cascading failures due to the calling service
waiting for a response before proceeding. The asynchronous
messaging pattern can mitigate these challenges while
providing a more robust and flexible solution for service
interactions.
Asynchronous messaging is a communication method that
enables services to exchange information without waiting for
an immediate response. It considers the following elements:
The sender, also known as the producer or publisher,
is a microservice that creates and sends messages to
be consumed by other services or components.
The receiver, also referred to as the consumer or
subscriber, is a microservice that listens for, retrieves,
and processes messages sent by the sender.
A message is a discrete unit of data or an event
transmitted between the sender and receiver. It
typically contains information, such as commands or
updates, required for the receiver to perform a specific
task or action.
The message envelope is a wrapper or container that
encapsulates the actual message and its metadata.
Metadata can include information like routing details,
message priority, and timestamp, which is useful for
the messaging system to handle and process the
message correctly.
Message brokers play a pivotal role in asynchronous
communication within microservices architectures, providing
a centralized platform for managing message exchanges
between services. The key capabilities of message brokers
include:
Decoupling: Message brokers facilitate the separation
of concerns between services by acting as an
intermediary, enabling services to communicate
without direct knowledge of each other's existence,
location, or implementation details.
Message routing and delivery: Message brokers
handle the routing of messages from senders to the
appropriate receivers based on predefined rules or
configurations. They ensure reliable message delivery,
even in scenarios where receivers are temporarily
unavailable, by persisting messages until they are
successfully consumed.
Publish/Subscribe and Point-to-Point messaging:
Message brokers support both one-to-many
communication using publish/subscribe mechanisms,
where a message is broadcasted to multiple
subscribers, and one-to-one communication using
point-to-point messaging, where messages are sent
directly to a specific receiver using queues.
Scalability and fault tolerance: Message brokers
can be deployed in a distributed manner, providing
horizontal scalability and ensuring fault tolerance
through replication and automatic failover
mechanisms.
Message prioritization and expiration: Message
brokers can manage message priorities and expiration
times, ensuring that high-priority messages are
processed before lower-priority ones, and preventing
the processing of stale messages.
Monitoring and observability: Message brokers
often come equipped with built-in monitoring and
observability tools, allowing for the tracking of
message flow, performance metrics, and potential
issues within the system.

Point-to-point
Point-to-point communication utilizes queues as a dedicated
channel for exchanging messages between a single sender
and a specific receiver. This pattern ensures message
delivery in the correct order, even when the receiver is
temporarily unavailable. In a microservices architecture,
point-to-point communication allows for better handling of
service-specific interactions, such as a user registration
process where one service generates a unique identifier and
another service stores the user's details. By enabling
services to operate independently and asynchronously, this
pattern promotes efficient resource management and
prevents bottlenecks in scenarios that require direct and
secure communication between two distinct services (See
Figure 5.2):

Figure 5.2: Point-to-point asynchronous communication between microservices

Point-to-point communication is typically used to send


commands, notifications and callbacks messages in event-
driven microservices.
Here is an example of sending a message using Jakarta
Messaging Service (JMS) API (Code snippet 5.9):
1. package com.example.samples;
2.
3. import javax.jms.Connection;
4. import javax.jms.ConnectionFactory;
5. import javax.jms.MessageProducer;
6. import javax.jms.Queue;
7. import javax.jms.Session;
8. import javax.jms.TextMessage;
9. import javax.naming.Context;
10. import javax.naming.InitialContext;
11.
12.
13. public class JmsProducerExample {
14.
15. public static void main(String[] args) {
16. try {
17. // Set up JNDI context and connection factory
18. Context jndiContext = new InitialContext();
19. ConnectionFactory connectionFactory =
(ConnectionFactory) jndiContext.lookup("ConnectionF
actory");
20.
21. // Look up the destination (queue)
22. Queue queue = (Queue) jndiContext.lookup(«E
xampleQueue»);
23.
24. // Create connection, session, and producer
25. Connection connection = connectionFactory.cr
eateConnection();
26. Session session = connection.createSession(fal
se, Session.AUTO_ACKNOWLEDGE);
27. MessageProducer producer = session.createPr
oducer(queue);
28.
29. // Create and send a text message
30. TextMessage message = session.createTextMe
ssage("Hello, JMS!");
31. producer.send(message);
32.
33. System.out.println("Message sent successfully!
");
34.
35. // Clean up resources
36. producer.close();
37. session.close();
38. connection.close();
39. jndiContext.close();
40. } catch (Exception e) {
41. System.err.println("Exception occurred: " + e.g
etMessage());
42. e.printStackTrace();
43. }
44. }
45. }
The message receiver would be the following (Code snippet
5.10):
1. package com.example.samples;

2.
3. import javax.jms.Connection;

4. import javax.jms.ConnectionFactory;

5. import javax.jms.Message;

6. import javax.jms.MessageConsumer;

7. import javax.jms.Queue;

8. import javax.jms.Session;

9. import javax.jms.TextMessage;

10. import javax.naming.Context;

11. import javax.naming.InitialContext;

12.
13. public class JmsConsumerExample {

14.
15. public static void main(String[] args) {

16. try {

17. // Set up JNDI context and connection factory

18. Context jndiContext = new InitialContext();

19. ConnectionFactory connectionFactory = (ConnectionFactory


) jndiContext.lookup("ConnectionFactory");

20.
21. // Look up the destination (queue)

22. Queue queue = (Queue) jndiContext.lookup(«ExampleQueue


»);

23.
24. // Create connection, session, and consumer

25. Connection connection = connectionFactory.createConnectio


n();

26. Session session = connection.createSession(false, Session.A


UTO_ACKNOWLEDGE);

27. MessageConsumer consumer = session.createConsumer(que


ue);

28.
29. // Start the connection and receive a message

30. connection.start();

31. Message message = consumer.receive();

32.
33. // Process the received message

34. if (message instanceof TextMessage) {

35. TextMessage textMessage = (TextMessage) message;

36. System.out.println("Received message: " + textMessage.g


etText());

37. } else {

38. System.err.println("Received message is not a TextMessag


e");

39. }

40.
41. // Clean up resources

42. consumer.close();
43. session.close();

44. connection.close();

45. jndiContext.close();

46. } catch (Exception e) {

47. System.err.println("Exception occurred: " + e.getMessage());

48. e.printStackTrace();

49. }

50. }

51. }

Following are the pros and cons of Point-to


Point Pattern:

Pros:
Senders and receivers operate independently,
improving system resilience.
Queues distribute messages evenly across multiple
receiver instances.
Messages are persisted in queues, ensuring delivery
even if the receiver is unavailable.
Queues maintain the order of messages, enabling
proper processing.

Cons:
Requires a messaging infrastructure and management
of queues.
Asynchronous nature may introduce delays in
processing messages.
Tracing and monitoring message flow between services
can be complex.
System performance and reliability depend on the
underlying messaging infrastructure.

Publish/subscribe
Publish/subscribe communication, also known as pub/sub, is
a messaging pattern widely used in distributed systems,
allowing a single sender, or publisher, to broadcast
messages to multiple receivers, or subscribers, without
direct knowledge of their identities. In this pattern, messages
are organized into topics or channels, which subscribers
express interest in by registering with the messaging
system. The messaging system then ensures that each
subscriber receives messages related to their subscribed
topics, allowing consumers to consume them at their own
pace. The pub/sub pattern fosters decoupling, as publishers
and subscribers can evolve independently without impacting
one another. Additionally, it enables efficient dissemination
of information and facilitates event-driven architectures,
where subscribers react to specific events or updates.
However, the pub/sub pattern also introduces complexities in
message ordering and consistency, requiring careful design
and implementation to ensure reliable message handling
(See Figure 5.3).

Figure 5.3: Publish/subscribe asynchronous communication between


microservices
Publish/subscribe communication is typically used to send
events messages in event-driven microservices and is
essential to implement a choreographic architectural style.
To implement pub/sub using SpringBoot JMS do the following:
1. Set up a Spring Boot project with JMS dependencies in
your build configuration (pom.xml or build.gradle).
2. Configure the JMS connection factory and topic in your
application.properties or application.yml file.
3. Create a JmsPublisher class to publish messages to
the topic.
4. Autowire the JmsTemplate and use it to publish a
message to the topic.
The following code shows how to do this (Code snippet 5.11):
1. package com.example.samples.jakartaspring;
2.
3. import org.springframework.beans.factory.annotation.
Autowired;
4. import org.springframework.jms.core.JmsTemplate;
5. import org.springframework.stereotype.Component;
6.
7. @Component
8. public class JmsPublisher {
9.
10. @Autowired
11. JmsTemplate jmsTemplate;
12.
13. public void publishMessage(String message) {
14. jmsTemplate.convertAndSend("myQueue", messa
ge);
15. System.out.println("Message published: " + mess
age);
16. }
17. }
In this example, we create a JmsPublisher component that
utilizes the JmsTemplate from Spring JMS. The
JmsTemplate simplifies the process of publishing messages
to a topic. In this case, we publish messages to the
"myTopic" topic.
To publish a message, you can call the publishMessage
method of the JmsPublisher (Code snippet 5.12):
1. package com.example.samples.jakartaspring;

2.
3. import org.springframework.boot.SpringApplication;

4. import org.springframework.boot.autoconfigure.SpringBootApplicati
on;

5. import org.springframework.context.ConfigurableApplicationContex
t;

6.
7. @SpringBootApplication

8. public class JmsExampleApplication {

9.
10. public static void main(String[] args) throws InterruptedException
{

11. ConfigurableApplicationContext context = SpringApplication.ru


n(JmsExampleApplication.class, args);
12.
13. JmsPublisher jmsPublisher = context.getBean(JmsPublisher.clas
s);

14. jmsPublisher.publishMessage("Hello, JMS Pub/Sub!");

15.
16. // Close the application context

17. context.close();

18. }

19. }
The subscriber component can be the following (Code
snippet 5.13):
1. package com.example.samples.jakartaspring;
2.
3.
4. import org.springframework.jms.annotation.JmsListen
er;
5. import org.springframework.stereotype.Component;
6.
7. import javax.jms.MessageListener;
8.
9. @Component
10. public class JmsSubscriber {
11.
12. @JmsListener(destination = "myQueue")
13. public void receiveMessage(String message) {
14. System.out.println("Received message1: " + mess
age);
15. }
16. }

Following are the pros and cons of


Publish/Subscribe Pattern:

Pros:
Publishers and subscribers operate independently,
improving system resilience.
Pub/sub efficiently broadcasts messages to multiple
subscribers.
Enables reactive programming and real-time updates.
Subscribers can join or leave topics at runtime,
providing flexibility.
Publishers only need to send messages to a topic, not
individual subscribers.

Cons:
Ensuring correct order of messages can be
challenging.
Asynchronous nature may lead to temporary
inconsistencies between subscribers.
Requires a messaging infrastructure and management
of topics or channels.
Tracing and monitoring message flow between services
can be complex.
System performance and reliability depend on the
underlying messaging infrastructure.
API versioning
Application Programming Interfaces (APIs) play a
crucial role in microservices architecture by enabling
communication between services. Over time, APIs may
require changes or enhancements to accommodate new
features, bug fixes, or performance improvements. API
versioning is a strategy to manage these changes without
affecting existing clients or breaking functionality.

Problem
Correct implementation of microservices architecture
demands that every microservices has a totally independent
lifecycle. They shall be developed and deployed
independently from other microservices. However,
microservices must communicate with each other. And
changes in call signatures or message formats in
microservice API can break consumers. To prevent that from
happening all microservice APIs must be versioned.
Unfortunately, API versioning is often overlooked by
inexperienced developers and became one of top reasons
why microservices systems turn into distributed monoliths.
Adding versioning as an afterthought is hard. So, it is
recommended to implement API versioning from day 1.
When it’s done, fast evolution of microservices in a system
should not be a problem, and incremental delivery should be
possible.
To implement API versioning, a microservice, in addition to
the most recent API version, must also expose one or a few
older versions for backward compatibility with existing
consumers. Migration of consumers to a newer version
should not be forced. Instead, it should happen as a part of
natural evolution (See Figure 5.4):
Figure 5.4: Evolution of microservice API

It is important to understand that not every change in API


shall be “breaking”. Modern communication technologies
allow adding optional fields and optional methods without
breaking backward compatibility. Those changes are allowed
in “minor” microservice releases and do not require
developing a whole new API version.

Versioned channels
One approach to add a new version to a microservice API is
to expose it via a new endpoint. In the case of HTTP or gRPC
this can be a controller exposed via a separate TCP port. For
async messaging, it can be a separate queue (refer to Figure
5.5):

Figure 5.5: API versions exposed via separate TCP endpoints or message
queues
Following are the pros and cons of Versioned
Channels Pattern:

Pros:
Simpler microservice implementation with clean code
separation between versions
Easier to add retroactively

Cons:
Higher deployment complexity caused by increased
number of end points
Bigger surface area that may cause security issues

Versioned routing
In an alternative approach the microservice receives
requests from all old and new consumers on the same
channel (endpoint or queue), but uses an internal routing
mechanism to separate versions using version information
included into routes or messages (see Figure 5.6):

Figure 5.6: API versions on shared TCP port or message queue, separated by
versioned routes/messages.
gRPC allows adding multiple services to a single Service as it
is shown as follows (Code snippet 5.14):
1. int port = 8080;
2. Server server = ServerBuilder.forPort(port)
3. .addService(new SampleGRPCServiceV1Impl
())
4. .addService(new SampleGRPCServiceV2Impl
())
5. .build()
6. .start();
HTTP has three methods to implement routing.
URI versioning: In URI versioning, a version identifier is
included in the URI or endpoint of the API. For example, a
service with a version 1 API may have a URI like
https://<ip>/sample/v1/do-something, and when the API
is updated to version 2, the URI will be
https://<ip>/sample/v2/do-something. This approach is
straightforward and easily understandable, but it violates
REST principles, as URIs should ideally represent resources,
not versions of the API (Code snippet 5.15):
1. @RestController
2. @RequestMapping("/sample/v1")
3. public class SampleRestServiceV1 {
4.
5. @PostMapping("/do-something")
6. public String doSomething() {
7. return "Hello, I just did something!";
8. }
9. }
Query parameter versioning: In this method, the version
information is added as a query parameter in the API
request. For example, https://<ip>/sample?version=1.
When the API is updated to version 2, the new request would
be https://<ip>/sample?version=2. This technique
maintains the RESTful nature of the API by keeping the
version information separated from the resource URI.
However, it can be less intuitive than URI versioning and
may lead to inconsistencies if not handled carefully (Code
snippet 5.16):
1. @RestController

2. @RequestMapping(value = "/sample/v1", params = "version=1", hea


ders = "X-API-Version=1")

3. public class SampleRestService {

4.
5. @PostMapping("/do-something")

6. public String doSomething(@RequestParam("version") String ver


sion) {

7. return "Version: " + version + " | Hello, I just did


something!";

8. }

9. }

Header-based versioning: Header-based versioning


involves adding version information in the HTTP headers of
API requests and responses. For example, a custom header
like X-API-Version: 1 would indicate version 1 of the API.
When the API is updated to version 2, clients will update the
custom header value accordingly. This method is considered
more compliant with REST principles, as it keeps the API
resource URIs clean and free from version information.
However, it can be less discoverable for clients, as version
information is not directly visible in the URI (Code snippet
5.17).
1. @RequestMapping(path = "/sample", headers=”X-API-
Version=1”)
2. public class SampleRestServiceV1 {
3.
4. @PostMapping("/do-something")
5. public String doSomething() {
6. return "Hello, I just did something!";
7. }
8. }
When it comes to asynchronous messaging, unfortunately,
most technologies do not provide a standard versioning
mechanism. Developers must include version numbers in
messages, and implement a routing logic inside subscribers
after extracting version number from received messages.

API Documentation
API documentation is a vital component in ensuring the
successful adoption and use of an API in a microservices
ecosystem. Comprehensive, accurate, and up-to-date API
documentation enables developers to understand the
functionality, usage, and constraints of the API, facilitating a
seamless integration with other services.

Problem
Despite the importance of API documentation, manual
writing consumes significant time, and, usually, requires the
involvement of technical writers as most developers are not
good at writing clear and concise documentation that can be
given to customer’s hands. Also, as API evolves the
documentation tends to get out of sync.
To address that issue, the development community and
technology vendors implemented interface definition
languages (IDL) to describe APIs formally. The API IDLs can
be used in a model-first approach when code for service is
automatically generated from the IDL using automated tools.
Or they can be used on a code-first approach when IDL is
automatically generated from the running service. Then the
IDL can be used by developers as a formal API
documentation or/and it can be used to automatically
generate microservice clients (refer to Figure 5.7):

Figure 5.7: Use of API IDL definition

OpenAPI
OpenAPI (formerly known as Swagger) is a widely adopted
specification for describing and documenting RESTful APIs. It
uses a standard format, typically YAML or JSON, to define the
API's endpoints, request and response parameters, data
types, authentication methods, and other relevant
information. OpenAPI also enables the generation of
interactive documentation using tools like Swagger UI, which
allows developers to explore, test, and understand the API's
functionality in a user-friendly manner.
To enable OpenAPI for the SampleRestService from the
Synchronous Calls pattern add into the project the following
SwaggerConfig file (Code snippet 5.18):
1. package com.example.samples.http;
2.
3. import io.swagger.v3.oas.models.OpenAPI;
4.
5. import io.swagger.v3.oas.models.info.Info;
6. import org.springframework.beans.factory.annotation.
Value;
7. import org.springframework.context.annotation.Bean;
8. import org.springframework.context.annotation.Config
uration;
9.
10. @Configuration
11. public class SwaggerConfig {
12. @Value("${springdoc.swagger-ui.swagger-name}")
13. private String swaggerName;
14. @Value("${springdoc.swagger-ui.swagger-
description}")
15. private String swaggerDescription;
16.
17. @Bean
18. public OpenAPI customOpenAPI() {
19. return new OpenAPI().info(
20. new Info()
21. .title(swaggerName)
22. .description(swaggerDescription)
23. .termsOfService("http://swagger.io/ter
ms/")
24. );
25. }
26. }
Which can be represented in JSON format as (Code snippet
5.19):
1. {
2. "openapi": "3.0.0",
3. "info": {
4. "title": "Sample API",
5. "description": "This is a sample API generated with
Swagger",
6. "termsOfService": "http://swagger.io/terms/"
7. },
8. "paths": {},
9. "components": {}
10. }
Now, you can run your Spring Boot application, and navigate
to http://localhost:8080/swagger/swagger-
ui/index.html to see your API documentation, see Figure
5.8:
Figure 5.8: Swagger page for the service

ProtoBuf
gRPC uses Protocol Buffers (Protobuf) as the Interface
Definition Language (IDL). Protobuf definitions serve as
the source of truth for gRPC services, specifying the service
methods, request and response messages, and their
respective data types. These definitions can be compiled into
various programming languages using gRPC tools, which
generate client and server stubs. Protobuf definitions offer a
concise and efficient way to document gRPC APIs while
providing strong typing, backward compatibility, and
performance benefits.
Unfortunately, unlike REST APIs with Swagger, there's no
standard mechanism to retrieve the Protobuf definition
(.proto files) or the service definition from a running gRPC
service. This is mainly due to the fact that the protocol
buffers binary format doesn't contain the full schema
information. To enable automated generation of .proto files
from running gRPC services, you need to add a reflection
service into the gRPC server.
The code snippet below shows how to modify
SampleGRPCService from gRPC Synchronous Calls pattern
to enable reflection (Code snippet 5.20):
1. public class SampleGRPCServer {
2. public static void main(String[] args) throws IOExce
ption,
InterruptedException {
3. int port = 8080;
4. Server server = ServerBuilder.forPort(port)
5. .addService(new SampleGRPCServiceImplV1
())
6. .addService(ProtoReflectionServic
e.newInstance()) // Enable reflection
7. .build()
8. .start();
9. System.out.println("Server started on port " + por
t);
10. server.awaitTermination();
11. }
Once your server is running with the reflection service, you
can use grpcurl to retrieve the service definition. If you do
not have grpcurl, you can install it from GitHub
(https://github.com/fullstorydev/grpcurl).
To list all the services (Code snippet 5.20):
1. grpcurl -plaintext localhost:8090 list
To describe a particular service, replace your.service.Name
with your actual service name (Code snippet 5.22):
1. grpcurl -
plaintext localhost:8090 describe com.example.grpc.Sa
mpleGRPCServiceV1
Unfortunately, grpcurl does not have a built-in way to
generate .proto files. It can describe services, methods, and
message types, but it does not output this information in
.proto file syntax. However, you can piece together the
descriptions provided by grpcurl to manually write a .proto
file. Once you have it, you can automatically generate clients
using protoc compiler.

AsyncAPI
AsyncAPI is an open-source initiative that provides a
specification and a suite of tools to help developers work
with event-driven architectures. It's often described as the
"OpenAPI for event-driven APIs" and is used to define APIs
that use asynchronous messaging protocols, such as MQTT,
AMQP, Kafka, WebSockets, etc.
Just as OpenAPI allows developers to describe RESTful APIs in
a machine-readable format, AsyncAPI allows developers to
describe event-driven APIs in a similar way. The AsyncAPI
specification is a contract for your APIs and includes details
about your server and information about all the channels
(topics, routing keys, event types, etc.) your application is
able to publish or subscribe to. It also includes information
about messages, security schemes, and other important
aspects of your event-driven APIs.
AsyncAPI can be used in several ways:
Documentation: Like OpenAPI for HTTP APIs,
AsyncAPI can generate human-readable documentation
for your APIs. This makes it easier for other developers
to understand how your API works and how to interact
with it.
Code Generation: Using the AsyncAPI specification,
you can generate boilerplate code in various
programming languages. This can significantly speed
up the development process and help ensure that your
API behaves as described in the specification.
Testing and Validation: The AsyncAPI specification
can be used to create mock servers for testing and to
validate that your API is behaving as described.
Event-driven Architecture (EDA) Design and
Planning: AsyncAPI can be used during the design
phase of an EDA to plan out your APIs. It provides a
way to visualize how your APIs will interact with each
other and helps ensure that your APIs are well-
designed before you begin the implementation.
Here is a simple AsyncAPI specification example for a service
that sends a text message to a queue (Code snippet 5.23):
1. asyncapi: '2.0.0'
2. info:
3. title: 'Message Service'
4. version: '1.0.0'
5. channels:
6. messageQueue:
7. publish:
8. message:
9. contentType: text/plain
10. payload:
11. type: string
12. example: 'Hello, world!'
In this example, asyncapi field specifies the AsyncAPI
version. The info field provides metadata about the API. The
channels field describes the paths available in the API. In this
case, we have a single messageQueue channel where
messages can be published. The message field describes the
type of data that can be published to this channel. We've
specified that the contentType is plain text, and the
payload is a string.
To generate code from the AsyncAPI document, you can use
the AsyncAPI Generator, a command line tool that can
generate code in various languages. For example, here is a
command you could use to generate code for a Java
application (Code snipped 5.24):
1. ag asyncapi.yaml @asyncapi/java-spring-template -
o output -p 'generator=java'

Blob streaming
In a microservices system, the need to transfer large binary
objects such as images, videos, or other sizable data blobs is
not uncommon. This process, however, can be challenging
due to network constraints, memory usage, and the potential
for latency in sending and receiving large amounts of data.

Problem
Microservices systems face significant challenges when
transferring large binary objects (BLOBs). This is due to the
inherent complexities related to the size of the data,
networking overhead, bandwidth utilization, latency, and
failure recovery. Transferring these large BLOBs can disrupt
system performance and affect service reliability.
When using synchronous communication methods, such as
HTTP or gRPC, the problems are magnified. Synchronous
protocols wait for a response after sending a request,
meaning they block further execution until the response is
received. This can lead to long wait times when sending or
receiving large binary objects due to the time it takes to
serialize, transfer, and deserialize the data. Additionally,
issues such as network latency and timeouts become
significant problems, causing system disruptions and
failures.
Asynchronous communication can alleviate some of these
issues as it allows the system to continue execution without
waiting for the response. However, it introduces its own
complexities, such as limitation of message size, managing
message order and ensuring data consistency.

Continuous streaming
This approach involves sending the large binary objects as a
continuous stream of data. This can reduce the latency
associated with transferring the entire object at once. It is
especially beneficial when used with protocols designed for
streaming, such as gRPC or HTTP/2. However, streaming
requires careful handling to ensure data consistency and
order, especially in unreliable network conditions.
Here is a sample of how you could use Spring Boot to upload
and download BLOBs using the HTTP/2 protocol.
Firstly, ensure that you have enabled HTTP/2 in your
application.properties (Code snippet 5.25):
1. server.http2.enabled=true
Next, create a simple REST controller which handles file
upload and download (Code snippet 5.26):
1. ackage com.example.samples.http;

2.
3. import com.example.samples.storage.BlobStorage;
4. import org.springframework.http.HttpHeaders;

5. import org.springframework.http.ResponseEntity;

6. import org.springframework.web.bind.annotation.*;

7. import org.springframework.web.multipart.MultipartFile;

8. import org.springframework.core.io.ByteArrayResource;

9. import org.springframework.core.io.Resource;

10. import org.springframework.http.MediaType;

11. import java.io.IOException;

12.
13. @RestController

14. public class BlobController {

15.
16. @PostMapping("/upload")

17. public ResponseEntity<String> uploadFile(@RequestParam("file"


) MultipartFile file) {

18. try {

19. // Assuming you're storing the file in-memory for simplicity.

20. // In production, you'd probably store it in a database or a blo


b store.

21. byte[] data = file.getBytes();

22. BlobStorage.put(file.getOriginalFilename(), data);

23. return ResponseEntity.ok().body("File uploaded successfully.


");

24. } catch (IOException e) {

25. return ResponseEntity.status(500).body("Failed to upload th


e file.");

26. }

27. }

28.
29. @GetMapping("/download/{fileName}")

30. public ResponseEntity<Resource> downloadFile(@PathVariable S


tring fileName) {

31. byte[] data = BlobStorage.get(fileName);

32. if (data == null) {

33. return ResponseEntity.notFound().build();

34. } else {

35. ByteArrayResource resource = new ByteArrayResource(data


);

36. return ResponseEntity.ok()

37. .header(HttpHeaders.CONTENT_DISPOSITION, "attach


ment; filename=" + fileName)

38. .contentType(MediaType.APPLICATION_OCTET_STREA
M)

39. .body(resource);

40. }

41. }

42. }

For the sake of simplicity, we will use a static BlobStorage


class to hold the BLOB data. In a real-world application, you
would want to use a database, a cloud-based blob store, or
some other form of durable storage (Code snippet 5.27).
1. import java.util.concurrent.ConcurrentHashMap;

2.
3. public class BlobStorage {

4. private static final ConcurrentHashMap<String, byte[]> storage


= new ConcurrentHashMap<>();

5.
6. public static void put(String key, byte[] data) {

7. storage.put(key, data);

8. }

9.
10. public static byte[] get(String key) {

11. return storage.get(key);

12. }

13. }

The client code to upload or download blobs from the service


is presented as follows (Code snippet 5.28):
1. package com.example.samples.http;

2.
3. import org.springframework.core.io.ByteArrayResource;

4. import org.springframework.core.io.FileSystemResource;

5. import org.springframework.core.io.Resource;

6. import org.springframework.http.HttpEntity;

7. import org.springframework.http.HttpHeaders;

8. import org.springframework.http.HttpMethod;
9. import org.springframework.http.MediaType;

10. import org.springframework.http.client.ClientHttpRequest;

11. import org.springframework.http.client.ClientHttpResponse;

12. import org.springframework.http.client.HttpComponentsClientHttpR


equestFactory;

13. import org.springframework.util.LinkedMultiValueMap;

14. import org.springframework.util.MultiValueMap;

15. import org.springframework.util.StreamUtils;

16. import org.springframework.web.client.RequestCallback;

17. import org.springframework.web.client.ResponseExtractor;

18. import org.springframework.web.client.RestTemplate;

19.
20. import java.io.FileOutputStream;

21. import java.io.IOException;

22. import java.io.InputStream;

23. import java.net.URI;

24. import java.nio.charset.StandardCharsets;

25. import java.nio.file.Files;

26. import java.nio.file.Path;

27. import java.nio.file.Paths;

28.
29. public class BlobClient {

30.
31. private final RestTemplate restTemplate;
32.
33. public BlobClient() {

34. HttpComponentsClientHttpRequestFactory requestFactory = n


ew HttpComponentsClientHttpRequestFactory();

35. restTemplate = new RestTemplate(requestFactory);

36. }

37.
38. public void upload(String url, String filePath) throws IOException
{

39. Path path = Paths.get(filePath);

40.
41. MultiValueMap<String, Object> body = new LinkedMultiValue
Map<>();

42. body.add("file", new FileSystemResource(path));

43.
44. var res = restTemplate.postForEntity(url, body, String.class);

45. System.out.println(res.getBody());

46. }

47.
48. public void download(String url, String filePath) throws IOExcepti
on {

49. RequestCallback requestCallback = request -> {

50. HttpHeaders headers = request.getHeaders();

51. headers.setContentType(MediaType.APPLICATION_OCTET_
STREAM);

52. };
53.
54. ResponseExtractor<byte[]> responseExtractor = (ResponseExt
ractor<byte[]>) response -> {

55. try (InputStream inputStream = response.getBody()) {

56. var byteArray = StreamUtils.copyToByteArray(inputStrea


m);

57. try (FileOutputStream stream = new FileOutputStream(fil


ePath)) {

58. stream.write(byteArray);

59. }

60. return byteArray;

61. }

62. };

63.
64. restTemplate.execute(URI.create(url), HttpMethod.GET, reque
stCallback,

65. responseExtractor);

66. }

67. }

gRPC has built-in support for server-, client-, and bi-


directional streaming. To upload and download blobs using
gRPC streaming in Java, we first need to define a service in
the .proto file, specifying a stream for both the request and
the response (Code snippet 5.29):
1. syntax = "proto3";
2.
3. package blob;
4.
5. option java_multiple_files = true;
6. option java_package = "com.example.samples.grpc";
7. option java_outer_classname = «ServiceProto»;
8.
9. service BlobService {
10. rpc Upload(stream UploadRequest) returns (UploadR
esponse) {}
11. rpc Download(DownloadRequest) returns (stream Do
wnloadResponse) {}
12. }
13.
14. message UploadRequest {
15. string name = 1;
16. bytes chunk = 2;
17. }
18.
19. message UploadResponse {
20. string message = 1;
21. }
22.
23. message DownloadRequest {
24. string name = 1;
25. }
26.
27. message DownloadResponse {
28. bytes chunk = 1;
29. }
Next, let us implement the server-side logic for the service
(Code snippet 5.30):
1. package com.example.samples.grpc;

2.
3.
4. import io.grpc.Server;

5. import io.grpc.stub.StreamObserver;

6.
7. import java.io.ByteArrayInputStream;

8. import java.io.ByteArrayOutputStream;

9. import java.io.IOException;

10. import java.util.concurrent.ConcurrentHashMap;

11.
12. public class BlobServiceImpl extends BlobServiceGrpc.BlobServiceI
mplBase {

13. private Server server;

14. private final ConcurrentHashMap<String, byte[]> storage = new


ConcurrentHashMap<>();

15.
16. @Override

17. public StreamObserver<UploadRequest> upload(final StreamObs


erver<UploadResponse> responseObserver) {

18. return new StreamObserver<UploadRequest>() {


19. private String fileName;

20. private ByteArrayOutputStream currentFile;

21.
22. @Override

23. public void onNext(UploadRequest value) {

24. if (currentFile == null) {

25. currentFile = new ByteArrayOutputStream();

26. }

27. try {

28. currentFile.write(value.getChunk().toByteArray());

29. fileName = value.getName();

30. } catch (IOException e) {

31. responseObserver.onError(e);

32. }

33. }

34.
35. @Override

36. public void onError(Throwable t) {

37. t.printStackTrace();

38. }

39.
40. @Override

41. public void onCompleted() {

42. storage.put(fileName, currentFile.toByteArray());


43. responseObserver.onNext(UploadResponse.newBuilder().s
etMessage("Upload completed.").build());

44. responseObserver.onCompleted();

45. }

46. };

47. }

48.
49. @Override

50. public void download(DownloadRequest request, StreamObserver


<DownloadResponse> responseObserver) {

51. byte[] file = storage.get(request.getName());

52. ByteArrayInputStream byteStream = new ByteArrayInputStrea


m(file);

53.
54. byte[] buffer = new byte[1024];

55. int len;

56. try {

57. while ((len = byteStream.read(buffer)) != -1) {

58. responseObserver.onNext(DownloadResponse.newBuilder
().setChunk(com.google.protobuf.ByteString.copyFrom(buffer, 0, len
)).build());

59. }

60. } catch (IOException e) {

61. responseObserver.onError(e);

62. }

63.
64. responseObserver.onCompleted();

65. }

66. }

Here is a basic Java client that interacts with the gRPC


BlobService defined earlier. It includes methods to upload
and download BLOBs using gRPC streaming (Code snippet
5.31).
1. package com.example.samples.grpc;
2.
3. import io.grpc.ManagedChannel;
4. import io.grpc.ManagedChannelBuilder;
5. import io.grpc.stub.StreamObserver;
6.
7. import java.io.FileInputStream;
8. import java.io.FileOutputStream;
9. import java.io.IOException;
10. import java.util.concurrent.CountDownLatch;
11. import java.util.concurrent.TimeUnit;
12.
13. public class BlobGrpcClient {
14. private final BlobServiceGrpc.BlobServiceStub asyn
cStub;
15.
16. public BlobGrpcClient(String host, int port) {

17. this(ManagedChannelBuilder.forAddress(host, port).usePlainte


xt());

18. }
19.
20. public BlobGrpcClient(ManagedChannelBuilder<?
> channelBuilder) {

21. ManagedChannel channel = channelBuilder.build();

22. asyncStub = BlobServiceGrpc.newStub(channel);

23. }

24.
25. public void upload(String name, String filePath) throws IOExcepti
on, InterruptedException {

26. CountDownLatch finishLatch = new CountDownLatch(1);

27. StreamObserver<UploadRequest> requestObserver = asyncStu


b.upload(new StreamObserver<UploadResponse>() {

28. @Override

29. public void onNext(UploadResponse response) {

30. System.out.println("Upload completed");

31. }

32.
33. @Override

34. public void onError(Throwable t) {

35. t.printStackTrace();

36. finishLatch.countDown();

37. }

38.
39. @Override

40. public void onCompleted() {


41. System.out.println("Upload finished");

42. finishLatch.countDown();

43. }

44. });

45.
46. byte[] buffer = new byte[1024];

47. int len;

48. try (FileInputStream fis = new FileInputStream(filePath)) {

49. while ((len = fis.read(buffer)) != -1) {

50. requestObserver.onNext(UploadRequest.newBuilder().set
Name(name).setChunk(com.google.protobuf.ByteString.copyFrom(b
uffer, 0, len)).build());

51. }

52. }

53.
54. requestObserver.onCompleted();

55. if (!finishLatch.await(1, TimeUnit.MINUTES)) {

56. System.out.println("Upload can not finish within 1 minute");

57. }

58. }

59.
60. public void download(String name, String filePath) throws IOExce
ption, InterruptedException {

61. final CountDownLatch finishLatch = new CountDownLatch(1);

62. FileOutputStream fos = new FileOutputStream(filePath);


63. asyncStub.download(DownloadRequest.newBuilder().setName(
name).build(), new StreamObserver<>() {

64. @Override

65. public void onNext(DownloadResponse value) {

66. try {

67. fos.write(value.getChunk().toByteArray());

68. } catch (IOException e) {

69. e.printStackTrace();

70. }

71. }

72.
73. @Override

74. public void onError(Throwable t) {

75. t.printStackTrace();

76. finishLatch.countDown();

77. }

78.
79. @Override

80. public void onCompleted() {

81. System.out.println("Download completed");

82. finishLatch.countDown();

83. }

84. });

85.
86. if (!finishLatch.await(1, TimeUnit.MINUTES)) {

87. System.out.println("Download can not finish within 1 minute"


);

88. }

89. fos.close();

90. }

91. }

Continuous streaming is a real-time, efficient, and scalable


approach to data processing, but it can be complex to
implement and manage, and mistakes can propagate quickly
across the system.

Following are the pros and cons of Blob


Streaming Pattern:

Pros:
Allows for immediate data processing and analysis.
Utilizes resources better by processing data as it
arrives.
Accommodates increasing data volumes by design.
Provides built-in mechanisms for fault tolerance and
recovery.

Cons:
Implementing a streaming architecture can be
technically challenging.
Handling state in a distributed streaming system can
be difficult.
It might consume more resources compared to batch
processing due to constant processing.
Mistakes can propagate rapidly across the system,
impacting downstream components.
Requires point-to-point communication and is not
suitable for asynchronous messaging.

Transferring blob IDs


Instead of transferring the entire binary object, the systems
can exchange unique identifiers (IDs) representing these
objects. The receiver can then fetch the object when
required. This significantly reduces the data transferred
during inter-service communication. However, this pattern
requires a robust and efficient object storage and retrieval
system, adding complexity to the architecture (See Figure
5.9):

Figure 5.9: Transferring blobs via passing their IDs

Transferring blobs by their IDs can be implemented using


basic synchronous or asynchronous communication
described by the patterns above.

Following are the pros and cons of Transferring


Blob IDs Pattern:

Pros:
Transferring IDs is significantly more efficient than
transferring large blobs themselves, especially over a
network.
The process is straightforward - once the blob is
stored, its ID can be easily shared across
microservices.
Multiple services can access the blob simultaneously
using its ID without any data inconsistency.
As the blob is stored in a central location, there's a
single source of truth.
Suitable for implementation via basic synchronous calls
or asynchronous messaging.

Cons:
Requires a reliable blob storage service, adding an
extra dependency to the system.
Additional latency can be introduced due to the need to
fetch the blob from a central location using the ID.
Ensuring appropriate access control may become
complex, as the blob must be securely accessible to
different services.
Requires a special clean-up logic to remove blobs left
by failed transactions.

Chunking
Here, large binary objects are divided into smaller chunks,
and these chunks are transferred independently. This
approach improves network utilization, and in case of a
failure during transmission, only the failed chunk needs to be
retransmitted, not the entire object. However, this method
requires complex logic to manage chunk order, verify
integrity, and handle missing or corrupted chunks (See
Figure 5.10).

Figure 5.10: A basic API for Blobs service that transfers blobs in small chunks

Similar to transferring blobs by their IDs, chunking can also


be implemented using basic synchronous or asynchronous
communication.

Following are the pros and cons of Chunking


Pattern:

Pros:
Chunking allows for efficient transfer of large data by
breaking it into manageable pieces.
If a transfer fails, only the failed chunk needs to be
resent, not the entire blob.
Different chunks can be processed in parallel,
potentially speeding up overall processing time.
Chunking large blobs reduces the memory footprint on
both client and server sides.
Suitable for implementation via basic synchronous calls
or asynchronous messaging.
Cons:
Implementing chunking can add complexity to the
system, as chunks need to be reassembled in the
correct order.
There can be added overhead due to managing the
chunks (tracking, ordering, ensuring integrity).
Depending on the network, there might be increased
latency due to the overhead of multiple requests.

Commandable API
The Commandable pattern is a design strategy frequently
employed in microservices systems, where it encapsulates
all the data required to perform an action or trigger a specific
behavior into a discrete command object. This command can
then be serialized, dispatched, and executed within another
process or service.
In the realm of handcrafted microservices APIs, considerable
effort is often expended in building and maintaining these
interfaces. However, the Commandable pattern can alleviate
much of this workload by enabling a generic implementation
of APIs. By transforming operations into command objects,
the pattern provides a unified interface for interacting with
the services, thus streamlining the development process and
reducing the complexity associated with API management.
Moreover, it fosters a more flexible and easily maintainable
system, as changes or additions to the services require
modifications to the command objects only, rather than the
entire API infrastructure.

Problem
In the context of microservices architecture, a myriad of
operations and tasks are required to ensure seamless
functionality of the system. Each service could expose
numerous endpoints, each requiring distinct data payloads
and operational logic. This heterogeneity in API design can
lead to an increase in complexity, making the system
difficult to maintain, extend, or scale.
Moreover, individual services often need to communicate
with each other to fulfill an operation. This inter-service
communication can become cumbersome to manage,
especially when services are distributed across different
environments or regions. It can lead to significant latency
and inconsistencies, particularly when the system needs to
handle a substantial number of requests concurrently.
Furthermore, in a high-load scenario, the system needs to
efficiently manage the incoming requests without
overloading any single service or the network. Asynchronous
processing and request batching become vital for system
stability, but the implementation of these aspects can be
challenging with a traditional request-response model.
Finally, tracing and auditing of the operations performed by
various services become crucial for system debugging and
monitoring. However, with diverse API designs, it can be
challenging to maintain a standardized logging mechanism
across the system.
The Commandable pattern offers a solution to these
challenges by encapsulating all the information needed to
perform a specific operation into a single command object.
This pattern provides a generic approach to implement APIs,
enabling efficient inter-service communication, facilitating
asynchronous processing and request batching, and ensuring
the standardization of auditing and tracing mechanisms.
However, the applicability and effectiveness of the
Commandable pattern depend on the specific use case and
the overall system design.

Solution
A conceptual design of the Commandable is presented in
Figure 5.11:

Figure 5.11: Commandable pattern

The communication scenario via Commandable API has the


following steps:
1. Services that expose operations via public microservice
API register commands in the
CommandableController by specifying the command
name and action that should be called when a client
invokes a command.
2. When a client needs to communicate with a
microservice API, it converts its request into a generic
InvokeRequest object and specifies a command name
and arguments. Then, it calls the invoke method of the
microservice API.
3. The CommandableController receives the call and
dispatches it to a responsible Service using actions in
previously registered commands. Then, it gets the
result or an error, packages it into an InvokeResponse
object, and sends it back to the client.
4. The client receives the InvokeResponse, unpackages
it and returns the command result or error.
As we can see from the pattern description, microservice
developers using the Commandable patterns are able to
implement a few generic CommandableControllers for all
supported communication protocols. Then, the controllers
can be reused across all microservices, providing robust
communication, interoperability, error handling and
traceability.
The main drawback on the Commandable pattern is
inefficiency. Generic serialization of arguments and results
plus dispatching of commands add overhead. However, that
becomes a problem for critical transactions when operations
are called frequently or their payload is too large. The good
news is that over 80% calls in microservices systems are not
critical. If they are implemented using the Commandable
pattern, developers free their time to handcraft and optimize
critical APIs.

Following are the pros and cons of


Commandable Pattern:

Pros:
Provides a uniform way to invoke any functionality,
simplifying client interactions.
Decouples the sender and receiver, promoting system
flexibility and scalability.
Enables efficient asynchronous processing and
batching of commands.
Encapsulated command logic can be reused across the
system and modified independently, enhancing
maintainability.
Facilitates easy logging and auditing, as all command
data can be recorded.

Cons:
Adds complexity, especially in simpler systems where
commands might be overkill.
Serialized command objects might add to latency,
especially for time-sensitive operations.
In case of failures, debugging can be difficult as the
failure point can be anywhere in the command
execution pipeline.

Reliability
Reliability is a paramount concern when designing and
implementing microservices architectures. As systems grow
in size and complexity, ensuring the consistent performance
and availability of services becomes increasingly
challenging. Microservices Reliability patterns provide a suite
of strategies aimed at enhancing the robustness and
resilience of such systems. These patterns address key
aspects of reliability, including fault tolerance, service
degradation, and redundancy, among others. By applying
these patterns, developers can build systems capable of
effectively managing failures, maintaining service quality,
and delivering a seamless user experience, even in the face
of unexpected issues or high demand.

Problem
In microservices systems, independent services often need
to communicate with each other to fulfill an operation. Given
the distributed nature of these systems, such inter-service
communication is prone to various types of failures, such as
network issues, service unavailability, or high latency. These
failures can significantly degrade the performance of the
system, leading to poor user experience, or in the worst
case, complete system shutdown.
For instance, in a scenario where a service is temporarily
unavailable or responding slowly, a naive approach might be
to continuously retry the request. However, this can lead to
further strain on the already struggling service, exacerbating
the problem, a phenomenon known as the retry storm.
Moreover, without proper control mechanisms in place, a
sudden spike in requests (e.g., due to increased user
demand or a bug in the system) can overwhelm a service,
causing it to slow down or even crash. This is known as the
thundering herd problem.
In addition, when a service is facing an issue, it is important
that the failure is isolated and does not cascade to other
services in the system, preventing a single point of failure
from taking down the entire system.
To address these and other similar challenges, a set of
patterns, known as Communication Reliability Patterns,
such as Timeout, Retires, Rate Limiting, or Circuit
Breakers. These patterns, when implemented correctly, can
significantly improve the reliability of communication
between microservices.

Timeout
The Timeout pattern involves setting a specific waiting
period for a response from another service. If the response is
not received within this period, the operation is deemed
failed, freeing up system resources and maintaining
responsiveness. This pattern mitigates the issues of
indefinitely waiting for responses from slow or unresponsive
services. However, the timeout duration must be carefully
determined to avoid unnecessary retries or system
unresponsiveness (See Figure 5.12):
Figure 5.12: Invocation of a microservice with or without a timeout

Timeouts offer a fail-fast mechanism, but the underlying


issues causing slow responses should be resolved for the
overall system health.
Here is how you can set the timeout for a WebClient
introduced in Spring 5 (Code snippet 5.32).
1. HttpClient httpClient = HttpClient.create()
2. .option(ChannelOption.CONNECT_TIMEOUT_MIL
LIS, 3000)
3. .responseTimeout(Duration.ofMillis(3000));
4.
5. var client = WebClient.builder()
6. .clientConnector(new ReactorClientHttpConnecto
r(httpClient))
7. .build();

Following are the pros and cons of Timeout


Pattern:

Pros:
Timeouts prevent a system from waiting indefinitely for
a response, freeing up resources to handle other
requests, which improves system performance and
responsiveness.
It enhances system resilience by allowing the system to
fail fast and recover from slow or unresponsive
services.
It provides predictability in terms of maximum
response time from a service.

Cons:
Setting the right timeout can be challenging. A short
timeout might lead to unnecessary failures, while a
long timeout might lead to resource exhaustion.
Timeouts are a fail-safe mechanism, but they do not
address the root cause of the slow or unresponsive
service.
If not combined with other strategies like exponential
backoff or circuit breaker, a high number of timeouts
can lead to an increased error rate, as the system
might repeatedly timeout while trying to access a
failing service.

Retries
The retries pattern in microservices is a resilience strategy
where a failed operation is automatically re-attempted, often
used for transient failures. This pattern enhances the
system's reliability by increasing the chances of an
operation's eventual success. A common approach is to
implement retries with exponential backoff, where the delay
between retries doubles with each failed attempt, preventing
overloading of the service. However, care must be taken to
avoid 'retry storms' and retries should only be used for
operations that are safe to retry, such as idempotent
operations. Figure 5.13 illustrates the invocation of a
microservice with and without retries:

Figure 5.13: Invocation of a microservice with or without retries


Let us say we are using Spring's WebClient to create a REST
client that retrieves data from an external API. We can use
the Project Reactor's retryBackoff method to easily add
retries with backoff to our client (Code snippet 5.33).
1. import org.springframework.web.reactive.function.clie
nt.WebClient;
2. import reactor.core.publisher.Mono;
3. import org.springframework.http.HttpStatus;
4. import org.springframework.web.reactive.function.clie
nt.ClientResponse;
5.
6. public class MyWebClient {
7.
8. private WebClient webClient;
9.
10. public MyWebClient(String url) {
11. this.webClient = WebClient.builder()
12. .baseUrl(url)
13. .build();
14. }
15.
16. public Mono<String> getData(String endpoint) {
17. return webClient.get()
18. .uri(endpoint)
19. .retrieve()
20. .onStatus(HttpStatus::is5xxServerError, Clie
ntResponse::createException)
21. .bodyToMono(String.class)
22. .retryBackoff(3, Duration.ofSeconds(1), Dura
tion.ofSeconds(10));
23. }
24. }

Following are the pros and cons of Retries


Pattern:

Pros:
Retries can improve the reliability of a system by
automatically re-attempting operations in the face of
temporary failures.
It can help in achieving graceful degradation of the
system by ensuring that transient faults do not result in
immediate operation failure.
The pattern can be customized with various strategies
such as exponential backoff and jitter, providing better
control over how and when retries are performed.

Cons:
Frequent retries can put a strain on the system
resources, especially in high load scenarios or in case
of prolonged failures.
It can increase the overall latency of the system, as it
adds waiting time before retries.
Without proper control mechanisms, retries can lead to
"retry storms", where repeated retries amplify the load
on a struggling service, making it even harder for the
service to recover.

Rate limiter
The rate Limiter pattern controls the number of requests a
client can make to a service within a set timeframe, crucial
for preventing service abuse, protecting resources, and
ensuring fair usage. When a client exceeds the limit, the
server responds with an error and denies further requests
until the limit resets. Rate limiter can be implemented at
various levels (for example, API gateway, service level) using
different algorithms such as the token bucket or the leaky
bucket algorithm. While essential for maintaining service
availability, rate limiter must be managed well to avoid
negatively impacting the client experience (See Figure 5.14):

Figure 5.14: Invocation of a microservice with or without rate limiting

In Spring Boot, you can use the Bucket4j library to


implement rate limiting. Bucket4j is a powerful Java library
that provides an efficient, lock-free, thread-safe
implementation of the Token-Bucket algorithm.
First, add the Bucket4j Spring Boot Starter dependency to
your pom.xml (Code snippet 5.34):
1. <dependency>
2. <groupId>com.giffing.bucket4j.spring.boot.starter<
/groupId>
3. <artifactId>bucket4j-spring-boot-
starter</artifactId>
4. <version>0.9.0</version>
5. </dependency>
Then, configure Bucket4j in your application.properties
(Code snippet 5.35):
1. bucket4j.filters[0].filter-method=servlet
2. bucket4j.filters[0].url=/*
3. bucket4j.filters[0].rate-
limits[0].bandwidths[0].capacity=50
4. bucket4j.filters[0].rate-limits[0].bandwidths[0].time=1
5. bucket4j.filters[0].rate-
limits[0].bandwidths[0].unit=seconds
This configuration sets a limit of 50 requests per second for
all endpoints.
Now, let us create a simple REST service (Code snippet
5.36):
1. import org.springframework.web.bind.annotation.Get
Mapping;
2. import org.springframework.web.bind.annotation.Rest
Controller;
3.
4. @RestController
5. public class SampleRestService {
6.
7. @GetMapping("/doSomething")
8. public String doSomething() {
9. // ... your code here ...
10. return "Done";
11. }
12. }
With this setup, the /doSomething endpoint is now rate
limited to 50 requests per second. If a client makes more
than 50 requests in a single second, the server will respond
with a "429 Too Many Requests" HTTP status code.

Following are the pros and cons of Rate Limiter


Pattern:

Pros:
Rate limiting safeguards system resources from being
overwhelmed by a high number of requests, thereby
protecting the availability and performance of the
service.
It helps to prevent service abuse or misuse, whether
accidental or malicious.
By limiting the number of requests a single client can
make, rate limiting ensures that resources are
distributed fairly among multiple clients.

Cons:
Overly restrictive rate limits can negatively impact the
client experience, leading to frustration if legitimate
requests are being blocked.
Implementing a sophisticated rate limiting algorithm
that can adapt to the system's current state can add
complexity to the system design.
Fixed rate limits might not work well in a scaling
scenario where the number of requests can fluctuate
dramatically.

Circuit breaker
The Circuit Breaker pattern in microservices architecture
prevents a network or service failure from cascading to other
services by halting the flow of calls to a failing service. It
works by wrapping a protected function call, monitoring for
failures. When failures surpass a threshold, the circuit
breaker 'trips', and further calls return an error without
making the protected call. This pattern safeguards against
cascading failures and helps maintain system resilience by
allowing the faulty service to recover (See Figure 5.15):

Figure 5.15: Invocation of a microservice with or without circuit breaker

The circuit breaker maintains a state that changes based on


the results of the function it calls (See Figure 5.16).

Figure 5.16: Circuit Breaker states

The typical states are Closed, Open, and Half-open:


Closed: The circuit breaker allows calls to the
function. If calls fail more than the defined threshold, it
moves to the Open state.
Open: The circuit breaker blocks calls to the function
and returns an error immediately. After a set recovery
period, it moves to the Half-Open state.
Half-Open: The circuit breaker allows a limited
number of calls to go through. If those calls succeed, it
moves back to the Closed state. If they fail, it returns to
the Open state.
Circuit breaker pattern can be implemented using libraries
such as Netflix Hystrix, Resilience4j or Spring Cloud circuit
breaker for Java applications.

Following are the pros and cons of Circuit


Breaker Pattern:

Pros:
The circuit breaker pattern increases the resilience of a
system by preventing cascading failures and giving
failing services time to recover.
By preventing further damage from a failing service,
the pattern contributes to overall system stability and
availability.
Clients are not left hanging for a response from a
failing service. The pattern promotes the fail-fast
approach, reducing the impact on system performance.

Cons:
Implementing a circuit breaker can add complexity to
the system. Developers need to decide on the threshold
for failures, the timeout period, and manage the state
transitions.
Circuit breakers depend on external services to reset
their state. If these services are not reliable, it can lead
to problems.
If not monitored correctly, a tripped circuit breaker
can mask the underlying problem of a consistently
failing service.

Client library
The client library pattern simplifies the integration
between clients and services in microservice systems. It
provides a pre-packaged code library that clients can include
in their applications, abstracting away the complexities of
interacting with the service's API. By using a client library,
developers can focus on their application logic while
ensuring consistent and correct usage of the service.
However, maintaining and versioning the library for different
languages and managing coupling between clients and the
library can pose challenges.

Problem
In order to call a microservice, consumers need to develop a
client and implement a communication protocol. Assuming
that microservice consumers are more than one it is
impractical to develop that code multiple times. Also
microservice developers have intimate knowledge of the API
and are able to develop an optimal and reliable client much
faster. Then that code can be packaged as a client library,
released and shared by all consumers.
Often critiques of the pattern say that it can tightly couple
client applications to the library, making it difficult to switch
or update the library, as well as to adapt to changes in the
service's API. To mitigate those issues you should follow a
few recommendations:
Follow dummy pipes and smart endpoints principle. Do
not place complex logic into a client. The client should
only be responsible for implementation of the
communication protocol, and, optionally, offer
convenience methods and optimizations.
Version the client library.
Use only versioned dependencies for the client library
and minimize their number.

Solution
A client library typically contains data objects, abstract client
interface and one or more implementations to call the
microservice using supported protocols (See the Figure
5.17):

Figure 5.17: Typical structure of a client library

To increase value for consumers, a client library, in addition


to implementation of clients for Inter-process communication
protocols, can include the following:
Random data generators to enable generation of
realistic pseudo-random data for non-functional
testing.
Mock clients to cut dependencies and enable unit
testing for consumer code.
Direct clients to call microservice services directly
packaged together with consumer code in the same
process. This feature allows developers to develop
systems following microservices architecture
principles, but package and deploy them as monoliths.
More about that read in Chapter 14, Assembling and
Deploying Products.
Example of Mock client (Code snippet 5.37):
1. package com.example.client;
2.
3.
4. public class MockClient implements IClient {
5.
6. @Override
7. public String doSomething(String correlationId) {
8. return "Do something method called!";
9. }
10. }
Example of Direct client (Code snipped 5.38):
1. package com.example.client;
2.
3. import com.example.service.logic.IController;
4. import org.pipservices3.commons.refer.Descriptor;
5.
6. public class DirectClient extends org.pipservices3.rpc.
clients.DirectClient<IController> implements IClient {
7.
8. public DirectClient() {
9. super();
10. this._dependencyResolver.put("controller", new D
escriptor("sample-
service", "controller", "*", "*", "1.0"));
11. }
12.
13. @Override
14. public String doSomething(String correlationId) {
15. return this._controller.doSomething(correlationId
);
16. }
17. }
Example of rest client (Code snipped 5.39):
1. package com.example.client;
2.
3. import jakarta.ws.rs.HttpMethod;
4. import org.pipservices3.commons.errors.ApplicationEx
ception;
5.
6.
7. public class RestClient extends org.pipservices3.rpc.cli
ents.RestClient implements IClient {
8.
9. public RestClient() {
10. super("someservice/v1");
11. }
12.
13. @Override
14. public String doSomething(String correlationId) {
15. try {
16. return this.call(String.class, "example_client",
HttpMethod.GET, "/doSomething", null);
17. } catch (ApplicationException ex) {
18. throw new RuntimeException(ex);
19. }
20. }
21. }

Following are the pros and cons of Client


Library Pattern:

Pros:
Client libraries provide pre-packaged code that
simplifies the integration process between clients and
services, abstracting away the complexities of
interacting with the service's API.
By using a client library, clients can ensure consistent
usage of the service's API across different applications
and teams, as the library enforces the API contract.
Client libraries save development time and effort by
providing ready-made solutions for interacting with
services, allowing developers to focus on their
application logic rather than low-level integration
details.

Cons:
The use of client libraries can introduce coupling
between client applications and the library itself,
making it challenging to switch or update the library or
adapt to changes in the service's API.
Maintaining and versioning client libraries for different
languages and frameworks requires dedicated effort
and can introduce complexities in managing
compatibility and backward compatibility.
Client libraries may not cater to every specific need or
use case of clients, limiting their flexibility in
implementing custom behavior or making adjustments
to suit unique requirements.

Conclusion
In this chapter, we learned about different synchronous and
asynchronous patterns, API documentation and versioning
techniques, error handling strategies, and client libraries.
With these patterns at your disposal, you can build a resilient
and scalable microservices-based application that is both
easy to maintain and extend. The next chapter Working with
Data introduces you to the fundamental patterns for
managing data in microservices.

Further reading
Vinci, A. Microservices Communication Architecture
Patterns. Medium. Jan 12, 2023. Available at
https://medium.com/@vinciabhinav7/microservice
s-communication-architecture-patterns-
a8e77e614c2c
Yusanif, I. Best practices to communicate between
microservices. Medium. Dec 2, 2021. Available at
https://irfanyusanif.medium.com/how-to-
communicate-between-microservices-
7956ed68a99a
Jeong, S. Handling Exceptions with
RestControllerAdvice, ExceptionHandler. Medium. Nov
5, 2022 https://medium.com/@Seonggil/handling-
exceptions-with-restcontrolleradvice-
exceptionhandler-e7c95216da8d
Radar. F. Handling Microservices with gRPC and REST
API. Medium. Nov 4, 2022. Available at
https://fonradar.medium.com/ali-okan-kara-
a3d0b61610d
Sharma, K. Principles & Best practices of REST API
Design. Medium. Nov 21, 2021. Available at
https://blog.devgenius.io/best-practice-and-cheat-
sheet-for-rest-api-design-6a6e12dfa89f
Khaitan, N. gRPC vs REST — Comparing API
Architecture. Medium. Dec 11, 2022. Available at
https://medium.com/towards-polyglot-
architecture/grpc-vs-rest-comparing-api-
architecture-4be9b1cdc703
Kogut, O.S. An Introduction to Request-Reply Pattern
and Its Uses. Medium. Dec 3, 2021. Available at
https://aws.plainenglish.io/an-introduction-to-
request-reply-pattern-and-its-uses-2a0bb74ff7d8?
gi=3edf4f8cf33b
Jeong, S. Develop event-driven Applications using
Spring Events. Medium. Jan 8, 2023. Available at
https://medium.com/@Seonggil/develop-event-
driven-applications-using-spring-events-
5da5ef7bc02c
DevChris. SpringBoot — API Versioning — Fast&Easy.
Medium. April 6, 2023. Available at
https://medium.com/@DevChris01/springboot-api-
versioning-fast-easy-c3ef2c87452f
Soma. What is Circuit Breaker Design Pattern in
Microservices? Spring Cloud Netflix Hystrix Example
in Java? Medium. Mar 25, 2023. Available at
https://medium.com/javarevisited/what-is-circuit-
breaker-design-pattern-in-microservices-java-
spring-cloud-netflix-hystrix-example-
f285929d7f68
The AsyncAPI Initiative. Building the future of Event-
Driven Architectures (EDA). Available at
https://www.asyncapi.com/en.
Gupta, L. REST API Tutorial. Available at
https://restfulapi.net/.
Vins. Rate Limiter Pattern – Microservice Design
Patterns. VinsGuru. November 12, 2020. Available at
https://www.vinsguru.com/rate-limiter-pattern/
CHAPTER 6
Working with Data

Introduction
This chapter introduces you to the fundamental patterns for
defining, processing, and storing data in microservices. We
will explore a range of topics from data objects, keys, and
data management strategies to data schema, data integrity,
and data deletion. Each section will delve into the specifics,
discussing the rationale, advantages, disadvantages, and
best practices associated with each pattern. By the end of
this chapter, you will be well-equipped with the knowledge to
design and implement effective data strategies in a
microservices architecture.

Structure
In this chapter, we will cover the following topics:
Data objects
Static data
Dynamic data
Object ID
Natural key
Generated key
GUID
Data management
CRUD
CQRS
Event sourcing
Materialized view
Dynamic query
Filter
Pagination
Sorting
Projection
Database architecture
Database per service
Database sharding
Data migration
Disruptive migration
Versioned tables
Schemaless
Antipatterns
Static queries
Shared databases

Objectives
After studying this chapter, you should be able to understand
and apply data management patterns such as CRUD, CQRS,
event sourcing, and materialized views in microservices. You
will learn to implement dynamic queries for filtering,
pagination, sorting, and projection. Additionally, you will gain
insights into database architecture choices like database per
service and database sharding. You will also explore data
migration strategies, including disruptive migration,
versioned tables, and schemaless approaches. Lastly, you
will recognize and avoid antipatterns related to static queries
and shared databases.

Data objects
Before we dive into more advanced patterns, let's start with
the basics of data representation in microservices: Data
Objects or Data Transfer Objects (DTOs). DTOs form the
foundation of how data is structured, processed,
communicated, and stored within our services.

Problem
In a microservices architecture, data is a primary concern.
Services need to process data, communicate it to other
services, and store it effectively. However, the raw data is
often unstructured, complex, and difficult to manage. This
complexity can lead to inefficient code, difficulties in data
serialization and deserialization, and increased chances of
errors during data processing and transmission. Moreover,
maintaining the integrity and consistency of data as it
travels across services or layers within a service can be a
significant challenge.

Static data
A static data object typically represents a particular business
entity and includes properties to store the entity's data along
with methods to manipulate that data. These objects are
designed with a known, fixed schema, which allows for
straightforward serialization and deserialization of the data,
making them a good fit for use cases where the data
structure is well-understood and not likely to change
frequently.
Here is an example of a User data object that supports
serialization (Code snippet 6.1):
1. import java.io.Serializable;

2.
3. public class User implements Serializable {

4. private static final long serialVersionUID = 1L;

5.
6. private String id;

7. private String name;

8. private String email;

9.
10. public User() {}

11.
12. public User(String id, String name, String email) {

13. this.id = id;

14. this.name = name;

15. this.email = email;

16. }

17.
18. // getters and setters

19. public String getId() {


20. return id;

21. }

22.
23. public void setId(String id) {

24. this.id = id;

25. }

26.
27. public String getName() {

28. return name;

29. }

30.
31. public void setName(String name) {

32. this.name = name;

33. }

34.
35. public String getEmail() {

36. return email;

37. }

38.
39. public void setEmail(String email) {

40. this.email = email;

41. }

42.
43. @Override
44. public String toString() {

45. return "User{" +

46. "id='" + id + '\'' +

47. ", name='" + name + '\'' +

48. «, email=’» + email + ‘\’’ +

49. '}';

50. }

51. }

Leveraging static data representations significantly simplifies


the process of writing code that interacts with that data.
However, such representations can add complexity when
implementing generic patterns. To facilitate generic
implementation, a portion of the data can adhere to specific
conventions.
This adherence can either be achieved implicitly, where
fields are accessed via reflection (although this method
tends to be slower and less efficient), or explicitly, by
extending a base class or implementing specific interfaces
that necessitate certain fields.
For example, an interface named Identifiable could require a
unique object ID field. A versioned interface might mandate a
version number and a trackable interface could be designed
to monitor the state of an object (Code snippet 6.2):
1. import java.time.LocalDateTime;

2.
3. public interface Identifiable {

4. String getId();

5. }
6.
7. public interface Versioned {

8. long getVersion();

9. }

10.
11. public interface Trackable {

12. LocalDateTime getCreatedAt();

13. LocalDateTime getUpdatedAt();

14. boolean isDeleted();

15. }

16.
17. public class User implements Identifiable, Versioned, Trackable {

18. private String id;

19. private String name;

20. private String email;

21. private long version;

22. private LocalDateTime createdAt;

23. private LocalDateTime updatedAt;

24. private boolean deleted;

25.
26. // Constructor, getters and setters…

27. …

28.
29. @Override
30. public String getId() {

31. return this.id;

32. }

33.
34. @Override

35. public long getVersion() {

36. return this.version;

37. }

38.
39. @Override

40. public LocalDateTime getCreatedAt() {

41. return this.createdAt;

42. }

43.
44. @Override

45. public LocalDateTime getUpdatedAt() {

46. return this.updatedAt;

47. }

48.
49. @Override

50. public boolean isDeleted() {

51. return this.deleted;

52. }

53.
54. // Other getters and setters...

55. }

Following are the pros and cons of Static Data


Pattern:

Pros:
Static data objects provide a fixed, well-defined
structure for data, enhancing code readability and
maintainability.
The predictable structure of static data objects enables
efficient data processing, serialization, and
deserialization.
Static data objects safeguard data integrity and
consistency throughout its lifecycle.
The consistent structure of static data objects
simplifies the codebase and makes it easier to
understand.

Cons:
Any changes to the data structure require modifying
the object's schema, affecting all system components
reliant on the object.
Static data objects may not scale well for applications
with evolving or complex data models.
In scenarios with intricate or deeply nested data
models, static data objects can become unwieldy and
cumbersome.
Defining all fields and their types upfront for static
data objects can lead to additional development time
and overhead.
Dynamic data
Dynamic data objects offer a more flexible alternative to
static data objects, trading off some of the structural rigidity
for adaptability. Unlike static data objects that require a
predefined schema, dynamic data objects can be modified
and extended more freely as the application evolves.
These objects are particularly advantageous when working
with generic processing algorithms, as they allow data to be
manipulated without adhering to a strict, predefined
structure. Additionally, dynamic data objects can facilitate
partial updates to the data, eliminating the need to handle or
transmit the entire data object when only a subset of the
data is relevant. They also excel in situations where queries
with projections are used, as they can easily adapt to return
only the specific fields requested in the projection. This
flexibility can lead to improved efficiency and reduced
network traffic in distributed systems.
In Java, we can use a Map as a dynamic data object. A Map
allows us to add, remove, and modify key-value pairs
dynamically at runtime. Here is an example (Code snippet
6.3):
1. Map<String, Object> user = new HashMap<>();

2.
3. user.put("id", "123");

4. user.put("name", "John Doe");

5. user.put("email", "[email protected]");

Following are the pros and cons of Dynamic


Data Pattern:

Pros:
Dynamic data objects can adapt to changes in the data
structure without requiring modifications to the
object's schema, which can be especially beneficial in
rapidly evolving applications.
They are well-suited to applications with complex or
changing data models, as they can easily accommodate
additional data fields.
Dynamic data objects can facilitate partial updates and
queries with projections, which can lead to improved
efficiency and reduced network traffic.
They are ideal for generic processing algorithms,
enabling data manipulation without adhering to a rigid,
predefined structure.

Cons:
The lack of a fixed structure can make the code harder
to understand and maintain. Additionally, developers
must handle the possibility of missing or unexpected
data fields.
Accessing data in dynamic data objects can be slower
than in static data objects, especially in strongly typed
languages like Java.
Without a fixed schema, it is harder to ensure data
consistency and integrity across the application.
Dynamic data objects often sacrifice type safety,
meaning that developers must be careful to handle
data of the correct type.

Object ID
Object identifiers, or IDs, serve as a fundamental attribute in
the realm of data management, providing a unique tag to
each data object within a system. The method by which
these IDs are generated is crucial, as it influences the
performance, scalability, and reliability of the system.

Problem
Creating unique object IDs in a distributed microservices
system is a complex task. These IDs are critical for managing
data across various services, but their generation can
become challenging when multiple service instances
generate objects concurrently. Traditional methods like
sequential or auto-incremented IDs can cause conflicts, while
natural keys, such as email addresses or usernames, are not
always applicable. Coordinating ID generation across
services can also reduce system performance and scalability.
Hence, effective strategies are necessary for generating
universally unique IDs in such an environment.

Natural key
Natural keys, also known as business or domain keys, are
unique identifiers derived from the data itself, like a user's
email address or a product's SKU. They align closely with
real-world identities, making them intuitive and user-friendly.
However, their use requires guaranteed uniqueness of the
business data, which may not always be possible. Changes in
business rules could necessitate modifications to the natural
key, adding complexity. Also, as they're based on business
data, natural keys might inadvertently expose sensitive
information. Despite these challenges, in contexts where
certain data attributes are assuredly unique, natural keys
can serve as effective object identifiers.
In this example, a User object will be defined in Java, where
the email attribute serves as the natural key (Code snippet
6.4):
1. public class User implements Identifiable {
2. private String email; // natural key

3. private String name;

4. private String address;

5.
6. public String getId() {

7. return email;

8. }

9.
10. …

11. }

Following are the pros and cons of Natural Key


Pattern:

Pros:
Natural keys align with real-world identities and
attributes, making them intuitive and easily
understandable.
Natural keys often correspond to meaningful data
elements, such as email addresses or product SKUs,
which can be more user-friendly.
In some cases, natural keys can guarantee uniqueness
within the context of the business domain, reducing the
risk of conflicts.
Natural keys eliminate the need for additional
generated key management, simplifying the data model
and reducing overhead.

Cons:
Natural keys may not be universally applicable, as not
all objects possess attributes that can serve as unique
identifiers.
Alterations to business rules may require modifying the
natural key, potentially leading to complex updates and
data integrity issues.
Natural keys might inadvertently expose sensitive
information, since they are derived from business data.
Ensuring the uniqueness of natural keys across the
entire system can be challenging, as it relies on the
accuracy and consistency of the data.

Generated key
Generated keys are a common method for creating unique
identifiers in a system, often implemented as auto-
incremented keys generated by databases. When a new
object is inserted into the database, the database
automatically assigns a unique, incrementing value as the
object's key. However, this approach requires a roundtrip to
the database, which can introduce latency.
In distributed systems, a potential solution is to utilize a
dedicated key generator microservice. This service is
responsible for producing unique IDs that are distributed
across the system. The service can generate a single ID or a
batch of IDs on demand, reducing the number of calls
needed and helping to improve performance. However, this
approach can introduce its own complexities, such as
ensuring the high availability and performance of the key
generator service itself. In summary, while generated keys
can provide a reliable method for creating unique IDs, their
implementation needs to be carefully considered to ensure
efficiency and scalability in a distributed system.
In Java, a key generator interface might look something like
this (Code snippet 6.5):
1. public interface KeyGenerator {

2. String generateKey();

3. List<String> generateKeys(int count);

4. }

In this interface, there are two methods: `generateKey()` for


generating a single unique key, and `generateKeys(int count)`
for generating a batch of unique keys. The actual
implementation of these methods will depend on your
specific key generation strategy.
Following are the pros and cons of Generated Key Pattern:

Pros:
Generated object keys, such as auto-incremented or
specialized key generator microservices, can provide a
high level of uniqueness, ensuring that each object is
assigned a distinct identifier.
Generated keys often have predictable ordering,
making them suitable for efficient indexing and
querying in databases.
Generated keys can simplify the data model by
providing a standardized approach for generating
unique IDs, eliminating the need to rely on potentially
complex natural keys.
Key generators can be designed to handle the
scalability demands of distributed systems, generating
unique IDs across multiple instances without conflicts.

Cons:
Generated keys, especially when obtained from a
centralized key generator service, introduce a
dependency on external resources, which can impact
system availability and performance.
In distributed systems, coordinating the generation and
assignment of unique keys across multiple instances
may introduce synchronization overhead and potential
bottlenecks.
Generated keys typically lack inherent meaning or
relevance to the business domain, as they are typically
numerical or alphanumeric sequences, which may
make them less user-friendly.
While generated keys can guarantee uniqueness within
the system, conflicts may still arise when integrating
data from external sources that use different key
generation mechanisms.

GUID
Globally Unique Identifiers (GUIDs), also known as
Universally Unique Identifiers (UUIDs), offer a robust
solution for generating unique IDs, especially in distributed
systems. GUIDs are produced using specific algorithms
designed to ensure their global uniqueness without the need
for central coordination. This makes them particularly
suitable for distributed environments where such
coordination is expensive or unfeasible.
While GUIDs can be represented in binary form, using their
string representation often provides more versatility. The
string representation of a GUID, typically in a 36-character
format like "550e8400-e29b-41d4-a716-446655440000", is
widely recognized and compatible across different systems
and programming languages. This makes it easier to store,
transmit, and use these IDs across various parts of a system,
regardless of the underlying technology. However, it's worth
noting that compared to shorter, numerical IDs, GUIDs can
take up more storage space and be less efficient to index
and query in databases, which is a trade-off to consider in
their use.
UUIDs are designed to have a high degree of uniqueness by
employing various strategies in their generation algorithms.
One commonly used version of UUIDs is based on random
number generation. In this approach, a UUID is generated
using a combination of random bits, typically from a
cryptographically secure pseudo-random number generator.
These random bits are then combined with other
components, such as timestamps, MAC addresses, or other
unique identifiers, depending on the specific version of UUID
being used. By incorporating these different elements, UUID
generation algorithms ensure that the likelihood of
generating duplicate UUIDs is extremely low, even when
generating large numbers of UUIDs across distributed
systems or over long periods of time. Additionally, UUID
standards specify formats and guidelines to further minimize
the chance of collisions, ensuring that UUIDs remain globally
unique.
To generate a string GUID (UUID) object ID in Java, you can
use the java.util.UUID class (Code snippet 6.6):
1. UUID uuid = UUID.randomUUID();

2. String objectId = uuid.toString();

Following are the pros and cons of GUID


Pattern:

Pros:
GUIDs offer a high probability of being globally unique,
regardless of the system or environment in which they
are generated. This eliminates the need for
coordination across distributed systems.
GUIDs can be generated without relying on a central
authority or coordination, making them suitable for
decentralized and distributed systems.
GUIDs can be represented as strings, making them
compatible with various systems, databases, and
programming languages. The string representation is
more universal than binary, facilitating
interoperability.
The probability of collision between two independently
generated GUIDs is extremely low, reducing the
chances of ID conflicts.

Cons:
GUIDs are typically represented as 128-bit values,
which can consume more storage space than shorter
numerical IDs, especially when dealing with large
datasets.
The string representation of GUIDs can be less human-
readable and less intuitive than other types of IDs,
making it harder to work with and debug.
Compared to shorter, numerical IDs, the complexity
and size of GUIDs can have performance implications,
especially when used as indexed keys in databases.
When using GUIDs as indexed keys in databases,
insertions can lead to higher index fragmentation due
to the non-sequential nature of GUIDs, potentially
impacting query performance.

Data management
In the context of microservices architecture, data
management patterns emerge as vital strategies for
orchestrating data flow, manipulation, and storage. These
patterns, including Create, Read, Update, Delete
(CRUD), Command Query Responsibility Segregation
(CQRS), Event Sourcing, and others, offer a foundation for
building scalable, maintainable, and robust systems. They
provide abstracted structures that guide developers in
modeling data, separating concerns, and managing state
changes over time. Understanding and correctly
implementing these patterns is pivotal in navigating the
complexities of data in distributed systems, fostering data
consistency, integrity, and facilitating an efficient data
exchange between services. As we delve deeper into these
patterns, we will gain insights into their principles,
advantages, and potential challenges, along with their
practical applications in Java-based microservices.

Problem
In the vast expanse of microservices architecture, a pivotal
challenge lies in designing an effective data management
strategy. This strategy encompasses how updates are
transmitted, data is stored, and queries are retrieved across
distributed services. The complexity of maintaining data
integrity, consistency, and performance underlines the
criticality of a well-architected design. Without a thoughtful
design backed by appropriate data management patterns
like CRUD, CQRS, or Event Sourcing, developers may
encounter problems such as data inconsistency,
performance degradation, and scalability issues. Thus, the
challenge is not just about understanding and selecting
these patterns, but also architecting a system that
seamlessly integrates these patterns to facilitate efficient
data transmission, storage, and retrieval. An incorrect or ill-
fitted design choice can potentially lead to increased
complexity and suboptimal system performance. Therefore,
the problem centers on designing a robust architecture that
employs the most suitable data management patterns for a
specific microservices system.

CRUD
The CRUD (Create, Read, Update, Delete) pattern, a
cornerstone of data management, outlines the four
fundamental operations performed on any persistent data
storage:
Create: This operation represents the insertion of new
data into the storage system. This typically equates to
the instantiation of a new entity or object.
Read: This operation refers to the retrieval of existing
data from the storage. It is an operation that allows
viewing of data without making alterations.
Update: This operation signifies modifications to
existing data within the storage, including alteration of
attribute values or relationships.
Delete: This operation denotes the elimination of
existing data from the storage system.
One of the defining characteristics of the CRUD pattern is
that it employs the same data model for both read and write
operations, ensuring consistency in data representation.
Moreover, the CRUD pattern presupposes a unified storage
system, with all data contained within a single repository,
refer to Figure 6.1:
Figure 6.1: CRUD data management

The inherent simplicity makes the CRUD pattern widely


adaptable and fundamental to data manipulation in
microservices architecture, where each service typically
implements these operations for its own specific data.
Following are the pros and cons of CRUD Pattern:

Pros:
CRUD provides a straightforward and intuitive
approach to data manipulation. It covers the basic
operations needed for most applications.
CRUD operations can be applied to any database or
persistent storage mechanism, making the pattern
adaptable across different technologies and platforms.
By using the same data model for read and write
operations, CRUD ensures uniformity and predictability
in handling data.
The operations in CRUD map directly to low-level
database operations, making it easier for developers to
understand and implement.

Cons:
CRUD might not be suitable for complex operations or
business logic. It is often not sufficient when there is a
significant divergence between the models used for
reading and writing data.
With a single storage system, scalability can become an
issue for large applications as all operations are
directed towards a unified data source.
CRUD operations can lead to inefficiencies in fetching
data. Read operations might fetch more data than
needed (over-fetching) or not enough data (under-
fetching), leading to additional database requests.
The CRUD pattern typically doesn't store historical
data. Once a record is updated or deleted, the previous
state is lost unless specifically programmed to retain
that information.

CQRS
Command Query Responsibility Segregation (CQRS) is
a design pattern that diverges from the traditional CRUD
approach by segregating the operations for reading and
writing into separate models. This segregation allows each
model to be designed, developed, and scaled independently
based on specific requirements, which can result in
significant performance benefits and greater flexibility. In a
CQRS-based system, the command model deals with the
update operations (Create, Update, Delete), altering the
state of the system, whereas the query model handles the
read operations, retrieving data without causing any state
changes. This separation allows each side to be optimized
according to its own needs. For example, you might optimize
the command side for write performance while the query
side could be optimized for read performance. CQRS is
particularly effective in complex domain-driven design
scenarios where the business rules for read and write
operations differ significantly.
In the CQRS pattern, commands, which denote state-
changing operations, may not necessarily be persisted
directly in a write database; instead, they are typically stored
in an Event Store. These stored events are then utilized to
populate one or more read databases, thus ensuring data
consistency across the system, refer to Figure 6.2:

Figure 6.2: CQRS Data Management

Utilizing the history of commands stored in an Event Store


provides valuable opportunities for operations such as
logging and auditing, allowing for a detailed examination of
system activity over time. Moreover, this retained sequence
of commands can also facilitate undo operations, enabling a
system to revert to a previous state by replaying or reverting
these commands.
Following are the pros and cons of CQRS Pattern:

Pros:
Separate read/write models allow for workload-specific
scaling, improving overall system scalability.
Individual tuning of read/write sides enhances
performance.
CQRS can streamline complex systems where
read/write models significantly differ, improving
maintainability.
Stored commands provide an inherent audit trail,
aiding logging, debugging, and tracing system
behavior.

Cons:
Synchronization between separate read/write sides can
introduce system complexity.
Updates on the read side may be delayed, posing issues
for applications requiring real-time data.
Separate models, databases, and synchronization logic
require more planning and development.
Separate databases for read/write operations can lead
to data duplication.

Event Sourcing
Event Sourcing is a design pattern where all changes to the
application state are stored as a sequence of events. Instead
of storing just the current state of the data in a domain, it
uses an append-only store to record the full series of actions
taken on the data. Each action corresponds to an event
object that is appended to the event store.
A significant advantage of the Event Sourcing pattern is its
ability to reconstruct the past application state by replaying
the events. This capability opens up several possibilities like
providing an audit trail, debugging by understanding the
sequence of events leading to a certain state, and even
system versioning by taking the system back to a previous
state.
Event Sourcing can also work in conjunction with the CQRS
pattern. Here, the "write" model can correspond to the event
store, while the "read" model can be implemented using a
Materialized View pattern (see below), keeping the read side
optimized for query operations.
When it comes to implementing event sourcing, there are a
few key strategies to consider (refer to Figure 6.3):
Event Store: A fundamental requirement of Event
Sourcing is an Event Store - a database built
specifically for storing events. This can be built using
existing database technologies or specialized event
storage solutions.
Event Objects: Events should be designed as
immutable objects that capture a single change to the
domain data.
Event Replay: The ability to replay events to
reconstruct a particular state of the application is
crucial. This may involve building a replay mechanism
and maintaining snapshots of certain points in time to
speed up the reconstruction.
Event Handlers: On the read side, events can be
handled by specific components designed to update the
read model or trigger other events.
Figure 6.3: Event sourcing pattern

When implementing the event sourcing pattern, you have a


variety of options for the underlying technologies, including
Apache Kafka, relational databases, NoSQL databases, and
more specialized event sourcing platforms.
Following are the pros and cons of Event Sourcing Pattern:

Pros:
Event Sourcing provides a comprehensive record of all
system changes, facilitating auditing requirements.
The ability to replay events aids in debugging by
reconstructing past states and tracing event
sequences.
The state of the system at any given point in time can
be examined by reviewing stored events.
Events can be replayed to synchronize other systems,
populate read models, or recover states.
Event Sourcing enables independent scaling of write
and read operations, beneficial for high-performance
systems.
Cons:
Evolving event structures can create issues with
versioning and compatibility with older events.
Implementing Event Sourcing, especially in
combination with other patterns like CQRS, introduces
system complexity.
Storing all state transitions, including sensitive data,
can pose challenges in complying with regulations like
GDPR.
Replaying all events to reconstruct the current state
can be slow, necessitating the use of snapshots, adding
further complexity.
Event Sourcing represents a different paradigm,
requiring teams to adapt and grasp the concept.

Materialized View
The Materialized View pattern is a design pattern used in
applications that require complex queries or computations
on data that are difficult or time-consuming to perform. The
pattern works by precomputing and storing the result of a
query in a separate database table, which is then updated as
the underlying data changes.
In this pattern, the Materialized View acts as a cache for the
results of complex queries or computations. Instead of
executing a complex query every time the data is requested,
the application can fetch the precomputed results from the
materialized view, providing a significant performance
improvement, refer to Figure 6.4:
Figure 6.4: Materialized View Pattern

Moreover, the Materialized View pattern also helps in


situations where it is advantageous to segregate the read
model from the write model, as is often the case with CQRS
based systems. The write model, which captures all changes
to the system's state, typically maintains an event store,
while the read model, optimized for specific queries, uses
materialized views to efficiently serve data.
When implementing the Materialized View pattern, it is
important to consider how and when the view gets updated.
It can be updated in real-time as changes occur, or in a
deferred manner at regular intervals depending on the
system requirements and performance trade-offs.
Following are the pros and cons of Materialized View Pattern:

Pros:
Materialized Views enhance read performance by
precomputing and storing complex query results.
Computation is offloaded during data updates,
reducing the computational load during query
execution.
Materialized Views can be tailored and optimized for
specific queries, enhancing data organization and
retrieval.
In a CQRS-based system, Materialized Views help
maintain data consistency between the read and write
models.

Cons:
Every change to the underlying data requires updating
the Materialized View, potentially incurring significant
computational costs for complex computations or large
datasets.
Depending on the update frequency, there may be a
delay in reflecting changes in the underlying data,
leading to stale data in the Materialized View.
Maintaining and updating Materialized Views
introduces additional complexity to the system.
Storing precomputed data results in additional storage
requirements.

Dynamic query
Dynamic queries are a flexible and powerful data access
pattern that allow you to construct and execute database
queries at runtime, based on user input or application needs.
This adaptability facilitates data retrieval according to
various criteria and conditions that are not known in
advance, enhancing the system's ability to accommodate
diverse and changing requirements.

Problem
Designing and implementing data access patterns in
applications often requires dealing with diverse and
unpredictable query requirements. The challenge arises
when the specifics of these queries, such as the filtering
criteria, sorting order, or the fields to be returned, are not
known at design time but are determined by user input or
business requirements at runtime. Additionally, the system
must be capable of executing these queries efficiently to
ensure optimal performance. Statically defined queries lack
the flexibility to meet these changing demands, and writing
individual static queries for every possible variation would
quickly become unmanageable and inefficient. The issue
then becomes how to architect the data access layer in such
a way that it can dynamically construct and execute queries
based on various input parameters, while maintaining the
system's performance, security, and integrity.
For instance, an Employee microservice may need to support
multiple ways to retrieve employees. The example below
(Code snippet 6.7) shows an interface in Java using Spring
Data JPA:
1. import org.springframework.data.jpa.repository.JpaRepository;

2. import org.springframework.data.jpa.repository.Query;

3. import org.springframework.stereotype.Repository;

4. import java.util.List;

5.
6. @Repository

7. public interface EmployeeRepository extends


JpaRepository<Employee, Long> {

8.
9. // Query using Spring Data JPA's Query Creation feature

10. List<Employee> findByFirstName(String firstName);

11.
12. // Query using @Query annotation with JPQL

13. @Query("SELECT e FROM Employee e WHERE e.lastName = :las


tName")

14. List<Employee> findEmployeesByLastName(String lastName);

15.
16. // Query using @Query annotation with native SQL

17. @Query(value = "SELECT * FROM Employee WHERE salary >


:salary", nativeQuery = true)

18. List<Employee> findEmployeesWithSalaryAbove(double salary);

19.
20. ...

21. }

Supporting multiple query operations bloats the microservice


interface, adds development time and increases the chances
of errors. Moreover, when new functionality requires another
query operation, it will spin another round of microservice
development.

Filtering
Filter parameters in dynamic queries enable selective data
retrieval based on runtime criteria, providing enhanced
flexibility to meet diverse requirements. These parameters
can range from simple attributes like names or dates to more
complex criteria like numerical thresholds or Boolean
conditions. They shape the conditions of the query,
efficiently filtering the dataset. Implementing such a system
necessitates attention to performance, security, and
potential input errors, even as it eliminates the need for
altering underlying code or database structures.
For example, to pass a set of filter parameters to a
microservice, we can use a simple HashMap (Code snippet 6.8):
1. public class FilterParams extends HashMap<String, Object> {

2.
3. public FilterParams() {

4. super();

5. }

6.
7. }

Then, the EmployeeRepository might look like this (Code snippet


6.9):
1. public class EmployeeSpecification {

2.
3. public static Specification<Employee>
matchesEmployeeCriteria(FilterParams filter) {

4.
5. return (root, query, cb) -> {

6.
7. Predicate firstNamePredicate = filter.containsKey("firstNam
e")

8. ? cb.conjunction()

9. : cb.like(root.get("firstName"), "%" + filter.get("firstNa


me") + "%");

10. Predicate lastNamePredicate = filter.containsKey("lastName


")

11. ? cb.conjunction()
12. : cb.like(root.get("lastName"), "%" + filter.get("lastNam
e") + "%");

13.
14. Predicate salaryPredicate = cb.conjunction();

15. if (filter.containsKey("aboveSalary"))

16. salaryPredicate = cb.greaterThan(root.get("salary"), (doub


le) filter.get("aboveSalary"));

17.
18. return cb.and(firstNamePredicate, lastNamePredicate, salary
Predicate);

19. };

20. }

21. }

22. import org.springframework.data.domain.Page;

23. import org.springframework.data.domain.Pageable;

24. import org.springframework.data.jpa.repository.JpaSpecificationExe


cutor;

25. import org.springframework.data.repository.PagingAndSortingRepo


sitory;

26.
27. public interface EmployeeRepository extends
PagingAndSortingRepository<Employee, Long>,
JpaSpecificationExecutor<Employee> {

28. Page<Employee> findAll(Specification<Employee> spec,


Pageable pageable);

29. }
Pagination
Paging parameters in dynamic queries, often represented as
'page size' and 'page number', allow for efficient navigation
through large datasets by segmenting data into manageable
'pages'. This approach optimizes performance by reducing
data transfer volumes and memory usage, and enhances
user experience by enabling incremental exploration of data.
Implementing such a system requires due attention to
performance implications and user interface design.
Here is an example of a PagingParams class in Java, which can
be serialized and used to pass paging parameters (Code
snippet 6.10):
1. import java.io.Serializable;

2.
3. public class PagingParams implements Serializable {

4.
5. private int pageNumber;

6. private int pageSize;

7.
8. public PagingParams() {

9. }

10.
11. public PagingParams(int pageNumber, int pageSize) {

12. this.pageNumber = pageNumber;

13. this.pageSize = pageSize;

14. }

15.
16. // getters and setters

17.
18. public int getPageNumber() {

19. return pageNumber;

20. }

21.
22. public void setPageNumber(int pageNumber) {

23. this.pageNumber = pageNumber;

24. }

25.
26. public int getPageSize() {

27. return pageSize;

28. }

29.
30. public void setPageSize(int pageSize) {

31. this.pageSize = pageSize;

32. }

33. }

Here is an example of how you might create an EmployeeService


that calls EmployeeRepository.findAll and converts PagingParams
into a Pageable (Code snippet 6.11):
1. import org.springframework.data.domain.Page;

2. import org.springframework.data.domain.PageRequest;

3. import org.springframework.data.domain.Pageable;
4. import org.springframework.stereotype.Service;

5.
6. @Service

7. public class EmployeeService {

8.
9. private final EmployeeRepository employeeRepository;

10.
11. public EmployeeService(EmployeeRepository employeeRepository
){

12. this.employeeRepository = employeeRepository;

13. }

14.
15. public Page<Employee> findAllEmployees(FilterParams
filterParams, PagingParams pagingParams) {

16. Pageable pageable = PageRequest.of(pagingParams.getPageNumb


er(), pagingParams.getPageSize());

17. Specification<Employee> specification = EmployeeSpecificatio


n.matchesEmployeeCriteria(filterParams);

18.
19. return employeeRepository.findAll(specification, pageable);

20. }

21. }

Sorting
Sorting in dynamic queries refers to the process of arranging
data in a certain order to enhance the readability or to better
understand the data. The sorting order can either be
ascending or descending and can be applied to one or more
fields in the data. The primary advantage of dynamic sorting
is the ability to change the sorting criteria at runtime,
meaning the end-users or services can dictate the order of
the results based on their specific needs or preferences.
In a microservices architecture, sorting parameters are
typically passed to the service through a request object or
parameters, and the service uses these parameters to build
a query. The sorted data is particularly useful for end-users
when exploring and analyzing large volumes of data, and
can also be essential when applying pagination.
For implementation, a field name or names, along with the
sort direction (ascending or descending), are passed to the
service. The service will then construct a sort expression
based on these parameters, which is used to fetch sorted
data from the database.
For example, here is an implementation of SortParams as an
array of SortField objects (Code snippet 6.12):
1. import java.io.Serializable;

2.
3. public class SortField implements Serializable {

4. private String name;

5. private boolean ascending;

6.
7. public SortField() {

8. }

9.
10. public SortField(String name, boolean ascending) {
11. this.name = name;

12. this.ascending = ascending;

13. }

14.
15. // getters and setters

16.
17. public String getName() {

18. return name;

19. }

20.
21. public void setName(String name) {

22. this.name = name;

23. }

24.
25. public boolean isAscending() {

26. return ascending;

27. }

28.
29. public void setAscending(boolean ascending) {

30. this.ascending = ascending;

31. }

32. }

33.
34. public class SortParams extends ArrayList<SortField> implements
Serializable {

35. }

And here is how you might modify the EmployeeRepository to


allow sorting by selected fields (Code snippet 6.13):
1. @Repository

2. public interface EmployeeRepository extends JpaRepository<Employ


ee, Long>, JpaSpecificationExecutor<Employee> {

3.
4. List<Employee> findAll(Specification<Employee> spec, Pageable
pageable, Sort sort);

5. }

Before calling EmployeeRepository you shall convert SortParams


into the Sort object (Code snippet 6.14):
1. import org.springframework.data.domain.Sort;

2.
3. public class SortParamsConverter {

4. public static Sort convertToSort(SortParams sortParams) {

5. List<Sort.Order> orders = new ArrayList<>();

6.
7. for (SortField sortField : sortParams) {

8. orders.add(sortField.isAscending() ? Sort.Order.asc(sortField
.getName()) : Sort.Order.desc(sortField.getName()));

9. }

10.
11. return Sort.by(orders);
12. }

13. }

Projection
Projection in the context of dynamic queries refers to the
ability to selectively choose which fields of the data model
are returned in a query response. Instead of returning
complete data records, a projection will return only those
fields that are explicitly specified by the query.
This pattern is immensely useful in scenarios where only a
subset of data is required, potentially leading to significant
performance improvements by reducing the amount of data
fetched and transmitted.
In a microservices architecture, projection parameters are
typically passed to the service as part of the request object
or parameters. The service then uses these parameters to
construct a database query that fetches only the specified
fields.
In dynamic queries, when the projection fields can be
defined by users, it's essential to return the result using
dynamic objects, typically implemented as Maps, to
accommodate the variability and adaptability required by the
user-defined data structures (see Dynamic data pattern
above).
Here is a simple implementation of ProjectionParams as a List of
field names (Code snippet 6.15):
1. import java.io.Serializable;

2. import java.util.ArrayList;

3.
4. public class ProjectionParams extends ArrayList<String>
implements Serializable {
5. }

Regrettably, numerous technology stacks have limited


support for dynamically projected fields. This constraint often
compels developers to generate custom query statements
on the fly that are subsequently sent to the database. While
this approach can offer greater flexibility, it also necessitates
careful attention to potential security vulnerabilities, such as
SQL injection attacks, and can lead to higher maintenance
complexity due to the need for manual query construction
and management.
Here is an example of how it can be done in Spring JPA (Code
snippet 6.16):
1. @Repository

2. public interface EmployeeRepository extends JpaRepository<Employ


ee, Long>, JpaSpecificationExecutor<Employee> {

3. ...

4.
5. @Query(nativeQuery = true)

6. Page<Map<String, Object>> findAllProjected(Specification<Emp


loyee> spec, Pageable pageable, Sort sort, @Param("fields") List<St
ring> fields);

7. }

Database architecture
Database architecture is crucial to the functionality and
performance of microservices. This section focuses on the
exploration of key database patterns, primarily emphasizing
on the practices that lend towards a robust, scalable
microservices environment.
Problem
Designing an effective database architecture for
microservices is a complex task. Traditional approaches like
the Shared Database pattern often fall short, primarily due to
their inherent service coupling and lack of isolation,
rendering it an anti-pattern in this context.
The focus should instead shift towards more appropriate and
robust database architecture patterns such as the Database
per Service and Database Sharding patterns. The Database
per Service pattern offers an effective way to maintain
service autonomy, a principle at the heart of microservices.
Each service has its own exclusive database, reinforcing
isolation and reducing dependencies. However, as the data
and demand grow, the need for a scalable solution becomes
evident.
This is where Database Sharding comes into play. Sharding
allows data to be distributed across multiple databases,
improving the system's ability to scale horizontally while also
enhancing performance. The challenge lies in implementing
these patterns effectively and understanding the trade-offs
involved to make informed decisions that suit specific
system requirements.

Database per service


The database per service pattern is one of the most widely
adopted database patterns in the realm of microservices,
ensuring that each microservice has its dedicated database.
This pattern stands in direct contrast to the shared database
pattern, eliminating issues associated with service coupling
and lack of fault isolation.
The main principle behind the database per service pattern
is to maintain service autonomy. Each service owns its
database, making it the only entity that can access and
modify the data in it. This practice encapsulates the data
within each service, fostering a higher degree of data
integrity and isolation (refer to Figure 6.5):

Figure 6.5: Database per Service

The benefits are significant. First, it avoids the problem of


schema evolution in shared databases, as each service has
the liberty to manage and update its schema independently.
Second, it enforces the boundaries between services,
reducing the risk of cascading failures. Third, it allows each
service to use the type of database that is best suited to its
needs, paving the way for polyglot persistence.
However, this pattern does introduce new challenges,
including managing data consistency across services and
implementing complex queries that span multiple services. It
also raises questions about how to handle joint operations
that involve multiple services, which leads us to the concept
of distributed transactions and the eventual consistency
model.
Following are the pros and cons of Database per Service:

Pros:
Independence in schema evolution and development.
Strict control over individual service data.
Faults are limited to affected services, increasing
system resilience.
Freedom to choose the best-suited database type per
service.

Cons:
Challenges in managing distributed data and achieving
eventual consistency.
Situations may arise requiring data synchronization
and duplication across services.
Difficulty in implementing queries spanning across
multiple services.
Implementing distributed transactions can be complex.

Database sharding
As a pivotal component in any microservices architecture,
databases are frequently identified as the primary bottleneck
when it comes to performance and handling large data
volumes. To address this, the database sharding pattern is
often employed, which segments a large database into
smaller, more manageable pieces known as shards. Each
shard operates as a separate database, hosting a fraction of
the total data, thus distributing the load across multiple
databases (refer to Figure 6.6):

Figure 6.6: Database sharding


The selection of the sharding key, a specific data attribute
dictating data distribution, is a critical aspect of this process.
The sharding key directly influences how evenly data is
distributed, the efficiency of queries, and overall scalability
of the system.
There are several methods to define sharding keys, which
include:
Range-based sharding: Data is partitioned based on
ranges defined by the sharding key. For example,
customers could be sharded based on the alphabetical
range of their last names.
Hash-based sharding: The sharding key is hashed
and the resulting hash value determines the shard.
This method often provides a uniform distribution of
data.
List-based sharding: Each shard is assigned a list of
values. If the sharding key matches one of these values,
the corresponding shard is chosen.
Geographic sharding: Sharding is based on
geographical considerations. For instance, users might
be sharded based on their geographical location.
Implementing database sharding can significantly boost
performance and scalability. However, it introduces
additional complexity, potentially complicating query
handling, transaction management, and schema
maintenance.
Following are the pros and cons of Database Sharding:

Pros:
Sharding can handle increased data growth by
distributing it across multiple databases.
By reducing the size of individual datasets and
distributing the load, sharding can improve query
response times and overall system performance.
With data spread across multiple shards, a failure in
one does not render the entire database unavailable.
Shards can be placed close to users' locations,
decreasing latency and improving user experience.

Cons:
Sharding requires careful planning and introduces
significant complexity into the database design.
Keeping data consistent across multiple shards can be
challenging.
Queries that span multiple shards can be difficult to
handle, potentially decreasing performance.
Once chosen, changing the sharding key can be a
daunting and resource-intensive task.
Managing multiple databases can increase
administrative overhead and cost.

Data migration
Data migration patterns are essential strategies in the
domain of data management, particularly in the context of
distributed systems such as microservices. They provide
methods for moving and transforming data from one storage
system to another, whether for reasons of system upgrades,
changes in data models, transitioning to different databases,
or distributing data across various microservices. Effectively
employing data migration patterns can ensure data integrity,
minimize downtime, and facilitate a seamless transition
during the process of migration.
Problem
Data migration presents a significant challenge in the realm
of software architecture. Its complexity arises primarily from
the potential disruption it can cause to routine operations.
When migration is performed during the process of system
upgrades, it can cause significant interruptions, affecting
both system availability and user experience. Conversely,
executing a migration in parallel with normal operations,
often referred to as a non-disruptive migration, also
introduces its own complexities. It requires a thoughtful
strategy for managing data consistency, concurrency, and
synchronization between the old and new systems while
ensuring that regular operations remain unaffected.
Consequently, the design of an efficient and reliable data
migration strategy that minimizes disruption while
maintaining data integrity is a critical problem faced by
architects in the realm of data management.

Disruptive migration
The disruptive migration pattern involves carrying out a data
migration during a system upgrade. This pattern is called
disruptive because it usually requires the system to go
offline or become unavailable during the migration process.
Particularly for large datasets, this process can be quite
lengthy, taking several hours or even days to complete (refer
to Figure 6.7):
Figure 6.7: Disruptive data migration

In relational databases, a disruptive migration typically


involves transforming the data schema, often accomplished
through SQL commands such as ALTER TABLE. These
commands modify the structure of tables to conform to the
new schema, and can either be written manually or
generated automatically using specialized tools.
Several technologies exist that facilitate automated schema
migration, including but not limited to:
Flyway: An open-source database migration tool that
emphasizes simplicity and convention over
configuration.
Liquibase: A powerful open-source tool for managing
and tracking database schema changes.
Before initiating a disruptive migration, it is highly
recommended to create a comprehensive backup of the
database. This serves as a safeguard against potential errors
or issues during migration. If something goes wrong during
the migration, having a recent backup allows for data
recovery and minimizes data loss, offering a valuable safety
net in the high-stakes process of disruptive migration.
Following are the pros and cons of Disruptive Migration
Pattern:

Pros:
Compared to other methods, disruptive migration is
straightforward and relatively easy to understand.
This method allows for greater control over the
migration process, as it happens in a single, managed
window.
Since the system is offline, data consistency is ensured
as no new data is being written during the migration.

Cons:
The most notable drawback is the inevitable system
downtime, affecting the availability and user
experience.
For large datasets, migration can take several hours or
days to complete.
If a problem arises during migration, it can lead to
extended downtime.
Since the system is offline, no new data can be backed
up during the process.

Versioned tables
Versioned tables is a non-disruptive data migration pattern
specifically designed for use with relational databases. In this
pattern, instead of transforming the schema of existing
tables, a new table is created with the updated schema.
Thus, you essentially have two versions of the same table:
the original (old schema) and the new one (new schema),
refer to Figure 6.8:
Figure 6.8: Versioned tables migration

The new version of the table is populated with data


transformed from the old table, and during this process, both
tables coexist. Updates made to the old table (inserts,
updates, deletes) are also applied to the new table to keep
them in sync.
Once the new table is fully populated and in sync with the
original table, a switch is made at the application level to
start using the new table. This could be as simple as
changing a configuration setting or modifying the SQL
statements in the application.
One of the main benefits of this pattern is that it allows for
the migration of data without causing disruption to the
running application. If any issues arise with the new schema
during the migration process, the application can simply
continue using the old table until the issues are resolved.
On the downside, this pattern can double the storage
requirements during the migration and may increase
complexity as updates need to be made to both versions of
the tables until the migration is complete. Additionally,
careful planning and testing are necessary to ensure a
smooth transition from the old table to the new one.
Following are the pros and cons of Versioned Tables Pattern:

Pros:
The process allows for migration without disruption to
the running application.
If any issues arise with the new schema, the
application can continue using the old table.
It enables a controlled transition to the new schema,
with the ability to switch back if necessary.

Cons:
This pattern can double the storage requirements
during the migration.
Complexity increases as updates need to be managed
in both tables until the migration is complete.
Rigorous planning and testing are necessary to ensure
a smooth transition.
The migration process generates additional load on the
database and can significantly slow down normal
operations.

Schemaless
The schemaless data migration pattern is a unique approach
used predominantly in NoSQL databases that inherently lack
rigid schemas. This pattern is particularly useful for
microservices architecture due to its flexibility and
adaptability to evolving data structures.
In a schemaless data migration, instead of creating separate
tables or updating the existing schema, the database is
designed to accommodate data in different versions
simultaneously. Each stored object contains a version field
that indicates its current schema version. This design allows
for a smoother transition to new data versions, without
disrupting normal operation or requiring significant data
movement (refer to Figure 6.9):

Figure 6.9: Schemaless data migration

When microservices read the data, they check the version


field of the data object. If the data is an older version, the
microservice will convert it to the latest version on-the-fly.
This conversion logic needs to be implemented in the
microservice to handle all existing versions of the data.
When data is written, it is always stored as the latest
version. Over time, as data is read and written, older
versions of data are naturally converted to the latest version,
completing the migration process.
This pattern offers a non-disruptive way to handle data
migration, particularly in systems where schema changes
are frequent. However, it does require more sophisticated
handling in the application code to manage different versions
of the data, and this complexity increases with each new
version introduced.
The following JSON records show how this pattern works
(Code snippet 6.17):
1. [

2. {

3. "_v": "1",

4. "type": "person",

5. "name": "John Doe",

6. "age": 30,

7. "city": "New York"

8. },

9. {

10. "_v": "2",

11. "type": "person",

12. "first_name": "Jane",

13. "last_name": "Smith",

14. "age": 25,

15. "address": {

16. "street": "123 Main St",

17. "city": "Los Angeles",

18. "zip": "90001"

19. }

20. },

21. {

22. "_v": "1",


23. "type": "product",

24. "name": "Smartphone",

25. "brand": "Apple",

26. "price": 999

27. },

28. {

29. "_v": "2",

30. "type": "product",

31. "product_name": "Laptop",

32. "brand": "Dell",

33. "price": 1299,

34. "specs": {

35. "processor": "Intel Core i7",

36. "ram": "16GB",

37. "storage": "512GB SSD"

38. }

39. }

40. ]

In this collection, there are objects representing both people


and products, with different schemas for each _v (version)
value. For _v="1", the schema includes name, age, city (for
people) and name, brand, price (for products). For _v="2", the
schema includes first_name, last_name, age, address (for people)
and product_name, brand, price, specs (for products).
Following are the pros and cons of Schemaless Pattern:
Pros:
This pattern allows for smooth and continuous
operation of services during migration.
It accommodates different versions of data
simultaneously, offering adaptability to evolving data
structures.
Older data versions are naturally converted to the
latest version over time during regular read and write
operations.

Cons:
Handling different versions of data requires more
sophisticated application code. This complexity grows
with each new version introduced.
The on-the-fly conversion of old data versions to the
latest version may lead to additional processing
overhead.
Each new version requires rigorous testing and
increases the maintenance burden.

Antipatterns
In the context of microservices, there are several
antipatterns associated with data handling and storing that
could lead to less optimal or even detrimental system
performance, scalability, and maintainability. Two of such
antipatterns are relying on static queries and using a shared
database among microservices:
Static queries: Static queries, which are hardcoded
database queries, can lead to several challenges in a
microservices architecture. Since they are not
dynamic, static queries lack the flexibility to adapt to
evolving requirements or changes in the structure of
the data. As a consequence, they often require changes
in the codebase, whenever a minor modification to the
query is needed. This defies one of the key benefits of
microservices - the ability to independently evolve and
scale different parts of a system.
Shared database: A shared database across multiple
microservices is another common antipattern. In
theory, sharing a database might seem like a good way
to reduce data duplication and ensure consistency.
However, it violates the principle of service autonomy,
a core tenet of microservices. Each microservice
should own and control its own data to maintain loose
coupling and high cohesion.
Sharing databases leads to tight coupling between
microservices and creates a single point of failure. If one
service makes a change to the shared database schema, it
can inadvertently break other services that depend on the
same schema. Also, when a shared database goes down, it
affects all the services using it, leading to system-wide
failure.
Moreover, sharing databases can lead to performance
bottlenecks, as all services contend for the same database
resources. It also complicates the task of scaling individual
services, as scaling decisions must take into account all
services sharing the database.

Conclusion
In this chapter, we have navigated the diverse landscape of
data handling in microservices. We have dissected the
fundamental elements of data objects, keys, and data
management strategies. We delved into the complexities of
data schema, data integrity, and data deletion, and explored
the subtleties of data migration strategies. Each of these
components plays a crucial role in defining, processing, and
storing data in a microservice-oriented architecture. With a
solid grasp of these patterns, you are now better equipped to
design and implement robust and efficient data strategies in
your microservices ecosystem. As we continue to explore the
world of microservices, remember that data is the foundation
upon which every service is built, and mastering its
manipulation is key to building scalable, resilient, and
efficient microservices. In the next chapter Handling
Complex Business Transactions, we will learn key patterns
that are essential for executing and managing complex
business transactions in microservices.

Further reading
1. Coulston. J. The Self-Identifying Identifier. Medium. Dec
6, 2022. Available at https://blog.devgenius.io/the-
self-identifying-identifier-5ede360d0313
2. Marcos. How about dynamic queries with Spring Data
Jpa? Medium. Oct 23, 2022. Available at
https://medium.com/@mmarcosab/how-about-
dynamic-queries-with-spring-data-jpa-
ec62b3e80b50
3. Verma, T. Microservices Design Principles and patterns.
Medium. Dec 10, 2021. Available at
https://medium.com/@tushar.msit27/microservice
s-design-principles-and-patterns-b2023ba264a9.
4. Richardson, C. Pattern: Database per service.
Microservice Architecture. 2023.
https://microservices.io/patterns/data/database-
per-service.html
5. Karnatakapu, K. 5 Important Microservices Design
Patterns. Medium. May 12, 2023.
https://medium.com/javarevisited/5-important-
microservices-design-patterns-c4d636b0051
6. Hatoum, S. Event Sourcing: the Future-Proof Design
Pattern for 2023 and Beyond. Medium. Jan 5, 2023.
Available at https://medium.xolv.io/event-sourcing-
the-future-proof-design-pattern-for-2023-and-
beyond-b42bc12ad268
7. Dafer, S.M. In a nutshell: What are DDD and CQRS
(Domain Driven Design and Command Query
Responsibility Segregation)? Medium. Jan 2.
https://medium.com/@stevedafer/in-a-nutshell-
what-are-ddd-and-cqrs-domain-driven-design-
and-command-query-responsibility-
dd2460d9a89a
8. Sadakath, S. Developing Microservices with CQRS and
Event Sourcing Patterns using GraphQL + Spring Boot
+ Kafka. Medium. Jul 16, 2022. Available at
https://shazinsadakath.medium.com/developing-
microservices-with-cqrs-and-event-sourcing-
patterns-using-graphql-spring-boot-kafka-
19f259a7aaa5
9. Venkatesh, D.S. Part 3: A comparison of CRUD and
CQRS patterns using shopping cart application.
Medium. Jan 5, 2022. Available at
https://medium.com/@suryasai.venkatesh/part-3-
a-comparison-of-crud-and-cqrs-patterns-using-
shopping-cart-application-6e2810b915aa
10. Hoxha, D. Sharing Data Between Microservices.
Medium. Oct 24, 2022. Available at
https://medium.com/@denhox/sharing-data-
between-microservices-fe7fb9471208

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 7
Handling Complex
Business Transactions

Introduction
This chapter presents key patterns essential for executing
and managing complex business transactions in
microservices. It covers a broad spectrum of topics, including
state management, process flow, transaction management,
delayed execution, and reliability. Each section analyzes
specifics, explaining the reasons, pros, cons, and application
guidelines of each pattern. Upon chapter completion, readers
will possess substantial knowledge to design and execute
effective transaction strategies within a microservices
architecture.

Structure
In this chapter, we will cover the following topics:
Concurrency and coordination
Distributed cache
Partial updates
Optimistic lock
Distributed lock
State management
Processes flow
Aggregator
Chain of responsibility
Branch
Transaction management
Orchestrated Saga
Choreographic Saga
Workflow
Reliability
Backpressure
Bulkhead
Outbox
Delayed execution
Job queue
Background worker
Antipatterns

Objectives
After completing this chapter, you should be equipped to
comprehend and utilize patterns for managing complex
transactions within microservices, including state
management and process flow strategies. You will be adept
at implementing transaction management mechanisms,
employing delayed execution techniques, and ensuring
reliability in microservices. Additionally, you will be capable
of identifying and circumventing common transactional
antipatterns.

Concurrency and coordination


Microservices architectures are fundamentally distributed
systems, typically involving multiple instances of each
service. This approach is integral for bolstering scalability
and reliability. Scalability is achieved as each service
instance can independently respond to increased load, while
reliability is bolstered by the fact that if one instance fails,
others can continue to provide the required service.
However, the concurrent operation of these instances
necessitates robust coordination mechanisms. Effective
coordination and concurrency control strategies, such as
state management, distributed lock, optimistic lock, and
caching, ensure that despite the distributed and concurrent
nature of the services, the overall system functions in a
consistent and reliable manner.

Problem
Within a distributed microservices system, coordination and
concurrency control are critical for smooth operation. Parallel
execution of distributed logic forms an essential aspect of
concurrent operation, enhancing the overall system
responsiveness. However, this simultaneous processing can
introduce risks such as data overwrites or other conflicts.
When two service instances attempt to alter the same data
simultaneously, race conditions may arise, leading to
inconsistencies. Therefore, proper management of these
parallel operations is essential to prevent conflicts and
maintain data integrity.
In addition, the nature of microservices demands that
transactional states be shared across multiple instances or
between subsequent calls. Maintaining this shared state is
pivotal to achieving consistency across services. However,
given the dynamic landscape where instances can appear
and disappear, or network issues can disrupt
communication, ensuring a consistent state across all
instances becomes challenging.
Adding to the complexity, microservices, by their very
nature, can crash unexpectedly. This volatility necessitates
robust transaction recovery mechanisms. In the event of a
crash, the system should be capable of recovering the
transaction and preserving data integrity. Without such
efficient recovery mechanisms, crashes can induce data loss
or inconsistencies, thereby affecting the overall system
reliability and accuracy.
Furthermore, inter-service communication is integral to a
functioning microservices architecture. But these
communication efforts can be resource-intensive, leading to
performance degradation. Hence, it becomes crucial to
optimize inter-service calls to maintain high performance and
system responsiveness. Caching techniques can serve as a
powerful tool to reduce redundant calls, but their
implementation in a distributed environment is a non-trivial
task that calls for careful planning and design.

Distributed cache
In a distributed microservices environment, the instances of
a service are usually spread across various nodes for load
balancing or fault tolerance. A distributed cache comes into
play here, providing a shared, high-speed, in-memory data
store accessible by all nodes. This reduces the dependency
on costly database operations or network calls for data
access.
A crucial part of implementing a distributed cache is
choosing the appropriate caching strategy. Some common
strategies include:
Least Recently Used (LRU): This strategy evicts the
least recently used entries first. It is based on the
assumption that data accessed recently will likely be
accessed in the near future.
Least Frequently Used (LFU): LFU evicts the least
frequently used entries first. It assumes that an entry
with a high access frequency in the past will likely have
a high access frequency in the future.
Time to Live (TTL): This strategy automatically
removes entries after a specified duration or 'life span'.
Write-Through Cache: In this strategy, data is
simultaneously written into the cache and the backing
store to ensure consistency.
Write-Behind Cache: This strategy introduces a delay
before writing data into the backing store, aiming to
condense multiple write operations.
Choosing the right strategy depends on the specific
requirements and the nature of the data being cached.
Alongside this, managing cache invalidation and ensuring
cache synchronization is important for maintaining data
consistency. Prominent examples of distributed cache
systems used in a microservices environment include Redis,
Memcached, and Hazelcast.
Here is an example of a class (UserAccountService) that uses
distributed caching implemented using the low-level Jredis
library to optimize frequent getAccount operations (Code
snippet 7.1):
1. import com.example.samples.data.UserAccount;
2. import com.example.samples.repositories.UserAccount
Repository;
3. import org.springframework.beans.factory.annotation.
Value;
4. import org.springframework.stereotype.Service;
5. import com.fasterxml.jackson.databind.ObjectMapper;
6. import redis.clients.jedis.Jedis;
7. import java.io.IOException;
8.
9. @Service
10. public class UserAccountService {
11. private final UserAccountRepository userAccountRe
pository;
12. private final Jedis jedis;
13. private final ObjectMapper objectMapper;
14.
15. public UserAccountService(UserAccountRepository
userAccountRepository, @Value("${spring.redis.host}")
String redisHost, @Value("${spring.redis.port}") int re
disPort) {
16. this.userAccountRepository = userAccountReposit
ory;
17. this.jedis = new Jedis(redisHost, redisPort);
18. this.objectMapper = new ObjectMapper();
19. }
20.
21. public UserAccount getAccount(Long id) throws IOE
xception {
22. // Try to get account from the cache
23. String accountJson = jedis.get(id.toString());
24.
25. // If not in cache, fetch from repository and store in
cache
26. if (accountJson == null || "nil".equals(accountJson
)) {
27. UserAccount account = userAccountRepository
.findById(id).orElseThrow(() -
> new RuntimeException("Account not found"));
28. accountJson = objectMapper.writeValueAsStrin
g(account);
29. jedis.setex(id.toString(), 3600, accountJson); //
Cache for 1 hour
30. }
31.
32. return objectMapper.readValue(accountJson, Use
rAccount.class);
33. }
34.
35. public UserAccount updateAccount(UserAccount us
erAccount) {
36. UserAccount updatedAccount = userAccountRepo
sitory.save(userAccount);
37.
38. // If updated, remove the account from the cache
39. if (updatedAccount != null) {
40. jedis.del(userAccount.getId().toString());
41. }
42.
43. return updatedAccount;
44. }
45.
46. …
47. }
Spring Boot offers built-in support for distributed caching and
several well-known caching technologies, including Redis,
Memcached, Hazelcast, and Ehcache, through its caching
abstraction and auto-configuration features. For example,
when considering Redis, to enable caching add the following
likes into the application.properties file (Code snippet 7.2):
1. spring.cache.type=redis
2. spring.cache.redis.time-to-
live=3600000 # TTL in milliseconds
3. spring.cache.redis.key-prefix=userAccount
4. spring.cache.redis.cache-null-values=false
5. spring.cache.redis.use-key-prefix=true
Then you can use simple annotations to add distributed
caching into your service (Code snippet 7.3):
1. import com.example.samples.data.UserAccount;
2. import com.example.samples.repositories.UserAccount
Repository;
3. import org.springframework.cache.annotation.CacheE
vict;
4. import org.springframework.cache.annotation.Cachea
ble;
5. import org.springframework.stereotype.Service;
6.
7. @Service
8. public class UserAccountServiceNative {
9. private final UserAccountRepository userAccountRe
pository;
10.
11. public UserAccountServiceNative(UserAccountRepo
sitory userAccountRepository) {
12. this.userAccountRepository = userAccountReposit
ory;
13. }
14.
15. @Cacheable(value = "userAccount", key = "#id")
16. public UserAccount getAccount(Long id) {
17. return userAccountRepository.findById(id).orElse
Throw(() -
> new RuntimeException("Account not found"));
18. }
19.
20. @CacheEvict(value = "userAccount", key = "#userA
ccount.id")
21. public UserAccount updateAccount(UserAccount us
erAccount) {
22. return userAccountRepository.save(userAccount);
23. }
24.
25. …
26.
27. }
Following are the pros and cons:
Pros:
Caching data reduces the need for expensive database
or service calls, thereby enhancing performance.
As requests can be satisfied from the cache, the
number of network calls decreases, leading to less
network congestion.
A distributed cache can be designed for high
availability, making data always accessible, even if one
cache node fails.

Cons:
Managing distributed caches brings an additional level
of complexity to the system, including handling data
consistency issues.
If data is updated in the primary store but not in the
cache, it can result in serving outdated information.
Caching requires additional memory resources. For
large datasets, this can become significant and
expensive.
In a distributed caching system, network problems can
give rise to a situation where data becomes
inconsistent.

Partial updates
Partial updates is a pattern used in microservices
architecture to prevent overriding data during concurrent
updates. This pattern is mainly used when there is a need to
update only certain fields of a data object without affecting
other fields that might have been changed concurrently.
Traditionally, when updating a data object, the client
retrieves the entire object, modifies the fields it needs to
change, and then sends the entire object back to the server.
This "read-modify-write" cycle can lead to lost updates if
another client modifies the same resource between the read
and write operations.
Partial updates mitigate this issue by allowing clients to send
only the properties they want to change. On the server side,
the update operation only modifies these properties, leaving
the others untouched.
Here is a simple example of a partial update in a Java Spring
Boot application (Code snippet 7.4):
1. import org.springframework.web.bind.annotation.*;

2.
3. @RestController

4. @RequestMapping("/accounts")

5. public class UserAccountController {

6. private final UserAccountService userAccountService;

7.
8. public UserAccountController(UserAccountService userAccountS
ervice) {

9. this.userAccountService = userAccountService;

10. }

11.
12. // ... other handlers

13.
14. @PatchMapping("/{id}")

15. public UserAccount patchAccount(@PathVariable Long id, @Requ


estBody Map<String, Object> updates) {

16. return userAccountService.patchAccount(id, updates);


17. }

18. }

19.
20.
21. import org.springframework.stereotype.Service;

22.
23. @Service

24. public class UserAccountService {

25. private final UserAccountRepository userAccountRepository;

26.
27. public UserAccountService(UserAccountRepository userAccountR
epository) {

28. this.userAccountRepository = userAccountRepository;

29. }

30.
31. // ... other methods

32.
33. public UserAccount patchAccount(Long id, Map<String, Object>
updates) {

34. UserAccount account = getAccount(id); // Load the existing acc


ount

35.
36. // Apply updates

37. for (Map.Entry<String, Object> entry : updates.entrySet()) {

38. switch (entry.getKey()) {


39. case "name":

40. account.setFirstName((String) entry.getValue());

41. break;

42. // Add more case blocks for other updatable fields

43. }

44. }

45.
46. return userAccountRepository.save(account); // Save the updat
ed account

47. }

48. }

This approach minimizes the chances of concurrent updates


overriding each other. However, it still does not eliminate the
risk entirely. If multiple clients try to update the same field
concurrently, one of the updates may be lost. Additional
techniques such as optimistic locking can be used to prevent
this.
Following are the pros and cons:

Pros:
Only the necessary data is transmitted over the
network, reducing bandwidth usage.
In scenarios with concurrent modifications, it helps to
prevent entire resource overwrite by only updating
specific fields.
Provides a more intuitive API for clients who only need
to change certain fields.

Cons:
Implementation could be more complex than a
straightforward full update, as you need to manage
individual fields.
Partial updates could lead to inconsistent data states if
not carefully managed, especially in distributed
systems.
If multiple clients are updating the same fields
simultaneously, one update may overwrite another,
unless further concurrency control measures are taken.
Validation rules may become more complex if they
depend on multiple fields, as a partial update can
change a subset of fields that may affect validation.

Optimistic lock
Optimistic Locking is a concurrency control pattern used to
ensure the consistency of data during concurrent operations.
The fundamental idea behind optimistic locking is to allow
multiple clients to read data simultaneously, and control is
enforced only when changes are committed, thereby
reducing lock contention.
In the optimistic locking strategy, a version identifier
(version ID) is associated with each modifiable entity. The
version ID is updated each time the entity is updated. The
two common types of version IDs are:
Incremented Numbers: In this approach, the version
ID is an integer that is incremented every time the
entity is updated. It starts from a base value, often
zero, and is increased by one (or by any constant) with
each update.
Timestamps: In this approach, the version ID is a
timestamp indicating the last modification time of the
entity. It could be the actual time of the update or a
logical timestamp maintained by the application.
When a client reads an entity, it also reads the version ID.
When the client later tries to update the entity, it provides
the version ID it previously read. The server checks the
version ID from the client against the current version ID of
the entity:
If the version IDs match, the update is performed and
the version ID is incremented or updated with a new
timestamp.
If the version IDs do not match, the update is rejected
because the entity has been modified by another client
since it was last read. This is called a write conflict.
Implementing optimistic locking without using JPA or a
similar ORM tool involves manually handling the version
checks and updates. Let us use a simple JDBC-based
implementation for the example.
Consider an `Account` class (Code snippet 7.5):
1. public class Account {
2. private Long id;
3. private double balance;
4. private Long version;
5.
6. // getters and setters...
7. }
In your `AccountDao` (Data Access Object), you might have a
method for updating the account like this (Code snippet 7.6):
1. import java.sql.*;

2.
3. public class AccountDao {

4.
5. private final DataSource dataSource;

6.
7. public AccountDao(DataSource dataSource) {

8. this.dataSource = dataSource;

9. }

10.
11. public void updateAccountBalance(Account account, double amou
nt) throws SQLException {

12. try (Connection connection = dataSource.getConnection()) {

13. String sql = "UPDATE accounts SET balance = ?, version = v


ersion + 1 WHERE id = ? AND version = ?";

14. try (PreparedStatement statement = connection.prepareStat


ement(sql)) {

15. statement.setDouble(1, account.getBalance() - amount);

16. statement.setLong(2, account.getId());

17. statement.setLong(3, account.getVersion());

18.
19. int rowsAffected = statement.executeUpdate();

20.
21. if (rowsAffected == 0) {

22. throw new OptimisticLockException("Account was


updated by another transaction");

23. }

24. }

25. }
26. }

27. }

In this `updateAccountBalance` method, we are manually


checking the version. If the version in the database doesn't
match the version of the account object, the `WHERE` clause of
the SQL statement will not match any row, so the update will
affect 0 rows. We detect this case and throw an
`OptimisticLockException`.
Note that this is a lower-level, more verbose way of doing
what JPA's `@Version` annotation does for us automatically.
However, it does illustrate clearly what happens under the
hood when you use optimistic locking.
In optimistic concurrency control, the read flow method also
plays a pivotal role in ensuring data consistency and
integrity within a database system. This method begins with
the retrieval of data from the database, where a transaction
reads the current state of the data without acquiring any
locks. Subsequently, the transaction performs its operations
based on the assumption that no other transactions will
interfere. However, before committing any changes, the read
flow method necessitates a verification step to ensure that
the data has not been modified by other concurrent
transactions since it was initially read. This verification
typically involves comparing timestamps or version numbers
associated with the data. If the verification is successful,
indicating that the data has not been altered by other
transactions, the transaction can proceed with committing its
changes. If conflicts are detected during the verification
process, appropriate measures such as aborting the
transaction or retrying it may be taken to maintain data
consistency and resolve concurrency issues. Thus, the read
flow method in optimistic concurrency control optimistically
assumes the absence of conflicts while still providing
mechanisms to detect and resolve them when they occur,
thereby promoting efficient and concurrent access to shared
data resources.
Following are the pros and cons:

Pros:
It allows multiple clients to read data simultaneously,
reducing lock contention.
It ensures that one client does not inadvertently
overwrite changes made by another client.
It eliminates the need for explicit lock management,
reducing the overhead of acquiring and releasing locks.

Cons:
In high-contention scenarios, write conflicts can
become common, and resolving these conflicts can add
complexity.
Clients must be prepared to retry their updates when
write conflicts occur, which can add overhead and
complexity.
If a client reads data and another client updates the
same data before the first client completes its
operation, the first client will be working with stale
data.

Distributed lock
In the context of microservices, where services interact with
shared resources, there arises a need to prevent concurrent
access and maintain the consistency of data. The distributed
lock pattern provides a solution to this need, facilitating
exclusive access to a particular resource across multiple
services.
The pattern works by having a service request a lock before
using a shared resource. If the lock is available, the service is
granted the lock, performs the necessary operations, and
then releases the lock for other services to use. However, if
the lock isn't available, the service can either retry after
some time or simply fail the operation, depending on the
specific business requirements (refer to Figure 7.1).

Figure 7.1: Distributed locking of a shared resource

However, it is crucial to mention the dangers of using global


locks. Global locks, which lock down large segments or all
the shared resources, can create significant contention and
lead to scalability and performance issues. This is considered
an anti-pattern in microservices architecture. Therefore, a
lock's scope should be minimized as much as possible,
ideally related to a specific transaction or object.
Moreover, the time period of a lock is equally significant. For
real-time transactions, locks should be held for a very short
duration to ensure high throughput. For long-running
transactions, it is advisable to use asynchronous execution in
the background to avoid holding the lock for an extended
period.
Distributed locks have to come with timeouts. A timeout
mechanism helps automatically release the lock if the
microservice that acquired the lock crashes or is unable to
complete the operation within a specified timeframe. This
helps to prevent deadlocks and ensures the smooth
operation of the overall system.
In-memory caching services like Redis or Memcached come
with a locking feature that can be used to implement a
distributed lock. Here is an example of how to implement a
DistributedLock service using Memcached (Code snippet 7.7):

1. import net.spy.memcached.MemcachedClient;
2.
3. import java.io.IOException;
4. import java.net.InetSocketAddress;
5. import java.util.concurrent.Future;
6. import java.util.concurrent.TimeUnit;
7.
8. public class DistributedLock {
9. private MemcachedClient client;
10. private String key;
11.
12. public DistributedLock(String key, String memcache
dServer,
int memcachedPort) throws IOException {
13. this.key = key;
14. this.client = new MemcachedClient(new InetSock
etAddress(memcachedServer, memcachedPort));
15. }
16.
17. public boolean acquire(int lockExpireTime, TimeUni
t unit) {
18. boolean success = false;
19. try {
20. Future<Boolean> future = client.add(key, (int)
unit.toSeconds(lockExpireTime), "LOCKED");
21. // Wait for the operation to complete
22. success = future.get();
23. } catch (Exception e) {
24. e.printStackTrace();
25. }
26. return success;
27. }
28.
29. public boolean tryAcquire(int lockExpireTime, Time
Unit unit) {
30. Future<Boolean> future = client.add(key, (int) u
nit.toSeconds(lockExpireTime), "LOCKED");
31. boolean success = false;
32. try {
33. // Try to acquire the lock but do not wait indefini
tely
34. success = future.get(1, TimeUnit.SECONDS);
35. } catch (Exception e) {
36. future.cancel(false);
37. }
38. return success;
39. }
40.
41. public void release() {
42. client.delete(key);
43. }
44. }
Use of the distributed lock will look the following way (Code
snippet 7.8):
1. // Create a distributed lock
2. DistributedLock lock = new DistributedLock("testLock
",
3. "localhost", 11211);
4.
5. // Acquire the lock
6. if (lock.tryAcquire(10, TimeUnit.SECONDS)) {
7. try {
8. // Perform some critical section of code here
9. System.out.println("Performing critical section of
code...");
10. } finally {
11. // Release the lock
12. lock.release();
13. }
14. } else {
15. // Handle failure to acquire lock
16. System.out.println("Unable to acquire lock, try agai
n later...");
17. }
Following are the pros and cons:
Pros:
Ensures that only one service can access a shared
resource at a time, maintaining the consistency of the
data.
Allows for management of access to resources in a
concurrent environment, which is critical for a system
with multiple services interacting with the same
resources.
Helps prevent conflicts that could arise from
simultaneous access to the same resource.

Cons:
Adds a layer of complexity to the system architecture.
Can potentially limit the scalability of the system,
especially when used excessively or improperly.
If not handled properly, there is a risk of deadlocks
where a service holding a lock fails to release it.
Locks, especially if held for a long time, can reduce
system performance by causing other services to wait.
Requires coordination between services, which can add
to the network and computational overhead.
If the service providing the lock goes down, it could
disrupt the operations of other services relying on it.

State management
When designing microservices, one key principle is that they
should be stateless. Stateless microservices allow multiple
running instances that can distribute client requests between
them. Requests from a client can be routed to any available
instance, making the system robust and scalable.
However, in some cases, microservices need to process
complex transactions and maintain state between requests,
turning them into stateful microservices. With stateful
services, managing the state becomes more challenging. For
instance, when stateful microservices utilize in-memory
storage for transaction state, client sessions need to be
sticky — routed consistently to the same microservice
instance — to ensure the client's state is maintained. This
requirement limits scalability and introduces a single point of
failure; if the microservice instance fails, the in-memory
state is lost, and the transaction cannot recover.
The State Management pattern provides a solution to these
problems. By storing the transaction state in external
storage instead of in memory. Microservices can retrieve and
update the state as needed, effectively becoming stateless
(refer to Figure 7.2):

Figure 7.2: Management of distributed state between microservice calls

The state can be updated partially or protected by a locking


mechanism. This pattern provides flexibility, as the external
storage can range from in-memory caching services like
Redis or Memcached, suitable for short transactions, to more
persistent storage solutions for long-running transactions.
In-memory implementation of state management for short
transactions is similar to DistributedLock presented in the
previous pattern. Following you can see the implementation
of persistent state management with a locking mechanism
using MongoDB as a storage (Code snippet 7.9):
1. import com.mongodb.client.MongoClients;

2. import com.mongodb.client.MongoCollection;

3. import com.mongodb.client.MongoClient;

4. import com.mongodb.client.model.Filters;

5. import com.mongodb.client.model.ReturnDocument;

6. import com.mongodb.client.model.Updates;

7.
8. import org.bson.Document;

9.
10. import java.time.Instant;

11. import java.util.Map;

12.
13. public class StateManagementService {

14. private MongoCollection<Document> collection;

15.
16. public StateManagementService(String connectionString, String
dbName,
String collectionName) {

17. MongoClient mongoClient = MongoClients.create(connectionSt


ring);

18. this.collection = mongoClient.getDatabase(dbName).getCollect


ion(collectionName);

19. }

20.
21. public Map<String, Object> retrieve(String id, int lockTimeoutInS
econds) {

22. Instant lockUntil = Instant.now().plusSeconds(lockTimeoutInSe


conds);

23. Document updatedDocument = collection.findOneAndUpdate(

24. Filters.and(

25. Filters.eq("_id", id),

26. Filters.or(

27. Filters.exists("lockUntil", false),

28. Filters.lt("lockUntil", Instant.now())

29. )

30. ),

31. Updates.set("lockUntil", lockUntil),

32. new FindOneAndUpdateOptions().returnDocument(Return


Document.AFTER)

33. );

34. return updatedDocument != null ? new HashMap<>


(updatedDocument): null;

35. }

36.
37. public void store(String id, Map<String, Object> newState) {

38. var update = new Document("$set", newState);

39. update.append("$unset", new Document("lockUntil", false));

40. collection.updateOne(new Document("_id", id),

41. update,
42. new UpdateOptions().upsert(true)

43. );

44. }

45. }

Following are the pros and cons:

Pros:
This pattern enables microservices to be stateless,
which significantly increases their scalability.
By moving the state to external storage, the pattern
reduces the risk of state loss in the event of a
microservice instance failure.
By using external locking mechanisms, it allows
concurrent access to shared resources without conflict.
It simplifies the service instance, as it does not need to
maintain its state internally.
It allows the state to be retrieved and used by any
instance of the microservice.

Cons:
Requires careful design to ensure atomicity and
consistency of operations, especially when dealing with
concurrency.
External state management can introduce latency due
to network and storage overhead.
Depending on the type and size of the state data, it can
increase storage costs.
Requires mechanisms to ensure that data is up-to-date
across all microservice instances.
Microservices become dependent on the availability
and performance of the external storage system.

Process flow
The term process flow refers to the sequence of steps that
define the interaction and communication between these
microservices. A process flow can range from a simple
request-response sequence between two services, to a
complex, multi-step business transaction involving several
services.

Problem
When building microservice systems, some business
transactions require the invocation of more than one
microservice. As each microservice is designed to perform a
specific business function and operate independently,
orchestrating interactions and coordinating tasks between
these services can become complex. Simple request-
response communication may not suffice when a transaction
involves multiple services, each needing to process a part of
the request. It may be necessary to collect and aggregate
data from several services, pass a request through a
sequence of services, or split a task into multiple concurrent
subtasks. Additionally, considerations such as maintaining
data consistency, handling failures gracefully, and ensuring
transaction integrity across multiple services add to the
complexity. Thus, managing the process flow effectively and
efficiently in a microservices architecture poses a significant
problem that requires systematic solutions.

Aggregator
The aggregator (or service aggregator) pattern is used to
manage interactions between various services and clients,
where a client’s request requires data from multiple services.
Instead of having the client make numerous individual calls,
an aggregator service steps in to coordinate these
interactions (refer to Figure 7.3):

Figure 7.3: The Aggregator pattern

The aggregator service, typically implemented in facades or


Backends for Frontends (BFFs), receives the client's
single request and makes parallel calls asynchronously to the
necessary services. By making these calls in parallel, the
aggregator ensures better performance and efficiency,
reducing the overall response time.
After receiving responses from the involved services, the
aggregator processes them. It may apply specific business
rules, transform the data, or simply combine the responses,
before returning a single aggregated response to the client.
This pattern can be illustrated using an e-commerce
application. Assume the application has different
microservices for user profiles, product catalogs, and order
history. When the client needs a dashboard view composed
of data from all these services, it directs a single request to
the aggregator. The aggregator then makes parallel requests
to the necessary services, collects the responses, combines
the data, and finally sends back a unified dashboard view to
the client.
Following are the pros and cons:

Pros:
Clients make a single request, irrespective of the
number of services involved, making client-side code
cleaner and easier to manage.
Asynchronous, parallel calls to different services can
improve performance and reduce overall response
time.
Changes in one service do not affect the client directly,
leading to less fragile systems.
Aggregators can apply business logic or transform the
data before sending it to the client.

Cons:
Introduces an extra layer to manage and another point
of potential failure.
If not managed effectively, the aggregator can become
a performance bottleneck.
Despite parallel calls, the aggregator service might
introduce some latency due to the aggregation process.
The client becomes highly dependent on the
aggregator service. If the aggregator service is down,
it affects all the clients relying on it.

Chain of Responsibility
The chain of responsibility (or simply Chain) pattern is a
behavioral design pattern that allows an event to be
processed by one of many handlers, in a decoupled manner.
In the context of microservices, this pattern provides a way
to orchestrate and coordinate the processing of a request
across multiple services (refer to Figure 7.4):
Figure 7.4: The Chain of Responsibility pattern

In this pattern, a request is passed along a "chain" of


potential handler services. Each service in the chain either
handles the request or forwards it to the next service in the
chain. The process continues until a service handles the
request or it reaches the end of the chain without being
handled.
This pattern promotes loose coupling as it avoids binding the
sender of the request to its receiver by giving more than one
service a chance to handle the request. It can be beneficial
in situations where a request can be processed by any one of
multiple services, and the specific handler isn't known at
compile time or should be dynamically determined.
For instance, in a payment processing system, you might
have different microservices to handle different payment
methods (credit card, PayPal, bank transfer, etc.). When a
payment request comes in, it can be passed along the chain
until it reaches a service that can handle the specified
payment method.
Following are the pros and cons:

Pros:
The pattern decouples the sender and receiver of a
request, promoting loose coupling of services.
Provides a flexible way to distribute the handling of
requests across multiple services.
Allows for dynamic determination of which service
should handle a request based on runtime conditions.
New handlers can be added, or existing handlers can
be changed without affecting other handlers or the
client.

Cons:
The request might have to travel through multiple
services before it is processed, adding latency.
Due to increased complexity and dynamic behavior,
debugging issues can be more challenging.
There is no guarantee a request will be handled unless
the end of the chain is an all-catch handler.
The order of handlers in the chain matters, adding
another aspect to manage.

Branch
The branch pattern is another structural design pattern used
within the context of microservices architecture. This pattern
comes into play when a request necessitates simultaneous
processing by multiple independent services.
In essence, the branch pattern splits the process flow into
multiple parallel flows. When a request is received, instead
of calling one service after another in a sequence, the
branch pattern "branches out" the request to several
services at the same time. Each branch represents a path for
a different service to process the request in parallel with the
others (refer to Figure 7.5):
Figure 7.5: The Branch pattern

For example, consider an online retail application that


receives an order request. Using the branch pattern, this
request could be simultaneously sent to several
microservices: one for checking inventory, another for
validating payment, and a third for updating the customer's
order history.
Once each service has completed its processing, the
responses can be aggregated and returned to the client. This
pattern significantly improves the system's responsiveness
and efficiency, especially in scenarios where the individual
processing times of services do not depend on each other.
However, the branch pattern can increase complexity due to
the requirement of synchronizing and coordinating responses
from multiple services. It also requires careful error and
exception handling to ensure that a failure in one branch
does not lead to overall transaction failure.
Following are the pros and cons:

Pros:
By processing parts of a request in parallel, the pattern
can significantly reduce the overall response time.
Concurrent processing can lead to better system
utilization and increased throughput.
Services can operate independently, promoting
modularity and loose coupling.
Allows for horizontal scaling by adding more instances
of each service.

Cons:
Managing, synchronizing, and coordinating responses
from multiple services can be complex.
Requires careful handling of errors and exceptions to
avoid failure in one branch affecting the entire
transaction.
Not all tasks can be effectively split into independent
sub-tasks suitable for parallel execution.
Introduces communication overhead due to the
coordination and data exchange between branches.

Transaction management
Transaction management in a microservices architecture
presents unique challenges due to transactions spanning
multiple services and databases. Various patterns like Two-
Phase Commit (2PC), Saga Pattern, and Workflow Engine
help manage these transactions, ensuring data consistency
and recovery from failures. Despite introducing complexity,
these strategies are crucial for efficient transaction handling
in distributed systems.

Problem
Within the microservices architecture, transaction
management is a complex challenge that requires
specialized strategies. Unlike monolithic systems, where a
single database transaction manages changes, microservices
transactions often span across multiple services and
databases (refer to Figure 7.6):
Figure 7.6: Distributed transaction executed by multiple microservices

This distributed nature leads to a series of issues that need


careful handling:
Data consistency: Maintaining data consistency
across various independent services becomes a critical
issue. Traditional ACID transactions, designed for
single database systems, fail to address this need in a
distributed microservices architecture.
Performance impact: The need for communication
between services during a transaction can lead to
increased network latency, potentially affecting the
system's performance.
Failure handling: In a distributed environment,
failures are inevitable. When a part of the transaction
fails, determining the system's next course of action
becomes complex. Retrying the operation, ignoring the
failure, or aborting the transaction altogether - each of
these options needs to be evaluated based on the
specific context and potential impact.
Distributed deadlocks: Coordinating transactions
across multiple services can potentially lead to
distributed deadlocks, which are difficult to detect and
resolve.
Increased complexity: The handling of distributed
transactions significantly increases the complexity of
the system, both in terms of development and
operations.
Service coupling: Managing distributed transactions
can inadvertently lead to tighter coupling between
services, defeating the purpose of a microservices
architecture which aims for loosely coupled services.
To navigate these issues, architects need to employ patterns
that align with the nature of microservices, like the Saga
pattern or the Two-Phase Commit (2PC), among others.
The chosen patterns should be implemented with care,
considering the context and specific requirements of the
architecture.

Orchestrated Saga
The orchestrated saga pattern is a common approach to
handling transactions in a microservices architecture,
particularly in traditional systems that rely heavily on
synchronous calls.
In this pattern, a single orchestrator service (also known as a
Saga Orchestrator) is responsible for managing the
execution of the entire transaction. The orchestrator initiates
and monitors each step of the transaction, making requests
for the appropriate services and managing responses (refer
to Figure 7.7):

Figure 7.7: Orchestrated Saga

The responsibilities of the saga orchestrator include:


Defining the transaction steps: The orchestrator
defines the sequence of steps that make up the
transaction.
Service coordination: It coordinates and initiates
calls to the various microservices involved in the
transaction.
State management: It maintains the state of the
transaction, keeping track of which steps have been
completed and which are pending.
Failure management: In case of a failure in any of the
steps, the orchestrator is responsible for handling it,
which could involve initiating compensating
transactions, or rolling back the entire operation.
The orchestrated saga pattern is particularly useful for
managing long-running transactions, which might involve
multiple steps that occur over a long period of time. The
orchestrator maintains the transaction state, allowing the
overall operation to be paused and resumed as needed.
Consider an example of an e-commerce platform. When a
customer places an order, several steps need to occur in
sequence:
1. Check inventory to ensure the product is available.
2. Reserve the product to prevent it from being sold to
another customer.
3. Charge the customer's credit card.
4. Initiate the delivery process.
In this case, the Saga Orchestrator would be responsible for
initiating each step, keeping track of the progress, and
handling any failures. If the credit card charge fails, for
instance, the orchestrator could cancel the reservation and
update the inventory. This approach allows for a consistent,
reliable transaction process, even in a complex, distributed
environment.
The Saga Pattern is fundamentally different from the
traditional Two-Phase Commit (2PC) used in monolithic
systems. Here are the key differences:
ACID vs. BASE: 2PC is based on Atomicity,
Consistency, Isolation, Durability (ACID)
properties, which ensure strict consistency at all times.
Sagas, on the other hand, operate under Basically
Available, Soft state, and Eventually consistent
(BASE) properties, trading off immediate consistency
for availability and performance.
Locking: 2PC uses a coordinator that locks the
resources involved in a transaction, keeping them
inaccessible to other transactions until the current one
is completed. This approach can lead to performance
bottlenecks and does not scale well in a distributed
system like microservices. In contrast, Sagas do not
lock resources for the duration of the transaction,
allowing other transactions to proceed, thereby
enhancing performance and scalability.
Failure Handling: In 2PC, if a single transaction fails,
the entire process is rolled back. This strict consistency
can be expensive and unnecessary in a distributed
system. Sagas instead uses compensating transactions
to handle failures, allowing for more flexibility and
resilience.
Synchronous vs. Asynchronous: 2PC is usually
implemented in a synchronous manner, which can lead
to reduced performance in a distributed environment
due to waiting times. Sagas, especially in a
choreographed saga pattern, can be implemented
asynchronously, improving performance by allowing
other processes to proceed while waiting for
responses.
Given these differences, the saga pattern is generally
preferable to 2PC in a microservices architecture. It aligns
more closely with the principles of distributed systems,
offering better performance, scalability, and resilience.
However, it is important to note that the choice between
these patterns depends on the specific needs and contexts
of the application. Each pattern has its own strengths and
trade-offs, and the best choice depends on the specific
requirements around consistency, performance, and
complexity.
Following are the pros and cons:

Pros:
The orchestrator handles failures, isolating the rest of
the services from their impact.
The orchestrator provides a centralized point for
managing transaction state and coordinating steps.
The pattern excels at managing long-running
transactions, allowing operations to be paused and
resumed.
Unlike 2PC, it avoids resource locking for the duration
of a transaction, thereby improving performance and
scalability.

Cons:
The orchestrator can become a single point of failure
and a potential performance bottleneck.
The pattern adds significant complexity to the system,
both in orchestrator implementation and managing
compensating transactions.
The orchestrator introduces some degree of coupling
between services.
It provides eventual consistency, which may not be
suitable for applications that require immediate
consistency.

Choreographic Saga
The choreographic saga pattern is a prominent strategy for
managing transactions in event-driven microservices
architectures, particularly those founded on asynchronous
messaging.
In this pattern, each microservice participating in the
transaction is self-aware and knowledgeable of its own role
and the subsequent step in the process. Essentially, each
service independently determines its next action based on
the received events, contributing to a self-orchestrating,
choreographed sequence (refer to Figure 7.8):

Figure 7.8: Choreographic Saga

Although there is no centralized logic, there can be a shared


common state between the microservices involved in a
transaction (see State Management pattern).
Contrasting this with the orchestrated saga pattern, key
differences include:
Decentralized management: In a choreographic
saga, no single orchestrator dictates the flow of the
transaction. Instead, each service is self-directing,
knowing which step to initiate next.
Loose coupling: The services in a choreographic saga
are more loosely coupled compared to an orchestrated
saga. There is no need for services to be aware of an
orchestrator; they only need to know the event to listen
for and the subsequent action.
Asynchronous communication: The pattern largely
relies on asynchronous messaging, enhancing
performance by enabling processes to continue without
having to wait for previous steps to complete.
Failure management: Management of failures can be
more intricate in a Choreographic Saga. Each service
is individually responsible for handling failures during
its step and initiating compensatory or recovery
actions.
We can illustrate this pattern with the previous e-commerce
platform’s example:
The order service, after validating product availability
and reserving the item, emits a 'ProductReserved' event.
The payment service, upon receiving this event,
charges the customer's credit card. On successful
charge, it broadcasts a 'PaymentSuccessful' event.
The delivery service, listening for the 'PaymentSuccessful'
event, kicks off the delivery process.
If any step is unsuccessful, the respective service emits a
failure event, prompting the prior service to execute a
compensating transaction. Thus, despite the lack of a central
orchestrator, the transaction smoothly progresses as each
service understands its role and the subsequent step in the
process.
Following are the pros and cons:

Pros:
No single point of failure or bottleneck, as there is no
central orchestrator.
Services only need to know the next step and not the
entire transaction flow, reducing dependencies.
Asynchronous messaging can lead to better
performance, as processes can continue without
waiting for responses.
The pattern scales well with the growth of the system
due to the lack of a central orchestrator.

Cons:
Ensuring the correct order of operations and handling
failures can be complex without a central coordinator.
Tracing and debugging a transaction through multiple
independent services can be challenging.
Although there is no centralized logic, managing a
shared state across microservices can be complicated.
Like the orchestrated saga, it provides eventual
consistency, which may not be suitable for applications
requiring immediate consistency.

Compensating transaction
The compensating transaction pattern (or simply
compensation) is a crucial mechanism for managing failures
in the saga patterns, be it orchestrated or choreographed.
This pattern is centered on the idea that for each operation
that modifies the data, there should be a corresponding
compensating operation that can undo the changes made by
the initial operation. The compensating transaction is
designed to leave the system in a consistent state without
violating business rules (refer to Figure 7.9):

Figure 7.9: Compensating transaction

In the context of a saga, if an operation fails during the


execution of a transaction, a series of compensating
transactions are invoked to rollback the changes made by
the previous operations in the transaction. The
compensating transactions are executed in the reverse order
of the operations.
Let us reconsider our e-commerce example. Suppose the
payment service successfully charges the customer's card,
but the delivery service later fails to initiate the delivery
process. In such a scenario, the compensating transactions
would be as follows:
The delivery service, upon encountering a failure,
would emit a 'DeliveryFailed' event.
The payment service, listening to this event, would
initiate its compensating transaction, which could
involve refunding the charged amount to the
customer's card and emit a 'PaymentRefunded' event.
Finally, the order service, on receiving the
'PaymentRefunded' event, would execute its own
compensating transaction, possibly involving un-
reserving the product.
This way, the system is returned to a consistent state. It is
important to note that these compensating transactions
should be idempotent to ensure the system's consistency,
regardless of how many times they are executed.

Workflow
Implementing distributed transactions using the saga pattern
can be a complex task, with challenges ranging from
orchestrating communication between services, handling
failures, and managing transaction state, to ensuring data
consistency. However, such a process is often systematic and
can be represented as a state machine, making the use of a
workflow engine a fitting solution. Instead of implementing a
saga orchestration logic from scratch, developers can
employ a workflow engine or a complete workflow service
(as shown in Figure 7.10):

Figure 7.10: Workflow Engine (Service) used as Saga Orchestrator


Workflow engines allow developers to model their
transaction flow as a state machine, abstracting away much
of the complexity involved in managing distributed
transactions. The engines take care of orchestrating the
services, handling failures, and managing the transaction
state. This allows developers to focus more on the business
logic of their services rather than the intricacies of
transaction management.
Here are some workflow engines suitable for orchestrating
sagas in Java-based microservices:
Camunda Business Process Management (BPM):
A lightweight, open-source platform for BPM. It's
written in Java and supports the creation of complex
business logic workflows and decision tables.
Activiti: An open-source workflow and BPM platform
that is designed to run in any Java application, on a
server, cluster or in the cloud.
jBPM: An open-source Business Process Management
Suite that includes the capability to design, execute
and monitor business processes throughout their life
cycle.
Flowable: A light-weight business process engine
written in Java. Flowable offers a set of cloud-native
building blocks designed to run on scalable
infrastructures.
Zeebe: Developed by the creators of Camunda, Zeebe
is a workflow engine for microservices orchestration. It
can scale horizontally to handle very high throughput.
Netflix Conductor: An orchestration engine that runs
in the cloud, it provides a control plane for workflows,
tasks, and event machinery.
Temporal Originally developed at Uber, Temporal is a
workflow engine that provides a coding framework for
reliably executing long-running business logic in a
distributed environment, though it requires a bridge
(like Temporal-SDK-Java) to use with Java.
Aside from workflow engines, cloud platforms offer
specialized workflow management services that developers
can leverage to orchestrate their distributed transactions:
AWS Step Functions provides a serverless function
orchestrator that makes it easy to sequence AWS
Lambda functions and multiple AWS services into
business-critical applications.
Azure Logic Apps allows developers to design
workflows visually for orchestrating data across
services, automate EAI, B2B/EDI, and business
processes.
Google Workflows offers serverless workflow
orchestration, allowing developers to develop, deploy,
and manage workflows connecting Google Cloud
services and APIs.
By employing the Workflow Pattern, developers can
streamline their transaction management process, reduce
complexity, and improve the reliability of their microservices-
based applications.
Following are the pros and cons:

Pros:
Workflow engines abstract away the complexity of
managing distributed transactions.
Developers can focus more on implementing the
business logic rather than the intricacies of transaction
management.
The pattern allows transaction flow to be modeled as a
state machine, making it easier to visualize and
understand.
Workflow engines handle failures, reducing the risk of
an entire transaction failing due to a single point of
failure.
Many workflow engines and services are designed to
scale horizontally, making them suitable for high-
throughput environments.

Cons:
Introducing a workflow engine into the architecture
can add its own layer of complexity and learning curve.
Workflow engines may introduce a performance
overhead due to the additional layer of communication
and processing.
Using a workflow engine or a cloud service introduces
a new dependency in the system.
Commercial workflow services from cloud providers
come with their own costs.
Incorporating a workflow engine in an existing system
might require significant refactoring.

Reliability
Reliability is a cornerstone in microservices, ensuring
consistent and correct functionality in diverse scenarios.
Design patterns play an essential role in enhancing system
reliability, shielding it from various types of failures. These
include patterns such as Bulkhead, Outbox, and
Backpressure. Collectively, they aid in isolating failures,
ensuring data consistency, and controlling system load.
Adopting such patterns is crucial for building resilient
microservice architectures.

Problem
In the world of microservices, ensuring the reliability of
business logic is a complex and vital task. Given their
distributed nature, microservices are susceptible to various
challenges that can disrupt the execution of business logic,
affecting the system's overall functionality and performance.
Here are some common issues faced in maintaining
reliability in microservices' business logic:
Network unreliability: Network issues, such as
latency or disconnections, can cause failure or delay in
communication between services, disrupting the
execution of business logic.
Service overload: High demand or unexpected spikes
can lead to service overload, affecting the service's
ability to execute business logic efficiently and reliably.
Data consistency: Ensuring data consistency across
microservices can be a challenge, with discrepancies
leading to inaccurate execution of business logic.
Error propagation: In a tightly coupled system, an
error in a single service can quickly propagate to other
services, leading to a widespread failure.
Resource exhaustion: In scenarios of high demand,
services can exhaust their resources, leading to a
complete halt in the execution of business logic.
Faulty business logic: Bugs or unhandled exceptions
in the business logic can cause services to fail,
affecting the reliability of the whole system.

Backpressure
The backpressure pattern is used to manage load and
increase the resilience of a system. Backpressure, as a term,
is derived from fluid dynamics, where it represents a
resistance or force opposing the desired flow of fluid in a
pipe.
In the realm of microservices, the backpressure pattern is
about controlling the rate of incoming requests to prevent
resource exhaustion. If a service is overwhelmed by too
many incoming requests, it can leverage backpressure to
push back on the caller or drop requests when necessary.
Here is how it works:
Buffering and queueing: Incoming requests are
temporarily stored in a buffer or queue. However, this
buffer has a limit to prevent indefinite growth and
potential memory exhaustion.
Load shedding: When the buffer reaches its limit, the
service can start refusing new requests. This technique
is also known as load shedding.
Flow control: Another approach is to apply flow
control, signaling the upstream services to slow down
the request rate. This can be achieved through
mechanisms like TCP flow control in the transport
layer or application-level flow control.
Adaptive modeling: Some implementations of
backpressure involve adaptive models where the
service adjusts its request handling capacity based on
current system load and performance metrics.
By applying the Backpressure pattern, a microservice can
effectively protect itself from becoming overloaded and
ensure that it continues to function optimally under high
demand.
In the realm of distributed systems and resource
management, rate limiting, throttling, and the backpressure
pattern serve as essential techniques to regulate the flow of
data or requests, albeit with distinct approaches. Rate
limiting involves setting a predefined maximum rate at which
requests or data can be processed, typically based on a time
window (e.g., requests per second). Throttling, on the other
hand, involves dynamically adjusting the processing rate in
response to system load or resource availability, slowing
down or pausing processing when necessary to prevent
overload. Both rate limiting (see it in Chapter 5,
Implementing Communication) and throttling focus on
controlling the influx of requests or data to prevent system
degradation or failure under heavy loads. In contrast, the
backpressure pattern addresses congestion in systems by
propagating information about overload or congestion
upstream, allowing the sender to adjust its rate of
transmission accordingly. While rate limiting and throttling
act as proactive measures to regulate the flow of data,
backpressure serves as a reactive mechanism for handling
congestion and maintaining system stability. Despite their
differences, these patterns all aim to ensure system
reliability and performance by managing the flow of requests
or data in distributed environments.
Backpressure can be implemented in various ways, one of
the commonly used methods in Java is via the
`java.util.concurrent.Flow` class introduced in Java 9, which
includes support for reactive streams and backpressure.
Here is a simple example (Code snippet 7.10):
1. import java.util.concurrent.Flow;

2. import java.util.concurrent.SubmissionPublisher;

3.
4. public class ExampleBackpressure {
5.
6. static class MySubscriber<T> implements Flow.Subscriber<T> {

7. private Flow.Subscription subscription;

8.
9. @Override

10. public void onSubscribe(Flow.Subscription subscription) {

11. this.subscription = subscription;

12. subscription.request(1); // requesting the first item

13. }

14.
15. @Override

16. public void onNext(T item) {

17. System.out.println("Received: " + item);

18. // after processing the item, we request the next one

19. subscription.request(1);

20. }

21.
22. @Override

23. public void onError(Throwable throwable) {

24. throwable.printStackTrace();

25. }

26.
27. @Override

28. public void onComplete() {


29. System.out.println("Completed");

30. }

31. }

32.
33. public static void main(String[] args) throws InterruptedExceptio
n{

34. SubmissionPublisher<String> publisher = new SubmissionPubl


isher<>();

35.
36. MySubscriber<String> subscriber = new MySubscriber<>();

37. publisher.subscribe(subscriber);

38.
39. System.out.println("Publishing items...");

40. String[] items = {"item1", "item2", "item3", "item4", "item5"};

41. for (String item : items) {

42. publisher.submit(item);

43. }

44.
45. publisher.close();

46. Thread.sleep(1000); // waiting for all items to be processed

47. }

48. }

In this example, `MySubscriber` is implementing backpressure


using the `Flow.Subscriber` interface. In `onSubscribe`, it
requests the first item. After processing each item in `onNext`,
it requests the next one. This way, it only pulls items as fast
as it can process them, effectively implementing
backpressure.
Following are the pros and cons:

Pros:
By limiting the rate of incoming requests, the pattern
protects a system from being overwhelmed.
By shedding excess load, the system can maintain
optimal performance and prevent cascading failures.
Can be made adaptive to changing system load,
ensuring efficient use of resources.

Cons:
Implementing backpressure adds complexity to the
system and requires careful tuning.
If requests are dropped during high load, important
data might be lost.
Slowing down request processing can have cascading
effects on overall system latency and responsiveness.

Bulkhead
The bulkhead pattern is a design strategy employed in
microservices architectures to increase the resilience and
fault tolerance of a system. The name "Bulkhead" is
borrowed from ship construction, where a ship's hull is
compartmentalized into watertight sections. If a leak occurs
in one section, it doesn't flood the entire ship (as shown in
Figure 7.11):
Figure 7.11: Bulkheads in ship design

In microservices, the bulkhead pattern involves isolating


elements of an application into pools so that if one fails, the
others will continue to function. It is like having several small
isolated systems instead of one large system.
Each microservice or operation gets its separate thread-pool
(or other resources like database connections), limiting the
scope of any potential failure to that single pool.
This way, even if one microservice is overloaded or failing, it
doesn't affect the whole system. Only the operations or
requests routed to that particular microservice are impacted,
while others continue to function normally.
In Spring Boot, you can implement the Bulkhead pattern with
resilience4j, a fault tolerance library. Here is an example of
how to do this:
In your `application.yml` file, define the bulkhead
configuration (Code snippet 7.11):
1. resilience4j:
2. bulkhead:
3. instances:
4. myService:
5. maxConcurrentCalls: 5
6. maxWaitDuration: 500ms
This configuration creates a bulkhead instance named
`myService` that allows a maximum of 5 concurrent calls and a
maximum wait duration of 500 milliseconds.
Now, in your service class, use the `@Bulkhead` annotation to
apply the bulkhead (Code snippet 7.12):
1. import io.github.resilience4j.bulkhead.annotation.Bulk
head;
2. import org.springframework.stereotype.Service;
3.
4. @Service
5. public class MyService {
6.
7. @Bulkhead(name = "myService", fallbackMethod =
"fallbackForMyService")
8. public String processRequest() {
9. // Your business logic here
10. }
11.
12. // Fallback method to be executed if the bulkhead is f
ull
13. public String fallbackForMyService(Throwable e) {
14. return "Fallback response";
15. }
16. }
In this example, the `processRequest()` method is protected by
the bulkhead. If there are already five concurrent calls being
executed, additional calls will wait up to 500 milliseconds for
a spot to free up. If the wait exceeds this limit, or if an
exception is thrown within the method, the
`fallbackForMyService()` method will be executed.
The bulkhead pattern is particularly useful for high-load
systems and those with critical operations that should be
highly available. It is often used together with the circuit
breaker pattern (see it in Chapter 5, Implementing
communication) to enhance the fault tolerance of the system
further.
Bulkhead and Backpressure patterns (described above) are
distinct system load management patterns. The bulkhead
pattern isolates system components, restricting failures to
confined areas and preventing them from spreading across
the entire system. Backpressure, on the other hand,
dynamically manages data flow based on system capacity,
providing proactive overload control.
Following are the pros and cons:

Pros:
If one microservice fails, the failure is limited to that
service and doesn't cascade to others.
The system can continue operating even in the face of
individual service failures.
Resources are allocated on a per-service basis,
preventing one service from consuming all resources.

Cons:
The pattern introduces more complexity to the system,
as it requires careful design and resource allocation.
Dedicated resource pools for each microservice might
lead to under-utilization of resources if not managed
properly.
More services mean more monitoring and
administration efforts.
Too many isolated segments might lead to an overly
complex and fragmented system.

Outbox
The outbox pattern aims to solve the data consistency
problem in distributed systems, particularly around the
implementation of transactions that span multiple services.
The core idea is that instead of a service directly producing
events or messages for other services to consume, it stores
these outgoing messages in a local, transactional "outbox"
database table. This table acts as a temporary storage for
messages that are yet to be dispatched (as shown in Figure
7.12).

Figure 7.12: Outbox pattern

The outbox pattern leverages the local ACID transactions to


ensure that changes to the database and the event
messages in the outbox are committed or rolled back
together, preventing inconsistency between the state of the
service and the messages it produces.
Once the local transaction is successfully committed, a
separate message relay process reads the messages from
the outbox and publishes them to the message broker. This
can be done through polling or database triggers, depending
on the specific technology in use.
In the event of a failure in the message relay process, the
messages remain in the outbox and can be retried, ensuring
eventual consistency. When the relay process is successful,
the messages are removed from the outbox.
This pattern is often used in combination with the
"Transactional Outbox" pattern, which ensures that the
changes in the database and the messages being published
are part of a single local transaction to maintain consistency.
Compared to other similar approaches such as direct
messaging or distributed transactions, the outbox pattern
provides several advantages. Unlike direct messaging, which
can introduce coupling between components and make it
challenging to maintain consistency across distributed
transactions, the outbox pattern ensures that message
emission is transactionally coupled with the business logic
that generates it, preserving data consistency. Additionally,
unlike distributed transactions, which can suffer from
performance issues and scalability limitations when spanning
multiple services or data stores, the outbox pattern allows
for more scalable and fault-tolerant message processing by
decoupling the emission of messages from the underlying
data transactions. This separation of concerns enables better
scalability, fault tolerance, and maintainability in distributed
systems while still ensuring reliable message delivery and
transactional integrity. Overall, the outbox pattern is well-
suited for environments where maintaining consistency and
reliability in asynchronous communication is essential,
offering a pragmatic and scalable approach to message
processing in distributed architectures.
Following are the pros and cons:
Pros:
By using local transactions, the pattern ensures that
database changes and message dispatching are
consistent.
If message dispatching fails, messages can be retried,
ensuring eventual consistency.
Separation of the business logic from the message
dispatching logic can lead to cleaner, more
maintainable code.

Cons:
Introducing an Outbox adds another layer to manage,
increasing the complexity of the system.
There might be a delay between the transaction
commit and the message dispatching, especially under
high load or failure scenarios.
Depending on the implementation, the pattern may
require database triggers or polling mechanisms,
which can have their own challenges and performance
considerations.
It might take some time for other services to see the
changes, especially in case of delays or failures.

Delayed execution
The delayed execution in microservices architecture address
the handling of complex or long-running business logic.
Instead of immediate execution, this pattern involves
queuing requests and processing them later by background
workers. Upon receiving a request, the service adds it to a
job queue and quickly responds, deferring the actual
processing. This method helps in managing system load,
enhancing scalability, and ensuring efficient resource use. It
also aids in reliable transaction recovery and prevents issues
like double-processing. This pattern is practical for systems
where immediate response is less critical, but efficient and
reliable handling of requests is paramount.

Problem
In microservices architecture, managing the execution of
business logic upon receiving user requests is crucial for
system efficiency and reliability. In most simple cases,
business logic is executed immediately, either synchronously
or asynchronously, upon the arrival of a request. However,
this immediate execution approach, despite its simplicity and
directness, presents significant challenges, particularly in
high-load scenarios.
The following figure explains this approach:

Figure 7.13: Synchronous and asynchronous immediate execution of business


logic in microservices

Synchronous execution: Here, the business logic is


executed on the same thread as the incoming request.
While this allows for straightforward programming
models and immediate feedback, it poses challenges in
handling concurrent requests. These requests run on
separate threads, requiring complex synchronization
mechanisms to avoid collisions and maintain data
integrity.
Asynchronous execution: In this model, the business
logic is executed in separate threads, while the original
request thread awaits responses. This approach can
handle more concurrent requests than synchronous
execution but often leads to increased complexity in
managing asynchronous workflows and handling
potential errors or timeouts.
Both synchronous and asynchronous immediate executions,
while effective in simpler scenarios, struggle under heavy
load conditions. Large volumes of concurrent requests can
lead to system overload, resulting in significant spikes in
system load. This not only degrades system responsiveness
but also increases the likelihood of transaction failures due to
timeouts. Also, this method is not suitable for long-running
transactions, as the request can fail on timeout before the
transaction is complete. Furthermore, in cases where the
business logic is complex, recovery from failures becomes
cumbersome, as simple retries might not suffice.
The delayed execution (also known as Task Scheduling)
offers am alternative approach. Instead of executing the
business logic immediately, the microservice places the
incoming request, along with its parameters, into a job
queue. The service then promptly responds, indicating that
the request has been queued. Later, stored jobs are
distributed among one or more workers that handle their
execution, (see figure 7.14).
Figure 7.14: Delayed execution of business logic in microservices

Delayed execution offers a few key benefits:


Load management: By queuing requests and
processing them asynchronously, this pattern
effectively spreads out the load, minimizing peaks in
system demand.
Scalability: Delayed execution allows for more
efficient use of system resources, leading to better
scalability, especially under high-load conditions.
Reliability and recovery: This approach simplifies the
recovery process for failed transactions. It ensures that
each request is processed exactly once, addressing
issues like double-processing and other concurrency-
related problems.

Job Queue
The job queue is the key component in the delayed
execution pattern. It serves as a temporary storage for tasks
or 'jobs' that need to be processed. The primary function of a
job queue is to manage these jobs efficiently, ensuring
smooth execution and handling of tasks.
The key features of the job queue include:
Adding and removing jobs: The queue allows for the
addition of new jobs as they arrive and the removal of
jobs once they are processed. This ensures a
continuous and organized flow of task handling.
Locking jobs during execution: When a job is being
processed, it is 'locked' to prevent multiple workers
from processing the same job simultaneously. This
locking mechanism is crucial for maintaining data
integrity and preventing duplicate processing.
Releasing locks on failure: If a job fails during
processing, the lock is released. This allows for the job
to be retried or handled according to the system's
failure management strategy.
Distributing jobs among multiple workers: The job
queue efficiently distributes jobs among available
worker services or threads. This distribution is key to
leveraging parallel processing and optimizing resource
utilization.
Scalability and load management: The job queue
inherently supports scalability, as it can manage an
increasing number of jobs by scaling out to more
workers. It also helps in balancing the load across the
system.
Monitoring and visibility: It provides mechanisms for
monitoring the status of jobs, which is essential for
managing long-running or complex tasks and for
debugging issues.
Priority handling: Some job queues may support
prioritization of tasks, allowing critical jobs to be
processed ahead of others.
When implementing a job queue in microservices
architecture, there are several options available, each
catering to different requirements and scenarios:
Persistent message queues: These are robust
solutions for job queuing and are particularly suited for
ensuring reliability and data persistence. Examples
include:
RabbitMQ: A widely-used open-source message
broker that supports complex routing and ensures
message delivery.
Apache Kafka: Known for handling high-throughput
data streams, Kafka is ideal for large-scale,
distributed systems.
Amazon SQS (Simple Queue Service): A managed
service offered by AWS, suitable for cloud-based
architectures, offering scalability and integration
with other AWS services.
Spring batch: This is a lightweight, comprehensive
batch processing framework designed for the
development of robust batch applications. It is a good
choice for applications already using the Spring
ecosystem. Spring Batch provides advanced job
processing capabilities and can be integrated with
persistent message queues for even more complex
scenarios.
Specialized queue services: These are cloud-based
or standalone services specifically designed for job
queuing and task scheduling. Examples include:
AWS Batch: Automates batch processing and job
scheduling, allowing you to efficiently run hundreds
of thousands of batch computing jobs on AWS.
Google Cloud tasks: A fully managed service within
Google Cloud Platform for managing the execution,
dispatch, and delivery of a large number of
distributed tasks.
Azure Queue storage: Offers cloud-based queuing
for storing large numbers of messages accessible
from anywhere in the world.
Each of these options has its unique strengths and is suited
for different use cases, ranging from simple, lightweight
queuing needs to complex, high-throughput scenarios
requiring advanced features like distributed processing, fault
tolerance, and integration with broader ecosystem tools. The
choice largely depends on the specific requirements of the
system, existing technology stack, and scalability needs.
Following are the pros and cons of the job queue pattern:
Pros:

Allows systems to handle large volumes of requests


efficiently.
Distributes workload evenly across available resources.
Enables handling of tasks without blocking main
application flow.
Provides resilience against system failures, ensuring
tasks are not lost.
Queued jobs can be retried in case of processing
failures.
More efficient use of system resources, processing jobs
as capacity allows.
Allows setting priorities for different jobs.
Facilitates tracking and management of task status and
performance.

Cons:
Adds additional architectural and operational
complexity.
Can introduce delays in processing, especially if the
queue is long.
Requires careful monitoring and management of
resources.
Poorly managed queues can become bottlenecks in the
system.
Reliant on the underlying queue infrastructure's
reliability and performance.
Ensuring consistency across distributed systems can be
complex.
Additional overhead in maintaining and monitoring the
queue system.
Requires robust mechanisms to handle and recover
from job failures.

Background worker
Background workers are components or microservices
responsible for executing tasks that have been placed into a
job queue. They operate in the background, processing jobs
independently from the primary request-handling workflow.
This separation allows for more efficient handling of
complex, long-running, or resource-intensive tasks without
impacting the responsiveness of the main service.
Execution of jobs in background workers can be triggered in
a few different ways:
On job arrival: Background workers can be
configured to activate when a new job arrives in the
queue. This immediate response ensures that tasks are
addressed promptly, optimizing the processing time.
Based on schedule: Alternatively, workers can be
activated based on a timer, where they periodically
check the queue for new jobs. This approach can be
more efficient in scenarios where job arrival is less
frequent or predictable.
Mixed: The combination of two previously described
methods.
In some cases, background workers can be implemented as
subcomponents inside microservices. This approach is
simpler and may be sufficient for smaller-scale applications
or services with limited background processing needs.
For greater flexibility and scalability, background workers can
be implemented as independent microservices. This
separation allows for more granular control over scaling, as
the background processing capacity can be adjusted
independently of the main service. It also enhances the
system's resilience, as the failure in a background worker
does not directly impact the main service operations.
An independent background worker microservice is
particularly advantageous in systems with variable
processing loads. It allows the architecture to be more
responsive to changing demands, scaling up or down as
required. This setup not only optimizes resource utilization
but also provides a more robust framework for handling
diverse and unpredictable workloads.
Following are the pros and cons of the background worker
pattern:
Pros:

Enhances system performance by handling time-


consuming tasks separately.
Can scale independently based on workload, enhancing
overall system scalability.
Errors in background tasks don’t directly impact the
clients.
More efficient use of system resources by distributing
tasks.
Can be tailored for specific tasks like data processing
or batch jobs.
Enables concurrent processing of multiple tasks.

Cons:
Adds architectural and operational complexity.
Can introduce delays in task processing and
completion.
Needs additional monitoring for performance and error
handling.
Requires strategies for handling failures and retries of
tasks.
Ensuring data consistency between main and
background processes can be challenging.
Additional overhead in deployment and management of
separate worker services.

Antipatterns
In the process of handling complex business transactions in a
microservices architecture, several practices can prove to be
counterproductive. These antipatterns can hinder scalability,
maintainability, and the overall efficiency of the system.
Here are a few notable ones:
Distributed monolith: This antipattern emerges when
services are tightly coupled, with one service directly
invoking another to perform its operation. The system
becomes a distributed monolith as services can't be
deployed, scaled, or updated independently.
Global transactions: Also known as the two-phase
commit (2PC), this pattern ensures that a transaction
is either fully committed across all services or fully
rolled back. However, this approach can become an
antipattern due to its negative impact on availability
and performance as it requires all involved services to
be available and introduces synchronous locks across
services.
Synchronization across datacenters:
Transactions or synchronization and locking across
multiple data centers can have a severe impact on
overall system performance compared to a single
location. It is crucial to avoid such scenarios at all
costs by partitioning and redirecting calls to the
designated data center where they belong.
Shared database: In this antipattern, multiple
services share the same database. While this approach
allows for easy sharing of data, it creates tight coupling
at the database level, violating the principle of data
encapsulation in a microservices architecture.
Inappropriate use of sagas: While sagas are
powerful tools for managing distributed transactions,
misusing them can lead to issues. For instance,
creating extremely long sagas can reduce system
performance and increase the chances of failures.
Lack of idempotency: In distributed systems, the
same request may be processed more than once due to
network retries or message duplication. Failure to
handle these situations idempotently (i.e., ensuring the
multiple executions have the same effect as a single
execution) can lead to incorrect business operations
and data inconsistency.
Avoiding these antipatterns requires careful planning, an
understanding of the domain, and adherence to the
principles of loose coupling, high cohesion, data
encapsulation, and autonomy of services.

Conclusion
In this chapter, we delved into the essential patterns for
handling complex business transactions within a
microservices architecture. We unpacked a wide range of
topics, from state management and process flow to
transaction management, delayed execution, and reliability.
As a result, we gained insights into how to leverage these
patterns to enhance the robustness, flexibility, and efficiency
of our microservices-oriented systems. The knowledge and
skills acquired in this chapter provide a solid foundation for
designing and implementing effective transaction strategies,
empowering us to construct highly responsive, scalable, and
resilient microservices. In the next chapter, we will expand
our knowledge of microservices by learning how to expose
external APIs.

Further reading
1. Huseynli, O. Caching as a part Software Architecture.
Medium. Nov 23, 2021. Available at
https://orkhanscience.medium.com/upgrade-
performance-via-caching-5-min-read-
19fafd56d704
2. Owusu, K.A. Summary #001 — Caching at Netflix: The
Hidden Microservice. Medium. Aug 30, 2021. Available
at https://medium.com/@thelyss/summary-001-
caching-at-netflix-the-hidden-microservice-
f28700b0e7a9
3. Srivastava, J. Distributed locks. Medium. Jan 11, 2021.
Available at https://medium.com/system-design-
concepts/distributed-locks-9ed116145a47
4. An, J. Distributed Lock Implementation With Redis and
Python. Medium. Jun 23, 2021. Available at
https://medium.com/geekculture/distributed-
lock-implementation-with-redis-and-python-
22ae932e10ee
5. Koshy. Distributed State — Challenges and Options.
Medium. Oct 19, 2020. Available at
https://medium.com/swlh/distributed-state-
management-80c8100bb563
6. Kirekov, S. Chain of Responsibility Pattern in Spring
Application. Medium. Oct 2, 2022. Available at
https://medium.com/javarevisited/chain-of-
responsibility-pattern-in-spring-application-
f79b35f341e5
7. Nur, P.A. How to Handle Distributed Transaction on
Microservices. Medium. Dec 11, 2021.
https://pramuditya.medium.com/how-to-handle-
distributed-transaction-on-microservices-
97a861dd7b11
8. Deng, D. Building Resilient Microservice Workflows with
Temporal: A Next-Gen Workflow Engine. Medium. Feb
14, 2023. Available at
https://medium.com/safetycultureengineering/bui
lding-resilient-microservice-workflows-with-
temporal-a-next-gen-workflow-engine-
a9637a73572d
9. Ahmad, A. 5 Must-Know Distributed Systems Design
Patterns for Event-Driven Architectures. Medium. May
26, 2023. https://levelup.gitconnected.com/stay-
ahead-of-the-curve-5-must-know-distributed-
systems-design-patterns-for-event-driven-
7515121a28ae

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 8
Exposing External APIs

Introduction
This chapter introduces you to the world of external APIs,
highlighting their role as gateways for external software
interactions. We will delve into the contrast between internal
communication and the broader, more intricate landscape of
external APIs. Emphasis will be placed on the importance of
security, the need for interoperability with varied systems,
and the art of versioning to ensure smooth functionality
across updates.

Structure
In this chapter, we will cover the following topics:
External interface
API gateways
Public Facades
Backend for Frontend
API management
Synchronous request/response
HTTP/REST
GraphQL
Push notifications and callbacks
Webhooks
WebSockets
Authentication
Basic authentication
API key
OpenID connect
Multi-factor authentication
Authorization
Session tracking
JWS token
OAuth 2.0
SSL/TLS encryption

Objectives
After studying this chapter, you should be able to build
robust external APIs tailored for diverse users, implement
security measures specific to external interfaces, engineer
APIs for seamless interoperability with various systems, and
apply effective versioning techniques to ensure backward
compatibility during API evolution.

External interface
In a microservices system, an external interface refers to the
point of interaction or communication between the
microservices-based application and the outside world. This
could be other applications, third-party services, or even
end-users. It serves as a bridge, ensuring that internal
services remain decoupled and isolated while providing a
unified, cohesive interface to external entities.

Problem
Microservices evolve rapidly, posing challenges for external
clients who prioritize interface stability and interoperability.
External interfaces act as stable Facades, ensuring
compatibility across diverse client technologies, enhancing
security, and enabling tailored optimization. Despite
underlying changes, the external interface remains a beacon
of consistency and adaptability (as shown in Figure 8.1):

Figure 8.1: External interfaces and their responsibilities

External interfaces in microservices systems come with a


suite of features designed to optimize, secure, and facilitate
smooth communication with external clients:
Interoperable synchronous and asynchronous
APIs: Ensure smooth communication across diverse
systems, regardless of their interaction patterns, be it
real-time synchronous or event-driven asynchronous.
Versioning: Allows clients to specify the API version
they want to use, safeguarding backward compatibility.
Authentication mechanisms: Features like OAuth,
API keys, JWT, or basic authentication validate the
identity of external entities.
Authorization layers: Define permissions and roles to
ascertain what an authenticated entity can access or
execute.
Rate limiting: Control the request frequency from
clients within a set timeframe to prevent potential
system abuse.
Caching: Temporarily stores frequently accessed data
or responses for enhanced response speeds.
Error handling and reporting: Offer uniform error
codes and messages, guiding external clients without
exposing system internals.
Request and response transformation: Adapt
incoming and outgoing data to relevant formats,
ensuring effective communication across different
systems.
Logging and monitoring tools: Chronicle
interactions, oversee performance metrics, and
monitor for potential anomalies or threats.
Encryption: Ensure the confidentiality and integrity of
data in transit using encryption mechanisms.
Load balancing: Distribute incoming interactions over
multiple service instances for balanced resource use
and optimal response times.
Circuit breaker: Recognize service failures and
swiftly redirect or fall back to maintain system
resilience.
Compression: Minimize response data size to increase
transmission speed and reduce bandwidth
consumption.
Documentation and discovery: Utilize tools such as
Swagger or OpenAPI to auto-generate comprehensive
API documentation, aiding in client integration.
Cross-Origin Resource Sharing (CORS): Regulate
and specify which external domains can interface with
the microservices, especially vital for web applications.
These features, embedded within external interfaces,
underpin the mission of delivering reliable, efficient, and
secure communication between the microservices and the
diverse array of external clients.

API gateway
In microservices, the API gateway pattern simplifies external
interfaces using pre-built solutions. These gateways rely on
configuration, not custom code, allowing developers to
define routing and transformations.
Major cloud providers (AWS, Azure, Google Cloud) offer
integrated solutions within their ecosystems, ideal for cases
using a specific platform. Independent vendors like Kong or
Apigee offer versatile, cross-platform API gateways for
broader compatibility.
However, over-reliance on platform-specific gateways may
pose challenges. They can lock a system into a single
platform, potentially complicating migration. Moreover, some
of these gateways cannot be used in local development,
increasing development and testing complexity.
API gateways excel with simple to moderately complex APIs,
streamlining common tasks like rate limiting, caching,
authentication, and logging. Yet, for interfaces requiring
intricate business logic, complex data transformations,
optimization strategies, or advanced request compositions,
the configuration-centric nature of API gateways may prove
limiting.
Despite offering configuration-driven ease, API gateways
have a learning curve. Efficiently configuring, optimizing, and
managing one requires specialized knowledge. Teams must
adjust their processes for building, deploying, and
maintaining gateways effectively (as shown in Figure 8.2):

Figure 8.2: Exposing external interface via API Gateway

Here are the primary features of API gateways:


Routing: Directs incoming requests to the appropriate
backend service based on pre-defined rules.
Request and response transformation: Adapts the
format, structure, or content of API messages to cater
to the needs of clients or services.
Authentication: Validates the identity of API clients,
typically using tokens, API keys, or other credentials.
Authorization: Determines and enforces what
authenticated clients can or cannot access.
Rate limiting: Restricts the number of requests a
client can make within a specified timeframe.
Error handling: Provides standardized and
informative feedback to clients during exceptions or
service disruptions.
Service aggregation: Combines data or
functionalities from multiple services into unified
responses.
Version management: Supports multiple versions of
an API, ensuring backward compatibility and staged
feature releases.
Security enhancements: Incorporates measures like
SSL/TLS encryption, CORS policies, and threat
detection mechanisms.
Monitoring and logging: Tracks API activity and
performance metrics, enabling debugging, analytics,
and operational oversight.
Circuit breaking: Adds resilience by detecting service
failures and rerouting or halting requests to avert
cascading failures.
Throttling: Controls and shapes API traffic based on
policies, which can be client-specific or global.
Latency reduction: Minimizes response times via
strategies like data compression or adaptive request
processing.
Several off-the-shelf solutions, ranging from cloud-native
offerings to open-source and commercial tools, are available
for API gateway needs. Here's a list of some of the most
popular API gateway solutions:
NGINX: While primarily known as a web server,
NGINX (especially in its Plus version) offers robust API
gateway capabilities.
Apache APISIX: A dynamic, real-time API gateway
from the Apache Foundation, offering extensibility
through plugins and integrating with the CNCF
ecosystem.
AWS API gateway: A managed service offered by
Amazon Web Services for creating, publishing, and
maintaining APIs.
Azure API management: Microsoft's cloud-based
solution offering API publishing, management, and
analytics.
Google Cloud endpoints: A Google Cloud service that
offers API development, deployment, and monitoring.
The API gateway pattern is a foundational component in
microservices architectures, centralizing external access to
various services. Like any architectural pattern, it brings with
it a set of advantages and disadvantages:

Pros:
Aggregates routing, authentication, rate limiting, and
more.
Single entry point simplifies client interactions.
Centralizes authentication, authorization, and threat
detection.

Cons:
Risk of becoming a system choke point without high
availability.
Extra layer may introduce minor overhead.
Might not be ideal for intricate logic or data
transformations.
Facade
The Facade pattern, rooted in object-oriented design,
simplifies complex subsystem access for clients by offering a
unified interface. In microservices, this pattern transforms
into a specialized microservice, presenting a streamlined
external interface for intricate systems (refer to Figure 8.3):

Figure 8.3: Exposing external interface via Facade Microservice

In contrast to API gateways using third-party tools, a Facade


is custom-built within the same tech stack and processes as
other microservices. This approach offers exceptional
flexibility, letting developers handle complex scenarios
without third-party constraints. While it demands more
development effort, it yields tailored optimizations, intricate
data transformations, and custom requests for highly
customized systems.
Following are the pros and cons:

Pros:
Streamlines client interactions by providing a single
point of access.
Custom-built, allowing tailored solutions.
Designed to grow based on the specific needs of the
system.

Cons:
Requires more effort to implement compared to out-of-
the-box solutions.
As the system grows, maintaining the facade can
become complex.
Concentrates risk, needing robust failover strategies.

Backend for Frontend


The Backend for Frontend (BFF) pattern tailors external
interfaces to match specific frontend requirements, like
mobile apps or web interfaces. Unlike Facades for all clients,
BFF creates dedicated backends for each client or group.
This optimizes interactions, letting developers boost
performance, trim overhead, and swiftly adapt to each
frontend's unique needs (refer to Figure 8.4):

Figure 8.4: Exposing external interface via backend for frontends

Facades and BFF patterns vary in design focus and


development roles. Facades, usually created by backend
teams, abstract complex services for a consistent client
interface. Conversely, frontend teams often orchestrate
BFFs, aligning backends with specific frontend requirements.
As a result, a BFF's lifecycle aligns with its corresponding
frontend.
Following are the pros and cons:

Pros:
Tailored backends ensure optimal performance for
specific frontends.
Frontend-aligned lifecycles allow rapid iterations and
releases.
Offers freedom to choose tech stacks suitable for each
frontend's needs.

Cons:
Managing multiple BFFs can complicate backend
infrastructure.
Deploying multiple BFFs might require intricate
orchestration.
Different teams may have varying levels of backend
expertise.

API management
The API management pattern is a sophisticated evolution of
the API gateway pattern, catering specifically to the
comprehensive needs of software vendors, particularly those
offering APIs as services. Beyond the foundational routing,
security, and transformation functionalities of a standard
gateway, API management platforms encompass a broader
ecosystem designed to nurture and support external
developers (refer to Figure 8.5):
Figure 8.5: Exposing external interface via API management platform

Features often include developer portals, API documentation,


analytics dashboards, developer tools, and billing
capabilities.
There are several leading API management solutions
available in the market, catering to various business needs
and scales. Some of the prominent ones include:
Apigee (Google Cloud): A comprehensive platform
offering API analytics, developer portal, and lifecycle
management. Google Cloud's acquisition of Apigee has
integrated it seamlessly with the GCP ecosystem.
MuleSoft anypoint platform: Acquired by Salesforce,
MuleSoft offers API management alongside a range of
integration services.
3scale (Red Hat): Acquired by Red Hat, 3scale offers
API traffic control, security policies, and developer
portal capabilities.
TIBCO Cloud Mashery: Offers API design, packaging,
testing, and analytics with a strong focus on API
strategy and monetization.
Following are the pros and cons:
Pros:
Unified dashboard for management, monitoring, and
analytics.
Integrated authentication, authorization, and threat
detection.
Dedicated portals, documentation, and tools for
developers.

Cons:
Can introduce additional complexity in deployment and
management.
Premium features and scalability might lead to high
costs.
Tied to specific platforms or vendors.

Synchronous request/response
In the realm of external API communication, the
request/response model is fundamental. This synchronous
pattern involves a client sending a request and awaiting an
immediate server response, providing a clear and
predictable data exchange. As businesses rely more on real-
time feedback and instant data retrieval, understanding this
mechanism becomes crucial.

Problem
In the evolving world of external API development, a crucial
challenge arises: How to serve diverse clients while ensuring
stability and consistent performance? As services grow and
client ecosystems diversify, APIs must rely on widely
accepted internet standards. These standards should
promote interoperability, include versioning, and robust
security. Balancing these needs while maintaining
synchronous request/response flow is a key concern for
modern API-centric systems (refer to Figure 8.6):

Figure 8.6: Request/response interactions in external APIs

The tenets governing request/response interactions in


external APIs bear a resemblance to those guiding
synchronous communications within internal microservices.
However, certain requirements, which might be discretionary
for internal exchanges, become indispensable when
orchestrating external API interactions, they are:
Standard Protocol Adoption: Utilize widely-accepted
communication protocols, like HTTP/HTTPS.
Interoperability: Ensure compatibility across various
platforms, languages, and systems.
Versioning: Implement version control mechanisms to
support backward and forward compatibility.
Data format standards: Adopt universally recognized
data formats, such as JSON or XML.
Error handling: Provide clear, informative error
messages with standardized error codes.
Security: Incorporate robust encryption,
authentication, and authorization mechanisms.
Documentation: Provide comprehensive, up-to-date
API documentation for developers.
Scalability: Design the API to handle an increasing
number of requests without performance degradation.
Resilience: Implement strategies to handle failures
gracefully and maintain availability.

HTTP/REST
The HTTP/REST protocol, widely used for internal
microservices communication, is the primary standard for
external APIs. Its dominance in web development
emphasizes its intuitive and standardized nature, ensuring
consistency for API developers and consumers. Adhering
strictly to REST is crucial for external APIs, emphasizing
uniformity and predictability.
REST architecture revolves around treating entities as
resources, each uniquely identifiable through a Uniform
Resource Locator (URL). This principle keeps your API
organized and intuitive. For instance, a user resource could
have a URL like /users for a collection of users or /users/123 for
a specific user with ID 123.
Every RESTful service is built upon the foundation of
standard HTTP verbs. These verbs indicate the type of action
you wish to perform on a resource:
GET: Fetch data from a specified resource.
POST: Add a new resource.
PUT/PATCH: Update an existing resource. While PUT
typically updates the whole resource, PATCH modifies
only specific parts.
DELETE: Remove the specified resource.
One of the strengths of the HTTP protocol is its extensive list
of status codes, each offering a glimpse into the result of an
operation. They are divided into five classes:
1xx informational response – the request was received,
continuing process
2xx successful – the request was successfully received,
understood, and accepted
3xx redirection – further action needs to be taken in
order to complete the request
4xx client error – the request contains bad syntax or
cannot be fulfilled
5xx server error – the server failed to fulfil an
apparently valid request
Some examples include:
200 OK: Successfully processed the request.
201 Created: Successfully created a new resource.
404 Not Found: The resource was not found.
Documentation is the bridge between your API and its
consumers. OpenAPI, previously known as Swagger, provides
a structured way to describe RESTful services. With OpenAPI,
developers can auto-generate interactive documentation,
derive client SDKs, and conduct API testing.
A secure API is paramount in today's digital world. Here are
the standard security mechanisms for HTTP/REST:
HTTPS: Guarantees encrypted communication,
safeguarding data in transit.
Authentication: Often achieved through tokens (such
as JWT) or API keys.
Authorization: Ensures that clients access only
permissible resources, typically through role-based
access controls. For example, OAuth, a widely adopted
authorization mechanism, allows for third-party apps
controlled access to user resources without directly
exposing user credentials.
In the example below, a client is sending an HTTP GET
request to retrieve information about a product with the ID of
123. The server then responds with a 200 OK status and
provides the requested product details in JSON format (Code
snippet 8.1).
Request (Code snippet 8.1):
1. GET /products/123 HTTP/1.1
2. Host: example.com
3. Authorization: Bearer <token>
4. Accept: application/json
Response (Code snippet 8.2):
1. HTTP/1.1 200 OK
2. Content-Type: application/json
3.
4. {"id": "123", "name": "Laptop", "price": 999.99}
For more information on implementation of HTTP/REST
services, refer to Chapter 5, Implementing Communication.

GraphQL
GraphQL, an API query language, has gained traction as an
alternative to the conventional HTTP/REST method for
external API development. In contrast to REST's rigid
structure, GraphQL offers flexibility, allowing clients to define
the response structure they require, potentially streamlining
data retrieval.
Initially conceived by Facebook in 2012 to address
inefficiencies in their REST API for mobile apps, GraphQL
became an open-source project in 2015. Its adoption has
surged in response to its benefits and the evolving demands
of modern applications.
While HTTP/REST has been the de facto standard for APIs, it
is not without its shortcomings, some of which GraphQL aims
to address:
Over-fetching and under-fetching: In REST, fixed
data structures can lead to over-fetching or under-
fetching of information. GraphQL solves this by
enabling clients to request precisely the data they
require.
Multiple requests: REST often necessitates multiple
requests to access related resources, whereas
GraphQL streamlines this by allowing a single query to
retrieve all needed data, reducing overhead.
Versioning: With REST, changes to the API often
result in new versions, which can be challenging to
manage. GraphQL avoids this by having a flexible
schema that allows for additive changes without
breaking existing queries.
Rapid iteration and frontend independence: With
GraphQL, front-end teams can iterate more quickly and
independently as they can fetch exactly what they need
without relying on backend changes for every
modification.
This example of GraphQL request/response is similar to the
one presented for the HTTP/REST protocol. The client sends a
POST request containing a GraphQL query to fetch details of
a product with the ID of 123. The server then responds with a
status and provides the requested product details in a
200 OK
nested structure under the data field in JSON format.
Request (Code snippet 8.3):
1. POST /graphql HTTP/1.1
2. Host: example.com
3. Authorization: Bearer <token>
4. Content-Type: application/json
5. Accept: application/json
6.
7. {
8. "query": "
9. {
10. product(id: 123) {
11. id
12. name
13. price
14. }
15. }
16. «
17. }
Response (Code snippet 8.4):
1. HTTP/1.1 200 OK
2. Content-Type: application/json
3.
4. {
5. "data": {
6. "product": {
7. "id": "123",
8. "name": "Laptop",
9. "price": 999.99
10. }
11. }
12. }
GraphQL Federation combines multiple distinct GraphQL
services into a single API. In microservices, it unifies data
without merging schemas, enabling independent
microservice development. This offers consumers a
consolidated view, connecting diverse data sources while
preserving microservice independence and scalability.
Here is a concise list of Java technologies commonly used to
implement GraphQL-based external APIs:
GraphQL Java: The main Java library for GraphQL that
provides tools for schema definition and query
execution.
Spring Boot GraphQL: Facilitates the integration of
GraphQL Java with the Spring Boot framework,
simplifying the creation of GraphQL services in Spring
applications.
GraphQL SPQR: Generates GraphQL schemas based
on Java code, using annotations to define schema
objects and operations.
DGS (Domain Graph Service) Framework: A
GraphQL server framework by Netflix, specifically
tailored for Spring Boot applications, emphasizing type
safety and ease of testing.
Following are the pros and cons:
Pros:
Clients specify exactly what data they need.
Only requested data is sent.
Evolutionary, not versioned. Additions do not break
existing queries.

Cons:
Steeper learning curve than REST.
Traditional HTTP caching might not work out-of-the-
box.
Makes general error tracking harder.

Push notifications and callbacks


Push notifications and callbacks deliver data without direct
requests, crucial for timely updates in areas like stock
trading, real-time analytics, and gaming.

Problem
Supporting diverse clients while ensuring stability is a
challenge in external APIs, particularly with push
notifications and callbacks. These APIs require widely
accepted standards, robust versioning, and robust security.
Seamless interactions across platforms demand a solution
harmonizing these strict demands (refer to Figure 8.7):
Figure 8.7: Push notifications and callbacks in external APIs

Horizontal scalability, while instrumental in accommodating


growing loads in distributed systems, introduces unique
challenges to asynchronous APIs. With several instances of
external interfaces running concurrently, a client might
establish a connection with any given instance.
In this dynamic setting, routing presents challenges. When
push notifications or callbacks trigger, the system must
intelligently direct messages to the client's current instance.
System resilience is crucial during instance or client failures,
requiring quick failure detection and subscription
reestablishment to maintain a functional routing fabric.

Webhooks
Webhooks enhance asynchronous external API
communication, especially with publicly accessible, stable
endpoints. Unlike the typical client-server model, webhooks
make the external interface act as an HTTP client, initiating
HTTP/REST calls to registered external endpoints when
specific events occur in the system.
Essentially, webhooks hook onto events, triggering outbound
HTTP calls to inform external systems of occurrences. This
departure from traditional flow enables real-time
notifications, eliminating the need for constant polling and
improving efficiency and timeliness.
A webhook process flow can be described in a series of
systematic steps:
1. Endpoint Setup: Before anything else, the receiving
system (subscriber) must set up an endpoint to listen
for incoming webhook requests. This endpoint is a
specific URL that can process POST requests.
2. Webhook registration: The subscriber then registers
its endpoint with the sending system (publisher). This
registration tells the publisher where to send event
notifications. Often, the subscriber will also specify
which specific events it is interested in.
3. Event occurrence: Within the publisher's system,
certain events are monitored. When one of these
monitored events occurs, it triggers the webhook
mechanism.
4. Event notification: The publisher packages details
about the event into an HTTP POST request payload,
typically structured as JSON or XML. This request is
then sent to the registered endpoint of the subscriber.
5. Subscriber response: Upon receiving the POST
request, the subscriber system processes the data. It
then sends back an HTTP status code to acknowledge
receipt.
6. Payload processing: After acknowledging the
webhook, the subscriber decodes and processes the
data in the payload as required. This could involve
updating databases, alerting users, triggering other
processes, and so on.
7. Error handling: If the webhook POST request fails (for
example, because the subscriber's endpoint is down),
the publisher often has retry logic. Depending on the
configuration, the publisher might attempt to resend
the data after a set period or use an exponential
backoff strategy.
8. Deregistration/Updates: If the subscriber no longer
wants to receive certain event notifications or if the
endpoint changes, it must update or deregister the
webhook with the publisher.
Webhook security is vital as they communicate externally.
Security measures include HTTPS for endpoint URL
protection, ensuring data confidentiality in transit. Data
authenticity is verified with HMAC, where a unique hash,
created using a shared secret key, is sent with the message.
The recipient validates authenticity by recomputing the hash
with the shared key. Additional security measures include IP
whitelisting and recipient acknowledgment for enhanced
security.
Following are the pros and cons:

Pros:
Immediate event notifications.
Often easier to implement than complex protocols.
Can be tailored to specific event interests.

Cons:
Risk of exposing endpoints.
Handling failures can be complex.
Relies on the subscriber's endpoint availability.

WebSockets
WebSockets provide a full-duplex communication channel
over a single, long-lived connection, making them a popular
choice for implementing real-time functionality in web
applications, such as push notifications and live updates.
Applications of WebSockets for push notifications and
updates in async external APIs include:
Persistent connection: Once the server accepts the
upgrade, a full-duplex communication channel is
established. This connection remains open, allowing
data to be sent in both directions as frames without the
overhead of re-establishing connections.
Push notifications: With the persistent connection,
the server can now send real-time notifications to the
client as soon as an event occurs. For example, if a new
message is posted in a chat application, the server can
instantly push this message to all connected clients.
Live updates: Similarly, WebSockets can be used to
send live data updates. This is especially useful for
applications like online games, financial trading
platforms, or any application where data is frequently
updated.
Security: WebSockets can be secured using the wss://
protocol, which establishes a connection similar to
HTTPS. Additionally, proper authentication and
authorization mechanisms should be in place to ensure
that only valid clients can connect.
Integration with other systems: When using
WebSockets in async external APIs, there may be
situations where the data being pushed to clients
originates from other internal systems. Proper
integration and data flow management are crucial in
such cases to ensure real-time updates.
Java provides several technologies and libraries to
implement WebSocket functionality in applications. Here is a
list of notable Java technologies to implement WebSockets:
Java API for WebSocket (JSR 356): This is the
standard API introduced in Java EE7 to build
WebSocket-driven applications. It offers both
annotated and programmatic means to create
WebSocket endpoints.
Spring WebSocket: Spring offers comprehensive
WebSocket support in its portfolio. The Spring
WebSocket module provides the necessary features for
WebSocket-based applications, handling messaging,
security, and more. It also provides a fallback option
for browsers that do not support WebSocket.
Vert.x: Though not strictly a WebSocket-only library,
Vert.x is a toolkit for building reactive applications on
the JVM. It offers robust support for WebSockets and
can handle a large number of concurrent connections.
Atmosphere: Atmosphere is a framework for building
asynchronous web applications with support for
WebSockets, Server-Sent Events (SSE), and long
polling. It is designed to handle the complexities of
different browser behaviors and fallbacks.
Tyrus: Tyrus is the reference implementation of the
Java API for WebSocket (JSR 356). It provides a
straightforward way to create and deploy WebSocket
endpoints.
SockJS: While not exclusively a Java technology,
SockJS is a browser JavaScript library that provides a
WebSocket-like object. The Spring Framework provides
a SockJS server-side counterpart to enable fallback
options when WebSocket is not available.
Following are the pros and cons:
Pros:
Real-time interactions are ideal for live chats, gaming,
and real-time feeds.
Supported by all modern browsers and many server
platforms.
Libraries like SockJS offer fallback options for
unsupported environments.

Cons:
More complex to implement than traditional request-
response.
If a connection drops, reconnection must be handled
explicitly.
Older browsers may not support WebSockets.

Authentication
Authentication is crucial in external APIs, confirming the
caller's identity, be it a user or a service. Using provided
credentials, verifies their legitimacy, enabling interaction
monitoring and access rights determination. In the diverse
client landscape, a secure authentication mechanism is
fundamental for system security, integrity, and proper
operation.

Problem
External APIs are vital entry points, often exposing sensitive
data. Verifying caller identity is crucial for data security and
system integrity. This challenge grows with diverse clients,
each with unique security needs.
Authentication is key for system security, ensuring only
authorized entities access it. Yet, the authentication process
can face security threats, including:
Brute force attacks: Attackers use trial and error to
guess authentication credentials. Systems without rate-
limiting or account lockout features are particularly
vulnerable.
Phishing: Attackers deceive users into providing their
authentication credentials. This is usually achieved by
mimicking legitimate websites or communication
channels.
Man-in-the-middle (MitM) attacks: Attackers
intercept communication between the user and the
authentication system to steal credentials or
manipulate authentication data.
Social engineering: Tricking individuals into
revealing their credentials through manipulative
interactions.
To counteract these threats, robust security measures, like
multi-factor authentication, encryption, continuous
monitoring, and user education, should be implemented.

Basic authentication
Basic authentication is one of the simplest methods used for
HTTP authentication in external APIs. In this method, a client
sends a combination of a username and password with each
HTTP request. The combination is base64 encoded (not
encrypted) and sent in the HTTP header (as shown in Figure
8.8):
Figure 8.8: Basic authentication dialog in a web browser

How it works:
1. Request: The client sends an HTTP request to the
server.
2. Server response: If no authentication header is
present, or if the server does not recognize the
credentials, it responds with a 401 Unauthorized
status code and a WWW-Authenticate: Basic header.
3. Client response: The client resends the HTTP request
with an Authorization header containing the word
Basic followed by a space and a base64 encoded string
username:password.

4. Server verification: The server decodes the base64


string to retrieve the username and password, verifies
them against its data store, and then either grants or
denies access based on the validity of the credentials.
While Basic Authentication is straightforward and easy to
implement, its security concerns mean it is often used in
conjunction with other security practices or replaced by more
secure authentication methods when dealing with external
APIs.
Following are the pros and cons:

Pros:
Simple to implement.
Widely supported in HTTP clients/servers.
No extra libraries are needed.

Cons:
Not encrypted, only base64 encoded.
Performance overhead for frequent verifications.
Lacks advanced security features.

API key
API key authentication is a method where the client
application includes a unique and secret key with every
request to the server. This key is pre-generated on the server
and shared with the client, either upon application
registration or through some other secure method (refer to
Figure 8.9):

Figure 8.9: API Key generation page in AWS API Gateway


When the server receives a request with an API key, it
checks the validity of the key. If valid, the request proceeds;
otherwise, it is rejected. The key helps identify the calling
application and, optionally, might provide information about
privileges, rate limits, and other access controls.

Typical flow:
1. Registration: A developer or client application
registers with the service provider and receives a
unique API key.
2. Request: For every API call, the client sends this key,
typically in the request header (for example, `x-api-
key`).
3. Validation: The server receives the request, validates
the API key, and processes the request if the key is
valid.
4. Response: The server then sends back the appropriate
response, whether it is the requested data for valid
keys or an error for invalid keys.
API key authentication is often combined with other security
measures like HTTPS to prevent eavesdropping.
Security mechanisms may also include restrictions such as IP
whitelisting or setting expiration dates for keys. In some
systems, different keys may have different access levels or
rate limits associated with them.
Following are the pros and cons:

Pros:
Easy to set up and integrate.
Quick validation process with minimal overhead.
Can be used in conjunction with other authentication
methods.
Cons:
Not inherently secure. If exposed, anyone with the key
has access.
Risk of exposure if sent over non-encrypted channels.
Does not inherently distinguish between different end-
users using the same client application.

OpenID Connect
OpenID Connect (OIDC) is an identity layer built on top of
the OAuth 2.0 protocol, providing a way for clients to verify
the identity of the end-user and obtain basic profile
information about them in a standardized manner. It is
commonly used in external APIs to facilitate user
authentication, particularly in single sign-on (SSO)
scenarios (as shown in Figure 8.10):

Figure 8.10: Sign in with Google OpenID

OIDC is based on OAuth 2 and offers several authentication


flows, with the Authorization Code Flow being the most
common. The Implicit Flow and Hybrid Flow cater to
different client types and scenarios. See the following
OAuth2 authorization pattern for details:
Security mechanisms include:
Nonce parameter: To prevent replay attacks.
PKCE (Proof Key for Code Exchange): To enhance
security for mobile and native applications.
ID token validation: To ensure the token's integrity
and authenticity.
OIDC has gained significant traction, with several renowned
providers offering its implementation. Some notable OIDC
providers are:
Google Identity Platform: Allows developers to
integrate Google's OAuth 2.0 authentication system,
which supports OIDC.
Facebook: While primarily known for its OAuth 2.0
implementation, Facebook's login system shares many
similarities with OIDC. Developers can utilize
Facebook's authentication mechanism for their
applications.
Microsoft Azure Active Directory (Azure AD):
Microsoft's cloud-based identity service which supports
OIDC for web, desktop, and mobile apps.
Amazon Cognito: Enables developers to establish a
secure user directory with integrated OIDC support.
GitHub: Developers can use GitHub's OAuth-based
system, which, similar to Facebook, resembles OIDC
functionalities.
Twitter: Like Facebook and GitHub, Twitter's
authentication is more OAuth-focused but shares
several characteristics with OIDC.
It is important to note that while OIDC is built on top of the
OAuth 2.0 protocol, it serves different purposes. OIDC
specifically addresses authentication, while OAuth 2.0 is
more about delegation and authorization. However, many
providers, like Facebook, GitHub, and Twitter, are often
colloquially associated with OIDC due to the similarity in
their authentication flows, even if they are more rooted in
OAuth.
Java offers several technologies and libraries to implement
and integrate OIDC in applications. Here is a list of popular
Java technologies and libraries to facilitate OpenID Connect
authentication:
Spring Security OAuth: An extension of the Spring
Security project that handles integration with OAuth
2.0 and OpenID Connect.
Keycloak: An open-source identity and access
management (IAM) solution that supports OIDC out
of the box.
Pac4j: A comprehensive security library for Java which
supports authentication for numerous protocols,
including OIDC.
MitreID Connect: An open-source OpenID Connect
reference implementation in Java, developed by MITRE
Corporation. It is built on Spring Boot.
Connect2id server: A commercial product that offers
an OpenID Connect server. They provide SDKs for Java
to facilitate the integration process.
Jboss AeroGear: Provides a range of libraries for
mobile and web development, including support for
OpenID Connect on the server side.
Nimbus JOSE + JWT: A popular library for encoding,
decoding, and verifying JSON Web Tokens (JWT).
JWT is an essential component of the OpenID Connect
protocol.
OIDC Java Spring Boot Starter: A ready-to-use
package that simplifies the integration of OIDC into
Spring Boot applications.
OAuth 2.0 and OpenID Connect SDK for Java: A
generic SDK for adding OIDC support to Java
applications, not tied to any specific framework.
Apache Oltu: While it is more focused on OAuth 2.0, it
can be used as a foundation for an OpenID Connect
implementation.
When choosing a technology or library, it is important to
consider factors like the application's existing stack,
developer familiarity with the technology, and the level of
community or commercial support available.
Following are the pros and cons:

Pros:
Widely accepted and recognized protocol built on top
of OAuth 2.0.
Supports various client types, including mobile and
JavaScript clients.
Allows third-party login (for example, Google,
Facebook) reducing the need for password
management.

Cons:
Might be overkill for simple use cases or smaller
systems.
Takes time to understand, especially if unfamiliar with
OAuth 2.0.
Relying on third-party identity providers could pose
risks or outages.

Multi-factor authentication
Multi-factor authentication (MFA), often referred to as
two-factor authentication (2FA) when it involves two
steps, is a security enhancement that requires users to
present two or more verification factors to gain access to a
resource, such as an external API. The primary principle
behind MFA is that a user provides multiple forms of
identification to ensure a more robust verification process
(refer to Figure 8.11):

Figure 8.11: Multi-factor authentication using a personal mobile device

For external APIs, MFA is especially critical due to the


exposure these interfaces have to the external world. Here is
how it commonly works:
Initial registration: The user first registers their
additional authentication methods (for example, a
mobile number to receive SMS codes or a biometric
method).
API call initiation: When a user or a system wants to
access a secured endpoint of the external API, they
provide their primary authentication method (for
example, a password or API key).
MFA challenge: After the initial authentication is
verified, the system challenges the user for a second
form of authentication. This could be a code sent via
SMS, an authentication prompt from a mobile app, or a
request for a biometric verification.
Access granted: Once the system successfully verifies
both authentication methods, the user is granted
access to the API, and the desired operation is allowed.
Some of the most common MFA methods include:
SMS text messages: Users receive a one-time code
via an SMS text message, which they must enter to
authenticate.
Authentication applications: Apps like Google
Authenticator, Authy, or Microsoft Authenticator
generate time-based one-time passwords (TOTPs)
for users to input.
Hardware tokens: Physical devices, often key fobs,
that generate secure codes at timed intervals.
Email-based codes: One-time codes sent to the user's
registered email address.
Biometrics: Utilizes the user's unique biological
characteristics, such as fingerprints, facial recognition,
retina scans, or voice recognition.
Smart cards: Physical cards with embedded chips that
store cryptographic information, which can be
combined with a PIN for additional security.
USB security keys: Devices like YubiKey can be
inserted into a USB port to serve as an authentication
factor.
Backup codes: Pre-generated one-time use codes that
users can print or write down and keep for situations
when their primary MFA method is unavailable.
When implementing MFA for external APIs using Java, there
are several technologies and libraries available. Here is a list
of some widely used ones:
Spring security: One of the most popular security
frameworks for Java applications. Spring Security
supports MFA through its numerous extensions and
community plugins.
Apache shiro: Another security framework that can be
configured to support MFA with custom modules.
Keycloak: An open-source identity and access
management solution that provides out-of-the-box
support for MFA.
Java Authentication and Authorization Service
(JAAS): While JAAS is more general-purpose, it can be
extended to support MFA.
Yubico's Java libraries: For integrating with YubiKey
hardware tokens.
Google authenticator: While it is primarily an app,
there are Java libraries available (like otplib-java) that
can help integrate its TOTP generation mechanism into
an application.
FreeOTP: A free and open-source soft token solution
that also provides Java libraries for integration.
JavaMail: For email-based MFA, you will often need a
way to send emails.
Following are the pros and cons:

Pros:
Meets requirements for certain regulations (for
example, GDPR, HIPAA).
Increases user trust in the platform's security
measures.
Offers various authentication methods to cater to user
preferences and security levels.

Cons:
Additional steps can frustrate users.
Can be challenging to implement correctly.
Implementing robust MFA may have associated costs.

Authorization
In external APIs, once caller identity is confirmed through
authentication, the crucial next step is authorization. This
process ensures that users or systems access only permitted
actions or resources, upholding system security and integrity.
Authorization sets user boundaries and offers a structured
means to manage permissions based on roles, policies, or
criteria.

Problem
In external APIs, after authentication, restricting callers to
authorized actions or resources is vital. This ensures data
privacy, system integrity, smooth user experiences, and
scalability. Establishing a strong, adaptable authorization
mechanism presents challenges. Various methodologies to
consider include:
Role-based authorization: Here, permissions are
assigned to specific roles, and users or entities are
assigned these roles. For instance, an admin role might
have broader access compared to a user role. While it
simplifies management, it may lack granularity in
complex systems.
Permission-based authorization: In this approach,
specific permissions are granted directly to users or
entities, allowing for more granular control. It can
cater to complex scenarios but might become
cumbersome to manage at scale.
Attribute-based authorization: Decisions are made
based on attributes of the user, the resource being
accessed, and other contextual factors. This dynamic
method can be highly adaptive but also requires
sophisticated logic.
Policy-based authorization: Defined policies, often in
a declarative manner, dictate the access control. These
policies can be comprehensive and context-aware but
need regular updating and auditing.
Addressing the issue of authorization in external APIs
requires an understanding of the system's unique needs,
potential growth, user base, and security requirements.
Choosing the right mix of these methods, ensuring
scalability, and maintaining ease of management becomes
crucial to the successful and secure operation of the API
ecosystem.

Session tracking
Session tracking authorization, commonly used in web apps,
manages user access during an active session. This
technique, linked to stateful web apps, also matters in
microservices, balancing statefulness and statelessness.
In systems, a key challenge is preserving a user's state
across stateless HTTP requests. Session tracking
authorization tackles this by creating a session post-login,
housing user-specific data (identity, roles, permissions). It is
vital for user experience and security throughout their app
interaction.

How it works:
1. Login: When a user logs in, the system validates their
credentials.
2. Session creation: On successful login, the system
creates a session, typically represented by a unique
session ID.
3. Session storage: This session ID can be stored in
various ways - as a cookie on the client's browser, a
URL parameter, or even in the page itself as a hidden
field.
4. Subsequent requests: For subsequent requests, the
system retrieves the session ID and fetches the
corresponding session data to determine user identity
and permissions.
In microservices, session tracking presents challenges. If a
session is in one service instance, and the next request goes
to a different one (due to load balancing), session data may
not be readily accessible. Solutions like centralized session
stores or sticky sessions (which route requests to the same
service instance) can tackle this. Alternatively, including
session data in each request ensures constant access but
may raise security and performance concerns.

Security considerations:
Session hijacking: If a malicious actor gains access to
a session ID, they can impersonate the user.
Techniques like regenerating session IDs post-login or
after a set duration can mitigate this risk.
Session timeout: Sessions should have an expiration
time to reduce the window of opportunity for
unauthorized access.
Secure storage: If session data contains sensitive
information, it should be encrypted.
In modern architectures, especially with the rise of stateless
APIs and microservices, session tracking has evolved.
Techniques like token-based authentication (for example,
JWT) often replace or complement traditional session-based
approaches, offering more scalability and ease of use across
distributed systems.
Following are the pros and cons:

Pros:
Consistent user experience across requests.
Centralized control over user sessions.
Well-understood, widely adopted mechanism.

Cons:
Potential for session hijacking.
Scalability challenges in distributed systems.
Centralized session store can be a bottleneck.

JWT token
JSON Web Token (JWT) is a compact, URL-safe means of
representing claims to be transferred between two parties.
The claims in a JWT are encoded as a JSON object that is
used as the payload of a JSON Web Signature (JWS)
structure or as the plaintext of a JSON Web Encryption
(JWE) structure, enabling the claims to be digitally signed or
integrity protected with a Message Authentication Code
(MAC) and/or encrypted (as shown in Figure 8.12):
Figure 8.12: Structure of JWT token

When utilized for authorization in external APIs, they


consider the following:
Issuance: After successful authentication, a JWT is
issued to the client. This token contains a set of claims,
which typically include the user's identity and
roles/permissions, among other metadata.
Token structure: A JWT typically consists of three
parts: Header, payload, and signature. The header
specifies the algorithm used for the signature. The
payload contains the claims and other metadata. The
signature ensures the token's integrity.
Stateless verification: On receiving a request with a
JWT, the server can verify the token's validity without
needing to maintain a session state. This is done using
the signature and a secret key or a public key,
depending on the signing algorithm used.
Authorization: Once the token is verified, claims
within the JWT, like roles or permissions, are used to
determine whether the user has the right to access the
requested resource or perform a specific action.
Expiration: JWTs often include an expiration claim
(exp). This makes them self-contained regarding
validity, reducing the risk of long-lived sessions. Once
expired, the token needs to be refreshed.
Security: JWTs can be signed and optionally
encrypted, ensuring data integrity and confidentiality.
However, the payload of a JWT is merely encoded and
can be decoded by anyone who has the token. Thus,
sensitive data should not be stored directly in a JWT
unless it is encrypted.
When it comes to implementing JWT tokens in Java, there are
several popular libraries and frameworks that developers can
leverage:
Java JWT (JJWT): This is one of the most popular
libraries for creating and verifying JSON Web Tokens
(JWT) in Java. Its fluent API and comprehensive
feature set make it a go-to choice for many Java
developers.
Auth0 Java JWT: Auth0 provides a JWT library for
Java that is intuitive and easy to use. It is not only
limited to Auth0's service but can be used with any JWT
tokens.
Spring Security JWT: If you are working within the
Spring ecosystem, Spring Security provides support for
JWT. This integrates smoothly with Spring Security's
authentication and authorization mechanisms.
Keycloak: This is an open-source identity and access
management solution that supports JWT and OpenID
Connect (OIDC). It is especially useful if you are
looking for a comprehensive solution that includes user
management, single sign-on, and more.
Nimbus JOSE+JWT: This library provides robust
support for creating, parsing, and verifying JWTs. It
also supports other JavaScript Object Signing and
Encryption (JOSE) structures like JWS and JWE.
MicroProfile JWT: For developers working with
MicroProfile (especially in the context of
microservices), this is a specification for using JWT
tokens for security.
Following are the pros and cons:

Pros:
Server does not need to store session information, as
JWTs contain all the required information.
Without server-side sessions, systems can easily scale
horizontally.
Defined by the RFC 7519 standard, ensuring consistent
structure and handling.

Cons:
Typically larger than traditional session tokens,
impacting bandwidth.
If leaked, JWTs can expose user information since they
are not encrypted but only encoded.
Decoding and verifying JWTs on every request can add
processing overhead.

OAuth 2
OAuth 2.0, the leading authorization standard for external
APIs, employs token-based access that is seamlessly
integrated with HTTP. It allows applications to access user
resources without revealing user credentials.
OAuth 2.0's architecture revolves around grant types or
flows, tailored for specific scenarios like authorization code
for web apps, implicit for SPAs, and client credentials for
application access, ensuring protocol flexibility.
OAuth 2.0's core entities include the client (seeking access),
the resource owner (usually the end-user), and the
authorization server, which manages authentication and
token issuance (as shown in Figure 8.13):

Figure 8.13: Abstract OAuth2 authorization flow (from Wikipedia)

OAuth 2.0's authorization flow prioritizes security and user


control by separating client, authorization server, and
resource server roles. It ensures user credentials remain
distant from potentially less-trusted apps. The six primary
steps of this flow include:
Authorization request to resource owner: The
process kicks off with the client (often an application)
requesting permission from the resource owner,
usually an end-user. This request is typically in the
form of a redirect to an authorization server's
authorization endpoint, where the user is prompted to
grant or deny access.
Authorization grant from resource owner: If the
resource owner agrees to the application's request,
they grant it authorization. The form of this grant can
vary: It might be an authorization code, an implicit
grant, or other grant types defined by the protocol.
Authorization grant to authorization server: The
client then proceeds to present this authorization grant
to the authorization server, validating itself in the
process. This validation usually involves presenting its
client ID and secret.
Access token from authorization server: Upon
successful validation of the client and the authorization
grant, the authorization server issues an access token
to the client. This token serves as a proof of
authorization, which the client can use to access
protected resources on behalf of the user.
Request with access token to resource server:
Armed with the access token, the client can now make
requests to the resource server to access the protected
resources. The client includes the access token in its
request to demonstrate that it possesses the necessary
permissions.
Protected resource from resource server: If the
access token is valid, the resource server returns the
requested protected resource to the client. If the token
has expired or is invalid, the resource server denies the
request.
OAuth 2.0 defines several authorization flows or grant types
to cater to various client types and use cases. Here are the
primary OAuth 2.0 grant types:
Authorization code flow: Used by server-side
applications where the source code is not exposed to
the end-user. Involves redirecting the user to the
authorization server and then exchanging an
authorization code for an access token.
Implicit flow: Intended for clients that are
implemented entirely using JavaScript and run in the
resource owner's browser. Unlike the authorization
code flow, it receives the access token directly without
an intermediary code exchange step. Less secure than
the Authorization Code Flow, and its usage has been
declining in favor of other mechanisms.
Resource owner password credentials flow (or
password flow): Suitable for clients that are highly
trusted, like user-agent-based and trusted first-party
applications. Involves the direct exchange of username
and password for an access token.
Client credentials flow: Used when the application
needs to access its own service account and not on
behalf of a user. Typically used for machine-to-machine
authentication.
Refresh token flow: Aids in obtaining a new access
token when the current one expires, without requiring
re-authentication by the resource owner. Used in
conjunction with other flows (like Authorization Code
Flow).
Device code flow: Suitable for devices with limited
input capabilities, like smart TVs or IoT devices. The
device displays a code, and the user verifies it on
another device, leading to the device obtaining an
access token.
Extension grants: For situations not covered by the
standard grant types, allowing organizations to define
custom grant types.
OAuth 2.0 relies on access tokens instead of user credentials,
known for their temporary nature and specificity, often tied
to actions or data via scopes.
To address short-lived access tokens and minimize re-
authentication, OAuth 2.0 introduced refresh tokens. They
enable apps to acquire new access tokens, ensuring a
smoother user experience.
OAuth 2.0 follows the principle of least privilege using scopes
to limit authorization. This granularity enhances security and
user control by granting applications access only to specific
resources.
While OAuth 2.0 is a significant advancement in
authorization, it relies on SSL/TLS for data integrity and
confidentiality. For user identity verification, it is often paired
with OpenID Connect, ensuring standardized user identity
retrieval and verification atop OAuth 2.0.
To implement OAuth 2.0 authorization in Java, there are
several libraries and frameworks available. Some of the most
popular and widely-used options include:
Spring Security OAuth: An extension to the Spring
Security framework that provides features for
OAuth1(a), OAuth2, and OpenID Connect.
Keycloak: An open-source identity and access
management solution that provides OAuth 2.0, SSO,
and JWT token capabilities.
Pac4j: A Java security engine that supports multiple
clients and multiple protocols, including OAuth 2.0.
ScribeJava: A simple OAuth library for Java that
supports OAuth1(a) and OAuth2.
Nimbus OAuth 2.0 SDK with OpenID Connect
extensions: A comprehensive library for building,
parsing, and verifying OAuth 2.0 and OpenID Connect
messages.
Apache Oltu (formerly Amber): An OAuth protocol
implementation in Java maintained by the Apache
Software Foundation.
Following are the pros and cons:

Pros:
Widely-accepted protocol with clear specifications.
Multiple flows cater to different use cases (web,
mobile, server-to-server).
Reduces risk by not requiring apps to store user
credentials.

Cons:
Can be overcomplicated for simple use cases.
Some services implement OAuth slightly differently.
For secure communication, it necessitates an SSL/TLS
setup.

SSL/TLS encryption
In external APIs, data security during transit is critical.
Secure Sockets Layer/Transport Layer Security
(SSL/TLS) encryption, fundamental security protocols,
enable encrypted communication between clients and
servers, protecting data from eavesdropping and tampering.
SSL/TLS not only strengthens data integrity and
confidentiality but also builds user trust, enhancing API
service credibility and reliability.
Problem
In the ever-changing realm of external APIs, transmitting
sensitive data without safeguards carries significant risks.
Data breaches, eavesdropping, and man-in-the-middle
attacks are concerns. SSL/TLS encryption ensures secure,
reliable communication. Inadequate SSL/TLS can expose APIs
to interception, tampering, and impersonation, risking data
and trust. Implementing SSL/TLS in external APIs
comprehensively and robustly is crucial.

Solution
SSL is a cryptographic protocol developed to secure
communications over a computer network. While the
protocol itself is outdated and has been superseded by
Transport Layer Security (TLS), the term SSL is still
popularly used in the industry to refer to the security layer,
which encompasses both SSL and TLS (as shown in Figure
8.14):

Figure 8.14: SSL/TLS handshake


Key concepts within SSL:
Encryption: At the heart of SSL is encryption. It
ensures that data transmitted between two systems—
such as a web user's computer and a website's server—
is scrambled and indecipherable to any unauthorized
parties who might intercept it. This is especially crucial
when transmitting sensitive data, like personal
information or financial details.
Authentication: SSL certificates, issued by
certificate authorities (CAs), are used to
authenticate the identity of a website. This means that
users can be confident they are communicating with a
legitimate website and not a malicious impostor.
Data integrity: SSL ensures that the data sent
between two parties is not tampered with during
transmission. It uses cryptographic checksums to
ensure the data's integrity.
Key exchange: For encryption to work, both parties
need a way to exchange cryptographic keys in a secure
manner. SSL facilitates this through a process known
as the SSL handshake.
Before encrypted data can be exchanged between a client
(for example, a browser) and a server, an SSL handshake
must occur. This process involves:
1. Negotiating the version of SSL/TLS to be used.
2. Choosing the encryption ciphers.
3. Exchanging cryptographic keys via asymmetric
encryption.
4. Once the handshake is completed, a secure connection
is established, and data can be exchanged using
symmetric encryption, which is faster.
When a server and client establish an SSL connection, the
server offers an SSL certificate, validated by a CA. Trusted
CAs are listed in browsers and operating systems. If a
certificate is from a trusted CA, the connection is secure, a
fundamental aspect of web SSL.
To enable SSL in a Spring Boot app and secure a REST
controller with HTTPS, obtain an SSL certificate and configure
Spring Boot. Here is a basic guide:
1. Obtaining a self-signed certificate: For
demonstration purposes, generate a self-signed
certificate. For production, obtain a certificate from a
legitimate CA.
Use the Java keytool utility to generate a self-signed
certificate (Code snippet 8.5):
1. keytool -genkeypair -alias mycertificate -keyalg RSA -
keysize 2048
-storetype PKCS12 -keystore keystore.p12 -validity 3650

This will prompt you for details about the certificate


and a password. Remember the password; you will
need it for the next step.
2. Spring Boot configuration: In your
application.properties or application.yml file, add the
following configurations (Code snippet 8.6):
1. # The location of the keystore containing the SSL certificate
2. server.ssl.key-store=classpath:keystore.p12

3.
4. # The password you used when creating the certificate

5. server.ssl.key-store-password=yourpassword

6.
7. # The type of the keystore, PKCS12 in our case
8. server.ssl.keyStoreType=PKCS12

9.
10. # The alias mapped to the certificate

11. server.ssl.keyAlias=mycertificate

12.
13. # Enable HTTPS

14. server.port=8443

3. REST controller: Now, create a simple REST controller


(Code snippet 8.7):
1. @RestController
2. @RequestMapping("/api")

3. public class MyController {

4.
5. @GetMapping("/hello")

6. public ResponseEntity<String> hello() {

7. return ResponseEntity.ok("Hello, SSL!");

8. }

9. }

4. Run and test: Start your Spring Boot application. Once


it is up and running, navigate to
https://localhost:8443/api/hello in your web
browser.
In SSL service calls, Java's standard library offers native SSL
support. Instead of HttpURLConnection, you can use
HttpsURLConnection. This class handles SSL handshake and
encryption, simplifying secure communication compared to
non-secure HTTP (Code snippet 8.8):
1. String https_url = "https://localhost:8443/api/hello";

2. URL url = new URL(https_url);

3. HttpsURLConnection con = (HttpsURLConnection) url.openCon


nection();

4.
5. // If you're working with self-
signed certificates or certificates not in the Java truststore,

6. // additional configuration may be required here to


establish trust.

7.
8. BufferedReader br = new BufferedReader(new InputStreamRead
er(con.getInputStream()));

9.
10. String inputLine;

11. while ((inputLine = br.readLine()) != null) {

12. System.out.println(inputLine);

13. }

14. br.close();

Following are the pros and cons:

Pros:
Encrypts data, ensuring data integrity and
confidentiality during transmission.
Validates identity of a website or server, preventing
impersonation.
Search engines may give preference to HTTPS sites.
Cons:
Initial setup, especially for custom configurations, can
be complex.
Purchasing certificates from trusted certificate
authorities can be expensive.
Certificates need to be renewed, possibly leading to
site downtime if neglected.

Conclusion
In this chapter, we explored external interfaces, emphasizing
API gateways, Public Facades, and Backend for Frontend
designs. We covered API management, synchronous
methods like HTTP/REST and GraphQL, and asynchronous
methods including push notifications, webhooks, and
WebSockets. Security topics included authentication
techniques (Basic, OpenID Connect, MFA), authorization
approaches (session tracking, JWT tokens, OAuth 2.0), and
SSL/TLS encryption's vital role in secure API communications.
The next chapter will explain strategies and techniques for
effectively monitoring microservices in a distributed system.

References
1. Shah, B. Microservices Design - API Gateway Pattern.
Medium. Jul 4, 2020. Available at
https://medium.com/dev-genius/microservices-
design-api-gateway-pattern-980e8d02bdd5
2. 2.https://medium.com/the-restful-web/webhooks-
dos-and-dont-s-what-we-learned-after-
integrating-100-apis-d567405a3671
3. Jackson, T. New to webhooks? Start here. Medium. Jul 3,
2018. Available at
https://medium.com/codeburst/whats-a-webhook-
1827b07a3ffa
4. Rana, V. Understanding Websockets In depth. Medium.
Feb 16, 2023. Available at
https://vishalrana9915.medium.com/understandi
ng-websockets-in-depth-6eb07ab298b3
5. Koff, D. Multi-Factor Authentication For The Masses.
Medium. May 17, 2017. Available at
https://medium.com/s/the-firewall/episode-3-
multifactor-authentication-b25e9e1d2c18

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 9
Monitoring Microservices

Introduction
This chapter introduces you to the essential strategies and
techniques for effectively monitoring microservices in a
distributed system. We will explore a range of topics from
trace IDs, error propagation, and various logging techniques
to log aggregation, application metrics, distributed tracing,
and health checks. Each section will delve into the specifics,
discussing the rationale, significance, and best practices
associated with each topic. By the end of this chapter, you
will be well-equipped with the knowledge to design,
implement, and manage effective monitoring strategies in a
microservices environment.

Structure
In this chapter, we will cover the following topics:
Trace IDs
Error propagation
Logging
Triple-layered logging
Log aggregation
Application metrics
Distributed tracing
Health checks

Objectives
This chapter equips you with effective monitoring practices
in microservices, including trace IDs, error propagation, and
various logging techniques. You will learn to interpret
application, operational, and audit logs, explore log
aggregation, metrics collection, distributed tracing, and
health checks. Strategies for addressing errors and
performance bottlenecks are covered, along with the
importance of proactive monitoring to detect, diagnose, and
recover from faults promptly.

Trace ID
The trace ID pattern enhances observability and debugging
in microservices. This section explores practices for a robust
and traceable environment, enabling tracking of transactions
across microservices and issue resolution.

Problem
In distributed microservices, transactions often span across
services and even data centers, causing difficulty in tracing
and isolating logs. Debugging and performance monitoring
become challenging without connecting related log
messages. A systematic approach is needed to correlate logs
from multiple microservices during a transaction.

Solution
The trace ID pattern, also known as the correlation ID or
request ID, offers a solution to the problem of tracing the
progression of a single transaction across multiple
microservices. It does so by assigning a unique identifier, a
trace ID, to each transaction. This ID, generated at the
initiation of a transaction, is passed along every subsequent
request throughout the microservices involved in the
transaction (refer to Figure 9.1):

Figure 9.1: Passing trace ID throughout the chain of calls in microservices

For example, if a user initiates a request that passes through


Microservice A, then B, and finally C, a unique trace ID is
generated when the request enters Microservice A. This ID
is then included in every log entry related to this transaction
within each microservice and is passed along to
Microservice B and Microservice C. As a result, every log
message associated with the transaction, regardless of the
microservice generating it, includes this unique identifier.
This approach makes it possible to aggregate and filter log
data based on the trace ID, effectively grouping together all
log entries corresponding to a specific transaction. This
provides a consolidated view of a transaction's execution
path and makes debugging and performance monitoring
tasks considerably more manageable.
It is also good practice to include the trace ID as a formal
parameter in all operation or business methods, ensuring it
is propagated throughout the entire transaction. Some
developers take this a step further by using an execution
context passed through the call chain. This context object
can carry additional valuable information such as user details
or client application data along with the trace ID, enhancing
the comprehensibility of logs and easing the process of
performance tuning and debugging.
Here are some a few common methods for generating trace
IDs:
Universally Unique Identifier (UUID): Popular for
generating trace IDs, UUIDs ensure high uniqueness
across space and time. UUIDv4, the most common
version, creates a 128-bit random number as a 32-
character hexadecimal string.
Timestamp-based IDs: This method uses high
precision timestamps, possibly combined with random
bits for uniqueness. However, in high throughput
systems, using timestamps alone may increase collision
chances.
Combination of user/client and timestamp: This
method combines timestamp with the name of calling
user or client application, providing a higher degree of
uniqueness.
Sequential numbers (with system identifier): In
this method, a system-specific identifier is combined
with a monotonically increasing number to create the
trace ID. Caution is advised as it requires calls to a
global key generator for ID uniqueness.
Hashed values: A unique trace ID can be generated
by hashing system parameters such as the current
timestamp, process ID, and other distinct values.
Snowflake ID: An open-source ID generator used by
Twitter, creating unique 64-bit integer IDs by
combining timestamp, worker number, and sequence
number.

Error propagation
The error propagation pattern allows to carry meaningful
error information up the call chain, thus reducing the time
and complexity involved in troubleshooting and ensuring
quick problem resolution. This fosters a resilient
microservices environment where system issues are
promptly identified and addressed, leading to improved
system uptime and user satisfaction.

Problem
In microservices, spanning transactions can lead to generic
errors without detailed information, making diagnosis and
resolution time-consuming. Handling errors is further
complicated by diverse microservices implemented in
different languages, where exceptions may not be easily
interpreted across services.

Solution
The error propagation pattern conveys meaningful error
information from the originating service to invoking services
and the client application. It ensures effective
communication of errors throughout the request chain,
allowing upstream services and client applications to handle
exceptions appropriately, refer to Figure 9.2:
Figure 9.2: Error propagation through a chain of calls in microservices

Here is a breakdown of the pattern:


Capture original error: When an error or an
exception occurs in a microservice, it should be
captured immediately at the point of failure.
Create an error object: The captured error or
exception is wrapped into a standardized error object,
like ApplicationException, which includes essential details
such as the error message, stack trace, error type,
trace ID, and relevant data.
Propagate error: The error object is propagated
upstream to the calling service or client application,
serialized in a universally understandable format like
JSON, enabling communication between microservices
to be implemented in different languages.
Deserialize and handle error: The calling service or
client application deserializes the error object, extracts
the error information, and handles the error
appropriately.
To handle and propagate errors in microservices, capturing
and wrapping them in a specialized exception is crucial. This
encapsulation facilitates the transfer of error information
across microservices and language boundaries. An
ApplicationException class can be utilized to wrap original
exceptions with additional details and propagate them using
various protocols like HTTP, gRPC, or asynchronous
messaging (Code snippet 9.1):
1. import com.fasterxml.jackson.annotation.JsonProperty;

2. …

3.
4. public class ApplicationException extends Exception implements Ser
ializable {

5.
6. @JsonProperty

7. private String type;

8. @JsonProperty

9. private String message;

10. @JsonProperty

11. private String traceId;

12. @JsonProperty

13. private String stackTrace;

14.
15. // ObjectMapper for JSON serialization/deserialization

16. private static final ObjectMapper objectMapper = new ObjectMap


per();

17.
18. public ApplicationException(String type, String message,
String traceId, String stackTrace) { // construct object }
19.
20. public static ApplicationException wrapError
(Exception e, String traceId) {

21. // The original exception's message and stack trace are


used, type and category could be based on the exception class.

22. return new ApplicationException(e.getClass().getSimpleName()


, e.getMessage(),
traceId, Arrays.toString(e.getStackTrace()));

23. }

24.
25. // Convert this object to JSON

26. public String toJson() throws JsonProcessingException {

27. return objectMapper.writeValueAsString(this);

28. }

29.
30. // Create an object from JSON

31. public static ApplicationException fromJson(String json) throws


JsonProcessingException {

32. return objectMapper.readValue(json, ApplicationException.clas


s);

33. }

34.
35. // getters and setters...

36. }

Here is a sample Spring Boot REST controller with a


doSomething() method that wraps an exception and sends it
over as JSON error response (Code snippet 9.2):
1. import org.springframework.http.HttpStatus;

2. …

3.
4. @RestController

5. @RequestMapping("/api")

6. public class SampleController {

7.
8. @GetMapping("/do-something")

9. public ResponseEntity<String> doSomething() {

10. try {

11. // Perform some operation...

12. // Let's simulate an exception for demonstration

13. throw new Exception("Simulated exception");

14. } catch (Exception e) {

15. String traceId = "12345"; // Replace with actual TraceId

16.
17. ApplicationException appException = ApplicationException.f
romException(e, traceId);

18. String errorJson = null;

19. try {

20. errorJson = appException.toJson();

21. } catch (Exception jsonException) {

22. jsonException.printStackTrace();

23. }
24.
25. return new ResponseEntity<>
(errorJson, HttpStatus.INTERNAL_SERVER_ERROR);

26. }}}

Here is an example of how to extract the ApplicationException


from the response (Code snippet 9.3):
1. RestTemplate restTemplate = new RestTemplate();

2.
3. try {

4. ResponseEntity<String> response = restTemplate.getForEntity("ht


tp://localhost:8080/api/do-something", String.class);

5.
6. // handle the response, e.g., response.getBody()

7.
8. } catch (HttpServerErrorException e) {

9. if (e.getStatusCode() == HttpStatus.INTERNAL_SERVER_ERROR)
{

10. String errorBody = e.getResponseBodyAsString();

11. throw ApplicationException.fromJson(errorBody);

12. }

13. ...

14. }

Implementing the error propagation pattern can significantly


improve the robustness and reliability of your microservices
architecture. It can help in quick and easy identification and
debugging of errors, which can otherwise be a daunting task
in a distributed system.

Logging
Logging is a foundational pattern in microservices, providing
transparency for troubleshooting, system monitoring, and
user activity tracking. It categorizes logs, emphasizing their
aggregated management in distributed architectures.
Mastery of this pattern enables improved manageability and
scalability in microservices environments.

Problem
In microservices, services in different languages and
locations interact, generating data and encountering errors.
Understanding interactions, monitoring the system,
debugging, and tracing transactions across services
becomes challenging due to their distributed nature.
Traditional logging methods designed for monolithic
architectures are inadequate in the distributed environment
of microservices.

Triple-layered logging
In the world of computing and information systems, logs play
a vital role in capturing and documenting important events
and activities. Logs provide a valuable source of information
for various purposes, ranging from troubleshooting and
debugging to business analysis and security investigations.
Among the various types of logs, three major categories
stand out: Application logs, operational logs, and audit logs.
Each type serves distinct purposes and offers unique insights
into different aspects of a system's behavior.
1. Application logs, also known as technical logs, are
generated by applications during their runtime. These
logs contain information about the behavior and state
of the system, offering valuable insights into the inner
workings of an application. Application logs are
primarily used for troubleshooting and debugging
purposes, allowing developers to identify and resolve
issues that may arise during the application's
execution. By examining these logs, developers can
gain visibility into error messages, exceptions,
warnings, and other relevant details that aid in
diagnosing and rectifying problems. Moreover,
application logs serve as a valuable source of
performance metrics, including response times and
resource utilization, facilitating optimization endeavors.
Developers further enhance these logs by
implementing alerts, enabling real-time notifications
through platforms like PagerDuty.
2. Operational logs, also referred to as event logs or
business logs, capture important business events within
a system. These logs record crucial information for
understanding the behavior of a system in its
operational context. Operational logs are valuable for
various purposes, including business analysis, activity
tracking, auditing, and compliance. By analyzing these
logs, organizations can gain insights into operational
patterns, identify bottlenecks, and make informed
decisions to optimize their processes. Moreover,
operational logs provide a means to track activities and
monitor system behavior for compliance with
regulatory standards and internal policies.
3. Audit logs, also known as user activity logs, have a
primary focus on tracking and documenting user
activities within a system. These logs play a crucial role
in ensuring security, compliance, and facilitating
forensic investigations if necessary. Audit logs record a
comprehensive set of user actions, including logins,
access attempts, file modifications, and other critical
operations. They are designed to be immutable and
tamper-resistant, providing an accurate and reliable
record of user activity. Due to their significance in legal
disputes and compliance requirements, audit logs are
typically stored for extended periods. These logs serve
as an essential source of evidence and are often
subject to strict retention policies.
Logs are an integral part of modern systems, offering
valuable insights into the behavior and operation of
applications. Understanding the different types of logs—
application logs, operational logs, and audit logs—provides
organizations with a comprehensive view of their systems
and enables them to address various needs, including
troubleshooting, business analysis, compliance, and security.
By harnessing the power of logs, businesses can optimize
their operations, ensure regulatory compliance, and enhance
the overall security posture of their systems.

Log aggregation
The log aggregation pattern involves the gathering of log
data from various sources and consolidating it into a single,
centralized location. This becomes crucial in a microservices
architecture due to the dispersed nature of the system,
where each microservice produces its own logs. Without
centralized log management, it would be tremendously
difficult to monitor system operations and troubleshoot
issues effectively (See Figure 9.3):
Figure 9.3: Centralized logging in a microservices system

Log aggregation has multiple benefits. It centralizes log


entries for easier searching, analysis, and real-time
visualization. It enables analysis across various aspects like
time, service, or log level. It also facilitates alert creation for
enhanced system monitoring and error detection.
The process of log aggregation can occur through either pull-
based or push-based methods:
Pull-based aggregation (log scraping): Regularly
scanning or scraping microservice outputs and pulling
relevant log data into a centralized log management
system. This method works well in environments where
logs are output to standard interfaces like stdout or
stderr.

Push-based aggregation: In this approach, logs are


actively sent or pushed from your microservices to the
centralized logging system. This often requires
specialized logging libraries or loggers that have built-
in support for remote log servers.
Several technologies support log aggregation in a
microservices architecture:
Elastic Stack (ELK Stack): An open-source stack
featuring Elasticsearch for search, Logstash for
centralized logging and log enrichment, and Kibana for
visualization.
Fluentd or Fluent Bit: Open-source data collectors
that unify data and, when combined with Elasticsearch
and Kibana, offer a comprehensive log aggregation
solution.
Graylog: An open-source log management platform
supporting multiple data sources and providing log
normalization, enrichment, and alerting features.
Splunk: A commercial product offering robust
capabilities for searching, monitoring, and analyzing
machine-generated big data.
Cloud-based solutions: Providers like Amazon CloudWatch,
Google Cloud Logging, and Azure Monitor offer managed log
aggregation services.
In choosing a log aggregation solution, you should consider
system-specific requirements, the complexity of your
microservices environment, and the resources you have
available for managing your log aggregation infrastructure.

Application metrics
The application metrics pattern focuses on operational
health and performance of microservices. It helps to
understand key execution patterns, placing importance on
the practices that foster a highly efficient, scalable, and
reliable microservices environment. By collecting and
analyzing these metrics, we can gain valuable insights into
the behavior of our services, leading to more informed
decisions and more effective system tuning.

Problem
Microservices architecture faces challenges with
performance and reliability due to interdependencies and
distribution. Monitoring and optimizing these aspects require
visibility into non-functional performance characteristics.
Traditional methods like logging are inadequate in identifying
and addressing performance issues and service failures.
Microservices operate in isolated environments with diverse
performance characteristics, making system-wide visibility
challenging. Interactions within and across microservices can
hide performance attributes. The lack of an application
metrics pattern delays performance awareness. A pattern for
capturing, aggregating, and analyzing metrics is needed to
monitor the system, detect issues early, and ensure
reliability and efficiency throughout the software lifecycle.

Solution
The application metrics pattern in microservices is a
comprehensive method for the consistent collection,
aggregation, and analysis of application performance data.
The goal is to create a granular and real-time view of the
system's behavior, enabling developers to identify potential
problems early, respond proactively, and optimize the
system's overall performance (refer to Figure 9.4):

Figure 9.4: Performance monitoring in a microservices system


Typical application metrics collected in a microservices
architecture include:
Number of requests: The total count of requests that
a service receives over a period.
Error rates: The ratio of failed requests to the total
number of requests. This can be further classified into
different types of errors.
Response time: Service response time includes
average, percentile (for example, P95, P99), maximum,
and minimum response times.
Throughput: The number of requests that a service
can handle per unit of time.
To collect and analyze these metrics, several technologies
can be used:
Prometheus: An open-source monitoring and alerting
toolkit originally built at SoundCloud.
Grafana: A cross-platform open-source analytics and
visualization web app that supports data visualization
from Prometheus and various other sources.
Micrometer: A metrics instrumentation library for
JVM-based applications.
Datadog: A monitoring service for cloud-scale
applications that brings together data from servers,
containers, databases, and third-party services.
Amazon CloudWatch: A monitoring service for AWS
resources and the applications running on AWS. It can
collect metrics, set and manage alarms, and
automatically react to changes in AWS resources.
Google Cloud Monitoring: Part of Google Cloud's
operations suite, it allows you to understand your
service behavior by ingesting, storing, analyzing, and
viewing metrics.
Azure Monitor: A service in Microsoft Azure used to
track performance metrics and logs for resources in a
subscriber's Azure account.
New Relic: An observability platform that helps
engineers see across their entire software stack for
improved uptime, performance, and troubleshooting.
Splunk: A platform for searching, monitoring, and
examining machine-generated big data.
Each of these tools and technologies has its own strengths,
and the choice often depends on the specific needs and
existing infrastructure of the organization.
In a Spring Boot application, you can use the Micrometer library
to collect and expose application metrics. Here is a simple
example:
In the or application.yaml file, enable the
application.properties
/actuator/prometheus endpoint and configure the metrics
exporting (Code snippet 9.4):
1. management.endpoints.web.exposure.include=health,i
nfo,prometheus
2. management.metrics.export.prometheus.enabled=true
Then, in your code, you can use MeterRegistry to create and
manage your custom metrics. Here is an example of how to
create a simple counter (Code snippet 9.5):
1. import io.micrometer.core.instrument.MeterRegistry;

2. …

3.
4. @RestController
5. public class SampleController {

6. private final MeterRegistry meterRegistry;

7.
8. public SampleController(MeterRegistry meterRegistry) {

9. this.meterRegistry = meterRegistry;

10. }

11.
12. @GetMapping("/do-something")

13. public String doSomething() {

14. meterRegistry.counter("my.counter").increment();

15. ...

16. }}

This example increments a counter each time the /do-


something endpoint is accessed. This metric (and any others
you register) will be exported to the /actuator/prometheus
endpoint, and can be collected by a Prometheus server.

Distributed tracing
While logging and metrics provide generic insights, the
distributed tracing pattern focuses on understanding
interactions between microservices in a distributed system.
It traces individual request journeys, revealing performance
bottlenecks and failing service interactions in complex
microservices collaborations.

Problem
In microservices, logging and metrics provide insights into
individual services but lack a holistic view of end-to-end
transactions. Distributed systems introduce latency,
reliability issues, and debugging challenges. Traditional
logging and metrics are inadequate for visualizing and
monitoring cross-service transaction pathways, hampering
troubleshooting and performance optimization. The
distributed tracing pattern aims to resolve this issue by
providing a comprehensive view of transactions as they
traverse through multiple services.

Solution
The distributed tracing pattern is a critical component of
managing and maintaining microservices, as it provides a
comprehensive, real-time overview of requests as they
traverse through different services. This allows for in-depth
performance monitoring, quick identification of bottlenecks,
and efficient troubleshooting (refer to Figure 9.5):
Figure 9.5: Analysis of traces in DataDog

Various technologies facilitate distributed tracing in Java


microservices:
OpenTelemetry: a unifying standard in distributed
tracing, consolidating OpenTracing and OpenCensus. It
provides APIs, libraries, and agents for capturing
distributed traces and metrics, while integrating with
diverse trace analysis backends in a vendor-agnostic
manner. It is emerging as a key standard in the
distributed tracing domain.
Jaeger: Open-source distributed tracing system for
monitoring and troubleshooting microservices.
Compatible with OpenTelemetry.
Zipkin: Distributed tracing system for troubleshooting
latency issues in service architectures. Compatible with
OpenTelemetry via a bridge library.
Elastic APM: This application performance monitoring
suite is ideal for tracing, and it integrates well with the
rest of the Elastic Stack (Elasticsearch, Logstash,
Kibana).
Azure Application Insights: A feature of Azure
Monitor, Application Insights collects telemetry data
from applications, including traces. It is compatible
with OpenTelemetry via a bridge library.
DataDog: This offers APM and distributed tracing with
a focus on visualizing performance bottlenecks in your
code or services.
By leveraging these technologies, developers can ensure
efficient and smooth operations of their microservices,
leading to improved system performance and a superior end-
user experience.
Health checks
Microservices face operational challenges and potential
failures, impacting performance. The health checks pattern
detects and addresses anomalies by regularly checking
microservices' health. This pattern uses special endpoints to
assess microservice health and ensure system robustness.

Problem
Detecting service failures promptly in a distributed system
such as microservices is challenging. Traditional logging and
monitoring might not be sufficient to identify and isolate
issues promptly. More so, when a microservice is
unresponsive or underperforming, it can negatively impact
the entire application's performance. Hence, there is a need
for a mechanism to perform constant health checks and
ensure the system's high availability and reliability.

Solution
The health checks pattern works by exposing dedicated
endpoints in a microservice, which can be frequently polled
to check the microservice's health status. These health
checks are generally categorized into:
Startup checks: Verify if the microservice has started
correctly and all its components are initialized
properly.
Liveness checks: Validate if the microservice is
running and capable of processing requests. If a
microservice fails the liveness check, it can be
restarted by the system.
Readiness checks: Check microservice readiness to
accept requests. If it fails, temporarily remove it from
the load balancer rotation until it is ready again.
These health checks play a crucial role in maintaining the
high availability of a microservices-based application,
allowing for quick detection and recovery of service failures,
thereby enhancing the system's resilience.
Here is a SpringBoot rest controller with a GET method that
returns the current time as a string. This controller can be
added to microservices to perform basic health checks (Code
snippet 9.6):
1. import org.springframework.web.bind.annotation.GetMapping;

2. …

3.
4. @RestController

5. public class HealthCheckController {

6.
7. @GetMapping("/heartbeat")

8. public String heartbeat() {return LocalDateTime.now().toString();


}}

Following you can see how to configure Kubernetes to use


this endpoint for health checks with a liveness probe for a
microservice pod (Code snippet 9.7):
1. apiVersion: v1

2. kind: Pod

3. metadata:

4. labels:

5. test: liveness

6. name: liveness-http

7. spec:
8. containers:

9. - name: liveness

10. image: k8s.gcr.io/liveness

11. args:

12. - /server

13. livenessProbe:

14. httpGet:

15. path: /heartbeat

16. port: 8080

17. initialDelaySeconds: 3

18. periodSeconds: 3

In this configuration, the livenessProbe is set to perform an


HTTP GET request on the /heartbeat endpoint of our
application every 3 seconds, starting 3 seconds after the pod
has been launched. If the probe fails, Kubernetes will restart
the container.
Similarly, the readiness probe configuration can be added to
use the /heartbeat endpoint.

Conclusion
This chapter explored strategies and techniques for
monitoring microservices in distributed systems. Topics
covered include trace IDs, error propagation, logging
techniques, log aggregation, application metrics, distributed
tracing, and health checks. These insights enhance
observability, maintainability, and reliability in microservices
systems. The knowledge gained establishes a foundation for
designing and implementing effective monitoring strategies,
enabling the creation of responsive, scalable, and resilient
microservices.
In the next chapter, we will expand our knowledge on
microservices by learning about the diverse world of
packaging microservices.

Further reading
1. Jhamukul. Spring boot: Setting a unique id per request.
Medium. Sep 6, 2022. Available at
https://jhamukul007.medium.com/spring-boot-
setting-a-unique-id-per-request-c83f7a811b77
2. Cheng, J. Microservice — Tracing Log in the Distributed
System. Medium. Jun 26, 2022. Available at
https://betterprogramming.pub/microservice-
tracing-log-in-the-distributed-system-
96f49bcb7bd
3. Yogev, G. Writing useful logs. Medium. Feb 12, 2023.
Available at
https://medium.com/@guyogev/writing-useful-
logs-2b8eda4d8318
4. Gupta, N. 5 Design Patterns for Building Observable
Services. Salesforce Engineering. Jan 13, 2023.
Available at https://engineering.salesforce.com/5-
design-patterns-for-building-observable-services-
d56e7a330419/

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 10
Packaging Microservices

Introduction
This chapter introduces the diverse world of packaging Java
microservices for various deployment platforms. It covers
everything from simple self-managed solutions to advanced
docker orchestrators and serverless platforms. Additionally,
it discusses deploying microservices on JEE servers. The
concept of the micromonolith is explored, where the modular
development of microservices merges with the consolidated
deployment of monoliths. To address platform constraints,
the chapter also highlights patterns for externally activating
microservices, enabling them to function actively in different
environments.

Structure
In this chapter, we will cover the following topics:
Microservice packaging
System process
Docker container
JEE bean
Serverless function
Cross-platform deployment
Micromonolith
External activation
Cron jobs
Cron service
JEE timer

Objectives
After this chapter, you will understand Java microservices
packaging for various platforms, including self-managed
solutions, Docker orchestrators (for example, Kubernetes,
Docker Swarm), and serverless environments (like AWS
Lambda, Azure Functions, and Google Cloud Functions). You
will also learn about micromonoliths, which blend
microservices' flexibility with monolithic robustness. Plus,
you will gain strategies to overcome platform constraints and
optimize microservice performance in any deployment.

Microservice packaging
The world of microservices offers a plethora of deployment
platforms, each boasting unique features—from lifecycle
management to adaptive scaling. However, with these
benefits come specific challenges. Each platform has its own
rules for how services should be packaged, and many exert
control over component interactions and thread
management. In this section, we navigate the complexities
of microservice packaging to optimize deployment across
various platforms and understand their inherent challenges
and advantages.

Problem
As the realm of microservices continues to evolve,
professionals are presented with an array of deployment
platforms to choose from, including:
On-premises solutions in VMs: Traditional virtual
machines offering encapsulated environments.
Dockerized environments: Containers that provide a
lightweight and consistent setup.
Cloud-based PaaS: Platforms like Heroku, AWS
Elastic Beanstalk or Azure App Service that manage
infrastructure while allowing application deployment.
Serverless frameworks: Such as AWS Lambda,
Google Cloud Functions or Azure Functions, where
infrastructure management is abstracted entirely.
Java-centric JEE servers: Like Tomcat or WildFly,
cater to Java applications.
These platforms come fortified with a rich set of capabilities
to ease the deployment process:
Lifecycle management: Ensuring services are
instantiated, maintained, and decommissioned
smoothly.
Health monitoring: Constantly checking the health
status of services to prevent unforeseen disruptions.
Dynamic scaling: Adjusting resources based on traffic
and demand.
Recovery mechanisms: Automatically restoring
services in the event of failures.
Yet, while these features are undeniably beneficial, the
platforms are not without their challenges. They frequently
impose certain constraints on microservices, such as:
Packaging formats: Each platform might demand its
unique packaging, be it Docker containers, JAR/WAR
files for Java applications, or standalone executables.
Component communication control: Some
platforms may restrict or dictate how individual
services communicate, which can affect inter-service
workflows.
Lifecycle controls: These can limit how a service
starts, runs, and stops, demanding adherence to
specific protocols set by the platform.
Thread management: Platforms might enforce
policies on how threading is managed, affecting
performance and concurrency strategies.
Facing this complex matrix of opportunities and challenges,
the primary dilemma emerges: How can developers and
architects effectively navigate the packaging labyrinth,
ensuring that their microservices are not only compliant with
platform-specific constraints but also optimized for
performance, reliability, and scalability across various
deployment contexts?

System process
System process packaging is a simple and versatile choice in
the microservices ecosystem. It suits self-managed
deployments in VM environments and PaaS platforms like
Heroku, AWS Elastic Beanstalk, or Azure App Service. Its
minimal restrictions make it appealing for various
microservices scenarios.
At its core, a system process is a running instance of an
executable program. In the context of microservices, it
means that each service is packaged as its standalone
executable. It offers a number of benefits:
Flexibility: Processes can manage their own lifecycles
and can be started, stopped, or restarted
independently.
Concurrency: Processes can maintain active threads,
facilitating efficient multitasking.
Communication: They support a variety of
synchronous and asynchronous inter-process
communication methods, enhancing the interactivity
among services.
The primary constraint for this method revolves around the
executable format. Beyond that, the flexibility it offers is
unparalleled:
Packaging requirement: The microservice must be
packaged into an executable format suitable for the
host OS.
Environment dependencies: Traditional Java
microservices would necessitate a Java Runtime
Environment (JRE). However, advancements in Java
allow compilation into native binaries, as we will
explore further.
While the approach of packaging microservices as system
processes offers undeniable flexibility and independence, it
also brings its own set of complexities.
Monitoring: While system processes offer
independence, monitoring tools must be implemented
to ensure each microservice's health and performance.
Recovery: System processes need external
orchestration or watchdog mechanisms to ensure they
recover from failures.
Scalability: Scaling, especially in non-containerized
environments, requires more manual intervention
compared to other packaging methods.
Let us look at a simple example using Java and the GraalVM
project, which provides the capability to compile Java
applications into standalone native executables.
Here are the prerequisites you need to compile Java
executables:
JDK (preferably Java 11 or newer)
GraalVM
Native Image tool for GraalVM
Before you begin, you should have both the JDK and GraalVM
installed. Once GraalVM is installed, you can install the
Native Image tool using (Code snippet 10.1):
1. gu install native-image

For simplicity, we will use a basic Hello World program (Code


snippet 10.2):
1. // HelloWorld.java

2. public class HelloWorld {

3. public static void main(String[] args) {

4. System.out.println("Hello, World!");

5. }

6. }

First, compile the program using the standard Java compiler


(Code snippet 10.3):
1. javac HelloWorld.java

Now, use the Native Image tool from GraalVM to compile the
Java class into a native executable (Code snippet 10.4):
1. native-image HelloWorld
After running the above command, you will find an
executable named helloworld (or helloworld.exe on Windows) in
the current directory.
Following are the pros and cons of system process:

Pros:
Less complexity compared to containers.
Enables multitasking and concurrency.
Efficient resource and system integrations.

Cons:
Possible OS-level conflicts.
Manual restarts or interventions.
Potential discrepancies across development, testing,
and production.

Docker container
A Docker container is a portable, self-contained software
package that includes an app, its environment, and
dependencies. It communicates through specific ports for
smooth internal and external interactions. You can customize
its behavior with environment variables, making it adaptable
across development, testing, and production. It also ensures
data persistence and sharing via file system volumes. In
today's software world, Docker containers are the go-to for
packaging and deploying microservices.
Docker containers offer a few benefits:
Consistency: The same Docker container can run
unchanged across various environments.
Isolation: Containers encapsulate applications,
ensuring that they do not interfere with one another.
Efficiency: Containers share the host OS kernel,
reducing the overhead compared to traditional VMs.
Docker orchestration platforms enable the management of
multiple containers, handling tasks like scaling, load
balancing, and recovery. Here are some notable
orchestrators:
Docker Swarm: Docker's native clustering and
orchestration tool. It integrates seamlessly with the
Docker CLI and API.
Kubernetes: Originally developed by Google, it is an
open-source container orchestration platform that
offers robust features for container deployment,
scaling, and management.
Amazon ECS (Elastic Container Service): AWS's
container management service that supports Docker
containers. It integrates deeply with other AWS
services.
Azure Kubernetes Service (AKS): Microsoft Azure's
managed Kubernetes service.
OpenShift: Developed by Red Hat, OpenShift is a
Kubernetes-based container platform that offers
developer and operational tools.
Apache Mesos with Marathon: Mesos is a scalable
cluster manager, while Marathon is a framework for
Mesos that provides a platform for hosting containers.
Rancher: An open-source platform that provides a full
set of infrastructure services for containers, from
orchestration to networking and storage.
Nomad: Developed by HashiCorp, Nomad is a flexible
orchestrator to deploy and manage containers and non-
containerized applications.
Google Kubernetes Engine (GKE): Google Cloud's
managed Kubernetes service.
Amazon EKS (Elastic Kubernetes Service): AWS's
managed Kubernetes service.
Docker orchestration platforms offer a rich set of features:
Automatic scaling: Dynamically adjusts the number
of running containers based on demand.
Load balancing: Distributes traffic to containers,
ensuring efficient utilization and responsiveness.
Service discovery: Allows containers to locate each
other and communicate.
Health checks: Monitors the state of containers and
replaces unhealthy ones.
Rolling updates and rollbacks: Ensures seamless
updates without downtime, with the capability to revert
to previous versions.
Data volumes: Manages persistent data storage
across container instances.
Here is an example of how Java microservice can be
packaged into a production-grade Docker image. Suppose
you have a compiled JAR of your microservice named
HelloWorldService-0.0.1-SNAPSHOT.jar.

In the directory containing your JAR, create a file named


Dockerfile with the following content (Code snippet 10.5):

1. # Use an official Java Runtime Environment (JRE) base image

2. FROM eclipse-temurin:17-jre-jammy

3.
4. # Set the application directory inside the container

5. WORKDIR /app
6.
7. # Copy the JAR file into the container at /app

8. COPY ./HelloWorldService-0.0.1-
SNAPSHOT.jar /app/HelloWorldService.jar

9.
10. # Specify the entry point. This will run the JAR when the
container starts.

11. ENTRYPOINT ["java", "-jar", "/app/HelloWorldService.jar"]

12.
13. # Expose the port the app runs on

14. EXPOSE 8080

Note: The Dockerfile uses a lightweight JRE image for Java 11. Adjust
the Java version according to your needs.

Navigate to the directory containing your Dockerfile and JAR,


then run (Code snippet 10.6):
1. docker build -t helloworldservice:latest

This creates a Docker image named helloworldservice with the


latest tag.

Note: This example provides a foundational setup. For advanced


production deployments, consider adding features like logging, health
checks, and further optimizing the Docker image size. Always follow
best practices for container security.

Docker containers combine microservices' agility with


container technology's consistency and efficiency. When
used with orchestration platforms, they enhance the
microservices ecosystem, guaranteeing scalable,
maintainable, and resilient deployments.
Following are the pros and cons of docker container:
Pros:
Same environment from development to production.
Easily multiply services as needed.
Many tools are available for orchestration and
management.

Cons:
It can be tricky to set up and manage.
Potential vulnerabilities if misconfigured.
Many tools to choose from, can be overwhelming.

JEE bean
Jakarta EE (previously known as Java Enterprise Edition) has
long stood as the bedrock of enterprise-level Java
applications. While originally designed in a pre-microservice
era, the JEE platform is quite capable of supporting the
microservices paradigm, provided we navigate its standards
and constraints adeptly.
Some of the prominent JEE servers include:
WildFly (formerly JBoss): An open-source application
server.
GlassFish: The reference implementation for JEE is
open-source and versatile.
IBM WebSphere: A proprietary solution offered by
IBM.
Oracle WebLogic: Another proprietary server, known
for its robustness.
Payara: Derived from GlassFish, it offers additional
features and tools.
TomEE: Apache Tomcat with JEE features integrated.
JEE servers provide a rich feature set while offering a simple
yet comprehensive programming model:
Integrated APIs: JEE offers a plethora of APIs for
different tasks like messaging, database connectivity,
web services, and so on.
Security: It provides a built-in security model.
Concurrency control: Enables efficient multi-
threading.
Load balancing: Distributed applications benefit from
inherent load distribution features.
Transaction management: Ensures that business
operations are completed or rolled back properly.
Microservices can be encapsulated as different types of JEE
beans:
Servlets: Stateless, and suitable for request-response
model services.
Stateless session beans: For services where no state
is maintained between method calls.
Stateful session beans: Useful when the state needs
to persist between method calls.
Message-driven beans: Ideal for services that act in
response to messages, like those in an asynchronous
architecture.
Entity beans: Represent data in a database; however,
their use is often discouraged in favor of JPA.
While JEE offers simplicity, it also poses constraints. Beans
must be constructed to adhere to the JEE standards,
including specific annotations, interfaces, and deployment
descriptors. Moreover, the platform dictates the lifecycle and
communication between beans, sometimes resulting in less
flexibility.
Here is a basic example of a “Hello World” microservice
implemented as a Stateless Session Bean (Code snippet
10.7):
1. // Import necessary packages

2. import javax.ejb.Stateless;

3. @Stateless // Annotate the class as a Stateless Session Bean

4. public class HelloWorldBean implements HelloWorldRemote {

5. // A simple method to return a greeting message

6. @Override

7. public String sayHello() {return "Hello, World!";}

8. }

9. // Remote interface for the Stateless Session Bean

10. import javax.ejb.Remote;

11. @Remote // Annotate the interface as a Remote interface

12. public interface HelloWorldRemote {String sayHello();}

In this example:
We define a Stateless Session Bean HelloWorldBean with a
method sayHello() that returns the string "Hello, World!".
The @Stateless annotation denotes that this is a Stateless
Session Bean.
is a remote interface that HelloWorldBean
HelloWorldRemote
implements. It is marked as @Remote, indicating that it is
a remote interface, which means clients can call this
bean's methods remotely.
For deploying this microservice, package it into an EJB JAR
file and then deploy it to a JEE application server. After
deployment, this Stateless Session Bean can be looked up
and invoked by client applications.
The Eclipse MicroProfile emerged to streamline microservices
creation on JEE. This optimizes Enterprise Java for
microservices, offering common APIs and lightweight, speedy
specifications. It links traditional enterprise setups with the
advancing microservices landscape.
JEE, with its robust features, suits microservices deployment.
Packaging as JEE beans and using tools like Eclipse
MicroProfile enables developers to merge JEE's strength with
modern microservices practices.
Following are the pros and cons of JEE bean:

Pros:
Tested and proven over time.
Offers a plethora of built-in services.
Consistent environment for all microservices.

Cons:
Requires understanding of JEE specifications.
JEE standards may lag behind newer technologies.
Requires adaptations for cloud-native patterns.

Serverless function
Serverless computing is a compelling paradigm for many
modern applications, providing the simplicity of a function-
based approach coupled with the power of cloud-native
scalability and reliability. Yet, it is not without its intricacies.
Almost every major cloud provider offers its own platform for
serverless computing. Here are the most noticeable ones:
AWS Lambda: Amazon Web Services' event-driven,
serverless computing platform.
Azure Functions: Microsoft Azure's solution for event-
driven, serverless compute.
Google Cloud Functions: Google Cloud Platform's
lightweight compute solution for developers.
Alibaba Cloud Function Compute: Alibaba's
serverless compute service for running code without
provisioning servers.
IBM Cloud Functions: IBM's Function-as-a-Service
(FaaS) based on Apache OpenWhisk.
Serverless platforms commonly offer a number of features:
Auto-scaling: Scales automatically with the size of the
workload.
No server management: Removes the need to
provision or manage servers.
Event-driven: Executes code in response to events.
Integrated security: Security at the level of the
function, role, and execution.
Cost-efficient: Pay only for the compute time
consumed.
Statelessness: Each function execution is
independent.
Serverless functions often favor nano-service architecture,
but not all apps suit this granularity. Enter the Commandable
pattern: Consolidate operations within a unified
microservice, invoked with a command in requests. This
enriches the microservice structure while staying serverless-
compliant.
Here is a simple example of implementing a "Hello World"
microservice using Java for AWS Lambda with the
Commandable pattern (see Chapter 6, Working with Data for
details) (Code snippet 10.8):
1. public class HelloWorldCommandable implements
RequestHandler<Map<String, String>, String> {

2.
3. @Override

4. public String handleRequest(Map<String, String> input,


Context context) {

5. if (input.containsKey("command")) {

6. switch (input.get("command")) {

7. case "sayHello": return sayHello();

8. // You can add more commands here as needed

9. default: return "Unknown command";

10. }

11. } else {return "No command provided";}

12. }

13.
14. private String sayHello() {return "Hello, World!";}

15. }

In the above example, we check the input map for a “command”


key. If it contains the command “sayHello”, we return the
greeting message. Additional commands can easily be added
to the switch statement.
After compiling and deploying the Lambda function, you can
test it using the following JSON. As a result, you should see
the “Hello, World!” response (Code snippet 10.9):
1. {"command": "sayHello"}

One challenge with serverless is the potential delay when a


function is invoked after being idle—often referred to as a
cold start. To mitigate this, some platforms allow for
provisioned concurrency or reserved instances where a
defined number of function instances are kept warm,
reducing the initialization latency.
Java is well-equipped to handle serverless functions. Major
cloud platforms offer Java SDKs to facilitate this:
AWS Lambda Java SDK: Allows Java functions to
respond to Amazon S3 or DynamoDB events.
Azure Functions Java SDK: A set of annotations and
tools to streamline serverless Java on Azure.
Google Cloud Functions Java Framework: Java
functions can be triggered by HTTP requests or cloud
events.
Apache OpenWhisk Java Runtime: Use Java to write
actions for OpenWhisk, the backbone of IBM Cloud
Functions.
Quarkus: A Kubernetes-native Java stack that offers a
serverless sub-framework, making it suitable for
building lightweight and high-performance serverless
applications.
Micronaut: A modern JVM-based full-stack framework
which includes built-in support for writing serverless
applications, especially targeting AWS Lambda and
Azure Functions.
Spring Cloud Function: An extension of the Spring
Boot project, which simplifies the writing and invoking
of functions. It can be adapted for AWS Lambda, Azure
Functions, and others.
Vert.x: An event-driven application framework that can
be used to build serverless applications, especially with
its integration with Quarkus.
Fn Project: An open-source container-native
serverless platform that can be run anywhere,
including in any cloud or on-premises. It has support
for Java.
Serverless computing provides advantages such as cost
efficiency through pay-per-use pricing, automatic scaling to
handle fluctuating workloads, reduced operational overhead
by offloading infrastructure management to the cloud
provider, support for event-driven architectures, suitability
for microservices, and effectiveness for short-lived tasks like
data processing or backend logic. This model allows
developers to focus on writing code rather than managing
infrastructure, making it particularly advantageous for
applications with unpredictable traffic patterns or those
requiring rapid scaling and responsiveness.
However, the serverless model, though simple and cost-
efficient, is not one-size-fits-all. It demands thoughtful
design to ensure scalability without fragmentation. Utilizing
patterns like Commandable and harnessing the power of
Java SDKs can enable robust microservices in a serverless
environment.
Following are the pros and cons of serverless function:

Pros:
Pay only for what you use.
Quicker time-to-market for features.
Ideal for reactive architectures.

Cons:
Execution time is capped by the platform (for example,
15 minutes on AWS Lambda).
Memory and storage may be limited.
Vendor Lock-in: Tightly bound to platform-specific
configurations and services.

Cross-platform deployment
In today's tech landscape, many vendors adopt cross-
platform deployment to cater to diverse client needs. This
approach ensures a product's availability on multiple
platforms, enhancing flexibility and market reach. As cloud,
on-premises, and hybrid environments converge, cross-
platform deployment is not just a trend, but a business
necessity, enabling vendors to deliver consistent services
across different infrastructures.

Problem
Cross-platform deployments involve balancing unique
platform traits to ensure consistent microservice
functionality. Developers must choose between a generic
lowest common denominator approach, sacrificing platform-
specific benefits, or tailoring services for each platform,
complicating development. Achieving this balance is the core
challenge in cross-platform deployment while maintaining
scalability, reliability, and user experience.

Symmetric deployments
Symmetric deployments offer a unified application delivery
solution, spanning on-premises and various cloud
environments. Using a single deployment platform ensures
consistency, streamlines processes, and potentially cuts
costs. However, it may not fully utilize each environment's
unique features. Prominent platforms for symmetric
deployments include:
System processes within virtual machines (VMs)
Docker containers, especially when orchestrated with
Kubernetes
JEE servers
These options are notable for their ability to be deployed
seamlessly on nearly any cloud or on-premises setting.
Following are the pros and cons of symmetric deployments:

Pros:
Same behavior across environments.
One deployment strategy to oversee.
Reduced variation leads to lower costs.

Cons:
Misses out on platform-specific features.
One-size approach may not be optimal.
Harder to adapt to platform innovations.

Platform abstraction
Platform abstraction in the context of serverless computing
means employing frameworks or tools that let developers
code once, but deploy to various serverless platforms
without major modifications. This approach enables the
leverage of distinct platform strengths while simplifying the
development process.
There are a number of Java frameworks that offer an
abstraction layer for multiple deployment platforms:
Quarkus: Tailored for Kubernetes and serverless
workloads, it provides a consistent development model
and has the capability to output optimized native code
suitable for serverless.
Micronaut: Built for microservices and serverless
applications, Micronaut supports cloud functions,
offering a streamlined serverless experience across
major cloud providers.
Vert.x: While primarily known as an event-driven
framework, Vert.x's reactive nature makes it an
excellent choice for building serverless functions that
can be deployed across various platforms.
Fn Project: An open-source container-native
serverless platform that can be run on any cloud or on-
premises. It provides a Java FDK (Function
Development Kit) to ease the building of serverless
applications.
These frameworks facilitate the development of cross-
platform microservices in Java, abstracting away the
complexities of individual cloud providers and ensuring
consistent functionality across them.
Following are the pros and cons of platform abstraction:

Pros:
Easier migration between platforms.
Code once, deploy everywhere.
Common codebase for all platforms.

Cons:
Potential performance penalties.
Cannot always use platform-specific features.
Another layer to learn and manage.

Repackaging
Usually, the codebase predominantly covers persistence and
core business logic, with less dedicated to communication
and packaging. To optimize development, a practical
approach concentrates on repackaging microservices for
diverse platforms.
Inversion of Control (IoC) containers are pivotal for this
repackaging approach. IoC flips control flow, injecting
dependencies into applications at runtime, fostering
modular, loosely coupled microservices. This enables easy
component swapping or reconfiguration without system-wide
changes. When repackaging for a new platform, only the
container and communication layers require modification,
preserving core business components (refer to Figure 10.1):

Figure 10.1: Repackaging microservice components for various deployment


platforms

Java development stack has two primary technologies to


implement dependency injection in microservices:
Spring: Spring Boot, a prominent Java framework,
simplifies creating production-ready apps. It relies on
the Spring IoC container, managing object lifecycle and
configuration, making dependency injection smoother.
CDI (Context and Dependency Injection): CDI,
initially part of Java EE, offers type-safe dependency
injection. It integrates well with Java EE and is widely
used in Jakarta EE apps. CDI allows injecting beans
into different components, promoting loose coupling
and improved modularity.
Using IoC and powerful dependency injection from Spring
Boot and CDI, developers can efficiently repackage
microservices. This preserves most of the codebase,
protecting the initial investment while allowing the use of
native platform capabilities.
Following are the pros and cons of repackaging:

Pros:
Core business logic remains unchanged across
platforms.
Reduces development and maintenance costs.
Enables full access to native platform APIs.

Cons:
Variations might arise in behavior across platforms.
Each repackaged service might need thorough testing.
Keeping track of multiple packaged versions can be
cumbersome.

Micromonolith
Micromonolith architecture merges traditional monolithic and
microservice paradigms. Initially, for startups and simpler
systems, monolithic simplicity prevails. However, as systems
grow complex, the agility and scalability of microservices
become appealing. Micromonolith allows microservice-style
design while deploying as a monolith. It offers flexibility,
letting teams embrace monolithic ease and transition
smoothly to distributed microservices when needed.

Problem
Starting a new project poses a dilemma: opt for the
simplicity of a monolithic system or the scalability of
microservices? Initially, monoliths are cost-effective and
straightforward, but they can become cumbersome and
costly to maintain as the system grows. Conversely,
microservices offer long-term flexibility and scalability but
demand significant upfront investments and expertise,
potentially overwhelming smaller teams or startups (refer to
Figure 10.2):

Figure 10.2: Comparison of cost of development monolithic systems vs


microservices
Micromonolithic deployment is a compromise. Teams create
modular components with a microservices approach, but for
deployment, they bundle these into a monolithic unit or a
few. It blends monolithic benefits with readiness for future
microservices transition. As projects grow, shifting from
monolithic deployment becomes smoother, making
micromonolithic deployment a pragmatic choice for evolving
projects.

Solution
Micromonolith packaging merges microservices' structured
design with monolithic deployment's simplicity. It
emphasizes componentized, loosely coupled microservice
design. Services are initially designed as independent units
but are later interwoven into a cohesive single unit for
deployment, referred to as a micromonolith (refer to Figure
10.3):

Figure 10.3: Packaging microservices versus micromonolith

This design strategy relies on client components that


abstract microservice communication. Initially, Direct Clients
enable in-process communication, simplifying interactions,
especially in micromonolithic setups.
As scalability demands arise, transitioning from Direct
Clients to inter-process communication is seamless. The
loosely coupled design allows easy extraction and
independent deployment of microservices, shifting from
micromonolith to genuine microservices architecture.
In the Java ecosystem, frameworks like Spring and CDI foster
componentized, loosely coupled designs. They offer tools
and conventions that promote modularity, simplifying the
development of systems prepared for micromonolithic
deployment and scalable microservices.
Following are the pros and cons of micromonolith:

Pros:
Easier startup compared to full microservices.
Designed for easy transition to microservices.
Less hardware and infrastructure are needed initially.

Cons:
Shifting to true microservices may still require effort.
More setup than a pure monolith.
Scalability is restricted to a single deployment unit.

External activation
Deployment platforms often manage component lifecycles,
restricting active threads in microservices. This makes
microservices reactive rather than proactive. However,
systems frequently need active control. To address this,
developers use external activation, where specialized
components externally trigger microservice logic, balancing
platform constraints with operational needs.

Problem
Certain platforms like JEE servers, serverless functions, Azure
Fabric Actors, and cross-platform microservices limit active
threads. This challenges continuous or periodic logic
execution, sacrificing microservices' autonomy for robust
lifecycle management.
The external activation technique solves this. An external
timer or scheduler triggers passive microservices. The
activation component runs at intervals, checks conditions,
and prompts microservices to execute.
This approach mimics active microservice behavior, reacting
to external cues. It preserves periodic/task-based execution
in restricted environments, maintaining the essence of
functionality.

Cron jobs
External activation can be effectively implemented using
cron jobs, which are time-based job schedulers in Unix-like
operating systems. By leveraging the predictable and
periodic nature of cron jobs, developers can awaken a
passive microservice at specified intervals, ensuring its logic
gets executed even in environments that do not support
active threads.
Suppose you have a passive microservice that needs to
clean up old data from a database every night at 2 AM. Here
is how you can set up external activation using a cron job:
Crontab entry: Add the following line to the crontab file to
schedule the cron job (Code snippet 10.10):
1. 0 2 * * * /path/to/your_script.sh

Script content (your_script.sh) (Code snippet 10.11):


1. #!/bin/bash

2.
3. # Preliminary checks, e.g., checking database connection

4. if [check database connection here]; then

5. # If all checks pass, make an HTTP request to the


microservice endpoint

6. curl -X POST http://your_microservice_endpoint/cleanup

7. else

8. echo "Database connection failed!" >> /path/to/your_log_file.lo


g

9. fi

With this setup, the microservice will be externally activated


each day at 2 AM to perform its cleanup task.
Following are the pros and cons of cron jobs:

Pros:
Simple setup and management.
Predictable execution times.
Time-tested reliability.

Cons
Not native to Windows.
Complex management for many jobs.
No dynamic rescheduling.

Cron service
In Kubernetes, scheduled tasks are managed using the
CronJob service. A CronJob creates Jobs on a time-based
schedule, which follows the cron format. It allows users to
run scheduled tasks with the reliability and scalability
provided by Kubernetes. This can be especially beneficial for
activating certain logic or microservices in a cloud-native
manner.
Suppose you have a microservice that needs to process data
every morning at 5 AM. Here is how you could set up a
CronJob in Kubernetes to achieve this:
1. Create a YAML configuration for the CronJob (Code
snippet 10.12):
2. apiVersion: batch/v1beta1

3. kind: CronJob

4. metadata:

5. name: data-processor-job

6. spec:

7. schedule: "0 5 * * *"

8. jobTemplate:

9. spec:

10. template:

11. spec:

12. containers:

13. - name: data-processor-container

14. image: data-processor-image:v1

15. restartPolicy: OnFailure

2. Apply the CronJob configuration (Code snippet 10.13):


1. kubectl apply -f data-processor-job.yaml

In the example above, the CronJob named data-processor-job


will run the container data-processor-container using the image
data-processor-image:v1 every day at 5 AM. If the job fails,
Kubernetes will retry it based on the restartPolicy:
Note: It is essential to ensure that the microservice (or logic) you are
activating with the CronJob is idempotent because there might be
occasions where the CronJob might overlap or run multiple times, and
you would not want duplicate or erroneous operations.

Following are the pros and cons of cron service:

Pros:
Leverages Kubernetes' robust orchestration
capabilities.
Automatic retries based on defined policies.
Supports complex scheduling with cron syntax.

Cons:
Requires understanding of Kubernetes objects and
YAML.
Tied to Kubernetes ecosystem.
Containers might have a slight delay in starting up.

JEE Timer
In the Jakarta EE (previously known as Java EE) platform, the
Timer Service provides a way to allow applications to be
notified of events at set intervals, effectively allowing for
external activation of logic within the application. It is a
simple, straightforward method to schedule future timed
notifications in a Java EE container.
First, you will define a Stateless Session Bean that contains
the timer logic (Code snippet 10.14):
1. @Stateless

2. public class MyTimerServiceBean {

3.
4. @Resource

5. TimerService timerService;

6.
7. public void initializeTimer(long duration) {

8. TimerConfig config = new TimerConfig();

9. config.setInfo("MyTimerInfo");

10. timerService.createSingleActionTimer(duration, config);

11. }

12.
13. @Timeout

14. public void timeoutHandler(Timer timer) {

15. System.out.println("Timer Service : " + timer.getInfo());

16. // Implement the timed logic here

17. }

18. }

When you need to start the timer (for example, during


application start-up or based on some event), you can call
the initializeTimer method of the above bean (Code snippet
10.15):
1. @EJB

2. MyTimerServiceBean timerBean;

3.
4. public void someMethod() {

5. // Starts a timer which will invoke `timeoutHandler`

6. after 10 seconds.
7. timerBean.initializeTimer(10000);

8. }

In the example, the timeoutHandler method gets invoked once


the timer duration (10 seconds in this case) has expired.
Following are the pros and cons of JEE timer:

Pros:
Directly integrated into the Java EE platform.
No need for external services or libraries.
Supports single-action and interval timers.

Cons:
Tied to the Java EE ecosystem.
Might not be as feature-rich as external scheduling
systems.
Timers can consume system resources if not managed
properly.

Conclusion
In this chapter, we covered various microservice packaging
methods like system processes, Docker containers, JEE
beans, and serverless functions. We also addressed cross-
platform deployment challenges and introduced the
micromonolith approach, blending monolithic and
microservice concepts. We explored external microservice
activation techniques, such as cron jobs, Kubernetes cron
services, and JEE timers, highlighting their pros and cons.
The next chapter will discuss microservices testing
automation patterns.
Further reading
1. Bachina. B. How To Dockerize Java REST API. Medium.
Apr 16, 2020. Available at https://medium.com/bb-
tutorials-and-thoughts/how-to-dockerize-java-
rest-api-3d55ad36b914
2. Wiener, G. How to Schedule a Python Script Cron Job.
Medium. Jan 24, 2018. Available at
https://medium.com/@gavinwiener/how-to-
schedule-a-python-script-cron-job-dea6cbf69f4e
3. Manavian, C. The Power of Kubernetes Cron Jobs.
Medium. Aug 14, 2019.
https://medium.com/swlh/the-power-of-
kubernetes-cron-jobs-d7f550958de8
4. Fong. G. How to code in Java Spring like a Pro —
Dependency Injection. Medium. Nov 14, 2022. Available
at https://medium.com/dev-genius/how-to-code-
in-java-spring-like-a-pro-dependency-injection-
69249fdb68
CHAPTER 11
Testing Microservices

Introduction
Automated testing is essential because it allows developers to
verify that their code behaves as expected, catching bugs and
issues early in the development process, ultimately leading to
higher-quality software products. This chapter introduces key
patterns in automating microservices testing. Starting with smart
planning for clear objectives, the chapter explores patterns for
functional and non-functional testing. For functionality, it teaches
methodologies for unit, integration, system, contract, and
acceptance testing, complemented by insights into effective
mocking. In non-functional, it describes in-depth tools for
benchmarks, simulators, and data generators, which are crucial
for assessing performance, scalability and more, and to ensure
microservices excel in diverse conditions. Some additional
insights on Testing automation can be found in the “Better
Testing” (https://www.entinco.com/programs/better-
testing) program.

Structure
In this chapter, we will cover the following topics:
Test planning
Functional testing
Unit test
Integration test
End-to-end test
Contract test
Acceptance test
Initial state
Non-functional testing
Benchmark
Simulator
Data generator
Mocks
Problem
Solution
Chaos Monkey
Problem
Solution

Objectives
In this chapter, we aim to elucidate the critical role of automated
testing in microservices development and to outline clear
strategies for planning and executing effective testing practices.
By understanding the significance of automated testing, defining
clear testing objectives, exploring functional and non-functional
testing patterns, and examining tools and techniques for
comprehensive testing coverage, you will be equipped with the
necessary knowledge and methodologies to ensure the
robustness and quality of your microservices in diverse
conditions.

Test planning
Navigating test automation without a clear strategy can lead to
repetitive cycles of writing and fixing tests, with bugs still
emerging. The key to transcending this challenge is not
necessarily more tests, but smarter testing. Effective test
planning, grounded in clear quality goals, ensures that tests are
strategically positioned for maximum impact. This section will
guide you through the essentials of planning, paving the way for
streamlined and potent test automation.

Problem
In software development, prioritizing quality through test
automation is crucial. Yet, challenges arise due to undefined
goals and strategies. Without them, even diligent automation can
result in higher costs, longer development times, and lower
software quality.
Why: Test automation's purpose is to optimize software,
satisfying functional and non-functional needs. Without
knowing why, testing lacks direction and may miss vital
problems.
Who: Identifying test planning responsibility is crucial. Is it
QA, developers, or both? Ambiguity may lead to missed
bugs and inefficiencies.
What: Our focus lies in determining the test's scope and
type. Is it for a single microservice, the whole system, or
both? And is it functional, non-functional, or
creative/manual? Clarifying the what enriches the test
strategy, ensuring it thoroughly aligns with the software's
needs.
Where: Where will the testing take place? Different
environments can yield varied results; thus, clarity here
ensures consistent testing outcomes.
When: Should testing be continuous, at the end of every
sprint, or only before major releases? The when impacts
resource allocation and software delivery timelines.
How: The methodologies and tools employed for testing
significantly influence the outcomes. Without a chosen
strategy, teams might opt for ineffective tests.
In test automation, this 5WH model is a blueprint, not just a
guideline. It directs teams to set ambitious quality objectives and
align efforts strategically to exceed these goals. The real
challenge is not just recognizing the importance of test
automation but setting measurable goals and using effective
strategies to reach them.

Solution
To grasp the essence of testing, one must delve into its
foundational elements, each playing a crucial role in ensuring
quality and reliability (refer to Figure 11.1):

Figure 11.1: Elements of test planning

As the figure shows, these elements can be understood by asking


ourselves the questions why, what, who, where, when, and how
to test:
Why: Reasons for testing have several aspects,
namely:
Purpose: Understanding testing reasons sets the
process foundation. It defines specific targets, like
functionality, user specs, performance, contracts, and
reliability.
Specific Targets: Clearly defined metrics or benchmarks
allow for objective assessment and ensure that tests have
tangible, meaningful outcomes.
Alignment with Business Goals and Stakeholder
Expectations: Testing ensures that software
development efforts are aligned with broader business
objectives by ensuring that the software meets strategic
goals, not just technical requirements.
Identification and Mitigation of Risks in
Microservices Development: Microservices
architecture introduces complexities and risks such as
communication failures, scalability issues, and security
vulnerabilities. Thorough testing allows teams to identify
and mitigate these risks early in the development
process.
What: Involves the scope and type of testing:
Scope: This involves identifying whether the testing is
focused on individual microservices or the broader
system.
Type: Determines the nature of the testing, whether it is
functional (ensuring specific features work as expected),
non-functional (checking system performance,
scalability, and so on), or creative/manual (exploratory
testing based on intuition and experience).
Who: Defines the actors and their responsibilities:
Distributing testing responsibilities, different perspectives,
and expertise contributes to a more thorough and
comprehensive testing process:
Developers: Typically responsible for the functionality of
individual components. They can utilize tools such as
JUnit for unit testing, Mockito for mocking dependencies,
and TestNG for more advanced testing scenarios.
Testers: Focus on end-to-end scenarios, ensuring use-
case validity and consistency. Testers can leverage tools
like Selenium for automated browser testing, Cucumber
for behavior-driven development (BDD) testing, and
Apache JMeter for performance testing.
DevOps engineers: Ensure integration of system
components and overall integrity. DevOps engineers can
employ tools like Jenkins for continuous integration
and continuous deployment (CI/CD) pipelines, Docker
for containerization of test environments, and SonarQube
for code quality analysis.
Product managers: Oversee the development and
launch of a product, ensuring it meets market needs and
business objectives. Product managers can benefit from
tools like JIRA for project management and tracking,
Confluence for documentation and collaboration, and
Zephyr for test management within JIRA
Requirement analysts: Gather and document user
needs and system requirements to guide the
development process effectively. Requirement analysts
can use tools like Enterprise Architect for modeling
system requirements, Balsamiq for wireframing and
prototyping, and ReqView for requirement management
and traceability.
End users/customers: Validate the final product
through acceptance testing. End users and customers
can utilize tools such as Selenium IDE for recording and
playback of browser interactions, Apache JMeter for load
testing from an end-user perspective, and AssertJ for
writing expressive assertions in acceptance tests.
Where: Depending on the phase of development and the
specific testing objectives, tests may be run in varied
environments like development, continuous integration
(CI) pipelines, staging platforms, or even production
settings.
When: Tests can be categorized based on their timing and
purpose:
Smoke tests: These are initial checks performed to
ensure that basic functionalities of the software are
working correctly and that the system is stable enough
for further testing.
Regression tests: These tests are conducted to verify
that bug fixes or new enhancements have not caused
unintended side effects or regressions in previously
working parts of the software.
Alpha and beta tests: Alpha tests are conducted
internally by the development team to identify and fix
any issues before releasing the software to a limited
group of external users (beta testers) who provide
feedback on usability, performance, and functionality
before the official release.
Acceptance tests: These tests are aimed at validating
that the software meets the specified requirements and
is ready for deployment, typically involving stakeholders
or end-users to ensure that the software aligns with their
expectations and needs.
How: Testing is based on methodologies, tools, and
frameworks:
Methodologies: The strategic approach to testing,
which could be manual, automated, or a mix.
Tools and frameworks: Teams use varied tools based
on expertise: Developers favor language-specific
frameworks, testers choose Java or Python, DevOps use
bash, Python, or PowerShell, and end-users prefer
no/low-code platforms.
A test plan can be captured as a formal, detailed document,
which provides clear directives and can be shared across teams.
Alternatively, for agile or fast-paced projects, it can be described
informally, ensuring flexibility and adaptability.

Functional testing
Functional microservices testing checks if services meet specs,
prioritizing what over how. Developers test services and
interactions with familiar tools. Testers use Java or Python for
system checks. DevOps manages system-wide integration. End-
users perform acceptance tests, including no-code tools.
Effective functional tests cover boundaries and negatives,
staying specific, isolated, and resilient, not just in ideal scenarios.

Problem
In microservices, assuring individual service functionality and
system cohesion is tough. Different scopes demand unique
strategies and responsibilities. Without clear delineation and
robust practices, hidden defects can hurt system efficiency.
Functional testing checks if software follows design/specs. Here
are some functional testing best practices:
Prioritize tests: Not all tests are of equal importance.
Based on factors such as user impact or the likelihood of
failure, prioritize tests to make the best use of available
resources.
Keep tests isolated: Ensure that one test's outcome does
not depend on another. Each test should set up its own data
and, if necessary, clean up after itself.
Ensure idempotency: The same set of inputs should
provide the same outputs every time, ensuring that tests
can be run multiple times without side effects.
Test the Happy Path first: Before diving into edge cases,
ensure that the application's main functionalities work as
expected.
Include negative test scenarios: Beyond just testing for
what should happen, test for what should not happen. For
example, verify that proper error messages are displayed
for invalid inputs.
Maintain test data: Setup initial dataset prior to each
test. Do not assume a certain state of data storage or/and
test environment. Regularly review and update test data to
ensure it remains relevant, especially if the application's
data structures change.
Automation is key: Automated functional tests can be run
frequently and consistently, ensuring rapid feedback and
early detection of issues.
Regularly review and update test cases: As the
application evolves, test cases should be revisited and
revised to ensure they remain relevant and comprehensive.
Use version control: Just as with your application code,
test scripts should be kept under version control to track
changes and ensure consistency across different testing
environments.
Feedback loops: Incorporate continuous feedback loops.
The faster the development team gets feedback, the
quicker the issues can be addressed.
Parallel execution: To speed up testing, especially in
larger projects, consider executing tests in parallel.
Keep tests maintainable and readable: As tests grow in
number and complexity, clear naming conventions,
comments, and modular design become crucial to
understand what each test does.
In microservices functional testing, the key lies in strategic
planning, clear role allocation, and unwavering adherence to best
practices to ensure a solid, efficient microservices ecosystem.

Unit test
Microservices' unit testing ensures software quality. It validates
small, testable parts, usually individual classes. Unit tests are
granular, evaluating components in isolation, and ensuring their
independent function. They are independent of external
dependencies, like databases, making them fast and reliable
(refer to Figure 11.2):

Figure 11.2: Unit testing of microservice components

In the broader spectrum of software testing, unit testing emerges


as the frontline, safeguarding against potential defects in a
microservices system. By confirming the reliability of each
system component, it paves the way for more expansive tests,
anchoring the foundation for a robust microservices ecosystem.
Unit testing is the front line in software testing for microservices.
It ensures component reliability, laying the foundation for a
strong microservices ecosystem.
Java boasts a rich ecosystem of frameworks for unit testing. Here
are some of the most popular and widely-used ones:
JUnit: Perhaps the most renowned unit testing framework
for Java, JUnit has been instrumental in the practice of test-
driven development. The latest version, JUnit 5, comes with
a lot of new features and a modular architecture.
TestNG: Inspired by JUnit, TestNG (where NG stands for
Next Generation) extends its capabilities and introduces
new functionalities, making it suitable for more extensive
test configurations and parallel execution.
Hamcrest: While primarily known as a matcher framework
(providing assertThat style assertions), it is often used in
conjunction with JUnit to write more expressive unit tests.
AssertJ: A library providing rich and fluent assertions for
enhancing the readability of unit tests.
JUnitParams: An extension to JUnit that allows
parameterized tests. It provides an easy way to run the
same test multiple times with different parameters.
Truth: Asserting library from Google that aims to provide
more readable and fluent assertions for unit tests.
Let us consider a simple class Calculator that has a method to add
two integers. We will then create a unit test using JUnit to test
this method.
Here is the Calculator class (Code snippet 11.1):
1. public class Calculator {public int add(int a, int b) { return a + b;}}

Now, let us write a JUnit test for the add method (Code snippet
11.2):
1. import org.junit.Test;

2. import static org.junit.Assert.assertEquals;

3.
4. public class CalculatorTest {

5.
6. @Test

7. public void testAdd() {

8. Calculator calculator = new Calculator();

9. // Test case: Adding positive numbers


10. int result = calculator.add(3, 4);

11. assertEquals(7, result, "Expected addition of 3 and 4 to be 7");

12.
13. // Test case: Adding negative numbers

14. result = calculator.add(-3, -4);

15. assertEquals(-7, result, "Expected addition of -3 and -4 to be -7");

16.
17. // Test case: Adding a positive number with a negative number

18. result = calculator.add(3, -4);

19. assertEquals(-1, result, "Expected addition of 3 and -4 to be -1");

20.
21. // Test case: Adding zero with a number

22. result = calculator.add(0, 4);

23. assertEquals(4, result, "Expected addition of 0 and 4 to be 4");

24.
25. // Test case: Adding a number with zero

26. result = calculator.add(3, 0);

27. assertEquals(3, result, "Expected addition of 3 and 0 to be 3");

28. }

29. }

Test-driven development (TDD) is a software development


approach highly compatible with unit testing in Java. It begins by
writing failing unit tests that describe the desired behavior of a
specific unit of code. Developers then proceed to write the
minimum amount of code necessary to pass these tests. Once
the tests pass, the code is refactored to enhance its design while
ensuring all unit tests continue to pass. TDD emphasizes early
and continuous testing, promoting code quality by encouraging
developers to focus on requirements and design upfront. In the
Java ecosystem, TDD with unit tests provides a structured
methodology for producing robust and maintainable code, while
also serving as a safety net during refactoring processes.
Ultimately, TDD and unit testing in Java facilitate the creation of
reliable software systems by fostering a disciplined and
systematic development approach.
Following are the pros and cons of unit test:

Pros:
Identifies issues at initial stages.
Encourages modularity and separation.
Safely change and improve code.

Cons:
Can lead to unrealistic tests.
Hard to capture all edge cases.
Does not test complex user interactions.

Integration test
Microservices integration testing checks how a microservice
interacts with external dependencies, ensuring they work
together as planned. Unlike unit tests for single components,
integration testing assesses a microservice in a real environment,
connecting it to actual instances of dependencies like other
microservices, databases, caches, message brokers, or logging
services (refer to Figure 11.3):
Figure 11.3: Integration testing of microservice with dependencies

Developers transition from unit to integration testing using


familiar frameworks, but the setup is more complex. Accurate
testing needs a production-like environment with all
dependencies. Automation tools often script this setup for
consistency.
Connection configurations, managed via environment variables,
provide flexibility for testing in different environments (dev, test,
staging, prod) by adjusting configurations. Defaulting to the
development environment is recommended for quicker
onboarding of new developers. They only need to set up the dev
environment, check out, and run the code.
The example below illustrates how integration test can be
configured using environment variables to connect to external
dependencies in test environments (Code snippet 11.3):
1. public class UserPersistenceIntegrationTest {

2. private String databaseUrl;

3. private Connection connection;

4.
5. @BeforeEach

6. public void init() throws Exception {

7. databaseUrl = System.getenv("DATABASE_URL");

8. if (String.isNullOrEmpty(databaseUrl)) {

9. databaseUrl = "jdbc:postgresql://localhost:5432/userdb?
user=test&password=test";

10. }

11. connection = DriverManager.getConnection(databaseUrl);

12. }

13.
14. @Test

15. public void testUserPersistence() throws Exception {

16. // ... Set up some sample user data ...

17.
18. // Persist user data to the database

19. // ...

20.
21. // Fetch user data from the database

22. ResultSet resultSet = connection.createStatement().executeQuery("S


ELECT * FROM users WHERE username = 'testUser'");

23.
24. // Assertions to verify persisted data is as expected

25. // ...

26. }

27. }

Mock services or service virtualization are invaluable tools for


simulating external dependencies when real instances are
unavailable or impractical. They allow developers and testers to
replicate the behavior of external systems in controlled
environments, facilitating testing and development processes.
However, challenges like flaky tests often arise due to the
dynamic nature of external systems or inconsistencies in their
behavior. To mitigate these challenges, teams can employ
several strategies. First, they can establish stable mock services
that closely mimic the behavior of actual dependencies.
Additionally, implementing retry mechanisms and timeouts in
test suites can handle transient errors and timeouts caused by
unreliable external systems. Regular monitoring and updating of
mock services to reflect changes in the actual dependencies are
also crucial. Finally, incorporating end-to-end testing to validate
the integration of mock services with the rest of the system can
ensure the reliability of the entire application under test. By
employing these strategies, teams can effectively manage flaky
tests and maintain the integrity of their testing processes.
In essence, integration testing for microservices ensures that as
units come together, the bigger picture, a harmonious and
interacting system, emerges as designed.
Following are the pros and cons of integration test:

Pros:
Identify communication problems between components.
Ensure data consistency across boundaries.
Test environment-specific configurations.
Validate databases, caching, and messaging systems.

Cons:
Requires detailed environment configuration.
Takes longer than unit tests.
Rely on external systems and components.

End-to-end test
End-to-end testing, in the context of microservices, is a
comprehensive testing approach that verifies a system's overall
functionality, behavior, and integration across all its components.
Rather than focusing on individual microservices or internal
processes, end-to-end testing ensures that the entirety of the
system, from the user interface down to the database layers and
everything in-between, operates harmoniously and meets the
defined specifications (refer to Figure 11.4):

Figure 11.4: End-to-end testing of a microservice system

In complex microservices, end-to-end testing is vital to ensure


correct data flow, expected service communication, and the
intended user experience.
DevOps engineers (product assemblers) verify system integrity,
mainly using bash, Python, and PowerShell to automate
deployment and ensure system cohesion. They may use
JavaScript for UI layer tests.
Prior to conducting end-to-end testing, the microservices system
must be deployed in an environment closely resembling the
production setup, be it a test, staging, or even the production
environment. Importantly, these tests should be externally
configurable, ensuring seamless adaptation to different
environments. This configurability guarantees flexibility, enabling
tests to accurately simulate real-world scenarios under various
conditions and setups.
Here are the most commonly used frameworks for end-to-end
testing:
Selenium: The most popular framework for automating
browsers, Selenium has bindings for Java or Python which
makes it ideal for end-to-end UI testing.
Rest-Assured: A fluent Java library you can use to test
HTTP-based REST services.
Cucumber: Used for behavior-driven development, it
integrates well with Java for writing end-to-end tests based
on business scenarios.
Requests: A simple yet powerful HTTP library, perfect for
testing RESTful services.
Behave: For behavior-driven development, similar to
Cucumber but for Python.
PyTest: A robust framework that can be used for both unit
and end-to-end testing.
Pester: It is a test and mock framework for PowerShell,
useful for both unit testing and infrastructure testing.
Bash or PowerShell + Curl: For API testing, you can use
Bash or PowerShell scripts in conjunction with tools like
Curl.
Cypress: An end-to-end testing framework that makes it
simple to set up, write, run, and debug tests in the browser.
Let us create a simple end-to-end test for a login page using
Selenium with Python.
Scenario: We want to validate that a user can access the
dashboard after successfully logging in.
Before running this test, ensure you have Selenium installed
(Code snippet 11.4):
1. pip install selenium

Additionally, download the appropriate driver for the browser you


wish to test with. For this example, we will use the ChromeDriver for
the Chrome browser (Code snippet 11.5):
1. # Set up the driver for Chrome

2. driver = webdriver.Chrome(executable_path='/path/to/chromedriver') #
Replace with your path to chromedriver

3.
4. # Navigate to the login page

5. driver.get("http://example.com/login") # Replace with your login page U


RL

6.
7. # Find the username and password input fields by their element IDs

8. username = driver.find_element(By.ID, "usernameField") # Replace 'user


nameField' with the
appropriate ID

9. password = driver.find_element(By.ID, "passwordField") # Replace 'pass


wordField' with the
appropriate ID

10.
11. # Input test credentials

12. username.send_keys("testUser")

13. password.send_keys("testPassword")

14. password.send_keys(Keys.RETURN) # Simulate pressing the Enter key

15.
16. # Wait for the dashboard to load and check for an element unique
to the dashboard

17. driver.implicitly_wait(10) # Wait up to 10 seconds for the element to appe


ar
18. dashboard_element = driver.find_element(By.ID, "dashboardElement") #
Replace 'dashboardElement' with the
appropriate ID

19.
20. # Assert that the dashboard element is displayed

21. assert dashboard_element.is_displayed()

22.
23. # Clean up and close the browser window

24. driver.quit()

This script initiates a browser, navigates to the login page, inputs


test credentials, and verifies that after logging in, the user is
taken to the dashboard.
Note: In a real-world scenario, you would probably use a more robust
method to wait for elements (for example, using WebDriverWait) and may
also include error handling mechanisms. This example is meant to be
straightforward and illustrative.

In essence, end-to-end testing in microservices acts as an


encompassing validation step. When individual services integrate
to form a complete system, this testing ensures they function
cohesively and reliably.
Following are the pros and cons of end-to-end test:

Pros:
Ensures the entire system works together.
Confirms data integrity across services.
Verifies service interactions.

Cons:
Setup and maintenance can be challenging, especially in
microservices environments where end-to-end test
environments are complex and resource-intensive.
Typically slower than other tests.
More prone to intermittent failures due to the dynamic
nature of dependencies and complexities in maintaining
mock services. Therefore, emphasis on managing these
aspects is crucial to mitigate challenges effectively.

Contract test
Contract testing in microservices ensures that individual services
adhere to their defined contracts or interfaces, vital for seamless
interactions in distributed microservices. These contracts specify
a service's expected behavior, documenting its inputs, outputs,
and side-effects. Honoring these expectations promotes
smoother integration with other services and consumers (refer to
Figure 11.5).
There are three primary motivations behind contract testing in
microservices:
Behavior verification: The primary goal is to ensure that
a service behaves as expected. It checks the correctness of
service responses for a given set of inputs, as well as
possible error states and their handling.
Compliance with specifications: Contract testing
enforces adherence to defined specs, ensuring that a
service meets the stipulated requirements. This ensures
that any service consuming it gets a predictable response.
Backward compatibility: As microservices evolve, it is
important that changes do not inadvertently disrupt
existing consumers. Contract testing validates that new
iterations of a service still align with previous contracts,
ensuring uninterrupted service to existing clients.
Figure 11.5: Contract testing in a microservice system

In the context of microservices, contract testing can be done for


internal and external interfaces.
External interfaces: These endpoints are exposed to external
clients, including third-party developers, applications, and front-
end systems. Ensuring stability and reliability is critical, as
failures can have far-reaching effects. Testers, often specialists in
this domain, manage contract testing to assure quality for
external consumers.
Internal interfaces: These represent inter-microservice
communication within the system. While contract testing can
ensure stability, it may not always be the most efficient method.
Integration and system tests often detect inconsistencies or
failures in these interfaces. However, when used, software
developers typically create and maintain these tests to preserve
their service's contract during changes.
Contract testing tools and frameworks play a pivotal role in
ensuring that microservices are reliable and cohesive. Here is a
list of some of the most popular tools and frameworks for
contract testing:
Pact: Pact is one of the most popular frameworks for
consumer-driven contract testing. It enables consumers to
set expectations, and providers can then verify that they
meet these expectations.
Spring Cloud contract: An offering from the Spring Cloud
ecosystem, this tool ensures that REST and messaging
applications work well together by providing support for
consumer-driven contract tests.
Postman: Though primarily an API development tool,
Postman can be used for contract testing by verifying the
response schema against expected outputs.
Swagger and OpenAPI: Using tools like Swagger
Codegen or Dredd, the Swagger/OpenAPI specifications
can be turned into a series of tests to ensure adherence to
the defined contract.
Apicurio and Microcks: While Apicurio is a tool to design
APIs, Microcks (pronounced mikes) is a tool that leverages
Apicurio designs to create mock endpoints and run tests to
ensure services adhere to their contract.
Karate: Though initially designed for API testing, Karate
has capabilities for contract testing as well.
Rest-Assured: A fluent Java library you can use to test
HTTP-based REST services.
Requests: A simple yet powerful HTTP library, perfect for
testing RESTful services.
Let us consider a simple example of contract testing using Spring
Cloud Contract, which is commonly used in Java-based
microservice architectures.
Suppose we have a service called User Service which provides
user details based on user ID. The contract for this service can be
defined in a Groovy DSL, like this one:
Contract definition
(src/test/resources/contracts/userService.groovy) (Code snippet
11.6):
1. Contract.make {

2. request {method 'GET' urlPath(value: '/users/123')}

3.
4. response {status 200
5. body([id: 123, name: 'John Doe', email: '[email protected]'])

6. headers {header('Content-Type': 'application/json;charset=UTF-8')}}

7. }

This contract defines that when a `GET` request is made to


`/users/123`, the service should respond with a 200 status, a JSON
body containing user details, and the specified headers.
Test Base Class: It is used by the auto-generated tests by
Spring Cloud Contract. (Code snippet 11.7):
1. public abstract class ContractVerifierBase {

2.
3. @Before

4. public void setup() {

5. MockitoAnnotations.initMocks(this);

6. RestAssuredMockMvc.standaloneSetup(new UserController());

7. }

8. }

UserController: It is a simple controller for the sake of the


example (Code snippet 11.8):
1. @RestController

2. public class UserController {

3.
4. @GetMapping("/users/{id}")

5. public User getUser(@PathVariable int id) {

6. return new User(id, "John Doe", "[email protected]"); // Dummy


data for the sake of example

7. }

8. }

When you run the build (typically via Maven or Gradle), Spring
Cloud Contract will generate tests based on the contract you
have defined and will verify that the “User Service” adheres to this
contract.
Contract testing stands as a pillar of reliability in microservices
architecture, offering assurances that services, whether
interfacing internally or externally, adhere to their defined
contracts, promoting a cohesive and dependable system
environment.
Managing and updating contracts presents several challenges,
particularly as services evolve over time. One significant
challenge is ensuring that changes made to service contracts do
not break existing integrations or dependencies. Best practices in
this regard include establishing clear communication channels
between teams responsible for different services, documenting
contracts comprehensively, and versioning contracts to track
changes effectively. Unlike integration tests, which focus on
testing the interactions between components within the system,
contract tests specifically verify the agreements made between
services. This means that contract tests are more targeted and
can catch integration issues early in the development process.
Practical examples of contract tests include verifying the
structure and format of API responses, ensuring that required
fields are present and correctly formatted, and validating error
handling mechanisms between services. By incorporating
contract tests into the testing strategy, teams can improve the
reliability and stability of their microservices architecture while
facilitating smoother service evolution.
Following are the pros and cons of contract test:

Pros:
Catches inter-service communication problems early.
Suitable for CI/CD pipelines.
Limits the need for extensive integration tests.

Cons:
Does not replace the need for other test types.
Keeping contracts updated can be laborious.
Initial configuration can be intricate.

Acceptance test
Acceptance testing within microservices systems focuses on
ensuring that the services, both individually and as an integrated
whole, align with user specifications or requirements. Created by
testers (test developers) or sometimes directly by customers
(end-users), these tests serve as a final validation step before a
product is released, confirming that the system behaves as
intended and meets user expectations (refer to Figure 11.6):

Figure 11.6: Acceptance testing against requirements or user specifications

The approach to acceptance testing in microservices can be


bifurcated into UI-based and External Interface testing. UI-based
tests primarily evaluate the user interface, examining elements
like data coherence, navigation workflows, and overall user
experience. External Interface tests, meanwhile, target the
system's APIs or other communication touchpoints, ensuring they
adhere to the expected behaviors and output.
Central to the microservices' acceptance testing toolkit is the use
of Domain Specific Languages (DSL). Such languages
streamline the translation of non-technical user requirements
into actionable test commands, enhancing the readability and
accuracy of the tests.
Given the diverse stakeholders involved in designing and using
microservices systems, acceptance tests typically incorporate
input from individuals such as product managers, business
analysts, and end-users. This inclusivity ensures that the
developed tests accurately reflect business needs and user
expectations.
Here is a list of popular frameworks and tools commonly used for
acceptance testing:
Cucumber: A tool that supports Behavior Driven
Development (BDD). It allows the execution of plain-text
functional descriptions as automated tests.
JBehave: Another BDD tool that enables writing stories in
Java.
FitNesse: An open-source tool that supports acceptance
tests by enabling customers, testers, and developers to
collaboratively create test cases on a wiki.
Gherkin: A language used to write tests in Cucumber. It
uses plain language to describe use cases.
Selenium: While often associated with functional testing,
Selenium is also widely used in acceptance testing,
particularly for web applications.
Robot Framework: A keyword-driven test automation
framework, it is employed for acceptance testing and
Acceptance Test Driven Development (ATDD).
Gauge: Created by the makers of Selenium, Gauge
supports the creation of readable and maintainable tests.
Behat: A BDD tool for PHP which helps in writing human-
readable stories that describe the behavior of an
application.
Protractor: An end-to-end test framework developed for
Angular and AngularJS applications. It can also be used for
acceptance testing.
Postman: While primarily a tool for API development and
testing, it is also widely used for acceptance tests of APIs.
UFT (Unified Functional Testing): Previously known as
QuickTest Professional (QTP), it is a commercial tool
used for acceptance testing of desktop, mobile, and web
applications.
Let us create a simple acceptance test for a hypothetical login
page of a web application using Gauge:
1. Setup:
Firstly, make sure you have Gauge and Gauge plugins (especially
gauge-java and selenium) installed. You can install them using the
command line (Code snippet 11.9):
1. gauge init java_maven_selenium

2. Write the Spec:


Create a specification file named LoginTest.spec (Code snippet
11.10):
1. # Login Test

2.
3. ## Successful Login

4. * Navigate to "https://example.com/login"

5. * Enter username "admin" and password "password123"

6. * Click on the login button

7. * Verify the dashboard page is displayed

8.
9. ## Failed Login due to incorrect password

10. * Navigate to "https://example.com/login"

11. * Enter username "admin" and password "wrongPassword"

12. * Click on the login button

13. * Verify the error message "Invalid credentials" is displayed

3. Implement the steps:


The above spec references steps that need to be
implemented. Here is how you might implement those steps
using Java and Selenium (Code snippet 11.11):
1. public class LoginTest {

2.
3. WebDriver driver = new ChromeDriver();

4.
5. @Step("Navigate to <url>")

6. public void navigateTo(String url) {

7. driver.get(url);

8. }

9.
10. @Step("Enter username <username> and password <password>")

11. public void enterCredentials(String username, String password)


{

12. WebElement userField = driver.findElement(By.id("username"));

13. WebElement passField = driver.findElement(By.id("password"));

14.
15. userField.sendKeys(username);

16. passField.sendKeys(password);

17. }

18.
19. @Step("Click on the login button")

20. public void clickLogin() {

21. WebElement loginButton = driver.findElement(By.id("loginButton


"));

22. loginButton.click();

23. }
24.
25. @Step("Verify the dashboard page is displayed")

26. public void verifyDashboard() {

27. WebElement dashboard = driver.findElement(By.id("dashboard")


);

28. assert(dashboard.isDisplayed());

29. }

30.
31. @Step("Verify the error message <message> is displayed")

32. public void verifyErrorMessage(String message) {

33. WebElement errorMessage = driver.findElement(By.id("errorMes


sage"));

34. assert(errorMessage.getText().equals(message));

35. }

36. }

Note: This is a simple example, and in a real-world scenario, there might be


a need for better structure, setup, teardown, and error handling. The key is
to have clear, human-readable specifications in Gauge that map directly to
test steps implemented with a combination of Gauge and Selenium
commands.

In a microservices environment, characterized by its modularity


and distributed nature, it is paramount for acceptance testing
environments to closely mirror production setups. This ensures
consistency and validates that the system, as tested, is prepared
for real-world deployment and use. In essence, acceptance
testing in microservices guarantees that, irrespective of the
complexities beneath, the system delivers the desired business
outcomes and user experiences.
Following are the pros and cons of acceptance test:

Pros:
Validates functionality against requirements.
Ensures user satisfaction.
Provides clear criteria for completion.

Cons:
May not cover all possible scenarios.
Requires a stable environment.
Can be resource-intensive.

Initial state
Managing the initial state of functional tests is crucial to ensure
that tests produce consistent and accurate results. The state acts
as the starting point for the tests, creating a known environment
from which variations can be measured (refer to Figure 11.7):

Figure 11.7: Managing test data

There are two common approaches to setting this state:


Full state: This involves completely clearing the database
and populating it with a predefined set of data. This method
offers simplicity and predictability, making it ideal for
component-level testing or system-level tests where the test
environment is isolated. By starting from a clean slate, it
ensures no residual data from prior tests or activities can
skew the results.
Partial state: This method is more intricate. Instead of
clearing the entire database, it only adjusts a segment of it.
This is particularly useful in environments where tests run
concurrently, or in production settings where tests operate
alongside regular system functions. To prevent interference
with other data, the state created is often randomized.
However, the manner in which this state is established is as
important as the state itself. It is advisable to use formal
interfaces to set the system state, rather than accessing the
database directly. This practice ensures encapsulation within
microservices, a key principle in their architecture. Direct
database access can lead to a multitude of issues:
Breaking encapsulation: Directly manipulating the
database could bypass the standard checks and balances
that an interface might enforce, potentially leading to data
corruption or inconsistencies.
Test fragility: Should there be a change in the
microservice's data model, tests relying on direct database
access would fail. Conversely, if data is set using a formal
interface that maintains backward compatibility, tests
remain robust amidst changes.
Some teams might consider using separate "test databases", but
this is generally discouraged. It not only breaks encapsulation but
also risks the divergence of the test database structure from the
actual production one over time.
Following are the pros and cons of initial state:

Pros:
Ensures test consistency.
Predictable test outcomes.
Reduces test interference.

Cons:
Complex partial state setup.
Might not reflect real-world scenarios.
Needs regular maintenance.

Non-functional testing
Non-functional testing ensures not just what the software does,
but how it accomplishes tasks. With the distributed nature of
microservices, they inherently face challenges related to
performance and integrity. Overlooking these aspects can
escalate into significant development roadblocks down the line.
Thus, proactive non-functional testing, especially focusing on
performance and system integrity, becomes indispensable in
building robust microservices architectures.

Problem
Microservices often require rigorous non-functional testing to
ensure their robustness and reliability in diverse scenarios.
Conducting these tests in a production-like environment under
realistic loads is paramount for accurate validation. Some
commonly validated non-functional requirements in
microservices include:
Performance: This gauges the system's response times
and throughput. Targets are typically defined by
benchmarking against industry standards or historical data.
Tests measure response times of various service endpoints
under defined load conditions.
Capacity: This assesses the maximum workload a system
can handle. Targets might be set based on expected user
counts or transaction volumes. Tests simulate these
volumes to check if the system can handle them without
degradation in performance.
Availability: Refers to the system's uptime. Targets, often
expressed as percentages (for example, 99.9% uptime),
represent the system's operational availability. Tests
typically involve monitoring tools to track system uptime
over periods.
Scalability: Examines how well the system handles
increased demands. Targets are set by projecting expected
growth rates in user or transaction volumes. Tests involve
ramping up loads to see how the system responds and if it
can scale up (or down) efficiently.
Reliability: This gauges consistent system functionality
over time. Targets might be set for acceptable error rates
or uptime ratios. Tests typically monitor system operations
over extended durations to capture any failures or
interruptions.
Security: Ensures protection against unauthorized access
and threats. Targets are defined by industry standards or
specific organizational security policies. Tests include
penetration testing, vulnerability scanning, and other
security assessments.
Latency: Represents the delay before a data transfer starts
after an instruction. Targets are often defined by
application needs (for example, a maximum of 50ms for a
particular service). Tests measure the actual delay times in
data processing or transmission.
For each of these characteristics, non-functional tests aim to
validate if the set targets are consistently met, ensuring the
microservice's efficiency and reliability in real-world conditions.
Performing these tests in an environment that mirrors production
settings ensures that any findings are immediately relevant and
actionable.

Benchmark
Benchmarks are specialized software routines designed to
evaluate the performance and responsiveness of a system under
various conditions. Their purpose is to mimic real-world requests,
thereby allowing testers to obtain an analytical view of the
system's capabilities (refer to Figure 11.8):
Figure 11.8: Non-functional testing using benchmarks

Benchmarks typically do three things:


Create a diverse range of requests that mirror typical user
interactions.
Execute a specific business transaction based on the
generated request.
Analyze the system's response to ensure it is accurate and
meets expectations.
When incorporated into a benchmarking framework or tool, these
routines can be executed systematically to obtain quantitative
data on system performance. The framework typically provides:
Rate control: Executes benchmarks at a predefined rate
or pushes the system to its limit.
Concurrency: Runs multiple benchmark threads
simultaneously, simulating real-world concurrent user
requests.
Duration control: Dictates how long the benchmarks
should run, ensuring consistent testing periods.
Metrics collection: Gathers data on key performance
indicators such as response times, error rates, and
throughput.
It is possible to use benchmarks to test various non-functional
characteristics by running them in different scenarios:
Performance testing: The system is tested for response
times at peak request rates. The observed times are then
compared against target benchmarks.
Volume testing: The environment is primed to expected
conditions, and benchmarks are executed at anticipated
request volumes. The goal is to ensure response times
remain within acceptable limits.
Stress testing: The system is pushed to its breaking point
by running maximum benchmarks. The peak performance
level before system failure is recorded.
Scalability testing: By altering the environment's
configuration, one can assess how resource changes (like
additional servers) affect system performance.
Reliability testing: Benchmarks are run at expected rates
over extended durations to determine if the system
maintains a low error rate over time.
Availability testing: Similar to reliability, but the focus is
on measuring the duration between system failures or
downtimes.
Non-functional testing is most effective in environments similar
to real-world production. It is important to mimic real challenges:
Having the actual amount of data and handling varied user
requests at once. Testing in such conditions helps benchmarks
give a true picture of how the system will perform. This approach
finds problems that might be missed in a more controlled setting,
ensuring the system is truly ready for real use.
Benchmarks can be executed at both the component level (like a
class or microservice) and the system level. While developers
often run performance benchmarks at the component level to
fine-tune their code, the majority of benchmarking is typically
carried out at the system level to gauge meeting non-functional
requirements.
There are a number of tools that allow to perform various types
of benchmarking as well as focus on specific types of non-
functional testing:
Apache JMeter: A load testing tool that measures the
performance of various services, including web
applications.
LoadRunner: A performance testing tool used to test
applications, measuring system behavior and performance
under load.
Gatling: A high-performance open-source load testing
framework based on Scala, Akka, and Netty.
Locust.io: An open-source load testing tool that allows you
to define user behavior with Python code and simulate
millions of simultaneous users.
Stress-ng: A tool that generates various computer
stressors to validate system robustness under heavy load.
Siege: An HTTP/HTTPS stress tester that's used to test the
strength and analyze the performance of web servers.
New Relic: A cloud-based platform offering application
performance monitoring and end-to-end transaction
tracing.
Chaos Monkey: A tool that randomly terminates instances
in production to ensure that engineers implement their
services to be resilient.
Gremlin: A chaos engineering platform used to
intentionally cause failures and test system reliability.
OWASP ZAP: A security tool used for finding
vulnerabilities in web applications during automated or
manual testing.
Burp Suite: An integrated platform for performing security
testing of web applications.
Database Benchmark: A tool that tests the performance
and capabilities of databases.
HammerDB: A load testing tool for databases, providing
performance testing for both transactional and analytical
workloads.
Pip.Benchmarks: A benchmarking framework designed
for comprehensive performance and reliability testing of
various system components and configurations, providing
actionable insights to optimize software performance.
Following is a basic walkthrough on how to use JMeter to measure
the performance of a critical business transaction:
1. Setting up JMeter:
a. Download and install JMeter from the Apache website.
b. Start JMeter using the jmeter.bat (for Windows) or
jmeter.sh (for Linux/Mac) script.

2. Creating a test plan: Open JMeter and right-click on the


Test Plan node, then choose Add | Threads (Users) |
Thread Group (refer to Figure 11.9):

Figure 11.9: JMeter

3. Configuring users: In the Thread Group, you can set the


number of users, ramp-up period, and the number of times
to execute the test.
4. Adding an HTTP request:
a. Right-click on the Thread Group, then choose Add |
Sampler | HTTP Request.
b. Enter the details of your HTTP request, such as server
name, path, method (GET, POST, and so on), and any
required parameters or headers.

5. Adding listeners for results: Right-click on the Test Plan


node, then choose Add | Listener | View Results in Table
(or any other listener based on your needs, like Graph
Results or Summary Report).
6. Running the test: Click the green Play button. JMeter will
start sending requests based on your configuration.
7. Analyzing results: Once the test is completed, the chosen
listeners will display the results. For performance testing,
you would be interested in metrics like average response
time, max response time, throughput, and error percentage.
8. Fine-tuning and repeating: Based on the results, you can
tweak the configuration, increase the number of users,
adjust the request parameters, and so on, and run the tests
again for further analysis.\
Following are the pros and cons of benchmark:

Pros:
Objective assessment.
Repeatable measurements.
Pinpoints performance bottlenecks.

Cons:
Might not cover all use cases.
Time-consuming to set up.
Requires production-like environment.

Simulators
Simulators are indispensable in non-functional testing, providing
a means to recreate real-world system interactions without the
involvement of actual devices or users. Their primary objective is
to emulate the behavior of genuine system components,
guaranteeing that tests occur under realistic conditions (refer to
Figure 11.10):
Figure 11.10: Simulators generating load to recreate realistic conditions for non-
functional testing

Tools simulate user actions and device interactions accurately,


handling requests, system prompts, and diverse usage patterns.
Simulators excel in scalability, efficiently replicating hundreds or
thousands of interactions. They adapt to various testing
scenarios, from typical to peak loads. Simulators are cost-
effective and versatile, generating pseudo-random requests for
complex testing. They work across different environments
seamlessly.
When it comes to creating simulations for non-functional testing,
several technologies and tools are available. Here is a list of
popular ones:
JMeter: An open-source tool primarily used for load
testing. It can simulate multiple users with concurrent
threads, create a heavy load against web or application
servers, and analyze performance metrics.
Gatling: An open-source load testing tool which uses Scala
scripts for simulation definitions. It provides detailed
metrics and is known for its efficient performance.
Locust: An open-source load testing tool where test
scenarios are written in Python. It allows developers to
simulate millions of simultaneous users.
Selenium: While primarily used for web application
testing, it can be used in combination with other tools to
simulate user behavior on web interfaces.
Shadow: A network simulation/emulation tool mainly used
for networking applications and protocols.
Simulink: From MathWorks, it is primarily used for
modeling, simulating, and analyzing dynamic systems. It
supports simulation, automatic code generation, and
continuous testing.
NS-3: A discrete-event network simulator. Used widely in
research and development to simulate various networking
protocols and architectures.
Artillery: A modern, powerful, and flexible open-source
load testing toolkit. It allows for a quick ramp-up of
requests and supports both HTTP and WebSocket
protocols.
Simulators are essential tools for validating that systems can
meet real-world demands before deployment in actual
environments.
Following are the pros and cons of simulators:

Pros:
Reduce the need for physical equipment or resources.
Simulate thousands of users or devices concurrently.
Consistently replicate test conditions.
Customize scenarios to match diverse requirements.

Cons:
Require expertise to configure accurately.
Need regular updates to stay relevant.
May affect system performance during tests.
Data generator
Effective non-functional testing demands realistic emulation of
real-world conditions, including user load and substantial data
volume representing long-term use by thousands of users and
devices. Creating such extensive, authentic, and structurally
sound datasets poses challenges, often involving millions of
records or terabytes of data (refer to Figure 11.11):

Figure 11.11: Data generation to fill persistent storage with realistic dataset

Development teams use data generation tools to address these


challenges. While some tools upload directly to the database, it
can breach microservices encapsulation. However, for infrequent,
supervised uploads, it may be acceptable for efficiency. A
balanced approach is creating dedicated bulk-upload features in
the interface for large-scale data integration.
Several prominent data generation solutions used for generating
large volumes of random data, especially for non-functional
testing are:
Mockaroo: A flexible, user-friendly tool that allows users
to generate custom datasets in various formats, including
CSV, JSON, SQL, and Excel. It offers a plethora of
predefined data types and lets users define their schemas.
JFairy: A Java library for generating fake data like names,
addresses, and more. It provides a fluent API and is perfect
for those who prefer to code their data generation logic.
Faker: Available for several programming languages like
Python, Ruby, PHP, and more. It is a library that helps
create massive amounts of fake, yet plausible, data.
GenerateData: An open-source script that quickly
produces large amounts of customizable data in CSV, XML,
SQL, and other formats.
Talend Data Fabric: An integrated suite of apps to help
manage data, including a robust data generator.
DataGenerator: An open-source Java tool that lets users
define a template for their data and then produce as much
of that data as they require.
TurboData: A tool specifically designed for populating
databases with large volumes of data. It can reverse
engineer an existing database schema or work with a
predefined schema.
Random User Generator: This tool generates random
user data, including names, addresses, and pictures.
DBMonster: An open-source tool that populates databases
with large, complex datasets for the purpose of testing or
benchmarking.
RedGate SQL Data Generator: Focused on SQL
databases, this tool fills your databases with realistic test
data.
Following are the pros and cons of the data generator pattern:
Pros:

Quickly populate databases/systems.


Produce repeatable datasets for testing.
Generate various types of data, structures, and formats.

Cons:
Synthetic data might not reflect real-world anomalies.
Changes in systems or requirements can necessitate
generator updates.
Some tools might not be compatible with all systems or
databases.

Mock
Setting up external dependencies in testing can be intricate and
time-consuming. Developers use mocks to simplify this process,
replacing genuine systems with simulated stand-ins. These
mocks can range from basic placeholders returning empty results
to advanced simulations mimicking real-time system behavior.
In unit testing, where the focus is on scrutinizing individual units
of code in isolation from external dependencies, mocks play a
pivotal role. They enable developers to mimic the behavior of
these dependencies, facilitating targeted testing of the logic
within the unit being examined.

Problem
Testing in complex environments with many external
dependencies poses challenges. Setting up and handling these
dependencies is time-consuming and can lead to unreliable tests.
Real services can be unpredictable, causing inconsistent results
that make it hard to spot real problems. Plus, changes in
dependencies can affect the entire testing process.

Solution
Using mocks to cut external dependencies provides a controlled
environment for testing, especially in microservices architectures
where services often rely on multiple external systems. These
mock implementations allow developers to simulate the behavior
of external components, ensuring that tests are not only faster
but also more reliable (as shown in Figure 11.12):
Figure 11.12: Cutting external dependencies with mocks

There are three primary types of mocks used for this purpose:
Null (or Dummy) implementations: These are the
simplest forms of mocks that essentially do nothing. They
might return null, an empty object, or a default value.
Advantages: Speeds up testing by quickly bypassing the
external system. Great for tests where the specific
behavior of the dependency is not under examination.
Limitations: Does not simulate the actual behavior of an
external system, so it may miss potential integration
issues or side effects.
Hard-coded implementations: These mocks return
predefined responses based on a specific input. They do not
have any processing logic but serve static responses that
are set up in advance.
Advantages: Allows for controlled testing by simulating
specific scenarios, especially useful for testing edge
cases or expected behaviors.
Limitations: Since responses are static, they might not
cover all potential interactions, and there is a risk of
them becoming outdated if not maintained.
In-Memory (simulated) implementations: These are
sophisticated mock implementations that simulate the
behavior of the actual external system. They might involve
in-memory databases, simulated processing logic, or mimic
actual service behaviors. Their advantages and limitations
are:
Advantages: Offers a closer-to-real testing environment
without connecting to the actual external systems. It can
handle a variety of scenarios dynamically and is
especially useful when the external system's behavior is
complex.
Limitations: Requires more effort to set up and
maintain. Might still have discrepancies compared to the
real external system.
Here are some popular Java mocking frameworks:
Mockito: One of the most popular Java mocking
frameworks. It provides simple and powerful APIs for
stubbing and spying on Java classes and interfaces.
PowerMock: This framework is often used in conjunction
with Mockito. It allows for mocking static methods,
constructors, and final classes, which are typically hard to
mock.
EasyMock: This is another widely-used framework that
provides mock objects for interfaces in JUnit tests by
generating them on the fly using Java's proxy mechanism.
JMock: JMock focuses on explicitly specifying the behavior
of the mocks using a domain-specific language (DSL)
contained within JMock itself.
Mockachino: A simpler and more lightweight framework
with easy-to-read error messages.
Spock: While primarily a Groovy testing framework, Spock
provides mocking capabilities that can also be used with
Java code.
WireMock: Useful for stubbing and mocking web services,
it allows you to set up standalone HTTP-based stub servers.
Let us use Mockito as it is one of the most popular Java mocking
frameworks. Suppose we have an external microservice
PaymentService that we want to mock in our tests (Code snippet
11.12):
1. public interface PaymentService {

2. boolean processPayment(double amount, String accountId);

3. }

4.
5. public class OrderService {

6. private final PaymentService paymentService;

7.
8. public OrderService(PaymentService paymentService) {

9. this.paymentService = paymentService;

10. }

11.
12. public boolean placeOrder(double amount, String accountId) {

13. // ... other business logic ...

14. return paymentService.processPayment(amount, accountId);

15. }

16. }

Now, let us mock the PaymentService in our test (Code snippet


11.13):
1. public class OrderServiceTest {

2.
3. @Test
4. public void testPlaceOrder() {

5. // Create mock of PaymentService

6. PaymentService mockPaymentService = mock(PaymentService.class)


;

7.
8. // Define behavior of the mock

9. when(mockPaymentService.processPayment(100.0, "12345")).thenRe
turn(true);

10.
11. // Use the mock in OrderService

12. OrderService orderService = new OrderService(mockPaymentServic


e);

13.
14. // Assert the behavior

15. assertTrue(orderService.placeOrder(100.0, "12345"));

16.
17. // Verify that the mock method was called

18. verify(mockPaymentService).processPayment(100.0, "12345");

19. }

20. }

In this example, we have mocked the PaymentService so that we do


not actually make a call to the external service during our tests.
Instead, the mock just returns true when it is called with specific
arguments.
While using mocks offers various advantages in testing, it is
essential to supplement mock-based tests with integration and
end-to-end tests to ensure that the microservice interacts
correctly with real external systems.
Following are the pros and cons of mock:
Pros:
Test components in isolation without real dependencies.
Faster test execution compared to real interactions.
Simulate various scenarios, including edge cases.

Cons:
Additional code to create and maintain mocks.
Mocks may not behave as actual implementations 100%
accurately. Significant differences can be found in error
handling, concurrency, response time and other behaviors
or non-functional characteristics.
Passing tests with mocks does not guarantee real-world
functionality.

Chaos Monkey
Ensuring system reliability is a cornerstone of modern
technology. To bolster this, a defensive coding strategy, where
potential failures are anticipated, becomes essential. The Chaos
Monkey pattern exemplifies this approach. In microservices
architectures, deliberate component failures are introduced,
compelling the system to adapt and recover. Though initially
challenging, this method cultivates the development of resilient
systems that excel in challenging conditions.

Problem
Systems can break down unexpectedly, as Verner Vogles once
pointed out, saying everything fails all the time. To make
software strong against such breakdowns, developers need to
write protective code. But even with regular checks and tests,
some problems might still go unnoticed. Discovering these issues
later can make them harder to fix. The Chaos Monkey approach
helps by purposely causing problems to see if the system can
handle them. So, the big question is: How can developers
regularly find and fix these hidden issues to make their software
more reliable?

Solution
The Chaos Monkey pattern, inspired by Netflix's strategy,
disrupts a system with planned failures. Rather than waiting for
unplanned issues, it actively introduces disruptions, even in live
production. The idea is that a strong system should recover from
these disruptions. If not, fixing the vulnerability becomes a
priority. Initially challenging for teams, it encourages robust
system design and improves reliability with time.
To simulate the Chaos Monkey behavior in a Spring Boot
application, you can use an application event listener coupled
with Spring's environment to read configuration properties. Here
is a simple illustration:

ChaosMonkey configuration:
Using application.properties or application.yml for external
configuration (Code snippet 11.14):
1. chaosmonkey.mttf=60000 # Mean time to failure in milliseconds
(60 seconds in this example)

2. chaosmonkey.probability=0.1 # 10% failure probability

ChaosMonkey component (Code snippet 11.15):


1. @Component

2. public class ChaosMonkey {

3.
4. @Value("${chaosmonkey.mttf}")

5. private long mttf;

6.
7. @Value("${chaosmonkey.probability}")

8. private double probability;


9.
10. private final Random random = new Random();

11.
12. @EventListener(ApplicationReadyEvent.class)

13. public void initiateChaos() {

14. new Timer().schedule(new TimerTask() {

15. @Override

16. public void run() {

17. if (random.nextDouble() < probability) {

18. System.exit(1); // Crash the microservice

19. }

20. }

21. }, mttf, mttf);

22. }

23. }

This approach ensures that on average, after mttf milliseconds,


there is a probability chance of the application crashing. You can
adjust the properties to control the frequency and likelihood of
these induced failures. Remember, while this is a fun exercise,
deliberately crashing production applications without a clear
strategy or monitoring can have unintended consequences.
Always exercise caution.
Following are the pros and cons of Chaos Monkey:

Pros:
Forces robustness.
Validates recovery mechanisms.
Encourages defensive programming.
Cons:
Initial disruption and chaos.
Can impact users if not managed.
Potential added cost (for example, auto-scaling reactions).

Conclusion
This chapter covers microservices testing, starting with test
planning. It explores functional testing types (unit, integration,
end-to-end, contract, and acceptance) and emphasizes setting
an initial state. Non-functional testing is discussed, including
benchmarks, simulators, and data generators. The chapter
introduces mocks to reduce external dependencies and presents
the Chaos Monkey pattern for handling issues. The next chapter
will teach you about scripting environments.

Further reading
1. Öztürk, M. How to Write Software Test Planning for
Successful Projects. Medium. Feb 18, 2020. Available at
https://medium.com/javascript-in-plain-english/how-
to-write-software-test-planning-for-successful-
projects-f2df2b9412a0
2. Knoldus Inc. Java Unit Testing with JUnit 5. Medium. Sep 24,
2021. https://medium.com/@knoldus/java-unit-
testing-with-junit-5-28192830704a
3. Peck, N. Microservice Testing: Unit Tests. Medium. Sep 27,
2017.
https://medium.com/@nathankpeck/microservice-
testing-unit-tests-d795194fe14e
4. Choudary, A. Functional Testing versus Non-Functional
Testing — What are the Differences? Medium. Mar 28, 2019.
Available at https://medium.com/edureka/functional-
testing-vs-non-functional-testing-a08bc732fbdd

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and Sessions
with the Authors:
https://discord.bpbonline.com
CHAPTER 12
Scripting Environments

Introduction
This chapter explains Platform Engineering's role in modern
software systems, where manual configurations are slow and
error-prone, and the solution is automation. Throughout this
chapter, we will explore the foundational patterns of scripted
environments, emphasizing the importance of consistency,
speed, and accuracy in today's ever-evolving digital
landscape.

Structure
In this chapter, we will cover the following topics:
Scripted environment
Production environment
Test environment
Development environment
Cross-platform deployment
Symmetric environment
Asymmetric environment
Dockerized environment
Deployment security
IP access lists
Traffic control rules
Management station
Environment verification
Environment test
Infrastructure certification

Objectives
This chapter guides you in using scripted environments for
efficient and reliable microservices delivery. It does not cover
all platform engineering but offers essential patterns for
faster and consistent delivery. You will understand the
drawbacks of manual infrastructure setups, value
automation in infrastructure management, recognize
scripted environment patterns, differentiate software
infrastructure components, and apply automation best
practices with confidence.

Scripted environment
Microservices systems require meticulous setup,
encompassing elements such as networks, computers,
databases, and messaging tools. Manual setup is time-
consuming and prone to errors. The Scripted Environments
pattern advocates for using automated scripts to expedite
and ensure the accuracy of the setup process.

Problem
Navigating the complexities of modern application
deployment often feels akin to traversing a digital labyrinth.
Central to this complexity is the need to understand the
distinctions between deployment platform, deployment
infrastructure, and deployment environment, especially
when deploying applications. Here is what they mean:
Deployment platform: Envision this as the overarching
umbrella, encapsulating every piece of hardware and
software required to deploy and run applications.
Deployment infrastructure: This zeroes in on the physical
realm, predominantly addressing hardware components such
as servers, storage devices, and network subnets and
routers.
Deployment environment: Picture this as a specific
configuration of a deployment platform. It is a unique mix
tailored for specific needs or stages, including the deployed
applications themselves, which are pivotal for understanding
the environment's function and context.
Finally, platform engineering emerges as a key discipline
here, devoted to meticulously crafting these deployment
platforms, ensuring they are robust, adaptive, and efficient
(refer to Figure 12.1):
Figure 12.1: Key components of a deployment environment

Deployment environment consists of a few key parts:


Hardware infrastructure: This foundational layer
houses networking components like subnets and
routers, storages as well as, computing instances
complete with their respective operating systems.
Software platform: A layer deployed on top of the
hardware infrastructure to provide infrastructure
services for deployed systems: databases, message
brokers, logging and monitoring tools, caching
systems, API gateways, and more.
Deployed applications: backend services and
frontends applications that sit on the top of the
hardware infrastructure and software platform and
serve end-users.
Manual setup of these environments has a few common
pitfalls:
Time: Crafting environments manually is a labor-
intensive endeavor, consuming vast amounts of
valuable time.
Errors: The manual approach is inherently error-
prone, susceptible to oversight or misconfigurations.
Inconsistencies: Without a standardized process,
there is a heightened risk of discrepancies between
setups, leading to unpredictable behavior.
The remedy lies in automation, embodied by the
Infrastructure as a Code philosophy. A set of automated
scripts are used to provision deployment environments. They
follow the same practices as regular code, including
versioning, quality control and automated verification.
Scripting deployment environments brings a number of
benefits:
Consistency: Automation ensures uniform
deployments, mitigating inconsistency risks.
Efficiency: Scripts expedite processes, curbing setup
durations.
Precision: Automated setups drastically reduce error
margins.
Adaptability: Scripts can be fine-tuned to
accommodate evolving requirements.
Scalability: As demands surge, scripts can be
recalibrated to meet those growing needs.

Production environment
Production environments are the backbone for running
applications in real-world scenarios. Beyond hosting the final
product, they are also invaluable for non-functional testing,
which demands an authentic setup (refer to Figure 12.2):

Figure 12.2: Provisioning production environments

Automated scripts configure environments based on key


properties like instance type and quantity, storage size, and
so on. They typically have three basic commands:
Create: Constructs the environment from scratch,
resembling building a new structure with various
resources and services configured according to
specifications.
Update: Modifies the existing environment to
accommodate new needs or alterations, crucial for non-
destructive updates in production environments to
maintain deployed applications and data.
Delete: Systematically dismantles the environment
when it is no longer needed or requires a reset, freeing
up resources.
The provisioning scripts can output parameters of created
resources, such as IP addresses and ports, to be used later
for application day 0 configuration and maintenance.
A well-architected production environment incorporates
three key aspects:
Security: Shields against unauthorized access and
potential breaches.
Scalability: Adapts gracefully to growing demands or
loads.
Reliability: Consistently delivers top-notch
performance with minimal disruptions.
There is a toolkit of technologies that are pivotal when
orchestrating production environments.
For provisioning hardware infrastructure and cloud native
services:
AWS CloudFormation: A tool within the Amazon Web
Services arsenal, permitting users to design and
establish AWS resources.
Azure Resource Manager (ARM): Within the
Microsoft Azure ecosystem, ARM templates assist in
deploying resources for collective utilization.
Google Cloud Deployment Manager: Specific for
Google Cloud Platform, it allows users to specify all the
resources needed for an application in a declarative
format using YAML.
Terraform: An open-source instrument allowing users
to provision environments across multiple deployment
platforms. It supports AWS, Google Cloud, Azure and
OpenStack.
Oracle Cloud: A cloud computing service offered by
Oracle Corporation providing servers, storage,
network, applications and services through a global
network of managed data centers
The mentioned tools have limited capabilities in configuring
computing instances and deploying self-managed
infrastructure services and applications. To augment them,
teams employ Configuration Management tools. Some
notable ones include:
Ansible: A tool perfect for IT automation, from system
configuration to software deployment and advanced
task orchestration.
Puppet: Automates the provisioning and management
of servers, ensuring consistent desired states.
Chef: Manages infrastructure through code, keeping
nodes consistently configured.
Following is a basic example of provisioning an environment
on AWS using Terraform. This Terraform configuration sets up
three main components on AWS:
Elastic Kubernetes Service (EKS) Cluster: The
eks_cluster module provisions an EKS cluster named "my-
eks-cluster" in the specified region (in this case, "us-
west-2"). It configures two subnets for the cluster and
sets up a node group with specified capacity and
instance type.
Relational Database Service (RDS) with MySQL:
The rds_mysql module provisions an RDS MySQL
database instance named " mydb ". It uses a db.m4.large
instance type with allocated storage of 20 GB, sets up
the database in the specified subnets, and sets a
username/password for access.
Kafka Cluster: The kafka module provisions a Kafka
cluster with a Kafka instance count to 3, uses "m5.large"
instance type for Kafka brokers, and specifies Kafka
version 2.4.0.
Configuration file (terraform.tfvars, Code snippet 12.1):
1. region = "us-west-2"
2. eks_cluster_name = "my-eks-cluster"
3. eks_cluster_version = "1.21"
4. subnets = [«subnet-abcde012», «subnet-
bcde012a»]
5. eks_desired_capacity =2
6. eks_max_capacity =3
7. eks_min_capacity =1
8. eks_instance_type = "m5.large"
9. key_name = "my-key"
10. rds_allocated_storage = 50
11. rds_instance_class = "db.m4.large"
12. rds_db_name = "mydb"
13. allocated_storage = 20
14. rds_db_username = "admin"
15. rds_db_password = "yourpassword"
16. kafka_instance_count =3
17. kafka_instance_type = "m5.large"
18. kafka_version = "2.4.0"
AWS Provider and Terraform configuration (main.tf - Code
snippet 12.2):
1. provider "aws" {
2. region = var.region
3. }
4.
5. # EKS Configuration
6. module "eks" {
7. source = "terraform-aws-modules/eks/aws"
8. cluster_name = var.eks_cluster_name
9. cluster_version = var.eks_cluster_version
10. subnets = var.subnets
11.
12. node_groups = {
13. eks_nodes = {
14. desired_capacity = var.eks_desired_capacity
15. max_capacity = var.eks_max_capacity
16. min_capacity = var.eks_min_capacity
17.
18. instance_type = var.eks_instance_type
19. key_name = var.key_name
20. }
21. }
22. }
23. # RDS Configuration
24.
25. resource "aws_db_instance" "default" {
26. allocated_storage = var.rds_allocated_storage
27. storage_type = "gp2"
28. engine = "mysql"
29. engine_version = "5.7"
30. instance_class = var.rds_instance_class
31. name = var.rds_db_name
32. username = var.rds_db_username
33. password = var.rds_db_password
34. parameter_group_name = "default.mysql5.7"
35. skip_final_snapshot = true
36. }
37.
38. # Kafka Configuration
39. module "kafka" {
40. source = "SuperQueuer/kafka/aws"
41. instance_count = var.kafka_instance_count
42. instance_type = var.kafka_instance_type
43. kafka_version = var.kafka_version
44. ...
45. }
46.
47. # Variables Declaration
48. variable "region" {}
49. variable "eks_cluster_name" {}
50. ... # and so on for each variable
The steps are:

1. Create the environment:


a. Initialize Terraform modules and plugins (Code
snippet 12.3):
1. terraform init

b. Apply the configuration to provision resources (Code


snippet 12.4):
1. terraform apply
2. Update the environment:
a. Make the desired changes to the main.tf configuration. Then apply
the updated configuration (Code snippet 12.5):
terraform apply

3. Delete the environment:


a. To destroy or delete all the resources (Code snippet 12.6):
1. terraform destroy

Provisioning scripts for production environments offer an


efficient means of deploying and managing applications.
Supported by precise configurations, these scripts guarantee
that the environment remains in sync with the dynamic
requirements of modern software systems.
Following are the pros and cons of production environment:

Pros:
Easily adjust based on demand or requirements.
Scripts act as a record of environment specifications.
Enable the creation and deletion of environments as
needed, thereby freeing up resources and lowering
development expenses.

Cons:
Need expertise to write and troubleshoot scripts.
Some scripts might not handle unforeseen scenarios
flexibly.
Reliance on specific scripting or provisioning tools may
lead to lock-in.

Test environment
Test environments are essential for verifying service
functionality without the complexity of production setups.
Customized for testing, they closely mimic production
behavior but incorporate key differences to align with testing
goals and cost-efficiency.
There are a few key differences between production and test
environments:
Cost efficiency: Production environments ensure
robustness, scalability, and user experience, often with
redundancy and higher costs. Test environments
prioritize cost efficiency, removing redundancies and
scaling down for functional accuracy.
Infrastructure scale: Test environments serve
smaller user bases compared to production, allowing
them to be less performant and scalable. In contrast to
multiple replicas in production for load balancing, test
environments typically use a single instance.
Life cycle: Production environments are long-lived,
built to operate indefinitely. In contrast, test
environments exhibit diverse lifetimes. Some are
permanent, resembling scaled-down production setups,
while others are ephemeral, created for specific tests
or development cycles and dismantled afterward.
Test environments adapt to development needs and team
choices, being either long-lived or on-demand:
Permanent test environments: These are
continuously active, offering a stable platform for
continuous integration and regular testing, ensuring
consistent validation of new changes or features.
On-demand test environments: These are temporary
and created when necessary. They are beneficial for
specific test cases, feature branches, or unique
scenarios. After fulfilling their testing purpose, they
can be dismantled, reducing costs.
The same set of tools that breathe life into production
environments is employed for test environments. Tools like
Terraform, Ansible, and Kubernetes are agnostic to the
nature of the environment. They can script, provision, and
manage both production and test setups with equal ease.
The difference lies in the configuration, where test
environments might have reduced resources, fewer
instances, or even mock services.
Following are the pros and cons of test environment:

Pros:
Reduced infrastructure needs compared to production.
Reproducible setups ensure uniform testing conditions.
Easily adjustable configurations for varied test cases.

Cons:
Requires time to script and configure initially.
Need to update scripts with changing testing
requirements.
Differences from production might miss some issues.

Development environment
In an organized development process, developers focus on
individual components, running them with their
dependencies. Occasionally, they may need to test
component integration or troubleshoot complex system-wide
problems. Creating individual test environments is costly,
and sharing them can cause issues when developers
overlap.
A better solution is to create scripts to provision
development environments for individual use. They are even
more optimized than test environments to require minimum
resources, which can even fit a development machine.
Common technologies for local development environments
include:
Vagrant: Vagrant manages virtualized development
environments. It allows for the creation and
management of virtual machines (VMs) that can
mimic server setups, ensuring that developers have a
consistent and isolated environment for their tasks.
Minikube: A local Kubernetes environment. Minikube
lets developers run Kubernetes locally, providing a
platform that closely mirrors production setups but in a
lightweight, containerized manner suitable for personal
machines.
Docker (Docker-Compose): While Minikube brings
Kubernetes into play, plain Docker can also be used to
create containerized environments for individual
services and their dependencies.
When implementing microservices with serverless
architecture, developers might require cloud-based personal
development environments like test setups. To control costs,
it is crucial to optimize these environments for savings and
shut them down when they are not in use.
The example below demonstrates provisioning of a local
development environment similar to the production
environment shown above. It uses a kubernetes YAML file to
provision MySQL database and Kafka message broker. The
steps are as follows:

Prerequisites:
Following are the prerequisites:
Ensure Minikube and `kubectl` are installed. If not, install
Minikube (https://minikube.sigs.k8s.io/docs/start/) and
kubectl (https://kubernetes.io/docs/tasks/tools/).
1. Creating the environment:
a. Start Minikube (Code snippet 12.7):
1. minikube start

b. Create a YAML file named `dev-environment.yaml` (Code


snippet 12.8):
1. ---
2. apiVersion: v1
3. kind: Service
4. metadata:
5. name: mysql-service
6. spec:
7. selector:
8. app: mysql
9. ports:
10. - protocol: TCP
11. port: 3306
12. targetPort: 3306
13. ---
14. apiVersion: apps/v1
15. kind: Deployment
16. metadata:
17. name: mysql-deployment
18. spec:
19. replicas: 1
20. selector:
21. matchLabels:
22. app: mysql
23. template:
24. metadata:
25. labels:
26. app: mysql
27. spec:
28. containers:
29. - name: mysql
30. image: mysql:5.7
31. env:
32. - name: MYSQL_ROOT_PASSWORD
33. value: "password"
34. ---
35. apiVersion: v1
36. kind: Service
37. metadata:
38. name: kafka-service
39. spec:
40. selector:
41. app: kafka
42. ports:
43. - protocol: TCP
44. port: 9092
45. targetPort: 9092
46. ---
47. apiVersion: apps/v1
48. kind: Deployment
49. metadata:
50. name: kafka-deployment
51. spec:
52. replicas: 1
53. selector:
54. matchLabels:
55. app: kafka
56. template:
57. metadata:
58. labels:
59. app: kafka
60. spec:
61. containers:
62. - name: kafka
63. image: confluentinc/cp-kafka:latest

c. Apply the configuration (Code snippet 12.9).


1. kubectl apply -f dev-environment.yaml

This command will create both the MySQL and Kafka


services, along with their respective deployments.
2. Deleting the environment:
For development environments, it is often a good
practice to clean up resources post usage (Code
snippet 12.10):
1. kubectl delete -f dev-environment.yaml

To stop Minikube (Code snippet 12.11):


1. minikube stop
Note: In development environments, non-destructive updates are
seldom needed. If modifications are necessary, it is usually better to
delete and recreate the environment.

In essence, scripting development environments encapsulate


the system's essence but remains minimalistic, personalized,
and locally deployable. Striking this balance between scale
and functionality allows developers to work efficiently,
reducing costs and enhancing productivity.
Following are the pros and cons of development
environment:
Pros:
Ensures every developer works within a uniform
environment.
New developers can get started quickly without
lengthy setup processes.
Environments can be versioned alongside code,
ensuring compatibility.
Cons:
Developers may need to learn new tools or scripting
languages.
Local machines might lack the resources for certain
configurations.
Differences between development and production can
lead to overlooked issues.

Cross-platform deployments
In today's diverse multi-platform landscape, applications
span on-premises and various clouds. Scripting ensures
consistent deployments, eliminating differences and
ensuring a uniform application experience across hosting
platforms. In essence, scripting is crucial for consistency in
modern cross-platform deployments.

Problem
Organizations supporting cross-platform deployments face a
dilemma: Prioritize optimizing applications for each
platform's strengths or prioritize uniformity for simplicity in
development and deployment. This choice can affect
application performance and deployment efficiency.
However, at its core, this decision hinges on the need for
scripts that ensure a consistent deployment experience
across all platforms. The challenge lies in creating robust and
adaptable scripts that enable seamless deployments,
whether accommodating platform nuances or enforcing
standardization.

Asymmetric environment
Asymmetric environments involve setups that offer similar
yet distinct infrastructure services customized for the
strengths of each deployment platform. This approach aims
to maximize the native capabilities of each platform. For
example, on AWS, EKS, Simple Queues, and RDS are
preferred, while on Azure, AKS, Storage Queues, and Azure
SQL may be used to achieve a functionally similar
configuration (refer to Figure 12.3):

Figure 12.3: Provisioning and application development in asymmetric


environments
An essential element facilitating smooth operation in
asymmetric environments is the adapter architecture within
applications. This design enables applications to adjust
during deployment to match the services of the chosen
platform. In essence, the core of the application remains
constant, but specific connectors or "adapters" can be
swapped to align with the platform's offerings.
While applications may need to adapt to their environment,
deployment scripts should strive for consistency across all
platforms. This uniformity simplifies the experience for
developers and operations teams, reducing the learning
curve and potential errors. To achieve this, scripts can utilize
cross-platform tools like Terraform for a unified scripting
experience. Alternatively, platform-specific tools like AWS's
CloudFormation or Azure's Resource Manager can be
integrated into custom scripts, ensuring a consistent
deployment process regardless of the underlying platform.
Following are the pros and cons of asymmetric environment:

Pros:
Access to unique features exclusive to specific
platforms.
Can leverage platform-specific scaling mechanisms.
Might reduce costs by using the most cost-effective
services on each platform.

Cons:
Requires more intricate setup and configuration.
Teams need to understand multiple platforms' services.
Different services may not integrate seamlessly with
each other.
Symmetric environment
Symmetric deployments aim for uniformity across multiple
platforms. The core of this strategy is utilizing infrastructure
services with compatible APIs, ensuring consistent operation
irrespective of the underlying deployment platform.
There are two ways to achieve this uniformity:
Self-managed services: Here, organizations deploy
their own software services on bare computing
instances for more control, albeit with increased
management effort.
Cloud-native managed services: These are out-of-
the-box services provided by cloud providers, removing
the hassle of setup and maintenance.
Here are the examples of compatible infrastructure services
available on most cloud platforms that can be used in
symmetric environments:
Kubernetes (Container orchestration)
AWS: Amazon Elastic Kubernetes Service
(EKS)
Azure: Azure Kubernetes Service (AKS)
Google Cloud: Google Kubernetes Engine (GKE)
Relational databases
MySQL:
AWS: Amazon RDS for MySQL
Azure: Azure Database for MySQL
Google Cloud: Cloud SQL for MySQL

PostgreSQL:
AWS: Amazon RDS for PostgreSQL
Azure: Azure Database for PostgreSQL
Google Cloud: Cloud SQL for PostgreSQL

NoSQL databases
MongoDB:
AWS: Amazon DocumentDB (with MongoDB compatibility)
Azure: Azure Cosmos DB (with MongoDB API)
Google Cloud: MongoDB Atlas (partner service on GCP
marketplace)

Message Brokers/Event Streaming


Kafka:
AWS: Amazon Managed Streaming for Apache Kafka (MSK)
Azure: Azure Event Hubs (with Kafka compatibility)
Google Cloud: Cloud Pub/Sub (with Kafka compatibility)

Message Queuing Telemetry Transport


(MQTT):
AWS: AWS IoT Core
Azure: Azure IoT Hub
Google Cloud: Cloud IoT Core

In-memory data stores


Redis:
AWS: Amazon ElastiCache for Redis
Azure: Azure Cache for Redis
Google Cloud: Cloud Memorystore for Redis

Memcached:
AWS: Amazon ElastiCache for Memcached
Azure: Azure Managed Cache Service (previously offered, now
encourages using Redis)
Google Cloud: Cloud Memorystore for Memcached

Elasticsearch (Search and analytics)


AWS: Amazon Elasticsearch Service
Azure: Azure Cognitive Search (supports
Elasticsearch API)
Google Cloud: Elastic Cloud on Google Cloud
(partner service)
Symmetric deployments excel in simplicity. With uniform
APIs, application components are less complex, and testing
becomes easier. Standardized deployment reduces variations
and potential challenges (refer to Figure 12.4):

Figure 12.4: Provisioning and application development in symmetric


environments

Scripting for symmetric environments involves using


compatible services, ensuring scripts remain consistent
across different deployment platforms. This approach may
limit the use of platform-specific features but guarantees a
streamlined and predictable deployment process.
Following are the pros and cons of symmetric environment:

Pros:
Uniform APIs and services ensure a consistent
experience across platforms.
Developers can target a single, consistent
infrastructure, reducing variations and special cases.
Common tooling and processes can be used for updates
and troubleshooting across different clouds.

Cons:
Using a lowest common denominator approach might
mean missing out on platform-specific advanced
features.
Ensuring symmetry can introduce unnecessary services
or complexity, leading to added costs.
Initially setting up symmetric environments can be
more complex due to the need to match services across
platforms.

Dockerized environment
Recently, Docker deployments gained popularity as a
preferred method for software. Kubernetes, a key container
orchestration tool, is central to this shift, now a staple with
managed services on major clouds.
Docker simplifies app component deployment but initially
excluded infrastructure services. These operated separately
in Kubernetes or as cloud-native/self-managed services due
to early concerns about Kubernetes potentially disrupting
mission-critical operations.
As Kubernetes evolved, the landscape changed. Now,
deploying robust infrastructure services in Kubernetes is not
only doable but efficient. Kubernetes features like
statefulsets, affinities, and resource limits have improved its
capabilities. You can run clusters on high-performance
storage-equipped nodes (refer to Figure 12.5):

Figure 12.5: Provisioning and application development in dockerized


environment in Kubernetes

Here is a list of advanced Kubernetes features that enable


production deployment of mission-critical infrastructure
services:
StatefulSets: Ordered, unique pod management.
Persistent Volumes (PV) and Persistent Volume
Claims (PVC): Independent storage life-cycles.
High-performance storages: Fast I/O with cloud
storage solutions.
Node Affinity/Anti-Affinity: Direct workloads to
specific nodes.
Taints and tolerations: Reserve nodes for specific
tasks.
PodDisruptionBudgets: Maintain minimum service
availability.
Topology spread: Distribute workloads across
zones/nodes.
QoS classes: Prioritize resource allocation.
Resource Limits/Requests: Set CPU/memory
bounds.
Local Persistent Volumes: Mount local storage to
pods.
Readiness/Liveness Probes: Check pod health and
readiness.
Network policies: Strict pod communication rules.
StorageClass: On-demand storage provisioning.
Volume snapshots: Backup and restore functionality.
Init containers: Run tasks before main containers
start.
Mature Kubernetes simplifies environment provisioning
significantly. The deployment platform mainly requires a
Kubernetes cluster. Within this framework, all infrastructure
services and app components deploy and manage
seamlessly. This uniformity enhances cohesion, making
environment setups efficient and straightforward, a notable
advantage.
Following are the pros and cons of dockerized environment:

Pros:
Same container runs on any platform.
Containers share the host OS, using less memory.
Easily scale services up or down.

Cons:
Requires understanding of containers and
orchestration.
Requires specialized tools and practices.
Can be complex in large deployments.
Deployment security
Securing deployments is a delicate balance of art and
science. Constantly evolving threats make neglecting
security costly. Misconceptions cloud security, and a poorly
executed strategy can be as weak as no strategy.
Understanding common attack vectors and applying proven
practices creates a resilient defense against most threats.

Problem
In the complex world of system security, persistent myths
can misguide even experienced developers. One common
misconception is that hackers mainly target application APIs
to gain full control. However, while attacking public APIs can
lead to threats like DoS attacks or unauthorized access,
these usually provide limited control. SQL injection attacks
pose a greater threat by compromising database integrity
but have limited scope (refer to Figure 12.6):
Figure 12.6: Common vectors of attacks on deployed systems

Two underestimated and misunderstood attack vectors are


more dangerous. Firstly, maintenance windows grant remote
access to deployment engineers, potentially giving hackers
full system control if exploited. Secondly, binary repositories,
if compromised, let adversaries insert and spread trojans,
granting extensive capabilities beyond typical application-
level access. Recognizing and addressing these overlooked
vulnerabilities is crucial for building a truly robust security
infrastructure.

IP access lists
IP access lists serve as gatekeepers, regulating the flow of
traffic into a system based on predetermined rules tied to IP
addresses. By defining who can or cannot access a system,
these lists play a crucial role in bolstering security measures.
There are two primary types:
White list: This list is inclusive, allowing specified IP
addresses access while blocking all others. It is crucial for
restricting access to trusted entities like support
administrators or specific software repositories. In internal
corporate systems, white lists can be configured to permit
access only from designated corporate subnets, ensuring
users can connect securely within the corporate network and
preventing external unauthorized access.
Black list: This list works by excluding specific IP addresses
denying them access to the system. Black lists are used
when certain IPs are identified as potential threats, often due
to suspicious activities or known malicious sources. By
adding these IPs to the black list, all incoming connections
from them are blocked, reducing potential risks.
The white list follows the "deny all, allow specific" principle,
while the black list operates on "allow all, deny specific."
Using these lists strategically, organizations can strengthen
their systems, maintaining restricted and controlled access.
Following are the pros and cons of IP access lists:

Pros:
Simple to implement and manage.
Effective immediate barrier against unauthorized IPs.
Customizable based on specific security needs.

Cons:
Might accidentally block legitimate users.
Does not provide protection against attacks from
allowed IPs.
Can be bypassed using IP spoofing.
Traffic control rules
Traffic control rules offer a more granular approach to system
security by regulating data flow based on specific criteria.
This ensures that only the right data reaches its intended
destination, further hardening the system's security
perimeter. These rules can be classified as:
Direction (Inbound/Outbound):
Inbound: These rules manage incoming traffic to
the system. By specifying allowed sources, one can
ensure that only authorized requests are
entertained.
Outbound: Regulate the data that leaves the
system, ensuring sensitive data is not sent to
unintended or potentially harmful destinations.
Source and destination IP addresses: Define which
IPs can send or receive data. This filters out unwanted
or potentially harmful traffic at the IP level.
Source and destination port numbers: Ports act as
endpoints for communication. By specifying allowed
ports, one can ensure that only specific services or
applications communicate as intended.
To define secure perimeter of a system consider putting the
following rules:
Application access: Only allows specific external
interfaces to access the application by filtering based
on ports and IP addresses.
Inter-system communication: Facilitates safe
communication between systems. Specific IPs and
ports ensure data flow only between trusted systems.
Maintenance Windows: Provides secure entry points
for system administrators. By whitelisting specific IPs
and hosts, administrators can safely access the system
without risking intrusion.
Software installations: Secures the software
installation process by allowing downloads only from
trusted repositories. By specifying IPs and port
numbers, one can ensure software components are
fetched from genuine sources.
Traffic control rules function as a refined sieve, allowing only
the necessary data to pass through while blocking potential
threats, ensuring a robust and tailored defense mechanism.
Following are the pros and cons of traffic control rules:

Pros:
Precise regulation of incoming and outgoing traffic.
Reduces exposure to potential threats.
Allows specific services or applications to
communicate.

Cons:
Can be challenging to set up and maintain.
Processing rules can slightly slow traffic.
Legitimate traffic might occasionally be blocked.

Management station
In the realm of deployment security, the maintenance
window stands as a critical period where systems are often
most vulnerable. This window is primarily used by
deployment engineers and system administrators for remote
access to computer instances, facilitating software
installation and system maintenance tasks. Given the high
risks associated with these periods, a specific technique has
been developed to mitigate potential threats (refer to Figure
12.7):

Figure 12.7: Use of Management Station to limit number of maintenance


windows into the system

The core idea is to establish a dedicated instance within the


deployment environment, termed as the Management
Station. This station acts as a secure gateway or a bridge. It
is the only instance openly accessible for remote connections
from outside the system. Here is how it works:
Isolated access: Instead of allowing remote access
directly to critical servers or instances, system
administrators connect first to the Management
Station.
Internal connection: Once authenticated on the
Management Station, administrators can then access
other components within the secure environment.
Centralized control: By funneling all remote access
through a single point, it becomes far simpler to
monitor, log, and secure these connections.
Cost-efficient: Utilizing a small, inexpensive
computing instance (often at a marginal cost) ensures
that security is maintained without a significant
overhead.
Reduced attack surface: By limiting external connections
to just one instance, potential vulnerabilities are significantly
reduced, making it harder for unauthorized entities to
penetrate the system.
The management station acts as a protective barrier,
ensuring that while essential tasks can be performed, the
integrity of the system remains uncompromised. This
approach not only simplifies security protocols but also
ensures that as personnel or system components change,
the risk of vulnerabilities creeping in is minimized.
Following are the pros and cons of management station:

Pros:
Simplifies monitoring and logging.
Fewer points of entry for unauthorized access.
Easy to update or modify without affecting the entire
system.

Cons:
Requires separate upkeep and updates.
Even if minimal, consumes resources continuously.
Training needed for proper use and understanding.

Environment verification
In Infrastructure as Code, automated checks ensure
alignment with blueprints, boosting consistency, reducing
discrepancies, and speeding up delivery, enhancing
reliability and streamlining development to production.
Problem
Despite the automation provided by provisioning scripts,
variations in hardware characteristics and infrastructure
service versions can emerge. Such disparities risk both the
functional and non-functional attributes of deployed
environments, jeopardizing the seamless operation of
software systems. Furthermore, mismatches between
deployment platforms, or among production, test, and
development environments can catalyze frequent and severe
collisions. Thus, there is a pressing need for automated
testing to guarantee the consistency and adherence of all
provisioned environments to the intended specifications.

Environment testing
Automated environment tests act as a quality gate to ensure
that the deployed environment aligns with both its functional
and non-functional specifications. Deployed directly within
the environment they are testing, these assessments
emulate how applications interact with platform services (as
shown in Figure 12.8):
Figure 12.8: Automated environment testing

Automated environment tests can perform a number of


checks. If any of those checks is unsuccessful, the tests fail:
Service accessibility verification: These tests ensure
that the infrastructure services are not only set up on
the designated hosts and ports but are also accessible
(day 0 configuration), mimicking how actual
applications would interface with them.
API compatibility checks: With constantly evolving
services and APIs, these tests validate that the versions
of APIs in the environment are consistent and
compatible, preventing any unexpected application
failures.
Functional validation: Beyond just accessibility and
compatibility, the tests delve deeper into the core
functionalities of services, confirming that key
operations and functionalities perform as expected.
Performance benchmarking: Ensuring the
environment not only functions correctly but also
efficiently, these tests measure various performance
metrics to confirm they are hitting the desired targets,
ensuring applications run optimally.
The automated environment test pattern provides a
comprehensive assessment, verifying that all facets of the
environment, from its base configuration to its performance,
are in line with the defined expectations.
Following are the pros and cons of environment testing:

Pros:
Guarantees uniformity across different deployment
platforms and environments.
Validates both functional and non-functional aspects of
the environment.
Pinpoints the source of issues, streamlining the
debugging process.

Cons:
Initial setup can be time-consuming and might delay
deployment.
Can sometimes flag issues that aren't critical or
relevant.
Running tests, especially performance ones, can
consume significant resources.

Infrastructure certification
In production deployments, the customer IT team handles
hardware provisioning, while the vendor deploys software.
Inadequate infrastructure setup can lead to deployment
failures or later issues. Infrastructure certification tests are
crucial to confirm proper setup and adherence to
requirements, preventing client-vendor conflicts.
The infrastructure certification performs a few checks:
Network configuration and integrity:
Connectivity checks: Ensure that all network
components, such as routers, switches, and firewalls,
are appropriately configured and functional.
Bandwidth and latency tests: Verify that the
network meets required bandwidth and latency
levels suitable for the software application's
demands.
Host accessibility and configuration:
Host reachability: Test the ability to reach all the
defined hosts within the specified network.
Port accessibility: Ensure that necessary ports on
these hosts are open and ready for connections,
adhering to the provided specifications.
Computing instance checks:
OS verification: Confirm that each computing
instance runs the correct version and
configuration of the operating system.
Service configuration: Check for essential services,
their versions, and configurations based on the
software's requirements.
Security and patch level: Assess the security
configurations, patches, and updates to make sure
they align with best practices and software demands.
Resource validation:
CPU assessment: Evaluate the processor's speed,
core count, and other vital attributes to ensure it
meets or surpasses the specified requirements.
Memory analysis: Check that each computing
instance possesses the required amount of RAM,
ensuring swift and efficient software operations.
By running these infrastructure certification tests, both
clients and vendors can have confidence that the
foundational elements are in place and ready for the
software deployment. This proactive approach minimizes
potential bottlenecks, optimizes the deployment process,
and fosters a collaborative environment between all involved
parties.
Following are the pros and cons of infrastructure
certification:

Pros:
Confirms that the infrastructure is set up according to
specifications before software deployment.
Ensures optimal software functioning by validating that
hardware meets requirements.
Creates a foundation for transparent communication
between involved parties.

Cons:
Additional time, tools, and expertise are needed to
develop and run tests.
Relying heavily on specific testing tools might lead to
overlooked configurations outside the tool’s purview.
Needs knowledgeable personnel to understand,
interpret, and act on test results.

Conclusion
In this chapter, we explored scripted environments, including
production, test, and development setups. We discussed
cross-platform deployment types: symmetric, asymmetric,
and dockerized frameworks. Security in deployment was
covered with IP access lists, traffic control rules, and
centralized management. Emphasized the importance of
environment verification, including tests and infrastructure
certification for deployment readiness. The next chapter
introduces automating CICD pipelines for microservices.

Further readings
1. Mike Tyson of the Cloud (MToC). Infrastructure-as-Code
for Startups: FAQs. Medium. Aug 8, 2023. Available at
https://medium.com/@mike_tyson_cloud/infrastru
cture-as-code-for-startups-faqs-a8f682d2cdf2
2. Kejser, P. N. Setup AWS Load Balancer Controller inside
AWS EKS with AWS CDK — Infrastructure as Code.
Medium. Aug 24, 2023. Available at
https://medium.com/devops-techable/setup-aws-
load-balancer-controller-inside-aws-eks-with-
aws-cdk-infrastructure-as-code-31b05a05ab80
3. Mike Tyson of the Cloud (MToC). Why Use Terraform for
Your Infrastructure-as-Code Projects? Medium. Jul 28.
2023. Available at
https://medium.com/@mike_tyson_cloud/why-use-
terraform-for-your-infrastructure-as-code-
projects-40aa9fed1979
4. Abdurrachman. Starting with Minikube. Medium. Mar
16. Available at
https://medium.com/@empeje/starting-with-
minikube-7cb5ec2ae54a

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 13
Automating CI/CD
Pipelines

Introduction
This chapter introduces you to the essentials of automating
continuous integration and continuous delivery
(CI/CD) pipelines tailored for microservices. We will explore
the core of automated pipelines, emphasizing incremental
delivery, handling multiple deployments, and the intricacies
of product packaging. A clear demarcation between
development and DevOps will be highlighted, revealing the
synergy between them. The significance of Docker takes
center stage as we dive into its transformative influence on
the development lifecycle, from building to testing and
packaging. Quality assurance remains paramount, and we
will touch upon the delicate balance between automated and
manual quality gates. Concluding the chapter, we address
the imperative of secure delivery in software deployment. By
the end, you will be equipped with key insights to refine your
microservices' CI/CD process, merging speed with security.
Some additional insights on CI/CD automation can be found
in the “Better Delivery”
(https://www.entinco.com/programs/better-delivery)
program.

Structure
In this chapter, we will cover the following topics:
CI/CD pipeline
Incremental delivery
Multiple deployments
Application platform
Product packaging
Development/DevOps delineation
Virtualized build process
Quality gate
Automated gate
Manual gate
Secure delivery
Environment provisioning
Branching strategy

Objectives
After studying this chapter, you will understand the core
elements of automated pipelines and the distinction between
Development and DevOps roles. You will appreciate Docker's
integral role in the CI/CD process, from development to
deployment. Additionally, you will learn to strike a balance
between automated and manual checks in quality gates,
ensuring microservice excellence. The chapter will conclude
by emphasizing the criticality of Secure Delivery, underlining
the need to deploy securely. This knowledge will prime you
to refine your microservices' CI/CD processes efficiently.

CI/CD pipeline
Automated CI/CD pipelines are indispensable when
delivering microservices systems, given the inherent
complexity of managing numerous components, often
ranging from tens to hundreds. Each microservice functions
as an individual unit, demanding precise coordination during
deployment. Building and deploying software with such an
intricate web of moving parts becomes virtually impossible
without deep-rooted automation. By streamlining and
automating the processes, CI/CD pipelines ensure that each
component of a microservices system is consistently
integrated, tested, and delivered, enhancing the overall
system's quality and reliability. In the context of
microservices, an efficient automated CI/CD pipeline is not
just an advantage; it is necessary to manage the complexity
and ensure robust system delivery.

Problem
CI/CD, an acronym for continuous integration and continuous
delivery/deployment, embodies the modern approach to
automated software delivery. A CI/CD pipeline orchestrates
the software development process, automating steps from
code integration to product delivery. The primary goal of this
system is to facilitate rapid, consistent, and reliable software
releases.
CI/CD represents a modern approach to software
development that emphasizes rapid, consistent, and
automated transitions from development to deployment. The
CD in the abbreviation can refer to either continuous
delivery, which relates to handing over a packaged product
to customers, generally favored by software vendors
catering to a vast customer base, or continuous deployment,
a process that entails automatic installation or updates of
products for end-users, predominantly employed in the
realm of SaaS or for internal products within an organization.
By distinguishing between these two, organizations can
effectively align their deployment strategies with their
business models, ensuring a smoother path from
development to the user (refer to Figure 13.1):

Figure 13.1: Structure of CI/CD pipeline

A CI/CD pipeline integrates various components to


streamline the software delivery process. These fundamental
elements include:
Triggers: Events initiating pipeline processes.
Stages (Phases): Major pipeline sections, like build or
test.
Steps (Tasks): Individual operations within stages.
Actions (Commands): Specific commands executed in
each task.
Artifacts: Compiled binaries and libraries for
deployment.
Repositories: Central hubs that store and manage
different versions of code in VCS/code repositories or
compiled binaries in binary/release repositories,
serving as a nexus for collaborative and streamlined
development initiatives.
Environments: Specific settings where the software
runs, for example, development, staging, or production.
To maximize efficiency and security in software delivery,
adhering to the following best practices for CI/CD pipelines is
indispensable, especially in the context of complex
microservices systems:
Incremental delivery: Deploy one component at a
time to ensure stability.
Immutable infrastructure: Use automated scripts to
create deployment environments.
Immutable artifacts: Build components once and use
them throughout the pipeline.
Reproducibility: Ensure process consistency during
development, build, staging, and production
deployment.
Reliability: Incorporate automated quality gates for
consistent quality assurance at every stage.
Trackability: Clearly associate every pipeline run with
a change that triggers it.
Reversibility: Ensure the pipeline and associated
environments can revert to a stable state when
component delivery fails.
High availability: Upgrades should be seamless
without introducing system downtimes.
Maintainability: Promote shared tasks and best
practices across pipelines.
Security: Shield the pipeline from potential code or
configuration injection threats.
A well-defined CI/CD pipeline, accompanied by best
practices, is pivotal to meeting the accelerating demands of
modern software delivery, especially in microservices
architectures.
Moreover, error handling in pipelines is a critical aspect of
ensuring the reliability and robustness of data processing
workflows. Effective error handling mechanisms are essential
for detecting, reporting, and managing errors that may occur
at various stages of the pipeline, such as data ingestion,
transformation, and output. Implementing strategies like
logging, alerting, and automatic retries can help mitigate
errors and prevent pipeline failures. Additionally,
incorporating error monitoring and tracking tools allows for
proactive identification and resolution of issues, ultimately
contributing to the stability and efficiency of the pipeline.

Incremental delivery
The incremental delivery pipeline is a simplified, yet highly
efficient strategy utilized in the continuous deployment of
software, where products are ushered into production one
component at a time. This approach demands a mature
development process and robust automated quality gates to
ensure each change is seamlessly integrated without
disruptions (as shown in Figure 13.2):
Figure 13.2: Incremental CI/CD pipeline

Within this pipeline, every modification undergoes four


stages:
1. Build: In this initial phase, individual components are
developed and assembled, ready for integration into
the existing system.
2. Test: Following the build stage, components are
subjected to tests to identify and rectify potential
issues, ensuring their reliability and compatibility with
the more extensive system.
3. Release: Once tested, the components are packaged
into a release format, ready for deployment into the
production environment.
4. Deploy: In the final stage, the packaged components
are deployed into the production setup, making them
accessible to end-users.
A classic example of applying an incremental delivery
pipeline can be seen at Netflix, a company renowned for its
agile and innovative software practices. Netflix meticulously
deploys updates, one component at a time, rather than
overhauling the entire system simultaneously. This strategy
minimizes the risk of system-wide failures and allows for
quicker identification and rectification of issues, ensuring a
stable and continually improving service for its customers.
Here are some well-known build servers commonly utilized in
automating CI/CD pipelines:
Jenkins: An open-source automation server offering
various plugins to support building and automating any
project.
Travis CI: A cloud-based CI/CD service that integrates
seamlessly with GitHub repositories.
CircleCI: A robust platform allowing quick, safe, and
at-scale automating of the development process.
GitLab CI/CD: Part of the GitLab ecosystem, it offers
functionalities to automate the entire software
development lifecycle.
Bamboo: A product by Atlassian offering continuous
integration and deployment options, integrating well
with JIRA and Bitbucket.
TeamCity: A Java-based build management and
continuous integration server from JetBrains.
GitHub Actions: A tool by GitHub that allows
automation of workflows directly within the GitHub
repository, providing a platform for CI/CD processes
and more.
Azure DevOps: Provides development services to
support teams to plan work, collaborate on code
development, and build and deploy applications.
AWS CodeBuild: A fully managed build service by
Amazon Web Services.
In this hypothetical scenario, we will employ Jenkins to
orchestrate an incremental CI/CD pipeline for a mono
repository hosting several microservices. Here is how this
scenario could unfold (Code snippet 13.1):
1. node {
2. stage('Build') {
3. steps {
4. checkout scm // Source Code Checkout: Fetch t
he latest
code from the repository
5.
6. script {
7. // Dependency Installation: Automate the installation of nec
essary dependencies for the impacted microservices

8. sh 'npm install' // or appropriate command


9. }
10.
11. script {
12. // Build Microservices: Build the respective m
odified
microservices using appropriate build tools
13. sh 'npm run build' // or appropriate command
14.
15. }
16. script {
17. // Unit Testing: Execute unit tests for the modi
fied
microservices
18. sh 'npm run test' // or appropriate command
19.
20. }
21. script {
22. // Package Microservice: Create a deployment
package
for the microservice
23. sh 'npm run package' // or appropriate comm
and
24. }
25. }
26. }
27.
28. stage('Test') {
29. steps {
30. script {
31. // Integration Testing: Conduct integration tes
ts to
verify the
interaction between the modified microservices and
other components
32. sh 'npm run integration-
test' // or appropriate command
33. }
34.
35. script {

36. // Test Deployment: Deploy modified microservices to a test


environment to test integration with other services in a setup
mirroring production environment

37. sh 'npm run deploy-


test' // or appropriate command
38. }
39. }
40. }
41.
42. stage('Release') {
43. steps {
44. script {
45. // Release Microservice: Push packaged microservice into
a release repository that is used for production deployments

46. sh 'npm run release' // or appropriate comma


nd
47. }
48.
49. script {
50. // Release Notification: Notify customers abou
t a new
release
51. sh 'npm run notify-
release' // or appropriate command
52. }
53. }
54.
55. }
56. stage('Deploy') {
57. steps {
58. script {
59. // Production Deployment: Trigger the deploy
ment of
modified microservices to the production environment
ensuring a
seamless release without affecting other components
60. sh 'npm run deploy-
prod' // or appropriate command
61.
62. }
63. script {
64. // Monitoring & Logging: Integrate with monitoring tools
to track the performance and logs of the newly deployed services
for anomalies

65. sh 'npm run monitor' // or appropriate comm


and
66. }
67. }
68. }
69. }
Following are the pros and cons of incremental delivery:

Pros:
Fewer components changed per update, reducing
potential errors.
Quicker deployment times by focusing on individual
components.
Simplified identification and correction of issues.

Cons:
Potential for dependency conflicts between
components.
Requires mature and sophisticated testing processes.
Necessitates meticulous tracking and coordination.

Multiple deployments
The multiple deployments pipeline is an advanced variation
of the incremental delivery pipeline. It helps software
vendors to continuously deploy products for their clients or
large SaaS companies to install and update their systems
across multiple locations (as shown in Figure 13.3):

Figure 13.3: Incremental pipeline with multiple deployments

How Amazon uses this approach with its AWS services


provides a concrete example. Amazon maintains data
centers in many corners of the world, housing vast networks
of their computer systems. When they roll out a new update
or feature, they do not do it everywhere all at once. Instead,
they use a multiple deployments pipeline to gradually
introduce these updates region by region, tailoring the
deployment to suit each area’s specific needs and
regulations.
Following are the pros and cons of multiple deployments:

Pros:
Tailoring deployments to meet local demands or
regulations.
Less downtime as issues can be isolated to individual
deployments.
Simplifies extending services to new areas or clients
incrementally.

Cons:
Handling multiple deployments can be administratively
intensive.
Requires more resources for individualized
deployments.
Risk of having varied service experiences across
different regions.

Application platform
Companies that build multiple products often create an
application platform that houses shared services and
components. The entire software delivery process can be
organized as several connected pipelines (refer to Figure
13.4):
Figure 13.4: Automated software delivery with application platform and
product pipelines

The first pipeline is responsible for delivering the application


platform. During the delivery phase, it updates the
configuration of related products. It may also activate the
delivery processes for those products, triggering a domino
effect that promotes synchronized updates throughout the
system.
Through this cohesive approach, the application platform
pipeline aids in reducing inconsistencies and fostering a
smooth workflow across all projects. It is a crucial strategy
for companies aiming to maintain organized and efficient
product updates, especially when juggling multiple products.
Following are the pros and cons of the application platform

Pros:
Easier tracking and control of shared components
across multiple products.
Shared services and components reduce the need for
duplicate efforts and save resources.
Ensures uniform updates, minimizing the risk of
compatibility issues between different products.

Cons:
Increased complexity in managing dependencies and
versioning across multiple products.
Dependency on the application platform may create a
single point of failure, impacting the entire ecosystem
if issues arise.
Introducing changes or updates to the application
platform may require coordination and communication
efforts across teams, potentially leading to delays or
conflicts in development schedules.
Product integration
In complex product development, teams are divided either
vertically, focusing on specific feature sets, or horizontally,
with responsibilities for frontend, backend, and edge
components. In either case, skillful integration of each
team's components is crucial to create a cohesive product
for customer delivery (refer to Figure 13.5):

Figure 13.5: Automated software delivery with product integration

To facilitate this, a system of automated pipelines can be


implemented. In essence, each team has its own pipeline
that manages the development and delivery of their
respective components. These pipelines converge into what
is known as the product integration pipeline. This
overarching pipeline is responsible for integrating the
different components, validating their coherence, and
ensuring a streamlined delivery of the final product to
customers.
An alternate variation of this approach can be observed
when a single team delivers a packaged product, bypassing
the incremental delivery of individual components.
The following are the pros and cons of product integration:
Pros:
Facilitates organized and seamless merging of different
components.
Enables quicker transitions from development to
delivery.
Encourages teamwork and clear division of
responsibilities.

Cons:
Setting up and managing the pipeline can be intricate.
Delays or problems in one pipeline can affect the entire
product delivery.
Integration points might become congestion zones,
slowing down the process.

Development/DevOps delineation
A shift in roles is proposed to address inefficiencies and
reduce cognitive load. Here, developers handle coding,
testing, packaging, and building of individual components,
while DevOps engineers focus on integrating these
components into the delivery pipeline and creating a
cohesive system. This new approach streamlines
development, promoting a more efficient workflow.

Problem
In today's software development, defining roles between
developers and DevOps engineers can be unclear.
Traditionally, developers start by writing code, and then
DevOps engineers automate build processes. However,
developers often step in to build the system, a process
mirrored by DevOps engineers who then script the
deployments (refer to Figure 13.6):
Figure 13.6: Traditional interactions between developers and DevOps

The circular responsibility flow forces developers and DevOps


engineers to have extensive knowledge of the entire system.
DevOps engineers must intimately understand individual
components for proper integration and building. This
complexity arises because code changes in components may
affect build processes, leading to frequent interactions
between the roles.
The iterative interaction often leads to miscommunication
and friction, hindering development progress. Both parties
needing broad and deep knowledge increases cognitive load,
raising the risk of errors and misunderstandings, and
affecting efficiency. To tackle this, a reevaluation and
potential restructuring of traditional roles are needed for a
more collaborative and efficient workflow.

Solution
To address friction and communication barriers, roles will be
redefined. Developers will expand their responsibilities to
include overseeing individual components comprehensively.
This involves developing, testing, packaging, and automating
the build processes. Focusing on specific elements allows
them to become experts in their domains, gaining deep
expertise in both the component and its dependencies.
Conversely, DevOps engineers will pivot toward a broader
but less detailed oversight of the entire system. They
manage the delivery pipeline, orchestrating the integration
of individual components into a cohesive system. Their role
involves synthesizing well-defined, self-contained component
units into a unified system. They also automate deployment
and operational processes for system-wide robustness and
efficiency (refer to Figure 13.7):

Figure 13.7: Optimal delineation between development and DevOps

This new dynamic offers multiple advantages. Firstly, it


lessens the cognitive load for developers and DevOps
engineers, allowing specialization in depth or breadth of
knowledge. Secondly, it enhances communication efficiency:
Developers excel at micro-level problem-solving, while
DevOps engineers oversee the macro perspective for
seamless system integration.
Treating components as black boxes with clear interfaces,
dependencies, and configurations enables DevOps engineers
to avoid getting stuck in component details, reducing
conflicts and miscommunications. This model promotes
teamwork, innovation, agility, and productivity. Focusing on
defined roles and responsibilities enhances efficiency and
ensures timely delivery of high-quality products.
Following are the pros and cons of development/ DevOps
delineation:

Pros:
Team members specialize in narrower fields, reducing
overall complexity.
Allows for rapid responses to specific component issues
or system-wide changes.
Concentration on specialized areas can increase
product quality.

Cons:
Teams might work in isolation, potentially creating
knowledge gaps.
Developers might have a narrow view, lacking a
system-wide perspective.
Risk of too much specialization, possibly hindering
flexibility in roles.

Virtualized build process


The virtualized build process pattern suggests using
virtualized (dockerized) environments for both development
and CI/CD. This aligns IDEs with pipeline processes, ensuring
consistency and reducing elusive issues caused by toolchain
variations. Embracing this pattern streamlines workflows and
improves product reliability.

Problem
Development teams face inconsistencies due to the use of
different development environments, IDEs, and CI/CD
pipelines. Toolchain setup affects build and test results,
causing delays and cognitive load. Consequently, there is an
urgent need to implement a system that can ensure
uniformity and reliability throughout the development cycle,
eliminating unforeseen discrepancies and fostering a
smoother, more efficient development process.

Solution
In addressing the prevalent inconsistencies in the software
development cycle, the Virtualized Build Process pattern
proposes the implementation of virtualized or dockerized
environments during the building and testing phases. This
approach ensures that a consistent set of tools and
environments are used both in development and CI/CD
pipelines, fundamentally eliminating the disparities that
occur due to different configurations and setups (as shown in
Figure 13.8):

Figure 13.8: Using virtualized build processes during development and


automated CI/CD pipeline

By adopting this pattern, developers will utilize environments


with pre-installed toolchains, alongside scripts that automate
the build processes, facilitating seamless transitions between
different stages of the development cycle. Even though
developers might lean towards using IDEs for development,
they are encouraged to employ the virtualized build and test
process at least once before committing their code to the
repository. This strategy not only fosters uniformity but also
curtails the occurrence of unpredictable issues, thereby
streamlining the overall development trajectory.
A number of technologies can be leveraged to virtualize
component build processes:
Docker: A platform used to containerize applications,
ensuring that they run consistently across multiple
environments.
Docker compose: A tool for defining and managing
multi-container Docker applications.
Vagrant: Provides a framework and configuration
format to create and manage complete portable
development environments.
Terraform: An open-source tool that allows building,
changing, and versioning infrastructure safely and
efficiently using a declarative configuration.
The following code provides an example that demonstrates
the dockerized build process for a Java microservice
implemented using SprintBoot:
component.json: Contains description of a software
component that allows the rest of the scripts to stay generic
and be reused across other microservices without
modifications. Build property can be automatically assigned
by the CI/CD pipeline to produce artifacts with unique names
tied to the pipeline run (Code snippet 13.2):
1. {
2. "name": "service-basic-springboot",
3. "type": "microservice",
4. "language": "java",
5. "version": "1.0.0",
6. "build": 0
7. }
docker/Dockerfile.build: A Docker image that contains the
toolchain for microservice build (Code snippet 13.3):
1. FROM maven:3.8.5-openjdk-18-slim
2.
3. # set working directory
4. WORKDIR /app
5.
6. # Copy project file
7. COPY pom.xml ./
8.
9. # install dependencies
10. RUN mvn dependency:go-offline
11.
12. # copy all project
13. COPY . .
14.
15. # compile source code
16. RUN mvn package -DskipTests
build.ps1: Script in Powershell that executes dockerized
microservice build. Error code 0 signifies successful
execution (Code snippet 13.4):
1. #!/usr/bin/env pwsh
2.
3. Set-StrictMode -Version latest
4. $ErrorActionPreference = "Stop"
5.
6. # Get component data and set necessary variables
7. $component = Get-Content -
Path "component.json" | ConvertFrom-Json
8.
9. $buildImage = "$($component.registry)/$($component
.name):$($component.version)-$($component.build)-
build"
10. $container = $component.name
11.
12. # Remove build files
13. if (Test-Path "$PSScriptRoot/target") {
14. Remove-Item -Recurse -Force -
Path "$PSScriptRoot/target"
15. }
16.
17. # Build docker image
18. docker build -f "$PSScriptRoot/docker/Dockerfile.
build" -t $buildImage $PSScriptRoot
19.
20. # Create and copy compiled files, then destroy
21. docker cr eate --name $container $buildImage
22. docker cp "$($container):/app/target" "$PSScriptRoot/
target"
23. docker rm $container
24.
25. # Verify build
26. if (-not (Test-Path "$PSScriptRoot/target")) {
27. Write-Error "'target' folder doesn't exist in root dir.
Build failed. Watch the logs above."
28. }
docker/Dockerfile.test: A Docker image that contains
toolchain for microservice test (Code snippet 13.5):
1. FROM maven:3.8.5-openjdk-18-slim
2.
3. # set working directory
4. WORKDIR /app
5.
6. # Copy project file
7. COPY pom.xml ./
8.
9. # install dependencies
10. RUN mvn dependency:go-offline
11.
12. # copy all project
13. COPY . .
14.
15. # compile source code
16. CMD ["mvn", "clean", "test"]
docker/docker-compose.test.yml: Dockerized
environment that contains the microservice and its
dependencies for integration testing (Code snippet 13.6):
1. version: '3.3'
2.
3. services:
4.
5. test:
6. build:
7. context: ..
8. dockerfile: docker/Dockerfile.test
9. image: ${IMAGE:-test}
10. ports:
11. - "8080:8080"
12. environment:
13. - HTTP_PORT=8080
test.ps1: Script in Powershell that executes dockerized
microservice test. Typically, that includes unit and
integration testing at the component level. Error code 0
signifies successful execution (Code snippet 13.7):
1. #!/usr/bin/env pwsh
2. Set-StrictMode -Version latest
3. $ErrorActionPreference = "Stop"
4.
5. # Get component data and set necessary variables
6. $component = Get-Content -
Path "component.json" | ConvertFrom-Json
7. $testImage = "$($component.registry)/$($component.
name):$($component.version)-$($component.build)-
test"
8.
9. # Set environment variables
10. $env:IMAGE = $testImage
11.
12. try {
13. docker-compose -f "$PSScriptRoot/docker/docker-
compose.test.yml" up --
build --abort-on-container-exit --exit-code-from test
14.
15. # Save the result to avoid overwriting it with the "do
wn"
command below
16. $exitCode = $LastExitCode
17. } finally {
18. # Workaround to remove dangling images
19. docker-compose -f "$PSScriptRoot/docker/docker-
compose.test.yml" down
20. }
21.
22. # Return the exit code of the "docker-
compose.test.yml up" command
23. exit $exitCode
docker/Dockerfile: A production Docker image that runs
the microservice (Code snippet 13.8):
1. FROM maven:3.8.5-openjdk-18-slim
2.
3. # set working directory
4. WORKDIR /app
5.
6. ARG PACKAGE_NAME
7. ENV PACKAGE=$PACKAGE_NAME
8.
9. # copy all project
10. COPY /target/$PACKAGE .
11.
12. RUN echo $PACKAGE
13. # compile source code
14. CMD java -jar $PACKAGE_NAME
docker/docker-compose.yml: A dockerized environment
that runs the microservice with its dependencies (Code
snippet 13.9):
1. version: '3.3'
2.
3. services:
4.
5. app:
6. image: ${IMAGE:-app}
7. build:
8. context: ..
9. dockerfile: docker/Dockerfile
10. ports:
11. - "8080:8080"
12. environment:
13. - HTTP_PORT=8080
14. - PACKAGE_NAME=${PACKAGE_NAME}
package.ps1: Script in Powershell that packages the
microservice into a Docker image and performs basic
verification. Error code 0 signifies successful execution (Code
snippet 13.10):
1. #!/usr/bin/env pwsh
2.
3. Set-StrictMode -Version latest
4. $ErrorActionPreference = "Stop"
5.
6. # Generate image names using the data in the "$PSScri
ptRoot/component.json" file
7. $component = Get-Content -
Path "$PSScriptRoot/component.json" | ConvertFrom-
Json
8. $rcImage = "$($component.registry)/$($component.na
me):$($component.version)-$($component.build)"
9. $latestImage = "$($component.registry)/$($componen
t.name):latest"
10.
11. # Build docker image
12. docker build -f "$PSScriptRoot/docker/Dockerfile" -
t $rcImage -t $latestImage $PSScriptRoot
13. if ($LastExitCode -eq 0) {
14. Write-
Host "`nBuilt run images:`n$rcImage`n$latestImage`n
"
15. }
16.
17. # Set environment variables
18. $env:IMAGE = $rcImage
19.
20. # Set docker ip
21. if ($null -ne $env:DOCKER_IP) {
22. $dockerMachineIp = $env:DOCKER_IP
23. } else {$dockerMachineIp = "localhost"}
24.
25. # Set http port if default value overwritten
26. if ($null -ne $env:HTTP_PORT) {
27. $httpPort = $env:HTTP_PORT
28. } else {$httpPort = "8080"}
29.
30. # Set http route to test container
31. if ($null -ne $env:HTTP_ROUTE) {
32. $httpRoute = $env:HTTP_ROUTE
33. } else {$httpRoute = "/actuator/health"}
34.
35. try {
36. docker-compose -f "$PSScriptRoot/docker/docker-
compose.yml" up -d
37.
38. # Give the service time to start and then check that it
's
responding to requests
39. Start-Sleep -Seconds 10
40. Invoke-WebRequest -
Uri "http://$dockerMachineIp`:$httpPort$httpRoute"
41.
42. if ($LastExitCode -eq 0) {
43. Write-
Host "The run container was successfully built and test
ed."
44. }
45. }
46. catch {
47. # Output container logs if web request failed
48. $containersStatuses = docker-compose -
f "$PSScriptRoot/docker/docker-compose.yml" ps
49. # Parse docker-compose list of containers
50. foreach ($containerStatus in $containersStatuses | S
elect-Object -Skip 1) {
51. $containerName = $containerStatus.split(" ")[0]
52. Write-
Host "`nLogs of '$containerName' container:"
53. docker logs $containerName
54. }
55.
56. Write-Error "Error on testing run container. See
logs above for more information"
57. }
58. finally {
59. docker-compose -f "$PSScriptRoot/docker/docker-
compose.yml" down
60. }
During development, before committing in an automated
CI/CD pipeline, the following sequence of steps shall be
executed. The produced microservice docker image can be
further tested and released into production (Code snippet
13.11):
1. ./build.ps1
2. ./test.ps1
3. ./package.ps1
Following are the pros and cons of virtualized build process:

Pros:
Ensures uniform behavior across different
environments.
Simplifies and standardizes deployment processes.
Facilitates team collaboration by avoiding the it works
on my machine problem.

Cons:
Containers and VMs, in general, demand greater CPU
and RAM resources on development/build machines
compared to native processes.
Requires knowledge and expertise to set up and
manage effectively.
Issues in virtual environments can sometimes be more
challenging to debug.

Quality gate
The quality gate pattern serves as a strategic approach to
achieving quality while conserving resources and time. It
accomplishes this by dividing the testing process into clear
phases or gates, each with its defined roles and measurable
criteria. This structured approach promotes accuracy and
efficiency. Positioned at specific points within the CI/CD
pipeline, these gates act as checkpoints, offering concrete
metrics that indicate when a component is ready to progress.

Problem
Without a structured testing approach, outcomes suffer.
Random tests miss critical issues, strain resources, and slow
deployment. To address this, we need the quality gate
pattern, an analytical, phased approach to improve software
testing’s reliability and efficiency in agile development
cycles.

Automated gate
The automated quality gate pattern is a CI/CD strategy that
maintains high-quality standards, avoids resource waste, and
shortens execution times. It divides testing into specific
quality gates, each with a defined scope and a measurable
quality bar. This bar, typically a ratio of identified issues to
tests passed, provides a tangible measure of product quality
at various stages (as shown in Figure 13.9):

Figure 13.9: Automated quality gates integrated into CI/CD pipeline

Here is a closer look at the typical gates utilized in this


pattern:
Component quality (Build stage): This initial gate
focuses on individual components, verifying that each
works as designed. Although it ensures individual
functionality, it does not guarantee the component’s
efficacy within the complete system. The quality bar
here would focus on the performance and stability of
individual components.
System integrity (Test stage): At this stage, the
interaction of different components within a system is
evaluated. The objective is to ensure that they work
harmoniously together. This gate does not necessarily
confirm adherence to functional and non-functional
specifications but verifies integrative functionality. The
quality bar could assess the number of integration
issues detected versus successful interactions.
Functional verification (Test stage): This gate
ascertains that the system complies with all the
delineated functional requirements, ensuring that the
system performs all expected functions effectively.
Here, the quality bar might involve metrics on
functional adherence and the detection of functional
discrepancies.
Non-functional verification (Test stage):
Concentrating on the system’s adherence to non-
functional parameters like performance, capacity, and
scalability, this gate is critical to ensuring the system’s
readiness for real-world deployment. The quality bar
here might encompass benchmarks regarding system
performance and reliability under various conditions.
Release availability (Release stage): This gate
confirms that the product is all set and ready for
production deployment, facilitating a seamless
transition from development to release. The quality bar
at this stage could be gauged by the readiness and
completeness of the product for release, evaluating
criteria like feature completeness and stability.
Post-production verification (Deploy stage): The
final gate verifies that the installed product is
operational and ready for use, easing its transition to
the live environment. The quality bar here would
involve assessing the live performance and user
experience, ensuring it meets the expected standards.
By organizing tests into these quality gates, each with a
measurable quality bar, the process ensures a rigorous,
comprehensive, and efficient path to achieving a high-quality
end product.
Following are the pros and cons of the automated gate
pattern:
Pros:

Focuses on relevant tests, reducing overall time.


Identifies issues early in the development cycle,
facilitating timely resolutions.
Easily adaptable to larger, more complex projects.

Cons:
Requires careful planning and set-up.
A pass at a quality gate might create a false sense of
security about the overall project status.
Needs skilled personnel for setup and maintenance.

Manual gate
In various scenarios, development teams might lack the time
or resources to construct a robust test suite suitable for
automated quality gates. In such cases, a manual
verification quality gate can be a viable alternative. This
manual quality gate is generally situated at the culmination
of the Test phase, preceding the release stage (refer to
Figure 13.10):
Figure 13.10: Manual quality gate

Given that manual testing can be time-intensive, the CI/CD


pipeline is halted before reaching the manual gate and is
only resumed once the manual verification has been
successfully completed. This approach safeguards against
discrepancies and alterations during manual testing, which
might invalidate the current system configuration.
Creating a separate environment, often referred to as the
stage environment, is recommended for smooth manual
testing. Deploying the system in this environment
safeguards against changes made during manual testing
affecting the existing setup.
For smaller teams, pausing development during manual
testing may maintain consistency, but larger teams might
find this costly. Consequently, larger teams need strategies
to minimize disruption from manual testing while
development continues.
Following are the pros and cons of manual gate:

Pros:
Allows for human insight and expertise.
Can identify unanticipated issues.
Flexible to adapt to different types of testing needs.
Cons:
Prone to human error.
Costly and time consuming.
Difficult to scale with the growth of the project.
Disrupts automated flow of CI/CD pipelines.

Secure delivery
In the ever-changing tech landscape, cybersecurity involves
more than just securing deployed products. Equally crucial is
protecting the CI/CD pipeline from potential breaches, where
hackers can insert malicious code through source code
changes or compromised external dependencies during
delivery. Securing the delivery process is vital for functional
and secure software. This underscores the importance of
adopting a secure delivery pattern as a fundamental
strategy for maintaining software integrity throughout
development and deployment.

Problem
In the modern software development landscape, the CI/CD
pipeline is becoming a prime target for cyber-attacks. As
software undergoes the journey from development to
deployment, several vulnerabilities can be exploited by
malicious entities to compromise the software's integrity and
security (refer to Figure 13.11):
Figure 13.11: Attack vectors during software delivery

The problem of secure delivery revolves around identifying


and mitigating potential attack vectors that can occur during
the software delivery process. These vectors can include:
Committing malicious code into a version control
system: This can happen when unauthorized users
gain access to the version control system, introducing
harmful code that can later be integrated into the final
product.
Injecting malicious code via compromised
external dependencies: External dependencies, often
necessary for the software, can be tampered with,
thereby serving as a channel to introduce malicious
code into the software.
Modifying the build process: During the build
process, attackers can infiltrate and alter the steps
involved, potentially introducing vulnerabilities or
backdoors into the software.
Overriding released artifacts: After the build
process, during the release stage, attackers can
potentially replace or override the genuine artifacts
with compromised versions containing malicious
elements.
Addressing these potential attack vectors is imperative to
safeguard the CI/CD pipeline and ensure the secure delivery
of software products. It necessitates implementing robust
security measures at every stage of the delivery process to
prevent unauthorized alterations and maintain the software's
integrity and reliability.

Solution
Implementing stringent security measures is crucial at every
phase of software development to ensure a secure delivery
pipeline and thwart potential attacks. The secure delivery
pattern encompasses the following key security strategies:
Restricted access to code repositories: Access to
write in the code repositories should only be granted to
designated developers. Moreover, their access should
be limited to the components they are responsible for
to prevent unauthorized changes and potential
malicious injections.
Malicious code detection: Incorporate systematic
reviews to check for malicious code during peer
reviews and employ automated linters to scan and flag
potential security threats in the code base.
External dependencies vetting: Before integrating
external dependencies into the system, thoroughly vet
them for security vulnerabilities. Utilize specialized
tools to perform security audits and analyses to ensure
the dependencies’ safety.
Development repository for external
dependencies: Store external dependencies in a
secure development repository with restricted access
to prevent unauthorized modifications. This eliminates
the reliance on public repositories where dependencies
can be tampered with easily.
Secure build infrastructure: Restrict access to build
infrastructure, including build servers and build
runners, to only a group of trusted DevOps engineers.
Furthermore, this access should be allowed only from
secured hosts, generally within the corporate network,
to prevent external intrusions.
Limited write access to development repositories:
Write access to development repositories should
exclusively be granted to build servers or runners.
Developers should have read-only access to prevent
unauthorized alterations or injection into the code
base.
Controlled access to release repositories: Similarly,
maintain stringent control over write access to release
repositories, allowing only build servers or runners to
make modifications. Deployment engineers and
customers should have read-only access to avoid any
unauthorized changes and ensure the security of the
final product.
By adopting these measures, development teams can
significantly enhance the security of the delivery process,
protecting the pipeline from potential vulnerabilities and
ensuring the reliable and secure delivery of the software
product.
Following are the pros and cons of secure delivery:

Pros:
Ensures the reliability and security of external
dependencies used in the project.
Helps in adhering to industry standards and
regulations regarding software security.
Builds and maintains trust with customers by
delivering secure products.

Cons:
It can make the delivery process more complex,
requiring additional tools and protocols.
Requires skilled personnel to manage and operate a
secure delivery pipeline.
Automated security tools sometimes flag false
positives, requiring additional time to investigate.

Environment provisioning
In CI/CD pipelines, the method of environment provisioning
plays an important role in balancing resource efficiency, cost,
and development speed. There are two primary patterns:
static environments, which favor simplicity and consistency
at the expense of scalability; and spin-off (dynamic)
environments, which offer flexibility and parallel processing
capabilities, albeit with increased complexity and potential
for missed conflicts. Understanding these patterns is
essential for development teams to make informed decisions
that align with their specific project needs and resource
constraints.

Problem
The provisioning of test environments within CI/CD pipelines
presents a challenge in software development. Essential for
validating software components, this process significantly
impacts resource use, costs, development speed, and overall
efficiency. The dilemma centers on choosing between static
and spin-off (dynamic) environments, each with distinct
advantages and drawbacks. Static environments, though
simpler and more cost-effective for small teams, can cause
delays and cost overruns in larger teams with frequent
concurrent updates. Conversely, spin-off environments
provide scalability and enable parallel testing but increase
complexity and the risk of missing conflicts in simultaneous
changes.
Deciding on an environment provisioning strategy
necessitates careful consideration of factors like commit
frequency, team size, project complexity, and budget.
Inappropriate or inefficient choices can lead to resource
wastage, escalated costs, prolonged development periods,
and degraded software quality. To address these challenges,
there is an imperative need for a solid understanding of
these patterns. Teams must adapt their strategies to their
project's unique needs, balancing technical, financial, and
resource management aspects to achieve an efficient,
effective, and economically viable approach in their CI/CD
pipeline.

Static Environment
The static environment pattern represents a straightforward
approach to provision test environments in CI/CD pipelines.
This pattern entails setting up a test environment that
remains consistent and is reused across multiple iterations of
the development cycle. The fundamental characteristic of a
static environment is its persistence; once provisioned, it is
not dismantled after each use but rather maintained for
ongoing testing (refer to Figure 13.12):
Figure 13.12: Static test environment in CI/CD pipeline

In practice, when code commits are pushed into the CI/CD


pipeline, they are sequentially integrated and tested in this
stable environment. This sequential processing means that if
multiple commits arrive simultaneously, they are queued
and tested one after another.
This approach offers several advantages and disadvantages:

Pros:
Easy to establish and maintain due to its unchanging
nature.
Consistent environment ensures reliable test results.
Economical for projects with infrequent code updates.
Less complexity in managing infrastructure and
configurations.
Known environment variables aid in predictable testing
outcomes.

Cons:
May not be suitable for large teams or projects with
high concurrency in updates.
Leads to delays as commits are processed sequentially.
Potential for idle resources during low activity periods,
increasing overhead.
Adapting to new testing requirements can be
challenging.
High risk of becoming a bottleneck in continuous
integration processes.

Spin-off environment
The spin-off environment pattern, in contrast to the static
environments, is used to provision test environment on-
demand whenever there is a new code commit or batch of
commits to test. This is akin to setting up a temporary
laboratory for each new experiment (refer to Figure 13.13):

Figure 13.13: Spin-off (dynamic) test environments in CI/CD pipeline

Provisioning of the test environment can be performed from


scratch using automation scripts (see Chapter 12, Scripting
Environments). However, to speed up the process, many
teams use pre-provisioned snapshots.
When multiple commits occur simultaneously, each of them
gets its own environment and tests are performed in parallel.
This is especially useful in large teams or projects with
frequent code changes, as it avoids the queuing delays seen
in static environments.
After tests are complete, the environment is dismantled,
which can be more efficient in terms of resource usage
compared to maintaining a constant test environment.
Following are the pros and cons of static environment:

Pros:
Efficiently manages multiple code updates
simultaneously.
Resources are used only when needed, thereby
reducing waste.
Enables quick turnaround times for testing individual
updates.
Reduces the risk of interferences between different
code changes.
Each environment can be tailored to specific testing
requirements.
Facilitates a faster and more dynamic development
process.

Cons:
Managing multiple environments requires
sophisticated orchestration.
High demand for resources during environment
provisioning.
Isolated testing may overlook conflicts that occur in
integrated environments.
Time and resources needed to create and dismantle
environments.
Heavily relies on automated processes for efficiency.
Can be more expensive due to the need for advanced
infrastructure and tools.

Branching strategy
In microservice development, branching strategies are
essential for managing and integrating code changes
efficiently. These strategies dictate how code modifications,
feature developments, and fixes are handled within the
repository, significantly impacting workflow, collaboration,
and deployment. From Feature Branching, focusing on
isolated features, to Trunk-Based Development, emphasizing
rapid integration, each strategy offers a unique approach to
handling code changes. Selecting the right strategy is
crucial, as it influences team dynamics, release cycles, and
overall software quality. Understanding these strategies
enables teams to optimize development processes and
maintain a stable, continuously evolving codebase.

Problem
Selecting an appropriate branching strategy has a significant
impact on how code changes for various microservices are
managed and integrated. Each strategy, from feature
branching to trunk-based development, carries distinct
implications for team collaboration, integration frequency,
and overall project risk.
The primary issue lies in aligning the branching strategy with
the team's workflow, project scale, and release frequency.
For instance, strategies that favor isolated development (like
feature branching) contrast sharply with those that promote
frequent integration (such as trunk-based development),
each influencing the project's dynamics in unique ways.
Moreover, maintaining code quality and stability becomes
increasingly complex with multiple active branches,
elevating the risk of conflicts and integration challenges. This
is compounded in distributed teams, where asynchronous
work necessitates a strategy that supports effective
collaboration and streamlined tracking of changes.
Furthermore, within continuous integration and delivery
(CI/CD) environments, the branching strategy must enable
rapid and reliable deployments. The wrong choice can lead
to bottlenecks, hindering the deployment process and
delaying delivery timelines.
Thus, the core challenge is to identify a branching strategy
that not only suits the specific needs of microservice
architecture but also strikes a balance between development
agility, codebase stability, and effective team collaboration.
This requires a nuanced understanding of various strategies
and a flexible approach to adapt to the changing demands of
the software development lifecycle.

No-branching or continuous deployment


The no-branching strategy, also known as continuous
deployment, represents a streamlined approach in the
software development process, particularly within the agile
framework. Distinct from conventional methodologies that
employ multiple branches for features, releases, or fixes, this
strategy involves developers committing changes directly to
the main branch of the codebase.
Central to this approach is a heavy reliance on automated
testing and continuous integration tools, which rigorously
test each commit to ensure stability and functionality.
Successfully tested changes are immediately deployed to the
production environment, epitomizing rapid development and
delivery cycles. This strategy necessitates an exceptionally
high standard of testing rigor to mitigate the risk of
introducing faults directly into the production environment.
Implementing the No-Branching Strategy requires not only
robust tooling for continuous integration, automated testing,
and deployment but also a cultural shift within the
development team. This shift places a premium on shared
responsibility for production-ready code and a commitment
to producing high-quality, testable code. Despite its
simplicity in reducing branch management complexity, the
No-Branching Strategy demands a high level of discipline
and maturity in development practices, making it a fitting
choice for teams that are adept at managing the accelerated
pace and responsibility it entails (refer to Figure 13.14):

Figure 13.14: No-branching development

Following are the pros and cons of non-branching or


continuous development:

Pros:
Accelerates the delivery of new features and fixes to
production.
Eliminates the complexity of managing multiple
branches.
Facilitates a seamless, ongoing integration process.
Allows for quick user feedback and iteration.
Reduces overhead, thus streamlining the development
process.
Cons:
Direct commits to production increase the risk of
introducing bugs.
Requires a robust and comprehensive suite of
automated tests.
Difficult to isolate and manage individual feature
developments.
Places high responsibility on developers for code
quality.
Can become challenging to manage as the project size
increases.
Needs continuous monitoring to quickly address any
production issues.

Feature branching
The feature branching strategy is a widely-used approach in
software development, particularly effective in managing
and isolating new features within a project. In this strategy,
each new feature is developed in its own separate branch,
diverging from the main codebase.
This separation allows developers to work on new features or
fixes without impacting the main branch, typically reserved
for stable, deployable code. Once a feature is complete and
thoroughly tested within its branch, it is then merged back
into the main branch. This merge typically occurs after a
code review process, ensuring that the new addition adheres
to the project's standards and does not introduce any
conflicts with the existing code.
Feature Branching enables a clean and organized workflow,
especially in team environments where multiple features are
being developed concurrently. It allows for easier tracking of
changes and more focused development, as each branch
encapsulates all the changes pertaining to a specific feature.
This strategy is particularly beneficial in maintaining the
integrity and stability of the main codebase, as only fully
developed and tested features make their way into it,
reducing the likelihood of introducing bugs or errors into the
primary line of development (refer to Figure 13.15):

Figure 13.15: Feature branching development

Following are the pros and cons of feature branching:

Pros:
Allows individual features to be developed in isolation,
reducing interference.
Each branch provides a clear, dedicated space for
specific features or fixes.
Keeps the main branch stable by isolating new
developments.
Facilitates targeted code reviews and quality checks
for each feature.
Enables multiple features to be developed concurrently
without conflict.
Developers can experiment and iterate within branches
without immediate impact on the main codebase.
Cons:
Merging branches back into the main codebase can
become complex, especially if they diverge
significantly.
Features may be developed in silos, leading to potential
integration challenges later.
Prolonged development in branches increases the risk
of conflicts during merging.
Branches can become outdated if not regularly
synchronized with the main branch.
Managing multiple branches can add overhead and
complexity.
Variations in development environments and
dependencies across branches can lead to
inconsistencies.

Trunk-based development
Trunk-based development is a branching strategy that
emphasizes a single, shared branch — often referred to as
the ‘trunk’. This strategy minimizes the use of long-lived
branches, encouraging developers to integrate their changes
frequently, usually more than once a day, directly into the
trunk. Short-lived feature branches may be employed, but
these are merged back into the trunk quickly, often within a
day or two. This continuous integration into the main branch
ensures that it always contains the most recent code,
reducing the chances of significant divergence or conflicts
that can occur with longer-lived branches.
Trunk-based development fosters a collaborative and
dynamic development environment, where the focus is on
maintaining a single source of truth for the codebase and
ensuring that it is always in a releasable state. By frequently
integrating changes, teams can detect integration issues
early, making them easier to address. This strategy is
particularly effective in supporting continuous delivery and
deployment practices, as it ensures that the codebase is
always ready for release to production. Trunk-based
development requires a rigorous approach to testing and a
culture of collective code ownership, as all team members
contribute to and are responsible for the health of the trunk
(refer to Figure 13.16):

Figure 13.16: Trunk-based branching development

Following are the pros and cons of trunk-based development:

Pros:
Frequent merges to the trunk ensure quick integration
of changes.
Regular integration reduces the likelihood of
significant merge conflicts.
The trunk is always in a releasable state, supporting
continuous delivery.
Regular merging helps in identifying and resolving
issues early.
Promotes teamwork and collective ownership of the
code.
With fewer branches, the repository management is
simpler.

Cons:
Demands a strong automated testing environment to
maintain code quality.
Can be difficult to manage with a large number of
developers.
Short-lived branches offer less isolation for complex
features.
Frequent changes to the trunk can introduce instability
if not managed carefully.
Requires robust CI practices to handle frequent
commits effectively.
May be challenging for teams new to agile or
continuous integration practices.

Release branching
The release branching strategy is a methodical approach in
software development that focuses on managing the release
process effectively. In this strategy, a new branch, commonly
referred to as a 'release branch', is created from the main
branch (often the 'trunk' or 'master') to prepare for a new
release. This branch serves as a freeze-point for the features
that are to be included in the upcoming release, allowing any
final polishing, bug fixes, and stabilization efforts to be
concentrated in this isolated environment.
The main branch remains active for ongoing development of
features that are not part of the current release cycle. Once
the release branch is thoroughly tested and deemed stable,
it is then merged into the main branch and subsequently
deployed to production. This separation ensures that the
development of new features can continue without
disrupting the stabilization of the release version. It also
allows for a more controlled and focused approach to
preparing a release, as only specific changes relevant to that
release are addressed in the release branch.
The Release Branching strategy is particularly useful in
projects with scheduled releases or those that require a more
rigorous quality assurance process before deployment. It
strikes a balance between ongoing development and the
need for a stable, reliable release process (refer to Figure
13.17):

Figure 13.17: Release branching development

Following are the pros and cons of release branching:

Pros:
Concentrates on polishing and bug fixing specific to the
release.
Allows ongoing development without affecting release
preparation.
Facilitates a more structured and predictable release
cycle.
Keeps the main branch free from last-minute release
changes.
Defines distinct cutoff points for determining which
features are included in a release.

Cons:
Merging back into the main branch can be complex,
especially after long stabilization.
New features developed concurrently may wait until
after the release to be integrated.
The release branch can significantly diverge from the
main branch over time.
Requires additional resources for managing and testing
separate branches.
Managing multiple release branches can add
complexity to the workflow.
Integrating fixes from the release branch back into the
main branch can be delayed.

Gitflow
The Gitflow branching strategy is a robust and structured
approach to software development, particularly designed to
enhance the management of larger projects. It establishes a
clear hierarchy and sequence of branches, making it easier
to track the progress of features, prepare for releases, and
maintain the overall codebase. Central to Gitflow are two
primary branches: the 'master' branch, which holds the
official release history, and the 'develop' branch, which
serves as an integration branch for features.
In Gitflow, feature branches are created from the 'develop'
branch for new features. These branches are dedicated to
specific features and are merged back into 'develop' once
the feature is complete. When it is time to release a version,
a 'release' branch is created from 'develop'. This branch
allows for final adjustments and bug fixes before the release.
After the release is complete and deemed stable, it is
merged into both 'develop' and 'master', with 'master'
representing the latest stable release.
In addition to feature and release branches, Gitflow utilizes
'hotfix' branches to address urgent bugs in the production
code. These branches are created directly from 'master' and,
once the fix is implemented, are merged back into both
'master' and 'develop', ensuring that the fixes are
incorporated into the ongoing development work.
This structured approach, with clearly defined roles for
different branches, facilitates a more organized development
process, especially beneficial for projects that require regular
releases and maintenance of a stable production version.
Gitflow's explicit branch naming and the specific purpose of
each branch type make it easier for teams to collaborate and
manage complex software development tasks effectively
(refer to Figure 13.18):

Figure 13.18: Giflow development


Following are the pros and cons of gitflow:

Pros:
Offers a clear and systematic approach to branching
and merging.
Each branch type has a specific purpose, reducing
confusion.
Facilitates parallel development of features, releases,
and hotfixes.
Separates release preparation from ongoing
development work.
The master branch remains stable, hosting only
released code.
Hotfix branches provide a quick way to address issues
in production.

Cons:
Can be overly complex for small projects or teams.
Managing multiple branches requires more effort and
organization.
Multiple active branches increase the risk of conflicts.
Features developed in isolation may face integration
challenges later.
The structured approach requires time to learn and
understand.
The develop branch can significantly diverge from
master, complicating merges.

Delivery metrics
The delivery metrics pattern is essential in microservices
development, focusing on measuring and analyzing key
performance indicators to refine and optimize delivery
processes. It emphasizes the importance of a delivery
dashboard, providing clear and actionable data to the
development team.

Problem
In the dynamic and complex domain of microservices
development, one of the critical challenges is effectively
measuring and improving the delivery process. This
challenge is rooted in the necessity to have a clear, empirical
understanding of the efficiency and effectiveness of the
software delivery pipeline.
Standard build servers typically provide basic metrics, but
they often fall short in offering a comprehensive analysis,
particularly in correlating pipeline activities with specific
components or features and in evaluating across various
pipelines. Without detailed and relevant metrics, teams
struggle to identify where the issues lie, what needs to be
fixed, and how to enhance their processes (refer to Figure
13.19):
Figure 13.19: Measurements, analysis and continuous improvement of
software delivery

This lack of precise, actionable data hinders the ability to


manage and optimize the delivery of microservices
effectively. The problem, therefore, is the need for a more
refined approach to gathering and analyzing delivery
metrics, one that can be tailored to the unique combination
of tools and processes in a given development environment.
Such an approach should not only provide clear insights but
also align with best practices like those recommended by
DORA, thereby enabling teams to make informed decisions
and continuous improvements in their software delivery
process.

Detail metrics
Detail metrics in software delivery are comprehensive data
points that offer deep insights into various aspects of the
software development process. These metrics are typically
categorized into three key measurements and analyzed
across five critical dimensions, providing a multidimensional
view of the software delivery lifecycle.
Key measurements:
Size: This measurement involves quantifying aspects
such as the size of software components or the number
of features developed. It helps in understanding the
scale and complexity of the software being developed.
Timing: Timing metrics focus on the duration aspects,
such as the time taken to initiate development (time to
start), the overall development duration (development
time), and the time from completion to release (time to
release). These metrics are crucial for tracking
efficiency and identifying bottlenecks in the
development process.
Quality: This includes metrics related to the number of
defects or issues discovered. It's a direct indicator of
the software's reliability and the effectiveness of the
development and testing processes.
Key dimensions:
Scope of work: This dimension covers the range of
work items, including features, defects, and tasks to be
completed. It helps in tracking progress and workload
distribution.
Product design: Metrics in this dimension focus on
how the product is decomposed into components, the
dependencies between these components, and their
impact on release planning and execution.
People (Team): This involves metrics related to team
composition, roles, and the operations performed by
team members. It provides insights into team
efficiency, skills distribution, and collaboration
effectiveness.
Process: These metrics evaluate the different stages of
development and the quality gates passed. They are
essential for assessing the efficacy of the development
process and methodologies used.
Quality control: This dimension focuses on testing,
encompassing metrics related to test cases, their
execution, and the results. It is crucial for gauging the
effectiveness of quality assurance measures.
Different combinations of measurements and dimensions can
provide tens of useful metrics. By analyzing these detailed
metrics, teams can gain valuable insights into every facet of
software delivery. This comprehensive approach allows for
informed decision-making, targeted improvements, and a
deeper understanding of the overall health and progress of
software development projects. The combination of these
measurements and dimensions can yield a multitude of
specific metrics, each providing a unique lens through which
the software delivery process can be assessed and
optimized.

DORA metrics
The DevOps Research and Assessment (DORA) metrics
provide a focused and effective approach to evaluating and
improving software delivery and operational performance.
These metrics, distilled from extensive research, concentrate
on four key areas that are critical to the success of DevOps
practices:
Deployment frequency: This metric measures how
often an organization successfully releases code to
production. High deployment frequency is indicative of
an agile, responsive development process. It reflects
the team's ability to implement and deliver new
features, updates, and fixes quickly and reliably.
Lead time for changes: Lead time is the duration
from the initiation of a code change (such as a commit)
to its successful deployment in production. This metric
is a gauge of the efficiency and speed of the
development process. Shorter lead times suggest a
more streamlined and efficient pipeline, enabling
quicker realization of value from new features or
changes.
Change failure rate: This metric assesses the
percentage of deployments causing a failure in the
production environment, requiring immediate remedy
(like a hotfix or rollback). A lower change failure rate
indicates higher reliability and quality of the
deployment processes and the code being released. It
reflects the effectiveness of the development, testing,
and deployment practices in preventing disruptions in
the production environment.
Time to restore service: This measures the time it
takes for an organization to recover from a failure in
the production environment. It's a critical indicator of
the team's ability to rapidly address and rectify issues,
maintaining operational resilience and minimizing
downtime's impact on users.
By focusing on these four DORA metrics, organizations can
gain valuable insights into their DevOps effectiveness. These
metrics help in identifying strengths and weaknesses in the
software delivery and operational processes, guiding teams
towards targeted improvements and enhancing overall
performance. They are universally applicable across various
development environments, making them a valuable toolset
for any organization striving to achieve excellence in their
DevOps practices.

Delivery dashboard
The delivery dashboard is a visualization tool, specifically
designed to present delivery metrics in an accessible, clear,
and concise manner. Its primary function is to provide a
snapshot of the team's current status regarding software
delivery, compare this status against predefined targets, and
identify areas that may require attention or improvement. By
highlighting critical metrics and trends, the delivery
dashboard acts as a catalyst for informed decision-making
and action, making it an invaluable asset during team status
meetings or scrums.
The dashboard's ability to drill down into data across
different dimensions and time intervals is crucial for a
deeper analysis of issues. This functionality allows teams to
not only identify problems at a surface level but also
understand their root causes over time and across various
aspects of the development process.
There are a few implementation options to create a delivery
dashboard for a team:
Built-in analytical tools in existing systems:
Examples: Jira, GitLab, Azure DevOps.
Characteristics: These are analytical tools integrated
into project management software or build servers.
Limitations: They often lack comprehensive data
aggregation capabilities as they don't incorporate data
from external systems.
Off-the-shelf analytical and dashboarding tools:
Examples: Prometheus + Grafana.
Characteristics: These tools specialize in metrics
collection and dashboard visualization. They offer
simplicity in configuration, possibly requiring minimal
coding, to tailor dashboards to specific needs.
Limitations: While flexible, they might be constrained
in terms of advanced features or specific data
integration needs.
Home-grown solution:
Description: Custom-developed software tailored to the
unique requirements of a particular organization or
team.
Characteristics: These solutions are built from scratch,
offering the highest level of customization and
integration with an organization's specific tools and
processes.
Considerations: While potentially offering the best fit,
this option requires significant investment in
development, maintenance, and scaling.
A delivery dashboard is a critical tool for modern software
development teams, providing clear, measurable insights
into delivery processes. The choice of implementation
depends on the team's specific needs, existing
infrastructure, and resource availability, balancing the trade-
offs between customization, cost, and integration
capabilities. Each option offers distinct advantages and
limitations, and the decision should align with the
organization's overall strategy and goals in software delivery
and DevOps practices.

Conclusion
In this chapter, we thoroughly explore strategies to
streamline and secure CI/CD pipelines. We begin by
examining incremental delivery, considering its impact on
multiple deployments, application platforms, and product
packaging methods. Next, we discuss the evolving roles in
development and DevOps, aiming for greater efficiency and
collaboration.
We then delve into the virtualized build process, which
ensures consistency and reduces discrepancies during
building and testing. Following that, we explain the quality
gate concept, covering both automated and manual gate
strategies to maintain a defined quality standard throughout
the development lifecycle.
In the last patterns, we emphasize the importance of secure
delivery by providing strategies to strengthen the delivery
process against security threats and malicious attacks. We
also address environment provisioning, offering essential
strategies for balancing resources. Additionally, we discuss
branching strategy, focusing on managing and integrating
code changes effectively. Finally, we analyze delivery metrics
used to optimize the delivery process. Continuing with this,
the next chapter teaches how to put together and launch
complex microservices-based products.

Further reading
1. Rakrha. Healthy CICD Pipeline. Medium. May 20, 2023.
Available at https://medium.com/@chetxn/healthy-
cicd-pipeline-b985f56be18
2. Learn. A CICD Pipeline Example. Medium. Jun 3, 2013.
Available at https://medium.com/@cinish/a-cicd-
pipeline-example-c72e8198ad31
3. Shaik, S. Build a CICD Pipeline Using Gitlab, Terraform
and Aws. Medium. Nov 14, 2022. Available at
https://medium.com/@jaffarshaik/build-a-cicd-
pipeline-using-gitlab-terraform-and-aws-
24e782b551ba
4. Batra, R. Building your Docker Images with Docker
Files. Medium. May 16, 2023. Available at
https://medium.com/@rishab07/building-your-
docker-images-with-docker-files-3804ee22e19a
5. Nair, P.G. Creating Docker Images in Spring Boot Using
Build Packs. Medium. Jun 26, 2020. Available at
https://medium.com/@praveeng-nair/creating-
docker-images-in-spring-boot-using-build-packs-
4ecc853f5732

Join our book’s Discord space


Join the book's Discord Workspace for Latest updates, Offers,
Tech happenings around the world, New Release and
Sessions with the Authors:
https://discord.bpbonline.com
CHAPTER 14
Assembling and Deploying
Products

Introduction
In this chapter, we will provide a step-by-step guide to
assembling and launching sophisticated products built on
microservices architecture. We will cover various aspects
such as simplifying product packaging, effectively managing
component versions within a product, and deploying
products securely using techniques like blue/green, rolling, or
canary deployment. In essence, this chapter aims to assist
you in efficiently and successfully overseeing and launching
your microservices projects with the help of automation.

Structure
In this chapter, we will cover the following topics:
Product packaging
Kubernetes YAML manifests
Helm chart
EAR archive
Cloud resource templates
Custom scripts
Baseline management
Development branch
System deployment
Updates from the CI/CD pipeline
Deployment strategy
Blue/green deployment
Rolling deployment
Canary deployment

Objectives
After studying this chapter, you should be able to understand
the varied techniques of product packaging, including the
utilization of Kubernetes YAML manifests, Helm charts, EAR
archives, or cloud resource templates. You will learn to
manage system baselines to ensure proper configuration of
product components. Furthermore, you will gain insights into
executing deployment strategies such as blue/green, rolling,
and canary deployments effectively, helping you to
streamline the process and ensure a smoother transition
during the product release phases.

Product packaging
Product packaging is vital in delivering microservices
systems, which can have many parts working together. It
helps assemble microservices, necessary external
dependencies, and automated scripts into one package. This
makes it easier to move through different stages of quality
checks and finally get it ready for use. In this section, we will
learn about some essential tools like Helm charts, EAR
archives, and cloud resource templates that help make this
process smooth and hassle-free.

Problem
In the rapidly evolving sphere of software development,
microservices systems have emerged as a complex
composition of tens or even hundreds of individual
microservices. While companies like Netflix have pioneered
incremental delivery to mitigate the necessity of redeploying
the entire system, most development teams grapple with
deploying systems in new environments for testing and
subsequent production. A large number of individual
components that microservices systems consist of, and the
corresponding high complexity of deployment necessitate
automation in the deployment process.
Drawing a parallel from manufacturing, we find the Bill of
Material (BOM) concept playing a vital role. The BOM
serves as a comprehensive document that delineates the
assembly instructions, listing vital materials and required
tools for creating a specific version of a product (refer to
Figure 14.1):
Figure 14.1: Structure of product deployment package

Similarly, in software development, the product package


serves as the software counterpart to the BOM. As a critical
element in the deployment pipeline, a well-constructed
product package should include:
Product metadata: Name, version, and additional
information about the product.
Compatible environments: A comprehensive list
specifying the environments where the product can be
installed successfully.
External dependencies: A detailed list of external
dependencies along with their specific versions that
need to be present in the environment before the
initiation of the deployment process.
Component inventory: A meticulous inventory
detailing the components along with their respective
versions required to be installed as part of the product
setup.
Deployment actions: A set of actions outlining the
necessary steps to populate initial data, set up required
configurations, and accomplish other tasks vital for
completing the product installation.
Furthermore, this product package should be adept at
facilitating the installation, upgrade, and uninstallation of the
system, thereby ensuring a seamless and efficient
deployment process. The challenge lies in crafting a product
package that encapsulates all these elements efficiently,
paving the way for streamlined and hassle-free deployments.
The deployment process requires a tool that takes a product
package and a Day 0 configuration that contains key
deployment parameters and deploys them into a specified
environment (refer to Figure 14.2). More about Day 0
configuration read in Chapter 5, Configuring Microservices:

Figure 14.2: Deploying product package into an environment

Deployment tools should typically support three types of


actions:
Install: Installation of a new product into an
environment
Upgrade: Replacing an older version of a product with
a newer one
Uninstall: Uninstallation of previously installed
product from an environment

Kubernetes YAML manifests


Packaging microservices into Docker containers and
orchestrating their deployment through Kubernetes has
emerged as the foremost strategy for deploying microservice
systems. This method is not only efficient but also promotes
scalability and manageability. To streamline this process,
various elements of the Kubernetes deployment are defined
in Yet Another Markup Language (YAML) manifests,
which, when combined with relevant files, form the blueprint
of the microservice product package. This collection of files
can be compressed into a single zipped file for simplicity and
ease of delivery.
The product package defined through YAML manifests
encompasses the following components:
Product metadata: Not supported. Inferred through
the naming convention adopted for the archive
containing the product package.
Compatible environments: Not supported.
External dependencies: Not supported.
Component inventory: A comprehensive collection of
deployment entities, including pods, services, config
maps, secrets, and so forth, delineated in the YAML
configurations. Pod configurations reference the
Docker containers housing the microservices.
Deployment actions: Implicitly characterized within
the functionalities of the deployment components.
Day 0 configuration is typically defined as a ConfigMap YAML
file that is processed together with other YAML manifests
containing definitions of system components.
Day0-config.yaml (Code snippet 14.1):
1. apiVersion: v1
2. kind: ConfigMap
3. metadata:
4. name: day0-config
5. data:
6. param1: "value1"
7. param2: "value2"
8. param3: "value3
Then key-values are injected into pod configurations,
exposing configuration parameters to microservices (Code
snippet 14.2):
1. apiVersion: v1
2. kind: Pod
3. metadata:
4. name: my-microservice-pod
5. spec:
6. containers:
7. - name: my-microservice-container
8. image: my-microservice-image
9. env:
10. - name: PARAM1
11. valueFrom:
12. configMapKeyRef:
13. name: day0-config
14. key: param1
15. - name: PARAM2
16. valueFrom:
17. configMapKeyRef:
18. name: day0-config
19. key: param2
20. - name: PARAM3
21. valueFrom:
22. configMapKeyRef:
23. name: day0-config
24. key: param3
25. restartPolicy: Always
During the deployment process, users will use the kubectl
command-line tool to process the YAML configurations
sequentially, utilizing specific commands for installation
(create), upgrade (apply), and uninstallation (delete) of the
product components. See Day 0 configuration in Chapter 4,
Configuration Types for a more detailed explanation.
Assembling a product package often involves adding extra
steps to finish the setup process completely. These steps
include populating an initial dataset or setting system
configurations. Sadly, Kubernetes YAML manifests do not
have custom deployment actions. But an easy workaround
lets you run these actions as scripts in a Docker container in
a pod. Alternatively, deployment actions can be packaged
into Docker images and executed similarly during
deployment.
Packaging microservices into Kubernetes YAML manifests
involves defining components such as pods, services, and
configurations in YAML files. This method facilitates
automated deployment and management within a
Kubernetes environment but does not natively support
custom deployment actions or external dependency checks.
It is a popular choice for container orchestration due to its
scalability and community support.
Following are the pros and cons of Kubernetes YAML
manifests:

Pros:
Relatively easy to read and write.
Facilitates tracking changes and maintaining version
history.
It can be seamlessly integrated with various CI/CD
pipelines.

Cons:
Cannot natively handle complex deployment actions.
It can get complex and hard to manage with the growth
of components.
A steep learning curve that can demand substantial
time and effort to master.

Helm chart
Deploying dockerized microservices in Kubernetes can be
significantly streamlined using Helm charts, a more
sophisticated packaging method. These charts utilize
Kubernetes YAML manifests, which we discussed previously,
as templates, injecting values sourced from a values.yaml file
or defined via command-line arguments.
A helm package has a particular structure:
myproduct/
Chart.yaml # A YAML file containing
information about the chart
LICENSE # OPTIONAL: A plain text file
containing the license
README.md # OPTIONAL: A human-readable README
file
values.yaml # The default configuration values
for this chart
charts/ # A folder containing any charts
upon which this chart depends.
crds/ # Custom Resource Definitions
templates/ # A folder of templates that, when
combined with values,
# will generate valid Kubernetes
manifest files.
The chart's content is packaged into a .tgz file, which can be
placed in a Helm chart repository, facilitating seamless
distribution.
The product package created as a Helm chart encompasses
the following elements:
Product metadata: Detailed within the Chart.yaml file.
Compatible environments: Not supported.
External dependencies: These are articulated in the
Chart.yaml file, specifically within the dependencies
section. The dependent charts can be placed under
/charts directory to be distributed together with the
product package.
Component inventory: A vast assortment of
deployment elements, such as pods, services, config
maps, secrets, and more, are detailed in the YAML
manifests housed in the /templates directory. References
to microservices Docker containers can be found in the
pod configurations.
Deployment actions: These are implicitly articulated
through the functionalities encapsulated by the
deployment components.
Unfortunately, Helm charts do not support custom
deployment actions. However, given that Helm charts
fundamentally operate using Kubernetes YAML manifests,
custom actions can be executed employing the technique
described in the preceding pattern.
The helm command-line tool facilitates creating and
deploying packages within a given environment. The package
command generates a new package, while the install
command facilitates the integration of a new chart within an
environment. Similarly, the upgrade command enables
substituting an older chart with a newer version, and the
uninstall command assists in removing a previously deployed
chart from an environment.
In summary, Helm charts are an advanced tool for packaging
dockerized microservices systems, offering a structured and
customizable approach to manage and deploy applications in
a Kubernetes environment efficiently.
Following are the pros and cons of Helm charts:

Pros:
Simplifies the deployment process of microservices.
Allows for the reuse of pre-defined chart templates
across different environments.
Facilitates versioning and rollback functionalities,
enhancing management and control.
Cons:
Requires understanding of both Helm and Kubernetes,
presenting a steeper learning curve.
It can lead to intricate configurations that are hard to
debug.
Managing dependencies between charts can sometimes
become complex.

EAR archive
Microservice systems implemented on the JEE platform are
generally encapsulated within EAR archives. Typically, the
microservices that form a system are packaged as WAR files
and housed within the EAR archive. This archive might also
encompass dependencies, presented as Java libraries and
WAR files.
The product package, as established through the EAR
archive, comprises the following components:
Product metadata: Defined inside application.xml
deployment descriptor.
Compatible environments: Not facilitated.
External dependencies: The verification for external
dependencies is not facilitated. It is advised to
incorporate dependencies directly within the EAR
archive.
Component inventory: Every component must be
encapsulated as either JAR or WAR files and positioned
within the EAR archive. The roster of integrated
components is depicted in the deployment descriptor,
housed in the application.xml file.
Deployment actions: These are inherently delineated
by the configurations established in the deployment
descriptors in each WAR file’s web.xml files.
EAR archives can be created using Java build tools like Ant,
Maven, or Gradle. You can simply compile the necessary
contents into a ZIP file using any zip archiving tool. The
deployment of the EAR archive is facilitated through tools
associated with each specific JEE server.
Day 0configuration can be executed using environment
variables or by setting context parameters within the
deployment descriptors found in the web.xml microservices
WAR files.
Web.xml (Code snippet 14.3):
1. <web-app version="3.1">
2. <context-param>
3. <param-name>param1</param-name>
4. <param-value>value1</param-value>
5. </context-param>
6. <context-param>
7. <param-name>param2</param-name>
8. <param-value>value2</param-value>
9. </context-param>
10. <context-param>
11. <param-name>param3</param-name>
12. <param-value>value3</param-value>
13. </context-param>
14.
15. <!-- Other entries like servlets, filters, listeners, etc. -
->
16.
17. </web-app>
The reading of context parameters inside microservices can
be done in the following way (Code snippet 14.4):
1. String param1 = getServletContext().getInitParameter
("param1");
Unfortunately, there is no standard method to override
context parameters during deployment. Depending on the
specific server you are using, there may be server-specific
methods to override context-param values, possibly through
administrative consoles or server-specific configuration files.
Implementing custom deployment actions in Enterprise
Archive (EAR) packaging generally involves utilizing
application server-specific features or mechanisms within
your code to execute specific actions upon deployment.
Packaging microservices into EAR archives is usually
employed in JEE platforms. This approach encapsulates
microservices, commonly packaged as WAR files and their
respective dependencies as JAR files, within an EAR archive.
This package structure is orchestrated through a descriptor
file, application.xml, within the archive, outlining the
components and configurations. However, this structure
lacks explicit support for product metadata and
environmental compatibility specifications. Deployment is
typically executed through tools associated with specific JEE
servers, and initial configurations can be defined within
web.xml files inside WAR files for each microservice.
Following are the pros and cons of EAR archive:

Pros:
Simplifies deployment by bundling multiple
components into a single package.
Allows sharing of libraries and resources among
various microservices.
Supported by well-established build tools like Maven,
Ant, and Gradle.
JEE servers offer comprehensive management and
monitoring capabilities.

Cons:
Deployment descriptors can be complex and verbose.
Sometimes tied to specific JEE server vendors, possibly
leading to vendor lock-in.

Resource template
In the scenario where microservices are packaged as
serverless functions or other cloud-native solutions, resource
templates can serve as the solution for packaging and
deployment. Every cloud platform provides own provisioning
tools with their respective resource templates. Some well-
known cloud platforms are:
Amazon Web Services (AWS):
AWS CloudFormation Templates
AWS Cloud Development Kit (CDK)
Microsoft Azure:
Azure Resource Manager (ARM) Templates
Bicep (a domain-specific language for deploying
Azure resources)
Google Cloud Platform (GCP):
Deployment Manager Templates
IBM Cloud:
IBM Cloud Schematics
Oracle Cloud:
Oracle Resource Manager
Alibaba Cloud:
Alibaba Cloud Resource Orchestration Service
(ROS) Templates
Nevertheless, for a cross-platform deployment mechanism
that transcends individual cloud platforms, Terraform
emerges as a viable choice, especially when microservices
are crafted in a non-platform-specific manner, such as
processes operating in virtual machines.
A product package defined as cloud resource template has
the following components:
Product metadata: Not supported. Inferred through
the naming convention adopted for the archive
containing the product package.
Compatible environments: This information is not
supported by Azure Resource Templates. If it is
important, developers must implement their own
mechanism to work around it.
External dependencies: Same as the previous one.
Component inventory: Constituted as entities within
the resource templates. The entities include references
to the microservice packages.
Deployment actions: Implicitly defined by the
elements facilitating deployment.
A common approach to setting up Day 0 configurations for
cloud-native microservices is using environment variables.
During deployment, these values can be seamlessly
incorporated as parameters within the resource templates. If
you need to implement custom deployment actions, you can
implement them as scripts and run them on a small
computing instance.
Using resource templates for packaging microservices
facilitates a streamlined approach to defining and deploying
system components via coded templates. Compatible with
popular infrastructure as code tools like Terraform, it offers a
structured way to set up microservices, specifying necessary
resources and attributes in platform-specific templates. This
method makes scaling and reproducing deployments easier,
with the initial configurations typically established through
environment variables. However, it does not inherently
support product metadata or external dependencies,
necessitating separate setups or integrations within the
package. It serves as a flexible yet platform-centric solution
for microservices packaging.
Following are the pros and cons of resource template:

Pros:
Facilitates organized and automated deployments.
Easy to replicate environments and services.
Efficient scaling of services in cloud environments.
Seamless compatibility with Infrastructure as Code
(IaC) tools like Terraform.

Cons:
Limited to platform-specific templates and
configurations.
Does not inherently manage or document external
dependencies.
It can become complex and unwieldy for larger, more
intricate systems.
Custom script
If the existing solutions don't meet your requirements, you
always have the option to package and deploy your products
using custom scripts. This approach can also cater to cross-
platform deployments, masking platform variances and
ensuring a consistent user experience. Notably, this method
can incorporate the methods outlined in the preceding
patterns.
Following are the pros and cons of custom script:

Pros:
Tailored to specific needs
Can integrate with various platforms
Can encapsulate other packaging methods

Cons:
Higher maintenance complexity
May lack standardization
It may require deeper knowledge and expertise

Baseline management
Baseline management is crucial in overseeing the fluid
dynamics of microservices systems, which may undergo
several updates daily due to their independent life cycles. It
facilitates the automatic tracking and integration of varied
component versions, removing the need for manual updates
in a product package. This tool is essential in maintaining a
harmonized, current, and functioning microservice
ecosystem, seamlessly adapting to rapid changes, and
ensuring system coherence.
Problem
In microservices, baseline management is a crucial process
that helps manage and synchronize a system’s different
components effectively. It involves maintaining an agreed
standard or baseline of the system's components at different
stages of development, which helps in tracking changes and
managing dependencies more effectively.
Typically, a microservices system contains numerous
individual microservices that can change rapidly. This makes
tracking each microservice’s status and configuration a
complex task. However, the challenge becomes more
manageable when we introduce automated baseline
management. It takes over the laborious task of manually
tracking changes, ensuring the product package always
contains the correct versions of each component, which work
harmoniously together.
Interestingly, while the external dependencies of the system
tend to change rarely, the microservices themselves undergo
frequent changes, sometimes several times within a day.
Maintaining these changes is not feasible as it can lead to
errors and inconsistencies. Therefore, automating the
process ensures that the baseline is always up-to-date,
reducing the potential for error and improving efficiency.

Development branch
Baseline management employing a development branch
serves as a streamlined approach to keep track of various
versions and updates pertaining to different components of a
product package.
Here is how it works:
1. Within a mono repository, a centralized repository that
holds multiple microservices.
2. The development team continually integrates new
changes and updates.
3. This repository houses all the necessary components in
one place, making it a hub of sorts for all microservices
related to the project (as shown in Figure 14.3):

Figure 14.3: Baseline management using a VCS development branch

To create a new product package, the system scans the


development branch of the mono repository by using a
specific tag that identifies the pertinent microservices for a
particular product release. Through this method, it
accurately pinpoints and aligns the correct versions of the
microservices, ensuring a seamless and error-free integration
into the new product package.
Following are the pros and cons of the development branch:

Pros:
All microservices are housed in one location, making it
easier to track changes and updates.
Helps in maintaining a uniform versioning system for
all microservices, reducing inconsistencies.
Facilitates automated integration of various
components for new releases, saving time.

Cons:
Potential for merge conflicts if multiple teams are
working on a shared codebase simultaneously.
A monorepo is prone to creating coupling between
services in code as well as build and test processes if
not managed carefully.
As everything is centralized, product packaging might
take longer, especially if the repository is large.

System deployment
When using a multi-repo strategy or sourcing components
from diverse origins, a single development branch for
baseline establishment is impractical. Under such
circumstances, a potent alternative emerges in the form of
utilizing a specific test environment as the source of truth,
where the microservices system is deployed (refer to Figure
14.4):
Figure 14.4: Baseline management using a system deployment

As this method pivots around the concept of real-time


verification, it requires the development stage to attain a
level of stability before initiation. Once this stability is
achieved, the baselining procedure scans the versions of the
microservices deployed within that environment.
Subsequently, these versions are integrated into the product
package, thereby aligning the deployed components'
versions with the product package's blueprint.
However, the successful implementation of this approach
hinges heavily on the accessibility to easily discern the
versions of the deployed components. In environments such
as Kubernetes, this can be facilitated by listing pods and
extracting versions used as tags in the respective
microservice Docker images.
This method embodies a realistic and dynamic approach to
baseline management, essentially mirroring the deployed
system's state at a particular time, which assists in capturing
the most stable and recent configurations for packaging.
Following are the pros and cons of system deployment:
Pros:
Aligns with the live environment, reflecting the most
current state of deployed microservices.
Can accommodate components sourced from multiple
teams or vendors.
Allows for the management of microservices housed in
various repositories without centralization.

Cons:
Requires a stable development stage, which might
delay the baselining process.
Relies heavily on specific environmental setups, which
allow for easy version determination.
The constantly changing nature of deployed systems
might result in frequent alterations to the baseline.

Updates from CI/CD pipeline


Another baseline management method leverages the
capabilities of CI/CD pipelines can prove to be both efficient
and time-saving (refer to Figure 14.5):

Figure 14.5: Baseline management using a system deployment

This approach hinges on the automatic updating of


component versions within the product package each time a
microservice is successfully released through the pipeline.
However, this method requires a robust and reliable delivery
process. This is to ensure that each new version of a
component rolled out is fully compatible with other
components in the system, maintaining a correct
configuration.
Despite its efficiency, this method might present challenges,
particularly when there is a desynchronization in the product
configurations. In such instances, pinpointing the correct
configuration becomes a complex task. Due to this, it is
advisable to couple this baseline method with one of the
previously discussed approaches. This strategic combination
would work towards realigning the baseline, bringing it back
in sync, and ensuring smooth operations in the long run.
Following are the pros and cons of updates from the CI/CD
pipeline:

Pros:
Allows for immediate incorporation of new changes.
Reduces manual intervention, minimizing human error.
Ensures that the product is always in a deployable
state.

Cons:
New updates might not always be compatible with
existing components.
Can make identifying the correct configuration
challenging in the case of desynchronization.
Requires a highly reliable delivery process to function
effectively.

Deployment strategy
In the fast-paced world of product deployment, ensuring
continuous operations is pivotal. Despite stringent measures,
errors can occur, potentially leading to significant financial
and reputational losses. However, utilizing strategic
deployment approaches can help curb these issues,
facilitating swift issue identification and enabling a seamless
rollback to a functional state if necessary. These strategies
stand as a bulwark, safeguarding operations from disruptions
during product version updates.

Problem
In the dynamic landscape of business, consistent and
efficient product updates are key to maintaining a
competitive edge. However, the deployment process,
especially concerning microservices, can be complex and
prone to errors, potentially leading to service disruptions and
consequent financial and reputational losses.
Despite rigorous planning, unforeseen errors can infiltrate
the deployment pipeline, sometimes going unnoticed until
they cause significant operational setbacks. It is essential to
have strategies in place that can swiftly identify and rectify
these issues, preventing larger-scale disruptions and losses.
Furthermore, a crucial component of a robust deployment
strategy is the ability to revert to a previous, stable state
swiftly if a new deployment fails. This ensures minimal
service disruptions and safeguards both the company's
reputation and its financial health.
Therefore, the development of effective and agile
deployment strategies is vital. Such strategies should
facilitate a seamless deployment process while also
providing mechanisms to identify potential issues quickly
and guarantee a smooth transition back to a stable state if
necessary, securing uninterrupted service and sustaining
market reputation and stability.
Blue/green deployment
Blue/green deployment is a strategy designed to reduce
downtime and risk by running two identical production
environments named blue and green.
At any time, only one of the environments is live, with the
live environment serving all production traffic. For example,
the blue environment is live initially, and the green
environment is used to deploy the new version of the
application (refer to Figure 14.6, where LB means load
balancer):

Figure 14.6: System deployment using blue/green strategy

The process works as follows:


Initial setup: Start with two environments, the blue is
the current production environment running the
existing version of the application, while the green is
idle.
New version deployment: Deploy the new version of
the application in the green environment, ensuring all
necessary databases and back-end systems are
correctly linked.
Testing: Once the new version is deployed in the green
environment, it undergoes thorough testing to ensure
functionality and stability.
Traffic shifting: Upon successful testing, the traffic is
gradually shifted from the blue environment to the
green environment, either all at once or incrementally,
to monitor the performance and functionality in the
real world.
Rollback plan: In case of any issues during the
transition, a rollback plan is in place to quickly revert
back to the blue environment, minimizing disruption.
Final transition: If no issues are detected, all traffic is
completely redirected to the green environment,
making it the new production environment.
Retiring the old environment: The Blue environment
then becomes idle and can be updated to be used as a
standby for the next update, restarting the cycle.
When implementing the blue/green deployment strategy,
special attention should be paid to data management to
ensure continuity and integrity.
There are primarily two approaches to handle this:
Database replication
Database sharing
In the case of replication, the data is duplicated across both
environments, providing an additional layer of isolation and
reducing the risk of data corruption. On the other hand,
sharing a database between environments necessitates
meticulous schema management. It is crucial that the new
deployment is capable of operating without altering the
existing schema, preventing potential disruptions during the
transition. This allows for a seamless shift of operations from
the old to the new environment without jeopardizing data
integrity and facilitating a smoother rollback, if necessary.
Ultimately, the chosen approach should align with the
specific requirements and constraints of the system being
deployed.
This strategy allows for seamless transitions between
versions, quick rollbacks in case of errors, and virtually no
downtime during deployments, thereby ensuring a reliable
and uninterrupted service.
Following are the pros and cons of blue/green deployment:

Pros:
Enables virtually zero-downtime deployments.
Ensures clear isolation between the new (green) and
old (blue) environments.
Allows for controlled and staged rollout, facilitating
extensive testing and validation.

Cons:
Requires substantial resources as two parallel
environments need to be maintained.
Might incur higher costs due to the necessity of
maintaining two fully functional environments.
Risk of configuration drift, where the two environments
start to diverge in setup over time.

Rolling deployment
A rolling deployment is a strategy commonly employed to
minimize downtime while updating the components of a
microservice system. In this strategy, the newer version of a
microservice is gradually rolled out while simultaneously
phasing out the older versions.
During the process, the system is systematically updated
one unit at a time, which could be a server, a container, or a
virtual machine. This ensures that a portion of the system is
always functional to serve user requests, thereby minimizing
service disruptions (refer to Figure 14.7):

Figure 14.7: System deployment using Rollout strategy

This is how it works step by step:


Preparation: Before deployment, ensure that the new
version is compatible with existing data schemas and
configurations to prevent breakdowns during the
transition.
Partial deployment: Initially, a fraction of the
instances are updated with the new version. The
remaining instances continue to operate with the old
version.
Monitoring and verification: After updating a
portion of the instances, monitor the system's
performance and functionality to ensure that there are
no errors or issues.
Incremental rollout: If no issues are detected, the
next batch of instances is updated. This process
continues incrementally until all instances are running
the new version.
Reversion option: At any stage, if a problem is
detected, there is an option to roll back the updates to
return to a stable state.
Completion: Once all instances have been successfully
updated and are functioning correctly, the rolling
deployment is considered complete.
Post deployment testing: Conduct further tests to
ensure the entire system is working as expected with
the new version fully deployed.
If the system includes a database, it may either be shared
across versions or replicated. If shared, the new version must
maintain compatibility with the old database schema
throughout the deployment process.
It is worth noting that the system must be able to operate
with a mixture of old and new versions for the duration of the
deployment to prevent any downtime. Also, the deployment
might involve several stages, including updating the
application code and possibly adapting the database
schema.
This deployment method is particularly advantageous
because it does not require duplicating the production
environment like the blue/green deployment method,
making it less resource-intensive. It, however, requires
careful orchestration to ensure a smooth transition and to
quickly identify and correct issues that may arise during the
deployment phase.
Following are the pros and cons of rolling deployment:

Pros:
Ensures service availability during the deployment.
Does not require duplicating the production
environment, thus saving resources.
Allows for pausing or rolling back the deployment if
issues are detected.
Cons:
Requires backward compatibility to allow simultaneous
operations of mixed versions of system components.
Can take a considerable amount of time to complete,
particularly for large systems.
Needs careful handling of database schemas to prevent
conflicts or data issues during the transition.

Canary deployment
Canary deployment is a strategy used to reduce the risk of
introducing a new software version in production by slowly
rolling out the change to a small subset of users before
rolling it out to the entire infrastructure (refer to Figure 14.8):

Figure 14.8: System deployment using the canary strategy

Here is how it generally unfolds:


Initial release: In the initial phase, the new version of
the software is deployed to a limited number of
servers, and only a small percentage of the user base,
known as the canary group, is directed to this new
version. This allows teams to monitor how the update
performs in a live environment but on a smaller scale.
Monitoring performance: During this phase, system
metrics and user feedback are closely monitored to
evaluate the performance and potential issues of the
new version. This data helps in identifying any
unforeseen problems that were not caught during
testing stages.
Gradual rollout: If the new version performs well
without any significant issues, it is gradually rolled out
to larger segments of users and infrastructure in
stages. Each stage allows for further monitoring and
adjustments before proceeding to the next.
Rollback strategy: At any point during the canary
deployment, if a major issue is detected, the changes
can be rolled back quickly to the previous stable
version, minimizing the impact on the user base.
Finalization: Once the new version proves stable and
all identified issues are resolved, it is eventually rolled
out to the entire user base, replacing the old version
entirely.
Feedback and adjustments: After full deployment,
ongoing monitoring and user feedback help in making
any necessary tweaks or improvements.
Similar to other deployment strategies, database
management is a crucial aspect. It can either be shared
between the versions or replicated. If shared, the new
version should be compatible with the existing database
schema to avoid disruptions.
Through this method, organizations can mitigate risks
associated with deploying new versions by carefully
monitoring their performance and swiftly reacting to any
issues encountered. It facilitates a safer and more controlled
environment for software updates.
Following are the pros and cons of canary deployment:
Pros:
Isolates potential issues to a small user group.
Allows for staged, controlled deployment.
Facilitates swift reversal to a previous version if issues
are detected.

Cons:
Requires a sophisticated setup for routing and
monitoring.
Different users might experience different versions of
the system during the deployment phase.
Can complicate database schema management,
especially with shared databases.

Conclusion
In this chapter, we explore the diverse methodologies of
product packaging and deployment in the microservices
domain. We delve into product packaging using various
scripts and templates, including Kubernetes YAML manifests,
Helm charts, EAR archives, and cloud resource templates.
Furthermore, we elucidate the crucial aspect of baseline
management emphasizing development branch strategies,
system deployment, and incremental updates through CI/CD
pipelines. Lastly, we navigate through advanced deployment
strategies, detailing the benefits and considerations of
blue/green, rolling, and canary deployments, to ensure
seamless and efficient product rollouts.
We have now reached the end of our journey. In this book,
we have explored key aspects of designing and
implementing microservices architecture. Beginning with
chapters on defining business vision, organization structure,
and architecting microservices systems, we progressed to
cover essential topics including configuring microservices,
implementing effective communication, managing data, and
handling complex business transactions. Additionally, we
delved into exposing external APIs, monitoring
microservices, packaging and testing strategies, scripting
environments, and automating continuous
integration/continuous deployment (CI/CD) pipelines.
With a comprehensive approach, the book guided readers
through the entire lifecycle of microservices development,
fostering a deep understanding of microservices principles
and practices from development to deployment.

Further reading
1. Rajhi, S. An in-depth guide to building a Helm chart
from the ground up. Medium. Feb 20, 2023. Available at
https://medium.com/@seifeddinerajhi/an-in-
depth-guide-to-building-a-helm-chart-from-the-
ground-up-9eb8a1bbff21
2. Roper, J. Kubernetes Deployment Strategies. Medium.
Aug 17, 2022. Available at
https://medium.com/faun/kubernetes-
deployment-strategies-f36e7e4d2be
3. Nanayakkara, C. Application Deployment Strategies.
Medium. Aug 5, 2021. Available at
https://medium.com/@crishantha/application-
deployment-strategies-665d79617eac
4. Belagatti, P. How to Create a Simple Helm Chart.
Medium. Sep 19, 2022. Available at
https://medium.com/@pavanbelagatti/how-to-
create-a-simple-helm-chart-8211dcbaedc2
5. Pandei, I. R. What is basically JAR, WAR, EAR file in java.
Medium. Sep 7, 2021. Available at
https://medium.com/p/b7beeb51bebb
Index
Symbols
3scale (Red Hat) 265

A
Acceptance Test Driven Development (ATDD) 354
acceptance testing 339, 353-357
Activiti 239
Agile Workshop 8
characteristics 8, 9
cons 9
pros 9
alpha and beta tests 339
Amazon CloudWatch 306
Amazon Cognito 279
Ansible 379
antipatterns 19, 20, 81, 111
broad code changes 82
code sharing in monorepo 81
monolithic build and deployment 81
poorly structured monorepos 81
Apache APISIX 262
Apache JMeter 361
Apache Oltu 280
Apache Oltu (formerly Amber) 289
Apache OpenWhisk Java Runtime 324
Apache shiro 282
Apicurio 351
API documentation 137
problem 137
API gateway 260-262
Apigee (Google Cloud) 265
API key authentication 277
API management pattern 264
API versioning 133
problem 133
application metrics 304
problem 304, 305
solution 305-307
Application Programming Interfaces (APIs) 133, 134
versioned channels 134
architectural decomposition 26
data decomposition 28
domain-driven design (DDD) 29, 30
functional decomposition 27, 28
layered architecture 31, 32
problem 26, 27
Artillery 365
AssertJ 342
asymmetric deployments 47
cons 48
pros 48
asymmetric environment 388, 389
AsyncAPI 141, 142
asynchronous execution 250
asynchronous messaging 124, 125
problem 125
Atmosphere 274
Attribute-Based Access Control (ABAC) 43
Auth0 Java JWT 286
authentication 275
API key authentication 277
basic authentication 275, 276
multi-factor authentication (MFA) 280-282
OpenID Connect (OIDC) 278-280
problem 275
typical flow 277, 278
authorization 283
attribute-based authorization 283
JWT token 285, 286
OAuth 2.0 287-289
permission-based authorization 283
policy-based authorization 283
problem 283
role-based authorization 283
security considerations 284, 285
session tracking 283, 284
Authorization Code Flow 278
auto code documentation 73
auto-generated comments 75, 76
JavaDoc generation 74, 75
problem 74
automated testing 335
AWS API gateway 262
AWS CloudFormation 379
AWS CodeBuild 410
AWS Lambda 52
AWS Lambda Java SDK 324
AWS Step Functions 240
Azure API management 262
Azure Application Insights 309
Azure DevOps 410
Azure Functions 52
Azure Functions Java SDK 324
Azure Logic Apps 240
Azure Monitor 306
Azure Resource Manager (ARM) 379

B
Backends for Frontend (BFF) 228, 263, 264
backpressure pattern 242-245
Bamboo 410
baseline management 460
development branch 461, 462
problem 460, 461
system deployment 462, 463
updates, from CI/CD pipeline 464
Bash 348
Behat 354
Behave 347
Behavior Driven Development (BDD) 354
Bill of Material (BOM) concept 450
blob IDs
transferring 154, 155
BlobStorage class 144
blob streaming 142
continuous streaming 143-154
problem 142
blue/green deployment 465-467
branching strategy 435
continuous deployment 436
feature branching 437, 438
Gitflow branching strategy 441-443
no-branching strategy 436, 437
problem 436
release branching 440, 441
trunk-based development 439, 440
bulkhead pattern 245-247
Burp Suite 362
business logic coordination and control flow 37, 38
orchestration 38
problem 38

C
Camunda Business Process Management (BPM) 239
canary deployment 469, 470
castle and moat approach 44
Chaos Monkey 361, 371
configuration 371
problem 371
solution 371
Chef 380
choreographic saga pattern 236, 237
compensating transaction 237, 238
workflow 238-240
choreography 39, 40
cons 40
pros 40
chunking 155
cons 156
pros 155
CI/CD pipelines 405, 406
application platform 414, 415
incremental delivery 409-413
multiple deployments 413, 414
problem 406-408
product integration 415, 416
CircleCI 410
Circuit Breaker pattern 164, 165
client library pattern 166
cons 169
problem 166
pros 169
solution 166-168
coarse-grained microservices 33
Code Backward Compatibility patterns 65
code compatibility 67
cons 68
full backward compatibility 67
namespace versioning 68
problem 67
pros 67
Codeium 75
code repositories 56
mono-repo 57
multi-repo 59, 60
problem 57
code reviews 76
auto code checks 78-80
checklist 78
periodic reviews 77, 78
problem 76
Pull Request Review 76, 77
code sharing 63
problem 63
shared libraries / versioned dependencies 65
code structure 62
functional / domain-driven code structure 62
problem 62
type / technology-based code structure 63
Commandable API 156
cons 158
problem 156, 157
pros 158
solution 157, 158
Command Query Responsibility Segregation (CQRS) 184, 185
cons 185
pros 185
common low denominator approach 47
Communication Reliability Patterns 159
communication style 34
event-driven communication 37
message-driven microservices 35, 36
problem 34
synchronous communication 34
concurrency and coordination 210
distributed cache 211-215
distributed locking 221-224
optimistic locking 218-221
partial updates 215-218
problem 211
state management 224-227
configuration types 84
Day 0 configuration 85
Day 1 configuration 85, 86
Day 2 configuration 86, 87
problem 84, 85
solution 85
Connect2id server 280
connection configuration 98
client-side registrations 102, 103
discovery services 100-102
DNS registrations 98-100
problem 98
Consul 92
containerized infrastructure approach 47
contract testing 350-353
correlation ID 296
CronJob service 331
cross-platform deployment 325
platform abstraction 326
problem 325
repackaging 327, 328
symmetric deployments 325
cross-platform deployments 45, 46, 388
asymmetric deployments 47, 48
asymmetric environment 388, 389
dockerized environment 393, 394
problem 46, 388
symmetric deployments 46, 47
symmetric environment 390, 392
cross-platform frameworks
Fn Project 53
Micronaut 52
OpenFaaS (Functions as a Service) 53
Quarkus 52
CRUD (Create, Read, Update, Delete) pattern 183, 184
Cucumber 347, 354
custom script 460
Cypress 348

D
database architecture 197
database per service pattern 197, 198
database sharding 199, 200
problem 197
Database Benchmark 362
data decomposition 28
cons 29
pros 29
Datadog 306, 309
DataGenerator 366
data management 182
problem 182, 183
data migration 200
disruptive migration 201, 202
problem 200
schemaless 203-206
versioned tables 202, 203
data objects 172
dynamic data objects 177, 178
problem 173
static data objects 173-177
Data Transfer Objects (DTOs) 172
Day 0 configuration 85
Day 1 configuration 85, 86
Day 2 configuration 86, 87
DBMonster 366
decoupling 125
delayed execution 249
background workers 253, 254
job queue 251-253
problem 249-251
delivery dashboard 446, 447
delivery metrics 443
delivery dashboard 446, 447
detail metrics 444, 445
DORA metrics 445, 446
problem 443, 444
deployment security 394, 395
IP access lists 395, 396
management station 398, 399
problem 395
traffic control rules 396, 397
deployment strategy 465
blue/green deployment 465-467
canary deployment 469, 470
problem 465
rolling deployment 467-469
deployment-time composition 103
problem 103
solution 104-106
deployment-time configuration 86
detail metrics 444, 445
development environments 384
development model 8
problem 8
development stacks 51
cross-platform frameworks 52
platform-specific frameworks 52
problem 51
DevOps delineation 416
problem 416, 417
solution 417, 418
DevOps Research and Assessment (DORA) metrics 445, 446
DGS (Domain Graph Service) Framework 271
Discovery Services 102
Distributed Application Runtime (DAPR) 53
distributed monolith 26
distributed tracing 307
problem 307
solution 307, 308
divide and conquer strategy 2
Docker 385
Docker container 317
benefits 317
dockerized environment 393, 394
Docker orchestration
features 318
domain-driven design (DDD) 29
cons 31
principles 30
pros 31
Domain Specific Languages (DSL) 354
dynamic configuration 92
generic configuration service 92-94
problem 92
specialized data microservice 94, 95
dynamic query 189
filtering 190
pagination 192, 193
problem 189, 190
projection 196, 197
sorting 194, 195

E
EAR archive 456, 457, 458
EasyMock 368
Elastic APM 308
Elastic Kubernetes Service (EKS) Cluster 380
Elastic Stack (ELK Stack) 304
end-to-end testing 346, 347
frameworks, using 347
environment configuration 95
problem 95
solution 95-97
environment provisioning 432
problem 432, 433
spin-off environment 434, 435
static environment 433, 434
environment variables 88, 89
config file 90
configuration template / consul 90, 91
cons 89
pros 89
environment verification 399
environment testing 399-401
infrastructure certification 401, 402
problem 399
error propagation 297
problem 298
solution 298-301
Etcd 92
event-driven microservices 37
cons 37
pros 37
Event Handlers 186
Event Objects 186
Event Replay 186
Event Sourcing 182, 186
cons 187
pros 187
Event Store 186
external activation 330
cron jobs 330, 331
CronJob service 331, 332
JEE Timer 333, 334
problem 330
external interface 258-260
API gateway 260-262
API management 264, 265
Backend for Frontend 263, 264
Facade pattern 262, 263

F
Facade pattern 262, 263
Facebook 279
Faker 366
feature branching 437, 438
feature delivery teams 14, 15
cons 15
pros 15
feature flag 106
experimental toggles 108
operational toggles 108
permission toggles 108
problem 106
release toggles 108
solution 106-110
fine-grained microservices 33
FitNesse 354
Flowable 239
Fn Project 53
FreeOTP 282
functional decomposition 27
cons 28
pros 28
functional / domain-driven code
structure 62
cons 62
pros 62
functional testing 340
acceptance testing 353-357
contract testing 350-353
end-to-end testing 346-349
full state 357
initial state 357, 358
integration testing 344-346
partial state 358
problem 340, 341
unit testing 341-344

G
Gatling 361, 364
Gauge 354
GenerateData 366
Gherkin 354
Gitflow branching strategy 441, 442
GitHub 279
GitHub Actions 410
GitHub Co-pilot 75
GitLab CI/CD 410
GlobalConstants class 88
Globally Unique Identifiers (GUIDs) 181
Google authenticator 282
Google Cloud Deployment Manager 379
Google Cloud endpoints 262
Google Cloud Functions 52
Google Cloud Functions Java Framework 324
Google Cloud Monitoring 306
Google Identity Platform 279
Google Workflows 240
Grafana 306
GraphQL 269, 270
GraphQL Java 270
GraphQL SPQR 270
Graylog 304
Gremlin 361
gRPC 121-123
cons 124
pros 124
grpcurl 140

H
Hamcrest 342
HammerDB 362
handwritten documentation 69
Changelog 70, 71
Readme 69, 70
TODO file 71
hardcoded configuration 87
cons 88
problem 87
pros 88
solution 88
HashiVault 92
header-based versioning 136, 137
health checks 309
problem 309
solution 309-311
Helm chart 454-456
HTTP/REST synchronous communication 117-119
cons 121
pros 120
Hybrid Flow 278
Hypertext Transfer Protocol (HTTP) 115

I
implemented interface definition languages (IDL) 137
Implicit Flow 278
incremental delivery 6, 26
problem 6
solution 7
infrastructure certification 401
Integration Teams 16
cons 17
pros 17
integration testing 344-346
intellectual property (IP) 57
Inversion of Control (IoC) containers 327

J
Jaeger 308
Jakarta EE 52
Java API for WebSocket (JSR 356) 274
Java Authentication and Authorization Service (JAAS) 282
JavaDoc 74
Java FDK 326
Java JWT (JJWT) 286
JavaMail 282
Java Runtime Environment (JRE) 316
JavaScript Object Signing and Encryption (JOSE) 286
Java Virtual Machine (JVM) 52
JBehave 354
Jboss AeroGear 280
jBPM 239
JEE beans 320
JEE timer 334
JEE Timer 333
Jenkins 409
JFairy 366
JMeter 364
setting up 362
JMock 368
JmsTemplate 132
job queue 251
JSON Web Encryption (JWE) 285
JSON Web Signature (JWS) 285
JSON Web Token (JWT) 285
JSON Web Tokens (JWT) 280
JUnit 342
JUnitParams 342

K
Kafka Cluster 380
Karate 351
Keycloak 279, 282, 289
Kubernetes YAML manifests 452-454

L
Large Language Models (LLMs) 75
layered architecture 31
cons 32
domain layer 31
infrastructure layer 31
interface layer 31
pros 32
Least Frequently Used (LFU) 211
Least Recently Used (LRU) 211
LoadRunner 361
Locust 364
Locust.io 361
logging 302
log aggregation 303, 304
problem 302
triple-layered logging 302, 303

M
Macro-services 33
Materialized View pattern 187, 188
cons 188
pros 188
measurable quality bar 427
message 125
Message Authentication Code (MAC) 285
message-driven microservices 35, 36
cons 36
pros 36
message envelope 125
Microcks 351
Micrometer 306
micromonolith packaging 328
problem 328
solution 329, 330
Micronaut 52, 324
MicroProfile JWT 286
Microservice Chassis 80
problem 80
solution 80, 81
microservice packaging 314
Docker container 317-319
JEE bean 320-322
problem 314, 315
serverless function 322-325
system process 315-317
microservices 1
microservices adoption goals 2
innovation 5, 6
problem 3
productivity 4, 5
scalability 3, 4
time to market 5
microservices adoption process 17
problem 17, 18
solution 18, 19
microservices architecture
definition 24
problem 24, 25
solution 26
microservice sizing 32
problem 32, 33
solution 33
Microsoft Azure Active Directory (Azure AD) 279
minikube 384
minimalist documentation 69
commit messages 72, 73
handwritten documentation 69
problem 69
Mini-Monoliths 33
Minimum Viable Product (MVP) 19
MitreID Connect 279
mock 367
problem 367
solution 367-370
Mockachino 369
Mockaroo 366
Mockito 368
mono-repo 57
cons 57
pros 57
MuleSoft anypoint platform 265
multi-factor authentication (MFA) 43, 280, 281
implementing 282
methods 281
multi-repo 59
cons 59, 60
pros 59
multi-tenant architecture 50, 51
cons 51
pros 51

N
namespace versioning 68
cons 69
pros 68
nano-services 33
Netflix Conductor 239
New Relic 306, 361
NGINX 262
Nimbus JOSE + JWT 280, 286
no code sharing 64
cons 65
pros 64
non-functional testing 358
availability testing 361
benchmark 359-361
data generator 365, 366
performance testing 360
problem 359
reliability testing 361
scalability testing 361
simulators 363-365
stress testing 360
volume testing 360
NS-3 365

O
OAuth 2.0 287
Object ID 178
generated key 180, 181
GUID 181, 182
natural key 178, 179
problem 178
OIDC Java Spring Boot Starter 280
OpenAPI 138, 139, 351
OpenFaaS (Functions as a Service) 53
OpenID Connect (OIDC) 278
OpenTelemetry 308
Oracle Cloud 379
orchestrated saga pattern 233, 234
orchestration 38
cons 39
pros 39
orchestrator 38
organizational structure 13
feature delivery teams 14, 15
Integration Teams 16, 17
Platform Teams 15, 16
problem 13
outbox pattern 247, 248
OWASP ZAP 362

P
Pac4j 279, 289
Pact 351
perimeter-based security model 44, 45
cons 45
pros 45
periodic reviews 77
cons 78
pros 78
Pester 347
Pip.Benchmarks 362
Pip.Services toolkit 53
platform abstraction 326
Platform Engineer 12
platform engineering 377
platform-specific frameworks
AWS Lambda 52
Azure Functions 52
Google Cloud Functions 52
Jakarta EE 52
Vert.x 52
Platform Teams 15, 16
cons 16
pros 16
point-to-point communication 126, 127, 128
cons 130
pros 130
polyglot and cross-platform frameworks 53
Distributed Application Runtime (DAPR) 53
Pip.Services toolkit 53
Spring Boot 53
Postman 351, 354
PowerMock 368
Process Engineer 11
process flow 227
aggregator 228, 229
branch pattern 230-232
chain of responsibility 229, 230
problem 228
Product Assembler 11
product integration pipeline 416
product packaging 450
custom script 460
EAR archive 456-458
Helm chart 454-456
Kubernetes YAML manifests 452-454
problem 450-452
resource template 458, 459
Prometheus 305
ProtoBuf 139, 140
Protractor 354
publish/subscribe communication 130-132
cons 133
pros 133
Pull Request Review 76, 77
cons 77
pros 77
Puppet 380
push notifications and callbacks 271
problem 271, 272
webhooks 272, 273
WebSockets 273, 274
PyTest 347

Q
quality gate 427
automated gate 427, 428
manual gate 429
problem 427
Quarkus 52, 324
query parameter versioning 136
QuickTest Professional (QTP) 354

R
Random User Generator 366
Rate Limiter pattern 162, 163
cons 164
pros 164
receiver 125
RedGate SQL Data Generator 366
regression tests 339
Relational Database Service (RDS) with MySQL 380
release branching 440, 441
reliability 158, 241
problem 159, 241
timeout pattern 159, 160
Remote Procedure Call (RPC) 121
Representational State Transfer (REST) 116
resource templates 458, 459
Rest-Assured 347, 351
retries pattern 160-162
retry storm 159
Robot Framework 354
Role-Based Access Control (RBAC) 43
rolling deployment 467-469

S
saga orchestrator 234, 235
SampleRestService 138
ScribeJava 289
scripted environment 376
deployment environment 377, 378
development environment 384-388
problem 376, 377
production environment 378-383
test environment 383, 384
secrets 89
secure delivery 430
problem 430, 431
solution 431
security model 40
perimeter-based security model 44, 45
problem 40, 41
zero trust security model 42, 43
Selenium 347, 354, 364
sender 125
serverless computing 322
Server-Sent Events (SSE) 274
Shadow 364
sharding keys
geographic sharding 199
hash-based sharding 199
list-based sharding 199
range-based sharding 199
shared libraries / versioned dependencies 65
cons 66
pros 66
Sidecar pattern 66
cons 67
pros 66
Siege 361
Simulink 364
Single Responsibility Principle (SRP) 33
single sign-on (SSO) 278
single-tenant architecture 49
cons 50
pros 50
smoke tests 339
SockJS 274
Software Architecture 26
Software Factory model 10
characteristics 10-12
cons 12
pros 12
Splunk 304, 306
Spock 369
Spring Boot 53
SpringBoot Configuration Service 92
Spring Boot GraphQL 270
Spring Cloud contract 351
Spring Cloud Function 324
Spring security 282
Spring Security JWT 286
Spring Security OAuth 279, 289
Spring WebSocket 274
SSL handshake 291
SSL/TLS encryption 290
problem 290
solution 290-293
Standard Operating Procedure (SOP) 11
state management 224-227
static configuration 88
problem 88
Stress-ng 361
StringTemplate 81
Swagger 351
symmetric deployments 46, 325
cons 47
pros 47
symmetric environment 390-392
synchronous communication 115
HTTP 115
problem 115
REST 116
synchronous execution 250
synchronous microservices 34
cons 35
pros 34
synchronous request/response 266, 267
HTTP/REST protocol 267, 268

T
T4 template engines 81
Talend Data Fabric 366
Task Scheduling 250
TeamCity 410
technologist 11
Temporal 240
tenancy 48
multi-tenancy 50, 51
problem 49
single-tenancy 49, 50
Terraform 379
Test-driven development (TDD) 343
TestNG 342
test planning 336
problem 336, 337
solution 337-340
thundering herd problem 159
TIBCO Cloud Mashery 265
time-based one-time passwords (TOTPs) 281
Timeout pattern 159, 160
Time to Live (TTL) 212
TODO file 71
trace ID 296
problem 296
solution 296, 297
Trainer 12
transaction management 232
problem 232, 233
Travis CI 410
triple-layered logging 302, 303
trunk-based development 439
Truth 342
TurboData 366
Twitter 279
two-factor authentication (2FA) 280
Two-Phase Commit (2PC) 232-234
type/technology-based code structure 63
cons 63
pros 63
Tyrus 274

U
UFT (Unified Functional Testing) 354
Uniform Resource Locator (URL) 267
unit testing 341-344
Universally Unique Identifiers (UUIDs) 181
URI versioning 135

V
Vagrant 384
Version Control System (VCS) 58
Versioned Channels Pattern 134
cons 135
pros 135
versioned routing 135
Vert.x 52, 274
virtualized build process 418
problem 418
solution 419

W
WebSockets 273
WireMock 369
workspace 60
problem 61
solution 61
Write-Behind Cache 212
Write-Through Cache 212

Y
Yet Another Markup Language (YAML) 452
Yubico's Java libraries 282

Z
Zeebe 239
zero trust security model 42
cons 43
key principles 42, 43
pros 43
Zipkin 308

You might also like