CloudNative III
CloudNative III
Cloud Native –
Resiliency, Monitoring & DevOps
2
Outline
• Introduction
• Pillars of Cloud Native
• Cloud Native Applications
• Cloud-native Communication Patterns
• Cloud-native Data Patterns
• Cloud-native Resiliency
• Monitoring & Health
• Devops
3
Cloud-native Resiliency
• Resiliency is the ability of a system to react to failure and still remain functional.
• It's not about avoiding failure, but accepting failure and constructing cloud-native services to
respond to it.
• The system should return to a fully functioning state quickly as possible.
• Unlike traditional monolithic applications, where everything runs together in a single
process, cloud-native systems embrace a distributed architecture
• In the figure, a Circuit Breaker pattern has been added to the original retry pattern.
• Note how after 100 failed requests, the circuit breakers opens and no longer allows calls to the service.
• The CheckCircuit value, set at 30 seconds, specifies how often the library allows one request to proceed
to the service.
• If that call succeeds, the circuit closes and the service is once again available to traffic.
• The intent of the Circuit Breaker pattern is different than that of the Retry pattern.
• The Retry pattern enables an application to retry an operation in the expectation that it will succeed.
• The Circuit Breaker pattern prevents an application from doing an operation that is likely to fail.
• Typically, an application will combine these two patterns by using the Retry pattern to invoke an operation
through a circuit breaker.
9
Outline
• Introduction
• Pillars of Cloud Native
• Cloud Native Applications
• Cloud-native Communication Patterns
• Cloud-native Data Patterns
• Cloud-native Resiliency
• Monitoring & Health
• Devops
10
Logging
• No matter how careful we are, applications almost always behave in unexpected ways in
production.
• When users report problems with an application, it's useful to be able to see what was
going on with the app when the problem occurred.
• One of the most tried and true ways of capturing information about what an application is
doing while it's running is to have the application write down what it's doing.
• This process is known as logging.
• Anytime failures or problems occur in production, the goal should be to reproduce the
conditions under which the failures occurred, in a non-production environment.
• Having good logging in place provides a roadmap for developers to follow in order to
duplicate problems in an environment that can be tested and experimented with.
Centralized Logging
• Because of the challenges associated with using file-based logs in cloud-native
apps, centralized logs are preferred.
• Logs are collected by the applications and shipped to a central logging application
which indexes and stores the logs.
• This class of system can ingest tens of gigabytes of logs every day.
• It's also helpful to follow some standard practices when building logging that spans
many services.
• For instance, generating a correlation ID at the start of a lengthy interaction, and
then logging it in each message that is related to that interaction, makes it easier to
search for all related messages.
• One need only find a single message and extract the correlation ID to find all the
related messages.
• Another example is ensuring that the log format is the same for every service,
whatever the language or logging library it uses.
• This standardization makes reading logs much easier.
15
Logstash
• The first component of ELK is Logstash.
• This tool is used to gather log information from a large variety of different
sources.
• For instance, Logstash can read logs from disk and also receive messages
from logging libraries.
• Logstash can do some basic filtering and expansion on the logs as they
arrive.
• For instance, if the logs contain IP addresses then Logstash may be
configured to do a geographical lookup and obtain a country/region or even
city of origin for that message.
22
Elasticsearch
• Elasticsearch is a powerful search engine that can index logs as they arrive.
• It makes running queries against the logs quick.
• Elasticsearch can handle huge quantities of logs and, in extreme cases, can be
scaled out across many nodes.
• Log messages that have been crafted to contain parameters or that have had
parameters split from them through Logstash processing, can be queried
directly as Elasticsearch preserves this information.
23
Kibana
• The final component of the stack is Kibana.
• This tool is used to provide interactive visualizations in a web dashboard.
• Dashboards may be crafted even by users who are non-technical.
• Most data that is resident in the Elasticsearch index, can be included in the
Kibana dashboards.
• Individual users may have different dashboard desires and Kibana enables this
customization through allowing user-specific dashboards.
24
Outline
• Introduction
• Pillars of Cloud Native
• Cloud Native Applications
• Cloud-native Communication Patterns
• Cloud-native Data Patterns
• Cloud-native Resiliency
• Monitoring & Health
• Devops
25
DevOps
• Cloud-native applications have clear advantages in terms of speed of development, stability,
and scalability, but managing them can be quite a bit more difficult.
• Years ago, it wasn't uncommon for the process of moving an application from development to
production to take a month, or even more.
• Companies released software on a 6-month or even every year cadence.
• It's now fairly well established that being able to release software rapidly gives fast-moving
companies a huge market advantage over their more sloth-like competitors
• The patterns and practices that enable faster, more reliable releases to deliver value to the
business are collectively known as DevOps.
• They consist of a wide range of ideas spanning the entire software development life cycle
from specifying an application all the way up to delivering and operating that application.
• Donovan Brown (cloud advocate and DevOps evangelist) : "DevOps is the union of people,
process, and products to enable continuous delivery of value to our end users.“
26
GitHub Actions
• Founded in 2009, GitHub is a widely popular web-based repository for
hosting projects, documentation, and code.
• GitHub uses the open-source, distributed version control system named Git
as its foundation.
• On top, it then adds its own set of features, including defect tracking, feature
and pull requests, tasks management, and wikis for each code base.
• As GitHub evolves, it too is adding DevOps features.
• For example, GitHub has its own continuous integration/continuous delivery
(CI/CD) pipeline, called GitHub Actions.
• GitHub Actions is a community-powered workflow automation tool.
• It lets DevOps teams integrate with their existing tooling, mix and match new
products, and hook into their software lifecycle, including existing CI/CD
partners."
• GitHub has over 40 million users, making it the largest host of source code in
the world.
28
Source Control
• Organizing the code for a cloud-native application can be challenging.
• Instead of a single giant application, the cloud-native applications tend
to be made up of a web of smaller applications that talk with one
another.
• The best arrangement of code remains an open question.
• There are examples of successful applications using different kinds of
layouts, but two variants seem to have the most popularity.
29
Single Repository
Frequently the argument for using a single repository boils down to an argument that
Facebook or Google use this method for source code arrangement.
• If the approach is good enough for these companies, then, surely, it's the correct approach for all
companies.
• The truth of the matter is that few companies operate on anything like the scale of Facebook or
Google.
• The problems that occur at those scales are different from those most developers will face
In the end, either solution can be used to host the source code for microservices.
However, in most cases, the management, and engineering overhead of operating in a
single repository isn't worth the meager advantages.
Splitting code up over multiple repositories encourages better separation of concerns
and encourages autonomy among development teams.
35
CI/CD Pipelines
• Almost no change in the software development life cycle has been so revolutionary as
the advent of continuous integration (CI) and continuous delivery (CD).
• Building and running automated tests against the source code of a project as soon as a
change is checked in catches mistakes early.
• Prior to the advent of continuous integration builds, it wouldn't be uncommon to pull
code from the repository and find that it didn't pass tests or couldn't even be built.
• This resulted in tracking down the source of the breakage.
• Traditionally shipping software to the production environment required extensive
documentation and a list of steps.
• Each one of these steps needed to be manually completed in a very error prone
process.
• Continuous integration is followed by continuous delivery in which the freshly built
packages are deployed to an environment.
• The manual process can't scale to match the speed of development so automation
becomes more important.
• Checklists are replaced by scripts that can execute the same tasks faster and more
accurately than any human.
37
CI/CD Pipelines
• The environment to which continuous delivery delivers might be a test environment or,
as is being done by many major technology companies, it could be the production
environment.
• The latter requires an investment in high-quality tests that can give confidence that a
change isn't going to break production for users. In the same way that continuous
integration caught issues in the code early continuous delivery catches issues in the
deployment process early.
• The importance of automating the build and delivery process is accentuated by cloud-
native applications.
• Deployments happen more frequently and to more environments so manually
deploying borders on impossible.
• There's no cost to configuring many build pipelines, so it's advantageous to have at
least one build pipeline per microservice.
• Ideally, microservices are independently deployable to any environment so having each
one able to be released via its own pipeline without releasing a mass of unrelated code
is perfect.
• Each pipeline can have its own set of approvals allowing for variations in build process
for each service.
38
Feature Flags
• Cloud native is about speed and agility.
• Users expect rapid responsiveness, innovative features, and zero downtime.
• Feature flags are a modern deployment technique that helps increase agility for cloud-
native applications.
• They enable you to deploy new features into a production environment but restrict their
availability.
• You can activate a new feature for specific users without restarting the app or deploying
new code.
• They separate the release of new features from their code deployment.
• Feature flags are built upon conditional logic that control visibility of functionality for
users at run time.
• In modern cloud-native systems, it's common to deploy new features into production
early, but test them with a limited audience.
• As confidence increases, the feature can be incrementally rolled out to wider
audiences.
39
Feature Flags
Other use cases for feature flags include:
• Restrict premium functionality to specific customer groups willing to pay higher subscription fees.
• Stabilize a system by quickly deactivating a problem feature, avoiding the risks of a rollback or
immediate hotfix.
• Disable an optional feature with high resource consumption during peak usage periods.
• Conduct experimental feature releases to small user segments to validate feasibility and
popularity.
Feature flags also promote trunk-based development.
• It's a source-control branching model where developers collaborate on features in a single
branch.
• The approach minimizes the risk and complexity of merging large numbers of long-running
feature branches.
• Features are unavailable until activated.
40
Infrastructure as Code
• Cloud-native systems embrace microservices, containers, and
modern system design to achieve speed and agility.
• They provide automated build and release stages to ensure
consistent and quality code.
• Modern cloud-native applications also embrace the widely accepted
practice of Infrastructure as Code, or IaC.
• With IaC, platform provisioning is automated.
• Software engineering practices such as testing and versioning are
applied to DevOps practices.
• Infrastructure and deployments are automated, consistent, and
repeatable.
• Just as continuous delivery automated the traditional model of manual
deployments, Infrastructure as Code (IaC) is evolving how application
environments are managed.
41
Terraform
• Cloud-native applications are often constructed to be cloud agnostic.
• Being so means the application isn't tightly coupled to a particular cloud vendor and
can be deployed to any public cloud.
• Terraform is a commercial templating tool that can provision cloud-native applications
across all the major cloud players: Azure, Google Cloud Platform, AWS, and AliCloud.
• Instead of using JSON as the template definition language, it uses the slightly more
terse HCL (Hashicorp Configuration Language).
• Terraform also provides intuitive error messages for problem templates.
• There's also a handy validate task that can be used in the build phase to catch
template errors early.
• Command-line tools are available to deploy Terraform templates.
• Sometimes Terraform output meaningful values, such as a connection string to a newly
created database.
• This information can be captured in the build pipeline and used in subsequent tasks.
42
Terraform - Example
43
Bundle.json
• The core of a CNAB is a file called bundle.json.
• This file defines the contents of the bundle, be
they Terraform or images or anything else.
• The figure defines a CNAB that invokes some
Terraform.
• Notice, however, that it actually defines an
invocation image that is used to invoke the
Terraform.
• When packaged up, the Docker file that is located
in the cnab directory is built into a Docker image,
which will be included in the bundle.
• Having Terraform installed inside a Docker
container in the bundle means that users don't
need to have Terraform installed on their machine
to run the bundling.\
• The bundle.json also defines a set of parameters
that are passed down into the Terraform.
• Parameterization of the bundle allows for
installation in various different environments.
46
Readings
• Architecting Cloud Native .NET Applications for Azure.
https://dotnet.microsoft.com/en-us/download/e-book/cloud-native-
azure/pdf