0% found this document useful (0 votes)
12 views46 pages

CloudNative III

The document provides an overview of cloud-native computing, focusing on resiliency, monitoring, and DevOps practices. It discusses the importance of designing applications to handle failures gracefully using patterns like Retry and Circuit Breaker, as well as the challenges of logging and monitoring in distributed environments. Additionally, it highlights the use of centralized logging systems, such as the Elastic Stack, to manage and analyze logs effectively across cloud-native applications.

Uploaded by

borntowin435435
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views46 pages

CloudNative III

The document provides an overview of cloud-native computing, focusing on resiliency, monitoring, and DevOps practices. It discusses the importance of designing applications to handle failures gracefully using patterns like Retry and Circuit Breaker, as well as the challenges of logging and monitoring in distributed environments. Additionally, it highlights the use of centralized logging systems, such as the Elastic Stack, to manage and analyze logs effectively across cloud-native applications.

Uploaded by

borntowin435435
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

ELL 887 - CLOUD COMPUTING

Cloud Native –
Resiliency, Monitoring & DevOps
2

Outline
• Introduction
• Pillars of Cloud Native
• Cloud Native Applications
• Cloud-native Communication Patterns
• Cloud-native Data Patterns
• Cloud-native Resiliency
• Monitoring & Health
• Devops
3

Cloud-native Resiliency
• Resiliency is the ability of a system to react to failure and still remain functional.
• It's not about avoiding failure, but accepting failure and constructing cloud-native services to
respond to it.
• The system should return to a fully functioning state quickly as possible.
• Unlike traditional monolithic applications, where everything runs together in a single
process, cloud-native systems embrace a distributed architecture

Distributed cloud-native environment: each microservice and cloud-based backing service


execute in a separate process, across server infrastructure, communicating via network-based
calls.
4

Cloud-native Distributed Environment


 Operating in this environment, a service must be sensitive to many different
challenges:
• Unexpected network latency - the time for a service request to travel to the receiver and
back.
• Transient faults - short-lived network connectivity errors.
• Blockage by a long-running synchronous operation.
• A host process that has crashed and is being restarted or moved.
• An overloaded microservice that can't respond for a short time.
• An in-flight orchestrator operation such as a rolling upgrade or moving a service from one
node to another.
• Hardware failures.
 Cloud platforms can detect and mitigate many of these infrastructure issues.
 It may restart, scale out, and even redistribute a service to a different node.
 However, to take full advantage of this built-in protection, services must be
designed to react to it and thrive in this dynamic environment.
5

Application Resiliency - Retry Pattern


 In a distributed cloud-native environment, calls to services and cloud resources can fail because of transient
(short-lived) failures, which typically correct themselves after a brief period of time.
 Implementing a retry strategy helps a cloud-native service mitigate these scenarios.
 The Retry pattern enables a service to retry a failed request operation a (configurable) number of times with an
exponentially increasing wait time.

 A retry pattern has been implemented for a request operation.


 It's configured to allow up to four retries before failing with a backoff interval (wait time) starting at two
seconds, which exponentially doubles for each subsequent attempt.
•The first invocation fails and returns an HTTP status code of 500. The application waits for two seconds and retries the
call.
•The second invocation also fails and returns an HTTP status code of 500. The application now doubles the backoff
interval to four seconds and retries the call.
•Finally, the third call succeeds.
•In this scenario, the retry operation would have attempted up to four retries while doubling the backoff duration before
failing the call.
•Had the 4th retry attempt failed, a fallback policy would be invoked to gracefully handle the problem.
 It's a best practice to implement an exponentially increasing backoff (doubling the period on each retry)
to allow adequate correction time.
6

Application Resiliency – Circuit Breaker Pattern


• While the retry pattern can help salvage a request entangled in a partial failure, there are
situations where failures can be caused by unanticipated events that will require longer
periods of time to resolve.
• These faults can range in severity from a partial loss of connectivity to the complete failure
of a service.
• In these situations, it's pointless for an application to continually retry an operation that is
unlikely to succeed.
• To make things worse, executing continual retry operations on a non-responsive service
can move you into a self-imposed denial of service scenario where you flood your service
with continual calls exhausting resources such as memory, threads and database
connections, causing failure in unrelated parts of the system that use the same resources.
• In these situations, it would be preferable for the operation to fail immediately and only
attempt to invoke the service if it's likely to succeed.
• The Circuit Breaker pattern can prevent an application from repeatedly trying to execute
an operation that's likely to fail.
• After a pre-defined number of failed calls, it blocks all traffic to the service.
• Periodically, it will allow a trial call to determine whether the fault has resolved.
7

Application Resiliency – Circuit Breaker Pattern


• While the retry pattern can help salvage a request entangled in a partial failure, there are
situations where failures can be caused by unanticipated events that will require longer
periods of time to resolve.
• These faults can range in severity from a partial loss of connectivity to the complete failure
of a service.
• In these situations, it's pointless for an application to continually retry an operation that is
unlikely to succeed.
• To make things worse, executing continual retry operations on a non-responsive service
can move you into a self-imposed denial of service scenario where you flood your service
with continual calls exhausting resources such as memory, threads and database
connections, causing failure in unrelated parts of the system that use the same resources.
• In these situations, it would be preferable for the operation to fail immediately and only
attempt to invoke the service if it's likely to succeed.
• The Circuit Breaker pattern can prevent an application from repeatedly trying to execute
an operation that's likely to fail.
• After a pre-defined number of failed calls, it blocks all traffic to the service.
• Periodically, it will allow a trial call to determine whether the fault has resolved.
8

Application Resiliency – Circuit Breaker Pattern

• In the figure, a Circuit Breaker pattern has been added to the original retry pattern.
• Note how after 100 failed requests, the circuit breakers opens and no longer allows calls to the service.
• The CheckCircuit value, set at 30 seconds, specifies how often the library allows one request to proceed
to the service.
• If that call succeeds, the circuit closes and the service is once again available to traffic.
• The intent of the Circuit Breaker pattern is different than that of the Retry pattern.
• The Retry pattern enables an application to retry an operation in the expectation that it will succeed.
• The Circuit Breaker pattern prevents an application from doing an operation that is likely to fail.
• Typically, an application will combine these two patterns by using the Retry pattern to invoke an operation
through a circuit breaker.
9

Outline
• Introduction
• Pillars of Cloud Native
• Cloud Native Applications
• Cloud-native Communication Patterns
• Cloud-native Data Patterns
• Cloud-native Resiliency
• Monitoring & Health
• Devops
10

Monitoring & Health


• To gauge the health of an application in production, it's necessary to
monitor the various logs and metrics being produced from the
servers, hosts, and the application proper.
• The number of different services running in support of a cloud-native
application makes monitoring the health of individual components
and the application as a whole a critical challenge.
• Just as patterns have been developed to aid in the layout of code in
applications, there are patterns for operating applications in a reliable
way.
• Three useful patterns in maintaining applications have
emerged: logging, monitoring, and alerts.
11

Logging
• No matter how careful we are, applications almost always behave in unexpected ways in
production.
• When users report problems with an application, it's useful to be able to see what was
going on with the app when the problem occurred.
• One of the most tried and true ways of capturing information about what an application is
doing while it's running is to have the application write down what it's doing.
• This process is known as logging.
• Anytime failures or problems occur in production, the goal should be to reproduce the
conditions under which the failures occurred, in a non-production environment.
• Having good logging in place provides a roadmap for developers to follow in order to
duplicate problems in an environment that can be tested and experimented with.

In traditional applications, log files are


typically stored on the local machine.
12

Challenges of Logging in scaled out Monolithic Applications


• The usefulness of logging to a flat file on a single machine is vastly reduced
in a cloud environment.
• Applications producing logs may not have access to the local disk or the
local disk may be highly transient as containers are shuffled around physical
machines.
• Even simple scaling up of monolithic applications across multiple nodes can
make it challenging to locate the appropriate file-based log file.
13

Challenges of Logging in Cloud Native Applications


• Cloud-native applications developed using a microservices architecture also pose some
challenges for file-based loggers.
• User requests may now span multiple services that are run on different machines and may
include serverless functions with no access to a local file system at all.
• It would be very challenging to correlate the logs from a user or a session across these many
services and machines.
• Moreover, the number of users in some cloud-native applications is high.
• Imagine that each user generates a hundred lines of log messages when they log into an
application.
• In isolation, that is manageable, but multiply that over 100,000 users and the volume of logs
becomes large enough that specialized tools are needed to support effective use of the logs.
14

Centralized Logging
• Because of the challenges associated with using file-based logs in cloud-native
apps, centralized logs are preferred.
• Logs are collected by the applications and shipped to a central logging application
which indexes and stores the logs.
• This class of system can ingest tens of gigabytes of logs every day.
• It's also helpful to follow some standard practices when building logging that spans
many services.
• For instance, generating a correlation ID at the start of a lengthy interaction, and
then logging it in each message that is related to that interaction, makes it easier to
search for all related messages.
• One need only find a single message and extract the correlation ID to find all the
related messages.
• Another example is ensuring that the log format is the same for every service,
whatever the language or logging library it uses.
• This standardization makes reading logs much easier.
15

Implementing Centralized Logging

Logs from various sources are ingested into


a centralized log store.
16

Challenges with detecting and responding to potential app health issues


 Some applications aren't mission critical.
 Maybe they're only used internally, and when a problem occurs, the user can
contact the team responsible and the application can be restarted.
 However, customers often have higher expectations for the applications they
consume.
 You should know when problems occur with your application before users do, or
before users notify you.
 Otherwise, the first you know about a problem may be when you notice an angry
deluge of social media posts deriding your application or even your organization.
 Some scenarios you may need to consider include:
• One service in your application keeps failing and restarting, resulting in intermittent slow
responses.
• At some times of the day, your application's response time is slow.
• After a recent deployment, load on the database has tripled.
 Implemented properly, monitoring can let you know about conditions that will lead
to problems, letting you address underlying conditions before they result in any
significant user impact.
17

Monitoring Cloud-native Apps


• Some centralized logging systems take on an additional role of collecting telemetry outside of pure logs.
• They can collect metrics, such as time to run a database query, average response time from a web server, and
even CPU load averages and memory pressure as reported by the operating system.
• In conjunction with the logs, these systems can provide a holistic view of the health of nodes in the system
and the application as a whole.
• The metric-gathering capabilities of the monitoring tools can also be fed manually from within the
application.
• Business flows that are of particular interest such as new users signing up or orders being placed, may be
instrumented such that they increment a counter in the central monitoring system.
• This aspect unlocks the monitoring tools to not only monitor the health of the application but the health of
the business.
• Queries can be constructed in the log aggregation tools to look for certain statistics or patterns, which can
then be displayed in graphical form, on custom dashboards.
• Frequently, teams will invest in large, wall-mounted displays that rotate through the statistics related to an
application.
• This way, it's simple to see the problems as they occur.
• Cloud-native monitoring tools provide real-time telemetry and insight into apps regardless of whether
they're single-process monolithic applications or distributed microservice architectures.
• They include tools that allow collection of data from the app as well as tools for querying and displaying
information about the app's health.
18

Challenges with reacting to critical problems in cloud-native apps


 If you need to react to problems with your application, you need some way to alert the right
personnel.
 This depends on logging and monitoring.
 Your application needs to have logging in place to allow problems to be diagnosed, and in
some cases to feed into monitoring tools.
 It needs monitoring to aggregate application metrics and health data in one place.
 Once this has been established, rules can be created that will trigger alerts when certain
metrics fall outside of acceptable levels.
 Generally, alerts are layered on top of monitoring such that certain conditions trigger
appropriate alerts to notify team members of urgent problems.
 Some scenarios that may require alerts include:
• One of your application's services is not responding after 1 minute of downtime.
• Your application is returning unsuccessful HTTP responses to more than 1% of requests.
• Your application's average response time for key endpoints exceeds 2000 ms.
19

Alerts in Cloud-native Apps


 You can craft queries against the monitoring tools to look for known failure conditions.
 For instance, queries could search through the incoming logs for indications of HTTP
status code 500, which indicates a problem on a web server.
 As soon as one of these is detected, then an e-mail or an SMS could be sent to the
owner of the originating service who can begin to investigate.
 Typically, though, a single 500 error isn't enough to determine that a problem has
occurred.
• It could mean that a user mistyped their password or entered some malformed data.
• The alert queries can be crafted to only fire when a larger than average number of 500 errors are
detected.
• One of the most damaging patterns in alerting is to fire too many alerts for humans to
investigate.
• Service owners will rapidly become desensitized to errors that they've previously investigated
and found to be benign.
• Then, when true errors occur, they'll be lost in the noise of hundreds of false positives.
• It's important to ensure that the alerts that do fire are indicative of a real problem.
20

Elastic Stack (ELK)


• There are many good centralized logging tools and they vary in cost from
being free, open-source tools, to more expensive options.
• In many cases, the free tools are as good as or better than the paid offerings.
• One such tool is a combination of three open-source components:
Elasticsearch, Logstash, and Kibana.
• Collectively these tools are known as the Elastic Stack or ELK stack.
• Elastic Stack provides centralized logging in a low-cost, scalable, cloud-
friendly manner.
• Its user interface streamlines data analysis so you can spend your time
gleaning insights from your data instead of fighting with a clunky interface.
• It supports a wide variety of inputs so as your distributed application spans
more and different kinds of services, you can expect to continue to be able to
feed log and metric data into the system.
• The Elastic Stack also supports fast searches even across large data sets,
making it possible even for large applications to log detailed data and still be
able to have visibility into it in a performant fashion.
21

Logstash
• The first component of ELK is Logstash.
• This tool is used to gather log information from a large variety of different
sources.
• For instance, Logstash can read logs from disk and also receive messages
from logging libraries.
• Logstash can do some basic filtering and expansion on the logs as they
arrive.
• For instance, if the logs contain IP addresses then Logstash may be
configured to do a geographical lookup and obtain a country/region or even
city of origin for that message.
22

Elasticsearch
• Elasticsearch is a powerful search engine that can index logs as they arrive.
• It makes running queries against the logs quick.
• Elasticsearch can handle huge quantities of logs and, in extreme cases, can be
scaled out across many nodes.
• Log messages that have been crafted to contain parameters or that have had
parameters split from them through Logstash processing, can be queried
directly as Elasticsearch preserves this information.
23

Kibana
• The final component of the stack is Kibana.
• This tool is used to provide interactive visualizations in a web dashboard.
• Dashboards may be crafted even by users who are non-technical.
• Most data that is resident in the Elasticsearch index, can be included in the
Kibana dashboards.
• Individual users may have different dashboard desires and Kibana enables this
customization through allowing user-specific dashboards.
24

Outline
• Introduction
• Pillars of Cloud Native
• Cloud Native Applications
• Cloud-native Communication Patterns
• Cloud-native Data Patterns
• Cloud-native Resiliency
• Monitoring & Health
• Devops
25

DevOps
• Cloud-native applications have clear advantages in terms of speed of development, stability,
and scalability, but managing them can be quite a bit more difficult.
• Years ago, it wasn't uncommon for the process of moving an application from development to
production to take a month, or even more.
• Companies released software on a 6-month or even every year cadence.
• It's now fairly well established that being able to release software rapidly gives fast-moving
companies a huge market advantage over their more sloth-like competitors
• The patterns and practices that enable faster, more reliable releases to deliver value to the
business are collectively known as DevOps.
• They consist of a wide range of ideas spanning the entire software development life cycle
from specifying an application all the way up to delivering and operating that application.
• Donovan Brown (cloud advocate and DevOps evangelist) : "DevOps is the union of people,
process, and products to enable continuous delivery of value to our end users.“
26

DevOps and Cloud-native


• Microservices and cloud-native applications go hand in hand with good DevOps practices.
• DevOps emerged before microservices and it's likely that the movement towards smaller, more
fit to purpose services wouldn't have been possible without DevOps to make releasing and
operating not just one but many applications in production easier.
• Through good DevOps practices, it's possible to realize the advantages of cloud-native
applications without suffocating under a mountain of work actually operating the applications.
27

GitHub Actions
• Founded in 2009, GitHub is a widely popular web-based repository for
hosting projects, documentation, and code.
• GitHub uses the open-source, distributed version control system named Git
as its foundation.
• On top, it then adds its own set of features, including defect tracking, feature
and pull requests, tasks management, and wikis for each code base.
• As GitHub evolves, it too is adding DevOps features.
• For example, GitHub has its own continuous integration/continuous delivery
(CI/CD) pipeline, called GitHub Actions.
• GitHub Actions is a community-powered workflow automation tool.
• It lets DevOps teams integrate with their existing tooling, mix and match new
products, and hook into their software lifecycle, including existing CI/CD
partners."
• GitHub has over 40 million users, making it the largest host of source code in
the world.
28

Source Control
• Organizing the code for a cloud-native application can be challenging.
• Instead of a single giant application, the cloud-native applications tend
to be made up of a web of smaller applications that talk with one
another.
• The best arrangement of code remains an open question.
• There are examples of successful applications using different kinds of
layouts, but two variants seem to have the most popularity.
29

Repository per Microservice - Advantages


 At first glance, this approach seems like the most logical approach to splitting up the source code for
microservices.
 Each repository can contain the code needed to build the one microservice.
 The advantages to this approach are readily visible:
• Instructions for building and maintaining the application can be added to a README file at the root of each
repository.
• When flipping through the repositories, it's easy to find these instructions, reducing spin-up time for developers.
• Every service is located in a logical place, easily found by knowing the name of the service.
• Builds can easily be set up such that they're only triggered when a change is made to the owning repository.
• The number of changes coming into a repository is limited to the small number of developers working on the
project.
• Security is easy to set up by restricting the repositories to which developers have read and write permissions.
• Repository level settings can be changed by the owning team with a minimum of discussion with others.
 One of the key ideas behind microservices is that services should be siloed and separated from
each other.
• When using Domain Driven Design to decide on the boundaries for services the services act as transactional
boundaries.
• Database updates shouldn't span multiple services.
• This collection of related data is referred to as a bounded context.
• This idea is reflected by the isolation of microservice data to a database separate and autonomous from the rest
of the services.
• It makes a great deal of sense to carry this idea all the way through to the source code.
30

Repository per Microservice - Limitations


 However, this approach isn't without its issues.
 One of the more challenging development problems is managing
dependencies.
• If a dependency is updated, then downstream packages must also update this
dependency.
• Unfortunately, that takes development work so, invariably, each code directory ends
up with multiple versions of a single package, each one a dependency of some
other package that is versioned at a slightly different cadence.
• When deploying an application, which version of a dependency should be used?
− The version that is currently in production?
− The version that is currently in Beta but is likely to be in production by the time the consumer
makes it to production?
− Difficult problems that aren't resolved by just using microservices.
31

Repository per Microservice - Limitations


 Another disadvantage presents itself when moving code between services.
• Although it would be nice to believe that the first division of an application into microservices is
100% correct, the reality is that rarely we're so prescient as to make no service division
mistakes.
• Thus, functionality and the code that drives it will need to move from service to service:
repository to repository.
• When leaping from one repository to another, the code loses its history.
• There are many cases, especially in the event of an audit, where having full history on a piece of
code is invaluable.
 The final and most important disadvantage is coordinating changes.
• In a true microservices application, there should be no deployment dependencies between
services.
• It should be possible to deploy services A, B, and C in any order as they have loose coupling.
• In reality, however, there are times when it's desirable to make a change that crosses multiple
repositories at the same time..
• To do a cross-repository change requires a commit to each repository be made in succession.
• Each change in each repository will need to be pull-requested and reviewed separately.
• This activity can be difficult to coordinate.
32

Single Repository - Advantages


 An alternative to using many repositories is to put all the source code together in a giant, all
knowing, single repository.
 In this approach, sometimes referred to as a monorepository, all the source code for every
service is put into the same repository.
 At first, this approach seems like a terrible idea likely to make dealing with source code
unwieldy.
 There are, however, some marked advantages to working this way.
 The first advantage is that it's easier to manage dependencies between projects.
• Instead of relying on some external artifact feed, projects can directly import one another.
• This means that updates are instant, and conflicting versions are likely to be found at compile time on the
developer's workstation.
• In effect, shifting some of the integration testing left.
• When moving code between projects, it's now easier to preserve the history as the files will be detected
as having been moved rather than being rewritten.
 Another advantage is that wide ranging changes that cross service boundaries can be made in
a single commit.
• This activity reduces the overhead of having potentially dozens of changes to review individually.
• There are many tools that can perform static analysis of code to detect insecure programming practices or
problematic use of APIs.
• In a multi-repository world, each repository will need to be iterated over to find the problems in them.
• The single repository allows running the analysis all in one place.
33

Single Repository - Limitations


 There are also many disadvantages to the single repository approach.
 One of the most worrying ones is that having a single repository raises security
concerns.
• If the contents of a repository are leaked in a repository per service model, the amount of code
lost is minimal.
• With a single repository, everything the company owns could be lost.
• Having multiple repositories exposes less surface area, which is a desirable trait in most security
practices.
• The size of the single repository is likely to become unmanageable rapidly.
• This presents some interesting performance implications.
• It may become necessary to use specialized tools such as Virtual File System for Git, which was
originally designed to improve the experience for developers on the Windows team.
34

Single Repository
 Frequently the argument for using a single repository boils down to an argument that
Facebook or Google use this method for source code arrangement.
• If the approach is good enough for these companies, then, surely, it's the correct approach for all
companies.
• The truth of the matter is that few companies operate on anything like the scale of Facebook or
Google.
• The problems that occur at those scales are different from those most developers will face
 In the end, either solution can be used to host the source code for microservices.
 However, in most cases, the management, and engineering overhead of operating in a
single repository isn't worth the meager advantages.
 Splitting code up over multiple repositories encourages better separation of concerns
and encourages autonomy among development teams.
35

Standard Directory Structure


• Regardless of the single versus multiple repositories debate
each service will have its own directory.
• One of the best optimizations to allow developers to cross
between projects quickly is to maintain a standard directory
structure.
• Whenever a new project is created, a template that puts in
place the correct structure should be used.
• This template can also include such useful items as a
skeleton README file.
• In any microservice architecture, a high degree of variance
between projects makes bulk operations against the services
more difficult.
• There are many tools that can provide templating for an
entire directory, containing several source code directories.
• GitHub have recently released Repository Templates, which
provide much of the same functionality.
36

CI/CD Pipelines
• Almost no change in the software development life cycle has been so revolutionary as
the advent of continuous integration (CI) and continuous delivery (CD).
• Building and running automated tests against the source code of a project as soon as a
change is checked in catches mistakes early.
• Prior to the advent of continuous integration builds, it wouldn't be uncommon to pull
code from the repository and find that it didn't pass tests or couldn't even be built.
• This resulted in tracking down the source of the breakage.
• Traditionally shipping software to the production environment required extensive
documentation and a list of steps.
• Each one of these steps needed to be manually completed in a very error prone
process.
• Continuous integration is followed by continuous delivery in which the freshly built
packages are deployed to an environment.
• The manual process can't scale to match the speed of development so automation
becomes more important.
• Checklists are replaced by scripts that can execute the same tasks faster and more
accurately than any human.
37

CI/CD Pipelines
• The environment to which continuous delivery delivers might be a test environment or,
as is being done by many major technology companies, it could be the production
environment.
• The latter requires an investment in high-quality tests that can give confidence that a
change isn't going to break production for users. In the same way that continuous
integration caught issues in the code early continuous delivery catches issues in the
deployment process early.
• The importance of automating the build and delivery process is accentuated by cloud-
native applications.
• Deployments happen more frequently and to more environments so manually
deploying borders on impossible.
• There's no cost to configuring many build pipelines, so it's advantageous to have at
least one build pipeline per microservice.
• Ideally, microservices are independently deployable to any environment so having each
one able to be released via its own pipeline without releasing a mass of unrelated code
is perfect.
• Each pipeline can have its own set of approvals allowing for variations in build process
for each service.
38

Feature Flags
• Cloud native is about speed and agility.
• Users expect rapid responsiveness, innovative features, and zero downtime.
• Feature flags are a modern deployment technique that helps increase agility for cloud-
native applications.
• They enable you to deploy new features into a production environment but restrict their
availability.
• You can activate a new feature for specific users without restarting the app or deploying
new code.
• They separate the release of new features from their code deployment.
• Feature flags are built upon conditional logic that control visibility of functionality for
users at run time.
• In modern cloud-native systems, it's common to deploy new features into production
early, but test them with a limited audience.
• As confidence increases, the feature can be incrementally rolled out to wider
audiences.
39

Feature Flags
 Other use cases for feature flags include:
• Restrict premium functionality to specific customer groups willing to pay higher subscription fees.
• Stabilize a system by quickly deactivating a problem feature, avoiding the risks of a rollback or
immediate hotfix.
• Disable an optional feature with high resource consumption during peak usage periods.
• Conduct experimental feature releases to small user segments to validate feasibility and
popularity.
 Feature flags also promote trunk-based development.
• It's a source-control branching model where developers collaborate on features in a single
branch.
• The approach minimizes the risk and complexity of merging large numbers of long-running
feature branches.
• Features are unavailable until activated.
40

Infrastructure as Code
• Cloud-native systems embrace microservices, containers, and
modern system design to achieve speed and agility.
• They provide automated build and release stages to ensure
consistent and quality code.
• Modern cloud-native applications also embrace the widely accepted
practice of Infrastructure as Code, or IaC.
• With IaC, platform provisioning is automated.
• Software engineering practices such as testing and versioning are
applied to DevOps practices.
• Infrastructure and deployments are automated, consistent, and
repeatable.
• Just as continuous delivery automated the traditional model of manual
deployments, Infrastructure as Code (IaC) is evolving how application
environments are managed.
41

Terraform
• Cloud-native applications are often constructed to be cloud agnostic.
• Being so means the application isn't tightly coupled to a particular cloud vendor and
can be deployed to any public cloud.
• Terraform is a commercial templating tool that can provision cloud-native applications
across all the major cloud players: Azure, Google Cloud Platform, AWS, and AliCloud.
• Instead of using JSON as the template definition language, it uses the slightly more
terse HCL (Hashicorp Configuration Language).
• Terraform also provides intuitive error messages for problem templates.
• There's also a handy validate task that can be used in the build phase to catch
template errors early.
• Command-line tools are available to deploy Terraform templates.
• Sometimes Terraform output meaningful values, such as a connection string to a newly
created database.
• This information can be captured in the build pipeline and used in subsequent tasks.
42

Terraform - Example
43

Cloud Native Application Bundles


 A key property of cloud-native applications is that they leverage the capabilities of the cloud
to speed up development.
 This design often means that a full application uses different kinds of technologies.
 Applications may be shipped in Docker containers, some services may use Azure Functions,
while other parts may run directly on virtual machines allocated on large metal servers with
hardware GPU acceleration.
 No two cloud-native applications are the same, so it's been difficult to provide a single
mechanism for shipping them.
• The Docker containers may run on Kubernetes using a Helm Chart for deployment.
• The Azure Functions may be allocated using Terraform templates.
• Finally, the virtual machines may be allocated using Terraform but built out using Ansible.
 This is a large variety of technologies and there has been no way to package them all
together into a reasonable package.
44

Cloud Native Application Bundles


• Cloud Native Application Bundles (CNABs) are a joint effort by many community-minded
companies such as Microsoft, Docker, and HashiCorp to develop a specification to package
distributed applications.
• The effort was announced in December of 2018,
• The CNABs can contain different kinds of installation technologies.
• This aspect allows things like Helm Charts, Terraform templates, and Ansible Playbooks to
coexist in the same package.
• Once built, the packages are self-contained and portable; they can be installed from a USB
stick.
• The packages are cryptographically signed to ensure they originate from the party they
claim.
• The CNAB format is also flexible, allowing it to be used against any cloud.
• It can even be used against on-premises solutions
45

Bundle.json
• The core of a CNAB is a file called bundle.json.
• This file defines the contents of the bundle, be
they Terraform or images or anything else.
• The figure defines a CNAB that invokes some
Terraform.
• Notice, however, that it actually defines an
invocation image that is used to invoke the
Terraform.
• When packaged up, the Docker file that is located
in the cnab directory is built into a Docker image,
which will be included in the bundle.
• Having Terraform installed inside a Docker
container in the bundle means that users don't
need to have Terraform installed on their machine
to run the bundling.\
• The bundle.json also defines a set of parameters
that are passed down into the Terraform.
• Parameterization of the bundle allows for
installation in various different environments.
46

Readings
• Architecting Cloud Native .NET Applications for Azure.
https://dotnet.microsoft.com/en-us/download/e-book/cloud-native-
azure/pdf

You might also like