100% found this document useful (1 vote)

260 views41 pages

Feature Flag Best Practices

The document discusses best practices for implementing and managing feature flags in software. It covers topics like maintaining flag consistency, handling anonymous to logged-in user transitions, making flagging decisions on the server, incremental database changes, and building feedback loops.

Uploaded by

Ebu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

260 views41 pages

Feature Flag Best Practices

Uploaded by

Ebu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

1.

1. Introduction
2. 2. The Moving Parts of a Feature-Flagging System

a. Creating Separate Code Paths

3. 3. Best Practice #1: Maintain Flag Consistency
4. 4. Best Practice #2: Bridge the “Anonymous” to “Logged-In”
Transition
5. 5. Best Practice #3: Make Flagging Decisions on the Server

a. Performance
b. Configuration Lag
c. Security
d. Implementation Complexity

6. 6. Best Practice #4: Incremental, Backward-Compatible

Database Changes

a. Code First
b. Data First
c. Big Bang
d. Expand-Contract Migrations
e. Duplicate Writes and Dark Reads
f. Working with Databases in a Feature-Flagged
World

7. 7. Best Practice #5: Implement Flagging Decisions Close to

Business Logic
a. A Rule of Thumb for Placing Flagging Decisions
8. 8. Best Practice #6: Scope Each Flag to a Single Team
9. 9. Best Practice #7: Consider Testability
10. 10. Best Practice #8: Have a Plan for Working with Flags at
Scale
a. Naming Your Flags
b. Managing Your Flags

i. Establishing a Retirement Plan

ii. Finding Zombie Flags
iii. Ownership and Change Control
iv. Flagging Metadata

11. 11. Best Practice #9: Build a Feedback Loop

a. Correlating Changes with Effects

i. Inference
ii. Causality

b. Categories of Feedback
12. 12. Summary
Chapter 1. Introduction

Feature flags—also known as feature toggles, feature flippers, or feature

bits—provide an opportunity for a radical change in the way software
engineers deliver software products at a breakneck pace. Feature flags
have a long history in software configuration but have since “crossed the
chasm,” with growing adoption over the past few years as more and more
engineering organizations are discovering that feature flags allow faster,
safer delivery of features to their users by decoupling code deployment
from feature release. Feature flags can be used for operational control,
enabling “kill switches” that can dynamically reconfigure a live
production system in response to high load or third-party outages. Feature
flags also support continuous integration/continuous delivery (CI/CD)
practices via simpler merges into the main software branch.

What’s more, feature flags enable a culture of continuous experimentation

to determine what new features are actually desired by customers. For
example, feature flags enable A/B/n testing, showing different experiences
to different users and allowing for monitoring to see how those
experiences affect their behavior.

In this book, we explain how to implement feature-flagged software

successfully. We also offer some tips to developers on how to configure
and manage a growing set of feature flags within your product, maintain
them over time, manage infrastructure migrations, and more.
Chapter 2. The Moving Parts of
a Feature-Flagging System

At its core, feature flagging is about your software being able to choose
between two or more different execution paths, based upon a flag
configuration, often taking into account runtime context (i.e., which user is
making the current web request). A toggle router decides the execution
path based on runtime context and flag configuration.

Creating Separate Code Paths

Let’s break this down using a working example. Imagine that we work for
an ecommerce site called acmeshopping.com. We want to use our feature-
flagging system to perform some A/B testing of our checkout flow.
Specifically, we want to see whether a user is more likely to click the
“Place your order!” button if we enlarge it, as illustrated in Figure 2-1.
Figure 2-1. acmeshopping.com A/B testing

To achieve this, we modify our checkout page rendering code so that there
are two different execution paths available at a specific toggle point:

renderCheckoutButton(){
if(
features
.for({user:currentUser})
.isEnabled(“showReallyBigCheckoutButton”)
){
return renderReallyBigCheckoutButton();
}else{
return renderRegularCheckoutButton();
}
}

Every time the checkout page is rendered our software will use that if
statement (the toggle point) to select an execution path. It does this by
asking the feature-flagging system’s toggle router whether the
showReallyBigCheckoutButton feature is enabled for the current
user requesting the page (the current user is our runtime context). The
toggle router uses that flag’s configuration to decide whether to enable
that feature for each user.

Let’s assume that the configuration says to show the really big checkout
button to 10% of users. The router would first bucket the user, randomly
assigning that individual to one of 100 different buckets. The router would
then report that the feature is enabled if the current user has landed in
buckets 0 through 9, but disabled if they’d landed in any of the
remaining buckets (10 through 99).

When using a feature-flagging system, we often want to control which

users see a feature. We might want to initially limit rollout of a new
feature to a set of beta users, or expose a new functionality to only paying
customers, or to only 10% of traffic. Most feature-flagging systems allow
you to configure a feature to support these different targeting scenarios
based on a few different strategies, such as canary release, dark launching,
targeting by demographic, account, or license level. When the benefits of
feature flags are proven, additional use cases emerge and usage quickly
grows, so it’s a good idea to establish best practices from the start.
Chapter 3. Best Practice #1:
Maintain Flag Consistency

Our hypothetical ecommerce company, acmeshopping.com, has a landing

page in one of the main sections of the web portal where customers can
search and see search results, as they look for items they would like to
buy. Sometimes, Acme’s customers search without buying. Often, users
return and search again for the same items, and eventually some of them
do a checkout. Acme is trying to reduce the time it takes users to find
those frequently searched items by adding a new header section showing
these items. Let’s call them “Previously Seen Products.”

This new section is gated by a feature flag named

new_search_relevance. So, when new_search_relevance is
enabled, the web portal will first display the “Previously Seen Products”
section at the top of the search result part of the page.

You, as the person in charge of the rollout, set up this new feature to
initially be seen by 10% of the user population. Exposing the feature from
zero to a subset of the population is often called ramping up a feature.
Here, a feature was ramped to 10%.

As demonstrated in Figure 3-1, the expectation is that if user A visits your

site and your feature-flagging system decides that user A should see this
feature (variation “on” of the feature), user A should then continue to see
this same variation of the feature no matter how many times the flag is
evaluated, assuming no external changes have occurred (for example, the
flag definition changed).

Increasing the exposure to a broader user population should not affect the
current exposure of variations to users—if a user experienced a feature
when it ramped to 40%, that user should continue to see it as it ramps to
60%, 80%, and so on. In others, existing allocations should remain intact.

Figure 3-1. Flag consistency during feature ramping

A particular case occurs if you were to “de-ramp” (reduce exposure of) the
feature; for example, reducing exposure from 10% to 5%, as in Figure 3-2.
We know that user A was part of the “on” group in the 10% sample.
Unless your feature-flagging system has the notion of “memory” to
remember the prior allocation of A, there is little you can do to maintain
user A in the “on” group when reducing exposure, just because we don’t
know a priori whether user A will be in the “on” or “off” group.
Figure 3-2. Flag consistency during feature de-ramp
Chapter 4. Best Practice #2:
Bridge the “Anonymous” to
“Logged-In” Transition

When the same user crosses from an “anonymous” to a “logged-in” user,

deciding whether to maintain a consistent feature treatment can be a
challenge. The need to maintain the same feature set as a user switches
from anonymous to logged in is encountered more often in organizations
that serve consumers (B2C) than in enterprise organizations. In B2C
software, consumers generally start as visitor or anonymous, perform
some actions like adding elements to a shopping cart, and then later log in
to complete the purchase. There are several strategy options, and you will
need to choose what is right for your user’s experience.

Our sample company, acmeshopping.com, will have this problem as well.

An acmeshopping.com customer generally enters the site as an anonymous
user and is assigned a visitor ID as a cookie. The user might later complete
a login sequence.

When dealing with an anonymous user, you first need to decide whether
it’s important to maintain feature-flag consistency during the transition
from visitor ID to user ID.

For example, if your application is more transactional in nature, such as a

collaboration or networking site, perhaps maintaining feature-flag
consistency from session to session will not be as important. However, if
your test involves different pricing options, maintaining consistency will
be important because you’ll want to show the same price at every session.

If you decide that maintaining a consistent feature-flag treatment is

important, the technique to achieve consistency is to track the user’s
visitor ID as a cookie, as shown in Figure 4-1, and then associate it to the
user ID immediately after login when the user is created. We recommend
setting the cookie expiration time to be semi-permanent to ensure that the
user is served a consistent experience over the life of any feature flags.

Figure 4-1. Crossing the “anonymous” to “logged-in” barrier

Chapter 5. Best Practice #3:
Make Flagging Decisions on
the Server

In addition to logic implemented on the server side, modern web

applications often contain rich client-side logic. When applying feature
flagging, we usually have a choice between making our toggling decision
client side or server side, and there are trade-offs to consider either way.

Performance
By moving flagging decisions to the server side, you gain user-perceived
performance. Single-page applications are already making a server-side
call to render the data needed for the UI. At this time, you could also make
a call to a feature-flag service, so one network call fetches all feature-flag
evaluations with the server-side data.

Configuration Lag
One way in which engineers improve performance of an application is to
cache data locally, thereby reducing network latency. This has an impact
on where the feature flag decision should be made. You could opt to
proactively request all flagging decisions for a specific runtime context
(i.e., current user, browser, and geolocation) from the server. Or, you
could just request the current feature-flagging configuration and make
flagging decisions using a client-side toggle router. In both approaches,
you are at risk of Configuration Lag.

When a Site Reliability Engineer hits a kill switch to disable a feature

that’s going sideways, how does your client-side feature-flagging system
find out? Do you poll for updates on a regular basis? Maybe there is a
server-side push system that informs you of a flag configuration change.
What if your client-side code doesn’t have network connectivity when that
push goes out? These are all variants of cache invalidation challenges—
one of the famously difficult problems in computer science. Keeping your
decisions on the server side helps to reduce these challenges.

In addition, the UI often won’t have access to a lot of dimensional data

about the user for security purposes. For example, there might not be
history or behavioral data on a mobile application that is needed to roll out
your features. This data is on the server side and is another reason to keep
feature-flag evaluations on the server side.

Security
Whenever you move a feature-flagging decision to the client, you’re
exposing information about the existence of those decisions—anyone
who’s able to log in to your product can potentially see what product
features are under active management and can also manipulate which
variants they experience. If you’re concerned about industrial espionage or
particularly nosy tech journalists, this might be relevant, but that’s
unlikely to be the case for the typical feature-flag practitioner.

Implementation Complexity
Most delivery teams working with feature flags need the ability to make a
server-side toggling decision. If a team also begins making toggling
decisions on the client side, it significantly increases the complexity of its
feature-flagging system. There will now be two parallel implementations,
which are likely to be implemented in multiple languages (unless you’ve
opted to implement your backend in JavaScript, in addition to your
frontend). These parallel implementations need to remain synchronized
and make consistent toggling decisions. And, as discussed earlier, if you
begin adding client-side caching into the mix, things can get quite
complicated.

Given these performance and complexity concerns, we recommend

keeping feature-flagging decisions on the server side.
Chapter 6. Best Practice #4:
Incremental, Backward-
Compatible Database Changes

Whenever we make code changes to a production system, we need to take

existing database data—and, more generally, any shared persistent state—
into account. The database schema in place needs to be compatible with
the expectations of any newly deployed code; sometimes that means
applying a migration to our database schema, as illustrated in Figure 6-1.

Figure 6-1. Database schema before and after migration

We can orchestrate a code deployment and its corresponding database
migration in a few ways.

Code First
We can perform the code deployment first, shown in Figure 6-2, making
sure that the new version of our code is backward-compatible with the
existing database schema.

Figure 6-2. Code-first approach

Data First
Alternatively, we can perform our database migration first, as shown in
Figure 6-3, which means that we must ensure that the new schema is
backward-compatible with the existing code.
Figure 6-3. Data-first approach

Big Bang
In simple systems, there’s a third option (Figure 6-4): update data and
code simultaneously in a lockstep deployment in which you stop your
system, update your data to support your code change, and then restart the
system with your new code.

Figure 6-4. Big-Bang approach

With the Big-Bang approach, you don’t need to worry about backward or
forward compatibility, because you’re updating both data and code in
concert. New code will never see old data, and old code will never see
new data.

Feature flagging brings an additional challenge in this area. Code paths

that are managed by an active feature flag exist in a sort of quantum
superposition, in which any given execution might go down one side of
the code path or the other. This means that your data must be compatible
with both code paths for the duration that the managing feature flag is
active. This precludes the option of performing a lockstep migration of
both data and code, because after that migration, your data schema will not
support the old code path that could still be selected by your flagging
system.

Expand-Contract Migrations
When a feature-flagged code change requires a corresponding data schema
migration, this migration must be performed as a series of backward- or
forward-compatible changes, sometimes referred to as an Expand-
Contract migration, or a Parallel Change. The technique is called
Expand-Contract because the series of changes will consist of an initial
data-first change that “expands” the schema to accommodate your code
change, followed by a code-first change that “contracts” the schema to
remove aspects that are no longer needed.

Let’s see how one of these Expand-Contract migrations might work in

practice. At acmeshopping.com, the shipping address for an order
previously was stored as a set of columns within the Orders table. We
want to normalize how addresses are stored, extracting the shipping
address out of the Orders table into a ShippingAddress table, referenced
by a foreign key in the Orders table, as depicted in Figure 6-5.
Figure 6-5. Expand-Contract migrations

We want to perform this change safely, so we use a feature flag to manage

the change, and we perform the migration using the following series of
Expand-Contract changes:

1. We Expand the schema by performing a data-first change, which

adds the new ShippingAddress table, as well as a nullable
shipping_addr_id foreign key to the Order table. We don’t
make any code changes at this point, because this change is
backward-compatible with existing code.
2. We start a code-first change, rolling out a code change that will
write to both the old shipping address columns in the Order table
and the columns in the new ShippingAddress table.
3. We perform a one-time data migration, which backfills the
ShippingAddress table, creating a row for each existing Order,
linked back to the Order via the shipping_addr_id foreign
key. We also remove the nullability from
shipping_addr_id, given that we can be sure that it will
now always be set.
4. Now that all existing data is in the new table, we can contract the
schema by performing a code-first change. We roll out a code
change to read only from the ShippingAddress table. At this
point, no code is referencing the old shipping_addr columns
in Order, so we can drop those columns from the database.

Duplicate Writes and Dark Reads

When performing these Expand-Contract migrations, there is a middle
phase during which our system needs to support both the old and the new
schemas. We achieve this by doing Duplicate Writes—whenever we need
to create or update a shipping address, we do that in the old fields and in
the new table.

Because we have data duplicated in two places, we also need to decide

from where we should read data. For a complex migration, it’s advisable
to perform Dark Reads, in which we read from both sources and compare
the results to make sure that everything matches. This allows us to gain
confidence in our change, safe in the knowledge that if things don’t seem
to be correct, we can easily roll back to the previous schema (using a
feature flag) while we debug the problem.

Working with Databases in a Feature-Flagged

World
We’ve seen that feature flagging precludes some migration techniques,
such as making Big-Bang changes. Instead, we recommend using patterns
like Expand-Contract and Double Writing to perform a safe, incremental
migration of your data. For complex migrations, techniques like Dark
Reads will help gain confidence and mitigate the risk of data
inconsistencies. Also note that some database changes are just too big and
still need to be scheduled and change-controlled in a more traditional
fashion.
Chapter 7. Best Practice #5:
Implement Flagging Decisions
Close to Business Logic

At acmeshopping.com we want to experiment with the idea of offering

free shipping on all orders that total more than $50. We want to manage
the rollout of this feature using a feature flag. An obvious first question to
ask is where to implement that flagging decision. Let’s take a look at the
architecture of acmeshopping.com’s backend systems, shown in Figure 7-
1, so that we can evaluate our options.

Figure 7-1. Backend systems involved in creating a new checkout

At acmeshopping.com, all user-facing web interactions go through the

cunningly named Web App.

When a user is done filling their cart and is ready to check out, they make
a request to the Web App, which in turn asks the Checkout service to
create a new checkout. The Checkout service does this and returns
information about the checkout back to the Web App, including details
about the shipping costs associated with the items in this checkout (at
Acme, shipping costs are not calculated until the user finalizes the order
and checks out).

The Checkout service calculates shipping costs, so it seems obvious that

the flagging decision for the “free shipping” feature should reside within
that service, as depicted in Figure 7-2.

Figure 7-2. Placing the “free shipping” flag decision in the Checkout service

But is that really the best place to make this flagging decision? Suppose
that we initially want to roll out free shipping for internal testing (an
example of the virtual UAT technique that we discussed earlier). This
means that we’d need to take into account which user is requesting the
checkout when deciding whether to allow free shipping. The user would
need to be part of the runtime context for the flagging decision. However,
our Checkout service doesn’t know anything about users—its sole
responsibility is creating and managing checkouts.

One way to solve this would be to have the Web App pass the necessary
runtime context through to the Checkout service, as demonstrated in
Figure 7-3, by telling it which user is requesting to check out. This would
allow the Checkout service to make the flagging decision locally.
Figure 7-3. Web App passes the necessary runtime context through to the Checkout
service

However, with this approach, we’d be forcing the Checkout service to

become aware of the concept of users, which is beyond the scope of that
service. We’d like to keep the Checkout service entirely focused just on
creating and managing checkouts.

On the other hand, the Web App is already deeply aware of the concept of
users and already has the context of which user is requesting a checkout.
The Web App is actually in a better position to make our “free shipping”
flagging decision, as shown in Figure 7-4.

Figure 7-4. Placing the “free shipping” feature decision in the Web App

Now, every time a user proceeds to checkout, the web app will use the
feature-flagging system to decide whether this user should receive free
shipping. That decision is then passed to the Checkout service as an extra
parameter when creating the checkout. Note that the Checkout service still
has the responsibility of determining whether the order is eligible for free
shipping; that is, if the total value of the order is more than $50. We’ve
avoided having to change the scope of our various components just to
support a flagging decision, and we’ve kept the underlying business logic
that’s powering the feature in the right place.

A Rule of Thumb for Placing Flagging Decisions

We can extract a general rule of thumb from this example: make your
flagging decision as close to the business logic powering the flag while
still retaining sufficient runtime context to make the flagging decision.

This ensures that flagging decisions are encapsulated as much as possible

while avoiding the need for every part of your codebase to be aware of the
concepts like User, Account, Product Tier, and so on, which often help
drive flagging decisions.

Following this rule of thumb is sometimes a balancing act. We usually

have the most context about an operation at the edge—where the operation
enters our system. In modern systems, however, core business logic is
often broken apart into many small services. As a result, we need to make
a flagging decision near the edge of the system where a request first
arrives (i.e., in a web tier) and then pass that flagging decision on to core
services when making the API calls that will fulfill that request.
Chapter 8. Best Practice #6:
Scope Each Flag to a Single
Team

In our previous example, acmeshopping.com was experimenting with

offering free shipping for some orders via a feature-flag managing rollout.
Marketing executives want to promote this new feature heavily and plan to
put a banner at the top of the home page. Of course, we don’t want to
show that banner to users who won’t be eligible for free shipping—we
could end up with some grumpy customers that way! We need to place
that banner behind a feature flag.

We could create a new feature flag called “free shipping banner,” and use
that to manage the display of the banner ad. But we’d need to make sure
that this banner wasn’t ever on when the other “free shipping” feature was
off; otherwise, we’d be back to grumpy customers unhappy that they will
not be getting the free shipping.

To avoid this problem, some feature-flagging systems allow you to “link”

different flags together, marking one flag as a dependency of the other.
However, most of the time there’s a much simpler solution: just use one
flag to control both the feature and the banner promoting the feature!

Although this might seem like an obvious best practice, it’s sometimes not
so obvious when flags are grouped by team, and the team implementing
the shipping calculation code is very disconnected from the team
implementing the banner ad. Ideally, your product delivery teams are
already oriented around product features (i.e., the product detail page
team, the search team, the home page team) rather than technology or
projects (i.e., the frontend team, the performance optimization team). This
reduces the number of features that require cross-team collaboration, but
does not eliminate it. Some features will always require changes across
multiple teams. When that happens, it’s OK to bend the rules and have a
flag that’s being used by multiple teams. You should, however, still
always aim to have each flag owned by a clearly identified team. That
team is responsible for driving rollout of the feature, monitoring
performance, and so on.
Chapter 9. Best Practice #7:
Consider Testability

We write unit tests to validate parts of business logic, automate user

interaction to maintain an error-free user experience, and avoid
introducing bugs as the codebase evolves. In this section, we discuss the
implications of using feature flags in combination with known practices of
continuous integration.

When using feature flags in your application, you are creating a

conditional alternative execution path. Imagine that you have a typical
three-tier microservice composed of an API layer, a controller layer, and a
Data Access Layer (DAL). Your engineering team set up a feature flag at
the controller layer to gate the access to a new storage layer that won’t
have any visible impact for the user.

In a typical CI/CD environment, a new code change is pushed to the

source code repository, where it is compiled and all appropriate test cases
are run (unit tests, integration tests, end-to-end tests, and so on). If
successful, the new code will become part of the main branch.

There are two approaches here. On one end we have high-level testing,
often called end-to-end testing or black-box testing. This approach tests
the functionality from the outermost layer and ensures that the expected
behavior is working without caring for how the underlying components
operate. When using feature flags, high-level testing must assure that the
application will produce the expected behavior when the feature is turned
on.

On the other hand, when pursuing lower-level unit testing, you should try
to isolate the functionality the flag is gating and write tests to target both
behaviors; testing the functionality for when the flag is on and when the
flag is off.

Last, it’s easy to fall into the temptation of writing tests for all possible
combinations of flags. We advise reducing the scope of the test
components to span only a handful of flags or isolate the test so that the
tester can test the main target flag with the rest of the flags turned on.
Chapter 10. Best Practice #8:
Have a Plan for Working with
Flags at Scale

As engineers from different teams create, change, and roll out feature flags
across the application stack, tracking and cleanup can get out of hand over
time. Maintaining the who/what/why of feature flags and establishing a
basic process for identification and tracking will reduce complexity down
the road.

Naming Your Flags

Defining a pattern or naming convention for naming your feature flags is
always a good practice for keeping things organized. Your peers will
quickly identify from the name what the feature is about and be able to
recognize what areas of the application for which the feature is being used.
We won’t take an opinionated view on a specific naming convention;
instead, we point out useful examples of naming structures and
demonstrate the practical benefits of adopting one.

The feature name example that follows has three parts. First, we present
the name of the section the feature is gating. In this example, the feature is
gating functionality in the admin section. The second part indicates what
the feature does, a self-explanatory naming: make the new invite flow
visible to users. And, last, where in the stack the feature is located: here it
belongs to the backend layer of the application.
This naming pattern looks like this:

section_featurepurpose_layer

Here’s a specific example that uses the aforementioned template:

admin_panel_new_invite_flow_back_end

Here are a couple of other examples of the same feature but in different
layers of the stack:

admin_panel_new_invite_flow_front_end
admin_panel_new_invite_flow_batch

An alternative naming structure can include the team that created and
owns the flag; for instance, Data Science, Growth, Data Infrastructure.
You might also want to include the name of the service for which the flag
is used: web app, router, data writer, and so on.

We recommend adopting a naming convention that makes sense within the

organization’s preestablished style code; or, if one doesn’t exist, taking
this opportunity to engage with different stakeholders of the feature-
flagging system to define one.

Managing Your Flags

Feature flags are a useful tool for folks in a variety of product delivery
roles. It’s not uncommon to see rapid, organic adoption of feature flagging
after it’s introduced into an organization, sometimes leading to a phase in
which feature flags become a victim of their own success. The number of
flags within a product can grow to become overwhelming. No one is
entirely sure who’s responsible for which flags or which flags are stale—
left permanently on for the foreseeable future, cluttering up your codebase
with conditional statements and contributing an incremental drag on
delivery. Also, a misguided change to the flag configuration of your
production systems can have severe negative consequences.

To ensure a successful long-term adoption, it’s essential to have a plan in

place that will keep the number of flags in check and help you manage the
ones you have.

Establishing a Retirement Plan

Delivery teams are asked to add flags for a variety of reasons—release
management, experimentation, operational control—but aren’t often asked
to remove a flag after it has served its purpose. Teams need to put
processes in place to ensure that flags are eventually retired.

One useful technique is to add a flag retirement task to the team’s work
backlog whenever a flag is created. However, these tasks can have a nasty
tendency of being perpetually deprioritized, always one or two weeks
away from being tackled.

Assigning an exact expiry date to every flag when it is created can help
break through this tendency to prioritize the urgent over the important.
Your feature-flagging system should have some way to communicate this
information, ideally with a highly visible warning if a flag has expired.

Some teams even go so far as creating “time bombs,” whereby a system

will refuse to boot if it notices any flags that have passed their expiry date.
Less extreme versions of this approach would be for the system to
complain loudly in logs or perhaps fail your CI build when an expired flag
is detected.

You can also opt to place a limit on the number of active flags a given
team has under management. This incentivizes the removal of old flags to
make room for a new flag.

Of these various techniques, we recommend getting started by placing a

limit on the number of active flags and ensuring that you always add a
flag-removal task to your backlog whenever you create a flag.

Finding Zombie Flags

You might be faced with a situation in which you already have a large
number of flags in your system that don’t have any of these retirement
plans in place. A good feature-flagging system can help you by showing
you flags that have been either 100% rolled out or 0% rolled out for an
extended period of time, or flags that haven’t had their configuration
modified in a long time. You could even have your flagging system
identify flags that are in your flagging system but your production systems
aren’t using.

Ownership and Change Control

As an organization increases its adoption of feature flagging, the usage
patterns for flagging tend to broaden. A system that was initially used
primarily by engineers begins to be used by product owners, marketers,
production operations folks, and others. At the same time, the number of
product delivery teams using feature flagging can also grow. Initially, a
flagging system might have been used by only one team or product, but
over time the number of teams taking advantage of the capability can
grow.
As this spread of adoption plays out, it becomes important to be able to
track who is responsible for each flag in your system. In the early days,
this was easy—it was always the tech lead for the mobile team, or always
the product owner for the shopping cart team—and so many homegrown
feature-flagging systems don’t initially have this capability.

Having the ability to assign ownership of a flag—either to a specific

individual or to a team—allows your flagging system to answer questions
like “which flags my team owns and are within one week of expiring,” or
“which teams have active flags that have been at 100% for more than a
month.”

Feature flags offer an extremely powerful way to directly modify the

behavior of production systems; flag configuration should have the same
type of change control as production code deployments. Having explicit
ownership associated with a flag also allows you to apply change control
policies. For example, you can prevent individuals from making a change
to the configuration of a flag that they don’t own, or require approval
before a feature-flag configuration change is made in production.

Flagging Metadata
Attaching information like expiration dates and ownership to a flag are
specific examples of a more general capability: the ability to associate
metadata to your flag.

A handy generalized form of this capability is tagging (also sometimes

referred to as labeling)—the ability to add arbitrary key:value pairs to a
feature flag. This idea is used to great effect by other infrastructure
systems such as Kubernetes and Amazon Web Services APIs. It is an
extremely flexible way to allow users of the system to attach semantic
metadata to entities in the system. You can use tags to track the creation
date and when it should expire, and to indicate the life-cycle status of a
flag, what type of flag it is, which team owns it, who should be able to
modify it, which area of the architecture it affects, and so on.
Chapter 11. Best Practice #9:
Build a Feedback Loop

Feature flags allow us to make controlled changes to our system. We can

then observe the impact of these changes and make adjustments as
necessary. If a new feature causes business growth metrics such as
conversion rates to increase by 20% (with statistical significance), we
keep the change and roll it out to our entire user base. Conversely, if a new
feature is causing an engineering metric such as request latency to spike
by 200%, we want to roll the change back—quickly! Put another way,
when working with feature flags we operate within a feedback loop. We
make changes, observe the effects, and use those observations to decide
what change to make next, as illustrated in Figure 11-1.

We cannot overstate how effective a mature feedback mechanism is in

unlocking the maximum value of a feature-flagging practice. Making a
change without being able to see the effect of that change easily is like
driving a car with a fogged-up windshield.

Despite the value of this feedback loop, a surprising number of feature-

flagging implementations start life with no or very limited integration to
the analytics and instrumentation systems that exist in most modern
product delivery organizations and provide a rich mechanism for feedback
and iteration.
Figure 11-1. Feedback loop for feature iteration

Correlating Changes with Effects

To observe the effects of a feature-flag change, we need to be able to
correlate the change with its effects, closing the feedback loop. Correlation
is the key. If we apply an A/B test and see 50% of users converting more
but don’t know which users saw which treatment, we are unable to make
sound decisions. Continuous Delivery essentially produces micro-launches
all the time as engineers constantly push new features into production. The
impact of small changes is difficult to measure if the metrics are not
directly tied to the feature.
You can correlate the impact of changes made by tying a measurement to
the feature flag or by pushing information about the live state of the
feature flags into the instrumentation and analytics systems. This
information can be either inferred or statistically analyzed to determine
causality.

Inference
If we publish changes to our feature-flagging configuration as a separate
stream of instrumentation events, we can use the timing of these changes
as a way to correlate a feature-flag change with its effect. For example, if
we see 10% of servers having an increase in CPU utilization at the same
time as a feature flag change that rolled out to 10% of servers, we can
pretty easily infer a correlation.

This approach enables correlation in most simple scenarios but has some
drawbacks. In the case of a 50% feature roll out, it will be difficult to
know which effect is caused by the flag being on and which by it being
off. It’s also more difficult to draw correlations when the impact of a
feature flag takes some time; for example, a change that causes a slow
memory leak, or a change that affects a user’s behavior in later stages of a
conversion funnel. The fundamental issue is that we’re inferring the
correlation between a feature change and its effects.

Causality
A more sophisticated approach is to include contextual information about
feature-flag state within an analytics event. This approach, most
commonly described as experimentation, ties metrics to a feature flag to
measure the specific impact of each change. For example, the analytics
event that reports whether a user clicked a button can also include
metadata about the current state of your feature flags. This allows a much
richer correlation, as you can easily segment your analytics based on flag
state to detect significant changes in behavior. Conversely, given some
change in behavior, you can look for any statistically significant
correlation to the state of flags. Experimentation establishes a feedback
loop from the end user back to development and test teams for feature
iteration and quality control.

Categories of Feedback
You might already have noticed from the discussion so far that a feature
change can have a broad variety of effects. We might observe an impact
on technical metrics like CPU usage, memory utilization, or request
latency. Alternatively, we might also be looking for changes in higher-
level business metrics like bounce rate, conversion rate, or average order
value.

These two categories of metrics might initially seem unrelated—and

they’re almost always tracked in different places, by different systems—
but when it comes to feedback from a feature change, we are interested in
analyzing change from as many places as possible. It’s not unheard of for
a change in a technical flag to have a surprise impact on business KPIs, or
for a new user-facing feature to cause issues in things like CPU load or
database query volume.

Ideally, we will look across both categories when looking for the impact
of a feature change. One thing to note when looking to correlate a feature
change with its effects is that business analytics is typically oriented
around the user, at least for a typical B2C product, whereas technical
metrics usually focus on things like servers and processes. You’ll likely
want to use different contextual information for these different categories
of feedback.
Chapter 12. Summary

Feature flags help modern product delivery teams reduce risk by

separating code deploy from feature release to safely increase release
velocity at scale. Feature flags provide a mechanism for feedback and
iteration by linking features to changes in engineering KPIs and product
metrics.

In this book, we’ve offered advanced users of feature flags some best
practices for working with feature flags. Following tips such as
maintaining flag consistency in different scenarios, development and
testing with feature flags, and working with feature flags at scale, will help
you to manage a growing feature-flag practice.

Apache Hive Cookbook
No ratings yet
Apache Hive Cookbook
485 pages
30 Feature Flagging Best Practices Mega Guide
No ratings yet
30 Feature Flagging Best Practices Mega Guide
39 pages
6297fc8c22c7fa69ec2d200e - Feature Flags 101
No ratings yet
6297fc8c22c7fa69ec2d200e - Feature Flags 101
19 pages
OReillyAndSplitFeatureFlagBestPractices 1559898078860
No ratings yet
OReillyAndSplitFeatureFlagBestPractices 1559898078860
42 pages
Software Development Life Cycle
No ratings yet
Software Development Life Cycle
23 pages
Week10 Spring2023 DevOps - PostMid
No ratings yet
Week10 Spring2023 DevOps - PostMid
13 pages
Hibernate Search Reference
No ratings yet
Hibernate Search Reference
379 pages
React Coding Standards
50% (2)
React Coding Standards
14 pages
Docs Specflow Org Specflow en Latest PDF
No ratings yet
Docs Specflow Org Specflow en Latest PDF
136 pages
Dependency Injection and Unit Testing
No ratings yet
Dependency Injection and Unit Testing
5 pages
The Openid Connect Handbook v1
No ratings yet
The Openid Connect Handbook v1
45 pages
DuckDB in Action MEAP v02 Chptrs 1to4 MotheDuck
No ratings yet
DuckDB in Action MEAP v02 Chptrs 1to4 MotheDuck
123 pages
E Bites - Introducing SAP Cloud Identity Access Governance (IAG)
No ratings yet
E Bites - Introducing SAP Cloud Identity Access Governance (IAG)
108 pages
React Redux Tutorial PDF
No ratings yet
React Redux Tutorial PDF
28 pages
Designing Data Intensive Applications: Part 1: Storage and Retrieval
No ratings yet
Designing Data Intensive Applications: Part 1: Storage and Retrieval
85 pages
CSLA Blazor 02 28 PDF
100% (1)
CSLA Blazor 02 28 PDF
182 pages
Whitepaper Tempo
No ratings yet
Whitepaper Tempo
24 pages
The Angular Developers NX Handbook
0% (1)
The Angular Developers NX Handbook
83 pages
Jenkins Cheat Sheet PDF
No ratings yet
Jenkins Cheat Sheet PDF
1 page
Pavel Chertorogov GraphQL The Holy Contract-Between Client and Server v11
No ratings yet
Pavel Chertorogov GraphQL The Holy Contract-Between Client and Server v11
95 pages
Front-End Web Development With Reactjs: Presented By: Ajay Kumar Gupta B.Tech C.E. 5 Sem
No ratings yet
Front-End Web Development With Reactjs: Presented By: Ajay Kumar Gupta B.Tech C.E. 5 Sem
18 pages
Docs Node - Js v0.10.31
100% (1)
Docs Node - Js v0.10.31
209 pages
Trailblazer PDF
No ratings yet
Trailblazer PDF
259 pages
12 Factor App
100% (1)
12 Factor App
24 pages
A Guide To Rules Engines
100% (1)
A Guide To Rules Engines
30 pages
CleanCodeHandbook v1.0.1 PDF
100% (1)
CleanCodeHandbook v1.0.1 PDF
100 pages
Advanced Node Dot Js Success Guide
No ratings yet
Advanced Node Dot Js Success Guide
23 pages
Martin Fowler Articles PDF
No ratings yet
Martin Fowler Articles PDF
550 pages
Test Driven Development
100% (1)
Test Driven Development
13 pages
DZone 2015 Continuous Delivery
100% (1)
DZone 2015 Continuous Delivery
44 pages
Overview of Deployment Options On AWS: June 2020
No ratings yet
Overview of Deployment Options On AWS: June 2020
21 pages
Facilitator's Guide To SAFe - Architect Sync
No ratings yet
Facilitator's Guide To SAFe - Architect Sync
5 pages
The Outbox Pattern: One of My Previous Articles
No ratings yet
The Outbox Pattern: One of My Previous Articles
11 pages
Understanding Unit and Integration Testing in Golang
No ratings yet
Understanding Unit and Integration Testing in Golang
59 pages
04 Resource Monitoring
100% (1)
04 Resource Monitoring
35 pages
Slack Business Case Template
No ratings yet
Slack Business Case Template
12 pages
Akash High Scale Benchmarks
No ratings yet
Akash High Scale Benchmarks
74 pages
Building Scalable Web Architectures: Aaron Bannert
No ratings yet
Building Scalable Web Architectures: Aaron Bannert
75 pages
Full Reactive
No ratings yet
Full Reactive
53 pages
Lucene 4 Cookbook - Sample Chapter
No ratings yet
Lucene 4 Cookbook - Sample Chapter
28 pages
Containerized Docker Application Lifecycle With Microsoft Platform and Tools
100% (1)
Containerized Docker Application Lifecycle With Microsoft Platform and Tools
104 pages
Ch-2 System Planning and Selection
No ratings yet
Ch-2 System Planning and Selection
79 pages
Beyond The Technical - Succeed at Leading A Software Architecture Team Presentation
No ratings yet
Beyond The Technical - Succeed at Leading A Software Architecture Team Presentation
34 pages
IT Recruiter Training Part
No ratings yet
IT Recruiter Training Part
4 pages
What's New in Angular 8?: Typescript 3.4 or Above Support. Ivy Renderer Engine Support
No ratings yet
What's New in Angular 8?: Typescript 3.4 or Above Support. Ivy Renderer Engine Support
7 pages
Ignite Sample
0% (1)
Ignite Sample
88 pages
Redis Vs Ncache
No ratings yet
Redis Vs Ncache
36 pages
Slurm Guide
No ratings yet
Slurm Guide
78 pages
VMware Best Practices Kubernetes Security
No ratings yet
VMware Best Practices Kubernetes Security
8 pages
Cloud Native Java Ebook
No ratings yet
Cloud Native Java Ebook
28 pages
Microsoft Azure IAAS Solutions Ebook
No ratings yet
Microsoft Azure IAAS Solutions Ebook
51 pages
The Docker Book
No ratings yet
The Docker Book
80 pages
L 4 Microkernel S History
No ratings yet
L 4 Microkernel S History
29 pages
Are You A DevOps Hero Whitepaper
No ratings yet
Are You A DevOps Hero Whitepaper
8 pages
Sahdev Zala Henry Nash Martin Hickey: © 2018 IBM Corporation
No ratings yet
Sahdev Zala Henry Nash Martin Hickey: © 2018 IBM Corporation
50 pages
Sepam: User's Manual
No ratings yet
Sepam: User's Manual
40 pages
Refactoring To Collection Loop
No ratings yet
Refactoring To Collection Loop
23 pages
Agile Software Architecture
No ratings yet
Agile Software Architecture
33 pages
Mesosphere Guide To Data-Rich Apps in Financial Services 1
No ratings yet
Mesosphere Guide To Data-Rich Apps in Financial Services 1
11 pages
Functional Test Automation:: A Complete Introduction To
No ratings yet
Functional Test Automation:: A Complete Introduction To
47 pages
855 Boost Mobile Phone User's Guide
100% (2)
855 Boost Mobile Phone User's Guide
195 pages
Backend Challenge
No ratings yet
Backend Challenge
6 pages
The Node - Js Developer Roadmap For 2021
No ratings yet
The Node - Js Developer Roadmap For 2021
6 pages
A How-To-Guide To OAuth and API Security
No ratings yet
A How-To-Guide To OAuth and API Security
14 pages
API Facade Pattern
No ratings yet
API Facade Pattern
37 pages
WCE700 MX51 ER 1106 ReferenceManual
No ratings yet
WCE700 MX51 ER 1106 ReferenceManual
245 pages
Letter To Field Office SD-WAN
No ratings yet
Letter To Field Office SD-WAN
126 pages
NET Core - Xamarin Forms + Prism - Xamarin Classic + MVVM Cross (Leasing)
No ratings yet
NET Core - Xamarin Forms + Prism - Xamarin Classic + MVVM Cross (Leasing)
519 pages
Visual Dispatch Guide
No ratings yet
Visual Dispatch Guide
20 pages
Unit 1
No ratings yet
Unit 1
32 pages
Robustel RT Ug r3000-Lg v.1.0.9
No ratings yet
Robustel RT Ug r3000-Lg v.1.0.9
136 pages
Cyber Recovery On AWS FAQ
No ratings yet
Cyber Recovery On AWS FAQ
7 pages
Portable) LAN Games Repository 1.0.0 (Final)
No ratings yet
Portable) LAN Games Repository 1.0.0 (Final)
7 pages
004 Ingrid MEYER ComputerWords in Our Everyday
No ratings yet
004 Ingrid MEYER ComputerWords in Our Everyday
20 pages
Usability Engineering IRCTC UI
No ratings yet
Usability Engineering IRCTC UI
12 pages
SAD-MCA-II 2023-2025 - Syllabus-1
No ratings yet
SAD-MCA-II 2023-2025 - Syllabus-1
3 pages
User Manual 15851 - (Manymanuals - Com) 43 53
No ratings yet
User Manual 15851 - (Manymanuals - Com) 43 53
11 pages
Disk Management
No ratings yet
Disk Management
8 pages
(Chapter-06) The Tools of Structured Analysis
No ratings yet
(Chapter-06) The Tools of Structured Analysis
23 pages
Student Attendance System Using QR Codes (Case Study: Institut Teknologi Telkom Surabaya)
No ratings yet
Student Attendance System Using QR Codes (Case Study: Institut Teknologi Telkom Surabaya)
11 pages
String in Python-1
No ratings yet
String in Python-1
18 pages
Cs411 Assignment Solution 2025
No ratings yet
Cs411 Assignment Solution 2025
3 pages
Workshop
No ratings yet
Workshop
2 pages
Ada File
No ratings yet
Ada File
29 pages
My Resume
No ratings yet
My Resume
1 page
BB6060 Manual
No ratings yet
BB6060 Manual
3 pages
GRSM Standard Operating Procedure: Backup, Storage & Recovery
No ratings yet
GRSM Standard Operating Procedure: Backup, Storage & Recovery
6 pages
Upload File From Local Disk To A Specific Box Folder in C#
No ratings yet
Upload File From Local Disk To A Specific Box Folder in C#
2 pages
Configuration Fallback
No ratings yet
Configuration Fallback
1 page

Feature Flag Best Practices

Uploaded by

Feature Flag Best Practices

Uploaded by

1.

a. Creating Separate Code Paths

6. 6. Best Practice #4: Incremental, Backward-Compatible

7. 7. Best Practice #5: Implement Flagging Decisions Close to

i. Establishing a Retirement Plan

11. 11. Best Practice #9: Build a Feedback Loop

Feature flags—also known as feature toggles, feature flippers, or feature

What’s more, feature flags enable a culture of continuous experimentation

In this book, we explain how to implement feature-flagged software

Creating Separate Code Paths

When using a feature-flagging system, we often want to control which

Our hypothetical ecommerce company, acmeshopping.com, has a landing

This new section is gated by a feature flag named

As demonstrated in Figure 3-1, the expectation is that if user A visits your

Figure 3-1. Flag consistency during feature ramping

When the same user crosses from an “anonymous” to a “logged-in” user,

Our sample company, acmeshopping.com, will have this problem as well.

For example, if your application is more transactional in nature, such as a

If you decide that maintaining a consistent feature-flag treatment is

Figure 4-1. Crossing the “anonymous” to “logged-in” barrier

In addition to logic implemented on the server side, modern web

When a Site Reliability Engineer hits a kill switch to disable a feature

In addition, the UI often won’t have access to a lot of dimensional data

Given these performance and complexity concerns, we recommend

Whenever we make code changes to a production system, we need to take

Figure 6-1. Database schema before and after migration

Figure 6-2. Code-first approach

Figure 6-4. Big-Bang approach

Feature flagging brings an additional challenge in this area. Code paths

Let’s see how one of these Expand-Contract migrations might work in

We want to perform this change safely, so we use a feature flag to manage

1. We Expand the schema by performing a data-first change, which

Duplicate Writes and Dark Reads

Because we have data duplicated in two places, we also need to decide

Working with Databases in a Feature-Flagged

At acmeshopping.com we want to experiment with the idea of offering

Figure 7-1. Backend systems involved in creating a new checkout

At acmeshopping.com, all user-facing web interactions go through the

The Checkout service calculates shipping costs, so it seems obvious that

However, with this approach, we’d be forcing the Checkout service to

A Rule of Thumb for Placing Flagging Decisions

This ensures that flagging decisions are encapsulated as much as possible

Following this rule of thumb is sometimes a balancing act. We usually

In our previous example, acmeshopping.com was experimenting with

To avoid this problem, some feature-flagging systems allow you to “link”

We write unit tests to validate parts of business logic, automate user

When using feature flags in your application, you are creating a

In a typical CI/CD environment, a new code change is pushed to the

Naming Your Flags

Here’s a specific example that uses the aforementioned template:

We recommend adopting a naming convention that makes sense within the

Managing Your Flags

To ensure a successful long-term adoption, it’s essential to have a plan in

Establishing a Retirement Plan

Some teams even go so far as creating “time bombs,” whereby a system

Of these various techniques, we recommend getting started by placing a

Finding Zombie Flags

Ownership and Change Control

Having the ability to assign ownership of a flag—either to a specific

Feature flags offer an extremely powerful way to directly modify the

A handy generalized form of this capability is tagging (also sometimes

Feature flags allow us to make controlled changes to our system. We can

We cannot overstate how effective a mature feedback mechanism is in

Despite the value of this feedback loop, a surprising number of feature-

Correlating Changes with Effects

These two categories of metrics might initially seem unrelated—and

Feature flags help modern product delivery teams reduce risk by

You might also like