10 Best Practices For Infrastructure As Code With Terraform
10 Best Practices For Infrastructure As Code With Terraform
Table of Contents
Table of Contents 1
Overview 2
Conclusion 15
References 15
1
Overview
Terraform has become the industry standard for Infrastructure as Code today. In this article, we
present 10 best practices for using Terraform at scale in production. These practices are
opinionated and based on how we’ve seen successful organizations adopt Terraform.
Below is an example of connecting your VCS to Terraform Enterprise (TFE) or Terraform Cloud
(TFC)
It is very important to securely store these state files. Some organizations have standardized on
using an S3 bucket in AWS or blob storage in Azure. However, care needs to be taken to make
sure that these storage locations are private and that the state files are encrypted at rest.
Another alternative is to adopt TFE or TFC for state file secure storage. TFE and TFC securely
manage your state files, which prevents:
2
● Leakage of credentials that are embedded in your state files
● Corruption and/or loss of state files
When it comes to provisioning VMs, you could choose to use a vanilla image from your favourite
cloud provider or build your own. Packer is a great solution to create VM images for cloud
providers and for vSphere on-premises.
● Put into your image any dependencies your app needs to run -- libraries, windows
security updates, etc.
● DO NOT put into your image - things like passwords, or hardcoded configuration that
may dynamically change. Terraform will deal with these.
● Static elements that take a long time to install, go in Packer.
● Dynamic elements that need to be different per-instance, should happen at provisioning
time, use Terraform.
● Use Packer Build Pipelines to build several specialized images
3
More on Packer Build Pipelines:
● When creating Packer templates, create builds in a step-by-step process. Do not try and
do everything in a single Packer build.
● Instead, create several Packer build.json files and “chain” them together:
● Keep templates as generic as possible and use User Variables.
● For windows images, keep WinRM disabled or blocked by the firewall until the system
has had its final boot after sysprep. Check out this article on best practices with Packer
and Windows.
4
Terraform manages collections of infrastructure resources (networking, infrastructure, app A,
app B, etc…). TFC or TFE manages infrastructure collections with “workspaces”. A workspace
contains everything Terraform needs to manage a given collection of infrastructure.
Functional Workspaces
5
Workspace per Infrastructure Unit
A single workspace is used to manage a standalone deployable infrastructure unit. Below are
three examples of how to organize workspaces.
6
5. Integrate with a Secrets Management Solution
As organizations move to hybrid cloud environments, they realize the importance of a secrets
management solution. My recommendation here is to use HashiCorp Vault to centralize all your
secrets in one location.
Terraform integrates well with Vault and below are two common examples.
7
From a Workspace with the Terraform Vault Provider
The Terraform Vault Provider may be used to dynamically retrieve credentials from Vault for the
workspace to execute. In this scenario, Terraform should be able to authenticate to Vault and
have permissions to generate these credentials. The generated credentials are used by the
workspace to authenticate to the cloud provider.
These credentials will appear in the state file, so it is good practice to create them with a short
TTL, as well as restrict user access to the state file.
Notes:
● Steps 1 & 2 happen only when a workspace is on-boarded
● Cloud credentials will be saved in the Terraform state file. To mitigate exposure risk,
restrict access to the state file or/and minimize the credential TTL
8
6. Use Continuous Integration (CI) Pipelines
This is a detailed workflow where we use Jenkins as an example of a CI tool to do the following:
● Grabs configuration files from VCS
● Dynamically creates a workspace (if does not exist)
● Injects the necessary variables into the Terraform workspace before executing the run.
● In this scenario, Terraform will not pull the code directly from VCS (even though
possible), but rather, Jenkins will get the code from VCS and upload it to the Terraform
workspace.
With this, Jenkins is able to promote the same code between different environments.
9
7. Modules and Variables for Code Reuse
As the adoption of Terraform at your organization increases, start thinking in terms of a producer
consumer model. This allows the adoption of a self-service framework. Producers create
modules, whereas, consumers don’t need to know the specifics within each module. Consumers
treat modules as black boxes with inputs and outputs. This makes it easier to reuse code and
enables these consumers to serve themselves without the need to submit tickets to a central
team of producers.
10
Create Standard Modules for Commonly Deployed Infrastructure
● Accelerate time to value by offering standard modules for use
● Further improve productivity by designing well test configurations from standard modules
and offering them through self service channels
11
When to use Modules
It’s not always clear when to create a module and when to put your configs in a config file. The
flowchart above walks through the decision process.
As organizations look for ways to speed up application development, they start to think of
moving away from ticketing systems to more self-service models. It's important to ensure and
enforce some sort of governance when adopting a self-service framework.
There are multiple policy as code solutions out there. The Open Policy Agent (OPA) is a popular
one for generic policy as code. However, specific to Terraform, HashiCorp Sentinel is a policy as
code solution that is shipped with TFE or TFC.
12
Benefits of Sentinel:
● Reduce mistakes, make tracking easier and reduce coordination with stakeholders of the
provisioning process.
● Leverage Terraform Foundational Policies to accelerate your policy rule development.
● Tightly integrated with all HashiCorp products.
Moreover, take a look at the diagram below for an end-to-end view of testing.
13
Terraform Cloud or Enterprise offers the following checks:
● Syntax - use VS Code Plugin
● Plan - use provider checks
● Sentinel - use governance checks
● Apply - use cloud checks
Consider using Terraform Enterprise or Terraform Cloud from HashiCorp if you need any of
these:
14
The above diagram shows how to use Terraform Cloud or Terraform Enterprise for your
workflow.
Conclusion
In this article we reviewed 10 best practices for using Terraform in production environments.
This was an opinionated view based on what we see with successful organizations in their
adoption of Terraform. If you have questions or comments, feel free to message us here.
References
● The Core Terraform Workflow - Guides
● Workspaces - Terraform Cloud and Terraform Enterprise
● Naming - Workspaces - Terraform Cloud and Terraform Enterprise
● Terraform Configurations - Workspaces - Terraform Cloud and Terraform Enterprise
15