Jet Devops Cookbook (Revised)
Jet Devops Cookbook (Revised)
Table of Contents
Chapter 1: Scope and Audience for This Cookbook
o Introduction
o Who Should Read This Cookbook?
o Scope of This Cookbook
o Key Takeaways
Chapter 2: Introduction to Infrastructure as Code Pipelines
o What is Infrastructure as Code (IaC)
o Benefits of using Infrastructure as Code (IaC)
o Importance of CI/CD Pipelines in IaC
o Key Concepts in Building IaC Pipelines
o Benefits of a Well-Designed IaC Pipeline
Chapter 3: Best Practices for Infrastructure as Code
o Organizing Terraform Code for Scalability and Maintainability
o State Management and Remote Backends
o Security Best Practices in IaC (e.g., Secrets Management)
o Documentation and Code Reviews
o Managing Dependencies and Module Reusability
Chapter 4: Designing a universal IaC Pipeline
o Abstracting the Pipeline Process
o Reusable jobs and workflows
o Environment Segregation and Deployment Strategies
o Notifications and Monitoring in Pipelines
o Error Handling and Rollbacks
Chapter 5: Implementing the Pipeline with GitHub Actions and Atlantis
o Introduction to Atlantis and Its Role in IaC Pipelines
o Configuring Atlantis for Pull Request Automation
o Setting Up GitHub Actions and Writing Workflows for Terraform
o Summary
Chapter 6: Troubleshooting, Tips, and Future Trends
o Common Challenges in IaC Pipelines
o Emerging Trends in IaC and CI/CD
o Final Thoughts and Next Steps
# Jan Tymiński
I am a Senior DevOps Engineer with expertise in building resilient, scalable,
and cost-efficient cloud infrastructures. With 5 AWS certifications, I specialize
in designing and implementing high-availability architectures using
Terraform, containerization, and CI/CD practices to streamline development
processes and optimize operational performance.
# Chapter 1: Scope and Audience for This Cookbook
Introduction
Infrastructure as Code (IaC) has transformed the way organizations manage
and deploy their infrastructure. By treating infrastructure configuration as
code, teams can ensure consistency, scalability, and efficiency in their
environments. This cookbook is designed to guide you through best practices
for building, maintaining, and optimizing an IaC pipeline that is robust,
scalable, and secure.
Key Takeaways
By the end of this cookbook, readers will:
Understand the critical components of a well-structured IaC pipeline
Learn best practices that ensure security, maintainability, and
efficiency
Gain insights into the latest industry trends and emerging technologies
Be equipped with the knowledge to implement and continuously
improve their infrastructure automation strategies
This book serves as a practical resource for those looking to master the
discipline of Infrastructure as Code and build pipelines that stand the test of
time.
# Chapter 2: Introduction to Infrastructure as Code Pipelines
How It Works
1. Define Infrastructure: Use a supported language or tool (e.g.,
Terraform) to write configuration files that describe the desired state of
your infrastructure.
2. Store in Version Control: Save these files in a version control
system like Git to enable tracking, collaboration, and auditing.
3. Execute: Use IaC tools to parse the configuration files and create or
update the infrastructure automatically.
4. Maintain State: Track the current state of resources to ensure
updates are incremental and idempotent.
Example
Consider setting up a web server. Without IaC, you might manually provision
a virtual machine, install software, and configure the network. With IaC, you
write a file that describes the virtual machine, its software, and its
configuration. Running this file through an IaC tool creates the server
automatically, and it can be repeated for different environments. With the
manual approach, configuration is prone to errors, ; people may forget to
configure something when provisioning another server. IaC makes it easy for
continuous improvements of the configuration - you can start from the same,
well established base, andbase and gradually improve it over time - easily
introducing improvements to already running infrastructure.
Improved Collaboration
IaC facilitates collaboration between development and operations teams by
enabling infrastructure definitions to be versioned and reviewed like
application code. This shared approach reduces silos and fosters teamwork.
Enhanced Scalability
Scaling infrastructure to meet demand becomes straightforward with IaC.
Scripts can dynamically adjust the number of resources, ensuring seamless
performance during traffic spikes or peak usage periods.
Cost Management
IaC can help optimize costs by automating the deprovisioning of unused
resources. Policies and scripts can enforce cost-saving measures, such as
turning off non-essential resources during off-hours or support preventing
costly misconfiguration.
Reusability
Infrastructure code can be modularized and reused across projects, reducing
duplication and promoting standardization. Teams can leverage modules,
templates, or libraries to build new environments faster.
Vendor-Agnostic Flexibility
IaC tools support multiple cloud providers and on-premise environments,
enabling teams to adopt a vendor-agnostic approach. This flexibility prevents
lock-in and allows for easier migration between platforms. The configuration
of each platform may differ, but the concepts for the tool remain. Not every
tool supports that, so it is important to choose the one that fulfills our needs.
Seamless Collaboration
CI/CD pipelines provide a unified workflow for developers and operations
teams. Pull requests trigger the pipeline, automating tasks like code
validation, infrastructure provisioning, and deployment approvals. This
ensures smoother onboarding, handoffs and better alignment between
teams.
Enhanced Security
By embedding security checks into the CI/CD pipeline, teams can enforce
policies, scan for vulnerabilities or misconfiguration (e.g. too open
networking), and ensure sensitive data like secrets are handled securely.
This reduces the risk of misconfigurations and data breaches.
Cost Optimization
CI/CD pipelines can include tools like Infracost to provide cost estimates for
infrastructure changes based on the code being deployed. This allows teams
to understand the financial implications of changes before they are applied,
helping to prevent unexpected costs. While pipelines do not analyze actual
usage patterns for underutilized resources, they can support enforcing
tagging and policy compliance, enabling downstream cost analysis through
dedicated cloud management tools.
Continuous Testing
Continuous Testing involves automating tests to validate infrastructure code
at every stage of development and deployment. In IaC, it serves to:
Identify Errors Early: Catch misconfigurations or invalid code before
they reach production.
Ensure Compliance: Validate that infrastructure adheres to security
and organizational policies.
Maintain Reliability: Test that changes to infrastructure won’t
disrupt existing environments.
Secrets and Configuration Management
Securely handle sensitive information such as API keys, credentials,
and environment variables.
Use tools like HashiCorp Vault, AWS Secrets Manager, or encrypted
variables in CI/CD tools.
Approval Workflows
Implement approval steps in the pipeline to control deployments to
critical environments.
Configure pipelines to require manual reviews for pull requests before
applying changes.
Error Handling and Rollback Mechanisms
Design pipelines to detect errors early and provide actionable
feedback.
Include rollback strategies, such as reapplying previous versions of IaC,
to recover from failed deployments.
Cost Awareness
Incorporate tools like Infracost to assess the cost implications of
changes.
Enforce tagging and resource policies to track and optimize spending.
Monitoring and Observability
Monitor pipeline execution and deployed infrastructure to detect and
address issues promptly.
Integrate alerts and logging into your pipeline to enhance visibility.
Compliance and Governance
Enforce organizational policies through automated checks in the
pipeline.
Use tools like Open Policy Agent (OPA) or Checkov to ensure
configurations comply with security and regulatory standards.
Scalability of Pipelines
Design pipelines that scale with the growth of the infrastructure.
Implement parallel processing where possible to reduce execution
time.
By understanding and applying these concepts, teams can build IaC pipelines
that are robust, maintainable, and aligned with DevOps best practices. Each
of these key ideas will be elaborated further in subsequent chapters, with
examples and implementations to illustrate their application.
Seamless Collaboration
Pipelines enable smooth collaboration by integrating with version control
systems and enabling review workflows. Developers and operations teams
can contribute to infrastructure code, while automated pipelines handle
validation and deployment, fostering a DevOps culture.
Summary
These benefits illustrate how a well-designed IaC pipeline not only
streamlines infrastructure management but also supports strategic goals
such as faster delivery, improved quality, and reduced operational risks.
# Chapter 3: Best Practices for Infrastructure as Code
Imagine a world where infrastructure changes are predictable, teams
collaborate seamlessly, and scaling your environment feels effortless. This is
the promise of Infrastructure as Code (IaC)—but only when done right. Best
practices aren’t just guidelines; they’re the foundation for building
infrastructure that’s secure, scalable, and resilient in the face of change.
In this chapter, we dive into the essential practices that ensure your IaC
delivers on its promise. Whether you’re writing code, reviewing it, or guiding
your teams from the management level, these practices will empower you to
achieve efficiency, prevent costly mistakes, and future-proof your
infrastructure. Embracing these principles isn’t just a technical choice; it’s a
strategic one for long-term success.
environments/
dev/
main.tf
variables.tf
staging/
main.tf
variables.tf
prod/
main.tf
variables.tf
Pros:
components/
networking/
main.tf
variables.tf
compute/
main.tf
variables.tf
storage/
main.tf
variables.tf
Pros:
Pros:
Summary
Effective dependency management and module reusability are foundational
to scalable and maintainable IaC. Whether adopting a centralized,
decentralized, or hybrid module management approach, organizations
should focus on reducing complexity, ensuring stability, and leveraging best
practices to drive long-term success. By implementing structured
dependency strategies and designing reusable modules, teams can
streamline infrastructure deployment, minimize risks, and enhance
collaboration.
As the organization grows, the strategy may shift - don’t focus on choosing
the right one too much - don’t try to foresee the future you cannot foresee.
You will cross this bridge when you get there.
# Chapter 4: Designing a universal IaC Pipeline
In the fast-paced world of modern infrastructure, efficiency and reliability are
not optional—they are essential. Infrastructure as Code (IaC) pipelines
provide the foundation for automating deployments, eliminating manual
errors, and enforcing best practices at scale. While different organizations
may use different IaC tools, CI/CD platforms, and cloud providers, the core
principles of a well-designed pipeline remain the same. It’s not about the
tools—it’s about the process.
By focusing on universal best practices, we can build pipelines that are
adaptable, scalable, and resilient to change. Instead of being constrained by
specific technologies, organizations can design workflows that integrate
seamlessly with their infrastructure strategy. Whether you’re an engineer
striving for efficiency, a manager ensuring compliance, or a leader shaping
operational strategy, mastering the core elements of an IaC pipeline will help
drive long-term success. Let’s dive into the principles that make an IaC
pipeline not just functional, but truly universal.
provider "aws" {
region = "eu-west-1"
default_tags {
tags = {
Terraform = "true"
Cost = "iac-pipelines-cookbook"
}
}
}
terraform.tf:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
main.tf:
module "s3_bucket" {
source = "terraform-aws-modules/s3-bucket/aws"
versioning = {
enabled = true
}
}
provider "aws" {
region = "eu-west-1"
default_tags {
tags = {
Terraform = "true"
Cost = "iac-pipelines-cookbook"
}
}
}
terraform.tf:
terraform {
backend "s3" {
bucket = "s3-bucket-for-states"
key = "cookbook/states_dynamodb_table/terraform.tfstate"
region = "eu-west-1"
use_lockfile = true
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
main.tf:
attribute {
name = "LockID"
type = "S"
}
}
Next we need a VPC ready - you can (and probably should) use an existing
one or think a bit aboutconsider tailoring the configuration below
appropriately.
The VPC configuration for the purposes of preparing this cookbook is as
below:
Directory structure:
├── shared_vpc
│ ├── outputs.tf
│ ├── providers.tf
│ ├── terraform.tf
│ └── vpc.tf
providers.tf:
provider "aws" {
region = "eu-west-1"
default_tags {
tags = {
Terraform = "true"
Cost = "iac-pipelines-cookbook"
}
}
}
terraform.tf:
terraform {
backend "s3" {
bucket = "s3-bucket-for-states" # Your S3 bucket for
Terraform states
key = "cookbook/shared_vpc"
region = "eu-west-1"
use_lockfile = true # Terraform 1.10+ only
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
vpc.tf:
module "shared_vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "shared-vpc"
cidr = "10.0.0.0/16"
enable_nat_gateway = true
enable_vpn_gateway = false
# single nat gateway is used to reduce costs of the PoC
single_nat_gateway = true
}
outputs.tf:
output "vpc_id" {
value = module.shared_vpc.vpc_id
}
output "private_subnets" {
value = module.shared_vpc.private_subnets
}
output "public_subnets" {
value = module.shared_vpc.public_subnets
}
provider "aws" {
region = "eu-west-1"
default_tags {
tags = {
Terraform = "true"
Cost = "iac-pipelines-cookbook"
}
}
}
terraform.tf:
terraform {
backend "s3" {
bucket = "s3-bucket-for-states" # Your S3 bucket for
Terraform states
key = "cookbook/shared_vpc"
region = "eu-west-1"
use_lockfile = true # Terraform 1.10+ only
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
module "secrets_manager" {
source = "terraform-aws-modules/secrets-manager/aws"
version = "~> 1.0"
for_each = {
github-token = {
secret_string = var.github_token
}
github-webhook-secret = {
secret_string = random_password.webhook_secret.result
}
atlantis-web-username = {
secret_string = var.atlantis_web_username
}
atlantis-web-password = {
secret_string = var.atlantis_web_password
}
}
# Secret
name_prefix = each.key
recovery_window_in_days = 0 # For example only
secret_string = each.value.secret_string
}
variable "github_token" {
description = "Github token to use when creating webhook"
type = string
}
variable "github_owner" {
description = "Github owner to use when creating webhook"
type = string
}
variable "domain" {
description = "Route53 domain name to use for ACM certificate.
Route53 zone for this domain should be created in advance"
type = string
}
variable "atlantis_github_user" {
description = "GitHub user or organization name"
type = string
}
variable "atlantis_repo_allowlist" {
description = "List of GitHub repositories that Atlantis will be
allowed to access"
type = list(string)
}
variable "repositories" {
description = "List of GitHub repositories to create webhooks for.
This is just the name of the repository, excluding the user or
organization"
type = list(string)
}
variable "atlantis_web_username" {
description = "Atlantis web username"
type = string
}
variable "atlantis_web_password" {
description = "Atlantis web password"
type = string
}
github_token = "ghp_PAToftheusertointegratewithatlantis123"
github_owner = "dedicated-user"
domain = "example.com"
atlantis_github_user = "your-github-organization"
# Format is {hostname}/{owner}/{repo}
https://www.runatlantis.io/docs/server-configuration.html#repo-
allowlist
atlantis_repo_allowlist =
["github.com/your-github-organization/repository-to-orchestrate-with-
atlantis"]
repositories = ["repository-to-orchestrate-with-atlantis"]
atlantis_web_username = "basic_auth_username"
atlantis_web_password = "basic_auth_password"
locals {
atlantis_domain = "atlantis.${var.domain}"
}
module "atlantis" {
source = "terraform-aws-modules/atlantis/aws"
name = "atlantis"
# ECS Service
service = {
task_exec_secret_arns = [
module.secrets_manager["github-token"].secret_arn,
module.secrets_manager["github-webhook-secret"].secret_arn,
module.secrets_manager["atlantis-web-username"].secret_arn,
module.secrets_manager["atlantis-web-password"].secret_arn,
]
# Provide Atlantis permission necessary to create/destroy
resources
tasks_iam_role_policies = {
AdministratorAccess =
"arn:aws:iam::aws:policy/AdministratorAccess"
}
}
service_subnets =
data.terraform_remote_state.network.outputs.private_subnets
vpc_id = data.terraform_remote_state.network.outputs.vpc_id
# ALB
alb_subnets =
data.terraform_remote_state.network.outputs.public_subnets
certificate_domain_name = local.atlantis_domain
route53_zone_id = data.aws_route53_zone.atlantis.zone_id
}
Data sources used below are for reading the remote state of the network and
for fetching the zone attributes from AWS. The network state is used to
configure VPC and subnets for Atlantis. The zone attributes contain zone ID,
which is used later for configuring the ACM certificate for Load Balancer.
data.tf:
output "atlantis_url" {
description = "URL of Atlantis"
value = module.atlantis.url
}
output "webhook_secret" {
description = "Webhook secret"
value = module.secrets_manager["github-webhook-
secret"].secret_string
sensitive = true
}
######################################################################
##########
# Load Balancer
######################################################################
##########
# output "alb" {
# description = "ALB created and all of its associated outputs"
# value = module.atlantis.alb
# }
######################################################################
##########
# ECS
######################################################################
##########
# output "cluster" {
# description = "ECS cluster created and all of its associated
outputs"
# value = module.atlantis.cluster
# }
# output "service" {
# description = "ECS service created and all of its associated"
# value = module.atlantis.service
# }
######################################################################
##########
# EFS
######################################################################
##########
# output "efs" {
# description = "EFS created and all of its associated outputs"
# value = module.atlantis.efs
# }
atlantis_ui.png
atlantis.yaml
Atlantis automation on a per repository basis is handled by atlantis.yaml file
that is stored at the root of each repository we want to orchestrate with
Atlantis. For the states described above we want to exclude them from the
automation using the ignore_paths list that takes directories to be excluded
from the orchestration. To avoid the chicken or the egg problem we don’t
want Atlantis to orchestrate itself.
version: 3
autodiscover:
mode: auto
ignore_paths:
- atlantis_setup
- shared_vpc
- states_dynamodb_table
- states_s3_bucket
on:
pull_request:
jobs:
lint:
name: Check Terraform Code formatting
runs-on: ubuntu-latest
container: hashicorp/terraform:latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
tflint:
name: Lint Terraform Code
runs-on: ubuntu-latest
container: ghcr.io/terraform-linters/tflint:latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
security_scan:
name: Run Security Scan
runs-on: ubuntu-latest
container: aquasec/trivy:latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
infracost:
name: Run Infracost
runs-on: ubuntu-latest
container: infracost/infracost:latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
3. In Target branches select main or your default branch that PRs will be
merged to.
gh_ruleset_targets.png
4. Enable the following:
Summary
In this chapter, we configured a Terraform automation pipeline using
Atlantis and GitHub Actions, ensuring infrastructure changes are
reviewed, validated, and applied in a controlled manner. We deployed
Atlantis on AWS, integrated it with GitHub, and set up workflows to enforce
security, compliance, and approval requirements before applying changes.
This implementation serves as a structured approach, but it is not the only
solution. It can be amended, extended, or completely restructured
based on specific requirements, organizational policies, or alternative
tooling. The principles outlined here remain applicable regardless of the
chosen tools.
# Chapter 6: Troubleshooting, Tips, and Future Trends
By now, you’ve built a solid Infrastructure as Code (IaC) pipeline, automating
your deployments and enforcing best practices. But as with any complex
system, challenges will arise. Whether it’s debugging a failing pipeline,
scaling your workflows across multiple teams, or staying ahead of evolving
best practices, there’s always room for refinement and growth.
In this chapter, we’ll explore some of the most common challenges in IaC
pipelines and how to overcome them. We’ll share insights on optimizing
performance, scaling pipelines for large organizations, and keeping up with
the ever-changing landscape of infrastructure automation. Lastly, we’ll take
a forward-looking approach, discussing emerging trends that could shape the
future of IaC and CI/CD.
Building and maintaining a robust pipeline is not just about writing code —
it’s about understanding the ecosystem, avoiding common pitfalls, and
continuously improving your processes. Let’s dive in and ensure your
pipelines are not just functional but truly rock-solid.