DOCKER - SWARM - ECS Notes
DOCKER - SWARM - ECS Notes
1. Introduction to Docker
1.1 What is Docker?
Docker is an open-source platform that enables developers to build, package, and distribute
applications using containerization. A Docker container is a lightweight, portable, and isolated
environment that encapsulates an application along with its dependencies, ensuring consistency
across different computing environments.
Docker allows developers to:
Package applications and their dependencies together.
Run applications consistently across multiple environments (development, testing, and
production).
Reduce conflicts between different software versions.
Improve resource efficiency by sharing the underlying OS kernel.
Example:
Suppose you have a Python application that requires Python 3.10 and specific libraries. Without
Docker, a developer might install dependencies manually, leading to inconsistencies between
environments.
With Docker, you create a Docker image containing the necessary Python version and dependencies,
ensuring that the application runs the same way everywhere.
Simple Dockerfile for a Python Application:
# Install dependencies
RUN pip install -r requirements.txt
Example:
Imagine running 3 applications with different environments:
Using VMs: You need three separate VMs, each with its own OS, consuming a lot of memory.
Using Docker: You need three containers, each sharing the same OS kernel, making it
lightweight and efficient.
Diagram:
VM Approach: Docker Approach:
---------------------- ----------------------
| Host OS | | Host OS |
| Hypervisor | | Docker Engine |
| ---------------- | | ------------------ |
| VM1 | VM2 | VM3 | | Container1 | Container2 | Container3 |
---------------------- ----------------------
# Install dependencies
RUN pip install -r requirements.txt
Types of Registries
1. Public Registry – Docker Hub (default)
2. Private Registry – Self-hosted or cloud-based (AWS ECR, Azure ACR, Google GCR)
sudo chmod a+r /etc/apt/keyrings/docker.asc (Grant read permission to all user on machine)
sudo apt update (Updates APT package index so it recognizes the new repository)
# Install dependencies
RUN pip install -r requirements.txt
3. Docker Networking
3.1 Docker Networking Basics
How Docker Manages Networking?
Docker provides a flexible networking model that allows containers to communicate with each other,
the host system, and the internet. By default, Docker creates different types of networks for various
use cases.
Container-to-Container Communication
Containers can communicate with each other using different networking approaches.
The bridge network has itself also a gateway IP to route traffic between containers. This ip is
also from the same subnet
The CIDR block (e.g., 192.168.1.0/24) defines the range of IPs available for the bridge network. The
container ip and bridge ip both come from this range
Checking IPs
Explanation:
eth0: Primary network interface with IP 192.168.1.100. Generally the ip of interface of
Gateway
lo: Loopback interface (127.0.0.1).
RX/TX packets show data received/sent.
Docker Network Details: docker network inspect bridge => It shows all the network
information, all the ips included and all the containers in the network.
Container IP: docker container inspect <container-id> => In this we can see detail of specific
container, Hence also ip of that specific container and the network it belongs to.
Example:
Here’s a step-by-step example of how a Docker bridge network works, including commands and
expected outputs.
Command:
docker network ls
Expected Output:
NETWORK ID NAME DRIVER SCOPE
abcdef123456 bridge bridge local
ghijkl789012 host host local
mnopqr345678 none null local
The bridge network is the default network for containers.
2. Host Network
Removes the network isolation of the container, making it use the host’s network stack
directly.
The container does not get its own IP; it shares the host’s IP.
Useful for performance optimization but eliminates container networking isolation.
Example:
docker run -d --name web_host --network host nginx
curl http://localhost
In host networking, the container directly uses the host machine’s network stack, bypassing Docker’s
bridge. So any container running in this network doesn’t need port binding. The host ip and port
become the container ip and port.
Use Case
Used when a container needs direct access to the host network, such as an antivirus container that
needs to monitor all traffic.
Checking IPs
Container IP: Since it shares the host network, the container does not get a separate IP.
Example:
Here’s a step-by-step example of how a Docker host network works, including commands and
expected outputs.
3. None Network
Disables all networking for a container.
The container only has a loopback interface.
Used for highly restricted, isolated workloads.
Example:
docker run -d --name isolated_container --network none ubuntu sleep 1000
docker exec -it isolated_container ping google.com # Should fail
How It Works
A container with none network mode is fully isolated and does not have any network interfaces
except for the loopback interface.
Use Case
Useful for containers that should not have any network access, such as security-sensitive
applications.
Example:
Here’s a step-by-step example of how the Docker None network works, including commands,
outputs, and explanations.
4. Overlay Network
Enables communication between containers across multiple Docker hosts in a Swarm cluster.
Used in Docker Swarm or Kubernetes deployments.
Provides secure, encrypted communication.
Example:
docker network create --driver overlay my_overlay
5. Macvlan Network
Assigns each container a unique MAC address, making it appear as a physical device on the
network.
Allows direct communication with other devices in the network.
Example:
docker network create -d macvlan --subnet=192.168.1.0/24 --gateway=192.168.1.1 -o parent=eth0 my_macvlan
docker run -d --name mac_container --network my_macvlan nginx
6. IPvlan Network
Assigns IP addresses directly from the physical network, similar to Macvlan but with lower
overhead.
Provides better performance by reducing network translation layers.
Example:
docker network create -d ipvlan --subnet=192.168.1.0/24 --gateway=192.168.1.1 -o parent=eth0 my_ipvlan
docker run -d --name ipvlan_container --network my_ipvlan nginx
Creating Networks
To create a new Docker network, use the docker network create command. This allows the creation
of custom bridge networks, overlay networks, and more.
Example: Creating a User-Defined Bridge Network
docker network create my_custom_bridge
Inspecting Networks
To get detailed information about a network, use docker network inspect.
Example:
docker network inspect my_custom_bridge (See what containers are part of this network,
what’s ip block of this network)
Removing Networks
Unused networks can be removed using docker network rm.
Example:
docker network rm my_custom_bridge
EXTRA NOTES:
-p vs -P Port Binding
Docker provides two options for exposing container ports to the host:
-p <host_port>:<container_port> (explicit mapping): Maps a specific host port to a
container port.
-P (automatic mapping): Randomly assigns available host ports to the container’s
exposed ports.
Example: Manual Port Mapping with -p
docker run -d -p 8080:80 nginx
To access the Nginx server, navigate to:
http://localhost:8080
This maps port 80 inside the container to port 8080 on the host.
Example: Random Port Mapping with -P
Step 1: Run a Container with -P
docker run -d --name web_random -P nginx
The -P flag assigns a random available host port for each exposed port inside
the container.
Since Nginx exposes port 80 by default, Docker will map it to a random host
port.
Expected Output:
PING db (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: icmp_seq=1 ttl=64 time=0.064 ms
This shows that the container app can resolve db using the internal DNS.
1. Docker Volume
Why use Docker Volumes?
Managed by Docker (safe, portable).
Persists
Works across multiple containers.
Best for databases, logs, and application data.
Check Volumes
docker volume ls
Expected Output
DRIVER VOLUME NAME
local my_volume
Inspect Volume
docker volume inspect my_volume
Expected Output
[
{
"Name": "my_volume",
"Mountpoint": "/var/lib/docker/volumes/my_volume/_data",
"Driver": "local",
"CreatedAt": "2025-03-26T10:00:00Z"
}
]
Note: As Volumes are managed by docker, hence they are in Docker Area of Host File system,
Unlike Bind mounts which can be at any location in host file system.
3. Listing Volumes
docker volume ls
Output
DRIVER VOLUME NAME
local my_volume
4. Inspecting Volumes
docker volume inspect my_volume
Output
[
{
"Name": "my_volume",
"Mountpoint": "/var/lib/docker/volumes/my_volume/_data"
}
]
5. Removing Volumes
docker volume rm my_volume
Output
my_volume
By default, Docker does not remove volumes when a container is deleted, even if the container is
removed using docker rm. This is to prevent data loss. However, you can automatically remove the
associated volume when a container is removed using anonymous volumes and the --rm or --
volumes flag.
6.1. Using --rm for Temporary Containers while starting container(Does Not Work for Named
Volumes)
docker run --rm -v /app/data nginx
🔹 The container and its anonymous volume will be removed as soon as it stops.
❗ Limitation: This does not work with named volumes.
Docker Commit
docker commit my_container my_new_image
🔹 Saves live container changes to an image (not best practice for reproducibility).
Example:
# Stage 1: Build
FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp
# Stage 2: Runtime
FROM alpine:latest
COPY --from=builder /app/myapp /usr/local/bin/myapp
ENTRYPOINT ["myapp"]
🔹 This results in a smaller image by keeping only the built binary.
6. Docker Compose
6.1 Introduction to Docker Compose
Docker Compose is a tool for defining and running multi-container(different apps) Docker
applications in one machine. It uses a YAML file to configure the application's services, networks, and
volumes, making it easier to manage and deploy complex environments.
services:
db:
image: mysql:latest
container_name: mysql-db
environment:
MYSQL_ROOT_PASSWORD: root
ports:
- "3306:3306"
backend:
image: my-backend-image
container_name: backend-app
depends_on:
- db
ports:
- "5000:5000"
frontend:
image: my-frontend-image
container_name: frontend-app
depends_on:
- backend
ports:
- "3000:3000"
🔹 Understanding docker-compose.yml
A docker-compose.yml file is a configuration file written in YAML format that defines:
✔️Services (containers)
✔️Networks
✔️Volumes
✔️Environment variables
✔️Restart policies
✔️Dependencies between services
version: "3.8"
services:
web:
image: nginx:latest
ports:
- "8080:80"
depends_on:
- db # Web service depends on the database
restart: always
networks:
- my_network
db:
image: postgres:latest
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
volumes:
- db_data:/var/lib/postgresql/data
networks:
- my_network
volumes:
db_data:
networks:
my_network:
✅ What Happens Here?
depends_on: Ensures db starts before web, but it does NOT wait for the DB to be
ready.
restart: always: Ensures the container restarts automatically if it crashes.
Networking: Both containers communicate via my_network (instead of exposing
ports to the host (By default, Bridge network).
Volume db_data stores the database’s persistent data.
Example:
services:
app:
image: my-app
depends_on:
- db # Ensures db service starts before app
db:
image: mysql:latest
⚠ Limitation: depends_on does not wait for the database to be fully initialized. To ensure
readiness, use a health check.
Solution: Using healthcheck for Readiness
services:
app:
image: my-app
depends_on:
db:
condition: service_healthy # Waits for db health check to pass
db:
image: mysql:latest
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 10s
retries: 5
start_period: 30s
✅ Now, app will wait until db is actually ready!
🔹 Attaching Volumes
Volumes persist data even when a container is removed.
1️ Named Volumes (Preferred)
Docker manages these volumes.
services:
db:
image: postgres:latest
volumes:
- db_data:/var/lib/postgresql/data
volumes:
db_data: # Declare the named volume
🟢 Data remains even after docker-compose down.
🔹 Attaching Networks
By default, Docker Compose creates a default network for all services to communicate.
1️ Default Bridge Network
networks:
my_network:
driver: bridge
services:
app:
image: my-app
networks:
- my_network
db:
image: postgres
networks:
- my_network
✔️Containers in the same network can talk using service names.
✔️app can reach db using db:5432.
services:
backend:
image: my-backend
networks:
- my_overlay
✔️Used in Swarm Mode to connect services across multiple hosts.
Stop Containers
docker-compose down
🚀 Stops and removes all services.
View Logs
docker-compose logs -f
Rebuild After Code Changes
docker-compose up -d --build
Example Output:
Creating network my_app_default ... done
Creating volume my_app_data ... done
Creating my_app_db ... done
Creating my_app_web ... done
✅ Use case: Start the whole multi-container application in the background.
Example Output:
Stopping my_app_web ... done
Stopping my_app_db ... done
Removing my_app_web ... done
Removing my_app_db ... done
Removing network my_app_default
✅ Use case: Clean up the environment after development or testing.
🔹 2. Checking Logs
📜 docker-compose logs (View Container Logs)
Fetches logs for all containers.
docker-compose logs
Options:
-f → Follows logs in real time (like tail -f).
--tail=50 → Shows the last 50 log lines.
Example Output:
web | Starting Nginx...
db | Database initialized.
✅ Use case: Debugging application behavior.
🔹 4. Managing Services
🚀 docker-compose restart (Restart Containers)
Example Output:
Name Command State Ports
-------------------------------------------------------------
my_app_web "nginx -g daemon off;" Up 80/tcp
my_app_db "docker-entrypoint.sh" Up 5432/tcp
✅ Use case: Verify which containers are running.
🔹 7. Updating Services
🔄 docker-compose pull (Pull New Image Versions)
🔹 Summary of Commands
Command Description
docker-compose up -d Start all services in detached mode
docker-compose down Stop and remove containers, networks, and volumes
docker-compose logs -f View real-time logs
docker-compose ps List running services
docker-compose stop Stop containers without removing them
docker-compose restart Restart all containers
docker-compose start Start previously stopped containers
docker-compose rm -f Remove stopped containers
docker-compose exec service_name command Run commands inside a running container
docker-compose run service_name command Run a command in a new temporary container
docker-compose pull Pull latest images from the registry
docker-compose build Build images from a Dockerfile
docker-compose up --force-recreate Recreate containers without reusing old ones
docker-compose volume ls List all Docker volumes
docker-compose network ls List all Docker networks
docker-compose config Validate docker-compose.yml file
Example Challenge
If a database container crashes, orchestration tools restart it. But if persistent data storage
isn’t set up properly, data may be lost.
Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
✅ Best for: Large-scale production deployments.
Expected Output:
Swarm initialized: current node (abcd1234) is now a manager.
Expected Output:
ID NAME NODE STATUS
xyz123 web_service.1 worker-1 Running
abc456 web_service.2 worker-2 Running
def789 web_service.3 worker-3 Running
Swarm distributes containers across multiple nodes for high availability.
Expected Output:
Swarm initialized: current node (abcd1234) is now a manager.
2️ What is a Task?
1 task = 1 container running on a Swarm node.
If a service is set to 5 replicas, Swarm creates 5 tasks distributed across nodes.
If a task fails, Swarm automatically replaces it on an available node.
Comparison Example:
In Swarm, to run 2 different containers, you need 2 separate services.
In ECS, a single task definition can run multiple containers together.
🔹 Scaling Services
Swarm supports two types of scaling:
1. Replicated Scaling: Defines a fixed number of replicas. Uses –replicas=5
2. Global Scaling: Ensures exactly one task runs per node. Eaxh time we add a node, this service
is added to that node. This type of scaling is done to run antivirus containers on all machines.
Example: docker service create – name antivirus –mode global -dt ubuntu
🔹 Draining a Node
When maintaining a Swarm node, you can drain it so tasks are rescheduled to other nodes.
$ docker node update --availability drain worker-1
✅ This removes tasks from worker-1.
To reactivate the node:
$ docker node update --availability active worker-1
Use case: We want to update something on a node. Then we can drain that node and update
it and then later make it active again so that tasks can be assigned to it by orchestrator
To inspect a node:
$ docker node inspect worker-1 --pretty
✅ Provides a human-readable summary.
Or
services:
frontend:
image: nginx:latest
ports:
- "8080:80"
networks:
- app_network
deploy:
replicas: 2
restart_policy:
condition: any
db:
image: mysql:5.7
environment:
MYSQL_ROOT_PASSWORD: rootpassword
MYSQL_DATABASE: mydatabase
volumes:
- db_data:/var/lib/mysql
networks:
- app_network
deploy:
replicas: 1
restart_policy:
condition: on-failure
networks:
app_network:
volumes:
db_data:
Explanation:
frontend (NGINX web server)
o Runs on port 8080 (maps to container’s port 80).
o Uses a custom network (app_network) to communicate with the database.
o Runs two replicas for load balancing.
db (MySQL database)
o Uses environment variables to set the database name and password.
o Uses a Docker volume (db_data) to persist database data.
o Runs only one replica (databases should not have multiple replicas unless
explicitly configured).
networks: Creates a custom network for inter-container communication.
volumes: Ensures MySQL data is persisted.
Example output:
{"region":"us-east","storage":"ssd"}
This means the node is labeled as:
region=us-east
storage=ssd
Example:
docker node update --label-add storage=ssd worker-1
This assigns the label storage=ssd to the node named worker-1.
Example output:
[region == us-east storage == ssd]
✅ This confirms that the service was restricted to nodes with these labels.
Step 3: Enter the App Container and Ping the Web Container
Find a running container from the app service:
docker ps
Example output:
CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES
ab12cd34ef56 alpine "sleep 10000" 1 minute ago Up 1 minute app.1.xxxx
7. Verifying Encryption
To check if encryption is working, capture network packets and ensure traffic is encrypted.
On One of the Swarm Nodes, Run:
sudo tcpdump -i eth0 port 4789
🔹 Port 4789 is used for VXLAN encapsulation. If encrypted, no plain text data will be visible.
Ensuring high availability and fault tolerance in a Docker Swarm cluster is critical for preventing
disruptions and maintaining service uptime. In this section, we'll cover key topics like quorum, split-
brain problems, manager node high availability, and disaster recovery strategies.
1. Quorum & Split-Brain Problem
What is Quorum?
Quorum is the minimum number of manager nodes required for the cluster to function properly.
✅ Swarm uses Raft consensus algorithm to maintain consistency across manager nodes.
✅ A quorum is required for leader elections and state changes.
✅ If quorum is lost, the cluster cannot accept new updates, even if some nodes are running.
Split-Brain Problem
If two or more partitions of a cluster believe they are the leader due to a network failure, it's called a
split-brain problem.
🔹 This can lead to data inconsistency and conflicting changes.
🔹 Swarm prevents split-brain by requiring a majority (quorum) of managers to make decisions.
🔹 If quorum is lost, Swarm stops making decisions until the quorum is restored.
container_definitions = jsonencode([
{
name = "nginx"
image = "nginx"
cpu = 256
memory = 512
essential = true
portMappings = [
{
containerPort = 80
hostPort = 80
}
]
}
])
}
terraform apply -auto-approve
✔️Output:
Apply complete! Resources: 1 added.
network_configuration {
subnets = aws_subnet.public[*].id
security_groups = [aws_security_group.ecs_sg.id]
assign_public_ip = true
}
}
terraform apply -auto-approve
✔️Output:
Apply complete! Resources: 1 added.
services:
web:
image: nginx:latest
container_name: web_container
ports:
- "80:80"
networks:
- app_network
depends_on:
- backend
backend:
image: node:14
container_name: backend_container
build: ./backend
networks:
- app_network
environment:
DB_HOST: db
DB_PORT: 3306
depends_on:
- db
db:
image: mysql:5.7
container_name: db_container
environment:
MYSQL_ROOT_PASSWORD: rootpassword
MYSQL_DATABASE: appdb
networks:
- app_network
volumes:
- db_data:/var/lib/mysql
networks:
app_network:
driver: bridge
volumes:
db_data:
driver: local
Scaling in Kubernetes
In Kubernetes, you would use a Deployment to scale the backend service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend-deployment
spec:
replicas: 3
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: backend_image
ports:
- containerPort: 8080
This command will deploy all services defined in the docker-compose.yml to the
Swarm cluster.
3. Verify Deployment:
docker stack services my_stack
This command will list the services deployed in the stack and their current status.
7. Continuous Integration and Continuous Deployment (CI/CD)
To maintain consistent deployments and enable automated testing and release,
integrating Docker with a CI/CD pipeline is crucial. Popular tools like Jenkins, GitLab
CI, and CircleCI can be used to automate building and deploying multi-service
applications.
build:
stage: build
script:
- docker build -t my-app .
deploy:
stage: deploy
script:
- docker-compose up -d
This pipeline:
Builds the Docker image.
Deploys the application using docker-compose up -d.
For example, in Docker Compose, you can integrate Fluentd as a logging driver:
services:
web:
image: nginx
logging:
driver: fluentd
options:
fluentd-address: localhost:24224
scrape_configs:
- job_name: 'docker'
static_configs:
- targets: ['host.docker.internal:9323']
📌 This tells Prometheus to scrape Docker metrics from
host.docker.internal:9323 (Docker's built-in metric endpoint).
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
-
./prometheus/prometheus.yml:/etc/prometheus/prometheus.y
ml
ports:
- "9090:9090"
networks:
- monitoring
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
networks:
- monitoring
depends_on:
- prometheus
networks:
monitoring:
driver: bridge
📌 Explanation:
Prometheus runs on port 9090 and scrapes data.
Grafana runs on port 3000 and fetches metrics from Prometheus.
Both services share a network (monitoring) for communication.
5️ Analyzing Bottlenecks
📌 Look for:
✅ CPU Spikes → High CPU usage means container may need more resources.
✅ High Memory Usage → Containers may be leaking memory.
✅ Slow Response Times → If response time increases under load, optimize code/config.
✅ Container Restarts → If containers restart under load, scale replicas or debug.
ping <server-ip>
ssh user@server-ip
docker ps -a
✅ If a container has exited, check logs:
📌 Solution:
services:
my_service:
mem_limit: 512m
✅ Solution:
df -h
📌 If /var/lib/docker is full:
du -sh /var/lib/docker/*
✅ Solution:
Prune Unused Resources:
Add:
"data-root": "/mnt/docker-data"
docker network ls
docker stats
✅ Solution:
services:
my_service:
mem_limit: "512m"
cpu_shares: 512
✅ Check logs:
docker node ls
📌 Possible Fixes:
Restart Docker:
✅ Summary of Fixes
Docker Down systemctl status docker Restart with systemctl restart docker
Container Crashes docker logs <container_id> Restart, fix config, increase memory
Overlay Network Corrupt docker network inspect Remove & recreate network
🚀 With these steps, you can troubleshoot and recover from most server downtime issues in
Docker!
Configuration mistakes can cause Docker containers to fail at startup, crash unexpectedly, or behave
incorrectly. These can stem from:
✅ Incorrect environment variables
✅ Missing or incorrect volume mounts
✅ Wrong network configuration
✅ Errors in docker-compose.yml or Dockerfile
✅ Incorrect permissions
docker ps
If your container is not running, check all containers:
docker ps -a
Permission denied
✅ Solution:
services:
myapp:
environment:
- DB_HOST=db.example.com
- DB_USER=root
cat .env
Containers may fail if the mounted volume does not exist or lacks permissions.
✅ Solution:
Verify if the host directory exists:
ls -l /data
services:
myapp:
volumes:
- /data:/app/data
✅ Solution:
services:
db:
networks:
- my_network
app:
networks:
- my_network
networks:
my_network:
driver: bridge
docker-compose config
✅ Solution:
services:
app:
image: myapp
ports:
- 8080:80
✅ Correct indentation:
services:
app:
image: myapp
ports:
- 8080:80
Common Issues
✅ Correct:
✅ Correct:
WORKDIR /app
CMD ["./app"]
ENV DB_HOST=localhost
ARG DB_HOST
ENV DB_HOST=${DB_HOST}
✅ Solution:
If a container lacks root privileges, it may fail to access files or execute commands.
✅ Solution:
USER 1000
✅ Solution:
✅ Summary of Fixes
Wrong Volume
`docker inspect <container_id> jq '.[0].Mounts'`
Mounts
docker inspect --
Health Check Fix HEALTHCHECK, ensure app
format='{{json .State.Health.Status}}'
Failures is running
<container_id>
📌 With these checks, you can fix most Docker configuration issues and ensure smooth container
operation! 🚀