Docker Guide
Docker Guide
Table of Contents
10
About the author
11
Sponsors
Materialize
DigitalOcean
12
DevDojo
The DevDojo is a resource to learn all things web development and web
design. Learn on your lunch break or wake up and enjoy a cup of coffee
with us to learn something new.
Join this developer community, and we can all learn together, build
together, and grow together.
13
Ebook PDF Generation Tool
14
Book Cover
15
Prayag parmar
License
The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.
16
Chapter 1: Introduction to
Docker
17
What is Docker?
Key Concepts:
18
Why Use Docker?
19
Docker Architecture
┌─────────────┐┌────────────────────────────────
│ Docker CLI │ │
│ Docker Host │
(docker) │◄───►│ ┌────────────┌───────────┐ │
└──────── ││┐│││◄────►│
│ (dockerd)
Docker │ Containers│ │
─────┘ ││ Daemon and ││
│ └──────── │ Images ││
│ ────┘ └───────────┘ │
└────────────────────────────────
▲
│
▼
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
│ Docker Registry │
│ (Docker Hub) │
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
20
Containers vs. Virtual Machines
While both containers and virtual machines (VMs) are used for isolating
applications, they differ in several key aspects:
21
Basic Docker Workflow
# Build an image
docker build -t myapp:v1 .
22
Docker Components
23
Use Cases for Docker
24
Conclusion
25
Chapter 2: Installing Docker
26
Docker Editions
27
Installing Docker on Linux
Docker runs natively on Linux, making it the ideal platform for Docker
containers. There are two main methods to install Docker on Linux:
using the convenience script or manual installation for specific
distributions.
28
This method is ideal for quick setups and testing environments.
However, for production environments, you might want to consider the
manual installation method for more control over the process.
For more control over the installation process or if you prefer to follow
distribution-specific steps, you can manually install Docker. Here are
instructions for popular Linux distributions:
Docker runs natively on Linux, making it the ideal platform for Docker
containers. Here's how to install Docker on popular Linux distributions:
Ubuntu
2.Install prerequisites:
29
4.Set up the stable repository:
6.Install Docker:
CentOS
30
3.Install Docker:
31
Installing Docker on macOS
1.Download Docker Desktop for Mac from the official Docker website:
https://www.docker.com/products/docker-desktop
2.Double-click the downloaded .dmg file and drag the Docker icon to
your Applications folder.
32
Installing Docker on Windows
33
Post-Installation Steps
After installing Docker, there are a few steps you should take:
docker version
docker run hello-world
Note: You'll need to log out and back in for this change to take
effect.
34
Docker Desktop vs Docker Engine
35
Troubleshooting Common Installation Issues
36
Updating Docker
To update Docker:
37
Uninstalling Docker
38
Conclusion
39
Chapter 3: Working with
Docker Containers
40
Running Your First Container
41
Basic Docker Commands
Here are some essential Docker commands for working with containers:
Listing Containers
docker ps
docker ps -a
To restart a container:
42
Removing Containers
docker rm <container_id_or_name>
docker rm -f <container_id_or_name>
43
Running Containers in Different Modes
Detached Mode
Interactive Mode
44
Port Mapping
Example:
45
Working with Container Logs
46
Executing Commands in Running Containers
Example:
47
Practical Example: Running an Apache Container
docker ps
48
Container Resource Management
Limiting Memory
Limiting CPU
49
Container Networking
Listing Networks
docker network ls
Creating a Network
50
Data Persistence with Volumes
Creating a Volume
51
Container Health Checks
52
Cleaning Up
53
Conclusion
54
Chapter 4: What are Docker
Images
55
Key Concepts
56
Working with Docker Images
Listing Images
docker images
docker image ls
Example:
57
docker run <image_name>:<tag>
Example:
Image Information
Removing Images
To remove an image:
or
58
Building Custom Images
Using a Dockerfile
Example Dockerfile:
FROM ubuntu:20.04
RUN apt-get update && apt-get install -y nginx
COPY ./my-nginx.conf /etc/nginx/nginx.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
59
Image Tagging
Example:
60
Pushing Images to Docker Hub
docker login
61
Image Layers and Caching
FROM ubuntu:20.04
RUN apt-get update && apt-get install -y nginx
COPY ./static-files /var/www/html
COPY ./config-files /etc/nginx
62
Multi-stage Builds
Example:
# Build stage
FROM golang:1.16 AS build
WORKDIR /app
COPY . .
RUN go build -o myapp
# Production stage
FROM alpine:3.14
COPY --from=build /app/myapp /usr/local/bin/myapp
CMD ["myapp"]
63
Image Scanning and Security
64
Best Practices for Working with Images
65
Image Management and Cleanup
66
Conclusion
67
Chapter 5: What is a
Dockerfile
68
Anatomy of a Dockerfile
Let's dive deep into each of these components and the instructions
used to implement them.
69
Dockerfile Instructions
FROM
The FROM instruction initializes a new build stage and sets the base
image for subsequent instructions.
FROM ubuntu:20.04
LABEL
ENV
70
image.
WORKDIR
WORKDIR /app
Both COPY and ADD instructions copy files from the host into the image.
COPY package.json .
ADD https://example.com/big.tar.xz /usr/src/things/
COPY is generally preferred for its simplicity. ADD has some extra
features like tar extraction and remote URL support, but these can
make build behavior less predictable.
RUN
RUN executes commands in a new layer on top of the current image and
commits the results.
It's a best practice to chain commands with && and clean up in the same
RUN instruction to keep layers small.
71
CMD
ENTRYPOINT
ENTRYPOINT
ENTRYPOINT is often used in combination with CMD, where
defines the executable and CMD supplies default arguments.
EXPOSE
EXPOSE 80 443
72
VOLUME
VOLUME /data
ARG
ARG defines a variable that users can pass at build-time to the builder
with the docker build command.
ARG VERSION=latest
73
Best Practices for Writing Dockerfiles
FROM nginx:alpine
COPY --from=build /app/dist /usr/share/nginx/html
6.Use specific tags: Avoid latest tag for base images to ensure
reproducible builds.
74
instructions like RUN cd … && do-something.
8.Use COPY instead of ADD: Unless you explicitly need the extra
functionality of ADD, use COPY for transparency.
75
Advanced Dockerfile Concepts
Health Checks
You can use the HEALTHCHECK instruction to tell Docker how to test a
container to check that it's still working.
The exec form is preferred as it's more explicit and avoids issues with
shell string munging.
BuildKit
export DOCKER_BUILDKIT=1
76
Conclusion
77
Chapter 6: Docker Networking
78
Docker Network Drivers
79
Working with Docker Networks
Listing Networks
docker network ls
This command shows the network ID, name, driver, and scope for each
network.
Inspecting Networks
Creating a Network
Example:
80
docker network create --driver bridge my_custom_network
You can specify additional options like subnet, gateway, IP range, etc.:
Example:
81
Removing Networks
To remove a network:
82
Deep Dive into Network Drivers
Bridge Networks
Bridge networks are the most commonly used network type in Docker.
They are suitable for containers running on the same Docker daemon
host.
Host Networks
Example:
83
docker run --network host -d nginx
Overlay Networks
Then, when creating a service in swarm mode, you can attach it to this
network:
MacVLAN Networks
Example:
84
docker network create -d macvlan \
--subnet=192.168.0.0/24 \
--gateway=192.168.0.1 \
-o parent=eth0 my_macvlan_net
85
Network Troubleshooting
86
Best Practices
87
Advanced Topics
Network Encryption
Network Plugins
Service Discovery
88
Conclusion
89
Chapter 7: Docker Volumes
90
Why Use Docker Volumes?
91
Types of Docker Volumes
1. Named Volumes
2. Anonymous Volumes
3. Bind Mounts
Bind mounts map a specific path of the host machine to a path in the
container. They're useful for development environments.
92
Using a bind mount:
93
Working with Docker Volumes
Listing Volumes
docker volume ls
Inspecting Volumes
Removing Volumes
Backing Up Volumes
To backup a volume:
94
docker run --rm -v my_volume:/source -v /path/on/host:/backup
ubuntu tar cvf /backup/backup.tar /source
Restoring Volumes
95
Volume Drivers
Local (default)
NFS
AWS EBS
Azure File Storage
96
Best Practices for Using Docker Volumes
6.Use volume labels: Labels can help you organize and manage
your volumes.
97
Advanced Volume Concepts
1. Read-Only Volumes
2. Tmpfs Mounts
Tmpfs mounts are stored in the host system's memory only, which can
be useful for storing sensitive information:
4. Volume Plugins
98
docker plugin install <plugin_name>
docker volume create -d <plugin_name> my_volume
99
Troubleshooting Volume Issues
100
Conclusion
101
Chapter 8: Docker Compose
Note: Docker Compose is now integrated into Docker CLI. The new
command is docker compose instead of docker-compose. We'll use the
new command throughout this chapter.
102
Key Benefits of Docker Compose
103
The docker-compose.yml File
version: '3.8'
services:
web:
build: .
ports:
- "5000:5000"
volumes:
- .:/code
environment:
FLASK_ENV: development
redis:
image: "redis:alpine"
104
Key Concepts in Docker Compose
105
Basic Docker Compose Commands
106
Advanced Docker Compose Features
1. Environment Variables
You can use .env files or set them directly in the compose file:
version: '3.8'
services:
web:
image: "webapp:${TAG}"
environment:
- DEBUG=1
2. Extending Services
version: '3.8'
services:
web:
extends:
file: common-services.yml
service: webapp
3. Healthchecks
107
version: '3.8'
services:
web:
image: "webapp:latest"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost"]
interval: 1m30s
timeout: 10s
retries: 3
start_period: 40s
108
Practical Examples
version: '3.8'
services:
db:
image: mysql:5.7
volumes:
- db_data:/var/lib/mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: somewordpress
MYSQL_DATABASE: wordpress
MYSQL_USER: wordpress
MYSQL_PASSWORD: wordpress
wordpress:
depends_on:
- db
image: wordpress:latest
ports:
- "8000:80"
restart: always
environment:
WORDPRESS_DB_HOST: db:3306
WORDPRESS_DB_USER: wordpress
WORDPRESS_DB_PASSWORD: wordpress
WORDPRESS_DB_NAME: wordpress
volumes:
db_data: {}
109
2.Services: We define two services: db and wordpress.
a. db service:
b. wordpress service:
110
3.Volumes: db_data: {}: This creates a named volume that Docker
manages. It's used to persist the MySQL data.
111
version: '3.8'
services:
flask:
build: ./flask
environment:
- FLASK_ENV=development
volumes:
- ./flask:/code
redis:
image: "redis:alpine"
nginx:
image: "nginx:alpine"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- "80:80"
depends_on:
- flask
networks:
frontend:
backend:
volumes:
db-data:
a. flask service:
112
environment: Sets FLASK_ENV=development, which enables
debug mode in Flask.
volumes: Mounts the local ./flask directory to /code in the
container. This is useful for development as it allows you to
make changes to your code without rebuilding the container.
b. redis service:
c. nginx service:
113
Dockerfile to build it.
2.You need an nginx.conf file in the same directory as your docker-
compose.yml.
3.Run docker compose up -d to start the services.
The use of Alpine-based images for Redis and Nginx helps to keep the
overall image size small, which is beneficial for deployment and scaling.
114
Best Practices for Docker Compose
115
Scaling Services
116
Networking in Docker Compose
version: '3.8'
services:
web:
networks:
- frontend
- backend
db:
networks:
- backend
networks:
frontend:
backend:
117
Volumes in Docker Compose
Compose also lets you create named volumes that can be reused
across multiple services:
version: '3.8'
services:
db:
image: postgres
volumes:
- data:/var/lib/postgresql/data
volumes:
data:
118
Conclusion
119
Chapter 9: Docker Security
Best Practices
120
1. Keep Docker Updated
Always use the latest version of Docker to benefit from the most recent
security patches.
121
2. Use Official Images
version: '3.8'
services:
web:
image: nginx:latest # Official Nginx image
122
3. Scan Images for Vulnerabilities
Use tools like Docker Scout or Trivy to scan your images for known
vulnerabilities.
123
4. Limit Container Resources
version: '3.8'
services:
web:
image: nginx:latest
deploy:
resources:
limits:
cpus: '0.50'
memory: 50M
124
5. Use Non-Root Users
FROM node:14
RUN groupadd -r myapp && useradd -r -g myapp myuser
USER myuser
125
6. Use Secret Management
For sensitive data like passwords and API keys, use Docker secrets:
version: '3.8'
services:
db:
image: mysql
secrets:
- db_password
secrets:
db_password:
external: true
126
7. Enable Content Trust
export DOCKER_CONTENT_TRUST=1
docker push myrepo/myimage:latest
127
8. Use Read-Only Containers
version: '3.8'
services:
web:
image: nginx
read_only: true
tmpfs:
- /tmp
- /var/cache/nginx
128
9. Implement Network Segmentation
version: '3.8'
services:
frontend:
networks:
- frontend
backend:
networks:
- backend
networks:
frontend:
backend:
129
10. Regular Security Audits
Regularly audit your Docker environment using tools like Docker Bench
for Security:
docker run -it --net host --pid host --userns host --cap-add
audit_control \
-e DOCKER_CONTENT_TRUST=$DOCKER_CONTENT_TRUST \
-v /var/lib:/var/lib \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /usr/lib/systemd:/usr/lib/systemd \
-v /etc:/etc --label docker_bench_security \
docker/docker-bench-security
130
11. Use Security-Enhanced Linux (SELinux) or
AppArmor
131
12. Implement Logging and Monitoring
version: '3.8'
services:
web:
image: nginx
logging:
driver: "json-file"
options:
max-size: "200k"
max-file: "10"
132
Conclusion
133
Chapter 10: Docker in
Production: Orchestration
with Kubernetes
Kubernetes is a topic of its own, but here are some key concepts and
best practices for using Kubernetes with Docker in production
environments.
134
Key Kubernetes Concepts
135
Setting Up a Kubernetes Cluster
minikube start
136
Deploying a Docker Container to Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
137
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80 targetPort: 80
type: LoadBalancer
138
Scaling in Kubernetes
139
Rolling Updates
140
Monitoring and Logging
141
Kubernetes Dashboard
142
Persistent Storage in Kubernetes
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
143
Kubernetes Networking
144
Kubernetes Secrets
Use in a Pod:
spec:
containers:
- name: myapp
image: myapp
env:
- name: SECRET_PASSWORD
valueFrom:
secretKeyRef:
name: my-secret
key: password
145
Helm: The Kubernetes Package Manager
146
Best Practices for Kubernetes in Production
147
Conclusion
148
Chapter 11: Docker
Performance Optimization
149
1. Optimizing Docker Images
Multi-stage builds can significantly reduce the size of your final Docker
image:
# Build stage
FROM golang:1.16 AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o
main .
# Final stage
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
CMD ["./main"]
Use .dockerignore
150
context:
.git
*.md
*.log
151
2. Container Resource Management
version: '3'
services:
app:
image: myapp
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
152
3. Networking Optimization
If you're experiencing slow DNS resolution, you can use the --dns
option:
153
4. Storage Optimization
version: '3'
services:
db:
image: postgres
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:
154
5. Logging and Monitoring
version: '3'
services:
app:
image: myapp
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
version: '3'
services:
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
grafana:
image: grafana/grafana
ports:
- "3000:3000"
155
6. Docker Daemon Optimization
{
"storage-driver": "overlay2"
}
{
"live-restore": true
}
156
7. Application-Level Optimization
FROM alpine:3.14
RUN apk add --no-cache python3
157
8. Benchmarking and Profiling
docker stats
158
9. Orchestration-Level Optimization
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
159
Conclusion
160
Chapter 12: Docker
Troubleshooting and
Debugging
Even with careful planning and best practices, issues can arise when
working with Docker. This chapter covers common problems you might
encounter and provides strategies for effective troubleshooting and
debugging.
161
1. Container Lifecycle Issues
162
2. Networking Issues
163
3. Storage and Volume Issues
# List volumes
docker volume ls
# Inspect a volume
docker volume inspect <volume_name>
164
4. Resource Constraints
165
5. Image-related Issues
166
6. Docker Daemon Issues
167
7. Debugging Techniques
Interactive Debugging
docker events
Logging
168
8. Performance Debugging
169
9. Docker Compose Troubleshooting
170
Conclusion
171
Chapter 13: Advanced Docker
Concepts and Features
172
1. Multi-stage Builds
# Build stage
FROM golang:1.16 AS builder
WORKDIR /app
COPY . .
RUN go build -o main .
# Final stage
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
CMD ["./main"]
This approach reduces the final image size by only including necessary
artifacts from the build stage.
173
2. Docker BuildKit
export DOCKER_BUILDKIT=1
174
3. Custom Bridge Networks
175
4. Docker Contexts
# List contexts
docker context ls
# Switch context
docker context use my-remote
176
5. Docker Content Trust (DCT)
# Enable DCT
export DOCKER_CONTENT_TRUST=1
177
6. Docker Secrets
# Create a secret
echo "mypassword" | docker secret create my_secret -
178
7. Docker Health Checks
179
8. Docker Plugins
# Install a plugin
docker plugin install vieux/sshfs
180
9. Docker Experimental Features
{
"experimental": true
}
181
10. Container Escape Protection
182
11. Custom Dockerfile Instructions
183
12. Docker Manifest
184
13. Docker Buildx
Buildx is a CLI plugin that extends the docker build command with the
full support of the features provided by BuildKit:
185
14. Docker Compose Profiles
services:
frontend:
image: frontend
profiles: ["frontend"]
backend:
image: backend
profiles: ["backend"]
186
Conclusion
187
Chapter 14: Docker in CI/CD
Pipelines
188
1. Docker in Continuous Integration
# .gitlab-ci.yml example
build_and_test:
image: docker:latest
services:
- docker:dind
script:
- docker build -t myapp:${CI_COMMIT_SHA} .
- docker run myapp:${CI_COMMIT_SHA} npm test
Parallel Testing
189
2. Docker in Continuous Deployment
customImage.push()
}
}
}
}
}
}
For Kubernetes:
190
# Kubernetes deployment in CircleCI
deployment:
kubectl:
command: |
kubectl set image deployment/myapp
myapp=myrepo/myapp:${CIRCLE_SHA1}
191
3. Docker Compose in CI/CD
# Travis CI example
services:
- docker
before_install:
- docker-compose up -d
- docker-compose exec -T app npm install
script:
- docker-compose exec -T app npm test
after_success:
- docker-compose down
192
4. Security Scanning
193
5. Performance Testing
194
6. Environment-Specific Configurations
ARG CONFIG_FILE=default.conf
COPY config/${CONFIG_FILE} /app/config.conf
build:
script:
- docker build --build-arg CONFIG_FILE=${ENV}.conf -t
myapp:${CI_COMMIT_SHA} .
195
7. Caching in CI/CD
uses: actions/cache@v2
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-buildx-
- name: Build and push
uses: docker/build-push-action@v2
with:
push: true
tags: user/app:latest
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache
196
8. Blue-Green Deployments with Docker
197
9. Monitoring and Logging in CI/CD
198
Conclusion
Integrating Docker into your CI/CD pipeline can greatly enhance your
development and deployment processes. It provides consistency across
environments, improves testing efficiency, and streamlines
deployments. By leveraging Docker in your CI/CD workflows, you can
achieve faster, more reliable software delivery.
199
Chapter 15: Docker and
Microservices Architecture
200
1. Principles of Microservices
201
2. Dockerizing Microservices
FROM node:14-alpine
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]
202
3. Inter-service Communication
REST API
// Express.js example
const express = require('express');
const app = express();
Message Queues
Using RabbitMQ:
# Dockerfile
FROM node:14-alpine
RUN npm install amqplib
COPY . .
CMD ["node", "consumer.js"]
203
// consumer.js
const amqp = require('amqplib');
console.log("Received:", msg.content.toString());
channel.ack(msg);
});
}
consume();
204
4. Service Discovery
Using Consul:
version: '3'
services:
consul:
image: consul:latest
ports:
- "8500:8500"
service-a:
build: ./service-a
environment:
- CONSUL_HTTP_ADDR=consul:8500
service-b:
build: ./service-b
environment:
- CONSUL_HTTP_ADDR=consul:8500
205
5. API Gateway
http {
upstream service_a {
server service-a:3000;
}
u p s t r e a m s e r v i c e _ b {
server service-b:3000;
}
server {
listen 80;
location /api/service-a {
proxy_pass http://service_a;
}
location /api/service-b {
proxy_pass http://service_b;
}
}
}
206
6. Data Management
version: '3'
services:
service-a:
build: ./service-a
depends_on:
- db-a
db-a:
image: postgres:13
environment:
POSTGRES_DB: service_a_db
POSTGRES_PASSWORD: password
service-b:
build: ./service-b
depends_on:
- db-b
db-b:
image: mysql:8
environment:
MYSQL_DATABASE: service_b_db
MYSQL_ROOT_PASSWORD: password
207
7. Monitoring Microservices
version: '3'
services:
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
grafana:
image: grafana/grafana
ports:
- "3000:3000"
depends_on:
- prometheus
208
8. Scaling Microservices
# Initialize swarm
docker swarm init
# Deploy stack
docker stack deploy -c docker-compose.yml myapp
# Scale a service
docker service scale myapp_service-a=3
209
9. Testing Microservices
Unit Testing
// Jest example
test('API returns correct data', async () => {
const response = await request(app).get('/api/data');
expect(response.statusCode).toBe(200);
expect(response.body).toHaveProperty('message');
});
Integration Testing
version: '3'
services:
app:
build: .
depends_on: -
test-db
test-db:
image: postgres:13
environment:
POSTGRES_DB: test_db
POSTGRES_PASSWORD: test_password
test:
build:
context: .
dockerfile: Dockerfile.test
depends_on:
- app
- test-db
command: ["npm", "run", "test"]
210
10. Deployment Strategies
Blue-Green Deployment
211
Conclusion
212
Chapter 16: Docker for Data
Science and Machine
Learning
Docker has become an essential tool in the data science and machine
learning ecosystem, providing reproducibility, portability, and scalability
for complex data processing and model training workflows.
213
1. Setting Up a Data Science Environment
FROM python:3.8
RUN pip install jupyter pandas numpy matplotlib scikit-learn
WORKDIR /notebooks
EXPOSE 8888
CMD ["jupyter", "notebook", "--ip='*'", "--port=8888", "--no-
browser", "--allow-root"]
214
2. Managing Dependencies with Docker
FROM continuumio/miniconda3
COPY environment.yml .
RUN conda env create -f environment.yml
SHELL ["conda", "run", "-n", "myenv", "/bin/bash", "-c"]
215
3. GPU Support for Machine Learning
FROM nvidia/cuda:11.0-base
RUN pip install tensorflow-gpu
COPY train.py .
CMD ["python", "train.py"]
216
4. Distributed Training with Docker Swarm
version: '3'
services:
trainer:
image: my-ml-image
deploy:
replicas: 4
command: ["python", "distributed_train.py"]
217
5. MLOps with Docker
app = Flask(__name__)
model = pickle.load(open('model.pkl', 'rb'))
@app.route('/predict', methods=['POST'])
def predict():
data = request.json
prediction = model.predict([data['features']])
return jsonify({'prediction': prediction.tolist()})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
FROM python:3.8
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app.py .
COPY model.pkl .
EXPOSE 5000
CMD ["python", "app.py"]
218
6. Data Pipeline with Apache Airflow
version: '3'
services:
webserver:
image: apache/airflow
ports:
- "8080:8080"
volumes:
- ./dags:/opt/airflow/dags
command: webserver
scheduler:
image: apache/airflow
volumes:
- ./dags:/opt/airflow/dags
command: scheduler
219
7. Reproducible Research with Docker
FROM rocker/rstudio
RUN R -e "install.packages(c('ggplot2', 'dplyr'))"
COPY analysis.R .
CMD ["R", "-e", "source('analysis.R')"]
220
8. Big Data Processing with Docker
Spark Cluster
version: '3'
services:
spark-master:
image: bitnami/spark:3
environment:
- SPARK_MODE=master
ports:
- "8080:8080"
spark-worker:
image: bitnami/spark:3
environment:
- SPARK_MODE=worker
- SPARK_MASTER_URL=spark://spark-master:7077
depends_on:
- spark-master
221
9. Automated Machine Learning (AutoML) with
Docker
FROM python:3.8
RUN pip install auto-sklearn
COPY automl_script.py .
CMD ["python", "automl_script.py"]
222
10. Hyperparameter Tuning at Scale
version: '3'
services:
optuna-worker:
image: my-optuna-image
deploy:
replicas: 10
command: ["python", "optimize.py"]
optuna-dashboard:
image: optuna/optuna-dashboard
ports:
- "8080:8080"
223
Conclusion
224
What is Docker Swarm mode
The manager nodes dispatch tasks to the worker nodes and on the
other side Worker nodes just execute those tasks. For High Availability,
it is recommended to have 3 or 5 manager nodes.
225
Docker Services
In order to have Docker services, you must first have your Docker
swarm and nodes ready.
226
Building a Swarm
Then once you've got that ready, install docker just as we did in the
Introduction to Docker Part 1 and then just follow the steps here:
Step 1
Step 2
Then to get the command that you need to join the rest of the
managers simply run this:
227
docker swarm join-token manager
Note: This would provide you with the exact command that you need to
run on the rest of the swarm manager nodes. Example:
Step 3
To get the command that you need for joining workers just run:
The command for workers would be pretty similar to the command for
join managers but the token would be a bit different.
The output that you would get when joining a manager would look like
this:
228
Step 4
Then once you have your join commands, ssh to the rest of your
nodes and join them as workers and managers accordingly.
229
Managing the cluster
After you've run the join commands on all of your workers and
managers, in order to get some information for your cluster status you
could use these commands:
docker node ls
docker info
Output:
230
231
Promote a worker to manager
Also note that each manager also acts as a worker, so from your docker
info output you should see 6 workers and 3 manager nodes.
232
Using Services
docker service ls
Output:
Then in order to get a list of the running containers you need to use the
following command:
Output:
233
Then you can visit the IP address of any of your nodes and you should
be able to see the service! We can basically visit any node from the
swarm and we will still get the to service.
234
Scaling a service
We could try shutting down one of the nodes and see how the swarm
would automatically spin up a new process on another node so that it
matches the desired state of 5 replicas.
Output:
In the screenshot above, you can see how I've shutdown the droplet
called worker-2 and how the replica bobby-web.2 was instantly started
again on another node called worker-01 to match the desired state of 5
replicas.
Output:
235
This would automatically spin up 2 more containers, you can check this
with the docker service ps command:
Then as a test try starting the node that we've shutdown and check if it
picked up any tasks?
Tip: Bringing new nodes to the cluster does not automatically distribute
running tasks.
236
Deleting a service
Output:
Now you know how to initialize and scale a docker swarm cluster! For
more information make sure to go through the official Docker
documentation here.
237
Docker Swarm Knowledge Check
Once you've read this post, make sure to test your knowledge with this
Docker Swarm Quiz:
https://quizapi.io/predefined-quizzes/common-docker-swarm-interview-q
uestions
238
Conclusion
As a next step make sure to spin up a few servers, install Docker and
play around with all of the commands that you've learnt from this
eBook!
239
Other eBooks
240