Kubernetes 101 Foundational Guide - 2024
Kubernetes 101 Foundational Guide - 2024
Introduction to Containers 3
Runtime Environments 4
Images 4
Registries 5
Introduction to Kubernetes 8
Architecture 8
Concepts & Terms 9
Nodes 9
Control Plane Nodes 9
Compute / Worker Nodes 10
Infrastructure Nodes (Optional) 10
Pods 10
Deployments 11
ReplicaSets 12
Services 12
Unlike virtual machines, containers are built at the process-level. Virtual machines
can be large, cumbersome and take a long time to execute in a rapid fashion;
additionally, they require the overhead of a virtual hypervisor to emulate and isolate
virtual machine instances from one another. Containers instead require only a
shared linux kernel and a OCI compliant runtime (i.e. CRI-O, Docker, etc) to run the
process on top of the underlying operating system. The isolation is then applied via
the usage of Linux Namespaces and CGroups.
The process of managing and deploying containers has been standardized over the
last several years making it safe, consistent and easy to use. Regarding the usage of
Containers, three main concepts need to be understood.
● Runtime Environments
● Images
● Registries
Runtime Environments
Eventually the image format and runtime were standardized creating the
emergence of multiple easy-to-use implementations for container engines (i.e.
Docker, Podman, CRI-O, etc).
Images
Container images are tar files bundled with an associated JSON file that is many
times referred to as an Image Bundle. This image contains the running process, all of
the required installed packages and any needed configuration details in a
standardized OCI format. This format allows for containers to be easily ported
across various systems regardless of the container runtime implementation (i.e.
Docker, CRI-O, RKT, etc).
These images are portable to various distributions of Linux including Fedora, Debian,
RHEL, CentOS, SUSE, etc. The only caveat to portability is the expected underlying
linux kernel and system architecture. All container images require the running host
to provide a compatible kernel (for API purposes) and system architecture (i.e.
x64/x86, ARM, Power, etc). If the Linux kernel version variance is too great and/or
the system architecture is not the same, the container image will not run in the
target environment.
Example Dockerfile
# vi /location/of/image/content/Dockerfile
FROM node:12-alpine
RUN apk add --no-cache python2 g++ make
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]
EXPOSE 3000
Once a Dockerfile instance has been created, a single command is run to build and
tag the corresponding image as is depicted below:
Registries
Container registries are standardized file servers that are designed to host and
distribute container images across various systems. The registries offer the ability
for end-users to authenticate to, authorize, upload, tag, and download images.
Many public registries exist from various product companies to host curated &
standardized supported images of content provided by the given vendor (ex. Red
Hat Container Catalog) . Additionally, multiple registries offerings exist privately
and/or public host an users own container images (ex. Docker Hub). Curated
registries are good for partners who want to deliver solutions together, while
cloud-based registries are good for end users collaborating on work.
Once a target registry has been selected and a container image has been locally
created, it can be tagged for, uploaded to and downloaded from the given registry.
After a image has been uploaded to a given registry it can be downloaded by any
container runtime that is compatible with the given image:
Example Podman & Docker Commands
This need warranted the creation of various orchestration platforms over time
including Kubernetes, Mesosphere, Docker Swarm and many home-grown
solutions. Founded by Google and maintained by various large IT organizations
(including Red Hat), Kubernetes has long since become the de facto standard for
container orchestration.
Architecture
Abstracting away the underlying infrastructure, Kubernetes acts as a Software
Defined Datacenter for containerized workloads. By providing a series of tightly
coupled technology solutions, Kubernetes enables workloads to dynamically scale
in a standardized way across various linux systems. Various distributions of
Kubernetes have been created including Amazon EKS, Google GKE, Azure AKS, Red
Hat OpenShift, SUSE Rancher and more; however, the baseline functionality, system
constructs and API definitions have been standardized to create a consistent
experience regardless of the selected distribution. The baseline functionality of
Kubernetes includes (but is not limited to) the following:
Nodes
Nodes refer to any host operating system instance (whether virtual or physical)
included in the Kubernetes cluster capable of running a containerized workload. By
default there are two types of Kubernetes Nodes included in every cluster which are
responsible for different tasks (with more custom types being optional).
Pods
Pods are the smallest object construct that exists in Kubernetes. Traditionally, pods
refer to a single containerized process running on a specified node in a Kubernetes
cluster. However, instances also exist where multiple containerized processes run
inside the same individual pod. The rationale for running multiple containers in the
same pod is only specifically used in instances where two container workloads
MUST run on the exact same node at all times. An example of this would be if every
instance of a containerized workload required that a web proxy run on the same
host as the workload itself. In all other instances, a pod should be equivalent to one
running container instance.
Deployments
Deployments are standardized Kubernetes constructs which outline the desired
state of a user-defined workload. Deployment configurations provide a variety of
options including the associated containerized images to deploy, references to
backend persistent storage, number of replicas to run, network port access
requirements, and more.
# vi deployment-ex.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
name: http-web-svc
# kubectl -f deployment-ex.yml
# kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 18s
ReplicaSets
ReplicatSets are a Kubernetes construct which represents the current running
instance count of a given series of pods and the desired number that should be
running. If at any point, the number of instances running varies from the desired
count, ReplicaSets are responsible for correcting this by either increasing or
decreasing the number of pods actively scheduled.
Services
Services are responsible for providing a means to expose a series of pods as an
internal DNS name, IP address and port number which can be addressed globally
within a cluster. The set of pods targeted by a Service is usually determined by a
selector. This selector looks for pods with a particular label (Ex. “app: nginx” was
used previously in the example above) and designates traffic to these downstream
pods accordingly. Once deployed, a Service will proxy traffic from itself to one of the
various backend selected pods. This is especially useful as pods will receive new IP
addresses every time they are recreated and the Service endpoint will automatically
and transparently be aware of all these network changes in real-time.
# vi service-example.yml
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- name: name-of-service-port
protocol: TCP
port: 80
targetPort: http-web-svc
# kubectl -f service-example.yml
https://kubernetes.io/docs/concepts/services-networking/service/
#configuration
Containerized Workload Scheduling
Kubernetes is a Software Defined Datacenter for containerized workloads providing
a pool of shared compute resources for all deployed workloads. Many configurable
parameters exist which can impact how/where workloads get scheduled as well as
actions that trigger deployment. As such it is important to understand the basic
mechanics of how Kubernetes allocates & schedules deployments to ensure
workloads are designed in a resilient and organized manner.
The following topic areas cover some of the most common considerations used
during workload scheduling. Understanding these optional configuration settings
aims to provide a greater deal of context for considerations when considering how
workloads should be designed and deployed.
The topic areas presented are not an exhaustive list for all the optional
scheduling considerations but are the most commonly used and core
concepts leveraged.
# vi pod-example.yml
---
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
tolerations:
- key: "key1"
operator: "Exists"
effect: "NoSchedule"
# vi pod-example.yml
---
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: app
image: images.my-company.example/app:v4
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- name: log-aggregator
image: images.my-company.example/log-aggregator:v6
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
It is important to note that limits and requests will help prevent issues
with system resources being overcommitted or consumed on a target
node. However, depending on the nature and language of the
underlying containerized process it is still possible for memory leaks
to still occur which can affect other workloads running on the same
system.
# vi pod-example.yml
---
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
Affinity rules are not the only method for targeting a selected system.
The most simple and basic method for targeting a specific system (as a
hard requirement) is using the concept of node selectors which include
the ability to target systems based on optional parameters (ex. if the
target node is running on SSDs). For more information on node
selectors, please review the following:
https://kubernetes.io/docs/tasks/configure-pod-container/assign-po
ds-nodes/#create-a-pod-that-gets-scheduled-to-your-chosen-node
In addition to Affinity rules, Kubernetes provides the concept of Inter-Pod Affinity &
Anti-Affinity. These rules take the form "this Pod should (or, in the case of anti-affinity,
should not) run in an X environment if X is already running one or more Pods that
meet rule Y. The uniqueness of these rules is that X environment can be any
topology domain including a specific node, rack, cloud provider availability zone or
region.
Without these rules, it’s possible that for a given user-defined workload, all pods (or
at least a subset) will potentially end up running on one individual node rather than
being distributed across various nodes. This can be troublesome & increase
downtime during node update/failure workloads since multiple instances of a
user-defined workload will be taken offline at the same time.
The number and description for all configuration options possible for
Affinity & Anti-Affinity rules goes well beyond the scope of this
document. For more information, please review the following:
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-po
d-node/#affinity-and-anti-affinity
Liveliness & Readiness Probes
Definable in a pod definition, once deployed, liveness and readiness probes are
checked by the local running nodes kubelet process to ensure the pod is running
and ready to accept traffic. By default, once all of the containers (within a given pod)
main process is fully running, the pod is considered functional and will start
accepting traffic. However, in many instances a process could be running and
deadlocked or malfunctioning. Therefore it is important to define these checks such
that redeployment of containerized services can operate as expected should an
issue occur.
# vi deployment-example.yml
---
…
livenessProbe:
httpGet:
path: /healthz
port: liveness-port
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
https://kubernetes.io/docs/tasks/configure-pod-container/configure-
liveness-readiness-startup-probes/
Furthermore, pod autoscaling exists which can cause the number of pods to
increase/decrease based on various metrics (i.e. mem, cpu, etc). It is important to
understand that these changes can occur theoretically at any time to ensure
deployments are designed to operate in this type of environment.
Workloads on Kubernetes vs Traditional VMs
Kubernetes deployments have very distinct behavior characteristics which need to
be considered during software development and deployment alike. Unlike
traditional virtual machine workloads which are long-lived and traditionally
point-solution specific, Kubernetes deployments are subject to the following:
Like many customer instances, a 10 year old web application and series of backend
web services actively deployed in JBoss EAP 7.2 were set to be migrated to
Kubernetes. These various components had gone through various iterations over
the years but the overall architecture remained the same. The front-end application
required the usage of sticky sessions from a front-end load balancer to not interrupt
user connections and many of the backend services required stateful data sharing.
Without making any changes to the underlying application and backend services,
migration began. Each component (i.e. front end application, backend services,
messaging brokers, databases, etc) was converted into individual Kubernetes
deployments with their own series of pods. Immediate challenges were noticed as
user traffic would rotate to different deployed pods in the environment and no
shared caching mechanism was being leveraged. This led to inconsistent behavior
(frequent request retries and re-logins required) when user requests were attempted
against the frontend application. Furthermore, when an instance of a pod died or
additional pods were added, transactional information was frequently lost.
One situation that frequently comes up is around the usage of CICD as it relates to
Kubernetes. In most organizations, the CICD solution (ex. Jenkins) is maintained by
Operations where the pipeline generated is built and maintained by Development
teams. In a particular instance, an organization had a Kubernetes cluster running in
AWS with Jenkins contributing CICD jobs to the environment. The Development
team created a pipeline which was responsible for deploying a workload, running a
series of integration tests and then cleaning up the workload when completed.
Unfortunately, this pipeline took numerous nodes in a given cluster offline causing
the cluster to be unstable.
The pipeline began by creating a temporary NFS instance, deploying the workload
to be tested, mounting the storage and then running the integration test. The
problem in this situation was the clean up job which deleted the temporary NFS
instance prior to deleting the workload. Since Kubernetes nodes are linux instances,
when the NFS server got removed, the nodes running the pipeline’s application
workload had their file systems locked-up in perpetuity waiting for the NFS share to
reconnect. The only way to fix the situation was to either login in as root on the host
and forcibly remove the mount or restart the nodes.
The worst problem in this entire scenario, is it took weeks to debug the issue
because the Operations team was unaware this pipeline job was running in the first
place to make the correlation between the job and the environment instability. This
created a chaotic situation which crippled workloads at seemingly random times.
About Shadow-Soft
Visit shadow-soft.com to learn more about what we do and how we help clients.