Kubernetes 3
Kubernetes 3
1. You submit the application manifest to the Kubernetes API. The API
Server writes the objects defined in the manifest to etcd.
2. A controller notices the newly created objects and creates several new
objects - one for each application instance.
3. The Scheduler assigns a node to each instance.
4. The Kubelet notices that an instance is assigned to the Kubelet’s node. It
runs the application instance via the Container Runtime.
5. The Kube Proxy notices that the application instances are ready to
accept connections from clients and configures a load balancer for them.
6. The Kubelets and the Controllers monitor the system and keep the
applications running.
The procedure is explained in more detail in the following sections, but the
complete explanation is given in chapter 14, after you have familiarized
yourself with all the objects and controllers involved.
After you’ve created your YAML or JSON file(s), you submit the file to the
API, usually via the Kubernetes command-line tool called kubectl.
Note
Kubectl splits the file into individual objects and creates each of them by
sending an HTTP PUT or POST request to the API, as is usually the case
with RESTful APIs. The API Server validates the objects and stores them in
the etcd datastore. In addition, it notifies all interested components that these
objects have been created. Controllers, which are explained next, are one of
these components.
The Kubelet that runs on each worker node is also a type of controller. Its
task is to wait for application instances to be assigned to the node on which it
is located and run the application. This is done by instructing the Container
Runtime to start the application’s container.
Once the application is up and running, the Kubelet keeps the application
healthy by restarting it when it terminates. It also reports the status of the
application by updating the object that represents the application instance.
The other controllers monitor these objects and ensure that applications are
moved to healthy nodes if their nodes fail.
When the number of workloads decreases and some worker nodes are left
without running workloads, Kubernetes can ask the cloud provider to destroy
the virtual machines of these nodes to reduce your operational costs. This
elasticity of the cluster is certainly one of the main benefits of running
Kubernetes in the cloud.
If your use-case requires it, you can also run a Kubernetes cluster across
multiple cloud providers or a combination of any of the options mentioned.
This can be done using a single control plane or one control plane in each
location.
If you already run applications on-premises and have enough hardware to run
a production-ready Kubernetes cluster, your first instinct is probably to
deploy and manage it yourself. If you ask anyone in the Kubernetes
community if this is a good idea, you’ll usually get a very definite “no”.
Using Kubernetes is ten times easier than managing it. Most major cloud
providers now offer Kubernetes-as-a-Service. They take care of managing
Kubernetes and its components while you simply use the Kubernetes API like
any of the other APIs the cloud provider offers.
The first half of this book focuses on just using Kubernetes. You’ll run the
exercises in a local development cluster and on a managed GKE cluster, as I
find it’s the easiest to use and offers the best user experience. The second part
of the book gives you a solid foundation for managing Kubernetes, but to
truly master it, you’ll need to gain additional experience.
The first thing you need to be honest about is whether you need to automate
the management of your applications at all. If your application is a large
monolith, you definitely don’t need Kubernetes.
Even if you deploy microservices, using Kubernetes may not be the best
option, especially if the number of your microservices is very small. It’s
difficult to provide an exact number when the scales tip over, since other
factors also influence the decision. But if your system consists of less than
five microservices, throwing Kubernetes into the mix is probably not a good
idea. If your system has more than twenty microservices, you will most likely
benefit from the integration of Kubernetes. If the number of your
microservices falls somewhere in between, other factors, such as the ones
described next, should be considered.
Can you afford to invest your engineers’ time into learning Kubernetes?
It would be hard to tell your teams that you’re switching to Kubernetes and
expect only the operations team to start exploring it. Developers like shiny
new things. At the time of writing, Kubernetes is still a very shiny thing.
Although Kubernetes has been around for several years at the time of writing
this book, I can’t say that the hype phase is over. The initial excitement has
just begun to calm down, but many engineers may still be unable to make
rational decisions about whether the integration of Kubernetes is as necessary
as it seems.
1.4 Summary
In this introductory chapter, you’ve learned that:
Kubernetes is Greek for helmsman. As a ship’s captain oversees the ship
while the helmsman steers it, you oversee your computer cluster, while
Kubernetes performs the day-to-day management tasks.
Kubernetes is pronounced koo-ber-netties. Kubectl, the Kubernetes
command-line tool, is pronounced kube-control.
Kubernetes is an open-source project built upon Google’s vast
experience in running applications on a global scale. Thousands of
individuals now contribute to it.
Kubernetes uses a declarative model to describe application
deployments. After you provide a description of your application to
Kubernetes, it brings it to life.
Kubernetes is like an operating system for the cluster. It abstracts the
infrastructure and presents all computers in a data center as one large,
contiguous deployment area.
Microservice-based applications are more difficult to manage than
monolithic applications. The more microservices you have, the more
you need to automate their management with a system like Kubernetes.
Kubernetes helps both development and operations teams to do what
they do best. It frees them from mundane tasks and introduces a standard
way of deploying applications both on-premises and in any cloud.
Using Kubernetes allows developers to deploy applications without the
help of system administrators. It reduces operational costs through better
utilization of existing hardware, automatically adjusts your system to
load fluctuations, and heals itself and the applications running on it.
A Kubernetes cluster consists of master and worker nodes. The master
nodes run the Control Plane, which controls the entire cluster, while the
worker nodes run the deployed applications or workloads, and therefore
represent the Workload Plane.
Using Kubernetes is simple, but managing it is hard. An inexperienced
team should use a Kubernetes-as-a-Service offering instead of deploying
Kubernetes by itself.
So far, you’ve only observed the ship from the pier. It’s time to come aboard.
But before you leave the docks, you should inspect the shipping containers
it’s carrying. You’ll do this next.
2 Understanding containers
This chapter covers
Understanding what a container is
Differences between containers and virtual machines
Creating, running, and sharing a container image with Docker
Linux kernel features that make containers possible
Unlike VMs, which each run a separate operating system with several system
processes, a process running in a container runs within the existing host
operating system. Because there is only one operating system, no duplicate
system processes exist. Although all the application processes run in the same
operating system, their environments are isolated, though not as well as when
you run them in separate VMs. To the process in the container, this isolation
makes it look like no other processes exist on the computer. You’ll learn how
this is possible in the next few sections, but first let’s dive deeper into the
differences between containers and virtual machines.
Compared to VMs, containers are much lighter, because they don’t require a
separate resource pool or any additional OS-level processes. While each VM
usually runs its own set of system processes, which requires additional
computing resources in addition to those consumed by the user application’s
own process, a container is nothing more than an isolated process running in
the existing host OS that consumes only the resources the app consumes.
They have virtually no overhead.
Figure 2.1 shows two bare metal computers, one running two virtual
machines, and the other running containers instead. The latter has space for
additional containers, as it runs only one operating system, while the first
runs three – one host and two guest OSes.
Figure 2.1 Using VMs to isolate groups of applications vs. isolating individual apps with
containers
Because of the resource overhead of VMs, you often group multiple
applications into each VM. You may not be able to afford dedicating a whole
VM to each app. But containers introduce no overhead, which means you can
afford to create a separate container for each application. In fact, you should
never run multiple applications in the same container, as this makes
managing the processes in the container much more difficult. Moreover, all
existing software dealing with containers, including Kubernetes itself, is
designed under the premise that there’s only one application in a container.
But as you’ll learn in the next chapter, Kubernetes provides a way to run
related applications together, yet still keep them in separate containers.
You’ll agree that containers are clearly better when it comes to the use of
resources, but there’s also a disadvantage. When you run applications in
virtual machines, each VM runs its own operating system and kernel.
Underneath those VMs is the hypervisor (and possibly an additional
operating system), which splits the physical hardware resources into smaller
sets of virtual resources that the operating system in each VM can use. As
figure 2.2 shows, applications running in these VMs make system calls (sys-
calls) to the guest OS kernel in the VM, and the machine instructions that the
kernel then executes on the virtual CPUs are then forwarded to the host’s
physical CPU via the hypervisor.
Figure 2.2 How apps use the hardware when running in a VM vs. in a container
Note
Containers, on the other hand, all make system calls on the single kernel
running in the host OS. This single kernel is the only one that executes
instructions on the host’s CPU. The CPU doesn’t need to handle any kind of
virtualization the way it does with VMs.
Examine the following figure to see the difference between running three
applications on bare metal, running them in two separate virtual machines, or
running them in three containers.
Figure 2.3 The difference between running applications on bare metal, in virtual machines, and
in containers
In the first case, all three applications use the same kernel and aren’t isolated
at all. In the second case, applications A and B run in the same VM and thus
share the kernel, while application C is completely isolated from the other
two, since it uses its own kernel. It only shares the hardware with the first
two.
The third case shows the same three applications running in containers.
Although they all use the same kernel, they are isolated from each other and
completely unaware of the others’ existence. The isolation is provided by the
kernel itself. Each application sees only a part of the physical hardware and
sees itself as the only process running in the OS, although they all run in the
same OS.
The main advantage of using virtual machines over containers is the complete
isolation they provide, since each VM has its own Linux kernel, while
containers all use the same kernel. This can clearly pose a security risk. If
there’s a bug in the kernel, an application in one container might use it to
read the memory of applications in other containers. If the apps run in
different VMs and therefore share only the hardware, the probability of such
attacks is much lower. Of course, complete isolation is only achieved by
running applications on separate physical machines.
Additionally, containers share memory space, whereas each VM uses its own
chunk of memory. Therefore, if you don’t limit the amount of memory that a
container can use, this could cause other containers to run out of memory or
cause their data to be swapped out to disk.
Note
While virtual machines are enabled through virtualization support in the CPU
and by virtualization software on the host, containers are enabled by the
Linux kernel itself. You’ll learn about container technologies later when you
can try them out for yourself. You’ll need to have Docker installed for that,
so let’s learn how it fits into the container story.
Figure 2.4 The three main Docker concepts are images, registries and containers