Kubernetes Campaign Ebook
Kubernetes Campaign Ebook
Table of Content..........................................................................................................................................................3
PART THREE: How to Use Trident and Cloud Volumes ONTAP with Kubernetes........................ 24
3.1 Using Cloud Manager for Kubernetes Deployment with NetApp Trident........................ 24
3.3 How to Set Up MySQL Kubernetes Deployments with Cloud Volumes ONTAP...........27
Conclusion.................................................................................................................................................................. 32
What Is Kubernetes?
TO UNDERSTAND KUBERNETES, YOU FIRST HAVE TO UNDERSTAND THE CORE
COMPONENT OF KUBERNETES ARCHITECTURE: CONTAINERS.
Containers are lightweight, independent units that software developers and DevOps engineers use to encapsulate applications.
On deployment, a container provides process and file system separation in much the same way as a virtual machine, but with
considerable improvements in server efficiency. That efficiency allows a much greater density of containers to be co-located on the
same host.
While container technology has been part of Unix-like operating systems since the turn of the century, it was only with the advent of
Docker that containers really came into the mainstream.
Docker has succeeded by bringing both standardization to container runtimes, for example, through the Open Container Initiative,
and by creating a complete container management system around the raw technology, simplifying the process of creating and
deploying containers for end users. Docker, however, can only be used to execute a container on a single host machine. That’s where
Kubernetes stepped in.
Docker Kubernetes
vs
Single host machine Cluster of machines
Kubernetes makes it possible to execute multiple instances And Kubernetes implementation is growing. A recent
of a container across a number of machines and achieve bi-annual survey of over 2000 IT professionals from North
fault tolerance and horizontal scale-out at the same time. America and Europe by the Cloud Native Computing
Kubernetes was created by Google after over a decade Foundation found that 75% of respondents were using
of using container orchestration internally to operate containers in production today, with the remaining
their public services. Google had been using containers number planning to use them in the future. Kubernetes
for a long time and developed their own proprietary usage has remained very strong with 83%, up from 77%,
solutions for data center container deployment and scaling. using the platform and 58% using it in production.
Kubernetes builds on those solutions as Google had been
using containers to enable the world-wide community of
software developers to grow the platform.
The ability to manage applications changes in demand. The flexibility of this service has driven
independently of infrastructure holds Kubernetes introduction and adaptation
For deployed applications, Kubernetes
great value for cloud deployments. We across the cloud, with all major cloud
offers many benefits, such as service
can build out a cluster of machines in vendors offering a native Kubernetes
discovery, load balancing, rolling updates,
the cloud that provides the compute service, for example Amazon Elastic
and much more. Kubernetes acts as an
and storage resources for all of our Kubernetes Service (Amazon EKS) and
application server that is used to run all
applications, and then let Kubernetes Google Kubernetes Engine. Kubernetes
of the services, message queues, batch
ensure we get the best resource is also the foundation of other container
processes, database systems, caching
utilization. Kubernetes can also be orchestration platforms, such as Red Hat
services, etc. that make up an enterprise
configured to automatically scale the OpenShift.
application deployment.
cluster up and down in response to
5
Stateless 2 Pod a private record of client session
Applications information, which allows any running
In the Kubernetes architecture, a set of instance of the same application to
containers may be deployed and scaled process incoming requests. Applications
6 Services together. This is achieved by using deployed to Kubernetes or to containers
pods, which are the minimum unit of in general are typically stateless, and so
deployment in a Kubernetes cluster, and are easier to scale out horizontally across
7 Volume allow more than one container to share the cluster.
the same resources, such as IP address,
file systems, etc.
8 Persistent Volume 6 Services
3 Deployment When multiple, interchangeable pod
9
Persistent Volume A deployment is used to control replicas are active at the same time,
Claim Kubernetes pod creation, updates, clients need a simple way to find any
and scaling within the cluster, and is active pod they can send requests to.
normally used for stateless applications. Services solve this problem by acting as
10 Storage Class
A stateless application does not depend gateway to a set of pods, which may even
on maintaining its own client session exist in different Kubernetes cluster.
11
Dynamic Storage information, allowing any instance of
Provisioning the application to be equally capable of
serving client requests.
12 Provisioner
Each persistent volume is created by a provisioner that uses a plugin to interface with different types of backend storage, with
support for Amazon EBS. The lifetime of a persistent volume is determined by its reclaim policy, which controls the action the cluster
will take when a pod releases its ownership of the storage.
Volume: Storage provisioned directly to a pod that enable the containers within a pod to share information.
Destroyed when their parent pod is deleted.
Persistent Volume: A volume that exists independently of any specific pod and with its own lifetime.
Can be used to support stateful applications, such as database services, enabling all components of an
enterprise solution to be deployed and managed by Kubernetes.
THERE ARE TWO CHOICES FOR HOW PERSISTENT VOLUMES ARE PROVISIONED BY
KUBERNETES: STATIC AND DYNAMIC.
Cluster administrators can pre-allocate persistent volumes for the cluster, known as static provisioning, however, this requires
prior knowledge of storage requirements as a whole. Dynamic volume provisioning is an alternative model for managing storage
provisioning in Kubernetes, and is used to automatically deploy persistent volumes based on the claims received by the cluster.
Storage Class:
Storage classes add a further level of abstraction to storage provisioning by allowing persistent volume claims
to only specify the type of storage they require. Encapsulate the details of the provisioner to be used, the type
of volume to create, etc. Generally used with dynamic storage provisioning.
Stateful applications, such as database services and message brokers, record and manage the information generated within an
enterprise platform. Though Kubernetes has always supported stateless applications—which are horizontally scalable due to the
interchangeability of each pod—stateful applications require stronger guarantees for the storage they use.
Why the need for extra guarantees when it comes to stateful vs stateless? Whereas the storage used by stateless containerized
applications can simply be re-initialized when a pod is rescheduled to different node in the cluster, stateful applications are recording
business-critical information that must be preserved at all costs—that requires persistent storage with an independent lifetime.
Stateful Sets: For certain types of applications, such as database systems, it is crucial to maintain the
relationship between pods and data storage volumes. Stateful sets provide an alternative model to Kubernetes
deployments, and give each pod a unique and durable identity.
Stateless Applications: Stateless applications do not keep a private record of client session
information, which allows any running instance of the same application to process incoming requests.
Applications deployed to Kubernetes or to containers in general are typically stateless, and so are easier to
scale out horizontally across the cluster.
Stateful applications in production environments, such as database services, require access to redundant and highly available
data storage. Most stateless applications make use of stateful services in order to fulfill client requests, and therefore have
an indirect dependency on robust data storage services as well. Kubernetes provides a lot of flexibility when it comes to
persistent data storage provisioning, however, each solution uses its own specific mechanisms for protecting data, which may
also have limitations.
Kubernetes caters for persistent this is not sufficient protection for all ability to create and restore backups.
data storage through persistent organizations where end users are Examples of why you’d take regular
volumes, which have a life-cycle that is expected to build their own solutions to backups include ensuring that previous
independent of any particular container protect data across Availability Zones, versions of the data are available in
and that can be provisioned using a or across regions. When it comes to case of user error and providing your
diverse range of storage platforms. Kubernetes workload DR requirements, deployment in Kubernetes security
investing in this type of data protection against malicious access, such as
How do different storage solutions
is not only mandatory for business ransomware attacks. Due to the large
protect that data differently? For
continuity and regulatory requirements, it size of production datasets, an efficient
example, persistent volumes can be
also pays huge dividends in the long run. procedure is required not only to create
provisioned using Amazon EBS, which
backups, but also to restore them
provides some level of data redundancy Another important requirement for
consistently.
within an Availability Zone; however, protecting persistent data storage is the
Many times, large allocations of storage made for an application are never
used, and in other cases the data stored within a persistent volume is never
compressed, leading to extra storage space being used up unnecessarily. These
types of issues can be resolved using the storage efficiency technologies that
are part of Cloud Volumes ONTAP, which the NetApp Trident provisioner makes
available to Kubernetes.
There may also be situations where a large part of the production dataset
is cold, or infrequently accessed, but which cannot be moved to more
cost-effective storage without introducing a lot of complexity to the ways
application services rely on that data. These application services require a
uniform view of their data, with fast access to hot data and on-demand access
to cold data. This can again be a difficult problem to solve, resulting in large
allocations of high performance and costly storage.
Though storage efficiency is always important in the cloud, it can be even more
so in Kubernetes environments due to the inherent scalability of containers.
Spinning up new pods to deal with an increase in workload, or to provide
greater redundancy and availability, also requires allocating new persistent
volumes. The storage overhead for these persistent volumes can be brought
under control through efficient data storage.
Let’s take a look at the Cloud Volumes ONTAP efficiency technologies you can leverage via NetApp Trident to reduce the
storage required for persistent volumes in Kubernetes.
Thin Provisioning
Thin provisioning makes it possible to create persistent volumes that appear to pods as having the size they requested through a
persistent volume claim, but without needing to allocate all of that storage in advance. Cloud Volumes ONTAP will automatically
add storage capacity to the persistent volume as and when it is required, and also return back freed up storage to the common pool
when data is deleted. This ensures that storage space is only allocated when it’s actually needed, which reduce storage usage costs
and drives up storage space utilization. Thin provisioning also makes it much easier to plan for the future storage requirements of
the cluster.
Data Deduplication
Cloud Volumes ONTAP is able to transparently apply transformations, which help to reduce storage space usage, to the data it
stores. Cloud Volumes ONTAP employs data deduplication, which collapses identical copies of a block into a single block, with
reference pointers inserted into every place the block is used. This dramatically reduces storage space requirements, with some
companies reporting savings of up to 70%. A small amount of storage space is consumed in order to maintain the metadata
required to support the block mappings.
Data Compaction
After applying inline data deduplication and compression, multiple blocks that
are not completely filled are combined together, removing the unused of space
that would have otherwise been left in each block.
Storage Tiering
A major advantage of using Cloud Volumes ONTAP is the ability to automatically
balance data between a capacity storage tier for colder data and a performance
tier for fast access. Cloud Volumes ONTAP will use Amazon S3 for the capacity
tier for large amounts of cold or infrequently accessed data that must still be
available on-demand. When the data is required, it is quickly moved to the
performance tier, and will age back out to the capacity tier when it is no longer in
active use. This saves a significant amount on storage costs.
The NFS (Network File System) protocol Unless containers are deployed to
is widely used in sharing files for the same pod, sharing data between
enterprise environments, allowing many containers can also be difficult. Deploying
users to access the same files at the same containers in this way, i.e. to the same
time. This makes sharing data a lot easier. pod, should only be done when it makes
NFS can be used for a wide range of use sense for the application, such as when
cases, such as for creating data lakes for the containers work together directly to
building analytics solutions, database fulfill some function. Putting containers
services, data archives, etc. If you’re into the same pod when they simply
using Kubernetes, NFS can be used with need to share data can lead to scalability
Kubernetes pods to manage persistent problems as a pod will only be scheduled
storage requirements and share data and to a single node.
files between containers and other pods.
NFS solves these problems by allowing
There are several reasons why NFS shared many hosts to mount the same file
storage is preferable for Kubernetes system at the same time, and for all hosts
deployments. For starters, there’s an to access files concurrently. With NFS,
advantage to NFS over iSCSI because users don’t need to format the storage
managing a large number of individual volume using an Operating System file
iSCSI storage allocations can add to system, such as ext4; the storage can
administrative overhead. All those block- simply be mounted and used straight
level persistent volume allocations can away. This makes it much easier to
make storage management more difficult, attach storage to pods and reduce the
especially when Kubernetes persistent administrative overhead of working
volumes are statically allocated. In these with persistent storage. NFS storage
cases, storage utilization may be lower, volumes in Kubernetes can also be
as a persistent volume is used only for a easily expanded without any client-side
single pod and that pod may not make changes using Trident, which we’ll discuss
use of all of its storage. Also, sharing data in more detail in the next section.
between pods can be difficult when using
block-level persistent volumes.
Creating test copies of production data is very often a requirement for DevOps
engineers implementing CI/CD pipelines or setting up staging clusters for pre-
production testing. Automated software test suites will often mutate the data
that they access, which means that a fresh copy of the data is required in order
to repeat the testing, with test cycles sometimes executed hundreds of times.
Ensuring faster TTM (Time To Market) and the delivery of high quality software
relies upon developers working in parallel and executing tests as often
as required.
Restoring data backups is not a scalable solution for creating test data sets:
with the size of the source data involved it just takes too much time to perform
a restore. This leads to reduced developer productivity when multiple, up-to-
date copies of persistent volumes are required, which also have to be refreshed
frequently. Using this approach also means that the amount of storage used to
develop and support an application will be many times greater than the storage
requirements of the production environment, causing a significant increase in
cloud storage costs. That’s not something any developer wants to get blamed for.
The time taken to perform a restore and the storage space it uses can both be
very wasteful, as usually most of the restored data remains unaffected by the
software testing and must be recreated simply to bring the data to a consistent
point or to get an up-to-date copy. In contrast, not only is NetApp FlexClone able
to instantly clone a source volume of any size, it also does so with zero storage
penalty for any storage size required. NetApp Trident helps you take advantage
of this for persistent volumes in a Kubernetes cluster.
In the third part of this ebook we’ll walk you through the steps of cloning your
persistent volumes in order to facilitate test environments and reduce your
storage footprints.
Cloud Manager is the central platform from which you can deploy and manage instances of Cloud Volumes ONTAP for both large
and small environments. The graphical, web-based user interface makes it easy to setup Cloud Volumes ONTAP storage services
and organize them across multiple tenants for better overall manageability. On-premises systems and AWS deployments can all be
controlled from a single dashboard, and NetApp SnapMirror replication relationships created between them with a simple drag-and-
drop.
Cloud Manager allows us to deploy NetApp Trident to a Kubernetes cluster and then relate the cluster to Cloud Volumes
ONTAP instances.
PREREQUISITES
Before you can start you should make sure that you have network access between Cloud Manager, the instances of Cloud Volumes
ONTAP to be used, and the actual Kubernetes cluster. Cloud Manager will also require internet access in order to download the
latest deployment packages for NetApp Trident.
When integrating with NFS Kubernetes deployments, Trident is also able to create the storage class and the NFS mount points.
Other benefits include cloning Kubernetes persistent volumes with FlexClone, using storage efficiencies to cut persistent volume
footprint and costs, storage tiering to Amazon S3 and data protection.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: basicclone
annotations:
trident.netapp.io/cloneFromPVC: basic
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: basic
In the background, Trident will find the ONTAP volume that was used to provision
storage for the original persistent volume claim and then use NetApp FlexClone
to perform a data cloning operation. Clones are created by using a snapshot
on the parent volume. Both the clone and parent volumes are free to accept
changes to their data, which are written to new blocks using a redirect-on-write
mechanism that protects blocks locked by a snapshot from being overwritten.
Using this technique, the clone requires negligible storage space when it is first
created, and only grows with the changes that are made to it.
By using the default reclaim policy of delete, the clone will automatically be
deleted when the pod using it is destroyed. This can be used to conveniently
clean up any allocated storage at the end of a round of testing, for example.
Trident also supports an optional annotation to split the clone from the parent
volume it is associated with. This turns the clone into an independent copy of
the data, which means the space efficiencies discussed previously will be lost,
however, this can be more suitable in scenarios when the clone will undergo a lot
of changes, or when the state of a volume is needed to provide seed data for a
new requirement.
First, we will create a persistent volume claim for the storage we need. As Trident
uses dynamic provisioning, we will specify a storage class, which must have been
setup prior to executing this manifest. Each storage class defines the provisioner
to be used, along with any other provisioner-specific settings that will determine
how the storage is provisioned.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mysql-pvc
annotations:
trident.netapp.io/reclaimPolicy: “Retain”
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: high-performance
When this claim is used by a pod, Trident will automatically create a 100 GiB
high-performance storage volume in Cloud Volumes ONTAP to fulfil the request.
This storage class may have been implemented to make use of Amazon EBS
Provisioned IOPS disks, for example. A reclaim policy of “retain” has also been
specified, which will prevent the persistent volume from being deleted if the pod
releases the claim.
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-server
labels:
app: mysql
spec:
selector:
matchLabels:
app: mysql
tier: database
strategy:
type: Recreate
template:
metadata:
labels:
app: mysql
tier: database
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-pass
key: password
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-pv
mountPath: /var/lib/mysql
volumes:
- name: mysql-pv
persistentVolumeClaim:
claimName: mysql-pvc
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mysql-pvc-clone
annotations:
trident.netapp.io/cloneFromPVC: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: high-performance
We simply annotate our claim with the name of the persistent volume claim to clone from, and Trident takes care of the rest by
automatically finding the source storage volume in Cloud Volumes ONTAP and performing a NetApp FlexClone operation. The
default reclaim policy of delete will ensure that the clone volume is automatically deleted when it is no longer required.
With block-level Kubernetes persistent volumes, Read/Write Once must be used, which means that there is a 1-to-1 relationship
between persistent volume claims and persistent volumes.
Here’s a persistent volume claim (PVC) for a Kubernetes NFS volume example:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs_share
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName: silver
New to provisioning in Kubernetes is the when it comes to NFS. With Trident, NFS when the underlying storage is expanded.
support for editing persistent volumes persistent volumes can be dynamically Kubernetes will take care of this, but it
claims to request more storage space. resized and all the back-end changes will also require that the pod using the
With this new capability, persistent required taken care of. This makes Trident storage to be restarted. That will cause
volumes can be expanded without a major value add. downtime. With NFS, no file system
needing to re-create the pod that uses expansion is required, and the pod can
By extending the support for resizing
the storage. In order for that to happen, continue to work without interruption.
persistent volumes to NFS, Trident is a
the underlying storage provisioner must
major value add. You can read more about resizing NFS
support resize. The commonly used
volumes with Trident on GitHub.
provisioners, such as those for Amazon Block-level storage, such as Amazon EBS,
EBS storage support this—except not needs to have the file system extended
Separate storage classes can be created for different mount parameters and other requirements, for example using separate storage
classes for NFSv3 and NFSv4.
Storage classes allow users to control the type and configuration of the storage they are using when provisioning in Kubernetes.
That extends to the Kubernetes dynamic provisioning NFS storage Trident provides.
Take this Kubernetes NFS storage class example. Here is the storage class manifest:
Using NetApp Trident, Kubernetes storage requests are dynamically fulfilled by Cloud Volumes
ONTAP, which similarly does for storage what Kubernetes does for containers.
Now that you’ve read this Kubernetes guide, find out for yourself how Cloud Volumes ONTAP
transforms container storage management with a free 30-day trial on AWS.
Copyright Information
Copyright © 1994–2019 NetApp, Inc. All rights reserved. Printed in the U.S. No part of this document covered by copyright may be reproduced in any
form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an electronic retrieval system—
without prior written permission of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO
EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability
arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not
convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of
the Rights in Technical Data and Computer Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).
Trademark Information
NETAPP, the NETAPP logo, and the marks listed at http://www.netapp.com/TM are trademarks of NetApp, Inc. Other company and product names
may be trademarks of their respective owners.
NA-287-0420