Dell Powerprotect Data Manager: Protecting Kubernetes Workloads
Dell Powerprotect Data Manager: Protecting Kubernetes Workloads
October 2022
H18563.6
Revisions
Revisions
Date Description
October 2020 Initial release
January 2021 Dell PowerProtect Data Manager 19.6 updates
Acknowledgments
Author: Vinod Kumar Kumaresan
The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this
publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
Copyright © 2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks of Dell
Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [10/27/2022] [White Paper] [H18563.6]
Table of contents
Revisions.............................................................................................................................................................................2
Acknowledgments ...............................................................................................................................................................3
Table of contents ................................................................................................................................................................4
Executive summary.............................................................................................................................................................6
Audience .............................................................................................................................................................................6
1 Introduction ...................................................................................................................................................................7
1.1 Features of Kubernetes ......................................................................................................................................7
1.2 PowerProtect Data Manager capabilities for Kubernetes ..................................................................................7
1.2.1 Efficient and flexible ............................................................................................................................................7
1.2.2 Built for Kubernetes ............................................................................................................................................7
1.3 Key components of PowerProtect Data Manager ..............................................................................................8
1.3.1 Cloud Native Data Manager (CNDM) .................................................................................................................8
1.3.2 PowerProtect controller ......................................................................................................................................8
1.3.3 VMware Velero ...................................................................................................................................................8
1.3.4 cProxy .................................................................................................................................................................8
1.4 Key components of Kubernetes .........................................................................................................................8
1.4.1 Cluster ................................................................................................................................................................8
1.4.2 Node ...................................................................................................................................................................8
1.4.3 Pods and containers ...........................................................................................................................................8
1.4.4 Kubernetes API (kube-apiserver) .......................................................................................................................9
1.4.5 Persistent volume and persistent volume claim .................................................................................................9
1.4.6 Container storage interface ................................................................................................................................9
1.4.7 Storage class ......................................................................................................................................................9
1.4.8 Namespaces .....................................................................................................................................................10
1.4.9 Custom resource ..............................................................................................................................................10
2 Deployment methods of Kubernetes ..........................................................................................................................11
2.1 Kubernetes on-premises ..................................................................................................................................11
2.1.1 Kubernetes running on virtual environment ......................................................................................................11
2.1.2 On-premises Kubernetes on bare metal ..........................................................................................................11
2.1.3 Using external CSI ............................................................................................................................................12
2.2 Kubernetes on Cloud ........................................................................................................................................12
2.2.1 Kubernetes deployed on Infrastructure as a Service (IaaS) ............................................................................12
2.2.2 Kubernetes as a Service (KaaS) ......................................................................................................................12
3 Reference architecture ...............................................................................................................................................13
Executive summary
Traditionally, organizations used physical servers to run applications. There was no way to define resource
boundaries for applications in a physical server, and this caused resource allocation issues. As a solution,
virtualization was introduced, making it possible to run multiple virtual machines (VMs) on a single physical
server’s CPU. It also enabled applications to be isolated within VMs and provided a level of security.
Modern infrastructure is being transformed by containers. Containers are like virtual machines but have
relaxed isolation properties to share the operating system. The container has its own file system, CPU,
memory, and process space. Agile application creation, continuous development, environmental consistency
across development, application-centric management, efficient resource allocation, and resource isolation are
the key benefits of containers. Kubernetes is an open-source container management platform that unifies a
cluster of machines into a single pool of compute resources.
With a distributed container deployment, it is important to protect your workloads. Dell PowerProtect Data
Manager protects the Kubernetes workloads and ensures high availability, and consistent and reliable backup
and restore capabilities for Kubernetes workloads or for a disaster recovery (DR) situation. PowerProtect Data
Manager offers centralized management, automation, multi-cloud options, and advanced integration to
simplify managing workloads.
Audience
This white paper is intended for customers, partners, and other users who want to understand how
PowerProtect Data Manager helps protect Kubernetes workloads.
Dell PowerProtect Data Manager protects existing and newly discovered workloads. It allows IT operations
and backup administrators to manage Kubernetes clusters and their protection through a single management
UI and define protection policies for Kubernetes workloads from Kubernetes APIs. The policy-driven
protection is defined by the protection policy mechanism. PowerProtect Data Manager discovers the
namespaces, labels, and pods in the environment, and helps protect them by providing cluster credentials
logging, monitoring, governance, and recovery.
• Kubernetes automates Linux container operations and eliminates many of the manual processes
involved in deploying and scaling containerized applications.
• Applications can be clustered together in group of hosts running Linux containers, and Kubernetes
helps you easily and efficiently manage those clusters.
• Kubernetes is an ideal platform for hosting cloud-native applications that require rapid scaling.
• Provides flexible protection for Kubernetes clusters using the Kubernetes APIs
• Discovers, monitors, and protects Kubernetes resources—namespaces, labels, pods, and persistent
volumes
• Does not require installing a backup client container for each pod for the backup process
• Provides protection to controllers per node to avoid cross-node traffic
• Enables application consistency for MySQL and MongoDB databases
• Can restore assets to another cluster that is connected to PowerProtect Data Manager
• Provides protection for AWS-hosted Kubernetes clusters using PowerProtect Data Manager running
on AWS and protected to PowerProtect DDVE running on AWS
• Data in-flight encryption for Kubernetes
1.3.4 cProxy
The containerized proxy (cProxy) is a stateless containerized proxy. It is installed on the Kubernetes cluster
when the backup and restore process is initiated and is deleted after the process is completed. It is
responsible for managing persistent volume snapshots (snap copies), mounting snapshots, and moving the
data to the target storage. It is also responsible for restoring data into a persistent volume from target storage
and making the data available for attaching to pods.
1.4.1 Cluster
A Kubernetes cluster is a group of machines called nodes that run containerized applications and has a
wanted state that defines which applications or workloads should be running. The cluster’s wanted state is
defined with the Kubernetes API.
1.4.2 Node
A node is defined for a virtual or physical machine, depending on the cluster. Each node contains the services
necessary to run pods and is managed by the control-plane components. There are two kinds of nodes:
control-plane node and worker node.
A persistent volume claim (PVC) is a request for storage by a user. It is like a pod, which consumes node
resources. Similarly, PVCs consume PV resources. Pods can request specific levels of resources (CPU and
memory).
The container orchestrates volume snapshot backup with which the volume snapshot is taken, mounted, and
streamed block data to target storage. For application-level backups, pre- and post-hooks are used which
quiesce the database, flush, and take snapshots of Persistent Volumes.
Also, the Kubernetes environment can be a generic upstream instance that is an open-source version of
Kubernetes, Distro (Kubernetes distribution) with Rancher and CoreOS. It automates the provisioning of the
Kubernetes cluster using an installer script and OEM distribution that automates Kubernetes, which is Red
Hat OpenShift, Anthos, or Pivotal Container Service (PKS).
A Kubernetes cluster is group of a control-plane and worker nodes. The control-plane node manages the
worker nodes, and it handles scheduling the pods across the nodes in the cluster. PowerProtect Data
Manager integrates with Kubernetes cluster through Kubernetes APIs to perform the discovery. The CNDM is
the component of PowerProtect Data Manager which communicates with the kube-apiserver of the cluster.
When the cluster is discovered, the cluster is added as PowerProtect Data Manager asset source and
associated namespaces as assets are available to be protected. During the process of the discovery,
PowerProtect Data Manager creates the following two namespaces in the cluster. The data is compressed
and deduplicated at the source and sent to the target storage.
• Velero-ppdm: Contains a Velero pod to backup metadata and stage to the target storage if there is a
BareMetal environment. It performs PVC and metadata backup if there is VMware Cloud Native
Storage (CNS).
• PowerProtect: Contains a PowerProtect controller pod to drive Persistent Volume Claim snapshot
and backup and push the backups to target storage using intermittently spawned cProxy pods.
With Kubernetes workloads, a BackupStorageLocation containing the SU information is also created on the
cluster. The PowerProtect controller running in the Kubernetes cluster creates a corresponding
BackupStorageLocation in the Velero namespace whenever a BackupStorageLocation is created in the
PowerProtect namespace.
Note: For general information about updating PowerProtect Data Manager, see the Dell PowerProtect Data
Manager Kubernetes User Guide.
Note: For more information about Dell PowerProtect Data Manager protecting VMware Tanzu Kubernetes
Clusters, see the document PowerProtect Data Manager: Protecting VMware Tanzu Kubernetes Clusters.
See the section “Prerequisites to Kubernetes cluster discovery” in the PowerProtect Data Manager
Kubernetes User Guide before adding a Kubernetes cluster as an asset source with Data Manager.
The Kubernetes asset source can be enabled from the PowerProtect Data Manager UI. Click Infrastructure
> Asset Sources, and click + (plus) to view the New Asset Source tab. In the pane for the asset source that
you want to add, click Enable Source. The Asset Sources window updates to display a tab for the new
asset source.
Note: Discovery of a Kubernetes cluster discovers namespaces that contain volumes from both container
storage interface (CSI) and non-CSI based storage. However, backup and recovery are supported only from
CSI-based storage. Also, only PVCs with the VolumeMode Filesystem are supported.
Port: Specify the port to use for communication when not using the default port, 443.
Note: The use of any port other than 443 or 6443 requires you to open the port on PowerProtect Data
Manager first to enable outgoing communication.
Host Credentials: The service account must have the following privileges:
• Get/Create/Update/List CustomResourceDefinitions
• Get/Create/Update ClusterRoleBinding for 'cluster-admin' role
Note: The admin-user service account in the kube-system namespace contains all these privileges. You can
provide the token of this account, or an existing similar service account. Alternatively, create a service
account that is bound to a cluster role that contains these privileges, and then provide the token of this service
account.
If you do not want to provide a service account with cluster-admin privileges, download the yaml files from the
PowerProtect Data Manager UI Downloads window by clicking the System Settings icon and selecting
Downloads. These files provide the definition of the cluster role with the required privileges required for
PowerProtect Data Manager. Follow the instructions in the README.txt within the tar file to create the
required clusterroles and clusterrolebindings, and to provide the token of the service account created in the
yaml files. The README.txt file also provides instructions for manually creating the secret for ppdm-
discovery-serviceaccount, which is required in Kubernetes versions 1.24 and later.
For more details, see the section “Add a Kubernetes cluster” in the PowerProtect Data Manager Kubernetes
User Guide.
The namespaces in the Kubernetes cluster appear in the Kubernetes tab of the Assets window.
1. Copy the root certificate of the Kubernetes cluster to the PowerProtect Data Manager server.
3. Under the Kubernetes tab, select the Kubernetes cluster asset source and click Edit.
4. Expand Advanced Options, and then copy the text of the root certificate (in Base64 format) into the Root
Certificate box.
5. Click Save.
On AWS EKS, run aws eks describe-cluster --region region --name Kubernetes cluster
name --query "cluster.certificateAuthority.data" --output certificate file name
For other distributions, run kubectl config view --flatten or its equivalent and obtain the Base64
encoded root certificate from the certificate-authority-data field for the cluster.
When adding Network Interface Cards (NICs), setting DNS configuration for pods, or creating custom ports,
you might want to update the PowerProtect Controller, Velero, and cProxy pod configurations to apply
additional attributes or change existing attributes.
When adding the Kubernetes cluster as an asset source, PowerProtect Data Manager UI provides the ability
to update the PowerProtect Controller configuration, Velero configuration, or cProxy configuration fields,
which can be used to add NICs or set the DNS configuration for pods.
Pod information is specified in “Advanced Options” when adding or editing the Kubernetes cluster asset
source in the PowerProtect Data Manager UI.
See the controller configuration section in PowerProtect Data Manager Kubernetes User Guide for more
information.
PowerProtect Data Manager automatically uses the volumegroup snapshot extension when the following
conditions are met:
• The Kubernetes clusters are using PVCs provisioned by the CSI driver for PowerFlex
• The CSI driver for PowerFlex has the volumegroup snapshotter feature enabled, and the
volumegroupsnapshot CRD is present on the Kubernetes cluster
• The PVCs share the same volume-group label
When the volumegroup snapshotter feature is in use for a group of PVCs in a protection policy, an entry for
VolumeGroup appears in the Details pane of the PowerProtect Data Manager UI Jobs window for the
protection policy backup.
If you want PowerProtect Data Manager to use the volumesnapshot functionality instead of the PowerFlex
volumegroup snapshot extension, you can disable the volumegroup snapshot extension by performing the
following steps:
1. From the PowerProtect Data Manager UI, go to Infrastructure > Asset Sources.
2. In the Kubernetes tab, select the Kubernetes cluster asset source that is used to protect the PVCs
belonging to the volumegroup, and then click Edit.
3. In the Edit Kubernetes dialog box, click the down arrow to expand Advanced Settings.
4. Under Controller Configuration, set the value for the property “k8s.ppdm.support.volumeGroup” to
false.
Data Manager provides the following options when creating a Kubernetes cluster protection policy:
Also, the admin can select namespaces and associated PVCs statically or dynamically for inclusion or
exclusion in protection policies, along with schedules, retention, and other protection operations.
From the Jobs window, the progress of the new Kubernetes cluster protection policy backup and associated
tasks can be monitored.
For more information about creating a protection policy, see the section “Add a protection policy for
Kubernetes namespace protection” in the PowerProtect Data Manager Kubernetes User Guide.
PowerProtect Data Manager by default creates a cProxy pod in the powerprotect namespace when backing
up and restoring PVCs to a new namespace. The powerprotect namespace may not have access to all the
PowerScale access zones.
• When PVCs from multiple access zones are provisioned in the Kubernetes cluster
• When Kubernetes cluster firewall and networking are configured not to allow PowerProtect Data
Manager data mover pods running in the powerprotect namespace access to PVCs from all access
zones
For more information about configuring this feature to protect PVCs in PowerScale access zones, see the
section “Protecting PVCs in PowerScale access zones” in the PowerProtect Data Manager Kubernetes User
Guide.
To perform a manual backup, go to Protection > Protection Policies. Select the Kubernetes protection
policy and click Protect Now as shown here.
During the manual backup, select the backup type of either Full or Synthetic Full.
From the Jobs > Protection Jobs window, you can monitor the progress of the Kubernetes cluster backup.
For more information about performing backup and recovery of Kubernetes workloads using PowerProtect
Data Manager, see the PowerProtect Data Manager Kubernetes User Guide.
These backups are agentless, in that the PowerProtect Data Manager can take a snapshot of containers
without the need for software installation in the database application environment. Then, that snapshot is
backed up using the normal procedures for the Kubernetes environment.
The PowerProtect Data Manager provides a standardized way to quiesce a supported database, back up the
data from that database, and then return the database to operation. Application templates serve as a bridge
between a specific database environment and the Kubernetes backup architecture for the PowerProtect Data
Manager. Depending on the differences between database environments, each deployment may require a
different configuration file.
Application templates are typically deployed from customizable YAML files that come with the CLI package.
The CLI package exists on the PowerProtect Data Manager host at
/usr/local/brs/lib/cndm/misc/ppdmctl.tar.gz and is part of the PowerProtect Data Manager
deployment. Starting from PowerProtect Data Manager 19.12, the CLI package can be downloaded from the
PowerProtect Data Manager UI.
1. In the PowerProtect Data Manager UI, click the System Settings icon and then select Downloads.
2. In the left pane, select Kubernetes.
3. In the PPDMCTL box, click Download.
Because data syncs from the primary pods to secondary pods, the PowerProtect Data Manager backs up
secondary pods first.
Kubernetes namespaces are already discovered as assets with PowerProtect Data Manager. There are the
available namespaces for protection except powerprotect and velero-ppdm namespaces. To configure the
application-consistent protection, particular namespace is specified. In this case, test-namespace is used to
demonstrate.
cd /usr/local/brs/lib/cndm/misc
3. Run the list command (ls) to view the content which is ppdmctl.tar.gz file.
ls -ltrh
4. Run the following command to transfer file from PowerProtect Data Manager to Kubernetes cluster
(K8s-cluster01 is for reference for Kubernetes cluster.
5. Log in to Kubernetes Cluster, run the list command to list the content of the root directory, and
confirm ppdmctl.tar.gz is available.
6. Run the tar command to untar the file. Once untar is completed, the ppdmctl directory is created.
7. Run the change directory command to enter the ppdmctl directory, and run ls to list the content.
cd ppdmctl
ls
8. Run the following command to enter the examples directory where all the MySQL and MongoDB
templates are stored.
cd examples
ls
9. Copy the mysqlapptemplatehelm.yaml file to create a template copy from the example for
MySQL with command (tmedemomysqlapptemplatestadalone.yaml is the existing template
associated to the test- namespace).
cp mysqlapptemplatehelm.yaml tmedemomysqlapptemplatestadalone.yaml
10. Run the following command to describe the MySQL stateful set.
- To change the name of primary pod as per the pod description, edit:
selectorExpression: “mysql” (mysql is new name)
- If there are no worker pods being used for MySQL, remove the selectorTerms
section under selectors for worker.
- To edit the MySQL password environment variable, preHook and postHook are edited. (Pod-
based hooks execute hook code in a new pod derived from template in a deployment
configuration, preHook is to quiesce and postHook is to be unquiesce.)
12. Run cd.. to exit the examples directory to the ppdmctl directory.
13. To create a template for MySQL application-consistent protection from the edited (step11) template
file in test-namespace, use the ppdmctl utility by running the following command.
AppLabel:“app.kubernetes.io/name=postgresql-
ha,app.kubernetes.io/component=postgresql”
# applabel to select
pod Type: “Postgresql”
AppActions:
Pod:
PreHook: "[\"/bin/bash\", \"-c\", \"REPMGR_PRIMARY_HOST=`echo
$REPMGR_PRIMARY_HOST | cut -f1 -d '.'`; if [ $HOSTNAME =
$REPMGR_PRIMARY_HOST ]; then PGPASSWORD=$POSTGRES_PASSWORD psql -U
$POSTGRES_USER -c \\\"select pg_start_backup('ppdm-backup',
true);\\\"; fi\"]“
PostHook: "[\"/bin/bash\", \"-c\", \"REPMGR_PRIMARY_HOST=`echo
$REPMGR_PRIMARY_HOST | cut -f1 -d '.'`; if [ $HOSTNAME =
$REPMGR_PRIMARY_HOST ]; then PGPASSWORD=$POSTGRES_PASSWORD psql -U
$POSTGRES_USER -c \\\"select pg_stop_backup();\\\"; fi\"]"
Application:
Kind: StatefulSet
Selectors:
SelectorTerms: {“field” : “Name”, "selectorExpression": ".*-[1-9][0- 9]*$“ }
# Standby pods with index > 0
SelectorTerms: {“field” : “Name”, "selectorExpression": ".*-0$"} # Primary pods
with index = 0
AppLabel: “app=cassandra”
Type: “Cassandra”
Enable: true
AppActions:
Pod:
PreHook: "[\"/bin/bash\", \"-c\", \"nodetool flush\"]"
For more details, see the section “Application-Consistent Database Backups in Kubernetes“ in the
PowerProtect Data Manager Kubernetes User Guide.
To view backup copies available for restore, select Restore > Assets on the PowerProtect Data Manager UI.
Select the asset and click Restore.
PowerProtect Data Manager provides options to recover the Kubernetes namespaces to the same or to an
alternate cluster.
Restore to Original Cluster: Select this option to restore to a new namespace on the original cluster.
Restore to an Alternate Cluster: Select this option to restore to a new namespace on a different cluster, and
then select the cluster from the list.
Note: When restoring to an alternate cluster, ensure that this Kubernetes cluster has been added and
discovered in the PowerProtect Data Manager UI Asset Sources window.
The resources that are scoped at a cluster level and not bound to any specific namespace are called cluster
scoped resources (for example, cluster roles, cluster role bindings, and custom resource definitions (CRD)).
When the CRD is created, Kubernetes API server creates a new RESTful resource path for the specific
version created. The CRD can be either namespaced or cluster scoped as specified in the scope field. This
section examines how you can use Kubernetes backup copies and restore the cluster scoped resources such
as service accounts, cluster roles, and cluster bindings. The Velero component performs the backup and
restore of cluster scoped resources, including the backup of the namespace and the associated cluster
scoped resources, such as cluster roles and cluster role bindings. Custom resource definitions are included
as a part of each namespace backup.
Restoring cluster resources is controlled by a check-box option in UI named Include cluster scoped
resources and runs the restore process.
Below are the steps to verify the cluster role, cluster binding, and CRD.
1. Log in to Kubernetes cluster and run the following command to view cluster role.
kubectl get clusterrole
4. Run the following command to view the service account associated with the namespace.
kubectl get serviceaccount -n <namespace>
5. To delete all the cluster scoped resources associated with the namespace, run the
After protecting the Kubernetes cluster protection policy, restoring namespace and PVCs can be done from
individual namespace backups.
• Restore to original namespace: Restore to the original namespace on the original cluster.
• Restore to new namespace: Create a namespace and restore to this location on the original cluster
or a different cluster.
• Restore to existing namespace: Restore to an existing namespace in the original cluster or a
different cluster.
On the PVCs page, if the configuration of the namespace you want to restore is different from the
configuration in the target namespace, perform the following:
• Select Overwrite content of existing PVCs to overwrite existing PVCs in the target location with the
PVCs being restored if the PVCs have the same name.
• Select Skip restore of existing PVCs to restore selected PVCs without overwriting existing PVCs in
the target location if they have the same name.
• Select Change storage class for PVCs to compatible storage class. The PVCs that are part of the
restore display.
• Select the check box next to the PVCs for which you want to change the storage class on the target
cluster.
The storage class mapping feature with PowerProtect Data Manager 19.8 enables you to choose an alternate
storage class for PVCs with a certain provisioner type while restoring persistent volumes. Storage class
mapping enables restoring namespaces and PVCs from one cluster to another using different container
storage. It is also useful when the migration of data from one storage class to another storage class and from
on-premises to cloud or conversely.
Note: The storage class is not modified for existing PVCs being overwritten.
If Change storage class for PVCs to compatible storage class is selected as shown above, the Storage
Class page appears with a list of supported storage classes on the target cluster.
1. Select the check box next to a PVC for which you want to change the storage class on the target
cluster. Alternately, select multiple PVCs to change all selections to the same storage class.
Note: When changing the PVC storage class on the target Kubernetes cluster, if you select more than one
PVC at a time on this page, only the storage classes that apply to all selected PVCs are displayed. To view
and select from all available storage classes, select one PVC at a time.
2. Click Target Storage Class to select from the available storage classes. The Select Storage Class
dialog appears.
From the Summary page, click Restore to initiate the restore job. An informational dialog box appears
indicating that the restore has started.
From the Jobs > Protection Jobs window, the restore progress can be monitored.
• Virtual machines
• Kubernetes
• File system
• NAS
Quick recovery sends metadata from the source system to the destination system, following the same flow of
backup copies. This metadata makes the replication destination aware of the copies and enables the recovery
view. You can recover your workloads at the remote site before you restore the source PowerProtect Data
Manager system.
The replicated copies will be available for restore to a Kubernetes cluster that is added to the remote
PowerProtect Data Manager.
Remote PowerProtect Data Manager displaying the replicated backup copies of the source PowerProtect
Data Manager asset for restore:
The Data Protection Info Hub provides expertise that helps to ensure customer success with Dell data
protection products.