Nutanix Kubernetes Platform v2 12
Nutanix Kubernetes Platform v2 12
Platform Guide
Nutanix Kubernetes Platform 2.12
October 8, 2024
Contents
2. Downloading NKP....................................................................................... 16
ii
AWS Installation........................................................................................................................... 156
AWS Air-gapped Installation........................................................................................................ 167
AWS with FIPS Installation.......................................................................................................... 181
AWS Air-gapped with FIPS Installation........................................................................................192
AWS with GPU Installation...........................................................................................................205
AWS Air-gapped with GPU Installation........................................................................................217
EKS Installation Options......................................................................................................................... 230
EKS Installation............................................................................................................................ 230
EKS: Minimal User Permission for Cluster Creation....................................................................231
EKS: Cluster IAM Policies and Roles.......................................................................................... 233
EKS: Create an EKS Cluster....................................................................................................... 237
EKS: Grant Cluster Access.......................................................................................................... 242
EKS: Retrieve kubeconfig for EKS Cluster.................................................................................. 243
EKS: Attach a Cluster.................................................................................................................. 244
vSphere Installation Options................................................................................................................... 249
vSphere Prerequisites: All Installation Types...............................................................................249
vSphere Installation...................................................................................................................... 254
vSphere Air-gapped Installation................................................................................................... 267
vSphere with FIPS Installation..................................................................................................... 281
vSphere Air-gapped FIPS Installation.......................................................................................... 294
VMware Cloud Director Installation Options........................................................................................... 309
Azure Installation Options....................................................................................................................... 309
Azure Installation.......................................................................................................................... 310
Azure: Creating an Image............................................................................................................ 311
Azure: Creating the Management Cluster....................................................................................312
Azure: Install Kommander............................................................................................................ 313
Azure: Verifying your Installation and UI Log in.......................................................................... 315
Azure: Creating Managed Clusters Using the NKP CLI.............................................................. 316
AKS Installation Options......................................................................................................................... 319
AKS Installation............................................................................................................................ 319
AKS: Create an AKS Cluster....................................................................................................... 322
AKS: Retrieve kubeconfig for AKS Cluster.................................................................................. 323
AKS: Attach a Cluster.................................................................................................................. 325
GCP Installation Options.........................................................................................................................327
GCP Installation............................................................................................................................328
iii
Deleting a Workspace.................................................................................................................. 398
Workspace Applications............................................................................................................... 398
Workplace Catalog Applications...................................................................................................406
Configuring Workspace Role Bindings.........................................................................................420
Multi-Tenancy in NKP...................................................................................................................421
Generating a Dedicated Login URL for Each Tenant.................................................................. 423
Projects.................................................................................................................................................... 423
Creating a Project Using the UI................................................................................................... 424
Creating a Project Using the CLI................................................................................................. 424
Project Applications...................................................................................................................... 425
Project Deployments.....................................................................................................................441
Project Role Bindings................................................................................................................... 447
Project Roles................................................................................................................................ 450
Project ConfigMaps...................................................................................................................... 453
Project Secrets............................................................................................................................. 454
Project Quotas and Limit Ranges................................................................................................ 455
Project Network Policies...............................................................................................................457
Cluster Management............................................................................................................................... 462
Creating a Managed Nutanix Cluster Through the NKP UI......................................................... 462
Creating a Managed Azure Cluster Through the NKP UI............................................................464
Creating a Managed vSphere Cluster Through the NKP UI........................................................ 465
Creating a Managed Cluster on VCD Through the NKP UI.........................................................470
Kubernetes Cluster Attachment....................................................................................................473
Platform Expansion: Conversion of an NKP Pro Cluster to an NKP Ultimate Managed
Cluster......................................................................................................................................515
Creating Advanced CLI Clusters..................................................................................................532
Custom Domains and Certificates Configuration for All Cluster Types........................................533
Disconnecting or Deleting Clusters.............................................................................................. 538
Management Cluster.................................................................................................................... 539
Cluster Statuses........................................................................................................................... 539
Cluster Resources........................................................................................................................ 540
NKP Platform Applications........................................................................................................... 541
Cluster Applications and Statuses............................................................................................... 541
Custom Cluster Application Dashboard Cards.............................................................................542
Kubernetes Cluster Federation (KubeFed).................................................................................. 543
Backup and Restore................................................................................................................................544
Velero Configuration..................................................................................................................... 544
Velero Backup.............................................................................................................................. 557
Logging.................................................................................................................................................... 561
Logging Operator..........................................................................................................................562
Logging Stack............................................................................................................................... 562
Admin-level Logs.......................................................................................................................... 565
Workspace-level Logging............................................................................................................. 565
Multi-Tenant Logging.................................................................................................................... 573
Fluent Bit.......................................................................................................................................578
Configuring Loki to Use AWS S3 Storage in NKP.......................................................................582
Customizing Logging Stack Applications..................................................................................... 584
Security.................................................................................................................................................... 585
OpenID Connect (OIDC).............................................................................................................. 585
Identity Providers.......................................................................................................................... 586
Login Connectors..........................................................................................................................586
Access Token Lifetime................................................................................................................. 587
Authentication............................................................................................................................... 587
Connecting Kommander to an IdP Using SAML..........................................................................588
Enforcing Policies Using Gatekeeper...........................................................................................589
Traefik-Forward-Authentication in NKP (TFA)..............................................................................592
iv
Local Users...................................................................................................................................594
Networking............................................................................................................................................... 597
Networking Service.......................................................................................................................598
Required Domains........................................................................................................................ 602
Load Balancing............................................................................................................................. 602
Ingress.......................................................................................................................................... 603
Configuring Ingress for Load Balancing.......................................................................................604
Istio as a Microservice................................................................................................................. 606
GPUs....................................................................................................................................................... 607
Configuring GPU for Kommander Clusters.................................................................................. 608
Enabling the NVIDIA Platform Application on a Management Cluster.........................................608
Enabling the NVIDIA Platform Application on Attached or Managed Clusters.............................610
Validating the Application............................................................................................................. 612
NVIDIA GPU Monitoring...............................................................................................................612
Configuring MIG for NVIDIA.........................................................................................................612
Troubleshooting NVIDIA GPU Operator on Kommander.............................................................614
Disabling NVIDIA GPU Operator Platform Application on Kommander....................................... 615
GPU Toolkit Versions................................................................................................................... 615
Enabling GPU After Installing NKP.............................................................................................. 616
Monitoring and Alerts.............................................................................................................................. 617
Recommendations........................................................................................................................ 617
Grafana Dashboards.................................................................................................................... 619
Cluster Metrics..............................................................................................................................621
Alerts Using AlertManager........................................................................................................... 621
Centralized Monitoring..................................................................................................................626
Centralized Metrics....................................................................................................................... 627
Centralized Alerts......................................................................................................................... 627
Federating Prometheus Alerting Rules........................................................................................ 628
Centralized Cost Monitoring......................................................................................................... 628
Application Monitoring using Prometheus.................................................................................... 630
Setting Storage Capacity for Prometheus....................................................................................632
Storage for Applications.......................................................................................................................... 632
Rook Ceph in NKP.......................................................................................................................633
Bring Your Own Storage (BYOS) to NKP Clusters......................................................................637
v
Pre-provisioned Installation in a Non-air-gapped Environment.................................................... 714
Pre-provisioned Installation in an Air-gapped Environment......................................................... 723
Pre-Provisioned Management Tools............................................................................................ 736
AWS Infrastructure.................................................................................................................................. 742
AWS Prerequisites and Permissions........................................................................................... 743
AWS Installation in a Non-air-gapped Environment.....................................................................759
AWS Installation in an Air-gapped Environment.......................................................................... 774
AWS Management Tools............................................................................................................. 789
EKS Infrastructure................................................................................................................................... 806
EKS Introduction...........................................................................................................................806
EKS Prerequisites and Permissions............................................................................................ 807
Creating an EKS Cluster from the CLI........................................................................................ 814
Create an EKS Cluster from the UI............................................................................................. 820
Granting Cluster Access...............................................................................................................822
Exploring your EKS Cluster..........................................................................................................823
Attach an Existing Cluster to the Management Cluster............................................................... 825
Deleting the EKS Cluster from CLI.............................................................................................. 829
Deleting EKS Cluster from the NKP UI....................................................................................... 830
Manage EKS Node Pools............................................................................................................ 832
Azure Infrastructure................................................................................................................................. 834
Azure Prerequisites...................................................................................................................... 835
Azure Non-air-gapped Install........................................................................................................839
Azure Management Tools............................................................................................................ 850
AKS Infrastructure................................................................................................................................... 854
Use Nutanix Kubernetes Platform to Create a New AKS Cluster................................................855
Create a New AKS Cluster from the NKP UI.............................................................................. 857
Explore New AKS Cluster............................................................................................................ 859
Delete an AKS Cluster................................................................................................................. 861
vSphere Infrastructure............................................................................................................................. 862
vSphere Prerequisites.................................................................................................................. 864
vSphere Installation in a Non-air-gapped Environment................................................................872
vSphere Installation in an Air-Gapped Environment.................................................................... 886
vSphere Management Tools........................................................................................................ 902
VMware Cloud Director Infrastructure.....................................................................................................912
VMware Cloud Director Prerequisites.......................................................................................... 912
Cloud Director Configure the Organization.................................................................................. 916
Cloud Director Install NKP........................................................................................................... 925
Cloud Director Management Tools.............................................................................................. 935
Google Cloud Platform (GCP) Infrastructure.......................................................................................... 942
GCP Prerequisites........................................................................................................................ 943
GCP Installation in a Non-air-gapped Environment..................................................................... 946
GCP Management Tools..............................................................................................................957
vi
Images Download into Your Registry: Air-gapped, Pre-provisioned Environments......................974
Installing Kommander in a Pre-provisioned, Non-Air-gapped Environment............................................977
Pro License: Installing Kommander in a Pre-provisioned, Non-Air-gapped Environment.............978
Ultimate License: Installing Kommander in Pre-provisioned, Non-Air-gapped with NKP
Catalog Applications................................................................................................................ 979
Installing Kommander in a Small Environment.......................................................................................979
Dashboard UI Functions......................................................................................................................... 981
Logging into the UI with Kommander..................................................................................................... 981
Default StorageClass...............................................................................................................................982
Identifying and Modifying Your StorageClass.............................................................................. 982
Installling Kommander............................................................................................................................. 983
Installing Kommander with a Configuration File..................................................................................... 983
Configuring Applications After Installing Kommander.............................................................................984
Verifying Kommander Installation............................................................................................................985
Kommander Configuration Reference.....................................................................................................986
Configuring the Kommander Installation with a Custom Domain and Certificate................................... 990
Reasons for Setting Up a Custom Domain or Certificate.......................................................................990
Certificate Issuer and KommanderCluster Concepts..............................................................................991
Certificate Authority................................................................................................................................. 992
Certificate Configuration Options............................................................................................................ 992
Using an Automatically-generated Certificate with ACME and Required Basic Configuration..... 992
Using an Automatically-generated Certificate with ACME and Required Advanced
Configuration............................................................................................................................993
Using a Manually-generated Certificate....................................................................................... 995
Advanced Configuration: ClusterIssuer...................................................................................................996
Configuring a Custom Domain Without a Custom Certificate.................................................................997
Verifying and Troubleshooting the Domain and Certificate Customization............................................. 998
DNS Record Creation with External DNS...............................................................................................998
Configuring External DNS with the CLI: Management or Pro Cluster..........................................999
Configuring the External DNS Using the UI...............................................................................1000
Customizing the Traefik Deployment Using the UI.................................................................... 1001
Verifying Your External DNS Configuration............................................................................... 1002
Verifying Whether the DNS Deployment Is Successful............................................................. 1002
Examining the Cluster’s Ingress.................................................................................................1003
Verifying the DNS Record.......................................................................................................... 1003
External Load Balancer.........................................................................................................................1004
Configuring Kommander to Use an External Load Balancer..................................................... 1005
Configuring the External Load Balancer to Target the Specified Ports.....................................1005
HTTP Proxy Configuration Considerations........................................................................................... 1006
Configuring HTTP proxy for the Kommander Clusters......................................................................... 1006
Enabling Gatekeeper.............................................................................................................................1007
Creating Gatekeeper ConfigMap in the Kommander Namespace........................................................1008
Installing Kommander Using the Configuration Files and ConfigMap...................................................1009
Configuring the Workspace or Project.................................................................................................. 1009
Configuring HTTP Proxy in Attached Clusters..................................................................................... 1009
Creating Gatekeeper ConfigMap in the Workspace Namespace......................................................... 1010
Configuring Your Applications............................................................................................................... 1010
Configuring Your Application Manually................................................................................................. 1011
NKP Catalog Applications Enablement after Installing NKP.................................................................1011
Configuring a Default Ultimate Catalog after Installing NKP................................................................ 1011
NKP Catalog Application Labels........................................................................................................... 1012
vii
Infrastructure Requirements for FIPS 140-2 Mode.................................................................... 1014
Deploying Clusters in FIPS Mode.............................................................................................. 1014
FIPS 140 Images: Non-Air-Gapped Environments.................................................................... 1015
FIPS 140 Images: Air-gapped Environment...............................................................................1015
Validate FIPS 140 in Cluster......................................................................................................1016
FIPS 140 Mode Performance Impact.........................................................................................1017
Registry Mirror Tools.............................................................................................................................1017
Air-gapped vs. Non-air-gapped Environments........................................................................... 1018
Local Registry Tools Compatible with NKP............................................................................... 1018
Using a Registry Mirror.............................................................................................................. 1019
Seeding the Registry for an Air-gapped Cluster...................................................................................1021
Configure the Control Plane..................................................................................................................1022
Modifying Audit Logs.................................................................................................................. 1022
Viewing the Audit Logs.............................................................................................................. 1027
Updating Cluster Node Pools................................................................................................................1028
Cluster and NKP Installation Verification.............................................................................................. 1029
Checking the Cluster Infrastructure and Nodes......................................................................... 1029
Monitor the CAPI Resources......................................................................................................1030
Verify all Pods............................................................................................................................ 1030
Troubleshooting.......................................................................................................................... 1030
GPU for Konvoy.................................................................................................................................... 1031
Delete a NKP Cluster with One Command.......................................................................................... 1031
viii
konvoy-image version.................................................................................................................1088
ix
Enable the NKP AI Navigator Cluster Info Agent...................................................................... 1138
Customizing the AI Navigator Cluster Info Agent...................................................................... 1138
Data Privacy FAQs.....................................................................................................................1138
Related Information
For information on related topics or procedures, see the Kubernetes documentation.
Table 1: Nutanix
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Nutanix
System Config gapped gapped Support gapped Image
Builder
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Rocky 5.14.0-162.12.1.el9_1.0.2.x86_64
Yes Yes Yes
Linux 9.1
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Table 7: vSphere
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Amazon 5.10.199-190.747.amzn2.x86_64
Yes
Linux 2
v7.9
Operating Kernel Default FIPS Air- FIPS Air- GPU GPU Air- Konvoy
System Config gapped gapped Support gapped Image
Builder
Procedure
1. From the Nutanix Download Site, select the NKP binary for either Darwin (MacOS) or Linux OS.
• Help you in getting started with Nutanix Kubernetes Platform (NKP) is the planning phase that introduces
definitions and concepts.
• Guide you with the Basic Installations by Infrastructure on page 50 through the NKP software installation
and start-up.
• Guide you with the Cluster Operations Management on page 339, which involves customizing applications
and managing operations.
You can install in multiple ways:
• On Nutanix infrastructure.
• On a public cloud infrastructure, such as Amazon Web Services (AWS), Google Cloud Platform (GCP), or Azure.
• On an internal network, on-premises environment, or with a physical or virtual infrastructure.
• On an air-gapped environment.
• With or without Federal Information Processing Standards (FIPS) and graphics processing unit (GPU).
Before you install NKP:
Procedure
1. Complete the prerequisites (see Prerequisites for Installation on page 44) required to install NKP.
2. Determine the infrastructure (see Resource Requirements on page 38) on which you want to deploy NKP.
3. After you choose your environment, download NKP, and select the Basic Installations by Infrastructure on
page 50 for your infrastructure provider and environment.
The basic installations set up the cluster with the Konvoy component and then install the Kommander component
to access the dashboards through the NKP UI. The topics in the Basic Installations by Infrastructure on
page 50 chapter help you explore ##NKP# and prepare clusters for production to deploy and enable the
applications that support Cluster Operations Management on page 339.
4. (Optional) After you complete the basic installation and are ready to customize, perform Custom Installation
and Additional Infrastructure Tools, if required.
5. To prepare the software, perform the steps described in the Cluster Operations Management chapter.
• Konvoy is the cluster life cycle manager component of NKP. Konvoy relies on Cluster Application Programming
Interface (API), Calico, and other open-source and proprietary software to provide simple cluster life cycle
management for conformant Kubernetes clusters with networking and storage capabilities.
Konvoy uses industry-standard tools to provision certified Kubernetes clusters on multiple cloud providers,
vSphere, and on-premises hardware in connected and air-gapped environments. Konvoy contains the following
components:
Cluster Manager consists of Cluster API, Container Storage Interface (CSI), Container Network Interface (CNI),
Cluster AutoScaler, Cert Manager, and MetalLB.
For Networking, Kubernetes uses CNI (Container Network Interface) as an interface between network
infrastructure and Kubernetes pod networking. In NKP, the Nutanix provider uses the Cilium CNI. All other
providers use Calico CNI.
The Konvoy component is installed according to the cluster’s infrastructure. Remember:
1. To install NKP quickly and without much customization, see Basic Installations by Infrastructure on
page 50.
2. To choose more environments and cluster customizations, see Custom Installation and Additional
Infrastructure Tools.
• Kommander is the fleet management component of NKP. Kommander delivers centralized observability, control,
governance, unified policy, and better operational insights. With NKP Pro, Kommander manages a single
Kubernetes cluster.
In NKP Ultimate, Kommander supports attaching workload clusters and life cycle management of clusters using
Cluster API. NKP Ultimate also offers life cycle management of applications through FluxCD. Kommander
contains the following components:
Cluster Types
Cluster types such as Management clusters, Managed clusters, and Attached clusters are key concepts in
understanding and getting the most out of Nutanix Kubernetes Platform (NKP) Pro versus Ultimate environments.
Multi-cluster Environment
• Management Cluster: Is the cluster where you install NKP, and it is self-managed. In a multi-cluster
environment, the Management cluster also manages other clusters. Customers with an Ultimate license need to
run workloads on Managed and Attached clusters, not on the Management cluster. For more information, see
License Packaging.
• Managed Cluster: Also called an “NKP cluster,” this is a type of workload cluster that you can create with
NKP. The NKP Management cluster manages its infrastructure, its life cycle, and its applications.
• Attached Cluster: This is a type of workload cluster that is created outside of NKP but is then connected to the
NKP Management Cluster so that NKP can manage it. In these cases, the NKP Management cluster only manages
the attached cluster’s applications.
Single-cluster Environment
NKP Pro Cluster: Is the cluster where you install NKP. A NKP Pro cluster is a stand-alone cluster. It is self-
managed and, therefore, capable of provisioning itself. In this single-cluster environment, you cannot attach other
clusters; all workloads are run on your NKP Pro cluster. You can, however, have several separate NKP Pro instances,
each with its own license.
Customers with a Pro license can run workloads on their NKP Pro cluster.
Note: If you have not decided which license to get but plan on adding one or several clusters to your environment and
managing them centrally, Nutanix recommends obtaining a license.
Self-Managed Cluster
In the Nutanix Kubernetes Platform (NKP) landscape, only NKP Pro and NKP Ultimate Management clusters are
self-managed. Self-managed clusters manage the provisioning and deployment of their own nodes through CAPI
controllers. The CAPI controllers are a managing entity that automatically manages the life cycle of a cluster’s nodes
based on a customizable definition of the resources.
A self-managed cluster is one in which the CAPI resources and controllers that describe and manage it run on the
same cluster they are managing. As part of the underlying processing using the --self-managed flag, the NKP CLI
does the following:
Network-Restricted Cluster
A network-restricted cluster is not the same as an air-gapped cluster.
A firewall secures a network-restricted or firewalled cluster, Perimeter Network, Network Address Translation (NAT)
gateway, or proxy server requires additional access information. Network-restricted clusters are usually located in
remote locations or at the edge and, therefore, not in the same network as the Management cluster.
The main difference between network-restricted and air-gapped clusters is that network-restricted clusters can reach
external networks (like the Internet), but their services or ingresses cannot be accessed from outside. Air-gapped
clusters, however, do not allow ingress or egress traffic.
In a multi-cluster environment, NKP supports attaching a network-restricted cluster to an NKP Management cluster.
You can also enable a proxied access pipeline through the Management cluster, which allows you to access the
network-restricted cluster’s dashboards without being in the same network.
Related Information
For information on related topics or procedures, see:
Non-Air-Gapped Environments
In a non-air-gapped environment, two-way access to and from the Internet exists. You can create a non-air-gapped
cluster on pre-provisioned (on-premises) environments or any cloud infrastructure.
NKP in a non-air-gapped environment allows you to manage your clusters while facilitating connections and offering
integration with other tools and systems.
Common Industry Synonyms: Open, accessible (to the Internet), not restricted, Non-classified Internet Protocol
(IP) Router Network (NIPRNet), etc.
Pre-provisioned Infrastructure
The pre-provisioned infrastructure allows the deployment of Kubernetes using Nutanix Kubernetes Platform (NKP)
to pre-existing machines. Other providers, such as vSphere, AWS, or Azure, create or provision the machines before
Kubernetes is deployed. On most infrastructures (including vSphere and cloud providers), NKP provisions the actual
nodes automatically as part of deploying a cluster. It creates the virtual machine (VM) using the appropriate image
and then handles the networking and installation of Kubernetes.
However, NKP can also work with pre-provisioned infrastructure in which you provision the VMs for the nodes.
You can pre-provision nodes for NKP on bare metal, vSphere, or cloud. Pre-provisioned and vSphere combine the
physical (on-premises bare metal) and virtual servers (VMware vSphere).
• On-premises clusters.
• Cloud or Infrastructure as a Service (IaaS) environments that do not currently have a Nutanix-supported
infrastructure provider.
Related Information
For information on related topics or procedures see Pre-provisioned Installation Options on page 65.
Licenses
This chapter describes the Nutanix Kubernetes Platform (NKP) licenses. The license type you subscribe to determines
what NKP features are available to you. Features compatible with all versions of NKP can be activated by purchasing
additional Add-on licenses.
The NKP licenses available are:
• NKP Starter
• NKP Pro
• NKP Ultimate
SECURITY
FIPS Compliant Build X X
Konvoy Image Builder or Bring your X X
own OS
Nutanix provided OS Image (Rocky X X X
Linux)
Air-Gapped Deployments X X X
RBAC
RBAC - Admin role only X X X
RBAC - Kubernetes X X X
NKP RBAC X X
Customize UI Banners X X
Upload custom Logo X
Purchase of a License
NKP Licenses are sold in units of cores. To learn more about licenses and to obtain a valid license:
Compatible Infrastructure
NKP Starter operates across Nutanix's entire range of cloud, on-premises, edge, and air-gapped infrastructures and
has support for various OSes, including immutable OSes. To view the complete list of compatible infrastructure, see
Supported Infrastructure Operating Systems on page 12.
To understand the NKP Starter cluster in one of the listed environments of your choice, see Basic Installations by
Infrastructure on page 50 or Custom Installation and Infrastructure Tools on page 644.
Cluster Manager
Konvoy is the Kubernetes installer component of NKP Pro that uses industry-standard tools to create a certified
Kubernetes cluster. These industry standard tools create a cluster management system that includes:
• Control Plane: Manages the worker nodes and pods in the cluster.
Builtin GitOps
NKP Starter is bundled with GitOps, an operating model for Kubernetes and other cloud native technologies,
providing a set of best practices that unify Git deployment, management, and monitoring for containerized clusters
and applications. GitOps uses Git as a single source of truth for declarative infrastructure and applications. With
GitOps, software agents can alert any divergence between Git and what it is running in a cluster and if there’s a
difference. Kubernetes reconcilers automatically update or roll back the cluster depending on the case.
Platform Applications
NKP Starter deploys only the required applications during installation by default. You can use the Kommander UI
to customize which Platform applications to deploy to the cluster in a workspace. For a list of available Platform
applications included with NKP, see Workspace Platform Application Resource Requirements on page 394.
Compatible Infrastructure
NKP Pro operates across Nutanix entire range of cloud, on-premises, edge, and air-gapped infrastructures and has
support for various OSs, including immutable OSs. For a complete list of compatible infrastructure, see Supported
Infrastructure Operating Systems on page 12.
For instructions on standing up an NKP Pro cluster in one of the listed environments, see Basic Installations by
Infrastructure on page 50 or Custom Installation and Infrastructure Tools on page 644.
Cluster Manager
Konvoy is the Kubernetes installer component of NKP Pro that uses industry-standard tools to create a certified
Kubernetes cluster. These industry standard tools create a cluster management system that includes:
• Control Plane: Manages the worker nodes and pods in the cluster.
• Worker Nodes: Used to run containerized applications and handle networking to ensure that the traffic between
applications across and outside the cluster is facilitated correctly.
• Container Networking Interface (CNI): Calico’s open-source networking and network security solution for
containers, virtual machines, and native host-based workloads.
• Container Storage Interface (CSI): A common abstraction to container orchestrations for interacting with storage
subsystems of various types.
• Kubernetes Cluster API (CAPI): Cluster API uses Kubernetes-style APIs and patterns to automate cluster life
cycle management for platform operators. For more information on how CAPI is integrated into NKP Pro, see
CAPI Concepts and Terms on page 20.
• Cert Manager: A Kubernetes addon to automate the management and issuance of TLS certificates from various
issuing sources.
• Cluster Autoscaler: A component that automatically adjusts the size of a Kubernetes cluster so that all pods have a
location to run and there are no unwanted nodes.
Builtin GitOps
NKP Pro is bundled with GitOps, an operating model for Kubernetes and other cloud native technologies, providing
a set of best practices that unify Git deployment, management, and monitoring for containerized clusters and
applications. GitOps uses Git as a single source of truth for declarative infrastructure and applications. With GitOps,
software agents can alert any divergence between Git and what it is running in a cluster and if there’s a difference.
Kubernetes reconcilers automatically update or roll back the cluster depending on the case.
Platform Applications
When creating a cluster, the application manager deploys specific platform applications on the newly created cluster.
You can deploy applications in any NKP managed cluster with the complete flexibility to operate across cloud, on-
Compatible Infrastructure
NKP Ultimate operates across Nutanix entire range of cloud, on-premises, edge, and air-gapped infrastructures and
has support for various OSs, including immutable OSs. See Supported Infrastructure Operating Systems on
page 12 for a complete list of compatible infrastructure.
For the basics on standing up a NKP Ultimate cluster in one of the listed environments of your choice, see Basic
Installs by Infrastructure or Custom Installation and Infrastructure Tools on page 644.
Cluster Manager
Konvoy is the Kubernetes installer component of NKP Pro that uses industry-standard tools to create a certified
Kubernetes cluster. These industry standard tools create a cluster management system that includes:
• Control Plane: Manages the worker nodes and pods in the cluster.
Builtin GitOps
NKP Ultimate is bundled with GitOps, an operating model for Kubernetes and other cloud native technologies,
providing a set of best practices that unify Git deployment, management, and monitoring for containerized clusters
and applications. GitOps uses Git as a single source of truth for declarative infrastructure and applications. With
GitOps, software agents can alert any divergence between Git and what it is running in a cluster and if there’s a
difference. Kubernetes reconcilers automatically update or roll back the cluster depending on the case.
Platform Applications
When creating a cluster, the application manager deploys specific platform applications on the newly created cluster.
Applications can be deployed in any NKP managed cluster, giving you complete flexibility to operate across cloud,
on-premises, edge, and air-gapped scenarios. Customers can also use the UI with Kommander to customize which
platform applications to deploy to the cluster in a given workspace.
With NKP Ultimate, you can use the Kommander UI to customize which platform applications to deploy to the
cluster in a workspace. For a list of available platform applications included with NKP, see Workspace Platform
Application Resource Requirements on page 394.
Procedure
5. Paste your license token in the Enter License section inside the License Key field.
» For Nutanix licenses, paste your license token in the provided fields.
» For D2iQ licenses, paste the license token in the text box.
6. Select Save.
Procedure
4. Your existing licenses will be listed. Select Remove License on the license you would like to remove and
follow the prompts.
For kubectl and NKP commands to run, it is often necessary to specify the environment or cluster in which you
want to run them. This also applies to commands that create, delete, or update a cluster’s resources.
There are two options:
Better suited for single-cluster environments. Better suited for multi-cluster environments.
Single-cluster Environment
In a single-cluster environment, you do not need to switch between clusters to run commands and perform operations.
However, specifying an environment for each terminal session is still necessary. Hence, the NKP CLI runs the
operations on the NKP cluster and does not accidentally run operations on, for example, the bootstrap cluster.
To set the environment variable for all your operations using the kubeconfig file, you must first set the variable:
• When you create a cluster, a kubeconfig file is generated automatically. Get the kubeconfig file and write it to
the ${CLUSTER_NAME}.conf variable :
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
• Set the context by exporting the kubeconfig file from the source file and executing the command for each
terminal session using the --kubeconfig file except the current session:
export KUBECONFIG=${CLUSTER_NAME}.conf
Multi-cluster Environment
Having multiple clusters means switching between two clusters to run operations. Nutanix recommends two
approaches:
• You can use a flag to reference the target cluster. The --kubeconfig=<CLUSTER_NAME>.conf flag defines the
configuration file for the cluster that you configure and try to access.
This is the easiest way to ensure you are working on the correct cluster when operating and using multiple
clusters. If you create additional clusters and do not store the name as an environment variable, you can enter the
cluster name followed by .conf to access your cluster.
Note: Ensure that you run nkp get kubeconfig for each cluster you want to create to generate a
kubeconfig file.
Storage
This document describes the model used in Kubernetes for managing persistent, cluster-scoped storage for workloads
requiring access to persistent data.
A workload on Kubernetes typically requires two types of storage:
• Ephemeral Storage
• Persistent Volume
Ephemeral Storage
Ephemeral storage, by its name, is ephemeral because it is cleaned up when the workload is deleted or the container
crashes. For example, the following are examples of ephemeral storage provided by Kubernetes:
Kubernetes automatically manages ephemeral storage and typically does not require explicit settings. However, you
might need to express capacity requests for temporary storage so that kubelet can use that information to ensure that
each node has enough.
Persistent Volume
A persistent volume claim (PVC) is a storage request. A workload that requires persistent volumes uses a persistent
volume claim (PVC) to express its request for persistent storage. A PVC can request a specific size and Access
Modes (for example, they can be mounted after read/write or many times read-only).
Any workload can specify a PersistentVolumeClaim. For example, a Pod may need a volume that is at least 4Gi large
or a volume mounted under /data in the container’s filesystem. If a PersistentVolume (PV) satisfies the specified
requirements in the PersistentVolumeClaim (PVC), it will be bound to the PVC before the Pod starts.
Note: NKP uses the local static provisioner as the default storage provider for pre-provisioned clusters.
However, localvolumeprovisioner is not suitable for production use. Use a Kubernetes CSI (see https://
kubernetes.io/docs/concepts/storage/volumes/#volume-types that is compatible with storage that is
suitable for production.
You can choose from any storage option https://kubernetes.io/docs/concepts/storage/volumes/
#volume-types available for Kubernetes. To disable the default that Konvoy deploys, set the default
StorageClass localvolumeprovisioner as non-default. Then, set the newly created StorageClass to
default by following the commands in the Changing the default StorageClass topic in the Kubernetes
documentation (see https://kubernetes.io/docs/tasks/administer-cluster/change-default-storage-
class/).
When a default StorageClass is specified, you can create PVCs without specifying the StorageClass. For
instance, to request a volume using the default provisioner, create a PVC with the following configurations:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pv-claim
To start the provisioning of a volume, launch a pod that references the PVC:
...
volumeMounts:
- mountPath: /data
name: persistent-storage
...
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: my-pv-claim
Note: To specify a StorageClass that references a storage policy when making a PVC (see https://kubernetes.io/
docs/concepts/storage/persistent-volumes/#class-1) and specify a name in storageClassName. If left
blank, the default StorageClass is used.
• StorageClass types with specific configurations. You can change the default StorageClass using these steps
from the Kubernetes site: Changing the default storage class
Ceph can also be used as Container Storage Interface (CSI) storage. For information on how to use Rook Ceph, see
Rook Ceph in NKP on page 633.
Driver Information
Below is infrastructure provider CSI driver specifics.
• The purpose is to manage the life cycle of the load balancers as well as associate Kubernetes nodes with virtual
machines in the infrastructure. See Cloud provider component (CPI)
• Cluster API for VMware Cloud Director (CAPVCD) is a component that runs in a Kubernetes cluster that
connects to the VCD API. It uses the Cloud Provider Interface (CPI) to create and manage the infrastructure.
• Persistent volumes with a Filesystem volume mode are discovered if you mount them under /mnt/disks.
• Persistent volumes with a Block volume-mode are discovered if you create a symbolic link to the block device in
/mnt/disks.
For additional NKP documentation regarding StorageClass, see Default Storage Providers on page 33.
Note: When creating a pre-provisioned infrastructure cluster, NKP uses localvolumeprovisioner as the
default storage provider (see https://d2iq.atlassian.net/wiki/spaces/DENT/pages/29919120). However,
localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI (see https://kubernetes.io/
docs/concepts/storage/volumes/#volume-types) to check for compatible storage suitable for production.
• You can access a Linux, macOS, or Windows computer with a supported OS version.
• You have a provisioned NKP cluster that uses the localvolumeprovisioner platform application but has not
added any other Kommander applications to the cluster yet.
This distinction between provisioning and deployment is important because some applications depend on the storage
class provided by the localvolumeprovisioner component and can fail to start if not configured.
Procedure
1. Create a pre-provisioned cluster by following the steps outlined in the pre-provisioned infrastructure topic.
As volumes are created or mounted on the nodes, the local volume provisioner detects each volume in the /mnt/
disks directory. It adds it as a persistent volume with the localvolumeprovisioner StorageClass. For more
information, see the documentation regarding Kubernetes Local Storage.
4. Claim the persistent volume using a PVC by running the following command.
cat <<EOF | kubectl create -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: example-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
storageClassName: localvolumeprovisioner
EOF
5. Reference the persistent volume claim in a pod by running the following command.
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: pod-with-persistent-volume
spec:
containers:
- name: frontend
image: nginx
volumeMounts:
- name: data
mountPath: "/var/www/html"
volumes:
- name: data
persistentVolumeClaim:
claimName: example-claim
6. Verify the persistent volume claim using the command kubectl get pvc.
Example output:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
AGE
example-claim Bound local-pv-4c7fc8ba 3986Mi RWO
localvolumeprovisioner 78s
Resource Requirements
To ensure a successful Nutanix Kubernetes Platform (NKP) installation, you must meet certain resource requirements
for the control plane nodes and worker nodes. These resource requirements can be slightly different for different
infrastructure providers and license type.
Section Contents
Table 15: General Resource Requirements for Pro and Ultimate Clusters
Root Volume Disk usage must be below 85% Disk usage must be below 85%
Note: If you use the instructions to create a cluster using the NKP default settings without any edits to configuration
files or additional flags, your cluster is deployed on an Ubuntu 20.04 OS image (see Supported Infrastructure
Operating Systems on page 12 with three control plane nodes, and four worker nodes which match the
requirements above.
Note: The Starter License is supported exclusively with the Nutanix Infrastructure.
Root Volume Disk usage must be below 85% Disk usage must be below 85%
Non-Default Flags in the
CLI --control-plane-vcpus 2 \
--control-plane-memory 6 \
--worker-replicas 2 \
--worker-vcpus 4 \
--worker-memory 4
Root Volume Disk usage must be below 85% Disk usage must be below 85%
Non-Default Flags in the
CLI --control-plane-vcpus 2 \
--control-plane-memory 4 \
--worker-replicas 2 \
--worker-vcpus 3 \
--worker-memory 3
Note: Four worker nodes are required to support upgrades to the rook-ceph platform application. rook-ceph
supports the logging stack, the velero backup tool, and NKP Insights. If you have disabled the rook-ceph platform
application, only three worker nodes are required.
• Default Storage Class and four volumes of 32GiB, 32GiB, 2GiB, and 100GiB or the ability to create those
volumes depending on the Storage Class:
$ kubectl get pv -A
NAME CAPACITY ACCESS MODES RECLAIM POLICY
STATUS CLAIM
STORAGECLASS REASON AGE
pvc-08de8c06-bd66-40a3-9dd4-b0aece8ccbe8 32Gi RWO Delete
Bound kommander-default-workspace/kubecost-cost-analyzer
ebs-sc 124m
pvc-64552486-7f4c-476a-a35d-19432b3931af 32Gi RWO Delete
Bound kommander-default-workspace/kubecost-prometheus-server
ebs-sc 124m
pvc-972c3ee3-20bd-449b-84d9-25b7a06a6630 2Gi RWO Delete
Bound kommander-default-workspace/kubecost-prometheus-alertmanager
ebs-sc 124m
Note: Actual workloads might demand more resources depending on the usage.
Note: Applications with an asterisk (“*”) are NKPUltimate-only apps deployed by default for NKP Ultimate customers
only.
• Currently, NKP only supports a single deployment of cert-manager per cluster. Because of this, cert-
manager cannot be installed on any Konvoy managed clusters or clusters with cert-manager pre-installed.
Note: Additional prerequisites are necessary for air-gapped; verify that all the non-air-gapped conditions are met and
then add any additional air-gapped prerequisites listed.
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• kubectl (see https://kubernetes.io/docs/tasks/tools/#kubectl) for interacting with the cluster running on the
host where the NKP Konvoy CLI runs.
• Konvoy Image Builder in KIB (see Konvoy Image Builder).
• Valid provider account with the credentials configured.
• If you follow the instructions in the Basic Installations by Infrastructure on page 50 topic for installing
NKP, use the --self-managed flag for a self-managed cluster. If you use the instructions in Custom
Installation and Additional Infrastructure Tools, ensure that you perform the self-managed process on your
new cluster:
• A self-managed AWS cluster.
• A self-managed Azure cluster.
• Pre-provisioned only:
• Linux machine (bastion) that has access to the existing Virtual Private Cloud (VPC) instead of an x86_64-based
Linux or macOS machine.
• Ability to download artifacts from the internet and then copy those onto your Bastion machine.
• Download the complete NKP air-gapped bundle NKP-air-gapped-bundle_v2.12.0_linux_amd64.tar.gz
for this release (see Downloading NKP on page 16).
• The version of the CLI that matches the NKP version you want to install.
• Review the Management Cluster Application Requirements and Workspace Platform Application Defaults
and Resource Requirements to ensure your cluster has sufficient resources.
• Ensure you have a default StorageClass (see default-storage-providers-c.dita) configured (the Konvoy
component is responsible for configuring one).
• A load balancer to route external traffic.
In cloud environments, this information is provided by your cloud provider. You can configure MetalLB for on-
premises and vSphere deployments. It is also possible to use Virtual IP. For more details, see Load Balancing on
page 602.
• Ensure you meet the storage requirements (see storage-c.dita), default storage class , and Workspace
Platform Application Defaults and Resource Requirements on page 42.
• Ensure you have added at least 40 GB of raw storage to your clusters' worker nodes.
Air-gapped Only (additional prerequisites)
In an air-gapped environment, your environment is isolated from unsecured networks, like the Internet, and therefore
requires additional considerations for installation. Below are the additional prerequisites required if installing in an
air-gapped environment:
• A local registry (see reg-mirror-tools-c.dita) containing all the necessary installation images, including
the Kommander images, which were downloaded in the air-gapped bundle above, NKP-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz. To view how to push images required to this registry and load the
registry under each provider in the Basic Installations by Infrastructure on page 50 section.
• Connectivity with the clusters attached to the management cluster:
Note: If you want to customize your cluster’s domain or certificate, ensure you review the respective documentation
sections:
Installing NKP
The topic lists the basic package requirements for your environment to perform a successful installation of
Nutanix Kubernetes Platform (NKP). Next, install NKP, and then you can begin any custom configurations
based on your environment.
3. Check the supported Kubernetes versions after finding your version with the preceding command.
4. For air-gapped environments, create a bastion host for the cluster nodes to use within the air-gapped network.
The bastion host needs access to a local registry instead of an Internet connection to export images. The
recommended template naming pattern is ../folder-name/NKP-e2e-bastion-template or similar.
Each infrastructure provider has its own set of bastion host instructions. For specific details of your provider,
see the respective provider's sit for more information: Azure (see https://learn.microsoft.com/en-us/
azure/bastion/quickstart-host-portal, AWS (see https://aws.amazon.com/solutions/implementations/
linux-bastion/, GCP https://blogs.vmware.com/cloud/2021/06/02/intro-google-cloud-vmware-
engine-bastion-host-access-iap/, or vSphere (see https://docs.vmware.com/en/VMware-vSphere/7.0/
com.vmware.vsphere.security.doc/GUID-6975426F-56D0-4FE2-8A58-580B40D2F667.html).
5. Create NKP machine images by downloading the Konvoy Image Builder and extracting it.
6. Download NKP. For more information, see Downloading NKP on page 16.
7. Verify that you have valid cloud provider security credentials to deploy the cluster.
Note: This step regarding the provider security credentials is not required if you install NKP on an on-premises
environment. For information about installing NKP in an on-premises environment, see Pre-provisioned
Infrastructure on page 695.
8. Install the Konvoy component depending on which infrastructure you have. For more information, see Basic
Installations by Infrastructure on page 50. To use customized YAML and other advanced features, see
Custom Installation and Infrastructure Tools on page 644.
9. Configure the Kommander component by initializing the configuration file under the Kommander Installer
Configuration File component of NKP.
10. (Optional) Test operations by deploying a sample application, customizing the cluster configuration, and
checking the status of cluster components.
11. Initialize the configuration file under the Kommander Installer Configuration File component of NKP.
What to do next
Here are some links to the NKP installation-specific information:
Note: For custom installation procedures, see Custom Installation and Additional Infrastructure Tools.
Production cluster configuration allows you to deploy and enable the cluster management applications and your
workload applications that you need for production operations. For more information, see Cluster Operations
Management on page 339.
For virtualized environments, NKP can provision the virtual machines necessary to run Kubernetes clusters. If you
want to allow NKP to manage your infrastructure, select your supported infrastructure provider installation choices
below.
Note: If you want to provision your nodes in a bare metal environment or manually, see Pre-provisioned
Infrastructure on page 695.
Section Contents
Scenario-based installation options:
Section Contents
NKP Prerequisites
Before using NKP to create a Nutanix cluster, verify that you have the following:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• A registry is required in your environment:
• Install kubectl 1.28.x to interact with the running cluster on the host where the NKP Konvoy CLI runs. For
more information, see https://kubernetes.io/docs/tasks/tools/#kubectl.
• A valid Nutanix account with credentials configured.
Note:
• NKP uses the Nutanix CSI Volume Driver (CSI) 3.0 as the default storage provider. For more
information on the default storage providers, see Default Storage Providers on page 33.
• For compatible storage suitable for production, choose from any of the storage options available for
Kubernetes. For more information, see https://kubernetes.io/docs/concepts/storage/volumes/
#volume-types.
• To turn off the default StorageClass that Konvoy deploys:
1. Set the default StorageClass as non-default.
2. Set your newly created StorageClass as default.
For information on changing the default storage class, see https://kubernetes.io/docs/tasks/
administer-cluster/change-default-storage-class/.
Nutanix Prerequisites
Before installing, verify that your environment meets the following basic requirements:
• Pre-designated subnets.
• A subnet with unused IP addresses. The number of IP addresses required is computed as follows:
• One IP address for each node in the Kubernetes cluster. The default cluster size has three control plane
nodes and four worker nodes. So, this requires seven IP addresses.
• One IP address in the same Classless Inter-Domain Routing (CIDR) as the subnet but not part of the
address pool for the Kubernetes API server (kubevip).
• One IP address in the same CIDR as the subnet but not part of an address pool for the Loadbalancer service
used by Traefik (metallb).
• Additional IP addresses may be assigned to accommodate other services such as NDK, that also need the
Loadbalancer service used by Metallb. For more information, see the Prerequisites and Limitations section
in the Nutanix Data Services for Kubernetes guide at https://portal.nutanix.com/page/documents/
details?targetId=Nutanix-Data-Services-for-Kubernetes-v1_1:top-prerequisites-k8s-c.html.
• For air-gapped environments, a bastion VM host template with access to a configured local registry. The
recommended template naming pattern is ../folder-name/NKP-e2e-bastion-template or similar. Each
infrastructure provider has its own bastion host instructions (see Creating a Bastion Host on page 652.
• Access to a bastion VM or other network-connected host running NKP Image Builder.
Note: Nutanix provides a full image built on Nutanix with base images if you do not want to create your own
from a BaseOS image.
• You must be able to reach the Nutanix endpoint where the Konvoy CLI runs.
• Note: For air-gapped, ensure you download the bundle nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz and extract the tar file to a local directory. For more
information, see Downloading NKP on page 16.
tar -xzvf nkp-air-gapped-bundle_v2.12.0_linux_amd64.tar.gz
Injected Credentials
By default, credentials are injected into the CAPX manager deployment when CAPX is initialized. For information
about getting started with CAPX, see Getting Started in https://opendocs.nutanix.com/capx/v1.5.x/
getting_started/.
Upon initialization, a nutanix-creds secret is automatically created in the capx-system namespace. This secret
contains the values specified in the NUTANIX_USER and NUTANIX_PASSWORD parameters.
The nutanix-creds secret is used for workload cluster deployments if no other credentials are supplied.
Users can override the credentials injected in CAPX manager deployment by supplying a credential specific to a
workload cluster. See Credentials injected into the CAPX manager deployment in https://opendocs.nutanix.com/
capx/v1.5.x/credential_management/#credentials-injected-into-the-capx-manager-deployment. The
credentials are provided by creating a secret in the same namespace as the NutanixCluster namespace.
The secret is referenced by adding a credentialRef inside the prismCentral attribute contained in
the NutanixCluster. See Prism Central Admin Center Guide. The secret is also deleted when the
NutanixCluster is deleted.
Note: There is a 1:1 relation between the secret and the NutanixCluster object.
This table contains the permissions that are pre-defined for the Kubernetes Infrastructure Provisions role in
Prism Central.
Role Permission
AHV VM
Create Virtual Machine
Create Virtual Machine Disk
Delete Virtual Machine
Delete Virtual Machine Disk
Update Virtual Machine
Update Virtual Machine Project
Procedure
» Full Access: provides all existing users access to all entity types in the associated role.
» Configure Access: provides you with the option to configure the entity types and instances for the added
users in the associated role.
7. Click Next.
» From the dropdown list, select Local User to add a local user or group to the policy. Search a user or group
by typing the first few letters of the name in the text field.
» From the dropdown list, select the available directory to add a directory user or group. Search a user or group
by typing the first few letters of the name in the text field.
9. Click Save.
Note: To display role permissions for any built-in role, see Displaying Role Permissions. in the Security
Guide.
The authorization policy configurations are saved and the authorization policy is listed in the Authorization
Policies window.
• Configure Prism Central. For more information, see Prism Central Admin Center Guide.
• Upload the image downloaded with the NKP binary to the Prism Central images folder.
• Configure the network by downloading NKP Image Builder (NIB) and installing packages to activate the network.
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
The existing VLAN implementation is basic VLAN. However, advanced VLAN uses OVN as the control plane
instead of the Acropolis. The subnet creation workflow is from Prism Central (PC) rather than Prism Element (PE).
Subnet creation can be done using API or through the UI.
Procedure
2. Select the option next to Use the VLAN migrate workflow to convert VLAN Basic subnets to Network
Controller managed VLAN Subnets.
4. Under Advanced Configuration, remove the check from the checkbox next to VLAN Basic Networking to
change from Basic to Advanced OVN.
5. Modify the subnet specification in the control plane and worker nodes to use the new subnet. kubectl edit
cluster <clustername>.
CAPX will roll out the new control plane and worker nodes in the new Subnet and destroy the old ones.
Note: You can choose Basic or Advanced OVN when creating the subnet(s) you used during cluster creation. If
you created the cluster with basic, you can migrate to OVN.
To modify the service subnet, add or edit the configmap. See the topic Managing Subnets and Pods for more
details.
Note: For Virtual Private Cloud (VPC) installation, see the topic Nutanix with VPC Creating a New Cluster.
Note: NKP uses the Nutanix CSI driver as the default storage provider. For more information see, Default Storage
Providers on page 33.
Note: NKP uses a CSI storage container on your Prism Element (PE). The CSI Storage Container image names must
be the same for every PE environment in which you deploy an NKP cluster.
Procedure
1. Set the environment variable for the cluster name using the following command.
export CLUSTER_NAME=<my-nutanix-cluster>
3. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
You must do this at the cluster creation stage to change the Kubernetes subnets. The default subnets used in NKP
are below.
spec:
clusterNetwork:
pods:
5. Enter your Nutanix Prism Central details. Required fields are denoted with a red asterisk (*). Other fields are
optional.
a. Enter your Prism Central Endpoint in the following prompt: Prism Central Endpoint: >https://
b. > Prism Central Username: Enter your username. For example, admin.
c. > Prism Central Password: Enter your password.
d. Enter yes or no for the prompt Insecure Mode
e. (Optional) Enter trust information in the prompt for Additional Trust Bundle: A PEM file as
base64 encoded string
6. On the next screen, enter additional information on the Cluster Configuration screen. Required fields are
denoted with a red asterisk (*). Other fields are optional.
» Cluster Name*
» Control Plane Endpoint*
» VM Image*: A generated list appears from PC images where you select the desired image.
» Kubernetes Service Load Balancer IP Range*
» Pod Network
» Service Network
» Reclaim Policy
» File System
» Hypervisor Attached Volumes
» Storage Container*
» HTTP Proxy:
» HTTPS Proxy:
» No Proxy List:
» ( ) Create
» ( ) Dry Run
After the installation, the required components are installed, and the Kommander component deploys the
minimum applications needed by default. For more information, see NKP Concepts and Terms or NKP
Catalog Applications Enablement after Installing NKP .
Caution: You cannot use the NKP CLI to re-install the Kommander component on a cluster created using the
interactive prompt-based CLI. If you need to re-install or reconfigure Kommander to enable more applications,
contact Nutanix Support.
Procedure
Run the following command to check the installation status.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
The first wait is for each of the helm charts to reach the Ready condition, eventually resulting in an output as follows:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
What to do next
If an application fails to deploy, check the status of the HelmRelease using the following command.
kubectl -n kommander get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a broken release state, such as exhausted or another rollback/release in progress,
trigger a reconciliation of the HelmRelease using the following commands. kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Logging In To the UI
Log in to the UI dashboard.
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using the following command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images, is required. This registry must be accessible from both the bastion
machine and other machines that will be created for the Kubernetes cluster.
Procedure
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
• REGISTRY_URL: the address of an existing registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. Konvoy will configure the
cluster nodes to trust this CA. This value is only needed if the registry uses a self-signed certificate and the
images are not already configured to trust this CA.
• REGISTRY_USERNAME: optional, set to a user with pull access to this registry.
4. Execute the following command to load the air-gapped image bundle into your private registry using any relevant
flags to apply the above variables.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the network's
performance between the machine you are running the script on and the registry.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Note: For Virtual Private Cloud (VPC) installation, see the topic Nutanix with VPC Creating a New Air-
gapped Cluster.
Note: NKP uses a CSI storage container on your Prism Element (PE). The CSI Storage Container image names must
be the same for every PE environment in which you deploy an NKP cluster.
Procedure
1. Enter a unique name for your cluster suitable for your environment.
2. Set the environment variable for the cluster name using the following command.
export CLUSTER_NAME=<my-nutanix-cluster>
4. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
You must do this at the cluster creation stage to change the kubernetes subnets. The default subnets used in NKP
are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
For more information, see Managing Subnets and Pods on page 651.
Note: If the cluster creation fails, check for issues with your environment, such as storage resources. If the
cluster becomes self-managed before it stalls, you can investigate what is running and what has failed to try to
resolve those issues independently. See Resource Requirements and Inspect Cluster Issues for more
information.
Procedure
Run the following command to check the status of the installation.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer using the CLI to not wait for all applications to be available, you can set the flag to --
wait=false.
The first wait is for each of the helm charts to reach the Ready condition, eventually resulting in an output as follows:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Logging In To the UI
Log in to the Dashboard UI.
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using the following command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following command.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use static credentials to access the UI for configuring an external identity provider (see Identity Providers
on page 350). Treat them as back up credentials rather than using them for normal access.
• Worker machines:
Note: Swap is disabled. kubelet does not support swapping. Due to variable commands, see the respective
Operating System documentation.
Installation Scenarios
Select your installation scenario:
Procedure
1. Export the following environment variables, ensuring that all the control plane and worker nodes are included.
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
2. Use the following template to define your infrastructure. The environment variables that you set in the previous
step automatically replaces the variable names when the inventory YAML file is created.
cat <<EOF > preprovisioned_inventory.yaml
---
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: $CLUSTER_NAME-control-plane
namespace: default
labels:
cluster.x-k8s.io/cluster-name: $CLUSTER_NAME
clusterctl.cluster.x-k8s.io/move: ""
spec:
hosts:
# Create as many of these as needed to match your infrastructure
# Note that the command line parameter --control-plane-replicas determines how
many control plane nodes will actually be used.
#
- address: $CONTROL_PLANE_1_ADDRESS
- address: $CONTROL_PLANE_2_ADDRESS
- address: $CONTROL_PLANE_3_ADDRESS
sshConfig:
port: 22
# This is the username used to connect to your infrastructure. This user must be
root or
# have the ability to use sudo without a password
user: $SSH_USER
privateKeyRef:
# This is the name of the secret you created in the previous step. It must
exist in the same
# namespace as this inventory object.
name: $SSH_PRIVATE_KEY_SECRET_NAME
• External load balancer: Nutanix recommends that an external load balancer be the control plane endpoint. To
distribute request load among the control plane machines, configure the load balancer to send requests to all the
control plane machines. Configure the load balancer to send requests only to control plane machines that are
responding to API requests.
• Built-in virtual IP: If an external load balancer is not available, use the built-in virtual IP. The virtual IP is not
a load balancer; it does not distribute request load among the control plane machines. However, if the machine
receiving requests does not respond to them, the virtual IP automatically moves to another machine.
Note: Modify control plane audit log settings using the information in the Configure the Control Plane page. See
Configuring the Control Plane.
Known Limitations
The control plane endpoint port is also used as the API server port on each control plane machine. The default port is
6443. Before you create the cluster, ensure the port is available for use on each control plane machine.
Note: The cluster can contain only the following characters: a-z, 0-9,., and -. The cluster creation will fail if the
name has capital letters. For more instructions on naming, see Object Names and IDs at https://kubernetes.io/
docs/concepts/overview/working-with-objects/names/.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
1. Enter a unique name for your cluster that is suitable for your environment.
2. Set the environment variable for the cluster name using the following command.
export CLUSTER_NAME=preprovisioned-example
4.
What to do next
Before you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding nkp create cluster command.
In a pre-provisioned environment, use the Kubernetes CSI and third party drivers for local volumes and other storage
devices in your data center.
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB, and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1
Note: Depending on the cluster size, it will take a few minutes to create.
When the command completes, you will have a running Kubernetes cluster! For bootstrap and custom YAML cluster
creation, see the Additional Infrastructure Customization section of the documentation for Pre-provisioned: Pre-
provisioned Infrastructure
Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to installing the NKP
Kommander UI:
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: If changing the Calico encapsulation, Nutanix recommends changing it after cluster creation, but before
production. See Calico encapsulation.
Audit Logs
To modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Further Steps
For more customized cluster creation, access the Pre-Provisioned Additional Configurations section. That section
is for Pre-Provisioned Override Files, custom flags, and more that specify the secret as part of the create cluster
command. If these are not specified, the overrides for your nodes will not be applied.
MetalLB Configuration
Create a MetalLB configmap for your pre-provisioned infrastructure.
Nutanix recommends that an external load balancer be the control plane endpoint. To distribute request load among
the control plane machines, configure the load balancer to send requests to all the control plane machines. Configure
the load balancer to send requests only to control plane machines that are responding to API requests. If you do not
have one, you can use Metal LB to create a MetalLB configmap for your pre-provisioned infrastructure.
Choose one of the following two protocols you want to use to define service IPs. If your environment is not currently
equipped with a load balancer, use MetalLB. Otherwise, your load balancer will work, and you can continue with the
installation process. To use MetalLB, create a MetalLB configMap for your pre-provisioned infrastructure. MetalLB
uses one of two protocols for exposing Kubernetes services:
Layer 2 Configuration
Layer 2 mode is the simplest to configure. In many cases, you do not require any protocol-specific configuration, only
IP addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly to give the machine’s MAC address to clients.
• MetalLB IP address ranges or Classless Inter-Domain Routing (CIDR) needs to be within the node’s primary
network subnets. For more information, see Managing Subnets and Pods on page 651.
• MetalLB IP address ranges or CIDRs and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250 and
configures Layer 2 mode:
The following values are generic; enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
Note:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands using the kubeconfig file.
• Applications take longer to deploy and sometimes time out the installation. Add the --wait-timeout
<time to wait> flag and specify a period (for example, 1 h) to allocate more time to the deployment
of applications.
• Ensure you review all the prerequisites required for the installation.
• Ensure you have a default StorageClass (see Identifying and Modifying Your StorageClass on page 982).
• Note the name of the cluster where you want to install Kommander. If you do not know the cluster name, use
kubectl get clusters -A to search and find it.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PVC based storage that requires your CSI povider to support for PVC with type
volumeMode: Block. As this is not possible with the default local static provisioner, you can install Ceph in
host storage mode and choose whether Ceph’s object storage daemon (osd) pods can consume all or just some of
the devices on your nodes. Include one of the following overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
6. Enable NKP catalog applications and install Kommander in the same kommander.yaml, add these values (if you
are enabling NKP catalog applications) in nkp-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP catalog applications after installing NKP, see Configuring Applications
After Installing Kommander on page 984.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer using the CLI to not wait for all applications to be available, you can set the flag to --
wait=false.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output.
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
What to do next
If an application fails to deploy, check the status of the HelmRelease using the following command.
kubectl -n kommander get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a broken release state, such as exhausted or another rollback/release in progress,
trigger a reconciliation of the HelmRelease using the following commands. kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Logging In To the UI
Log in to the UI Dashboard. After you build the Konvoy cluster and install Kommander, verify your
installation. The cluster waits for all the applications to be ready by default.
Procedure
1. By default, you can log in to the UI in Kommander with the credentials provided in this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use static credentials to access the UI for configuring an external identity provider (see Identity Providers
on page 350). Treat them as back up credentials rather than using them for normal access.
Note: When creating managed clusters, do not create and move CAPI objects or install the Kommander component.
Those tasks are only done on Management clusters.
Your new managed cluster must be part of a workspace under a management cluster. To make the new
managed cluster a part of a workspace, set that workspace's environment variable.
Procedure
1. If you have an existing workspace name, run this command to find the name.
kubectl get workspace -A
2. After you find the workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
If you need to create a new workspace, see Creating a Workspace on page 397.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Tip: Before you create a new Nutanix Kubernetes Platform (NKP) cluster below, choose an external load balancer
(LB) or virtual IP and use the corresponding NKP create cluster command.
In a Pre-provisioned environment, use the Kubernetes CSI and third-party drivers for local volumes and other storage
devices in your datacenter.
Caution: NKP uses a local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After turning off localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Change the default StorageClass.
For Pre-provisioned environments, you define a set of nodes that already exist. During the cluster creation process,
Konvoy Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which
KIB uses to build images for other providers) against the set of nodes that you defined. This results in your pre-
existing or pre-provisioned nodes being configured properly.
The following command relies on the pre-provisioned cluster Application Programming Interface (API) infrastructure
provider to initialize the Kubernetes control plane and worker nodes on the hosts defined in the inventory YAML
previously created.
Procedure
1. This command uses the default external load balancer (LB) option.
nkp create cluster pre-provisioned
--cluster-name ${CLUSTER_NAME}
--control-plane-endpoint-host <control plane endpoint host>
--control-plane-endpoint-port <control plane endpoint port, if different than
6443>
--pre-provisioned-inventory-file preprovisioned_inventory.yaml
--ssh-private-key-file <path-to-ssh-private-key>
Note: NOTE: Depending on the cluster size, it will take a few minutes to create.
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to the workspace through the UI that was earlier, or
attach your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
10. You can now view this cluster in your Workspace in the UI, and you can confirm its status by running the
command below. It might take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them into a Managed Cluster to be centrally
administrated by a Management Cluster, refer to Platform Expansion:
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command:
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
NOTE: For FIPS, pass the flag: --fips
NOTE: For RHEL OS, pass your RedHat subscription manager credentials: export RMS_ACTIVATION_KEY
Example command:
export RHSM_ACTIVATION_KEY="-ci"
export RHSM_ORG_ID="1232131"
Setup Process
1. The bootstrap image must be extracted and loaded onto the bastion host.
2. Artifacts must be copied onto cluster hosts for nodes to access.
3. If using a graphics processing unit (GPU), those artifacts must be positioned locally.
4. Registry seeded with images locally.
2. Verify the artifacts for your OS exist in the artifacts/ directory and export the appropriate variables:
$ ls artifacts/
1.29.6_centos_7_x86_64.tar.gz 1.29.6_redhat_8_x86_64_fips.tar.gz
containerd-1.6.28-d2iq.1-rhel-7.9-x86_64.tar.gz containerd-1.6.28-d2iq.1-
rhel-8.6-x86_64_fips.tar.gz pip-packages.tar.gz
1.29.6_centos_7_x86_64_fips.tar.gz 1.29.6_rocky_9_x86_64.tar.gz
containerd-1.6.28-d2iq.1-rhel-7.9-x86_64_fips.tar.gz containerd-1.6.28-d2iq.1-
rocky-9.0-x86_64.tar.gz
3. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_FILE="<private key file>"
SSH_PRIVATE_KEY_FILE must be either the name of the SSH private key file in your working directory or an
absolute path to the file in your user’s home directory.
4. Generate an inventory.yaml , which is automatically picked up by the konvoy-image upload in the next
step.
cat <<EOF > inventory.yaml
all:
vars:
ansible_user: $SSH_USER
ansible_port: 22
ansible_ssh_private_key_file: $SSH_PRIVATE_KEY_FILE
hosts:
$CONTROL_PLANE_1_ADDRESS:
ansible_host: $CONTROL_PLANE_1_ADDRESS
$CONTROL_PLANE_2_ADDRESS:
ansible_host: $CONTROL_PLANE_2_ADDRESS
$CONTROL_PLANE_3_ADDRESS:
ansible_host: $CONTROL_PLANE_3_ADDRESS
$WORKER_1_ADDRESS:
ansible_host: $WORKER_1_ADDRESS
$WORKER_2_ADDRESS:
ansible_host: $WORKER_2_ADDRESS
$WORKER_3_ADDRESS:
ansible_host: $WORKER_3_ADDRESS
$WORKER_4_ADDRESS:
ansible_host: $WORKER_4_ADDRESS
EOF
5. Upload the artifacts onto cluster hosts with the following command:
konvoy-image upload artifacts \
--container-images-dir=./artifacts/images/ \
--os-packages-bundle=./artifacts/$OS_PACKAGES_BUNDLE \
--contained-bundle=artifacts/$CONTAINERD_BUNDLE \
--pip-packages-bundle=./artifacts/pip-packages.tar.gz
KIB uses variable overrides to specify the base image and container images to use in your new machine image.
The variable overrides files for NVIDIA and Federal Information Processing Standards (FIPS) can be ignored
unless an overlay feature is added.
Procedure
Task step.
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images, is required. This registry must be accessible from both the bastion
machine and either the Amazon Web Services (AWS) EC2 instances (if deploying to AWS) or other machines that
will be created for the Kubernetes cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files
from different directories. For example, for the bootstrap cluster, change your directory to the nkp-<version>
directory, similar to the example below, depending on your current location
cd nkp-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply the variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster,
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Procedure
1. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
2. Use the following template to help you define your infrastructure. The environment variables that you set in
the previous step automatically replace the variable names when the inventory YAML Ain't Markup Language
(YAML) file is created.
cat <<EOF > preprovisioned_inventory.yaml
A control plane with one node can use its single node as the endpoint, so you will not require an external load
balancer or a built-in virtual IP. At least one control plane node must always be running. Therefore, to upgrade a
cluster with one control plane node, a spare machine must be available in the control plane inventory. This machine
is used to provision the new node before the old node is deleted. When the API server endpoints are defined, you can
create the cluster using the link in the Next Step below.
Note: Modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Known Limitations
The control plane endpoint port is also used as the API server port on each control plane machine. The default port is
6443. Before you create the cluster, ensure the port is available for use on each control plane machine.
Procedure
What to do next
Create a Kubernetes Cluster
If your cluster is air-gapped or you have a local registry, you must provide additional arguments when creating the
cluster. These tell the cluster where to locate the local registry to use by defining the URL.
export REGISTRY_URL=<https/http>://<registry-address>:<registry-port>
export REGISTRY_CA=<path to the CA on the bastion>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
• REGISTRY_URL: the address of an existing registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. Konvoy will configure the cluster
nodes to trust this CA. This value is only needed if the registry is using a self-signed certificate and the AMIs are
not already configured to trust this CA.
• REGISTRY_USERNAME: optional, set to a user that has pull access to this registry.
Before you create a new NKP cluster below, you might choose an external load balancer or virtual IP and use the
corresponding nkp create cluster command example from that page in the docs from the links below. Other
customizations are available but require different flags during nkp create cluster command also. Refer to Pre-
provisioned Cluster Creation Customization Choices for more cluster customizations.
When you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding nkp create cluster command.
In a pre-provisioned environment, use the Kubernetes CSI and third-party drivers for local volumes and other storage
devices in your datacenter.
Note: NKP uses a local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After turning off localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Changing the Default Storage Class.
Note: (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when
attempting to pull images previously pushed to your registry when defining your infrastructure. Instructions in the
expandable Custom Installation section. For registry mirror information, see topics Using a Registry Mirror and
Registry Mirror Tools.
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster pre-provisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1
Note: Depending on the cluster size, it will take a few minutes to create.
When the command is complete, you will have a running Kubernetes cluster! For bootstrap and custom YAML
cluster creation, refer to the Additional Infrastructure Customization section of the documentation for Pre-
provisioned: Pre-provisioned Infrastructure.
Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to install the NKP
Kommander UI:
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: If changing the Calico encapsulation, Nutanix recommends doing so after cluster creation but before
production.
Audit Logs
To modify Control Plane Audit logs settings using the information contained on the page Configure the Control
Plane.
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly, to give the machine’s MAC address to clients.
• MetalLB IP address ranges or Classless Inter-Domain Routing (CIDR) need to be within the node’s primary
network subnet.
• MetalLB IP address ranges or CIDRs and node subnet must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250, and
configures Layer 2 mode:
The following values are generic, enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.1.240-192.168.1.250
EOF
kubectl apply -f metallb-conf.yaml
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need 4 pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install
command to retry.
Prerequisites:
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PersistentVolumeClaim (PVC) based storage, which requires your CSI provider
to support PVC with type volumeMode: Block. As this is not possible with the default local static provisioner,
you can install Ceph in host storage mode. You can choose whether Ceph’s object storage daemon (osd) pods can
consume all or just some of the devices on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See Kommanderthe Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the command-line interface (CLI) to not wait for all applications to become ready, you can set the
--wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Procedure
1. If you have an existing Workspace name, find the name using the command kubectl get workspace -A.
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable using the command
export WORKSPACE_NAMESPACE=<workspace_namespace>.
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Tip: Before you create a new NKP cluster below, choose an external load balancer (LB) or Pre-provisioned Built-
in Virtual IP on page 706 and use the corresponding NKP create cluster command.
In a Pre-provisioned environment, use the Kubernetes CSI and third-party drivers for local volumes and other storage
devices in your data center.
After turning off localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in the Change the default
StorageClass section of the Kubernetes documentation. For more information, see https://kubernetes.io/docs/tasks/
administer-cluster/change-default-storage-class/
For Pre-provisioned environments, you define a set of nodes that already exist. During the cluster creation process,
Konvoy Image Builder (KIB) is built into NKP and automatically runs the machine configuration process (which KIB
uses to build images for other providers) against the set of nodes that you defined. This results in your pre-existing or
pre-provisioned nodes being configured properly.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the Kubernetes
control plane and worker nodes on the hosts defined in the inventory YAML previously created.
Procedure
1. This command uses the default external load balancer (LB) option.
nkp create cluster earlier earlierisioned --cluster-name ${CLUSTER_NAME}
--control-plane-endpoint-host <control plane endpoint host>
--control-plane-endpoint-port <control plane endpoint port, if different than 6443>
--pre-provisioned-inventory-file preprovisioned_inventory.yaml
--ssh-private-key-file <path-to-ssh-private-key>
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD}
Note: NOTE: Depending on the cluster size, it will take a few minutes to create.
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information,
see Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set using the command echo
${MANAGED_CLUSTER_NAME} .
Note: This is only necessary if you never set the workspace of your cluster upon creation.
3. Retrieve the workspace where you want to attach the cluster using the command kubectl get workspaces -
A.
5. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve
the kubeconfig secret value of your cluster using the command kubectl -n default get secret
${MANAGED_CLUSTER_NAME}-kubeconfig -o go-template='{{.data.value}}{{ "\n"}}'.
6. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference. Create
a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
7. Create this secret in the desired workspace using the command kubectl apply -f attached-cluster-
kubeconfig.yaml --namespace ${WORKSPACE_NAMESPACE}
9. You can now view this cluster in your Workspace in the UI, and you can confirm its status by using the command
kubectl get kommanderclusters -A.
It might take a few minutes to reach "Joined" status.
If you have several Pro Clusters and want to turn one of them into a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Ensure Configuration
If not already done, see the documentation for:
Section Contents
Procedure
1. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
2. Use the following template to help you define your infrastructure. The environment variables that you set in the
previous step automatically replace the variable names when the inventory YAML file is created.
cat <<EOF > preprovisioned_inventory.yaml
---
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: $CLUSTER_NAME-control-plane
namespace: default
labels:
cluster.x-k8s.io/cluster-name: $CLUSTER_NAME
clusterctl.cluster.x-k8s.io/move: ""
spec:
hosts:
# Create as many of these as needed to match your infrastructure
# Note that the command-line parameter—-control-plane-replicas determines how
many control plane nodes will actually be used.
#
- address: $CONTROL_PLANE_1_ADDRESS
A control plane with one node can use its single node as the endpoint, so you will not require an external load
balancer, or a built-in virtual IP. At least one control plane node must always be running. Therefore, to upgrade a
cluster with one control plane node, a spare machine must be available in the control plane inventory. This machine
is used to provision the new node before the old node is deleted. When the API server endpoints are defined, you can
create the cluster using the link in the Next Step below.
Note: Modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Known Limitations
The control plane endpoint port is also used as the API server port on each control plane machine. The default port is
6443. Before you create the cluster, ensure the port is available for use on each control plane machine.
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1
Note: Depending on the cluster size, it will take a few minutes to create.
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command. See Docker Hub's rate limit.
Note: If changing the Calico encapsulation, Nutanix recommends doing so after cluster creation, but before
production. See Calico encapsulation.
Configure MetalLB
Create a MetalLB configmap for your Pre-provisioned Infrastructure.
Choose one of the following two protocols you want to use to announce service IPs. If your environment is not
currently equipped with a load balancer, you can use MetalLB. Otherwise, your own load balancer will work, and you
can continue the installation process with Pre-provisioned: Install Kommander. To use MetalLB, create a MetalLB
configMap for your Pre-provisioned infrastructure. MetalLB uses one of two protocols for exposing Kubernetes
services:
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly and giving the machine’s MAC address to clients.
• MetalLB IP address ranges or CIDRs needs to be within the node’s primary network subnet.
• MetalLB IP address ranges or CIDRs and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250, and
configures Layer 2 mode:
The following values are generic, enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need 4 pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PersistentVolumeClaim (PVC) based storage which requires your CSI provider
to support PVC with type volumeMode: Block. As this is not possible with the default local static provisioner,
you can install Ceph in host storage mode. You can choose whether Ceph’s object storage daemon (osd) pods can
consume all or just some of the devices on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use the static credentials to access the UI for configuring an external identity provider. Treat them as back
up credentials rather than use them for normal access.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
To make the new managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Tip: Before you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding NKP create cluster command.
In a Pre-provisioned environment, use the Kubernetes CSI and third party drivers for local volumes and other storage
devices in your data center.
Caution: NKP uses local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After disabling localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Change the default StorageClass
For Pre-provisioned environments, you define a set of nodes that already exist. During the cluster creation process,
Konvoy Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which
KIB uses to build images for other providers) against the set of nodes that you defined. This results in your pre-
existing or pre-provisioned nodes being configured properly.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the Kubernetes
control plane and worker nodes on the hosts defined in the inventory YAML previously created.
Procedure
1. This command uses the default external load balancer (LB) option.
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host <control plane endpoint host> \
--control-plane-endpoint-port <control plane endpoint port, if different than 6443>
\
--pre-provisioned-inventory-file preprovisioned_inventory.yaml \
--ssh-private-key-file <path-to-ssh-private-key> \
--kubernetes-version=v1.29.6+fips.0 \
--etcd-version=3.5.10+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere
Note: NOTE: Depending on the cluster size, it will take a few minutes to create.
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Ensure Configuration
If not already done, see the documentation for:
Setup Process
1. The bootstrap image must be extracted and loaded onto the bastion host.
2. Artifacts must be copied onto cluster hosts for nodes to access.
3. If using GPU, those artifacts must be positioned locally.
4. Registry seeded with images locally.
2. Verify the artifacts for your OS exist in the artifacts/ directory and export the appropriate variables:
$ ls artifacts/
1.29.6_centos_7_x86_64.tar.gz 1.29.6_redhat_8_x86_64_fips.tar.gz
containerd-1.6.28-d2iq.1-rhel-7.9-x86_64.tar.gz containerd-1.6.28-d2iq.1-
rhel-8.6-x86_64_fips.tar.gz pip-packages.tar.gz
1.29.6_centos_7_x86_64_fips.tar.gz 1.29.6_rocky_9_x86_64.tar.gz
containerd-1.6.28-d2iq.1-rhel-7.9-x86_64_fips.tar.gz containerd-1.6.28-d2iq.1-
rocky-9.0-x86_64.tar.gz
1.29.6_redhat_7_x86_64.tar.gz 1.29.6_ubuntu_20_x86_64.tar.gz
containerd-1.6.28-d2iq.1-rhel-8.4-x86_64.tar.gz containerd-1.6.28-d2iq.1-
rocky-9.1-x86_64.tar.gz
1.29.6_redhat_7_x86_64_fips.tar.gz containerd-1.6.28-d2iq.1-centos-7.9-
x86_64.tar.gz containerd-1.6.28-d2iq.1-rhel-8.4-x86_64_fips.tar.gz
containerd-1.6.28-d2iq.1-ubuntu-20.04-x86_64.tar.gz
3. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_FILE="<private key file>"
SSH_PRIVATE_KEY_FILE must be either the name of the SSH private key file in your working directory or an
absolute path to the file in your user’s home directory.
4. Generate an inventory.yaml which is automatically picked up by the konvoy-image upload in the next step.
cat <<EOF > inventory.yaml
all:
vars:
ansible_user: $SSH_USER
ansible_port: 22
ansible_ssh_private_key_file: $SSH_PRIVATE_KEY_FILE
hosts:
$CONTROL_PLANE_1_ADDRESS:
ansible_host: $CONTROL_PLANE_1_ADDRESS
$CONTROL_PLANE_2_ADDRESS:
ansible_host: $CONTROL_PLANE_2_ADDRESS
$CONTROL_PLANE_3_ADDRESS:
ansible_host: $CONTROL_PLANE_3_ADDRESS
$WORKER_1_ADDRESS:
ansible_host: $WORKER_1_ADDRESS
$WORKER_2_ADDRESS:
ansible_host: $WORKER_2_ADDRESS
$WORKER_3_ADDRESS:
ansible_host: $WORKER_3_ADDRESS
$WORKER_4_ADDRESS:
ansible_host: $WORKER_4_ADDRESS
EOF
5. Upload the artifacts onto cluster hosts with the following command:
konvoy-image upload artifacts \
--container-images-dir=./artifacts/images/ \
--os-packages-bundle=./artifacts/$OS_PACKAGES_BUNDLE \
--containerd-bundle=artifacts/$CONTAINERD_BUNDLE \
--pip-packages-bundle=./artifacts/pip-packages.tar.gz
KIB uses variable overrides to specify base image and container images to use in your new machine image. The
variable overrides files for NVIDIA and FIPS can be ignored unless adding an overlay feature.
• Use the --overrides flag and reference either fips.yaml or offline-fips.yaml manifests located in the
overrides directory or see these pages in the documentation:
• FIPS Overrides
• Create FIPS 140 Images
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images is required. This registry must be accessible from both the bastion machine
and either the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the Kubernetes
cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory similar
to example below depending on your current location
cd nkp-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster,
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
Procedure
1. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
2. Use the following template to help you define your infrastructure. The environment variables that you set in the
previous step automatically replace the variable names when the inventory YAML file is created.
cat <<EOF > preprovisioned_inventory.yaml
---
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: $CLUSTER_NAME-control-plane
namespace: default
labels:
cluster.x-k8s.io/cluster-name: $CLUSTER_NAME
clusterctl.cluster.x-k8s.io/move: ""
spec:
hosts:
# Create as many of these as needed to match your infrastructure
# Note that the command line parameter --control-plane-replicas determines how
many control plane nodes will actually be used.
#
- address: $CONTROL_PLANE_1_ADDRESS
- address: $CONTROL_PLANE_2_ADDRESS
- address: $CONTROL_PLANE_3_ADDRESS
sshConfig:
port: 22
A control plane with one node can use its single node as the endpoint, so you will not require an external load
balancer, or a built-in virtual IP. At least one control plane node must always be running. Therefore, to upgrade a
cluster with one control plane node, a spare machine must be available in the control plane inventory. This machine
is used to provision the new node before the old node is deleted. When the API server endpoints are defined, you can
create the cluster using the link in Next Step below.
Note: Modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Known Limitations
The control plane endpoint port is also used as the API server port on each control plane machine. The default port is
6443. Before you create the cluster, ensure the port is available for use on each control plane machine.
Note: When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
What to do next
Create a Kubernetes Cluster
If your cluster is air-gapped or you have a local registry, you must provide additional arguments when creating the
cluster. These tell the cluster where to locate the local registry to use by defining the URL.
export REGISTRY_URL=<https/http>://<registry-address>:<registry-port>
export REGISTRY_CA=<path to the CA on the bastion>
export REGISTRY_USERNAME=<username>
• REGISTRY_URL: the address of an existing registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. Konvoy will configure the cluster
nodes to trust this CA. This value is only needed if the registry is using a self-signed certificate and the AMIs are
not already configured to trust this CA.
• REGISTRY_USERNAME: optional, set to a user that has pull access to this registry.
Before you create a new NKP cluster below, you may choose an external load balancer or virtual IP and use the
corresponding nkp create cluster command example from that page in the docs from the links below. Other
customizations are available, but require different flags during nkp create cluster command also. Refer to Pre-
provisioned Cluster Creation Customization Choices for more cluster customizations.
When you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding nkp create cluster command.
In a pre-provisioned environment, use the Kubernetes CSI and third party drivers for local volumes and other storage
devices in your data center.
Note: NKP uses local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After disabling localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Changing the Default Storage Class
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster, by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command. See Docker Hub's rate limit.
Note: (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when
attempting to pull images previously pushed to your registry when defining your infrastructure. Instructions in the
expandable Custom Installation section. For registry mirror information, see topics Using a Registry Mirror and
Registry Mirror Tools.
The create cluster command below includes the --self-managed flag. A self-managed cluster refers to one in
which the CAPI resources and controllers that describe and manage it are running on the same cluster they are
managing.
This command uses the default external load balancer (LB) option (see alternative Step 1 for virtual IP):
nkp create cluster preprovisioned --cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host <control plane endpoint host> \
--control-plane-endpoint-port <control plane endpoint port, if different than 6443> \
--pre-provisioned-inventory-file preprovisioned_inventory.yaml \
--ssh-private-key-file <path-to-ssh-private-key> \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--self-managed
1. ALTERNATIVE Virtual IP - if you don’t have an external LB, and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1
Note: Depending on the cluster size, it will take a few minutes to create.
When the command completes, you will have a running Kubernetes cluster! For bootstrap and custom YAML cluster
creation, refer to the Additional Infrastructure Customization section of the documentation for Pre-provisioned: Pre-
provisioned Infrastructure
Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to installing the NKP
Kommander UI:
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: If changing the Calico encapsulation, Nutanix recommends changing it after cluster creation, but before
production.
Audit Logs
To modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly, to give the machine’s MAC address to clients.
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need 4 pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PVC based storage which requires your CSI provider to support PVC with type
volumeMode: Block. As this is not possible with the default local static provisioner, you can install Ceph in
host storage mode. You can choose whether Ceph’s object storage daemon (osd) pods can consume all or just
some of the devices on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-applications.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use the static credentials to access the UI for configuring an external identity provider. Treat them as back
up credentials rather than use them for normal access.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
To make the new managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Tip: Before you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding NKP create cluster command.
In a Pre-provisioned environment, use the Kubernetes CSI and third party drivers for local volumes and other storage
devices in your data center.
Caution: NKP uses local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After disabling localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Change the default StorageClass
For Pre-provisioned environments, you define a set of nodes that already exist. During the cluster creation process,
Konvoy Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which
KIB uses to build images for other providers) against the set of nodes that you defined. This results in your pre-
existing or pre-provisioned nodes being configured properly.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the Kubernetes
control plane and worker nodes on the hosts defined in the inventory YAML previously created.
1. This command uses the default external load balancer (LB) option.
nkp create cluster preprovisioned --cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host <control plane endpoint host> \
--control-plane-endpoint-port <control plane endpoint port, if different than 6443>
\
--pre-provisioned-inventory-file preprovisioned_inventory.yaml \
--ssh-private-key-file <path-to-ssh-private-key> \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD}
Note: NOTE: Depending on the cluster size, it will take a few minutes to create.
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Ensure Configuration
If not already done, see the documentation for:
Note: NKP supports NVIDIA version is 470.x. For more information, see NVIDIA driver.
Procedure
1. Create the secret that GPU nodepool uses. This secret is populated from the KIB overrides.
Example output of a file named overrides/nvidia.yaml.
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
2. Create a secret on the bootstrap cluster that is populated from the above file. We will name it
${CLUSTER_NAME}-user-overrides
kubectl create secret generic ${CLUSTER_NAME}-user-overrides --from-
file=overrides.yaml=overrides/nvidia.yaml
3. Create an inventory and nodepool with the instructions below and use the ${CLUSTER_NAME}-user-overrides
secret.
a. Create an inventory object that has the same name as the node pool you’re creating, and the details of the pre-
provisioned machines that you want to add to it. For example, to create a node pool named gpu-nodepool an
inventory named gpu-nodepool must be present in the same namespace.
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: ${MY_NODEPOOL_NAME}
spec:
hosts:
- address: ${IP_OF_NODE}
sshConfig:
port: 22
user: ${SSH_USERNAME}
privateKeyRef:
name: ${NAME_OF_SSH_SECRET}
namespace: ${NAMESPACE_OF_SSH_SECRET}
b. (Optional) If your pre-provisioned machines have overrides, you must create a secret that includes all of the
overrides you want to provide in one file. Create an override secret using the instructions detailed on this page.
Note: Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally
or store in version control.
Note: For more information regarding this flag or others, see the nkp create nodepool section of the
documentation for either cluster or nodepool and select your provider.
Procedure
1. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
2. Use the following template to help you define your infrastructure. The environment variables that you set in the
previous step automatically replace the variable names when the inventory YAML file is created.
cat <<EOF > preprovisioned_inventory.yaml
---
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: $CLUSTER_NAME-control-plane
namespace: default
labels:
cluster.x-k8s.io/cluster-name: $CLUSTER_NAME
clusterctl.cluster.x-k8s.io/move: ""
spec:
hosts:
# Create as many of these as needed to match your infrastructure
# Note that the command line parameter --control-plane-replicas determines how
many control plane nodes will actually be used.
#
- address: $CONTROL_PLANE_1_ADDRESS
- address: $CONTROL_PLANE_2_ADDRESS
- address: $CONTROL_PLANE_3_ADDRESS
sshConfig:
port: 22
A control plane with one node can use its single node as the endpoint, so you will not require an external load
balancer, or a built-in virtual IP. At least one control plane node must always be running. Therefore, to upgrade a
cluster with one control plane node, a spare machine must be available in the control plane inventory. This machine
is used to provision the new node before the old node is deleted. When the API server endpoints are defined, you can
create the cluster using the link in Next Step below.
Note: Modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Known Limitations
The control plane endpoint port is also used as the API server port on each control plane machine. The default port is
6443. Before you create the cluster, ensure the port is available for use on each control plane machine.
To use the built ami with Konvoy, specify it with the --ami flag when calling cluster create.
For GPU Steps in Pre-provisioned section of the documentation to use the overrides/nvidia.yaml.
Additional helpful information can be found in the NVIDIA Device Plug-in for Kubernetes instructions and the
Installation Guide of Supported Platforms.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB, and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1
Note: Depending on the cluster size, it will take a few minutes to create.
When the command completes, you will have a running Kubernetes cluster! For bootstrap and custom YAML cluster
creation, refer to the Additional Infrastructure Customization section of the documentation for Pre-provisioned: Pre-
Provisioned Infrastructure
Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to installing the NKP
Kommander UI:
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: If changing the Calico encapsulation, Nutanix recommends changing it after cluster creation, but before
production.
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster, by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
Audit Logs
To modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Further Steps
For more customized cluster creation, access the Pre-Provisioned Infrastructure section. That section is for Pre-
Provisioned Override Files, custom flags, and more that specify the secret as part of the create cluster command. If
these are not specified, the overrides for your nodes will not be applied.
Cluster Verification: If you want to monitor or verify the installation of your clusters, refer to: Verify your Cluster
and NKP Installation.
• MetalLB IP address ranges or CIDRs needs to be within the node’s primary network subnet.
• MetalLB IP address ranges or CIDRs and node subnet must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250, and
configures Layer 2 mode:
The following values are generic, enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.1.240-192.168.1.250
EOF
kubectl apply -f metallb-conf.yaml
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need 4 pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PersistentVolumeClaim (PVC) based storage which requires your CSI provider
to support PVC with type volumeMode: Block. As this is not possible with the default local static provisioner,
you can install Ceph in host storage mode. You can choose whether Ceph’s object storage daemon (osd) pods can
consume all or just some of the devices on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
6. Enable NVIDIA platform services in the same kommander.yamlfile. for GPU resources.
apps:
nvidia-gpu-operator:
enabled: true
a. RHEL 8.4/8.6
If you’re using RHEL 8.4/8.6 as the base operating system for your GPU enabled nodes, set the
toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to
the following
kind: Installation
apps:
nvidia-gpu-operator:
enabled: true
values: |
toolkit:
version: v1.14.6-ubi8
8. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
To make the new managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Procedure
Procedure
• If a custom AMI was created using Konvoy Image Builder, use the --ami flag. The custom ami id is printed and
written to ./manifest.json. To use the built ami with Konvoy, specify it with the --ami flag when calling
cluster create in Step 1 in the next section where you create your Kubernetes cluster.
Tip: Before you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding NKP create cluster command.
In a Pre-provisioned environment, use the Kubernetes CSI and third-partythird-party drivers for local volumes and
other storage devices in your data center.
Caution: NKP uses a local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After turning off localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Change the default StorageClass.
For Pre-provisioned environments, you define a set of nodes that already exist. During the cluster creation process,
Konvoy Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which
KIB uses to build images for other providers) against the set of nodes that you defined. This results in your pre-
existing or pre-provisioned nodes being configured properly.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the Kubernetes
control plane and worker nodes on the hosts defined in the inventory YAML previously created.
Procedure
1. This command uses the default external load balancer (LB) option.
nkp create cluster pre-provisioned --cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host <control plane endpoint host> \
Note: NOTE: Depending on the cluster size, it will take a few minutes to create.
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Ensure Configuration
If not already done, see the documentation for:
Note: If the NVIDIA runfile installer has not been downloaded, then retrieve and install the download first by running
the following command. The first line in the command below downloads and installs the runfile, and the second line
places it in the artifacts directory.
curl -O https://download.nvidia.com/XFree86/Linux-x86_64/470.82.01/NVIDIA-Linux-
x86_64-470.82.01.run
mv NVIDIA-Linux-x86_64-470.82.01.run artifacts
The instructions below outline how to fulfill the requirements for using pre-provisioned infrastructure in an air-
gapped environment. In order to create a cluster, you must first set up the environment with necessarthe y artifacts.
All artifacts for Pre-provisioned Air-gapped need to get onto the bastion host. Artifacts needed by nodes must be
unpacked and distributed on the bastion before other provisioning will work in the absence of an internet connection.
There is an air-gapped bundle available to download. In previous NKP releases, the distro package bundles were
included in the downloaded air-gapped bundle. Currently, that air-gapped bundle contains the following artifacts, with
the exception of the distro packages:
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command:
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
Setup Process
1. The bootstrap image must be extracted and loaded onto the bastion host.
2. Artifacts must be copied onto cluster hosts for nodes to access.
3. If using GPU, those artifacts must be positioned locally.
4. Registry seeded with images locally.
2. Verify the artifacts for your OS exist in the artifacts/ directory and export the appropriate variables:
$ ls artifacts/
1.29.6_centos_7_x86_64.tar.gz 1.29.6_redhat_8_x86_64_fips.tar.gz
containerd-1.6.28-d2iq.1-rhel-7.9-x86_64.tar.gz containerd-1.6.28-d2iq.1-
rhel-8.6-x86_64_fips.tar.gz pip-packages.tar.gz
1.29.6_centos_7_x86_64_fips.tar.gz 1.29.6_rocky_9_x86_64.tar.gz
containerd-1.6.28-d2iq.1-rhel-7.9-x86_64_fips.tar.gz containerd-1.6.28-d2iq.1-
rocky-9.0-x86_64.tar.gz
1.29.6_redhat_7_x86_64.tar.gz 1.29.6_ubuntu_20_x86_64.tar.gz
containerd-1.6.28-d2iq.1-rhel-8.4-x86_64.tar.gz containerd-1.6.28-d2iq.1-
rocky-9.1-x86_64.tar.gz
1.29.6_redhat_7_x86_64_fips.tar.gz containerd-1.6.28-d2iq.1-centos-7.9-
x86_64.tar.gz containerd-1.6.28-d2iq.1-rhel-8.4-x86_64_fips.tar.gz
containerd-1.6.28-d2iq.1-ubuntu-20.04-x86_64.tar.gz
1.29.6_redhat_8_x86_64.tar.gz containerd-1.6.28-d2iq.1-centos-7.9-
x86_64_fips.tar.gz containerd-1.6.28-d2iq.1-rhel-8.6-x86_64.tar.gz images
3. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_FILE="<private key file>"
SSH_PRIVATE_KEY_FILE must be either the name of the SSH private key file in your working directory or an
absolute path to the file in your user’s home directory.
4. Generate an inventory.yaml which is automatically picked up by the konvoy-image upload in the next step.
cat <<EOF > inventory.yaml
all:
5. Upload the artifacts onto cluster hosts with the following command:
konvoy-image upload artifacts --inventory-file=gpu_inventory.yaml \
--container-images-dir=./artifacts/images/ \
--os-packages-bundle=./artifacts/$OS_PACKAGES_BUNDLE \
--containerd-bundle=artifacts/$CONTAINERD_BUNDLE \
--pip-packages-bundle=./artifacts/pip-packages.tar.gz \
--nvidia-runfile=./artifacts/NVIDIA-Linux-x86_64-470.82.01.run
The konvoy-image upload artifacts command copies all OS packages and other artifacts onto each of
the machines in your inventory. When you create the cluster, the provisioning process connects to each node and
runs commands to install those artifacts and consequently Kubernetes running.. KIB uses variable overrides
to specify base image and container images to use in your new machine image. The variable overrides files
for NVIDIA and FIPS can be ignored unless adding an overlay feature. Use the --overrides overrides/
fips.yaml,overrides/offline-fips.yaml flag with manifests located in the overrides directory
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images, is required. This registry must be accessible from both the bastion
machine and either the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the
Kubernetes cluster.
Procedure
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply the variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the network
performance between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Note: NKP supports NVIDIA driver version is 470.x. For more information, see NVIDIA driver.
1. Create the secret that the GPU node pool uses. This secret is populated from the KIB overrides.
Example output of a file named overrides/nvidia.yaml.
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
2. Create a secret on the bootstrap cluster that is populated from the above file. We will name it
${CLUSTER_NAME}-user-overrides
kubectl create secret generic ${CLUSTER_NAME}-user-overrides --from-
file=overrides.yaml=overrides/nvidia.yaml
3. Create an inventory and nodepool with the instructions below and use the ${CLUSTER_NAME}-user-overrides
secret.
a. Create an inventory object that has the same name as the node pool you’re creating and the details of the pre-
provisioned machines that you want to add to it. For example, to create a node pool named gpu-nodepool an
inventory named gpu-nodepool must be present in the same namespace.
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: ${MY_NODEPOOL_NAME}
spec:
hosts:
- address: ${IP_OF_NODE}
sshConfig:
port: 22
user: ${SSH_USERNAME}
privateKeyRef:
name: ${NAME_OF_SSH_SECRET}
namespace: ${NAMESPACE_OF_SSH_SECRET}
b. (Optional) If your pre-provisioned machines have overrides, you must create a secret that includes all of the
overrides you want to provide in one file. Create an override secret using the instructions detailed on this page.
c. Once the PreprovisionedInventory object and overrides are created, create a node pool.
nkp create nodepool preprovisioned -c ${MY_CLUSTER_NAME} ${MY_NODEPOOL_NAME} --
override-secret-name ${MY_OVERRIDE_SECRET}
Note: Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally
or store in version control.
Note: For more information regarding this flag or others, see the nkp create node pool section of the
documentation for either cluster or nodepool and select your provider.
Procedure
1. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
2. Use the following template to help you define your infrastructure. The environment variables that you set in the
previous step automatically replace the variable names when the inventory YAML file is created.
cat <<EOF > preprovisioned_inventory.yaml
---
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: $CLUSTER_NAME-control-plane
namespace: default
labels:
cluster.x-k8s.io/cluster-name: $CLUSTER_NAME
clusterctl.cluster.x-k8s.io/move: ""
spec:
hosts:
# Create as many of these as needed to match your infrastructure
# Note that the command-line parameter—-control-plane-replicas determines how
many control plane nodes will actually be used.
#
- address: $CONTROL_PLANE_1_ADDRESS
- address: $CONTROL_PLANE_2_ADDRESS
- address: $CONTROL_PLANE_3_ADDRESS
sshConfig:
port: 22
# This is the username used to connect to your infrastructure. This user must be
root or
# have the ability to use sudo without a password
user: $SSH_USER
privateKeyRef:
# This is the name of the secret you created in the previous step. It must
exist in the same
# namespace as this inventory object.
name: $SSH_PRIVATE_KEY_SECRET_NAME
namespace: default
---
apiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: $CLUSTER_NAME-md-0
namespace: default
labels:
cluster.x-k8s.io/cluster-name: $CLUSTER_NAME
clusterctl.cluster.x-k8s.io/move: ""
spec:
hosts:
- address: $WORKER_1_ADDRESS
A control plane with one node can use its single node as the endpoint, so you will not require an external load
balancer, or a built-in virtual IP. At least one control plane node must always be running. Therefore, to upgrade a
cluster with one control plane node, a spare machine must be available in the control plane inventory. This machine
is used to provision the new node before the old node is deleted. When the API server endpoints are defined, you can
create the cluster using the link in the Next Step below.
Note: Modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Procedure
Note: The cluster name might only contain the following characters: a-z, 0-9, , and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
What to do next
Create a Kubernetes Cluster
• the
Note: a
Turning offthird-partymight.
nkp create cluster pre-provisioned --cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host <control plane endpoint host> \
--control-plane-endpoint-port <control plane endpoint port, if different than 6443> \
--pre-provisioned-inventory-file preprovisioned_inventory.yaml \
--ssh-private-key-file <path-to-ssh-private-key> \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--self-managed
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB, and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1
Note: Depending on the cluster size, it will take a few minutes to create.
When the command completes, you will have a running Kubernetes cluster! For bootstrap and custom YAML cluster
creation, refer to the Additional Infrastructure Customization section of the documentation for Pre-provisioned: Pre-
provisioned Infrastructure
Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to installing the NKP
Kommander UI:
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: If changing the Calico encapsulation, Nutanix recommends changing it after cluster creation, but before
production.
Audit Logs
To modify Control Plane Audit logs settings using the information contained in the page Configure the Control
Plane.
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly and give the machine’s MAC address to clients.
• MetalLB IP address ranges or CIDRs needs to be within the node’s primary network subnet.
• MetalLB IP address ranges or CIDRs and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250, and
configures Layer 2 mode:
The following values are generic; enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need four pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PersistentVolumeClaim (PVC) based storage which requires your CSI provider
to support PVC with type volumeMode: Block. As this is not possible with the default local static provisioner,
you can install Ceph in host storage mode. You can choose whether Ceph’s object storage daemon (osd) pods can
consume all or just some of the devices on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
6. Enable NVIDIA platform services in the same kommander.yamlfile. for GPU resources.
apps:
nvidia-gpu-operator:
enabled: true
a. RHEL 8.4/8.6
If you’re using RHEL 8.4/8.6 as the base operating system for your GPU enabled nodes, set the
toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to
the following
kind: Installation
apps:
nvidia-gpu-operator:
enabled: true
values: |
toolkit:
version: v1.14.6-ubi8
8. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the command kubectl -n kommander wait --for
condition=Ready helmreleases --all --timeout 15m.
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using the command NKP open
dashboard --kubeconfig=${CLUSTER_NAME}.conf.
2. Retrieve your credentials at any time using the command kubectl -n kommander get secret
NKP-credentials -o go-template='Username: {{.data.username|base64decode}}
{{ "\n"}}Password: {{.data.password|base64decode}}{{ "\n"}}'.
3. Retrieve the URL used for accessing the UI using the command kubectl -n kommander get svc
kommander-traefik -o go-template='https://{{with index .status.loadBalancer.ingress
0}}{{or .hostname .ip}}{{end}}/NKP/kommander/dashboard{{ "\n"}}'
Only use the static credentials to access the UI for configuring an external identity provider. Treat them as back
up credentials rather than use them for normal access.
a. Rotate the password using the command NKP experimental rotate dashboard-password.
The example output displays the new password:
Password: kqZ31lMBSCLcBjUKVwLJMQL2PxalipIzZw5Pjyw09wDqjWV3dz2wPSSBYi09JGJp
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the Kommander
component. Those tasks are only done on Management clusters!
To make the new managed cluster a part of a Workspace, set that workspace environment variable.
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name might only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
• If a custom AMI was created using Konvoy Image Builder, use the --ami flag. The custom ami id is printed and
written to ./manifest.json. To use the built ami with Konvoy, specify it with the --ami flag when calling
cluster create in Step 1 in the next section where you create your Kubernetes cluster.
Tip: Before you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding NKP create cluster command.
Caution: NKP uses local static provisioners as the Default Storage Providers on page 33 for a pre-provisioned
environment. However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI
compatible storage that is suitable for production.
After disabling localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes
documentation: Change or Manage Multiple StorageClasses on page 34
For Pre-provisioned environments, you define a set of nodes that already exist. During the cluster creation process,
Konvoy Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which KIB
uses to build images for other providers) against the set of nodes that you defined. This results in your pre-existing or
pre-provisioned nodes being configured properly.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the Kubernetes
control plane and worker nodes on the hosts defined in the inventory YAML previously created.
Procedure
1. This command uses the default external load balancer (LB) option.
nkp create cluster pre-provisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host <control plane endpoint host> \
--control-plane-endpoint-port <control plane endpoint port, if different than 6443>
\
--pre-provisioned-inventory-file preprovisioned_inventory.yaml \
--ssh-private-key-file <path-to-ssh-private-key> \
--registry-mirror-url=${_REGISTRY_URL} \
--registry-mirror-cacert=${_REGISTRY_CA} \
--registry-mirror-username=${_REGISTRY_USERNAME} \
--registry-mirror-password=${_REGISTRY_PASSWORD}
Note: NOTE: Depending on the cluster size, it will take a few minutes to create.
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to the workspace through earlier UI, or attach your
cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
• Control Plane Nodes - NKP on AWS defaults to deploying an m5.xlarge instance with an 80GiB root
volume for control plane nodes, which meets the above resource requirements.
• Worker Nodes - NKP on AWS defaults to deploying am5.2xlarge instance with an 80GiB root volume for
worker nodes, which meets the above resource requirements.
Section Contents
Supported environment combinations:
AWS Installation
This installation provides instructions to install NKP in an AWS non-air-gapped environment.
Remember, there are always more options for custom YAML in the Custom Installation and Additional
Infrastructure Tools section, but this will get you operating with basic features.
If not already done, see the documentation for:
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your own image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants in order to enforce security.
Section Contents
Procedure
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
5. Ensure you have met the minimal set of permissions from the AWS Image Builder Book.
6. A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
Note: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Set the environment variables for AWS access. The following variables must be set using your credentials
including required IAM:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
If you have an override file to configure specific attributes of your AMI file, add it. Instructions for customizing
an override file are found on this page: Image Overrides
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build aws images/ami/rhel-86.yaml
a. By default, it builds in the us-west-2 region. to specify another region set the --region flag as shown in the
command below.
konvoy-image build aws --region us-east-1 images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
What to do next
After KIB provisions the image successfully, the ami
id is printed and written to the packer.pkr.hcl (Packer config) file. This file has an artifact_id field whose
value provides the name of the AMI ID as shown in the example below. That is the ami you use in the NKP create
cluster command.
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.26.6"
}
}
],
"last_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05"
}
What to do next
1. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster:
export AWS_AMI_ID=ami-<ami-id-here>
Note: Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Procedure
• To use a local registry even in a non-air-gapped environment, download and extract the bundle. Downloading
NKP on page 16 the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
Procedure
• Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
• The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus
the base OS name: --ami-base-os ubuntu-20.04
• The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
Note:
• The AMI must be created with Konvoy Image Builder in order to use the registry mirror
feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror
when attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the
address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
4. Run this command to create your Kubernetes cluster by providing the image ID and using any relevant flags.
nkp create cluster aws \
--cluster-name=${CLUSTER_NAME} \
--ami=${AWS_AMI_ID} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
--self-managed
If providing the AMI path, use these flags in place of AWS_AMI_ID:
--ami-owner AWS_ACCOUNT_ID \
--ami-base-os ubuntu-20.04 \
--ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*' \
» HTTP or HTTPS flags if you use proxies: --http-proxy, --https-proxy, and --no-proxy
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: nkp-catalog-applications
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Configuring NKP Catalog
Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the Kommander
component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster aws \
--cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--kubeconfig=<management-cluster-kubeconfig-path> \
--namespace ${WORKSPACE_NAMESPACE}
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information, see
Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your own image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants in order to enforce security.
Section Contents
Procedure
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
6. A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder. For
more information, see Creating Minimal IAM Permissions for KIB on page 1035.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml.
Example
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
Note: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Set the environment variables for AWS access. The following variables must be set using your credentials
including required IAM:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
If you have an override file to configure specific attributes of your AMI file, add it. For more information on
customizing an override file, see Image Overrides on page 1073.
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build aws images/ami/rhel-86.yaml
a. By default, it builds in the us-west-2 region. to specify another region set the --region flag as shown in the
command below.
konvoy-image build aws --region us-east-1 images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
What to do next
After KIB provisions the image successfully, the ami
id is printed and written to the packer.pkr.hcl (Packer config) file. This file has an artifact_id field whose
value provides the name of the AMI ID as shown in the example below. That is the ami you use in the nkp create
cluster command.
...
amazon-ebs.kib_image: Adding tag: "distribution_version": "8.6"
amazon-ebs.kib_image: Adding tag: "gpu_nvidia_version": ""
amazon-ebs.kib_image: Adding tag: "kubernetes_cni_version": ""
amazon-ebs.kib_image: Adding tag: "build_timestamp": "20231023182049"
amazon-ebs.kib_image: Adding tag: "gpu_types": ""
amazon-ebs.kib_image: Adding tag: "kubernetes_version": "1.28.7"
==> amazon-ebs.kib_image: Creating snapshot tags
amazon-ebs.kib_image: Adding tag: "ami_name": "konvoy-ami-
rhel-8.6-1.26.6-20231023182049"
==> amazon-ebs.kib_image: Terminating the source AWS instance...
==> amazon-ebs.kib_image: Cleaning up any extra volumes...
==> amazon-ebs.kib_image: No volumes to clean up, skipping
==> amazon-ebs.kib_image: Deleting temporary security group...
==> amazon-ebs.kib_image: Deleting temporary keypair...
==> amazon-ebs.kib_image: Running post-processor: (type manifest)
Build 'amazon-ebs.kib_image' finished after 26 minutes 52 seconds.
What to do next
1. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster:
export AWS_AMI_ID=ami-<ami-id-here>
Note: Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Related Information
Procedure
• To use a local registry even in a non-air-gapped environment, download and extract the bundle.
Download the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images is required. This registry must be accessible from both the bastion machine
and either the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the Kubernetes
cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory similar
to example below depending on your current location
cd nkp-v2.12.0
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster,
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Procedure
• AWS_VPC_ID: the VPC ID where the cluster will be created. The VPC requires the following AWS VPC
Endpoints to be already present:
• ec2 - com.amazonaws.{region}.ec2
• elasticloadbalancing - com.amazonaws.{region}.elasticloadbalancing
• secretsmanager - com.amazonaws.{region}.secretsmanager
• autoscaling - com.amazonaws.{region}.autoscaling
• ecr - com.amazonaws.{region}.ecr.dkr -
More details about AWS service using an interface VPC endpoint and AWS VPC endpoints list at
https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html and https://
docs.aws.amazon.com/vpc/latest/privatelink/aws-services-privatelink-support.html respectively.
• AWS_SUBNET_IDS: a comma-separated list of one or more private Subnet IDs with each one in a different
Availability Zone. The cluster control-plane and worker nodes will automatically be spread across these
Subnets.
• AWS_ADDITIONAL_SECURITY_GROUPS: a comma-seperated list of one or more Security Groups IDs to
use in addition to the ones automatically created by CAPA. For more information, see https://github.com/
kubernetes-sigs/cluster-api-provider-aws.
• AWS_AMI_ID: the AMI ID to use for control-plane and worker nodes. The AMI must be created by the
konvoy-image-builder.
Note: In previous NKP releases, AMI images provided by the upstream CAPA project would be used if you did
not specify an AMI. However, the upstream images are not recommended for production and may not always be
available. Therefore, NKP now requires you to specify an AMI when creating a cluster. To create an AMI, use
Konvoy Image Builder on page 1032.
There are two approaches to supplying the ID of your AMI. Either provide the ID of the AMI or provide a way for
NKP to discover the AMI using location, format and OS information:
• Option One - Provide the ID of your AMI: Use the example command below leaving the existing flag that
provides the AMI ID: --ami AMI_ID.
• Option Two - Provide a path for your AMI with the information required for image discover.
• Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
• The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus
the base OS name: --ami-base-os ubuntu-20.04
• The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
5. (Optional) Configure your cluster to use an existing container registry as a mirror when attempting to pull images.
The example below is for AWS ECR:
Warning: Ensure that the local registry is set up if you do not have this set up already.
Warning: The AMI must be created with Konvoy Image Builder in order to use the registry mirror feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror when
attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the address of an existing
local registry accessible in the VPC that the new cluster nodes will be configured to use a mirror registry when
pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
• REGISTRY_URL: the address of an existing registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• Other local registries may use the options below:
• JFrog - REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. This value is only
needed if the registry is using a self-signed certificate and the AMIs are not already configured to trust this
CA.
• REGISTRY_USERNAME: optional, set to a user that has pull access to this registry.
6. Create a Kubernetes cluster. The following example shows a common configuration. For the complete list of
cluster creation options, see the dkp create cluster aws CLI Command reference.
Note: DKP uses AWS CSI as the default storage provider. You can use a Kubernetes CSI (see https://
kubernetes.io/docs/concepts/storage/volumes/#volume-types) compatible storage solution that
is suitable for production. For more information, see the https://kubernetes.io/docs/tasks/administer-
cluster/change-default-storage-class/ topic in the Kubernetes documentation.
• Option1 - Run this command to create your Kubernetes cluster using any relevant flags for Option One
explained above providing the AMI ID:
nkp create cluster aws --cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
• Option 2 - Run the command as shown from the explanation above to allow discovery of your AMI:
nkp create cluster aws \
--cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
--vpc-id=${AWS_VPC_ID} \
--ami-owner AWS_ACCOUNT_ID \
--ami-base-os ubuntu-20.04 \
--ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*' \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=<YOUR_ECR_URL> \
--self-managed
» HTTP or HTTPS flags if you use proxies: --http-proxy, --https-proxy, and --no-proxy
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
• Ensure you have reviewed all the prerequisites required for the installation. For more information, see
Prerequisites for Installation on page 44.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. For customization options, see Additional Kommander Configuration on page 964. Some options include
Custom Domains and Certificates, HTTP proxy, and External Load Balancer.
5. (Optional) If your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Configuring Applications
After Installing Kommander on page 984.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command
kubectl -n kommander get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the following commands.
kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op":
"replace", "path": "/spec/suspend", "value": true}]'
kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op":
"replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the Kommander
component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster aws --cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE}
--kubeconfig=<management-cluster-kubeconfig-path> \
--with-aws-bootstrap-credentials=true \
--vpc-id=${AWS_VPC_ID} \
--ami=${AWS_AMI_ID} \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--kubeconfig=<management-cluster-kubeconfig-path> \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information, see
Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your own image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants in order to enforce security.
Section Contents
Procedure
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
5. Ensure you have met the minimal set of permissions from the AWS Image Builder Book.
6. A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
Note: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Set the environment variables for AWS access. The following variables must be set using your credentials
including required IAM:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
If you have an override file to configure specific attributes of your AMI file, add it. Instructions for customizing
an override file are found on this page: Image Overrides
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build aws images/ami/rhel-86.yaml
a. By default, it builds in the us-west-2 region. to specify another region set the --region flag as shown in the
command below.
konvoy-image build aws --region us-east-1 images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
What to do next
After KIB provisions the image successfully, the ami
id is printed and written to the packer.pkr.hcl (Packer config) file. This file has an artifact_id field whose
value provides the name of the AMI ID as shown in the example below. That is the ami you use in the NKP create
cluster command.
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.26.6"
}
}
What to do next
1. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster:
export AWS_AMI_ID=ami-<ami-id-here>
Note: Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Related Information
Procedure
• To use a local registry even in a non-air-gapped environment, download and extract the bundle. Downloading
NKP on page 16 the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
Procedure
a. Use the example command below leaving the existing flag that provides the AMI ID: --ami AMI_ID
2. Option Two - Provide a path for your AMI with the information required for image discover.
a. Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
b. The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus the
base OS name: --ami-base-os ubuntu-20.04
c. The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
Note:
• The AMI must be created with Konvoy Image Builder in order to use the registry mirror feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror when
attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the address of
an existing local registry accessible in the VPC that the new cluster nodes will be configured to use a
mirror registry when pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
3. Run this command to create your Kubernetes cluster using any relevant flags.
nkp create cluster aws \
--cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
--ami=${AWS_AMI_ID} \
--kubernetes-version=v1.29.6+fips.0 \
--etcd-version=3.5.10+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
--self-managed
If providing the AMI path, use these flags in place of AWS_AMI_ID:
--ami-owner AWS_ACCOUNT_ID \
--ami-base-os ubuntu-20.04 \
--ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*' \
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
apps:
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the command kubectl -n kommander wait --for
condition=Ready helmreleases --all --timeout 15m.
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using the command nkp open
dashboard --kubeconfig=${CLUSTER_NAME}.conf.
2. Retrieve your credentials at any time using the command kubectl -n kommander get secret
NKP-credentials -o go-template='Username: {{.data.username|base64decode}}
{{ "\n"}}Password: {{.data.password|base64decode}}{{ "\n"}}'.
3. Retrieve the URL used for accessing the UI using the command kubectl -n kommander get svc
kommander-traefik -o go-template='https://{{with index .status.loadBalancer.ingress
0}}{{or .hostname .ip}}{{end}}/NKP/kommander/dashboard{{ "\n"}}'.
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster aws \
--cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE}
--with-aws-bootstrap-credentials=true \
--kubernetes-version=v1.29.6+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere
--kubeconfig=<management-cluster-kubeconfig-path>
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information, see
Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your own image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants in order to enforce security.
Section Contents
Procedure
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
5. Ensure you have met the minimal set of permissions from the AWS Image Builder Book.
6. A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the NKP command create-package-bundle. This builds an OS bundle using
the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
Note: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Set the environment variables for AWS access. The following variables must be set using your credentials
including required IAM:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
If you have an override file to configure specific attributes of your AMI file, add it. Instructions for customizing
an override file are found on this page: Image Overrides
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build aws images/ami/rhel-86.yaml
a. By default, it builds in the us-west-2 region. to specify another region set the --region flag as shown in the
command below.
konvoy-image build aws --region us-east-1 images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
What to do next
After KIB provisions the image successfully, the ami
id is printed and written to the packer.pkr.hcl (Packer config) file. This file has an artifact_id field whose
value provides the name of the AMI ID as shown in the example below. That is the ami you use in the NKP create
cluster command.
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.26.6"
}
}
What to do next
1. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster:
export AWS_AMI_ID=ami-<ami-id-here>
Note: Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Related Information
Procedure
• To use a local registry even in a non-air-gapped environment, download and extract the bundle. Downloading
NKP on page 16 the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2..0_linux_amd64.tar.gz) to load registry
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images is required. This registry must be accessible from both the bastion machine
and either the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the Kubernetes
cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory similar
to example below depending on your current location
cd nkp-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster,
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
4. There are two approaches to supplying the ID of your AMI. Either provide the ID of the AMI or provide a way for
NKP to discover the AMI using location, format and OS information:
• Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
• The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus
the base OS name: --ami-base-os ubuntu-20.04
• The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
Note:
• The AMI must be created with Konvoy Image Builder in order to use the registry mirror
feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror
when attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the
address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
5. Run this command to create your Kubernetes cluster by providing the image ID and using any relevant flags.
nkp create cluster aws --cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
--vpc-id=${AWS_VPC_ID} \
--ami=${AWS_AMI_ID} \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
» HTTP or HTTPS flags if you use proxies: --http-proxy, --https-proxy, and --no-proxy
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP,see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
AWS Air-gapped FIPS: Creating Managed Clusters Using the NKP CLI
This topic explains how to continue using the CLI to create managed clusters rather than switching to the
UI dashboard.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the Kommander
component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster aws
--cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE}
--with-aws-bootstrap-credentials=true \
--vpc-id=${AWS_VPC_ID} \
--ami=${AWS_AMI_ID} \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information, see
Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set using the command echo
${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace using
the command nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
Note: This is only necessary if you never set the workspace of your cluster upon creation.
4. Retrieve the workspace where you want to attach the cluster using the command kubectl get workspaces
-A.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve
the kubeconfig secret value of your cluster using the command kubectl -n default get secret
${MANAGED_CLUSTER_NAME}-kubeconfig -o go-template='{{.data.value}}{{ "\n"}}'.
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
Example:
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
8. Create this secret in the desired workspace using the command kubectl apply -f attached-cluster-
kubeconfig.yaml --namespace ${WORKSPACE_NAMESPACE}
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by using the
command kubectl get kommanderclusters -A.
It may take a few minutes to reach "Joined" status. If you have several Pro Clusters and want to turn one of them
to a Managed Cluster to be centrally administrated by a Management Cluster, review Platform Expansion.
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your own image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants in order to enforce security.
GPU Prerequisites
Before you begin, you must:
Section Contents
Procedure
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
5. Ensure you have met the minimal set of permissions from the AWS Image Builder Book.
6. A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note: For Red Hat Enterprise Linux (RHEL) OS, pass your RedHat subscription manager credentials: export
RMS_ACTIVATION_KEY. Example command:
export RHSM_ACTIVATION_KEY="-ci"
export RHSM_ORG_ID="1232131"
Note: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Set the environment variables for AWS access. The following variables must be set using your credentials
including required IAM:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
If you have an override file to configure specific attributes of your AMI file, add it. Instructions for customizing
an override file are found on this page: Image Overrides
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build aws images/ami/rhel-86.yaml
a. By default, it builds in the us-west-2 region. to specify another region set the --region flag as shown in the
command below.
konvoy-image build aws --region us-east-1 images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
What to do next
After KIB provisions the image successfully, the ami
id is printed and written to the packer.pkr.hcl (Packer config) file. This file has an artifact_id field whose
value provides the name of the AMI ID as shown in the example below. That is the ami you use in the NKP create
cluster command.
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.26.6"
}
}
What to do next
1. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster:
export AWS_AMI_ID=ami-<ami-id-here>
Note: Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Related Information
Procedure
• To use a local registry even in a non-air-gapped environment, download and extract the bundle. Downloading
NKP on page 16 the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
Procedure
a. Option One
Use the example command below leaving the existing flag that provides the AMI ID: --ami AMI_ID
b. Option Two - Provide a path for your AMI with the information required for image discover.
• Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
• The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus
the base OS name: --ami-base-os ubuntu-20.04
• The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
Note:
• The AMI must be created with Konvoy Image Builder in order to use the registry mirror
feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror
when attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the
address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
4. Run this command to create your Kubernetes cluster by providing the image ID and using any relevant flags.
nkp create cluster aws \
--cluster-name=${CLUSTER_NAME} \
--ami=${AWS_AMI_ID} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
--self-managed
If providing the AMI path, use these flags in place of AWS_AMI_ID:
--ami-owner AWS_ACCOUNT_ID \
--ami-base-os ubuntu-20.04 \
--ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*' \
» HTTP or HTTPS flags if you use proxies: --http-proxy, --https-proxy, and --no-proxy
5. After cluster creation, create the node pool after cluster creation.
nkp create nodepool aws -c ${CLUSTER_NAME} \
--instance-type p2.xlarge \
--ami-id=${AMI_ID_FROM_KIB} \
--replicas=1 ${NODEPOOL_NAME} \
--kubeconfig=${CLUSTER_NAME}.conf
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP,see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
AWS with GPU: Creating Managed Clusters Using the NKP CLI
This topic explains how to continue using the CLI to create managed clusters rather than switching to the
UI dashboard.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
1. Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster aws \
--cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--kubeconfig=<management-cluster-kubeconfig-path> \
--namespace ${WORKSPACE_NAMESPACE}
--with-aws-bootstrap-credentials=true \
--kubeconfig=<management-cluster-kubeconfig-path> \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information,
see Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your own image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants in order to enforce security.
Section Contents
Procedure
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
5. Ensure you have met the minimal set of permissions from the AWS Image Builder Book.
6. A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note: For Red Hat Enterprise Linux (RHEL) OS, pass your RedHat subscription manager credentials: export
RMS_ACTIVATION_KEY. Example command:
export RHSM_ACTIVATION_KEY="-ci"
export RHSM_ORG_ID="1232131"
Note: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Set the environment variables for AWS access. The following variables must be set using your credentials
including required IAM:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
If you have an override file to configure specific attributes of your AMI file, add it. Instructions for customizing
an override file are found on this page: Image Overrides
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build aws images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
What to do next
After KIB provisions the image successfully, the ami
id is printed and written to the packer.pkr.hcl (Packer config) file. This file has an artifact_id field whose
value provides the name of the AMI ID as shown in the example below. That is the ami you use in the NKP create
cluster command.
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.26.6"
}
}
],
"last_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05"
}
What to do next
1. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster:
export AWS_AMI_ID=ami-<ami-id-here>
Note: Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Related Information
Procedure
• To use a local registry even in a non-air-gapped environment, download and extract the bundle. Downloading
NKP on page 16 the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images is required. This registry must be accessible from both the bastion machine
and either the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the Kubernetes
cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory similar
to example below depending on your current location
cd nkp-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster,
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
Procedure
• Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
• The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus
the base OS name: --ami-base-os ubuntu-20.04
• The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
Note:
• The AMI must be created with Konvoy Image Builder in order to use the registry mirror
feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror
when attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the
address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
5. Run this command to create your Kubernetes cluster by providing the image ID and using any relevant flags.
nkp create cluster aws --cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--with-aws-bootstrap-credentials=true \
--vpc-id=${AWS_VPC_ID} \
--ami=${AWS_AMI_ID} \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--kubernetes-version=v1.29.6+fips.0 \
--etcd-version=3.5.10+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
--self-managed
If providing the AMI path, use these flags in place of AWS_AMI_ID:
--ami-owner AWS_ACCOUNT_ID \
--ami-base-os ubuntu-20.04 \
--ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*' \
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
AWS Air-gapped GPU: Creating Managed Clusters Using the NKP CLI
This topic explains how to continue using the CLI to create managed clusters rather than switching to the
UI dashboard.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
1. Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster aws --cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE}
--with-aws-bootstrap-credentials=true \
--vpc-id=${AWS_VPC_ID} \
--ami=${AWS_AMI_ID} \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--kubeconfig=<management-cluster-kubeconfig-path> \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. For more information,
see Clusters with HTTP or HTTPS Proxy on page 647.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Note: An EKS cluster cannot be a Management or Pro cluster. To install NKP on your EKS cluster, first, ensure
you have a Management cluster with NKP and the Kommander component installed that handles the life cycle of your
EKS cluster.
In order to install Kommander, you need to have CAPI components, cert-manager, etc on a self-managed cluster.
The CAPI components mean you can control the life cycle of the cluster, and other clusters. However, because EKS
is semi-managed by AWS, the EKS clusters are under AWS control and don’t have those components. Therefore,
Kommander will not be installed.
Section Contents
EKS Installation
This installation provides instructions to install NKP in an AWS non-air-gapped environment.
Note: Ensure that the KUBECONFIG environment variable is set to the Management cluster by running export
KUBECONFIG=<Management_cluster_kubeconfig>.conf.
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. A Management cluster with the Kommander component installed.
2. You have a valid AWS account with credentials configured that can manage CloudFormation Stacks, IAM
Policies, and IAM Roles.
3. You will need to have the AWS CLI utility installed.
4. Install aws-iam-authenticator. This binary is used to access your cluster using kubectl.
Note: In order to install Kommander, you need to have CAPI components, cert-manager, etc on a self-managed cluster.
The CAPI components mean you can control the life cycle of the cluster, and other clusters. However, because EKS
is semi-managed by AWS, the EKS clusters are under AWS control and don’t have those components. Therefore,
Kommander will not be installed and these clusters will be attached to the management cluster.
If you are found using AWS ECR as your local private registry; more information is available on the Registry Mirror
Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your image for the region.
Note: If your role is not named NKP-bootstrapper-role change the parameter on line 6 of the file.
AWSTemplateFormatVersion: 2010-09-09
Parameters:
existingBootstrapperRole:
Type: CommaDelimitedList
Description: 'Name of existing minimal role you want to add to add EKS cluster
management permissions to'
Default: NKP-bootstrapper-role
Resources:
EKSMinimumPermissions:
Properties:
Description: Minimal user policy to manage eks clusters
ManagedPolicyName: eks-bootstrapper
PolicyDocument:
Statement:
- Action:
Roles
Below is a CloudFormation stack that includes IAM policies and roles required to setup EKS Clusters.
Note: To create the resources in the CloudFormation stack, copy the contents above into a file and execute the
following command after replacing MYFILENAME.yaml and MYSTACKNAME with the intended values:
aws cloudformation create-stack
--template-body=file://MYFILENAME.yaml --stack-name=MYSTACKNAME --
capabilities
CAPABILITY_NAMED_IAM
AWSTemplateFormatVersion: 2010-09-09
Parameters:
existingControlPlaneRole:
Type: CommaDelimitedList
Description: 'Names of existing Control Plane Role you want to add to the newly
created EKS Managed Policy for AWS cluster API controllers'
Default: control-plane.cluster-api-provider-aws.sigs.k8s.io
existingNodeRole:
Type: CommaDelimitedList
Description: 'ARN of the Nodes Managed Policy to add to the role for nodes'
Default: nodes.cluster-api-provider-aws.sigs.k8s.io
Resources:
AWSIAMManagedPolicyControllersEKS:
Properties:
Description: For the Kubernetes Cluster API Provider AWS Controllers
ManagedPolicyName: controllers-eks.cluster-api-provider-aws.sigs.k8s.io
PolicyDocument:
Statement:
- Action:
- 'ssm:GetParameter'
Effect: Allow
Resource:
- 'arn:*:ssm:*:*:parameter/aws/service/eks/optimized-ami/*'
- Action:
- 'iam:CreateServiceLinkedRole'
Condition:
StringLike:
1. Export the AWS region where you want to deploy the cluster.
export AWS_REGION=us-west-2
2. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster.
export AWS_PROFILE=<profile>
Note: The cluster name may only contain the following characters: a-z, 0-9, ., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Known Limitations
Procedure
• The Konvoy version used to create a workload cluster must match the Konvoy version used to delete a workload
cluster.
• EKS clusters cannot be Self-managed.
• Konvoy supports deploying one workload cluster. Konvoy generates a set of objects for one Node Pool.
• Konvoy does not validate edits to cluster objects.
Procedure
1. Set the environment variable to the name you assigned this cluster.
export CLUSTER_NAME=eks-example
2. Make sure your AWS credentials are up-to-date. Refresh the credentials command is only necessary if you
are using Access Keys. For more information, see Leverage the NKP Create Cluster Role on page 750
otherwise, if you are using role-based authentication on a bastion host, proceed to step 3.
nkp update bootstrap credentials aws
4. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects as edits
can prevent the cluster from deploying successfully. See Customizing CAPI Clusters.
6. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command
to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/eks-example True
10m
##ControlPlane - AWSManagedControlPlane/eks-example-control-plane True
10m
##Workers
##MachineDeployment/eks-example-md-0 True
26s
##Machine/eks-example-md-0-78fcd7c7b7-66ntt True
84s
##Machine/eks-example-md-0-78fcd7c7b7-b9qmc True
84s
##Machine/eks-example-md-0-78fcd7c7b7-v5vfq True
84s
##Machine/eks-example-md-0-78fcd7c7b7-zl6m2 True
84s
7. As they progress, the controllers also create Events. List the Events using this command.
kubectl get events | grep ${CLUSTER_NAME}
For brevity, the example uses grep. It is also possible to use separate commands to get Events for specific objects.
For example, kubectl get events --field-selector involvedObject.kind="AWSCluster" and
kubectl get events --field-selector involvedObject.kind="AWSMachine".
46m Normal SuccessfulCreateVPC
awsmanagedcontrolplane/eks-example-control-plane Created new managed VPC
"vpc-05e775702092abf09"
46m Normal SuccessfulSetVPCAttributes
awsmanagedcontrolplane/eks-example-control-plane Set managed VPC attributes for
"vpc-05e775702092abf09"
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0419dd3f2dfd95ff8"
46m Normal SuccessfulModifySubnetAttributes
awsmanagedcontrolplane/eks-example-control-plane Modified managed Subnet
"subnet-0419dd3f2dfd95ff8" attributes
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0e724b128e3113e47"
Note: More information about the configuration of the EKS control plane can be found on the EKS Cluster IAM
Policies and Roles page.
If the EKS cluster was created as a cluster using a self-managed AWS cluster that uses IAM Instance Profiles, you
will need to modify the IAMAuthenticatorConfig field in the AWSManagedControlPlane API object to allow
other IAM entities to access the EKS workload cluster. Follow the steps below:
Procedure
1. Run the following command with your KUBECONFIG configured to select the self-managed cluster
previously used to create the workload EKS cluster. Ensure you substitute ${CLUSTER_NAME} and
${CLUSTER_NAMESPACE} with their corresponding values for your cluster.
kubectl edit awsmanagedcontrolplane ${CLUSTER_NAME}-control-plane -n
${CLUSTER_NAMESPACE}
2. Edit the IamAuthenticatorConfig field with the IAM Role to the corresponding Kubernetes Role. In
this example, the IAM role arn:aws:iam::111122223333:role/PowerUser is granted the cluster role
system:masters. Note that this example uses example AWS resource ARNs, remember to substitute real values
in the corresponding AWS account.
iamAuthenticatorConfig:
mapRoles:
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::111122223333:role/my-node-role
username: system:node:{{EC2PrivateDNSName}}
- groups:
- system:masters
rolearn: arn:aws:iam::111122223333:role/PowerUser
username: admin
For further instructions on changing or assigning roles or clusterroles to which you can map IAM users or
roles, see Amazon Enabling IAM access to your cluster.
Procedure
1. Get a kubeconfig file for the workload cluster from the Secret, and write it to a file using this command.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: It may take a few minutes for the Status to move to Ready while the Pod network is deployed. The node
status will change to Ready soon after the calico-node DaemonSet Pods are Ready.
• Install aws-iam-authenticator. This binary is used to access your cluster using kubectl.
Attach a Pre-existing EKS Cluster
Ensure that the KUBECONFIG environment variable is set to the Management cluster before attaching by running:
Note:
export KUBECONFIG=<Management_cluster_kubeconfig>.conf
Procedure
1. Ensure you are connected to your EKS clusters. Enter the following commands for each of your clusters.
kubectl config get-contexts
kubectl config use-context <context for first eks cluster>
Procedure
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
export CURRENT_CLUSTER=$(kubectl config view --raw -o=go-
template='{{range .contexts}}{{if eq .name "'''${CURRENT_CONTEXT}'''"}}
{{ index .context "cluster" }}{{end}}{{end}}')
export CLUSTER_CA=$(kubectl config view --raw -o=go-template='{{range .clusters}}{{if
eq .name "'''${CURRENT_CLUSTER}'''"}}"{{with index .cluster "certificate-authority-
data" }}{{.}}{{end}}"{{ end }}{{ end }}')
export CLUSTER_SERVER=$(kubectl config view --raw -o=go-template='{{range .clusters}}
{{if eq .name "'''${CURRENT_CLUSTER}'''"}}{{ .cluster.server }}{{end}}{{ end }}')
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
apiVersion: v1
kind: Config
current-context: ${CURRENT_CONTEXT}
contexts:
- name: ${CURRENT_CONTEXT}
context:
cluster: ${CURRENT_CONTEXT}
8. This process produces a file in your current working directory called kommander-cluster-admin-config. The
contents of this file are used in Kommander to attach the cluster. Before importing this configuration, verify the
kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
Procedure
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. You can now either attach it in the UI, or attach your cluster to the workspace you want in the CLI. This is only
necessary if you never set the workspace of your cluster upon creation.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown list at the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
stop following the steps below, and see the instructions on the page Attach a cluster WITH network restrictions.
5. Upload the kubeconfig file you created in the previous section (or copy its contents) into the Cluster
Configuration section.
Note: If a cluster has limited resources to deploy all the federated platform services, it will fail to stay attached in
the NKP UI. If this happens, ensure your system has sufficient resources for all pods.
vSphere Overview
vSphere is a more complex setup than some of the other providers and infrastructures, so an overview of steps has
been provided to help. To confirm that your OS is supported, see Supported Operating System.
The overall process for configuring vSphere and NKP together includes the following steps:
1. Configure vSphere to provide the needed elements described in the vSphere Prerequisites: All Installation
Types.
2. For air-gapped environments: Creating a Bastion Host on page 652.
3. Create a base OS image (for use in the OVA package containing the disk images packaged with the OVF).
4. Create a CAPI VM image template that uses the base OS image and adds the needed Kubernetes cluster
components.
5. Create a new self-managing cluster on vSphere.
6. Install Kommander.
7. Verify and log on to the UI.
Section Contents
Supported environment variable combinations:
1. NKP Prerequisites
Before using NKP to create a vSphere cluster, verify that you have:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• A registry needs installed on the host where the NKP Konvoy CLI runs. For example, if you are installing Konvoy
on your laptop, ensure the laptop has a supported version of Docker or other registry. On macOS, Docker runs in a
virtual machine. Configure this virtual machine with at least 8GB of memory.
• CLI tool Kubectl 1.21.6 for interacting with the running cluster, installed on the host where the NKP Konvoy
command line interface (CLI) runs. For more information, see https://kubernetes.io/docs/tasks/tools/#kubectl.
• A valid VMware vSphere account with credentials configured.
Note: NKP uses the vsphere CSI driver as the default storage provider. Use a Kubernetes CSI-compatible storage
that is suitable for production. For more information, see https://kubernetes.io/docs/concepts/storage/
volumes/#volume-types.
Note: You can choose from any of the storage options available for Kubernetes. To turn off the default that
Konvoy deploys, set the default StorageClass as non-default. Then, set your newly created StorageClass to be the
default by following the commands in the Kubernetes documentation called Changing the Default Storage
Class.
• Access to a bastion VM or other network connectednetwork-connected host, running vSphere Client version
v6.7.x with Update 3 or later version.
• You must be able to reach the vSphere API endpoint from where the Konvoy command line interface (CLI)
runs.
• vSphere account with credentials configured - this account must have Administrator privileges.
• A RedHat subscription with a username and password for downloading DVD ISOs.
• For air-gapped environments, a bastion VM host template with access to a configured local registry. The
recommended template naming pattern is ../folder-name/NKP-e2e-bastion-template or similar. Each
infrastructure provider has its own set of bastion host instructions. For more information on Creating a Bastion
Host on page 652, see your provider’s documentation:
• AWS: https://aws.amazon.com/solutions/implementations/linux-bastion/
• Azure: https://learn.microsoft.com/en-us/azure/bastion/quickstart-host-portal
• GCP: https://blogs.vmware.com/cloud/2021/06/02/intro-google-cloud-vmware-engine-bastion-host-
access-iap/
• vSphere: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.security.doc/
GUID-6975426F-56D0-4FE2-8A58-580B40D2F667.html
• VMware: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.security.doc/
GUID-6975426F-56D0-4FE2-8A58-580B40D2F667.html.
• Use of PersistentVolumes in your cluster depends on Cloud Native Storage (CNS), available in vSphere
v6.7.x with Update 3 and later versions. CNS depends on this shared Datastore’s configuration.
• Datastore URL from the datastore record for the shared datastore you want your cluster to use.
• You need this URL value to ensure that the correct Datastore is used when NKP creates VMs for your
cluster in vSphere.
• Folder name.
• Base template name, such as base-rhel-8 or base-rhel-7.
• Name of a Virtual Network that has DHCP enabled for both air-gapped and non-air-gapped environments.
• Resource Pools - at least one resource pool is needed, with every host in the pool having access to shared
storage, such as VSAN.
• Each host in the resource pool needs access to shared storage, such as NFS or VSAN, to make use of
machine deployments and high-availability control planes.
Section Contents
vSphere Roles
When provisioning Kubernetes clusters with the Nutanix Kubernetes Platform (NKP) vSphere provider, four
roles are needed for NKP to provide proper permissions.
Procedure
1. Open a vSphere Client connection to the vCenter Server, described in the Prerequisites.
3. Give the new Role a name from the four choices detailed in the next section.
4. Select the Privileges from the permissions directory tree dropdown list below each of the four roles.
• The list of permissions can be set so that the provider is able to create, modify, or delete resources or clone
templates, VMs, disks, attach network, etc.
Cns
XSearchable
Datastore
XAllocate space
Host
• Configuration
Profile-driven storage
Network
XAssign network
Resource
• Change Configuration - from the list in that section, select these permissions below:
XAdvanced configuration
XChange Memory
XChange Settings
Edit inventory
XRemove
Interaction
XPower off
XPower on
Provisioning
XClone template
XDeploy template
Session
XValidateSession
In the table below we describe the level at which these permissions get assigned.
vSphere Installation
This topic provides instructions on how to install NKP in a vSphere non-air-gapped environment.
Remember, there are always more options for custom YAML in the Custom Installation and Additional
Infrastructure Tools section, but this will get you operating with basic features.
If not already done, see the documentation for:
Section Contents
The workflow on the left shows the creation of a base OS image in the vCenter vSphere client using inputs from
Packer. The workflow on the right shows how NKP uses that same base OS image to create CAPI-enabled VM
images for your cluster.
After creating the base image, the NKP image builder uses it to create a CAPI-enabled vSphere template that includes
the Kubernetes objects for the cluster. You can use that resulting template with the NKP create cluster command
to create the VM nodes in your cluster directly on a vCenter server. From that point, you can use NKP to provision
and manage your cluster.
NKP communicates with the code in vCenter Server as the management layer for creating and managing virtual
machines after ESXi 6.7 Update 3 or later is installed and configured.
Next Step
vSphere Air-gapped: Create an Image
• Storage configuration: Nutanix recommends customizing disk partitions and not configuring a SWAP partition.
• Network configuration: as KIB must download and install packages, activating the network is required.
• Connect to Red Hat: if using Red Hat Enterprise Linux (RHEL), registering with Red Hat is required to configure
software repositories and install software packages.
• Software selection: Nutanix recommends choosing Minimal Install.
• NKP recommends installing with the packages provided by the operating system package managers. Use the
version that corresponds to the major version of your operating system.
Procedure
1. Users need to perform the steps in the topic vSphere: Creating an Image before starting this procedure.
Procedure
2. Copy the base OS image file created in the vSphere Client to your desired location on the bastion VM host and
make a note of the path and file name.
3. Create an image.yaml file and add the following variables for vSphere. NKP uses this file and these variables as
inputs in the next step. To customize your image.yaml file, refer to this section: Customize your Image.
Note: This example is Ubuntu 20.04. You will need to replace the OS name below based on your OS. Also, refer to
the example YAML files located here: OVA YAML.
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
• Any additional configurations can be added to this command using --overrides flags as shown below:
1. Any credential overrides: --overrides overrides.yaml
2. for FIPS, add this flag: --overrides overrides/fips.yaml
3. for air-gapped, add this flag: --overrides overrides/offline-fips.yaml
5. The Konvoy Image Builder (KIB) uses the values in image.yaml and the input base OS image to create a
vSphere template directly on the vCenter server. This template contains the required artifacts needed to create a
Kubernetes cluster. When KIB successfully provisions the OS image, it creates a manifest file. The artifact_id
field of this file contains the name of the AMI ID (AWS), template name (vSphere), or image name (GCP/Azure),
for example.
{
"name": "vsphere-clone",
"builder_type": "vsphere-clone",
"build_time": 1644985039,
"files": null,
"artifact_id": "konvoy-ova-vsphere-rhel-84-1.21.6-1644983717",
"packer_run_uuid": "260e8110-77f8-ca94-e29e-ac7a2ae779c8",
"custom_data": {
"build_date": "2022-02-16T03:55:17Z",
"build_name": "vsphere-rhel-84",
"build_timestamp": "1644983717",
[...]
}
}
Tip: Recommendation: Now that you can now see the template created in your vCenter, it is best to rename it to
nkp-<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to
keep templates organized.
Next Step
Procedure
Note: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster by setting
the following flag --registry-mirror-url=https://registry-1.docker.io --registry-
mirror-username=<username> --registry-mirror-password=<password> on the nkp create
cluster command.
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Procedure
3. Use the following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
export VSPHERE_PASSWORD=example_password
Note: NKP uses the vSphere CSI driver as the default storage provider. Use a Kubernetes CSI
compatibleCSI-compatibleat is suitable for production. See the Kubernetes documentation called Changing the
Default Storage Class for more information. If you’re not using the default, you cannot deploy an alternate
provider until after the nkp create cluster is finished. However, this must be determined before the
installation
4. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
nkp create cluster vsphere \
--cluster-name ${CLUSTER_NAME} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating
the cluster, by setting the following flag --registry-mirror-url=https://registry-1.docker.io
--registry-mirror-username= --registry-mirror-password= on the nkp create
cluster command.
» Flatcar OS flag: Flatcar OS use --os-hint flatcar to instruct the bootstrap cluster to make some changes
related to the installation paths:
» HTTP: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
Next Step
Procedure
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly and give the machine’s MAC address to clients.
• and giving IP address ranges or CIDR needs to be within the node’s primary network subnet.
• MetalLB IP address ranges or CIDRs and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need four pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period (for example, 1h) to allocate more time to the deployment of
applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy, and External Load Balancer.
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Next Step
Procedure
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Cluster Operations Management section of the
documentation. The majority of this customization such as attaching clusters and deploying applications will take
place in the dashboard or UI of NKP. The Cluster Operations section allows you to manage cluster operations and
their application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
1. Use the following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
export VSPHERE_PASSWORD=example_password
2. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure:.
nkp create cluster vsphere \
--cluster-name ${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file <SSH_PUBLIC_KEY_FILE> \
--resource-pool <RESOURCE_POOL_NAME> \
--virtual-ip-interface <ip_interface_name> \
--vm-template <TEMPLATE_NAME> \
--kubeconfig=<management-cluster-kubeconfig-path> \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Procedure
Section Contents
The workflow on the left shows the creation of a base OS image in the vCenter vSphere client using inputs from
Packer. The workflow on the right shows how NKP uses that same base OS image to create CAPI-enabled VM
images for your cluster.
After creating the base image, the NKP image builder uses it to create a CAPI-enabled vSphere template that includes
the Kubernetes objects for the cluster. You can use that resulting template with the NKP create cluster command
to create the VM nodes in your cluster directly on a vCenter server. From that point, you can use NKP to provision
and manage your cluster.
• Storage configuration: Nutanix recommends customizing disk partitions and not configuring a SWAP partition.
• Network configuration: as KIB must download and install packages, activating the network is required.
Procedure
1. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory similar
to example below depending on your current location.
cd nkp-v2.12.0
2. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
3. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
1. Load the Kommander images into your private registry using the command below to load the image bundle.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
2. Optional Step for Ultimate License to load NKP Catalog Applications images.
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Note: Users need to perform the steps in the topic vSphere: Creating an Image before starting this procedure.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
5. Follow the instructions to build a vSphere template below and if applicable, set the override --overrides
overrides/offline.yaml flag described in Step 4 below.
Procedure
2. Copy the base OS image file created in the vSphere Client to your desired location on the bastion VM host and
make a note of the path and file name.
3. Create an image.yaml file and add the following variables for vSphere. NKP uses this file and these variables as
inputs in the next step. To customize your image.yaml file, refer to this section: Customize your Image.
Note: This example is Ubuntu 20.04. You will need to replace OS name below based on your OS. Also refer to
example YAML files located here: OVA YAML
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "<VSPHERE_CLUSTER_NAME>"
datacenter: "<VSPHERE_DATACENTER_NAME>"
datastore: "<VSPHERE_DATASTORE_NAME>"
folder: "<VSPHERE_FOLDER>"
insecure_connection: "false"
network: "<VSPHERE_NETWORK>"
resource_pool: "<VSPHERE_RESOURCE_POOL>"
template: "os-qualification-templates/d2iq-base-Ubuntu-20.04" # change default
value with your base template name
vsphere_guest_os_type: "other4xLinux64Guest"
guest_os_type: "ubuntu2004-64"
# goss params
distribution: "ubuntu"
distribution_version: "20.04"
# Use following overrides to select the authentication method that can be used with
base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
# ssh_agent_auth: false # is set to true, ssh_password and ssh_private_key will be
ignored
• Any additional configurations can be added to this command using --overrides flags as shown below:
1. Any credential overrides: --overrides overrides.yaml
2. for FIPS, add this flag: --overrides overrides/fips.yaml
3. for air-gapped, add this flag: --overrides overrides/offline-fips.yaml
Tip: Recommendation: Now that you can now see the template created in your vCenter, it is best to rename it to
nkp-<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to
keep templates organized.
6. Next steps are to deploy a NKP cluster using your vSphere template.
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
3. Use the following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
export VSPHERE_PASSWORD=example_password
Note: NKP uses the vSphere CSI driver as the default storage provider. Use a Kubernetes CSI compatible
storage that is suitable for production. See the Kubernetes documentation called Changing the Default
Storage Class for more information. If you’re not using the default, you cannot deploy an alternate provider
5. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
nkp create cluster vsphere \
--cluster-name ${CLUSTER_NAME} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file <SSH_PUBLIC_KEY_FILE> \
--resource-pool <RESOURCE_POOL_NAME> \
--vm-template <TEMPLATE_NAME> \
--virtual-ip-interface <ip_interface_name> \
--self-managed
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating
the cluster, by setting the following flag --registry-mirror-url=https://registry-1.docker.io
--registry-mirror-username= --registry-mirror-password= on the nkp create
cluster command.
» Flatcar OS flag: Flatcar OS use --os-hint flatcar to instruct the bootstrap cluster to make some changes
related to the installation paths:
» HTTP: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly, to give the machine’s MAC address to clients.
• MetalLB IP address ranges or CIDRs need to be within the node’s primary network subnet.
• MetalLB IP address ranges or CIDRs and node subnet must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250, and
configures Layer 2 mode:
The following values are generic, enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.1.240-192.168.1.250
EOF
kubectl apply -f metallb-conf.yaml
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need 4 pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Cluster Operations Management section of the
documentation. The majority of this customization such as attaching clusters and deploying applications will take
place in the dashboard or UI of NKP. The Cluster Operations section allows you to manage cluster operations and
their application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
1. Configure your cluster to use an existing local registry as a mirror when attempting to pull images: IMPORTANT:
The image must be created by Konvoy Image Builder in order to use the registry mirror feature..
export REGISTRY_URL=<https/http>://<registry-address>:<registry-port>
export REGISTRY_CA=<path to the CA on the bastion>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
3. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure:.
nkp create cluster vsphere \
--cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--kubeconfig=<management-cluster-kubeconfig-path> \
--namespace ${WORKSPACE_NAMESPACE}
--network <NETWORK_NAME> \
--control-plane-endpoint-host <CONTROL_PLANE_IP> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file </path/to/key.pub> \
--resource-pool <RESOURCE_POOL_NAME> \
--vm-template konvoy-ova-vsphere-os-release-k8s_release-vsphere-timestamp \
--virtual-ip-interface <ip_interface_name> \
--extra-sans "127.0.0.1" \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Next Step
Procedure
Section Contents
The workflow on the left shows the creation of a base OS image in the vCenter vSphere client using inputs from
Packer. The workflow on the right shows how NKP uses that same base OS image to create CAPI-enabled VM
images for your cluster.
After creating the base image, the NKP image builder uses it to create a CAPI-enabled vSphere template that includes
the Kubernetes objects for the cluster. You can use that resulting template with the NKP create cluster command
to create the VM nodes in your cluster directly on a vCenter server. From that point, you can use NKP to provision
and manage your cluster.
• Storage configuration: Nutanix recommends customizing disk partitions and not configuring a SWAP partition.
• Network configuration: as KIB must download and install packages, activating the network is required.
Procedure
1. Users need to perform the steps in the topic vSphere FIPS: Creating an Image before starting this procedure.
Procedure
2. Copy the base OS image file created in the vSphere Client to your desired location on the bastion VM host and
make a note of the path and file name.
3. Create an image.yaml file and add the following variables for vSphere. NKP uses this file and these variables as
inputs in the next step. To customize your image.yaml file, refer to this section: Customize your Image.
Note: This example is Ubuntu 20.04. You will need to replace OS name below based on your OS. Also refer to
example YAML files located here: OVA YAML
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "<VSPHERE_CLUSTER_NAME>"
datacenter: "<VSPHERE_DATACENTER_NAME>"
datastore: "<VSPHERE_DATASTORE_NAME>"
folder: "<VSPHERE_FOLDER>"
insecure_connection: "false"
network: "<VSPHERE_NETWORK>"
• Any additional configurations can be added to this command using --overrides flags as shown below:
1. Any credential overrides: --overrides overrides.yaml
2. for FIPS, add this flag: --overrides overrides/fips.yaml
3. for air-gapped, add this flag: --overrides overrides/offline-fips.yaml
5. The Konvoy Image Builder (KIB) uses the values in image.yaml and the input base OS image to create a
vSphere template directly on the vCenter server. This template contains the required artifacts needed to create a
Kubernetes cluster. When KIB provisions the OS image successfully, it creates a manifest file. The artifact_id
field of this file contains the name of the AMI ID (AWS), template name (vSphere), or image name (GCP or
Azure), for example.
{
"name": "vsphere-clone",
"builder_type": "vsphere-clone",
"build_time": 1644985039,
"files": null,
"artifact_id": "konvoy-ova-vsphere-rhel-84-1.21.6-1644983717",
"packer_run_uuid": "260e8110-77f8-ca94-e29e-ac7a2ae779c8",
"custom_data": {
"build_date": "2022-02-16T03:55:17Z",
"build_name": "vsphere-rhel-84",
"build_timestamp": "1644983717",
[...]
}
}
Tip: Recommendation: Now that you can now see the template created in your vCenter, it is best to rename it to
nkp-<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to
keep templates organized.
6. Next steps are to deploy a NKP cluster using your vSphere template.
Procedure
• The table below identifies the current FIPS and etcd versions for this release.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail
if the name has capital letters. For more Kubernetes naming information, see https://kubernetes.io/docs/
concepts/overview/working-with-objects/names/.
• Name your cluster and give it a unique name suitable for your environment.
• Set the environment variable:
export CLUSTER_NAME=<my-vsphere-cluster>
Procedure
1. Use the following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
export VSPHERE_PASSWORD=example_password
2. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
nkp create cluster vsphere \
--cluster-name ${CLUSTER_NAME} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
Note: To increase Dockerhub's rate limit use your Dockerhub credentials when creating the cluster, by setting
the following flag --registry-mirror-url=https://registry-1.docker.io --registry-
mirror-username= --registry-mirror-password= on the nkp create cluster command.
» Flatcar OS flag: Flatcar OS use --os-hint flatcar to instruct the bootstrap cluster to make some changes
related to the installation paths:
» HTTP: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
Next Step
Procedure
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly, to give the machine’s MAC address to clients.
• MetalLB IP address ranges or CIDRs need to be within the node’s primary network subnet.
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need 4 pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you wish to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy and External Load Balancer.
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Cluster Operations Management section of the
documentation. The majority of this customization such as attaching clusters and deploying applications will take
place in the dashboard or UI of NKP. The Cluster Operations section allows you to manage cluster operations and
their application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects, or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
1. Use the following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
export VSPHERE_PASSWORD=example_password
2. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure:.
nkp create cluster vsphere \
--cluster-name ${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file <SSH_PUBLIC_KEY_FILE> \
--resource-pool <RESOURCE_POOL_NAME> \
--virtual-ip-interface <ip_interface_name> \
--vm-template <TEMPLATE_NAME> \
--kubeconfig=<management-cluster-kubeconfig-path> \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Use HTTP or HTTPS Proxy with KIB Images on page 1076.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Procedure
Section Contents
The workflow on the left shows the creation of a base OS image in the vCenter vSphere client using inputs from
Packer. The workflow on the right shows how NKP uses that same base OS image to create CAPI-enabled VM
images for your cluster.
After creating the base image, the NKP image builder uses it to create a CAPI-enabled vSphere template that includes
the Kubernetes objects for the cluster. You can use that resulting template with the NKP create cluster command
to create the VM nodes in your cluster directly on a vCenter server. From that point, you can use NKP to provision
and manage your cluster.
• Storage configuration: Nutanix recommends customizing disk partitions and not configuring a SWAP partition.
Procedure
1. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. Example: For the bootstrap cluster, change your directory to the nkp-<version> directory,
similar to the example below, depending on your current location.
cd nkp-v2.12.0
2. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
3. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the network
performance of the machine you are running the script on and the registry.
1. Load the Kommander images into your private registry using the command below to load the image bundle.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
2. Optional Step for Ultimate License to load NKP Catalog Applications images.
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Note: Users need to perform the steps in the topic vSphere: Creating an Image before starting this procedure.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
5. Follow the instructions to build a vSphere template below, and if applicable, set the override --overrides
overrides/offline.yaml flag described in Step 4 below.
Procedure
2. Copy the base OS image file created in the vSphere Client to your desired location on the bastion VM host and
make a note of the path and file name.
3. Create an image.yaml file and add the following variables for vSphere. NKP uses this file and these variables as
inputs in the next step. To customize your image.yaml file, refer to this section: Customize your Image.
Note: This example is Ubuntu 20.04. You will need to replace the OS name below based on your OS. Also, refer to
the example YAML files located here: OVA YAML.
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "<VSPHERE_CLUSTER_NAME>"
datacenter: "<VSPHERE_DATACENTER_NAME>"
datastore: "<VSPHERE_DATASTORE_NAME>"
folder: "<VSPHERE_FOLDER>"
insecure_connection: "false"
network: "<VSPHERE_NETWORK>"
resource_pool: "<VSPHERE_RESOURCE_POOL>"
template: "os-qualification-templates/d2iq-base-Ubuntu-20.04" # change default
value with your base template name
vsphere_guest_os_type: "other4xLinux64Guest"
guest_os_type: "ubuntu2004-64"
# goss params
distribution: "ubuntu"
distribution_version: "20.04"
# Use following overrides to select the authentication method that can be used with
base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
# ssh_agent_auth: false # is set to true, ssh_password and ssh_private_key will be
ignored
• Any additional configurations can be added to this command using --overrides flags as shown below:
1. Any credential overrides: --overrides overrides.yaml
2. for FIPS, add this flag: --overrides overrides/fips.yaml
3. for air-gapped, add this flag: --overrides overrides/offline-fips.yaml
Tip: Recommendation: Now that you can now see the template created in your vCenter, it is best to rename it to
nkp-<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to
keep templates organized.
6. The next steps are to deploy a NKP cluster using your vSphere template.
Procedure
• The table below identifies the current FIPS and etcd versions for this release.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
• Name your cluster and give it a unique name suitable for your environment.
Procedure
1. Configure your cluster to use an existing local registry as a mirror when attempting to pull images: IMPORTANT:
The image must be created by Konvoy Image Builder in order to use the registry mirror feature. Use the
following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
export VSPHERE_PASSWORD=example_password
3. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
nkp create cluster vsphere
--cluster-name ${CLUSTER_NAME} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <CONTROL_PLANE_IP> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file </path/to/key.pub> \
--resource-pool <RESOURCE_POOL_NAME> \
--vm-template konvoy-ova-vsphere-os-release-k8s_release-vsphere-timestamp \
--virtual-ip-interface <ip_interface_name> \
--extra-sans "127.0.0.1" \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--kubernetes-version=v1.29.6+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
--etcd-image-repository=docker.io/mesosphere --etcd-version=3.5.10+fips.0 \
Note: To increase Dockerhub's rate limit, use your Dockerhub credentials when creating the cluster by setting
the following flag --registry-mirror-url=https://registry-1.docker.io --registry-
mirror-username= --registry-mirror-password= on the nkp create cluster command.
» Flatcar OS flag: Flatcar OS use --os-hint flatcar to instruct the bootstrap cluster to make some changes
related to the installation paths:
» HTTP: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It works by
responding to ARP requests on your local network directly and giving the machine’s MAC address to clients.
• MetalLB IP address ranges or CIDR need to be within the node’s primary network subnet.
• MetalLB IP address ranges or CIDRs and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250 and
configures Layer 2 mode:
The following values are generi; enterr your specific values into the fields where applicable.
; enter
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need four pieces of information:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy, and External Load Balancer.
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Cluster Operations Management section of the
documentation. The majority of this customization such as attaching clusters and deploying applications will take
place in the dashboard or UI of NKP. The Cluster Operations section allows you to manage cluster operations and
their application workloads to optimize your organization’s productivity.
vSphere Air-gapped FIPS: Creating Managed Clusters Using the NKP CLI
This topic explains how to continue using the CLI to create managed clusters rather than switching to the
UI dashboard.
Note: When creating Managed clusters, you do not need to create and move CAPI objects or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name intomight only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail
if the name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
1. Configure your cluster to use an existing local registry as a mirror when attempting to pull images: IMPORTANT:
The image must be created by Konvoy Image Builder in order to use the registry mirror feature..
export REGISTRY_URL=<https/http>://<registry-address>:<registry-port>
export REGISTRY_CA=<path to the CA on the bastion>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
3. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
nkp create cluster vsphere \
--cluster-name ${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE}
--network <NETWORK_NAME> \
--control-plane-endpoint-host <CONTROL_PLANE_IP> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file </path/to/key.pub> \e
--resource-pool <RESOURCE_POOL_NAME> \
--vm-template konvoy-ova-vsphere-os-release-k8s_release-vsphere-timestamp \
--virtual-ip-interface <ip_interface_name> \
--extra-sans "127.0.0.1" \
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD} \
--kubernetes-version=v1.29.6+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
--etcd-image-repository=docker.io/mesosphere \
--etcd-version=3.5.10+fips.0 \
--kubeconfig=<management-cluster-kubeconfig-path> \
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the Nutanix Kubernetes Platform (NKP) CLI, it attaches automatically to
the Management Cluster after a few moments. However, if you do not set a workspace, the attached cluster will be
created in the default workspace. To ensure that the attached cluster is created in your desired workspace namespace,
follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
You can now either attach it in the UI, link to attaching it to the workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Procedure
Next Step
Continue to the VMware Cloud Director Infrastructure ice Providers section of the Custom Install and
Infrastructure Tools chapter.
• Control plane nodes - NKP on Azure defaults to deploying a Standard_D4s_v3 virtual machine with a 128 GiB
volume for the OS and an 80GiB volume for etcd storage, which meets the above resource requirements.
• Worker nodes - NKP on Azure defaults to deploying a Standard_D8s_v3 virtual machine with an 80 GiB
volume for the OS, which meets the above resource requirements.
Section Contents
Azure Prerequisites
Before you begin using Konvoy with Azure, you must:
1. Sign in to Azure:
az login
[
{
"cloudName": "AzureCloud",
"homeTenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"id": "b1234567-abcd-11a1-a0a0-1234a5678b90",
"isDefault": true,
"managedByTenants": [],
"name": "Nutanix Developer Subscription",
"state": "Enabled",
"tenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"user": {
"name": "[email protected]",
"type": "user"
}
}
]
2. Run this command to ensure you are pointing to the correct Azure subscription ID:
az account set --subscription "Nutanix Developer Subscription"
4. Ensure you have an override file to configure specific attributes of your Azure image.
Note: The konvoy-image binary and all supporting folders are also extracted. When extracted, konvoy-image
bind mounts the current working directory (${PWD}) into the container to be used.
This procedure describes how to use the Konvoy Image Builder (KIB) to create a Cluster API compliant Amazon
Machine Image (AMI). KIB uses variable overrides to specify base images and container images to use in your new
AMI.
The default Azure image is not recommended for use in production. We suggest . In order to build the image, use
KIB for Azure to take advantage of enhanced cluster operations. Explore the Customize your Image topic for more
options.
For more information about using the image to create clusters, refer to the Azure Create a New Cluster section of
the documentation.
• Download the Konvoy Image Builder bundle for your version of NKP.
• Check the Supported Kubernetes Version for your Provider.
• Create a working Docker setup.
Extract the bundle and cd into the extracted konvoy-image-bundle-$VERSION_$OS folder. The bundled version
of konvoy-image contains an embedded docker image that contains all the requirements for the building.
The konvoy-image binary and all supporting folders are also extracted. When extracted, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build azure --client-id ${AZURE_CLIENT_ID} --tenant-id ${AZURE_TENANT_ID}
--overrides override-source-image.yaml images/azure/ubuntu-2004.yaml
By default, the image builder builds in the westus2 location. To specify another location, set the --location flag
(shown in the example below is how to change the location to eastus):
konvoy-image build azure --client-id ${AZURE_CLIENT_ID} --tenant-id ${AZURE_TENANT_ID}
--location eastus --overrides override-source-image.yaml images/azure/centos-7.yaml
When the command is complete, the image id is printed and written to the ./packer.pkr.hcl file. This file has
an artifact_id field whose value provides the name of the image. Then, specify this image ID when creating the
cluster.
Procedure
• To specify a specific Resource Group, Gallery, or Image Name flags might be specified:
--gallery-image-locations string a list of locations to publish the image
(default same as location)
--gallery-image-name string the gallery image name to publish the image to
--gallery-image-offer string the gallery image offer to set (default "nkp")
--gallery-image-publisher string the gallery image publisher to set (default
"nkp")
--gallery-image-sku string the gallery image sku to set
--gallery-name string the gallery name to publish the image in
(default "nkp")
--resource-group string the resource group to create the image in
(default "nkp")
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
Procedure
Base64 encodes the Azure environment variables set in the Azure install prerequisites step.
export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "${AZURE_SUBSCRIPTION_ID}" | base64 | tr -d
'\n')"
export AZURE_TENANT_ID_B64="$(echo -n "${AZURE_TENANT_ID}" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo -n "${AZURE_CLIENT_ID}" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo -n "${AZURE_CLIENT_SECRET}" | base64 | tr -d
'\n')"
Procedure
Run this command to create your Kubernetes cluster using any relevant flags.
nkp create cluster azure \
--cluster-name=${CLUSTER_NAME} \
--self-managed
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
If you want to monitor or verify the installation of your clusters, refer to the topic: Verify your Cluster and NKP
Installation
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy, and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer; set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name might only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Procedure
Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by management cluster you created in the previous section.
nkp create cluster azure \
--cluster-name=${MANAGED_CLUSTER_NAME} \
--namespace=${WORKSPACE_NAMESPACE} \
--additional-tags=owner=$(whoami) \
--kubeconfig=<management-cluster-kubeconfig-path>
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the below
command. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them to a Managed Cluster to be centrally administrated
by a Management Cluster, refer to Platform Expansion:
Next Step
Procedure
Note: An AKS cluster cannot be a Management or Pro cluster. Before installing NKP on your EKS cluster, first
ensure you have a Management cluster with NKP and the Kommander component installed that handles the life cycle of
your AKS cluster.
Installing Kommander requires you to have CAPI components, cert-manager, etc, on a self-managed cluster. The
CAPI components mean you can control the life cycle of the cluster and other clusters. However, because AKS is
semi-managed by Azure, the AKS clusters are under Azure's control and don’t have those components. Therefore,
Kommander will not be installed.
Section Contents
AKS Installation
Nutanix Kubernetes Platform (NKP) installation on Azure Kubernetes Service (AKS).
AKS Prerequisites
Before you begin using Konvoy with AKS, you must:
1. Sign in to Azure using the command az login. For example:
[
{
"cloudName": "AzureCloud",
"homeTenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"id": "b1234567-abcd-11a1-a0a0-1234a5678b90",
"isDefault": true,
"managedByTenants": [],
"name": "Nutanix Developer Subscription",
"state": "Enabled",
"tenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"user": {
"name": "[email protected]",
"type": "user"
}
}
]
2. Create an Azure Service Principal (SP) by using the command az ad sp create-for-rbac --role
contributor --name "$(whoami)-konvoy" --scopes=/subscriptions/$(az account show --
query id -o tsv).
3. Set the Azure client secret environment variable using the command AZURE_CLIENT_SECRET. Example output:
export AZURE_CLIENT_SECRET="<azure_client_secret>" #
Z79yVstq_E.R0R7RUUck718vEHSuyhAB0C
export AZURE_CLIENT_ID="<client_id>" # 7654321a-1a23-567b-
b789-0987b6543a21
export AZURE_TENANT_ID="<tenant_id>" # a1234567-
b132-1234-1a11-1234a5678b90
export AZURE_SUBSCRIPTION_ID="<subscription_id>" # b1234567-abcd-11a1-
a0a0-1234a5678b90
5. Check to see what version of Kubernetes is available in your region. When deploying with AKS, you must pick
a version of Kubernetes that is available in AKS and use that version for subsequent steps. To find out the list
of available Kubernetes versions in the Azure Region you are using, run the following command, substituting
<your-location> for the Azure region you're deploying to:
6. Choose a version of Kubernetes for installation from the list using the command KubernetesVersion The
example shows the selected version is 1.29.0.
export KUBERNETES_VERSION=1.29.0
For the list of compatible supported Kubernetes versions, see Supported Kubernetes Versions.
NKP Prerequisites
Before starting the NKP installation, verify that you have:
Note: An AKS cluster cannot be a Management or Pro cluster. Before installing NKP on your AKS cluster, ensure
you have a Management cluster with NKP and the Kommander component installed, that handles the life cycle of
your AKS cluster.
• An x86_64-based Linux or macOS machine with a supported version of the operating system.
• A Self-managed Azure cluster, if you used the Day 1-Basic Installation for Azure instructions, your cluster
was created using --self-managed flag and therefore is already a self-managed cluster.
• Download the NKPbinary for Linux, or macOS. To check which version of NKP you installed for
compatibility reasons, run the NKP version -h command.
• Docker https://docs.docker.com/get-docker/ version 18.09.2 or later.
• kubectl https://kubernetes.io/docs/tasks/tools/#kubectl for interacting with the running cluster.
• The Azure CLI https://docs.microsoft.com/en-us/cli/azure/install-azure-cli.
• A valid Azure account used to sign in to the Azure CLI https://docs.microsoft.com/en-us/cli/azure/
authenticate-azure-cli?view=azure-cli-latest.
• All Resource requirements.
Note: Kommander installation requires you to have Cluster API (CAPI) components, cert-manager, etc on a self-
managed cluster. The CAPI components mean you can control the life cycle of the cluster, and other clusters. However,
because AKS is semi-managed by Azure, the AKS clusters are under Azure control and don’t have those components.
Therefore, Kommander will not be installed and these clusters will be attached to the management cluster.
To deploy a cluster with a custom image in a region where CAPI images https://cluster-api-aws.sigs.k8s.io/topics/
images/built-amis.html are not provided, you need to use Konvoy Image Builder to create your own image for the
region.
AKS best practices discourage building custom images. If the image is customized, it breaks some of the autoscaling
and security capabilities of AKS. Since custom virtual machine images are discouraged in AKS, Konvoy Image
Builder (KIB) does not include any support for building custom machine images for AKS.
Procedure
Use NKP to create a new AKS cluster
Ensure that the KUBECONFIG environment variable is set to the Management cluster by running :
export KUBECONFIG=<Management_cluster_kubeconfig>.conf
Procedure
Note: The cluster name might only contain the following characters: a-z, 0-9, ., and -. Cluster creation will fail
if the name has capital letters. See Kubernetes for more naming information.
Procedure
1. Set the environment variable to the name you assigned this cluster.
export CLUSTER_NAME=<aks-example>
2. Check to see what version of Kubernetes is available in your region. When deploying with AKS, you need
to declare the version of Kubernetes you want to use by running the following command, substituting <your-
location> for the Azure region you're deploying to.
az aks get-versions -o table --location <your-location>
4.
Note: Refer to the current release Kubernetes compatibility table for the correct version to use and choose an
available 1.27.x version. The version listed in the command is an example.
5. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as edits
can prevent the cluster from deploying successfully. See Customizing CAPI Clusters.
7. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command
to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
NAME READY SEVERITY REASON
SINCE MESSAGE
Cluster/aks-example True
48m
##ClusterInfrastructure - AzureManagedCluster/aks-example
##ControlPlane - AzureManagedControlPlane/aks-example
8. As they progress, the controllers also create Events. List the Events using this command.
kubectl get events | grep ${CLUSTER_NAME}
For brevity, the example uses grep. It is also possible to use separate commands to get Events for specific objects.
For example, kubectl get events --field-selector involvedObject.kind="AKSCluster" and
kubectl get events --field-selector involvedObject.kind="AKSMachine".
48m Normal SuccessfulSetNodeRefs machinepool/aks-
example-md-0 [{Kind: Namespace: Name:aks-mp6gglj-41174201-
vmss000000 UID:e3c30389-660d-46f5-b9d7-219f80b5674d APIVersion: ResourceVersion:
FieldPath:} {Kind: Namespace: Name:aks-mp6gglj-41174201-vmss000001 UID:300d71a0-
f3a7-4c29-9ff1-1995ffb9cfd3 APIVersion: ResourceVersion: FieldPath:} {Kind:
Namespace: Name:aks-mp6gglj-41174201-vmss000002 UID:8eae2b39-a415-425d-8417-
d915a0b2fa52 APIVersion: ResourceVersion: FieldPath:} {Kind: Namespace: Name:aks-
mp6gglj-41174201-vmss000003 UID:3e860b88-f1a4-44d1-b674-a54fad599a9d APIVersion:
ResourceVersion: FieldPath:}]
6m4s Normal AzureManagedControlPlane available azuremanagedcontrolplane/
aks-example successfully reconciled
48m Normal SuccessfulSetNodeRefs machinepool/aks-
example [{Kind: Namespace: Name:aks-mp6gglj-41174201-
vmss000000 UID:e3c30389-660d-46f5-b9d7-219f80b5674d APIVersion: ResourceVersion:
FieldPath:} {Kind: Namespace: Name:aks-mp6gglj-41174201-vmss000001 UID:300d71a0-
f3a7-4c29-9ff1-1995ffb9cfd3 APIVersion: ResourceVersion: FieldPath:} {Kind:
Namespace: Name:aks-mp6gglj-41174201-vmss000002 UID:8eae2b39-a415-425d-8417-
d915a0b2fa52 APIVersion: ResourceVersion: FieldPath:}]
Procedure
1. Get a kubeconfig file for the workload cluster. When the workload cluster is created, the cluster life cycle
services generate a kubeconfig file for the workload cluster and write it to a Secret. The kubeconfig file
Note:
It might take a few minutes for the Status to move to Ready while the Pod network is deployed. The
Node Status will change to Ready soon after the calico-node DaemonSet Pods are Ready.
3. List the Pods using the command kubectl --kubeconfig=${CLUSTER_NAME}.conf get --all-
namespaces pods.
Example output:
NAMESPACE NAME READY
STATUS RESTARTS AGE
calico-system calico-kube-controllers-5dcd4b47b5-tgslm 1/1
Running 0 3m58s
calico-system calico-node-46dj9 1/1
Running 0 3m58s
calico-system calico-node-crdgc 1/1
Running 0 3m58s
calico-system calico-node-m7s7x 1/1
Running 0 3m58s
calico-system calico-node-qfkqc 1/1
Running 0 3m57s
calico-system calico-node-sfqfm 1/1
Running 0 3m57s
calico-system calico-node-sn67x 1/1
Running 0 3m53s
calico-system calico-node-w2pvt 1/1
Running 0 3m58s
calico-system calico-typha-6f7f59969c-5z4t5 1/1
Running 0 3m51s
calico-system calico-typha-6f7f59969c-ddzqb 1/1
Running 0 3m58s
calico-system calico-typha-6f7f59969c-rr4lj 1/1
Running 0 3m51s
kube-system azure-ip-masq-agent-4f4v6 1/1
Running 0 4m11s
kube-system azure-ip-masq-agent-5xfh2 1/1
Running 0 4m11s
kube-system azure-ip-masq-agent-9hlk8 1/1
Running 0 4m8s
kube-system azure-ip-masq-agent-9vsgg 1/1
Running 0 4m16s
Note:
export KUBECONFIG=<Management_cluster_kubeconfig>.conf
Procedure
1. Ensure you are connected to your AKS clusters. Enter the following commands for each of your clusters.
kubectl config get-contexts
kubectl config use-context <context for first aks cluster>
Procedure
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
export CURRENT_CLUSTER=$(kubectl config view --raw -o=go-
template='{{range .contexts}}{{if eq .name "'''${CURRENT_CONTEXT}'''"}}
{{ index .context "cluster" }}{{end}}{{end}}')
export CLUSTER_CA=$(kubectl config view --raw -o=go-template='{{range .clusters}}{{if
eq .name "'''${CURRENT_CLUSTER}'''"}}"{{with index .cluster "certificate-authority-
data" }}{{.}}{{end}}"{{ end }}{{ end }}')
export CLUSTER_SERVER=$(kubectl config view --raw -o=go-template='{{range .clusters}}
{{if eq .name "'''${CURRENT_CLUSTER}'''"}}{{ .cluster.server }}{{end}}{{ end }}')
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
8. This process produces a file in your current working directory called kommander-cluster-admin-config. The
contents of this file are used in Kommander to attach the cluster. Before importing this configuration, verify the
kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
Procedure
From the top menu bar, select your target workspace.
a. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
b. Select Attach Cluster.
c. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
stop following the steps below and see the Attach a cluster WITH network restrictions.
d. Upload the kubeconfig file you created in the previous section (or copy its contents) into the Cluster
Configuration section.
e. The Cluster Name field automatically populates with the name of the cluster in the kubeconfig. You can edit
this field using the name you want for your cluster.
f. Add labels to classify your cluster as needed.
g. Select Create to attach your cluster. Next Step
• Control plane nodes - NKP on GCP defaults to deploying an n2-standard-4 instance with an 80GiB root
volume for control plane nodes, which meets the above requirements.
• Worker nodes - NKP on GCP defaults to deploying a n2-standard-8 instance with an 80GiB root volume for
worker nodes, which meets the above requirements.
GCP Installation
This installation provides instructions to install NKP in an GCP non-air-gapped environment.
Remember, there are always more options for custom YAML in the Custom Installation and Additional
Infrastructure Tools section, but this will get you operating with basic features.
If not already done, see the documentation for:
GCP Prerequisites
Verify that your Google Cloud project does not have the Enable OS Login feature enabled.
Note:
The Enable OS Login feature is sometimes enabled by default in GCP projects. If the OS login feature is
enabled, KIB will not be able to ssh to the VM instances it creates and will not be able to create an image
successfully.
To check if it is enabled, use the commands on this page Set and remove custom metadata | Compute
Engine Documentation | Google Cloud to inspect the metadata configured in your project. If you find
the enable-oslogin flag set to TRUE, you must remove it (or set it to FALSE) to use KIB.
The user creating the Service Accounts needs additional privileges in addition to the Editor role.
1. Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS.
» Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
» Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
5. On Debian-based Linux distributions, install a version of the cri-tools package known to be compatible with both
the Kubernetes and container runtime versions.
6. Note: The Enable OS Login feature is sometimes enabled by default in GCP projects. If the OS login feature
is enabled, KIB will not be able to ssh to the VM instances it creates and will not be able to create an image
successfully.
To check if it is enabled, use the commands on this page Set and remove custom metadata |
Compute Engine Documentation | Google Cloud to inspect the metadata configured in your project.
If you find the enable-oslogin flag set to TRUE, you must remove (or set it to FALSE) to use KIB
successfully.
Verify that your Google Cloud project does not have the Enable OS Login feature enabled. See below for more
information.
Procedure
1. If you are creating your image on either a non-GCP instance or one that does not have the required roles, you must
either:
2. Export the static credentials that will be used to create the cluster.
export GCP_B64ENCODED_CREDENTIALS=$(base64 < "${GOOGLE_APPLICATION_CREDENTIALS}" | tr
-d '\n')
2. KIB will run and print out the name of the created image; you will use this name when creating a Kubernetes
cluster. See the sample output below.
Note: Ensure you have named the correct YAML file for your OS in the konvoy-image build command.
...
==> ubuntu-2004-focal-v20220419: Deleting instance...
ubuntu-2004-focal-v20220419: Instance has been deleted!
==> ubuntu-2004-focal-v20220419: Creating image...
==> ubuntu-2004-focal-v20220419: Deleting disk...
ubuntu-2004-focal-v20220419: Disk has been deleted!
==> ubuntu-2004-focal-v20220419: Running post-processor: manifest
Build 'ubuntu-2004-focal-v20220419' finished after 7 minutes 46 seconds.
3. To find a list of images you have created in your account, run the following command.
gcloud compute images list --no-standard-images
Related Information
Procedure
• To use a local registry, even in a non-air-gapped environment, download and extract the bundle. Downloading
NKP on page 16 the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
Procedure
1. Create an image using Konvoy Image Builder (KIB) and then export the image name.
export IMAGE_NAME=projects/${GCP_PROJECT}/global/images/<image_name_from_kib>
2. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
If you need to change the Kubernetes subnets, you must do this at cluster creation. The default subnets used in
NKP are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
» (Optional) Modify Control Plane Audit logs - Users can make modifications to the KubeadmControlplane
cluster-api object to configure different kubelet options. See the following guide if you wish to configure
your control plane beyond the existing options that are available from flags.
» (Optional) Determine what VPC Network to use. All GCP accounts come with a preconfigured VPC Network
named default, which will be used if you do not specify a different network. To use a different VPC network
for your cluster, create one by following these instructions for Create and Manage VPC Networks. Then
specify the --network <new_vpc_network_name> option on the create cluster command below. More
information is available on GCP Cloud Nat and network flag.
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
If you want to monitor or verify the installation of your clusters, refer to the topic: Verify your Cluster and NKP
Installation
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Customizations page for customization options. Some options include Custom Domains and
Certificates, HTTP proxy, and External Load Balancer.
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see the topic Configuring NKP
Catalog Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of NKP. The Day 2 section allows you to manage cluster operations and their
application workloads to optimize your organization’s productivity.
Note: When creating Managed clusters, you do not need to create and move CAPI objects or install the
Kommander component. Those tasks are only done on Management clusters!
Your new managed cluster needs to be part of a workspace under a management cluster. To make the new
managed cluster a part of a Workspace, set that workspace environment variable.
Procedure
1. If you have an existing Workspace name, run this command to find the name.
kubectl get workspace -A
2. When you have the Workspace name, set the WORKSPACE_NAMESPACE environment variable.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: If you need to create a new Workspace, follow the instructions to Create a New Workspace
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
When specifying the cluster-name, you must use the same cluster-name as used when defining your
inventory objects.
Procedure
Note: Google Cloud Platform does not publish images. You must first build the image using Konvoy Image
Builder.
Procedure
1. Create an image using Konvoy Image Builder (KIB) and then export the image name.
export IMAGE_NAME=projects/${GCP_PROJECT}/global/images/<image_name_from_kib>
2. Execute this command to create your additional Kubernetes cluster using any relevant flags. This will create a new
non-self-managed cluster that can be managed by the management cluster you created in the previous section.
nkp create cluster gcp \
--cluster-name=${MANAGED_CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--namespace ${WORKSPACE_NAMESPACE} \
--project=${GCP_PROJECT} \
--image=${IMAGE_NAME} \
--kubeconfig=<management-cluster-kubeconfig-path>
Tip: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
Procedure
When you create a Managed Cluster with the NKP CLI, it attaches automatically to the Management Cluster after a
few moments. However, if you do not set a workspace, the attached cluster will be created in the default workspace.
To ensure that the attached cluster is created in your desired workspace namespace, follow these instructions:
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command.
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. Note: This is only necessary if you never set the workspace of your cluster upon creation.
You can now either attach it in the UI, link to attaching it to the workspace through UI that was earlier, or attach
your cluster to the workspace you want in the CLI.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: <your-managed-cluster-name>
type: cluster.x-k8s.io/secret
data:
value: <value-you-copied-from-secret-above>
10. You can now view this cluster in your Workspace in the UI, and you can confirm its status by running the
command below. It may take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them into a Managed Cluster to be centrally
administrated by a Management Cluster, refer to Platform Expansion:
Next Step
Procedure
Operations
You can manage your cluster and deployed applications using platform applications.
After you deploy an NKP cluster and the platform applications you want to use, you are ready to begin managing
cluster operations and their application workloads to optimize your organization’s productivity.
In most cases, a production cluster requires additional advanced configuration tailored for your environment, ongoing
maintenance, authentication and authorization, and other common activities. For example, it is important to monitor
cluster activity and collect metrics to ensure application performance and response time, evaluate network traffic
patterns, manage user access to services, and verify workload distribution and efficiency.
In addition to the configurations, you can also control the appearance of your NKP UI by adding banners and footers.
There are different options available depending on the NKP level that you license and install.
Workspace: Manages access to Create namespaced Roles on Federates ClusterRoles on all target
clusters in a specific workspace, the management cluster in the clusters in the workspace.
for example, in the scope of multi- workspace namespace.
tenancy. See Multi-Tenancy in
NKP on page 421.
Project: Manages access for Create namespaced Roles on Federates namespaced Roles on all
clusters in a specific project, for the management cluster in the target clusters in the project in the
example, in the scope of multi- project namespace. project namespace.
tenancy. See Multi-Tenancy in
NKP on page 421.
Create the role bindings for each level and type create RoleBindings or ClusterRoleBindings on the clusters
that apply to each category.
This approach gives you maximum flexibility over who has access to what resources, conveniently mapped to your
existing identity providers’ claims.
Groups
Access control groups are configured in the Groups tab of the Identity Providers page.
You can map group and user claims made by your configured identity providers to Kommander groups by selecting
administration or identity providers in the left sidebar in the global workspace level, and then select the Groups tab.
Roles
ClusterRoles are named collections of rules defining which verbs can be applied to what resources.
Procedure
2. To prevent propagation of the kommander-workspace-view role, remove this annotation from the
KommanderWorkspaceRole resource.
kubectl annotate kommanderworkspacerole -n <WORKSPACE_NAMESPACE> kommander-workspace-
view workspace.kommander.mesosphere.io/sync-to-project-
3. To enable propagation of the role, add this annotation to the relevant KommanderWorkspaceRole resource.
kubectl annotate kommanderworkspacerole -n <WORKSPACE_NAMESPACE> kommander-workspace-
view workspace.kommander.mesosphere.io/sync-to-project=true
Role Bindings
Kommander role bindings, cluster role bindings, and project role bindings bind a Kommander group to any number of
roles. All groups defined in the Groups tab are present at the global, workspace, or project levels and are ready for
you to assign roles to them.
Groups
If your external identity provider supports group claims, you can also bind groups to roles. To make the engineering
LDAP group administrators of the production namespace bind the group to the admin role:
cat << EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: engineering-admin
namespace: production
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: oidc:engineering
NKP UI Authorization
The NKP UI and other HTTP applications protected by Kommander forward authentication, are also authorized
by the Kubernetes RBAC API. In addition to the Kubernetes API resources, it is possible to define rules which
map to HTTP URIs and HTTP verbs. Kubernetes RBAC refer to these as nonResourceURLs, Kommander forward
authentication uses these rules to grant or deny access to HTTP endpoints.
Default Roles
Roles are created to grant access to the dashboard and select applications that expose an HTTP server
through the ingress controller. The cluster-admin role is a system role that grants permission to all
actions (verbs) on any resource, including non-resource URLs. The default dashboard user is bound to this
role.
Note: Granting usethe r the administrator privileges on /nkp/* grants admin privileges to all sub-resources, even if
the bindings exist for sub-resources with fewer privileges.
User
To grant the user [email protected] administrative access to all Kommander resources, bind the user to the nkp-
admin role:
cat << EOF | kubectl apply -f -
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: nkp-admin-mary
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: nkp-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: [email protected]
EOF
If you inspect the role, you can see what access is now granted:
kubectl describe clusterroles nkp-admin
Name: nkp-admin
Labels: app.kubernetes.io/instance=kommander
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/version=v2.0.0
helm.toolkit.fluxcd.io/name=kommander
helm.toolkit.fluxcd.io/namespace=kommander
rbac.authorization.k8s.io/aggregate-to-admin=true
Annotations: meta.helm.sh/release-name: kommander
meta.helm.sh/release-namespace: kommander
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
[/nkp/*] [] [delete]
[/nkp] [] [delete]
[/nkp/*] [] [get]
[/nkp] [] [get]
[/nkp/*] [] [head]
[/nkp] [] [head]
[/nkp/*] [] [post]
[/nkp] [] [post]
[/nkp/*] [] [put]
[/nkp] [] [put]
The user can now use the HTTP verbs HEAD, GET, DELETE, POST, and PUT when accessing any URL at or under
/nkp. The downstream application follows REST conventions. This effectively allows privileges to be read, edited,
and deleted.
Note: To enable users to access the NKP UI, ensure they have the appropriate nkp-kommander role and the
Kommander roles granted in the NKP UI.
Members of logging-ops to view all the resources under /nkp and edit all the resources under /nkp/logging/
grafana.
Procedure
2. Select the Cluster Roles tab, and then select + Create Role .
3. Enter a descriptive name for the role and ensure that Cluster Role is selected as the type.
Kubernetes Dashboard
The Kubernetes dashboard displays information that offloads authorization directly to the Kubernetes API server.
Once authenticated, all users may access the dashboard at /nkp/kubernetes/ without needing an nkp role.
However, the cluster RBAC policy protects access to the underlying Kubernetes resources exposed by the dashboard.
This topic describes some basic examples of operations that provide the building blocks for creating an access control
policy. For more information about creating your roles and advanced policies, see Using RBAC Authorization in the
Kubernetes documentation at https://kubernetes.io/docs/reference/access-authn-authz/rbac/. For information
on adding a user to a cluster as an administrator, see Onboarding a User to an NKP Cluster on page 348.
• https://dexidp.io/docs/connectors/oidc/
• https://dexidp.io/docs/connectors/saml/
• https://dexidp.io/docs/connectors/github/
To onboard a user:
Procedure
2. Add the connector and use the command kubectl apply -f ldap.yaml.
The following output is displayed.
secret/ldap-password created
connector.dex.mesosphere.io/ldap created
3. Add the appropriate role bindings and name the file new_user.yaml.
See the following examples for both Single User and Group Bindings.
4. Add the role binding(s) use the command kubectl apply -f new_user.yaml.
Note:
Identity Providers
You can grant access to users in your organization.
NKP supports GitHub Identity Provider Configuration on page 351, Adding an LDAP Connector on
page 353, SAML,and standard OIDC identity providers such as Google. These identity management providers
support the login and authentication process for NKP and your Kubernetes clusters.
You can configure as many identity providers as you want, and users can select from any method when logging in. If
you have multiple workspaces in your environment, you can use a single identity provider to manage access for all of
them or choose to configure an identity provider per workspace.
Configuring a dedicated identity provider per workspace can be useful if you want to retain access to your
workspaces separately. In this case, users of a specific workspace have a dedicated login 2-factor authentication
page with the identity provider options configured for their workspace. This setup is particularly helpful if you have
multiple tenants. For more information, see Multi-Tenancy in NKP on page 421.
Access Limitations
• The GitHub provider allows you to specify any organizations and teams that are eligible for access.
• The LDAP provider allows you to configure search filters for either users or groups.
• The OIDC provider cannot limit users based on identity.
• The SAML provider allows users to log in using a single sign-on (SSO).
Procedure
1. Log into the Kommander UI. See Logging In To the UI on page 74.
7. Select the target workspace for the identity provider and complete the fields with the relevant details.
Note: You can configure an identity provider globally for your entire organization using theAll Workspaces
option or per workspace, enabling multi-tenancy.
8. Click Save.
Procedure
To authorize all developers to access your clusters using their GitHub credentials, set up GitHub as an
identity provider login option.
Procedure
1. Start by creating a new OAuth Application in your GitHub organization by completing the registration form. To
view the form, see https://github.com/settings/applications/new.
4. In the Authorization callback URL field, use your cluster URL followed by /dex/callback by adding this
to the end of your URL.
7. Log in to your NKP UI from the top menu bar, and select the Global workspace.
9. Select the Identity Providers tab and then click Add Identity Provider .
10. Select GitHub as the identity provider type, and select the target workspace.
11. Copy the Client ID and Client Secret values from GitHub into this form.
12. To configure dex to load all the groups configured in the user's GitHub identity, select the Load All Groups
checkbox.
This allows you to configure group-specific access to NKP and Kubernetes resources.
Note: Do not select the Enable Device Flow checkbox before selecting <Register the Application> .
You can map the identity provider groups to the Kubernetes groups.
Procedure
1. In the NKP UI, select the Groups tab from the Identity Provider screen, and then click Create Group.
3. Add the groups or teams from your GitHub provider under Identity Provider Groups.
For more information on finding the teams to which you are assigned in GitHub, see the Changing team visibility
section at https://docs.github.com/en/organizations/organizing-members-into-teams/changing-team-
visibility.
4. Click Save.
After defining a group, bind one or more roles to this group. This topic describes how to bind the group to
the View Only role.
Procedure
1. In the NKP UI, from the top menu bar, select Global or the target workspace.
2. Select the Cluster Role Bindings tab and then select Add roles.
3. Select View Only role from the Roles dropdown list and select Save.
For more information on granting users access to Kommander paths on your cluster, see Access to Kubernetes
and Kommander Resources on page 342.
When you check your attached clusters and login as a user from your matched groups, every resource, is listed. Do
delete or edit them.
Each LDAP directory is set up in unique ways. So, these steps are important. Add the LDAP authentication
mechanism using the CLI or UI.
Note: This topic does not cover all possible configurations. For more information, see Dex LDAP connector reference
documentation on GitHub at https://github.com/dexidp/dex/blob/v2.22.0/Documentation/connectors/
ldap.md.
Procedure
Choose whether to establish an external LDAP globally or for a specific workspace.
» Global LDAP - identity provider serves all workspaces: Create and apply the following objects:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: ldap-password
namespace: kommander
type: Opaque
stringData:
password: password
---
apiVersion: dex.mesosphere.io/v1alpha1
kind: Connector
metadata:
name: ldap
namespace: kommander
spec:
enabled: true
Note: The value for the LDAP connector spec:displayName (here LDAP Test) appears on the Login button
for this identity provider in the NKP UI. Enter a name for the users.
» Workspace LDAP - identity provider serves a specific workspace: Create and apply the following objects:
Note: Establish LDAP for a specific workspace in the scope of multiple tenants..
• 1. Obtain the workspace name for which you are establishing an LDAP authentication server.
kubectl get workspaces
Note down the value under the WORKSPACE NAMESPACE column.
2. Set the WORKSPACE_NAMESPACE environment variable to that namespace.
export WORKSPACE_NAMESPACE=<your-namespace>
Note: The value for the LDAP connector spec:displayName (here LDAP Test) appears
on the Login button for this identity provider in the NKP UI. Choose a name for the users.
Procedure
Procedure
Note: In the UI, after the LDAP authentication is enabled, additional access rights must be configured using the
Add Identity Provider page in the UI.
Procedure
1. Complete the steps in Generating a Dedicated Login URL for Each Tenant on page 423.
Note: In the UI, after the LDAP authentication is enabled, additional access rights must be configured using the
Add Identity Provider page in the UI.
LDAP Troubleshooting
If the Dex LDAP connector configuration is incorrect, debug the problem, and iterate on it. The Dex log
output contains helpful error messages, as indicated in the following examples:
Procedure
1. Use the kubectl logs -f dex-66675fcb7c-snxb8 -n kommander command to retrieve the Dex logs.
You may see an error similar to the following example:
error parse config file /etc/dex/cfg/config.yaml: error unmarshaling JSON: parse
connector config: illegal base64 data at input byte 0
2. Another reason for Dex not starting up correctly is that https://<YOUR-CLUSTER-HOST>/token displays a
5xx HTTP error response after timing out.
Most problems with the Dex LDAP connector configuration become apparent only after a login attempt. A login that
fails from misconfiguration results in an error displaying only Internal Server Error and Login error. You
can find the root cause by reading the Dex log, as shown in the following example.
kubectl logs -f dex-5d55b6b94b-9pm2d -n kommander
You can look for output similar to this example.
[...]
time="2019-07-29T13:03:57Z" level=error msg="Failed to login user: failed to connect:
LDAP Result Code 200 \"Network Error\": dial tcp: lookup freeipa.example.com on
10.255.0.10:53: no such host"
Here, the directory’s DNS name was misconfigured, which should be easy to address.
A more difficult problem occurs when a login through Dex through LDAP fails because Dex cannot find the specified
user unambiguously in the directory. That is the result of an invalid LDAP user search configuration. Here’s an
example error message from the Dex log.
time="2019-07-29T14:21:27Z" level=info msg="performing ldap search
cn=users,cn=compat,dc=demo1,dc=freeipa,dc=org sub (&(objectClass=posixAccount)
(uid=employee))"
time="2019-07-29T14:21:27Z" level=error msg="Failed to login user: ldap: filter
returned multiple (2) results: \"(&(objectClass=posixAccount)(uid=employee))\""
Solving problems like this requires you to review the directory structures carefully. Directory structures can be very
different between different LDAP setups. You must carefully assemble a user search configuration matching the
directory structure.
For comparison, here are some sample log lines issued by Dex for a successful login:
time="2019-07-29T15:35:51Z" level=info msg="performing ldap search
cn=accounts,dc=demo1,dc=freeipa,dc=org sub (&(objectClass=posixAccount)
(uid=employee))"
time="2019-07-29T15:35:52Z" level=info msg="username \"employee\" mapped to entry
uid=employee,cn=users,cn=accounts,dc=demo1,dc=freeipa,dc=org"
time="2019-07-29T15:35:52Z" level=info msg="login successful: connector \"ldap\",
username=\"\", email=\"[email protected]\", groups=[]"
Enabling the Konvoy Async User configures the Konvoy Async Plugin so the authentication is
Plugin routed through Dex's oidc and the token is generated automatically.
By enabling the plugin, the user is routed to an additional login
procedure for authentication, but they no longer have to generate a
token manually in the UI.
The instructions for either generating a token manually or enabling the Konvoy Async Plugin differ slightly
depending on whether you configured the identity provide globally for all the workspaces, or individually for a single
workspace.
4. Login again.
Procedure
Warning: If you choose Method 2, the Set profile name field is not optional if you have multiple clusters in
your environment. Ensure you change the name of the profile for each cluster for which you want to generate a
kubeconfig file. Otherwise, all clusters will use the same token, which makes cluster authentication vulnerable
and can let users access clusters for which they do not have authorization.
Procedure
1. Open the login link you obtained from the global administrator, which they generated for your workspace or
tenant.
3. If there are several clusters in the workspace, select the cluster for which you want to generate a token.
Procedure
1. Open the login link you obtained from the global administrator, which they generated for your workspace or
tenant.
Warning: If you choose Method 2, the Set profile name field is not optional if you have multiple clusters in
your environment. Ensure you change the name of the profile for each cluster for which you want to generate a
kubeconfig file. Otherwise, all clusters will use the same token, which makes cluster authentication vulnerable
and can let users access clusters for which they do not have authorization.
Infrastructure Providers
Infrastructure providers, such as AWS, Azure, and vSphere, provide the infrastructure for your
Management clusters. You may have many accounts for a single infrastructure provider. To automate
cluster provisioning, NKP needs authentication keys for your preferred infrastructure provider.
To provision new clusters and manage them in the NKP UI, NKP also needs infrastructure provider credentials.
Currently, you can create infrastructure providers records for:
• AWS: Creating an AWS Infrastructure Provider with a User Role on page 360
• Azure: Creating an Azure Infrastructure Provider in the UI on page 372
• vSphere: Creating a vSphere Infrastructure Provider in the UI on page 373
Infrastructure provider credentials are configured in each workspace. The name you assign must be unique across all
the other namespaces in your cluster.
Procedure
• AWS:
•
• Create an Azure Infrastructure Provider in the NKP UI.
• Create a managed Azure cluster in the NKP UI.
• vSphere:
•
• Create a vSphere Infrastructure Provider in the NKP UI.
• Create a managed cluster on vSphere in the NKP UI.
• VMware Cloud Director (VCD):
•
• Create a VMware Cloud Director Infrastructure Provider in the NKP UI.
• Create a managed Cluster on VCD in the NKP UI
Procedure
To delete an infrastructure provider, delete all the other clusters created with that infrastructure provider first.
This ensures that NKP has access to your infrastructure provider to remove all the resources created for a managed
cluster.
Important: Nutanix recommends using the role-based method as this is more secure.
Note: The role authentication method can only be used if your management cluster is running in AWS.
For more flexible credential configuration, we offer a role-based authentication method with an optional External
ID for third party access. For more information, see the IAM roles for Amazon EC2 in the AWS documentation at
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html.
Procedure
9. If you want to share the role with a 3rd party, add an External ID. External IDs secure your environment from
accidentally used roles. For more information see How to use an external ID when granting access to your
AWS resources to a third party in the AWS documentation at https://docs.aws.amazon.com/IAM/latest/
UserGuide/id_roles_create_for-user_externalid.html.
Create a role manually before configuring an AWS Infrastructure Provider with a User Role.
• EC2 Instances
• VPC
• Subnets
• Elastic Load Balancer (ELB)
• Internet Gateway
• NAT Gateway
• Elastic Block Storage (EBS) Volumes
• Security Groups
• Route Tables
• IAM Roles
Procedure
1. The user you delegate from your role must have a minimum set of permissions. The following snippet is the
minimal IAM policy required.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AllocateAddress",
"ec2:AssociateRouteTable",
"ec2:AttachInternetGateway",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CreateInternetGateway",
2. Replace YOURACCOUNTRESTRICTION with the AWS Account ID that you want AssumeRole from.
Note: Never add a */ wildcard. This opens your account to the public.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com",
"AWS": "arn:aws:iam::YOURACCOUNTRESTRICTION:root"
3. To use the role created, attach the following policy to the role which is already attached to your managed or
attached cluster. Replace YOURACCOUNTRESTRICTION with the AWS Account ID where the role AssumeRole is
saved. Also, replace THEROLEYOUCREATED with the AWS Role name.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AssumeRoleKommander",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::YOURACCOUNTRESTRICTION:role/THEROLEYOUCREATED"
}
]
}
Procedure
1. In NKP, select the workspace associated with the credentials that you are adding.
2. Navigate to Administration > Infrastructure Providers, and click Add Infrastructure Provider .
6. Enter a access ID and secret keys using the keys generated above.
You can use an existing AWS user with the credentials configured.
• EC2 Instances
• VPC
• Subnets
• Elastic Load Balancer (ELB)
• Internet Gateway
• NAT Gateway
• Elastic Block Storage (EBS) Volumes
• Security Groups
• Route Tables
• IAM Roles
Procedure
The following is the minimal IAM policy required.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AllocateAddress",
"ec2:AssociateRouteTable",
"ec2:AttachInternetGateway",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CreateInternetGateway",
"ec2:CreateNatGateway",
"ec2:CreateRoute",
"ec2:CreateRouteTable",
"ec2:CreateSecurityGroup",
"ec2:CreateSubnet",
"ec2:CreateTags",
"ec2:CreateVpc",
"ec2:ModifyVpcAttribute",
"ec2:DeleteInternetGateway",
"ec2:DeleteNatGateway",
"ec2:DeleteRouteTable",
"ec2:DeleteSecurityGroup",
"ec2:DeleteSubnet",
"ec2:DeleteTags",
"ec2:DeleteVpc",
"ec2:DescribeAccountAttributes",
"ec2:DescribeAddresses",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInstances",
Procedure
5. If you are already in a workspace, the provider is automatically created in that workspace.
• Copy the id output from the login command above and paste it into the Subscription ID field.
• Copy the tenant used in step 2 and paste it into the Tenant ID field.
• Copy the appId used in step 2 and paste it into the Client ID field.
• Copy the password used in step 2 and paste it into the Client Secret field.
9. Click Save.
Procedure
1. Log in to your NKP Ultimate UI and access the NKP home page.
4. If you are already in a workspace, the new infrastructure provider is automatically created in that workspace.
6. click Save.
• Complete the VMware Cloud Director Prerequisites on page 912 for the VMware Cloud Director.
• Create a VCD infrastructure provider to contain your credentials.
6. Specify a Site URL value, which must begin with https://. For example, "https://vcd.example.com".
Do not use a trailing forward slash character.
Warning: Ensure to make a note of the Refresh Token, as it displays only one time, and cannot be retrieved
afterwards.
Note: Editing a VCD infrastructure provider means that you are changing the credentials under which NKP
connects to VMware Cloud Director. This can have negative effects on any existing cluster that use that
infrastructure provider record.
To prevent errors, NKP first checks if there are any existing clusters for the selected infrastructure
provider. If a VCD infrastructure provider has existing clusters, NKP displays an error message and
prevents you from editing the infrastructure provider.
Procedure
2. Select a general color range, and then select a specific shade or tint. The color input uses the style of your browser
for its color selection tool.
3. Select the eyedropper, move it to a sample of the color you want and select once to select that color’s location.
Procedure
2. Select a general color range from the slider bar, and then select a specific shade or tint with your mouse cursor.
3. Select the eyedropper, move it to a sample of the color you want and select once to select that color’s location.
Adding Your Organization’s Logo Using the Drag and Drop Option
When you license and install NKP Ultimate or Gov Advanced, you also have the option to add your
organization’s logo to the header. The width of the header banner automatically adjusts to contain your
logo. NKP automatically places your logo on the left side of the header and centers it vertically.
Note: To provide security against certain kinds of malicious activity, your browser has a same-origin policy for
accessing resources. When you upload a file, the browser creates a unique identifier for the file. This prevents you from
selecting a file more than once.
Procedure
1. Locate the required file in the MacOS Finder or Windows File Explorer.
2. Drag and drop an image of the appropriate file type into the shaded area to see a preview of the image and display
the file name.
You can select X on the upper-right or Remove on the lower-right to clear the image, if needed.
3. Click Save.
Warning: You cannot select a file for drag-and-drop if it does not have a valid image format.
Procedure
3. Click Save.
Procedure
1. Provide the name of a ConfigMap with the custom configuration in the AppDeployment.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: kube-prometheus-stack
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: kube-prometheus-stack-46.8.0
kind: ClusterApp
configOverrides:
name: kube-prometheus-stack-overrides-attached
EOF
2. Create the ConfigMap with the name provided in the previous step, which provides the custom configuration on
top of the default configuration.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: kube-prometheus-stack-overrides-attached
data:
values.yaml: |
prometheus:
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
Procedure
You can run the following commands to review AppDeployments.
» All AppDeployments in a workspace: To review the state of the AppDeployment resource for a specific
workspace, run the get command with the name of your workspace. Here's as example:
nkp get appdeployments -w kommander-workspace
The output displays a list of all your applications:
NAME APP CLUSTERS
[...]
kube-oidc-proxy kube-oidc-proxy-0.3.2 host-cluster
kube-prometheus-stack kube-prometheus-stack-46.8.0 host-cluster
kubecost kubecost-0.35.1 host-cluster
[...]
Note: For more information on how to create, or get an AppDeployment, see the CLI documentation.
Deployment Scope
In a single-cluster environment with an Starter license, AppDeployments enable customizing any platform
application. In a multi-cluster environment with a Starter license, AppDeployments enable workspace-level,
project-level, and per-cluster deployment and customization of workspace applications.
Note: When configuring storage for logging-operator-logging-overrides, ensure that you create a
ConfigMap in your workspace namespace for every cluster in that workspace.
extract_kubernetes_labels:
true
configure_kubernetes_labels:
true
buffer:
disabled: true
retry_forever:
false
retry_max_times: 5
flush_mode:
interval
flush_interval:
10s
flush_thread_count: 8
extra_labels:
log_source:
kubernetes_container
fluentbit:
inputTail:
Mem_Buf_Limit: 512MB
fluentd:
bufferStorageVolume:
emptyDir:
medium: Memory
disablePvc: true
scaling:
replicas: 10
resources:
requests:
memory: 1000Mi
cpu: 1000m
limits:
memory: 2000Mi
cpu: 1000m
Loki ingester:
replicas: 10
distributor:
replicas: 2
Note: To use the CLI to deploy or uninstall applications, see Deploying Platform Applications Using CLI on
page 389.
This topic describes how to enable your platform applications from the UI.
Procedure
2. From the sidebar, browse through the available applications from your configured repositories, and select
Applications.
3. Select the three-dot button of the desired application card > Enable.
6. For customizations only: to override the default configuration values, in the sidebar, select Configuration.
Note: If there are customization Overrides at the workspace and cluster level, they are combined for
implementation. Cluster-level Overrides take precedence over Workspace Overrides.
a. To customize an application for all clusters in a workspace, copy your customized values into the text editor
under Workspace Application Configuration or upload your YAML file that contains the values.
someField: someValue
b. To add a customization per cluster, copy the customized values into the text editor of each cluster under
Cluster Application Configuration Override or upload your YAML file that contains the values.
someField: someValue
2. From the sidebar, browse through the available applications from your configured repositories, and select
Applications.
3. In the Application card you want to customize, select the three dot menu and Edit.
Note: If there are customization Overrides at the workspace and cluster levels, they are combined for
implementation. Cluster-level Overrides take precedence over Workspace Overrides.
a. To customize an application for all clusters in a workspace, copy your customized values into the text editor
under Workspace Application Configuration or upload your YAML file that contains the values.
someField: someValue
b. To add a customization per cluster, copy the customized values into the text editor of each cluster under
Cluster Application Configuration Override or upload your YAML file that contains the values.
someField: someValue
You can also customize an application for a specific cluster from the Clusters view:
Procedure
Procedure
a. Select Management Cluster if your target cluster is the Management Cluster Workspace.
b. Otherwise, select Clusters, and choose your target cluster.
4. If the application was deployed successfully, the status Deployed appears in the application card. Otherwise,
hover over the failed status to obtain more information on why the application failed to deploy.
Note: It can take several minutes for the application to deploy completely. If the Deployed or Failed status is not
displayed, the deployment process is not finished.
Procedure
2. From the sidebar, browse through the available applications from your configured repositories, and select
Applications.
3. Select the three-dot button of the desired application card > Uninstall.
4. Follow the instruction on the confirmation pop-up message and select Uninstall Application.
Note: To use the CLI to deploy or uninstall applications, see Deploying Platform Applications Using CLI on
page 389.
Procedure
1. From the sidebar to browse through the available applications from your configured repositories, select
Applications.
2. Select the three-dot button of the desired application card > Enable.
3. If available, select a version from the dropdown list. This dropdown list is only visible if there is more than one
version to choose from.
a. To customize an application for all clusters in a workspace, copy your customized values into the text editor
under Workspace Application Configuration or upload your YAML file that contains the values.
someField: someValue
Procedure
1. From the sidebar, browse through the available applications from your configured repositories and select
Applications
2. In the Application card you want to customize, select the three dot menu and Edit.
Note: If there are customization Overrides at the workspace and cluster level, they are combined for
implementation. Cluster-level Overrides take precedence over Workspace Overrides.
a. To customize an application for all clusters in a workspace, copy your customized values into the text editor
under Workspace Application Configuration or upload your YAML file that contains the values.
someField: someValue
Procedure
2. Select the Applications tab and navigate to the application you want to verify.
3. If the application was deployed successfully, the status Deployed appears in the application card. Otherwise,
hover over the failed status to obtain more information on why the application failed to deploy.
Note: It can take several minutes for the application to deploy completely. If the Deployed or Failed status is not
displayed, the deployment process is not finished.
Procedure
1. From the sidebar, browse through the available applications from your configured repositories and select
Applications
2. Select the three-dot button of the desired application card > Uninstall.
3. Follow the instruction on the confirmation pop-up message, and select Uninstall Application.
Platform Applications
When attaching a cluster, NKP deploys certain platform applications on the newly attached cluster.
Operators can use the NKP UI to customize which platform applications to deploy to the attached clusters
in a given workspace. For more information and to check the default and their current versions. see Nutnix
Kubernetes Platform Release Notes
• Cert Manager: Automates TLS certificate management and issuance. For more information, see https://cert-
manager.io/docs/.
• Reloader: A controller that watches changes on ConfigMaps and Secrets, and automatically triggers updates on
the dependent applications. For more information, see https://github.com/stakater/Reloader.
• traefik: Provides an HTTP reverse proxy and load balancer. Requires cert-manager and reloader. For more
information, see https://traefik.io/.
• Chart Museum: An Open source Helm Chart (collection of files that describe a set of Kubernetes resources)
repository. For more information, see https://chartmuseum.com/.
• Air-gapped environments only: ChartMuseum is used on air-gapped installations to store the Helm
Charts for air-gapped installations. In non-air-gapped installations, the charts are fetched from upstream
repositories and Chartmuseum is not installed.
1. To see which applications are enabled or disabled in each category, verify the status.
kubectl get apps,clusterapps,appdeployments -A
Logging
Collects logs over time from Kubernetes and applications deployed on managed clusters. Also provides the ability to
visualize and query the aggregated logs.
• Fluent-Bit: Open source and multi-platform log processor tool which aims to be a generic. For example, Swiss
knife for logs processing and distribution. For more information, see https://docs.fluentbit.io/manual.
• Grafana: Log into the dashboard to view logs aggregated to Grafana Loki. For more information, see https://
grafana.com/oss/grafana/.
• Logging operation: A horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by
Prometheus. For more information, see Logging operator.
• Rook Ceph: Automates the deployment and configuration of a Kubernetes logging pipeline. For more information,
see https://rook.io/docs/rook/v1.10/Helm-Charts/operator-chart/.
• Rook Ceph Cluster: A Kubernetes-native high performance object store with an S3-compatible API that supports
deploying into private and public cloud infrastructures. For more information, see https://rook.io/docs/rook/
v1.10/Helm-Charts/ceph-cluster-chart/.
Note: Currently, the monitoring stack is deployed by default. The logging stack is not.
Monitoring
Provides monitoring capabilities by collecting metrics, including cost metrics for Kubernetes and applications
deployed on managed clusters. Also provides visualization of metrics and evaluates rule expressions to trigger alerts
when specific conditions are observed.
• Kubecost: provides real-time cost visibility and insights for teams using Kubernetes, helping you continuously
reduce your cloud costs. For more information, see https://kubecost.com/
• kubernetes-dashboard: A general purpose, web-based UI for Kubernetes clusters. It allows users to manage
applications running in the cluster, troubleshoot them and manage the cluster itself. For more information, see
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/
Note: Prometheus, Prometheus Alertmanager, and Grafana are included in the bundled installation. For more
information, see https://prometheus.io/, https://prometheus.io/docs/alerting/latest/alertmanager and
https://grafana.com/.
• nvidia-gpu-operator: The NVIDIA GPU Operator manages NVIDIA GPU resources in a Kubernetes cluster and
automates tasks related to bootstrapping GPU nodes. For more information, see https://catalog.ngc.nvidia.com/
orgs/nvidia/containers/gpu-operator.
• prometheus-adapter: Provides cluster metrics from Prometheus. For more information, see https://github.com/
DirectXMan12/k8s-prometheus-adapter.
Security
Allows management of security constraints and capabilities for the clusters and users.
• gatekeeper: A policy Controller for Kubernetes. For more information, see https://github.com/open-policy-
agent/gatekeeper
• kube-oidc-proxy: A reverse proxy server that authenticates users using OIDC to Kubernetes API servers where
OIDC authentication is not available. For more information, see https://github.com/jetstack/kube-oidc-proxy
• traefik-forward-auth: Installs a forward authentication application providing Google OAuth based authentication
for Traefik. For more information, see https://github.com/thomseddon/traefik-forward-auth.
Backup
This platform application assists you with backing up and restoring your environment:
Review the Workspace Platform Application Defaults and Resource Requirements on page 42 to ensure that
the attached clusters have sufficient resources.
When deploying and upgrading applications, platform applications come as a bundle; they are tested as a single unit,
and you must deploy or upgrade them in a single process, for each workspace. This means all clusters in a workspace
have the same set and versions of platform applications deployed.
• Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace where the
cluster is attached:
export WORKSPACE_NAMESPACE=<workspace_namespace>
• Set the WORKSPACE_NAME environment variable to the name of the workspace where the cluster is attached:
export WORKSPACE_NAME=<workspace_name>
Note: From the CLI, you can enable applications to deploy in the workspace. Verify that the application has
successfully deployed through the CLI.
To create the AppDeployment, enable a supported application to deploy to your existing attached or managed cluster
with an AppDeployment resource (see AppDeployment Resources on page 396).
Procedure
1. Obtain the APP ID and Version of the application from the "Components and Applications" section in the
Nutanix Kubenetes Platform Release Notes.You must add them in the <APP-ID>-<Version> format, for example,
istio-1.17.2.
Note:
• The --app flag must match the APP NAME from the list of available platform applications.
• Observe that the nkp create command must be run with the WORKSPACE_NAME instead of the
WORKSPACE_NAMESPACE flag.
This instructs Kommander to create and deploy the AppDeployment to the KommanderClusters in the
specified WORKSPACE_NAME.
Procedure
Connect to the attached cluster and watch the HelmReleases to verify the deployment. In this example, we are
checking whether istio is deployed correctly.
kubectl get helmreleases istio -n ${WORKSPACE_NAMESPACE} -w
HelmRelease must be marked as Ready.
NAMESPACE NAME READY STATUS AGE
workspace-test-vjsfq istio True Release reconciliation succeeded 7m3s
Some supported applications have dependencies on other applications. For more information, see Platform
Applications Dependencies on page 390.
fluent-bit - -
grafana-logging grafana-loki -
grafana-loki rook-ceph-cluster -
logging-operator - -
rook-ceph - -
rook-ceph-cluster rook-ceph kube-prometheus-stack
Logging
Logs are collected over a period of time from Kubernetes and applications are deployed on managed clusters. Also
provides the ability to visualize and query the aggregated logs.
• fluent-bit: Open source and multi-platform log processor tool which aims to be a generic Swiss knife for logs
processing and distribution. For more information, see https://docs.fluentbit.io/manual/.
• grafana-logging: Logging dashboard used to view logs aggregated to Grafana Loki. For more information, see
https://grafana.com/oss/grafana/.
• grafana-loki: A horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by
Prometheus. For more information, see https://grafana.com/oss/loki/.
• logging-operator (see https://banzaicloud.com/docs/one-eye/logging-operator/): Automates the deployment
and configuration of a Kubernetes logging pipeline. For more information, see
• rook-ceph: A Kubernetes-native high performance object store with an S3-compatible API that supports deploying
into private and public cloud infrastructures. For more information, see https://rook.io/docs/rook/v1.10/Helm-
Charts/operator-chart/) and rook-ceph-cluster (see https://rook.io/docs/rook/v1.10/Helm-Charts/ceph-
cluster-chart/
Monitoring
Provides monitoring capabilities by collecting metrics, including cost metrics, for Kubernetes and applications
deployed on managed clusters. Also provides visualization of metrics and evaluates rule expressions to trigger alerts
when specific conditions are observed.
• Kubecost: provides real-time cost visibility and insights for teams using Kubernetes, helping you continuously
reduce your cloud costs. For more information, see .https://kubecost.com/
• kubernetes-dashboard: A general purpose, web-based UI for Kubernetes clusters. It allows users to manage
applications running in the cluster, troubleshoot them and manage the cluster itself. For more information, see
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/.
• kube-prometheus-stack: A stack of applications that collect metrics and provide visualization and alerting
capabilities. For more information, see https://github.com/prometheus-community/helm-charts/tree/main/
charts/kube-prometheus-stack.
Note: Prometheus, Prometheus Alertmanager, and Grafana are included in the bundled installation. For more
information, see https://prometheus.io/, https://prometheus.io/docs/alerting/latest/alertmanager,
and https://grafana.com/.
• nvidia-gpu-operator: The NVIDIA GPU Operator manages NVIDIA GPU resources in a Kubernetes cluster and
automates tasks related to bootstrapping GPU nodes. For more information, see https://catalog.ngc.nvidia.com/
orgs/nvidia/containers/gpu-operator.
• prometheus-adapter: Provides cluster metrics from Prometheus. For more information, see https://github.com/
DirectXMan12/k8s-prometheus-adapter.
Security
Allows management of security constraints and capabilities for the clusters and users.
• kube-oidc-proxy (see ): A reverse proxy server that authenticates users using OIDC to Kubernetes API servers
where OIDC authentication is not available. For more information, see https://github.com/jetstack/kube-oidc-
proxy.
• traefik-forward-auth: Installs a forward authentication application providing Google OAuth based authentication
for Traefik. For more information, see https://github.com/thomseddon/traefik-forward-auth.
Backup
This platform application assists you with backing up and restoring your environment:
• velero: An open source tool for safely backing up and restoring resources in a Kubernetes cluster, performing
disaster recovery, and migrating resources and persistent volumes to another Kubernetes cluster.For more
information, see https://velero.io/.
Service Mesh
Allows deploying service mesh on clusters, enabling the management of microservices in cloud-native applications.
Service mesh can provide a number of benefits, such as providing observability into communications, providing
secure connections, or automating retries and backoff for failed requests.
• istio: Addresses the challenges developers and operators face with a distributed or microservices architecture. For
more information, see https://istio.io/latest/about/service-mesh/.
• jaeger: A distributed tracing system used for monitoring and troubleshooting microservices-based distributed
systems.For more information, see https://www.jaegertracing.io/.
• kiali: A management console for an Istio-based service mesh. It provides dashboards, observability, and lets
you operate your mesh with robust configuration and validation capabilities. For more information, see https://
kiali.io/.
• ai-navigator-info-api: This is the collector of the application’s API service, which performs all data
abstraction data structuring services. This component is enabled by default and included in the AI Navigator.
• ai-navigator-info-agent: After manually enabling this platform application, the agent starts collecting pro
or management cluster data and injecting it into the Cluster Info Agent database.
• ai-navigator-info-api (included in
ai-navigator-app)
1. Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace where the
cluster is attached: xport WORKSPACE_NAMESPACE=<your_workspace_namespace>.
export WORKSPACE_NAMESPACE=<your_workspace_namespace>
2. You are now able to copy the following commands without having to replace the placeholder with your
workspace namespace every time you run a command.
Follow these steps.
Note: Keep in mind that the overrides for each application appears differently and is dependent on how the
application’s helm chart values are configured.
For more information about the helm chart values used in the NKP, see "Component and Applications"
section in the Nutanix Kubernetes Platform Release Notes.
Generally speaking, performing a search for the priorityClassName field allows you to find out how
you can set the priority class for a component.
In the example below which uses the helm chart values in Grafana Loki, the referenced
priorityClassName field is nested under the ingester component. The priority class can be set for
several other components, including distributor, ruler, and on a global level.
Procedure
1. Create a ConfigMap with custom priority class configuration values for Grafana Loki.
The following example sets the priority class of ingester component to the NKP critical priority class.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: grafana-loki-overrides
data:
values.yaml: |
ingester:
priorityClassName: nkp-critical-priority
EOF
3. It will take a few minutes to reconcile but you can check the ingester pod’s priority class after reconciling.
kubectl get pods -n ${WORKSPACE_NAMESPACE} -o custom-
columns=NAME:.metadata.name,PRIORITY:.spec.priorityClassName,PRIORITY:.spec.priority
|grep ingester
The results appears as follows::
NAME PRIORITY
PRIORITY
grafana-loki-loki-distributed-ingester-0 nkp-critical-
priority 100002000
AppDeployment Resources
Use AppDeployments to deploy and customize platform, NKP catalog, and custom applications.
An AppDeployment is a custom resource (see Custom Resource in https://kubernetes.io/docs/concepts/extend-
kubernetes/api-extension/custom-resources/ created by NKP with the purpose of deploying applications
(platform, NKP catalog and custom applications) in the management cluster, managed clusters, or both. Customers of
both Pro and Ultimate products use AppDeployments, regardless of their setup (air-gapped, non-air-gapped, etc.),
and their infrastructure provider.
When installing NKP, an AppDeployment resource is created for each enabled Platform Application. This
AppDeployment resource references a ClusterApp, which then references the repository that contains a concrete
declarative and preconfigured setup of an application, usually in the form of a HelmRelease. ClusterApps are
cluster-scoped so that these platform applications are deployable to all workspaces or projects.
In the case of NKP catalog and custom applications, the AppDeployment references an App instead of a
ClusterApp, which also references the repository containing the installation and deployment information. Apps are
namespace-scoped and are meant to only be deployable to the workspace or project in which they have been created.
For example, this is the default AppDeployment for the Kube Prometheus Stack platform application:
apiVersion: apps.kommander.nutanix.io/v1alpha3
kind: AppDeployment
metadata:
name: kube-prometheus-stack
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: kube-prometheus-stack-46.8.0
kind: ClusterApp
Workspaces
Allow teams or tenants to manage their own clusters using workspaces. Workspaces are a logical grouping
of clusters that maintain a similar configuration, with certain configurations automatically federated to those
clusters. Workspaces give you the flexibility to represent your organization in a way that makes sense
for your teams or tenants. For example, you can create workspaces to separate clusters according to
departments, products, or business functions.
Global or Workspace UI
The UI is designed to be accessible for different roles at different levels:
• Global: At the top level, IT administrators manage all clusters across all workspaces.
• Workspace: DevOps administrators manage multiple clusters within a workspace.
• Projects: DevOps administrators or developers manage configuration and services across multiple clusters.
Default Workspace
To get started immediately, you can use the default workspace deployed in NKP. However, take into account that you
cannot move clusters from one workspace to another after creating/attaching them.
Creating a Workspace
In NKP, you can create your own workspaces.
Procedure
1. From the workspace selection dropdown list in the top menu bar, select Create Workspace.
3. Click Save.
The workspace is now accessible from the workspace selection dropdown list.
Procedure
2. Select the Actions from the dropdown list and click Edit.
3. Enter in new Key and Value labels for your workspace, or edit existing Key and Value labels.
Note: Labels that are added to a workspace are also applied to the kommanderclusters object and as well as
to all the clusters in the workspace.
Note: Workspaces can only be deleted if all the clusters in the workspace have been deleted or detached.
Procedure
3. Select the three-dot button to the right of the workspace you want to delete, and then click Delete.
Workspace Applications
This topic describes the applications and application types that you can use with NKP.
Application types are either pre-packaged applications from the Nutanix Application Catalog or custom applications
that you maintain for your teams or organization.
Note: NKP Pro users are only be able to configure and deploy applications to a single cluster within a workspace.
Selecting an application to deploy to a cluster skips cluster selection and takes you directly to the workspace
configuration overrides page.
• Amazon Web Services (AWS): Creating a New AWS Air-gapped Cluster on page 779
• Amazon Elastic Kubernetes Service (EKS):Create an EKS Cluster from the UI on page 820
• Microsoft Azure:Creating a Managed Azure Cluster Through the NKP UI on page 464
For more information, see the current list of Catalog and Platform Applications:
Procedure
1. From the left navigation pane, find the application you want to deploy to the cluster, and select Applications.
2. Select the three-dot menu in the desired application’s tile and select Enable.
Note: You can also access the Application Enablement by selecting the three-dot menu > View > Details.
Then, select Enable from the application’s details page.
3. Select the cluster(s) that you want to deploy the application to.
The available clusters are sorted by Name, Type, Provider and any Labels that you added.
4. In the top-right corner of the Application Enablement page, deploy the application to the clusters by selecting
Enable.
You are automatically redirected to either the Applications or View Details page.
To view the application enabled in your chosen cluster, navigate to the Clusters page on the left navigation bar.
The application appears in the Applications pane of the appropriate cluster.
Note: Once you enable an application at the workspace level, NKP automatically enables that app on any other
cluster you create or attach.
Procedure
1. Select the cluster(s) that you want to deploy the application to.
The available clusters can be sorted by Name, Type, Provider and any Labels you’ve added.
3. The Configuration tab contains two separate types of code editors, where you can enter your specified overrides
and configurations.
» Workspace Application Configuration: A workspace-level code editor that applies all configurations and
overrides to the entirety of the workspace and its clusters for this application.
» Cluster Application Configuration Override: A cluster-scoped code editor that applies configurations
and overrides to the cluster specified. These customizations will merge with the workspace application
configuration. If there is no cluster-scoped configuration, the workspace configuration applies.
4. If you already have a configuration to apply in a text or .yaml file, you can upload the file by selecting Upload
File. If you want to download the displayed set of configurations, select Download File.
Note:
Editing is disabled in the code boxes displayed in the application’s details page. To edit the
configuration, click Edit in the top right of the page and repeat the steps in this section.
Procedure
2. Select the three-dot menu in the application tile that you want and select Uninstall.
A prompt appears to confirm your decision to uninstall the application.
4. Refresh the page to confirm that the application has been removed from the cluster.
This process only removes the application from the specific cluster you’ve navigated to. To remove this
application from other clusters, navigate to the Clusters page and repeat the process.
• Any application you wish to enable or customize at a cluster level, first needs to be enabled at the workspace-level
through an AppDeployment. See Deploying Platform Applications Using CLI on page 389 and Workplace
Catalog Applications on page 406.
• Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace where the
cluster is attached.
export WORKSPACE_NAMESPACE=<workspace_namespace>
• Set the WORKSPACE_NAME environment variable to the name of the workspace where the cluster is attached.
export WORKSPACE_NAME=<workspace_name>
When you enable an application on a workspace, it is deployed to all clusters in the workspace by default. If you want
to deploy it only to a subset of clusters when enabling it on a workspace for the first time, you can follow the steps:
To enable an application per cluster for the first time:
Procedure
1. Create an AppDeployment for your application, selecting a subset of clusters within the workspace to enable
it on. You can use the nkp get clusters --workspace ${WORKSPACE_NAME} command to see the list of
clusters in the workspace.
The following snippet is an example. Replace the application name, version, workspace name and cluster names
according to your requirements. For compatible versions, see the "Components and Applications" section in the
Nutanix Kubernetes Platforms Release Notes.
nkp create appdeployment kube-prometheus-stack --app kube-prometheus-stack-46.8.0 --
workspace ${WORKSPACE_NAME} --clusters attached-cluster1,attached-cluster2
2. (Optional) Check the current status of the AppDeployment to see the names of the clusters where the application
is currently enabled.
You can enable or disable an application per cluster after it has been enabled at the workspace level.
Note: For clusters that are newly attached into the workspace, all applications enabled for the workspace are
automatically enabled on and deployed to the new clusters.
If you want to see on what clusters your application is currently deployed, see the print and review the current state of
your AppDeployment. For more information, see AppDeployment Resources on page 396.
Procedure
Edit the AppDeployment YAML by adding or removing the names of the clusters where you want to enable your
application in the clusterSelector section:
You can customize the application for each cluster occurrence of said application. If you want to customize
the application for a cluster that is not yet attached, refer to the instructions below, so the application is
deployed with the custom configuration during attachment.
Procedure
1. Reference the name of the ConfigMap to be applied per cluster in the spec.clusterConfigOverrides
fields. In this example, you have three different customizations specified in three different ConfigMaps for three
different clusters in one workspace.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: kube-prometheus-stack
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: kube-prometheus-stack-46.8.1
kind: ClusterApp
clusterSelector:
matchExpressions:
- key: kommander.d2iq.io/cluster-name
operator: In
values:
- attached-cluster1
- attached-cluster2
- attached-cluster3-new
clusterConfigOverrides:
- configMapName: kps-cluster1-overrides
clusterSelector:
matchExpressions:
- key: kommander.d2iq.io/cluster-name
2. If you have not done so yet, create the ConfigMaps referenced in each clusterConfigOverrides entry.
Note:
• The changes are applied only if the YAML file has a valid syntax.
• Set up only one cluster override ConfigMap per cluster. If there are several ConfigMaps configured
for a cluster, only one will be applied.
• Cluster override ConfigMaps must be created on the Management cluster.
You can customize the application configuration for a cluster prior to its attachment, so that the application
is deployed with this custom configuration on attachment. This is preferable, if you do not want to
redeploy the application with an updated configuration after it has been initially installed, which may cause
downtime.
Procedure
1. Set the CLUSTER_NAME environment variable to the cluster name that you will give your to-be-attached cluster.
export CLUSTER_NAME=<your_attached_cluster_name>
Reference the name of the ConfigMap you want to apply to this cluster in the
spec.clusterConfigOverrides fields. You do not need to update the spec.clusterSelector field.
In this example, you have the kps-cluster1-overrides customization specified for attached-cluster-1
and a different customization (in kps-your-attached-cluster-overrides ConfigMap) for your to-be-
attached cluster.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: kube-prometheus-stack
namespace: ${WORKSPACE_NAMESPACE}
spec:
2. If you have not done so yet, create the ConfigMap referenced for your to-be-attached cluster.
Note:
• The changes are applied only if the YAML file has a valid syntax.
• Cluster override ConfigMaps must be created on the Management cluster.
Procedure
1. To verify whether the applications connect to the managed or attached cluster and check the status of the
deployments, see Workplace Catalog Applications on page 406.
2. If you want to know how the AppDeployment resource is currently configured, refer to the print and review the
state of your AppDeployments.
Procedure
2. To delete the customization, delete the configMapName entry of the cluster. This is located under
clusterConfigOverrides.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: kube-prometheus-stack
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
kind: ClusterApp
name: kube-prometheus-stack-46.8.0
configOverrides:
name: kube-prometheus-stack-ws-overrides
clusterSelector:
matchExpressions:
- key: kommander.d2iq.io/cluster-name
operator: In
values:
- attached-cluster1
clusterConfigOverrides:
- configMapName: kps-cluster1-overrides
clusterSelector:
matchExpressions:
- key: kommander.d2iq.io/cluster-name
operator: In
Note: Compare steps one and two for a reference of how an entry should be deleted.
3. Before deleting a ConfigMap that contains your customization, ensure you will NOT require it at a later time. It is
not possible to restore a deleted ConfigMap. If you choose to delete it, run.
kubectl delete configmap <name_configmap> -n ${WORKSPACE_NAMESPACE}
Note: It is not possible to delete a ConfigMap that is being actively used and referenced in the
configOverride of any AppDeployment.
• Ensure your clusters run on a supported Kubernetes version and that this Kubernetes version is also compatible
with your catalog application version.
• For customers with an NKP Ultimate License on page 28 and a multi-cluster environment, Nutanix recommends
keeping all clusters on the same Kubernetes version. This ensures your NKP catalog application can run on all
clusters in a given workspace.
• Ensure that your NKP Catalog application is compatible with:
• The Kubernetes version in all the Managed and Attached clusters of the workspace where you want to install
the catalog application.
• The range of Kubernetes versions supported in this release of NKP.
• If your current Catalog application version is not compatible, upgrade the application to a compatible version.
Note: With the latest NKP version, only the following versions of Catalog applications are supported. All the
previous versions and any other applications previously included in the Catalog are now deprecated.
1. If you are running in air-gapped environment, install Kommander in an Air-gapped environment. For more
information, see Installing Kommander in an Air-gapped Environment on page 965.
2. Set the WORKSPACE_NAMESPACE environment variable to the name of your workspace’s namespace.
export WORKSPACE_NAMESPACE=<workspace namespace>
4. Verify that you can see the NKP workspace catalog Apps available in the UI (in the Applications section in said
workspace), and in the CLI, using kubectl.
kubectl get apps -n ${WORKSPACE_NAMESPACE}
Warning: If you use a custom version of KafkaCluster with cruise.control, ensure you use the custom resource
image version 2.5.123 in the .cruiseControlConfig.image field for both air-gapped and non-air-gapped
environments.
To avoid the critical CVEs associated with the official kafka image in version v0.25.1, a custom image must be
specified when creating a zookeeper cluster.
Specify the following custom values in KafkaCluster CRD:
• .spec.clusterImage to ghcr.io/banzaicloud/kafka:2.13-3.4.1
• .spec.cruiseControlConfig.initContainers[*].image to ghcr.io/banzaicloud/cruise-
control:2.5.123
This topic describes the Kafka operator running in a workspace namespace, and how to create and
manage Kafka clusters in any project namespaces.
Procedure
1. Follow the generic installation instructions for workspace catalog applications on the Application Deployment
page.
2. Within the AppDeployment, update the appRef to specify the correct kafka-operator App. You can find the
appRef.name by listing the available Apps in the workspace namespace.
kubectl get apps -n ${WORKSPACE_NAMESPACE}
For details on custom configuration for the operator, see Kafka operator Helm Chart documentation at https://
github.com/banzaicloud/koperator/tree/master/charts/kafka-operator#configuration.
Uninstalling the Kafka operator does not affect existing KafkaCluster deployments. After uninstalling
the operator, you must manually remove any remaining Custom Resource Definitions (CRDs) from the
operator.
Procedure
Note: The CRDs are not finalized for deletion until you delete the associated custom resources.
This topic describes the ZooKeeper operator running in a workspace namespace, and how to create and
manage ZooKeeper clusters in any project namespaces.
Procedure
1. Follow the generic installation instructions for workspace catalog applications in Application Deployment
page.
2. Within the AppDeployment, update the appRef to specify the correct zookeeper-operator App. You can
find the appRef.name by listing the available Apps in the workspace namespace.
kubectl get apps -n ${WORKSPACE_NAMESPACE}
For details on custom configuration for the operator, see ZooKeeper operator Helm Chart documentation at
https://github.com/pravega/zookeeper-operator/tree/master/charts/zookeeper-operator#configuration.
Uninstalling the ZooKeeper operator will not directly affect any running ZookeeperClusters. By default,
the operator waits for any ZookeeperClusters to be deleted before it will fully uninstall (you can set
hooks.delete: true in the application configuration to disable this behavior). After uninstalling the
operator, you need to manually clean up any leftover Custom Resource Definitions (CRDs).
Procedure
Warning: After you remove the CRDs, all deployed ZookeeperClusters will be deleted!
• A running cluster with Kommander installed. The cluster must be on a supported Kubernetes version for this
release of NKP and also compatible with the catalog application version you want to install.
• Attach an Existing Kubernetes Cluster section of the documentation completed. For more information, see
Kubernetes Cluster Attachment on page 473.
• Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace the attached
cluster exists in.
export WORKSPACE_NAMESPACE=<workspace_namespace>
After creating a GitRepository, use either the NKP UI or the CLI to enable your catalog applications.
Note: From within a workspace, you can enable applications to deploy. Verify that an application has successfully
deployed through the CLI.
Procedure
1. Ultimate only: From the top menu bar, select your target workspace.
2. From the sidebar menu to browse the available applications from your configured repositories and select
Applications.
3. Select the three dot button on the required application tile and select Enable.
5. (Optional) If you want to override the default configuration values, copy your customized values into the text
editor under Configure Service or upload your YAML file that contains the values.
someField: someValue
See Workspace Catalog Applications for the list of available applications that you can deploy on the
attached cluster.
Procedure
1. Enable a supported application to deploy to your attached Kubernetes cluster with an AppDeployment resource.
For more information, see Kubernetes Cluster Attachment on page 473.
2. Within the AppDeployment, define the appRef to specify which App to enable.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: kafka-operator
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: kafka-operator-0.25.1
kind: App
EOF
Note:
• The appRef.name must match the app name from the list of available catalog applications.
• Create the resource in the workspace you just created, which instructs Kommander to deploy the
AppDeployment to the KommanderClusters in the same workspace.
Enabling the Catalog Application With a Custom Configuration Using the CLI
Procedure
1. Provide the name of a ConfigMap in the AppDeployment, which provides custom configuration on top of the
default configuration.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: kafka-operator
namespace: ${WORKSPACE_NAMESPACE}
spec:
2. Create the ConfigMap with the name provided in the step above, with the custom configuration.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: kafka-operator-overrides
data:
values.yaml: |
operator:
verboseLogging: true
EOF
Kommander waits for the ConfigMap to be present before deploying the AppDeployment to the managed or
attached clusters.
Procedure
Connect to the attached cluster and check the HelmReleases to verify the deployment.
kubectl get helmreleases -n ${WORKSPACE_NAMESPACE}
The result appears as follows.
NAMESPACE NAME READY STATUS
AGE
workspace-test-vjsfq kafka-operator True Release reconciliation succeeded
7m3s
Procedure
3. Select the three dot button on the required application tile, and then select Edit.
4. Select the Version from the dropdown list and select a new version.
This dropdown list is only available if there is a newer version to upgrade to.
5. Click Save.
Note: The commands use the workspace name and not namespace.
You can retrieve the workspace name by running the following command.
nkp get workspaces
To view a list of the deployed apps to your workspace, run the following command.
nkp get appdeployments --workspace=<workspace-name>
Complete the upgrade prerequisites tasks. For more information, see Upgrade Prerequisites on page 1092.
Procedure
1. To see what app(s) and app versions are available to upgrade, run the following command.
Note: You can reference the app version by going into the app name (e.g. <APP ID>-<APP VERSION>.
Note: Platform applications cannot be upgraded on a one-off basis, and must be upgraded in a single process for
each workspace. If you attempt to upgrade a platform application with these commands, you receive an error and
the application is not upgraded.
Custom Applications
Custom applications are third-party applications you have added to the NKP Catalog.
Custom applications are any third-party applications that are not provided in the NKP Application Catalog. Custom
applications can leverage applications from the NKP Catalog or be fully-customized. There is no expectation of
support by Nutanix for a Custom application. Custom applications can be deployed on Konvoy clusters or on any
Nutanix supported 3rd party Kubernetes distribution.
Git repositories must be structured in a specific manner for defined applications to be processed by
Kommander.
You must structure your git repository based on the following guidelines, for your applications to be processed
properly by Kommander so that they can be deployed.
Kubernetes Kustomization file. For more information, see the Kubernetes Kustomization docs. For more
information, see the The Kustomization File in the SIG CLI documentation.
• Define the default values ConfigMap for HelmReleases in the services/<app name>/<version>/
defaults directory, accompanied by a kustomization.yaml Kubernetes Kustomization file pointing to the
ConfigMap file.
• Define the metadata.yaml of each application under the services/<app name>/ directory. For more
information, see Workspace Application Metadata on page 416
For an example of how to structure custom catalog Git repositories, see https://github.com/mesosphere/nkp-
catalog-applications.
Helm Repositories
You must include the HelmRepository that is referenced in each HelmRelease's Chart spec.
Each services/<app name>/<version>/kustomization.yaml must include the path of the YAML file that
defines the HelmRepository. For example.
# services/<app name>/<version>/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- <app name>.yaml
- ../../../helm-repositories/<helm repository 1>
For more information, see the flux documentation about Helm Repositories in the Flux documentation.
Substitution Variables
Some substitution variables are provided. For more information, see Kustomization in the Flux documentation.
• ${releaseName}: For each App deployment, this variable is set to the AppDeployment name. Use this
variable to prefix the names of any resources that are defined in the application directory in the Git repository
so that multiple instances of the same application can be deployed. If you create resources without using the
releaseName prefix (or suffix) in the name field, there can be conflicts if the same named resource is created in
that same namespace.
• ${releaseNamespace}: The namespace of the Workspace.
• ${workspaceNamespace}: The namespace of the Workspace that the Workspace belongs to.
Use the CLI to create the GitRepository resource and add a new repository to your Workspace.
1. If you are running in an air-gapped environment, complete the steps in Installing Kommander in an Air-gapped
Environment on page 965.
2. Set the WORKSPACE_NAMESPACE environment variable to the name of your workspace’s namespace.
export WORKSPACE_NAMESPACE=<workspace_namespace>
Note: To troubleshoot issues with adding the GitRepository, review the following logs.
kubectl -n kommander-flux logs -l app=source-controller
[...]
kubectl -n kommander-flux logs -l app=kustomize-controller
[...]
kubectl -n kommander-flux logs -l app=helm-controller
[...]
You can define how custom applications display in the NKP UI by defining a metadata.yaml file for each
application in the git repository. You must define this file at services/<application>/metadata.yaml for
it to process correctly.
Note: To display more information about custom applications in the UI, define a metadata.yaml file for each
application in the Git repository.
None of these fields are required for the application to display in the UI.
Here is an example metadata.yaml file.
displayName: Prometheus Monitoring Stack
description: Stack of applications that collect metrics and provides visualization and
alerting capabilities. Includes Prometheus, Prometheus Alertmanager and Grafana.
category:
- monitoring
overview: >
# Overview
A stack of applications that collects metrics and provides visualization and alerting
capabilities. Includes Prometheus, Prometheus Alertmanager and Grafana.
## Dashboards
By deploying the Prometheus Monitoring Stack, the following platform applications and
their respective dashboards are deployed. After deployment to clusters in a workspace,
the dashboards are available to access from a respective cluster's detail page.
### Prometheus
### Grafana
A monitoring dashboard from Grafana that can be used to visualize metrics collected
by Prometheus.
Procedure
2. From the sidebar menu to browse the available applications from your configured repositories, select
Applications.
3. Select the three dot button on the required application tile and click Enable.
5. (Optional) If you want to override the default configuration values, copy your customized values into the text
editor under Configure Service or upload your YAML file that contains the values.
someField: someValue
• Determine the name of the workspace where you wish to perform the deployments. You can use the nkp get
workspaces command to see the list of workspace names and their corresponding namespaces.
• Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace where the
cluster is attached:
export WORKSPACE_NAMESPACE=<workspace_namespace>
Procedure
1. Get the list of available applications to enable using the following command.
kubectl get apps -n ${WORKSPACE_NAMESPACE}
2. Deploy one of the supported applications from the list with an AppDeployment resource.
3. Within the AppDeployment, define the appRef to specify which App to enable.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: my-custom-app
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: custom-app-0.0.1
kind: App
EOF
Note:
• The appRef.name must match the app name from the list of available catalog applications.
• Create the resource in the workspace you just created, which instructs Kommander to deploy the
AppDeployment to the KommanderClusters in the same workspace.
Enabling the Custom Application With Custom Configuration Using the CLI
Procedure
1. Provide the name of a ConfigMap in the AppDeployment, which provides custom configuration on top of the
default configuration.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: my-custom-app
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: custom-app-0.0.1
kind: App
configOverrides:
name: my-custom-app-overrides
EOF
2. Create the ConfigMap with the name provided in the step above, with the custom configuration.
cat <<EOF | kubectl apply -f -
apiVersion: v1
Procedure
Connect to the attached cluster and check the HelmReleases to verify the deployment.
kubectl get helmreleases -n ${WORKSPACE_NAMESPACE}
The output is as follows.
NAMESPACE NAME READY STATUS AGE
workspace-test-vjsfq my-custom-app True Release reconciliation succeeded 7m3s
Note: The syntax for the Identity Provider groups you add to a NKP Group varies depending on the context for which
you have established an Identity Provider.
• For groups: Add an Identity Provider Group in the oidc:<IdP_user_group> format. For
example, oidc:engineering.
• For users: Add an Identity Provider User in the <user_email>. For example,
[email protected].
• For users: Add an Identity Provider User in the <workspace_ID>:<user_email> format. For
example, tenant-z:[email protected].
Run kubectl get workspaces to obtain a list of all existing workspaces. The workspace_ID is listed
under the NAME column.
3. Select the Cluster Role Bindings tab, and then select Add Roles next to the group you want.
4. Select the Role, or Roles, you want from the dropdown list and click Save.
It will take a few minutes for the resource to be created.
Multi-Tenancy in NKP
You can use workspaces to manage your tenants' environments separately, while still maintaining control
over clusters and environments centrally. For example, if you operate as a Managed Service Provider
(MSP), you can manage your clients clusters' life cycles, resources, and applications. If you operate as an
environment administrator, you can these resources per department, division, employee group, etc.
Here are some important concepts:
• Multi-tenancy in NKP is an architecture model where a single NKP Ultimate instance serves multiple
organization’s divisions, customers or tenants. In NKP, each tenant system is represented by a workspace. Each
workspace and its resources can be isolated from other workspaces (by using separate Identity Providers), even
though they all fall under a single Ultimate license.
Multi-tenant environments have at least two participating parties: the Ultimate license administrator (for example,
an MSP), and one or several tenants.
• Managed Service Providers or MSPs are partner organizations that use NKP to facilitate cloud infrastructure
services to their customers or tenants.
• Tenants can be customers of Managed Service Provider partners. They outsource their cloud management
requirements to MSPs, so they can focus on the development of their products.
Tenants can also be divisions within an organization that require a strict isolation from other divisions, for
example, through differentiated access control.
In NKP, a workspace is assigned to a tenant.
• Workspaces: In a multi-tenant system, workspaces and tenants are synonymous. You can set up an identity
provider to control all workspaces, including the Management cluster’s kommander workspace. You can then set
up additional identity providers for each workspace/tenant, and generate a dedicated Login URL so each tenant
has its own user access.
For more information see, Generating a Dedicated Login URL for Each Tenant on page 423.
• Projects: After you set up an identity provider per workspace or tenant, the tenant can choose to further
narrow down access with an additional layer. A tenant can choose to organize clusters into projects and assign
differentiated access to user groups with Project Role Bindings.
For more information, see Project Role Bindings on page 447.
By assigning clusters to one or several projects, you can enable more complex user access.
Multi-Tenancy Enablement
To enable multi-tenancy, you must:
Procedure
1. Set an environment variable to point at the workspace for which you want to generate a URL:
Replace <name_target_workspace> with the workspace name. If you do not know the exact name of the
workspace, run kubectl get workspace to get a list of all workspace names.
export WORKSPACE_NAME=<name_target_workspace>
3. Share the output login URL with your tenant, so users can start accessing their workspace from the NKP UI.
Projects
Multi-cluster Configuration Management
Project Namespaces
Project Namespaces isolate configurations across clusters. Individual standard Kubernetes namespaces are
automatically created on all clusters belonging to the project. When creating a new project, you can customize
the Kubernetes namespace name that is created. It is the grouping of all of these individual standard Kubernetes
namespaces that make up the concept of a Project Namespace. A Project Namespace is a Kommander specific
concept.
Procedure
Task step.
Project Applications
This section documents the applications and application types that you can utilize with NKP.
Application types are:
• Workplace Catalog Applications on page 406 that are either pre-packaged applications from the Nutanix
Application Catalog or custom applications that you maintain for your teams or organization.
• NKP Applications on page 376 are applications that are provided by Nutanix and added to the Catalog.
• Custom Applications on page 414 are applications integrated into Kommander.
• Platform Applications on page 386
When deploying and upgrading applications, platform applications come as a bundle; they are tested as a single unit
and you must deploy or upgrade them in a single process, for each workspace. This means all clusters in a workspace
have the same set and versions of platform applications deployed. Whereas catalog applications are individual, so you
can deploy and upgrade them individually, for each project.
Platform Applications
Procedure
5. Select the three dot button from the bottom-right corner of the desired application tile, and then select Enable.
6. If you want to override the default configuration values, copy your customized values into the text editor under
Configure Service or upload your YAML file that contains the values.
someField: someValue
Warning: There may be dependencies between the applications, which are listed in Project Platform
Application Dependencies on page 428. Review them carefully prior to customizing to ensure that the
applications are deployed successfully.
Platform Applications within a Project are automatically upgraded when the Workspace that a Project
belongs to is upgraded.
For more information on how to upgrade these applications, see Ultimate: Upgrade Platform Applications on
Managed and Attached Clusters on page 1101.
• Set the WORKSPACE_NAMESPACE environment variable to the namespace of the above workspace.
export WORKSPACE_NAMESPACE=$(kubectl get namespace --
selector='workspaces.kommander.mesosphere.io/workspace-name=${WORKSPACE_NAME}' -o
jsonpath='{.items[0].metadata.name}')
• Set the PROJECT_NAME environment variable to the name of the project in which the cluster is included:
export PROJECT_NAME=<project_name>
• Set the PROJECT_NAMESPACE environment variable to the name of the above project's namespace:
export PROJECT_NAMESPACE=$(kubectl get project ${PROJECT_NAMESPACE} -n
${WORKSPACE_NAMESPACE} -o jsonpath='{.status.namespaceRef.name}')
Procedure
1. Deploy one of the supported applications to your existing attached cluster with an AppDeployment resource.
Provide the appRef and application version to specify which App is deployed.
nkp create appdeployment project-grafana-logging --app project-grafana-logging-6.38.1
--workspace ${WORKSPACE_NAME} --project ${PROJECT_NAME}
2. Create the resource in the project you just created, which instructs Kommander to deploy the AppDeployment to
the KommanderClusters in the same project.
Note:
• The appRef.name must match the app name from the list of available catalog applications.
• Observe that the nkp create command must be run with both the --workspace and --project
flags for project platform applications.
Deploying the Project Platform Application With Custom Configuration Using the CLI
1. Create the AppDeployment and provide the name of a ConfigMap, which provides custom configuration on top
of the default configuration.
nkp create appdeployment project-grafana-logging --app project-grafana-logging-6.38.1
--config-overrides project-grafana-logging-overrides --workspace ${WORKSPACE_NAME}
--project ${PROJECT_NAMESPACE}
2. Create the ConfigMap with the name provided in the step above, with the custom configuration.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${PROJECT_NAMESPACE}
name: project-grafana-logging-overrides
data:
values.yaml: |
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Loki
type: loki
url: "http://project-grafana-loki-loki-distributed-gateway"
access: proxy
isDefault: false
EOF
Kommander waits for the ConfigMap to be present before deploying the AppDeployment to the managed or
attached clusters.
Procedure
2. Connect to the attached cluster and check the HelmReleases to verify the deployment.
kubectl get helmreleases -n ${PROJECT_NAMESPACE}
NAMESPACE NAME READY STATUS
AGE
project-test-vjsfq project-grafana-logging True Release reconciliation
succeeded 7m3s
Note: Some of the supported applications have dependencies on other applications. See Project Platform
Application Dependencies on page 428 for that table.
Application Dependencies
When deploying or troubleshooting applications, it helps to understand how applications interact and may require
other applications as dependencies.
If an application’s dependency does not successfully deploy, the application requiring that dependency does not
successfully deploy.
The following sections detail information about the platform applications.
Logging
Collects logs over time from Kubernetes pods deployed in the project namespace. Also provides the ability to
visualize and query the aggregated logs.
• project-logging: Defines resources for the Logging Operator which uses them to direct the project’s logs to its
respective Grafana Loki application. For more information, see https://grafana.com/oss/grafana/.
• project-grafana-loki: A horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by
Prometheus. For more information, see https://grafana.com/oss/loki/.
• project-grafana-logging: Logging dashboard used to view logs aggregated to Grafana Loki. For more
information, see https://grafana.com/oss/grafana/.
Warning: The project logging applications depend on the Enabling Logging Applications Using the UI on
page 566 being deployed.
Before upgrading your catalog applications, verify the current and supported versions of the application.
Also, keep in mind the distinction between Platform applications and Catalog applications. Platform
applications are deployed and upgraded as a set for each cluster or workspace. Catalog applications are
deployed separately, so that you can deploy and upgrade them individually for each project.
Procedure
5. Select the three dot button from the bottom-right corner of the desired application tile, and then click Edit.
7. Click Save.
Procedure
1. To see what app(s) and app versions are available to upgrade, run the following command:
Note: The APP ID column displays the available apps and the versions available to upgrade.
2. Run the following command to upgrade an application from the NKP CLI.
nkp upgrade catalogapp <appdeployment-name> --workspace=my-workspace --project=my-
project --to-version=<version.number>
As an example, the following command upgrades the Kafka Operator application, named
kafka-operator-abc, in a workspace to version 0.25.1.
nkp upgrade catalogapp kafka-operator-abc --workspace=my-workspace --to-
version=0.25.1
Note: Platform applications cannot be upgraded on a one-off basis, and must be upgraded in a single process for
each workspace. If you attempt to upgrade a platform application with these commands, you receive an error and
the application is not upgraded.
NKP applications are catalog applications provided by Nutanix for use in your environment.
Some NKP workspace catalog applications will provision CustomResourceDefinitions, which allow you
to deploy Custom Resources to a Project. See your NKP workspace catalog application’s documentation for
instructions.
To get started with creating ZooKeeper clusters in your project namespace, you first need to deploy the
Zookeeper operator in the workspace where the project exists.
Note: If you need to manage these custom resources across all clusters in a project, it is recommended you use
Project Deployments on page 441 which enables you to leverage GitOps to deploy the resources. Otherwise, you
will need to create the resources manually in each cluster.
Procedure
1. Set the PROJECT_NAMESPACE environment variable to the name of your project’s namespace.
export PROJECT_NAMESPACE=<project namespace>
Procedure
After you deploy the Kafka operator, you can create Kafka clusters by applying a KafkaCluster custom
resource on each attached cluster in a project’s namespace.
• Deploy the Kafka operator in the workspace where the project exists. See Kafka Operator in a Workspace on
page 407.
• Deploy the ZooKeeper operator the workspace where the project exists. See Zookeeper Operator in
Workspace on page 408.
• Deploy the ZooKeeper operator in the workspace where the project exists. Deploy Zookeeper in a Project in the
same project where you want to enable Kafka. See Deploying ZooKeeper in a Project on page 431.
Note: If you need to manage these custom resources across all clusters in a project, it is recommended you use project
deployments which enables you to leverage GitOps to deploy the resources. Otherwise, you must create the custom
resources manually in each cluster.
Procedure
2. Set the PROJECT_NAMESPACE environment variable to the name of your project’s namespace.
export PROJECT_NAMESPACE=<project namespace>
4. Use the Kafka Operator version to download the simplekafkacluster.yaml file you require.
In the following URL, replace /v0.25.1/ with the Kafka version you obtained in the previous step and
download the file.
https://raw.githubusercontent.com/banzaicloud/koperator/v0.25.1/config/samples/
simplekafkacluster.yaml
In order to use a CVE-free kafka image, set clusterImage value to ghcr.io/banzaicloud/
kafka:2.13-3.4.1 (similarly to workspace installation in ).
5. Open and edit the downloaded file to use the correct Zookeeper Cluster address.
Replace <project_namespace> with the target project namespace.
zkAddresses:
- "zookeeper-client.<project_namespace>:2181"
Procedure
Some workspace catalog applications provision some CustomResourceDefinition, which allow you to
deploy Custom Resources. Refer to your workspace catalog application’s documentation for instructions.
Custom applications are third-party applications you have added to the Kommander Catalog.
Custom applications are any third-party applications that are not provided in the NKP Application Catalog. Custom
applications can leverage applications from the NKP Catalog or be fully-customized. There is no expectation of
support by Nutanix for a Custom application. Custom applications can be deployed on Konvoy clusters or on any
Nutanix supported 3rd party Kubernetes distribution.
Git repositories must be structured in a specific manner for defined applications to be processed by
Kommander.
You must structure your git repository based on the following guidelines, for your applications to be processed
properly by Kommander so that they can be deployed.
• Define application manifests, such as a HelmRelease, under each versioned directory services/<app name>/
<version>/.
• Define the metadata.yaml of each application under the services/<app name>/ directory. For more
information, see Workspace Application Metadata on page 416
For an example of how to structure custom catalog Git repositories, see the NKP Catalog repository at https://
github.com/mesosphere/nkp-catalog-applications.
Helm Repositories
You must include the HelmRepository that is referenced in each HelmRelease's Chart spec.
Each services/<app name>/<version>/kustomization.yaml must include the path of the YAML file that
defines the HelmRepository. For example.
# services/<app name>/<version>/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
• HelmRepositories: https://fluxcd.io/docs/components/source/helmrepositories/
• Manage Helm Releases: https://fluxcd.io/flux/guides/helmreleases/
Substitution Variables
Some substitution variables are provided. For more information, see https://fluxcd.io/docs/components/
kustomize/kustomization/#variable-substitution.
• ${releaseName}: For each application deployment, this variable is set to the AppDeployment name. Use this
variable to prefix the names of any resources that are defined in the application directory in the Git repository
so that multiple instances of the same application can be deployed. If you create resources without using the
releaseName prefix (or suffix) in the name field, there can be conflicts if the same named resource is created in
that same namespace.
• ${releaseNamespace}: The namespace of the project.
• ${workspaceNamespace}: The namespace of the workspace that the project belongs to.
Use the CLI to create the GitRepository resource and add a new repository to your Project.
Procedure
1. Refer to Installing Kommander in an Air-gapped Environment on page 965 setup instructions section, if
you are running in air-gapped environment.
2. Set the WORKSPACE_NAMESPACE environment variable to the name of your project's namespace.
export PROJECT_NAMESPACE=<project_namespace>
Note: To troubleshoot issues with adding the GitRepository, review the following logs:
kubectl -n kommander-flux logs -l app=source-controller
[...]
kubectl -n kommander-flux logs -l app=kustomize-controller
[...]
kubectl -n kommander-flux logs -l app=helm-controller
[...]
You can define how custom applications display in the NKP UI by defining a metadata.yaml file for each
application in the git repository. You must define this file at services/<application>/metadata.yaml for
it to process correctly.
Note: To display more information about custom applications in the UI, define a metadata.yaml file for each
application in the Git repository.
None of these fields are required for the application to display in the UI.
Here is an example metadata.yaml file:
displayName: Prometheus Monitoring Stack
## Dashboards
By deploying the Prometheus Monitoring Stack, the following platform applications and
their respective dashboards are deployed. After deployment to clusters in a workspace,
the dashboards are available to access from a respective cluster's detail page.
### Prometheus
### Grafana
A monitoring dashboard from Grafana that can be used to visualize metrics collected
by Prometheus.
- [Grafana Documentation](https://grafana.com/docs/)
icon:
PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAzMDAgMzAwIiBzdHlsZT0iZW5hYm
Enable a Custom Application from the Project Catalog. After creating a GitRepository, you can either use
the NKP UI or the CLI to enable your custom applications.
Note: From within a project, you can enable applications to deploy. Verify that an application has successfully
deployed through the CLI.
Procedure
4. Select Applications from the sidebar menu to browse the available applications from your configured
repositories.
5. Select the three dot button from the bottom-right corner of the desired application tile, and then select Enable.
7. (Optional) If you want to override the default configuration values, copy your customized values into the text
editor under Configure Serviceor upload your YAML file that contains the values.
someField: someValue
Enabling a Custom Application From the Project Catalog Using the CLI
Enable a Custom Application from the Project Catalog. After creating a GitRepository, you can either use
the NKP UI or the CLI to enable your custom applications.
Note: From within a project, you can enable applications to deploy. Verify that an application has successfully
deployed through the CLI.
Procedure
1. Set the PROJECT_NAMESPACE environment variable to the name of the above project's namespace.
export PROJECT_NAMESPACE=<project_namespace>
2. Get the list of available applications to enable using the following command.
kubectl get apps -n ${PROJECT_NAMESPACE}
3. Enable one of the supported applications from the list with an AppDeployment resource.
4. Within the AppDeployment resource. Provide the appRef and application version to specify which App is
deployed.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: my-custom-app
namespace: ${PROJECT_NAMESPACE}
spec:
appRef:
name: custom-app-0.0.1
kind: App
EOF
Note: The appRef.name must match the app name from the list of available catalog applications.
Procedure
1. Provide the name of a ConfigMap in the AppDeployment, which provides custom configuration on top of the
default configuration.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: my-custom-app
namespace: ${PROJECT_NAMESPACE}
spec:
appRef:
name: custom-app-0.0.1
kind: App
configOverrides:
name: my-custom-app-overrides
EOF
2. Create the ConfigMap with the name provided in the step above, with the custom configuration.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${PROJECT_NAMESPACE}
name: my-custom-app-overrides
data:
values.yaml: |
someField: someValue
EOF
Kommander waits for the ConfigMap to be present before deploying the AppDeployment to the attached clusters
in the Project.
Procedure
Connect to the attached cluster and check the HelmReleases to verify the deployment.
kubectl get helmreleases -n ${PROJECT_NAMESPACE}
The output looks similar to this:
NAMESPACE NAME READY STATUS AGE
project-test-vjsfq my-custom-app True Release reconciliation succeeded 7m3s
Project AppDeployments
An AppDeployment is a Custom Resource created by NKP with the purpose of deploying applications
(platform, NKP catalog and custom applications).
For more information about these Custom Resources and how to customize them, see Printing and Reviewing the
Current State of an AppDeployment Resource on page 377 section of this guide.
Project Deployments
Use Project Deployments to manage GitOps based Continuous Deployments.
You can configure Kommander Projects with GitOps-based Continuous Deployments for federation of your
Applications to associated clusters of the project. This is backed by Flux, which enables software and applications
to be continuously deployed (CD) using GitOps processes. GitOps enables the application to be deployed as per a
manifest that is stored in a Git repository. This ensures that the application deployment can be automated, audited,
and declaratively deployed to the infrastructure.
GitOps
GitOps is a modern software deployment strategy. The configuration that describes how your application is
deployed to a cluster are stored in a Git repository. The configuration is continuously synchronized from the
Git repository to the cluster, ensuring that the specified state of the cluster always matches what is defined
in the “GitOps” Git repository.
The benefits of using a GitOps deployment strategy are:
• Familiar, collaborative change and review process. Engineers are intimately familiar with Git-based workflows:
branches, pull requests, code reviews, etc. GitOps leverages this experience to control the deployment of software
and updates to catch issues early.
• Clear change log and audit trail. The Git commit log serves as an audit trail to answer the question: “who changed
what, and when?” Having such information available, you can contact the right people when fixing or prioritizing
a production incident to determine the why and correctly resolve the issue as quickly as possible. Additionally,
Kommander’s CD component (Flux CD) maintains a separate audit trail in the form of Kubernetes Events, as
changes to a Git repository don’t include exactly when those changes were deployed.
• Avoid configuration drift. The scope of manual changes made by operators expands over time. It soon becomes
difficult to know which cluster configuration is critical and which is left over from temporary workarounds or
live debugging. Over time, changing a project configuration or replicating a deployment to a new environment
becomes a daunting task. GitOps supports simple, reproducible deployment to multiple different clusters by
having a single source of truth for cluster and application configuration.
That said, there are some cases when live debugging is necessary in order to resolve an incident in the minimum
amount of time. In such cases, pull-request-based workflow adds precious time to resolution for critical production
outages. Kommander’s CD strategy supports this scenario by letting you disable the auto sync feature. After auto sync
is disabled, Flux will stop synchronizing the cluster state from the GitOps git repository. This lets you use kubectl,
helm, or whichever tool you need to resolve the issue.
Note: This procedure was run on an AWS cluster with NKP installed.
Procedure
1. Ensure you are on the Default Workspace (or other workspace you have access to) so that you can create a
project.
2. Create a project, as described in Projects on page 423.In the working example we name the project pod-
info. When you create a namespace, Kommander appends five alphanumeric characters. You can opt to select a
target cluster for this project from one of the available attached clusters, and then this (pod-info-xxxxx) is the
namespace used for deployments under the project.
3. [Optional] Create a secret in order to pull from the repository, for private repositories.
a. Select the Secrets tab and set up your secret according to the Continuous Deployment on page 444
documentation.
b. Add a key and value pair for the GitHub personal access token and then select Create.
4. Verify that the secret podinfo-secret is created on the project namespace in the managed or attached cluster.
kubectl get secrets -n pod-info-xt2sz --kubeconfig=${CLUSTER_NAME}.conf
NAME TYPE DATA AGE
default-token-k685t kubernetes.io/service-account-token 3 94m
6. Add a GitOps Source, complete the required fields, and then Save.
There are several configurable options such as selecting the Git Ref Type but in this example we use the master
branch. The Path value should contain where the manifests are located. Additionally, the Primary Git Secret
is the secret (podinfo-secret) that you created in the previous step, if you need to access private repositories.
This can be disregarded for public repositories.
7. Do the following.
a. Verify the status of gitrepository creation with this command (on the attached or managed cluster), and
if READY is marked as True.
kubectl get gitrepository -A --kubeconfig=${CLUSTER_NAME}.conf
NAMESPACE NAME URL
AGE READY STATUS
kommander-flux management https://git-operator-git.git-operator-system.svc/
repositories/kommander/kommander.svc/kommander/kommander.git 134m True
stored artifact for revision 'main/4fbee486076778c85e14f3196e49b8766e50e6ce'
pod-info-xt2sz podinfo-source https://github.com/stefanprodan/podinfo
116m True stored artifact for revision 'master/
b3b00fe35424a45d373bf4c7214178bc36fd7872'
8. Verify the Kustomization with this command below (on the attached or managed cluster), and if READY is
marked as True.
kubectl get kustomizations -n pod-info-xt2sz --kubeconfig=${CLUSTER_NAME}.conf
NAME AGE READY STATUS
originalpodinfo 10m True Applied revision: master/
b3b00fe35424a45d373bf4c7214178bc36fd7872
podinfo-source 113m True Applied revision: master/
b3b00fe35424a45d373bf4c7214178bc36fd7872
project 116m True Applied revision:
main/4fbee486076778c85e14f3196e49b8766e50e6ce
project-tls-root-ca 117m True Applied revision:
main/4fbee486076778c85e14f3196e49b8766e50e6ce
Note the
port
so that you can use to verify if the app is deployed correctly (on the attached or managed cluster).
kubectl get deployments,services -n pod-info-xt2sz --kubeconfig=
${CLUSTER_NAME}.conf
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/podinfo 2/2 2 2 118m
10. Open a browser and type in localhost:9898. A successful deployment of the podinfo app gives you this page.
Continuous Deployment
After installing Kommander and configuring your project and its clusters, navigate to the Continuous
Deployment (CD) tab under your Project.
Here you create a GitOps source which is a source code management (SCM) repository hosting the application
definition. Nutanix recommends that you create a secret first then create a GitOps source accessed by the secret.
You can create a secret that Kommander uses to deploy the contents of your GitOps repository.
Note: This dialog box creates a types.kubefed.io/v1beta1, Kind=FederatedSecret and this is not yet
supported by NKP CLI. Use the GUI, as described above, to create a federated secret or create a FederatedSecret
manifest and apply it to the project namespace. For more information about secrets, see Project Secrets on
page 454
Kommander secrets (for CD) can be configured to support any of the following three authentication methods:
Procedure
If you are using a GitHub personal access token, you do not need to have a key:value pair of username.
1. If you are using GitOps by using a GitHub repo as your source, you can create your secret with a personal access
token. Then, in the NKP UI, in your project, create a Secret, with a key:value pair of password: <your-token-
created-on-github>. If you are using a GitHub personal access token, you do not need to have a key:value
pair of username: <your-github-username>.
Note: If you have multi-factor authentication turned on in your GitHub account, this will not work.
Note: Using a token without a username is valid for GitHub, but other providers (such as GitLab) require both
username and tokens.
Warning: If you are using a public GitHub repository, you do not need to use a secret.
After the secret is created, you can view it in the Secrets tab. Configure the GitOps source accessed by
the secret.
Note: If using an SSH secret, the SCM repo URL needs to be an SSH address. It does not support SCP syntax. The
URL format is ssh://user@host:port/org/repository.
It takes a few moments for the GitOps Source to be reconciled and the manifests from the SCM repository at the
given path to be federated to attached clusters. After the sync is complete, manifests from GitOps source are created
in attached clusters.
After a GitOps Source is created, there are various commands that can be executed from the CLI to check the various
stages of syncing the manifests.
Procedure
On the management cluster, check your GitopsRepository to ensure that the CD manifests have been created
successfully.
kubectl describe gitopsrepositories.dispatch.d2iq.io -n<PROJECT_NAMESPACE> gitopsdemo
Name: gitopsdemo
Namespace: <PROJECT_NAMESPACE>
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ManifestSyncSuccess 1m7s GitopsRepositoryController manifests synced to
bootstrap repo
...
On the attached cluster, check for your Kustomization and GitRepository resources. The status field reflects
the syncing of manifests.
kubectl get kustomizations.kustomize.toolkit.fluxcd.io -n<PROJECT_NAMESPACE>
<GITOPS_SOURCE_NAME> -oyaml
...
status:
conditions:
- reason: ReconciliationSucceeded
status: "True"
type: Ready
...
...
Procedure
There may be times when you need to suspend the auto-sync between the GitOps repository and the
associated clusters. This live debugging may be necessary to resolve an incident in the minimum amount
of time without the overhead of pull request based workflows.
Procedure
5. Select the three dot button to the right of the desired GitOps Source.
• Ensure your GitOps repository does not contain any HelmRelease and Kustomization resources that are
targeting a different namespace than the project namespace.
Procedure
Note: If an error occurs with the Helm Releases charts deployment, an “Install Failed” error status appears in the
Kommander Host field .
Select the error status to open a screen that details specific issues related to the error.
Procedure
2. Select the Role Bindings tab, then select Add Roles next to the group you want.
3. Select the Role you want from the dropdown list, and then click Save.
Procedure
A Project role binding can also be created using kubectl.
cat << EOF | kubectl create -f -
apiVersion: workspaces.kommander.mesosphere.io/v1alpha1
kind: VirtualGroupProjectRoleBinding
metadata:
generateName: projectpolicy-
namespace: ${projectns}
spec:
projectRoleRef:
name: ${projectrole}
virtualGroupRef:
name: ${virtualgroup}
EOF
Procedure
1. To list the WorkspaceRoles that you can bind to a Project, run the following command.
kubectl get workspaceroles -n ${workspacens} -o=jsonpath="{.items[?
(@.metadata.annotations.workspace\.kommander\.d2iq\.io\/project-default-workspace-
role-for==\"${projectns}\")].metadata.name}"
You can bind to any of the above WorkspaceRoles by setting spec.workspaceRoleRef in the project role
binding.
cat << EOF | kubectl create -f -
apiVersion: workspaces.kommander.mesosphere.io/v1alpha1
kind: VirtualGroupProjectRoleBinding
metadata:
generateName: projectpolicy-
namespace: ${projectns}
spec:
workspaceRoleRef:
name: ${workspacerole}
virtualGroupRef:
name: ${virtualgroup}
In order to define which VirtualGroup(s) is assigned to one of these roles, administrators can create corresponding
role bindings such as VirtualGroupClusterRoleBinding, VirtualGroupWorkspaceRoleBinding, and
VirtualGroupProjectRoleBinding.
Note that for WorkspaceRole and ProjectRole, the referenced VirtualGroup and corresponding role and role
binding objects need to be in the same namespace. If they are not in the same namespace, the role will not bind to the
VirtualGroup since it is assumed that the rules set in the role apply to objects that live in that namespace. Whereas
for ClusterRole which is cluster-scoped, the VirtualGroupClusterRoleBinding is also cluster-scoped, even
though it references a namespace-scoped VirtualGroup.
Project Roles
Project Roles are used to define permissions at the namespace level.
Procedure
Create a Project Role with a single rule.
Procedure
Note: Ensure the projectns variable is set before executing the command.
3. Then, if you run the following command on a Kubernetes cluster associated with the Project, you see a Kubernetes
Role object in the corresponding namespace.
kubectl -n ${projectns} get role admin-dbfpj-l6s9g -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
creationTimestamp: "2020-06-04T11:54:26Z"
labels:
Project ConfigMaps
Use ConfigMaps to automate ConfigMaps aretion on your clusters
Project ConfigMaps can be created to make sure Kubernetes ConfigMaps are automatically created on all Kubernetes
clusters associated with the Project, in the corresponding namespace.
As reference, a ConfigMap is a key-value pair to store some type of non-confidential data like “name=bob” or
“state=CA”. For a full reference to the concept, consult the Kubernetes documentation on the topic of ConfigMaps.
For more information, see https://kubernetes.io/docs/concepts/configuration/configmap/.
Procedure
6. Enter an ID, Description and Data for the ConfigMap, and click Create.
Procedure
1. A Project ConfigMap is simply a Kubernetes FederatedConfigMap and can be created using kubectl with YAML.
cat << EOF | kubectl create -f -
apiVersion: types.kubefed.io/v1beta1
kind: FederatedConfigMap
metadata:
generateName: cm1-
namespace: ${projectns}
spec:
placement:
Note: Ensure the projectns variable is set before executing the command. This variable is the project
namespace (the Kubernetes Namespace associated with the project) that was defined/created when the project itself
was initially created.
projectns=$(kubectl -n ${workspacens} get
projects.workspaces.kommander.mesosphere.io -o jsonpath='{.items[?
(@.metadata.generateName=="project1-")].status.namespaceRef.name}')
2. Then, if you run the following command on a Kubernetes cluster associated with the Project, you’ll see a
Kubernetes ConfigMap Object, in the corresponding namespace.
kubectl -n ${projectns} get configmap cm1-8469c -o yaml
apiVersion: v1
data:
key: value
kind: ConfigMap
metadata:
creationTimestamp: "2020-06-04T16:37:10Z"
labels:
kubefed.io/managed: "true"
name: cm1-8469c
namespace: project1-5ljs9-lhvjl
resourceVersion: "131844"
selfLink: /api/v1/namespaces/project1-5ljs9-lhvjl/configmaps/cm1-8469c
uid: d32acb98-3d57-421f-a677-016da5dab980
Project Secrets
Project Secrets can be created to make sure a Kubernetes Secrets are automatically created on all the
Kubernetes clusters associated with the Project, in the corresponding namespace.
Procedure
1. Select the workspace your project was created in from the workspace selection dropdown in the header.
Procedure
1. A Project Secret is simply a Kubernetes FederatedConfigSecret and can also be created using kubectl.
cat << EOF | kubectl create -f -
apiVersion: types.kubefed.io/v1beta1
kind: FederatedSecret
metadata:
generateName: secret1-
namespace: ${projectns}
spec:
placement:
clusterSelector: {}
template:
data:
key: dmFsdWU=
EOF
Ensure the projectns variable is set before executing the command.
projectns=$(kubectl -n ${workspacens} get projects.workspaces.kommander.mesosphere.io
-o jsonpath='{.items[?
(@.metadata.generateName=="project1-")].status.namespaceRef.name}')
2. If you run the following command on a Kubernetes cluster associated with the Project, you see a Kubernetes
Secret Object, in the corresponding namespace.
kubectl -n ${projectns} get secret secret1-r9vk2 -o yaml
apiVersion: v1
data:
key: dmFsdWU=
kind: Secret
metadata:
creationTimestamp: "2020-06-04T16:51:59Z"
labels:
kubefed.io/managed: "true"
name: secret1-r9vk2
namespace: project1-5ljs9-lhvjl
resourceVersion: "137215"
selfLink: /api/v1/namespaces/project1-5ljs9-lhvjl/secrets/secret1-r9vk2
uid: e5c6fc1d-93e7-47fe-ae1e-f418f8e35d72
type: Opaque
Procedure
1. Select the workspace your project was created in from the workspace selection dropdown in the header.
4. Select the Quotas & Limit Ranges tab, and then select Edit.
Kommander provides a set of default resources for which you can set Quotas. You can also define Quotas for
custom resources. We recommend that you set Quotas for CPU and Memory. By using Limit Ranges, you can
restrict the resource consumption of individual Pods, Containers, and Persistent Volume Claims in the project
namespace. You can also constrain memory and CPU resources consumed by Pods and Containers, and storage
resources consumed by Persistent Volume Claims.
5. To add a custom quota, scroll to the bottom of the form and select Add Quota.
Procedure
1. All the Project Quotas are defined using a Kubernetes FederatedResourceQuota called kommander which you can
also create/update using kubectl.
cat << EOF | kubectl apply -f -
apiVersion: types.kubefed.io/v1beta1
kind: FederatedResourceQuota
metadata:
name: kommander
namespace: ${projectns}
spec:
placement:
clusterSelector: {}
template:
spec:
hard:
limits.cpu: "10"
limits.memory: 1024.000Mi
EOF
Ensure the projectns variable is set before executing the command.
projectns=$(kubectl -n ${workspacens} get projects.workspaces.kommander.mesosphere.io
-o jsonpath='{.items[?
(@.metadata.generateName=="project1-")].status.namespaceRef.name}')
2. Then, if you run the following command on a Kubernetes cluster associated with the Project, you’ll see a
Kubernetes Resource Quota in the corresponding namespace.
kubectl -n ${projectns} get resourcequota kommander -o yaml
apiVersion: v1
kind: ResourceQuota
metadata:
creationTimestamp: "2020-06-05T08:04:37Z"
labels:
kubefed.io/managed: "true"
name: kommander
namespace: project1-5ljs9-lhvjl
resourceVersion: "470822"
selfLink: /api/v1/namespaces/project1-5ljs9-lhvjl/resourcequotas/kommander
uid: 925b61b4-134b-4c45-915c-96a05b63d3c3
spec:
hard:
Network Plugins
Network plugins ensure that Kubernetes networking requirements are met and surface features needed by network
administrators, such as enforcing network policies. Common Network Plugins include Flannel, Calico, and Weave
among many others. As an example, Nutanix Konvoy uses the Calico CNI plugin by default, but can support others.
Nutanix AHV uses Cilium instead of Calico.
Since pods are short-lived, the cluster needs a way to configure the network dynamically as pods are created and
destroyed. Plugins provision and manage IP addresses to interfaces and let administrators manage IPs and their
assignments to containers, in addition to connections to more than one host, when needed.
Network Policies
You can create network policies in three main parts:
• General information
• Ingress rules
• Egress rules
• Default - automatically includes ingress, and egress is set only if the network policy defines egress rules.
• Ingress - this policy applies to ingress traffic for the selected pods, to namespaces using the options you define
below, or both.
• Egress - this policy applies to egress traffic for the selected pods, to namespaces using the options you define
below, or both.
If the Default policy type is too rigid or does not offer what you need, you can select the Ingress or Egress type, or
both, and explicitly define the policy with the options that follow. For example, if you do not want this policy to apply
to ingress traffic, you only select Egress, and then define the policy.
To deny all ingress traffic, select the Ingress option here and then leave the ingress rules empty.
To deny all egress traffic, select the Egress option here and then leave the egress rules empty.
Procedure
Procedure
3. Type “Allow Users microservice clients to reach the APIs provided in this namespace” in the Description field.
4. Select Add under Pod Selector and then select Match Label.
Procedure
3. Select + Add Port, and set the Port to "8080" and the Protocol to TCP.
Procedure
1. Select + Add Source and mark the Select All Namespacescheck box.
4. Set the Key value to “service.corp/users-api-role” and set the Value to “client”.
• Use Port 3306 to receive incoming TCP traffic for pods that have the label, tier: database
• Refuse traffic from pods unless they have the label, tier: api
Procedure
3. Type “Allow MySQL access only from API pods in this namespace” in the Description field.
4. Select Add under Pod Selector and then select Match Label.
Procedure
3. Select + Add Port, and set the Port to "3306" and the Protocol to TCP.
Procedure
4. Set the Key value to “tier” and set the Value to “database”.
First, you need to create a network policy with one or more ingress rules. Follow one of the preceding procedures.
Then, edit the policy to match the following example:
Procedure
In the table row belonging to your network policy, click the context menu at the right of the row and select Edit.
Procedure
1. Update the Policy Types so that only Egress is selected. If you don’t want to deny all egress traffic, ensure
that you add an egress rule that suits your preferred level of access. You can add an empty rule to allow all egress
traffic.
Procedure
3. Type “Deny egress traffic from restricted pods” in the Description field.
4. Select Add under Pod Selector and then select Match Label.
Procedure
1. Update the Policy Types so that only Egress is selected. Do not add any egress rules.
Cluster Management
View clusters created with Kommander or any connected Kubernetes cluster
Kommander allows you to monitor and manage very large numbers of clusters. Use the features described in this area
to connect existing clusters, or to create new clusters whose life cycle is managed by Konvoy. You can view clusters
from the Clusters tab in the navigation pane on the left. You can see the details for a cluster by selecting the View
Details link at the bottom of the cluster card or the cluster name in either the card or the table view.
Select the View Details link (on the cluster card’s bottom left corner) to see additional information about this cluster.
Procedure
Open the NKP UI.
What to do next
See the topic Specifying Nutanix Cluster Information.
Procedure
1. In the selected workspace Dashboard, select the Add Cluster button at the top right to display the Add Cluster
page.
1. Provide the control plane node pool name and resource sizing information.
• Node Pool Name: NKP sets this field’s value, control plane, and you cannot change it.
• Disk: Enter the amount of disk space allocated for each control plane node. The default value is 80 GB. The
specified custom disk size must be equal to, or larger than, the size of the base OS image root file system. This
is because a root file system cannot be reduced automatically when a machine first boots.
• Memory: The amount of memory for each control plane node in GB. The default value is 16 GB.
• Number of CPUs: Enter the number of virtual processors in each control plane node. The default value is 4
CPUs per control plane node.
• Replicas: Enter the number of control plane nodes to create for your new cluster.
Valid values for production clusters are 3 or 5. You can enter one if creating a test cluster, but a single control
plane is not a valid production configuration. You must enter an odd number to allow internal leader selection
processes to provide proper failover for high availability. The default value is three control plane nodes.
Note: When you select a project, AOS cluster, subnets, and images in the control plane section, these selections
will automatically populate the worker node pool section. This eliminates the need to input the same information
twice manually. However, if desired, you can modify these selections for the worker node pool.
2. Provide the worker node pool name and resource sizing information.
• Node Pool Name: Enter a node pool name for the worker nodes. NKP sets this field’s default value to
worker-0.
• Disk: Enter the amount of disk space allotted for each worker node. The default value is 80GB. The specified
custom disk size must be equal to, or larger than, the size of the base OS image root file system. This is
because a root file system cannot be reduced automatically when a machine first boots.
• Memory: The amount of memory for each worker node in GB. The default value is 32 GB.
• Number of CPUs: Enter the number of virtual processors in each worker node. The default value is 8 CPUs
per node.
• Replicas: Enter the number of worker nodes to create for your new cluster. The default value is 4 worker
nodes.
3. AOS Cluster Dropdown will be empty by default. Upon selecting a project, the AOS cluster dropdown will be
filtered only to display clusters that are part of the selected project. If the project is empty, UI will display all the
AOS clusters.
Procedure
5. From Select Infrastructure Provider, choose the provider created in the prerequisites section.
6. If available, choose a Kubernetes Version. Otherwise, the review of the Supported Kubernetes Versions.
installs.
8. Edit your worker Node Pools as necessary. You can choose the Number of Nodes, the Machine Type,
and for the worker nodes, you can choose a Worker Availability Zone.
10. Review your inputs to ensure they meet the predefined criteria, and select Create.
Note: It can take up to 15 minutes for your cluster to appear in the Provisioned status.
You are then redirected to the Clusters page, where you’ll see your new cluster in the Provisioning status.
Hover over the status to view the details.
Note:
You must also create a vSphere infrastructure provider before you can create additional vSphere
clusters.
To Provision a vSphere Cluster.
Procedure
Complete these procedures to provision a vSphere cluster.
What to do next
Select the View Details link (on the cluster card’s bottom left corner) to see additional information about this
cluster.
Procedure
1. In the selected workspace Dashboard, select the Add Cluster button at the top right to display the Add Cluster
page.
1. Provide the following values for the Resources that are specific to vSphere.
2. Enter the values for the network information in the lower half of this section.
• Network: Enter an existing network name you want the new cluster to use.
You need to create required network resources, such as port groups or distributed port groups, in the vSphere
Client or using the vSphere API before you use NKP to create a new cluster.
• Resource Pool: Enter the name of a logical resource pool for the cluster’s resources.
In vSphere, resource pools are a logical abstraction that allows you to allocate and manage computing
resources, such as CPU and memory, for a group of virtual machines. Use resource pools only when needed,
as they can add complexity to your environment.
• Virtual Machine Template: Enter the name of the virtual machine template to use for the managed cluster's
virtual machines.
In vSphere, a virtual machine (VM) template is a pre-configured virtual machine that you can use to create
new virtual machines with identical configurations quickly. The template contains the basic configuration
settings for the VM, such as the operating system, installed software, and hardware configurations.
• Storage Policy: Enter the name of a valid vSphere storage policy. This field is optional.
A storage policy in vSphere specifies the storage requirements for virtual machine disks and files. It consists
of a rule set that defines the storage capabilities required, tags to identify them, profiles that collect settings
and requirements, and storage requirements that include storage performance, capacity, redundancy, and other
attributes necessary for the virtual machine to function properly. By creating and applying a storage policy to
a specific datastore or group of datastores, you can ensure that virtual machines using that datastore meet the
specified storage requirements.
1. Provide the control plane node pool name and resource sizing information.
• Node Pool Name: NKP sets this field’s value, control plane, and you cannot change it.
• Disk: Enter the amount of disk space allocated for each control plane node. The default value is 80 GB. The
specified custom disk size must be equal to, or larger than, the size of the base OS image root file system. This
is because a root file system cannot be reduced automatically when a machine first boots.
• Memory: The amount of memory for each control plane node in GB. The default value is 16 GB.
• Number of CPUs: Enter the number of virtual processors in each control plane node. The default value is 4
CPUs per control plane node.
• Replicas: Enter the number of control plane nodes to create for your new cluster.
Valid values for production clusters are 3 or 5. You can enter one if you are creating a test cluster, but a single
control plane is not a valid production configuration. You must enter an odd number to allow for internal
leader selection processes to provide proper failover for high availability. The default value is three control
plane nodes.
2. Provide the worker node pool name and resource sizing information.
• Node Pool Name: Enter a node pool name for the worker nodes. NKP sets this field’s default value to
worker-0.
• Disk: Enter the amount of disk space allotted for each worker node. The default value is 80GB. The specified
custom disk size must be equal to, or larger than, the size of the base OS image root file system. This is
because a root file system cannot be reduced automatically when a machine first boots.
• Memory: The amount of memory for each worker node in GB. The default value is 32 GB.
• Number of CPUs: Enter the number of virtual processors in each worker node. The default value is 8 CPUs
per node.
• Replicas: Enter the number of worker nodes to create for your new cluster. The default value is four worker
nodes.
Procedure
Provide the Virtual IP information needed for managing this cluster with NKP.
• Interface: Enter the name of the network used for the virtual IP control plane endpoint.
This value is specific to your environment and cannot be inferred by NKP. An example value is eth0 or ens5.
• Host: Enter the control plane endpoint address.
To use an external load balancer, set this value to the load balancer’s IP address or hostname. To use the built-in
virtual IP, set to a static IPv4 address in the Layer 2 network of the control plane machines.
• Port: Enter the control plane’s endpoint port.
The default port value is 6443. To use an external load balancer, see this value in the load balancer’s listening
port.
Procedure
The MetalLB load balancer is needed for cluster installation, and requires these values.
• Provide a Starting IP address range value for the load balancing allocation.
• Provide an Ending IP address range value for the load balancing allocation.
Procedure
1. Select Datastore URL if it is not already highlighted, and then in the Datastore URL field, enter a unique
identifier in URL format used by vSphere to access specific storage locations. A typical example of the field’s
format is ds:///vmfs/volumes/<datastore_uuid>/.
2. Select Storage Policy Name if it is not already highlighted, and then in the Storage Policy Name field, enter
the name of the storage policy to use with the cluster’s StorageClass.
Configuring CIDR Values for the Pod Network and Kubernetes Services
Procedure
Specify the following values.
• Enter a CIDR value for the Pod network in the Pod Network CIDR field. The default value is 192.168.0.0/16.
• Enter a CIDR value for Kubernetes Services in the Service CIDR field. The default value is 10.96.0.0/12.
Procedure
Configure the image registry mirror.
Procedure
Select the Create button (at the page’s top right corner) to begin provisioning the cluster.
Note: You must also create a VCD infrastructure provider before you can create additional VCD clusters.
Procedure
Complete these procedures to provision a VCD cluster:
Procedure
Provide these cluster details in the form:
Procedure
2. Copy and paste an entire, valid SSH Public Key to go with the SSH Username in the previous field.
Procedure
Provide the following values for the Resources that are specific to VCD.
• Datacenter: Select an existing Organization's Virtual Datacenter(VDC) name where you want to deploy the
cluster.
• Network: Select the Organization's virtual datacenter Network that the new cluster uses.
• Organization: The Organization name under which you want to deploy the cluster.
• Catalog: The name of the VCD Catalog that hosts the virtual machine templates used for cluster creation.
Procedure
1. Provide the control plane node pool name and resource sizing information.
• Node Pool Name: NKP sets this field’s value, control plane, and you cannot change it.
• Number of Nodes: Enter the number of control plane nodes to create for your new cluster.
Valid values for production clusters are three or five. You can enter one if you are creating a test cluster, but
a single control plane is not a valid production configuration. You must enter an odd number to allow for
internal leader selection processes to provide proper failover for high availability. The default value is three
control plane nodes.
• Placement Policy: The placement policy for control planes to be used on this machine.
A VM placement policy defines the placement of a virtual machine on a host or group of hosts. It is a
mechanism for cloud provider administrators to create a named group of hosts within a provider VDC. The
named group of hosts is a subset of hosts within the provider VDC clusters that might be selected based on any
criteria such as performance tiers or licensing. You can expand the scope of a VM placement policy to more
than one provider VDC.
• Sizing Policy: The sizing policy for control planes to be used on this machine.
A VM sizing policy defines the computing resource allocation for virtual machines within an organization's
VDC. The compute resource allocation includes CPU and memory allocation, reservations, limits, and shares.
2. Provide the worker node pool name and resource sizing information.
• Node Pool Name: Enter a node pool name for the worker nodes. NKP sets this field’s default value to
worker-0.
• Replicas: Enter the number of worker nodes to create for your new cluster. The default value is four worker
nodes.
• Placement Policy: The placement policy for workers to be used on this machine.
A VM placement policy defines the placement of a virtual machine on a host or group of hosts. It is a
mechanism for cloud provider administrators to create a named group of hosts within a provider VDC. The
named group of hosts is a subset of hosts within the provider VDC clusters that might be selected based on any
criteria such as performance tiers or licensing. You can expand the scope of a VM placement policy to more
than one provider VDC.
• Sizing Policy: The sizing policy for workers to be used on this machine.
A VM sizing policy defines the computing resource allocation for virtual machines within an organization's
VDC. The compute resource allocation includes CPU and memory allocation, reservations, limits, and shares.
Procedure
Procedure
2. Type the Username for the account to use to authenticate to the registry mirror.
3. Type the Password for the account to authenticate to the registry mirror.
4. Copy and paste a CA Certificate chain to use while communicating with the registry mirror using TLS.
Procedure
1. Type the Host name as either the control plane endpoint IP or a hostname.
2. Enter a Port value for the control plane endpoint port. The default value is 6443. To use an external load balancer,
set this port value to the load balancer’s listening port number.
Note: Starting with NKP 2.6.0, NKP supports the attachment of all Kubernetes Conformant clusters, but only x86-64
architecture is supported, not ARM.
Platform applications extend the functionality of Kubernetes and provide ready-to-use logging and monitoring stacks.
Platform applications are deployed when a cluster is attached to Kommander.
Basic Requirements
To attach an existing cluster in the UI, the Application Management cluster must be able to reach the services and the
api-server of the target cluster.
The cluster you want to attach can be a NKP-CLI-created cluster (which will become a Managed cluster upon
attachment), or another Kubernetes cluster like AKS, EKS, or GKE (which will become an Attached cluster
upon attachment).
Procedure
2.
Before you attach clusters, you need to create one or more Workspaces, and we recommend that you also create
Projects within your Workspaces. Workspaces give you a logical way to represent your teams and specific
configurations. Projects let you define one or more clusters as a group to which Kommander pushes a common
Note: Do not attach a cluster in the "Management Cluster Workspace" workspace. This workspace is reserved for your
Application Management cluster only.
In addition to the basic cluster requirements, the platform services you want NKP to manage on those clusters will
impact the total cluster requirements. The specific combinations of platform applications will affect the requirements
for the cluster nodes and their resources (CPU, memory, and storage).
To view a list of platform applications that NKP provides by default, see Platform Applications on page 386.
Attaching an existing AWS cluster requires that the cluster be fully configured and running. You must create a
separate service account when attaching existing AKS, EKS, or Google GKE Kubernetes clusters. This is necessary
because the kubeconfig files generated from those clusters are not usable out-of-the-box by Kommander. The
kubeconfig files call CLI commands, such as azure, aws , or gcloud, and use locally-obtained authentication
tokens. Having a separate service account also allows you to keep access to the cluster-specific to and isolated from
Kommander.
The suggested default cluster configuration includes a control plane pool containing three m5.xlarge nodes and a
worker pool containing four m5.2xlarge nodes.
Consider the additional resource requirements for running the platform services you want NKP to manage and ensure
that your existing clusters comply.
To attach an existing EKS cluster, see EKS Cluster Attachment on page 478.
To attach an existing GKE cluster, see GKE Cluster Attachment on page 484.
If you are attaching clusters that already have cert-manager installed, the cert-manager HelmRelease provided
by NKP will fail to deploy, due to the existing cert-manager installation. As long as the pre-existing cert-manager
functions as expected, you can ignore this failure. It will have no impact on the operation of the cluster.
3. Verify that the serviceaccount token is ready by running the kubectl -n kube-system get secret
kommander-cluster-admin-sa-token -oyaml command.
Verify that the data.token field is populated. The output should be similar to this example:
apiVersion: v1
data:
ca.crt: LS0tLS1CRUdJTiBDR...
namespace: ZGVmYXVsdA==
token: ZXlKaGJHY2lPaUpTVX...
kind: Secret
metadata:
annotations:
kubernetes.io/service-account.name: kommander-cluster-admin
kubernetes.io/service-account.uid: b62bc32e-b502-4654-921d-94a742e273a8
creationTimestamp: "2022-08-19T13:36:42Z"
name: kommander-cluster-admin-sa-token
namespace: default
resourceVersion: "8554"
uid: 72c2a4f0-636d-4a70-9f1c-55a75f15e520
type: kubernetes.io/service-account-token
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
apiVersion: v1
kind: Config
current-context: ${CURRENT_CONTEXT}
contexts:
- name: ${CURRENT_CONTEXT}
context:
cluster: ${CURRENT_CONTEXT}
user: kommander-cluster-admin
namespace: kube-system
clusters:
- name: ${CURRENT_CONTEXT}
cluster:
certificate-authority-data: ${CLUSTER_CA}
server: ${CLUSTER_SERVER}
users:
- name: kommander-cluster-admin
user:
token: ${USER_TOKEN_VALUE}
EOF
8. This process produces a file in your current working directory called kommander-cluster-admin-config. The
contents of this file are used in Kommander to attach the cluster.
Before importing this configuration, verify the kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
What to do next
Use this kubeconfig to:
Note: If a cluster has limited resources to deploy all the federated platform services, it will fail to stay attached to the
NKP UI. If this happens, check if any pods are not getting the resources required.
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
perform the steps in Cluster Attachment with Networking Restrictions on page 488.
5. In the Cluster Configuration section, paste your kubeconfig file into the field, or select Upload
kubeconfig File to specify the file.
6. The Cluster Name field will automatically populate with the name of the cluster is in the kubeconfig. You can
edit this field with the name you want for your cluster.
7. The Context select list is populated from the kubeconfig. Select the desired context with admin privileges from
the Context select list.
• A fully configured and running Amazon EKS cluster with administrative privileges.
• The current version NKP Ultimate is on your cluster.
• Ensure you have installed kubectl in your Management cluster.
• Attach Amazon EKS Clusters. Ensure that the KUBECONFIG environment variable is set to the Management
cluster before attaching by running:
export KUBECONFIG=<Management_cluster_kubeconfig>.conf
Procedure
Ensure you are connected to your EKS clusters. Enter the following commands for each of your clusters.
kubectl config get-contexts
kubectl config use-context <context for first eks cluster>
Confirm kubectl can access the EKS cluster:
kubectl get nodes
Procedure
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
export CURRENT_CLUSTER=$(kubectl config view --raw -o=go-
template='{{range .contexts}}{{if eq .name "'''${CURRENT_CONTEXT}'''"}}
{{ index .context "cluster" }}{{end}}{{end}}')
export CLUSTER_CA=$(kubectl config view --raw -o=go-template='{{range .clusters}}{{if
eq .name "'''${CURRENT_CLUSTER}'''"}}"{{with index .cluster "certificate-authority-
data" }}{{.}}{{end}}"{{ end }}{{ end }}')
export CLUSTER_SERVER=$(kubectl config view --raw -o=go-template='{{range .clusters}}
{{if eq .name "'''${CURRENT_CLUSTER}'''"}}{{ .cluster.server }}{{end}}{{ end }}')
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
apiVersion: v1
kind: Config
current-context: ${CURRENT_CONTEXT}
contexts:
- name: ${CURRENT_CONTEXT}
context:
cluster: ${CURRENT_CONTEXT}
user: kommander-cluster-admin
namespace: kube-system
clusters:
- name: ${CURRENT_CONTEXT}
cluster:
certificate-authority-data: ${CLUSTER_CA}
server: ${CLUSTER_SERVER}
users:
- name: kommander-cluster-admin
user:
token: ${USER_TOKEN_VALUE}
8. This process produces a file in your current working directory called kommander-cluster-admin-config. The
contents of this file are used in Kommander to attach the cluster. Before importing this configuration, verify the
kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
follow the steps in Cluster Attachment with Networking Restrictions on page 488.
5. Upload the kubeconfig file you created in the previous section (or copy its contents) into the Cluster
Configuration section.
6. The Cluster Name field automatically populates with the name of the cluster in the kubeconfig file. You can
edit this field using the name you want for your cluster.
Note: If a cluster has limited resources to deploy all the federated platform services, it will fail to stay attached to
the NKP UI. If this happens, ensure your system has sufficient resources for all pods.
• A fully configured and running Azure AKS cluster with administrative privileges.
• The current version NKP Ultimate is installed on your cluster.
• Ensure you have installed kubectl in your Management cluster.
Note:
This procedure assumes you have an existing and spun-up Azure AKS cluster(s) with administrative
privileges. For information on the Azure AKS for setup and configuration, see https://
azure.microsoft.com/en-us/products/kubernetes-service/.
Ensure you have access to your AKS clusters.
Procedure
1. Ensure you are connected to your AKS clusters. Enter the following commands for each of your clusters.
kubectl config get-contexts
kubectl config use-context <context for first AKS cluster>
Procedure
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
export CURRENT_CLUSTER=$(kubectl config view --raw -o=go-
template='{{range .contexts}}{{if eq .name "'''${CURRENT_CONTEXT}'''"}}
{{ index .context "cluster" }}{{end}}{{end}}')
export CLUSTER_CA=$(kubectl config view --raw -o=go-template='{{range .clusters}}{{if
eq .name "'''${CURRENT_CLUSTER}'''"}}"{{with index .cluster "certificate-authority-
data" }}{{.}}{{end}}"{{ end }}{{ end }}')
export CLUSTER_SERVER=$(kubectl config view --raw -o=go-template='{{range .clusters}}
{{if eq .name "'''${CURRENT_CLUSTER}'''"}}{{ .cluster.server }}{{end}}{{ end }}')
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
apiVersion: v1
kind: Config
current-context: ${CURRENT_CONTEXT}
contexts:
- name: ${CURRENT_CONTEXT}
context:
cluster: ${CURRENT_CONTEXT}
user: kommander-cluster-admin
8. This process produces a file in your current working directory called kommander-cluster-admin-config. The
contents of this file are used in Kommander to attach the cluster. Before importing this configuration, verify the
kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
follow the steps in Cluster Attachment with Networking Restrictions on page 488.
5. Upload the kubeconfig file you created in the previous section (or copy its contents) into the Cluster
Configuration section.
6. The Cluster Name field automatically populates with the name of the cluster in the kubeconfig file. You can
edit this field using the name you want for your cluster.
Note: If a cluster has limited resources to deploy all the federated platform services, it will fail to stay attached in
the NKP UI. If this happens, ensure your system has sufficient resources for all pods.
• A fully configured and running with a GKE cluster supported Kubernetes version cluster with administrative
privileges.
• The current version NKP Ultimate is installed on your cluster.
• Ensure you have installed kubectl in your Management cluster.
Note: This procedure assumes you have an existing and spun-up GKE cluster with administrator privileges.
Note:
Ensure you have access to your GKE clusters.
Procedure
1. Ensure you are connected to your GKE clusters. Enter the following commands for each of your clusters.
Procedure
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
export CURRENT_CLUSTER=$(kubectl config view --raw -o=go-
template='{{range .contexts}}{{if eq .name "'''${CURRENT_CONTEXT}'''"}}
{{ index .context "cluster" }}{{end}}{{end}}')
export CLUSTER_CA=$(kubectl config view --raw -o=go-template='{{range .clusters}}{{if
eq .name "'''${CURRENT_CLUSTER}'''"}}"{{with index .cluster "certificate-authority-
data" }}{{.}}{{end}}"{{ end }}{{ end }}')
export CLUSTER_SERVER=$(kubectl config view --raw -o=go-template='{{range .clusters}}
{{if eq .name "'''${CURRENT_CLUSTER}'''"}}{{ .cluster.server }}{{end}}{{ end }}')
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
apiVersion: v1
kind: Config
current-context: ${CURRENT_CONTEXT}
contexts:
- name: ${CURRENT_CONTEXT}
context:
cluster: ${CURRENT_CONTEXT}
user: kommander-cluster-admin
namespace: kube-system
clusters:
- name: ${CURRENT_CONTEXT}
cluster:
certificate-authority-data: ${CLUSTER_CA}
server: ${CLUSTER_SERVER}
users:
- name: kommander-cluster-admin
user:
token: ${USER_TOKEN_VALUE}
EOF
8. This process produces a file in your current working directory called kommander-cluster-admin-config. The
contents of this file are used in Kommander to attach the cluster. Before importing this configuration, verify the
kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
follow the steps in Cluster Attachment with Networking Restrictions on page 488.
5. Upload the kubeconfig file you created in the previous section (or copy its contents) into the Cluster
Configuration section.
6. The Cluster Name field automatically populates with the name of the cluster in the kubeconfig file. You can
edit this field using the name you want for your cluster.
Note: If a cluster has limited resources to deploy all the federated platform services, it will fail to stay attached to
the NKP UI. If this happens, ensure your system has sufficient resources for all pods.
After the attachment request is accepted and the connection between clusters is established, both clusters will allow
bilateral communication.
• Gain more understanding of this approach by reviewing UI: Attaching a Network-Restricted Cluster Using a
Tunnel Through the UI on page 490
• Ensure you have reviewed the general https://d2iq.atlassian.net/wiki/spaces/DENT/pages/29920407 .
• Firewall Rules:
Procedure
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
5. Establish the configuration parameters for the attachment: Enter the Cluster Name of the cluster
you’re attaching.
7. Select the hostname that is the Ingress for the cluster from the Load Balancer Hostname dropdown menu.
The hostname must match the Kommander Host cluster to which you are attaching your existing cluster with
network restrictions.
8. Specify the URL Path Prefix for your Load Balancer Hostname. This URL path will serve as the prefix for the
specific tunnel services you want to expose on the Kommander management cluster. If no value is specified, the
value defaults to /nkp/tunnel.
Kommander uses Traefik 2 ingress, which requires the explicit definition of strip prefix middleware as
a Kubernetes API object, as opposed to a simple annotation. Kommander provides default middleware
that supports creating tunnels only on the /nkp/tunnel URL prefix. This is indicated by using the extra
annotation, traefik.ingress.kubernetes.io/router.middlewares: kommander-stripprefixes-
kubetunnel@kubernetescrd as shown in the code sample that follows. If you want to expose a tunnel on a
different URL prefix, you must manage your own middleware configuration.
10. Provide a secret for your certificate in the Root CA Certificate dropdown list.
a. For environments where the Management cluster uses a publicly-signed CA (like ZeroSSL or Let’s Encrypt),
select Use Publicly Trusted CA.
b. If you manually created a secret in advance, select it from the dropdown list.
c. For all other cases, select Create a new secret. Then, run the following command on the Management
cluster to obtain the caBundle key:
kubectl get kommandercluster -n kommander host-cluster -o go-
template='{{ .status.ingress.caBundle }}'
Copy and paste the output into the Root CA Certificate field.
12. (Optional) Enable a Proxied Access Activate a proxied access to enable kubectl access and dashboard
observability for the network-restricted cluster from the Management cluster. For more information, see Proxied
Access to Network-Restricted Clusters on page 505 .
Select Show Advanced.
Note:
• If you previously configured a domain wildcard for your cluster, a Cluster Proxy Domain is
suggested automatically based on your cluster name. Replace the suggestion if you want to assign a
different domain for the proxied cluster.
• If you want to use the external-dns service, specify a Cluster Proxy Domain that is within
the zones specified in the --domain-filter argument of the external-dns deployment manifest
stored on the Management cluster.
For example, if the filter is set to example.com, a possible domain for the
TUNNEL_PROXY_EXTERNAL_DOMAIN is myclusterproxy.example.com.
Custom settings: Manually create a DNS record. The Select an existing TLS certificate.
box unchecked icon record’s A/CNAME value must
OR
point to the Management cluster’s
Traefik IP address, URL or domain. Select an existing Issuer or
OR ClusterIssuer.
15. Select Save & Generate kubeconfig to generate a file required to finish attaching the cluster.
A new window appears with instructions on how to finalize attaching the cluster.
What to do next
UI: Finishing Attaching the Existing Cluster on page 492.
How to apply the kubeconfig file to create the network tunnel to attach a network-restricted cluster.
Procedure
1. Select the Download Manifest link to download the file you generated previously.
2. Copy the kubectl apply command from the UI and paste it into your terminal session. Do not run it yet.
3. Ensure you substitute the actual name of the file for the variable. Also ensure you use the --
kubeconfig=<managed_cluster_kubeconfig.conf> flag to run the command on the Attached or Managed
cluster. Run the command.
Running this command starts the attachment process, which might take several minutes to complete. The Cluster
details page is appears automatically when the cluster attachment process completes.
4. (Optional) Select Verify Connection to Cluster to send a request to Kommander to refresh the connection
information. You can use this option to check to see if the connection is complete, though the Cluster Details page
displays automatically when the connection is complete.
Note: After the initial connection is made and your cluster becomes viewable as attached in the NKP UI, the
attachment, federated add-ons, and platform services will still need to be completed. This might take several
What to do next
CLI: Using the Network-restricted Cluster on page 498
Procedure
» Obtain the desired workspace namespace on the Management cluster for the tunnel gateway:
namespace=$(kubectl get workspace default-workspace -o
jsonpath="{.status.namespaceRef.name}")
» Alternatively, you can create a new workspace instead of using an existing workspace: Run the following
command, and replace the <workspace_name> with the new workspace name:
workspace=<workspace_name>
Finish creating the workspace:
namespace=${workspace}
Note: Kommander uses Traefik 2 ingress, which requires explicit definition of strip prefix middleware as a
Kubernetes API object, as opposed to a simple annotation. Kommander provides default middleware that supports
creating tunnels only on the /nkp/tunnel URL prefix. This is indicated by using the extra annotation,
traefik.ingress.kubernetes.io/router.middlewares: kommander-stripprefixes-
kubetunnel@kubernetescrd
as shown in the code sample that follows. If you want to expose a tunnel on a different URL prefix, you must
manage your own middleware configuration.
a. Establish variables for the certificate secret and gateway. Replace the <gateway_name> placeholder with the
name of the gateway.
cacert_secret=kubetunnel-ca
gateway=<gateway_name>
Procedure
1. Establish a variable for the connector. Provide the name of the connector, by replacing the <connector_name>
placeholder:
connector=<connector_name>
4. Wait for the tunnel connector to reach the Listening state and export the agent manifest.
while [ "$(kubectl get tunnelconnector -n ${namespace} ${connector} -o
jsonpath="{.status.state}")" != "Listening" ]
do
sleep 5
done
Note: When attaching several clusters, ensure that you fetch the manifest.yaml of the cluster you are
attempting to attach. Using the wrong combination of manifest.yaml and cluster will cause the attachment to
fail.
Procedure
1. Apply the manifest.yaml file to the Attached or Managed cluster and deploy the tunnel agent.
kubectl apply --kubeconfig=<managed_cluster_kubeconfig.conf> -f manifest.yaml
When you create a cluster using the NKP CLI, it does not attach automatically.
Procedure
1. On the Management cluster, wait for the tunnel to be connected by the tunnel agent.
while [ "$(kubectl get tunnelconnector -n ${namespace} ${connector} -o
jsonpath="{.status.state}")" != "Connected" ]
do
sleep 5
done
2. Establish variables for the managed cluster. Replace the <private_cluster> placeholder with the name of the
managed cluster:
managed=<private-cluster>
display_name=${managed}
Procedure
1. Apply a network policy that restricts tunnel access to specific namespaces and IP blocks.
The following example permits connections from:
2. To enable applications running in another namespace to access the attached cluster, add the label
kubetunnel.d2iq.io/networkpolicy=${connector}-${namespace} to the target namespace.
kubectl label ns ${namespace} kubetunnel.d2iq.io/networkpolicy=${connector}-
${namespace}
All pods in the target namespace can now reach the attached cluster services.
What to do next
• (Optional): If you want to access the network-restricted attached cluster from the Management cluster, Enabling
Proxied Access Using the CLI on page 507
• (Alternatively), start CLI: Using the Network-restricted Cluster on page 498
» If the client program supports the use of a kubeconfig file, use the network-restricted cluster’s kubeconfig.
» If the client program supports SOCKS5 proxies, use the proxy directly.
» Otherwise, deploy a proxy server on the Management cluster.
2. Network-restricted Cluster Service These sections require a service to run on the Attached or Managed
network-restricted cluster.
As an example, start the following service.
service_namespace=test
service_name=webserver
service_port=8888
service_endpoint=${service_name}.${service_namespace}.svc.cluster.local:
${service_port}
Procedure
2. After setting service_namespace and service_name to the service resource, run this command on the
Management cluster.
cat > get-service.yaml <<EOF
apiVersion: batch/v1
kind: Job
metadata:
name: get-service
spec:
template:
spec:
containers:
- name: kubectl
image: bitnami/kubectl:1.19
command: ["kubectl", "get", "service", "-n", "${service_namespace}",
"${service_name}"]
env:
- name: KUBECONFIG
value: /tmp/kubeconfig/kubeconfig
volumeMounts:
- name: kubeconfig
mountPath: /tmp/kubeconfig
volumes:
- name: kubeconfig
secret:
secretName: "${kubeconfig_secret}"
restartPolicy: Never
backoffLimit: 4
EOF
Procedure
1. To use the SOCKS5 proxy directly, obtain the SOCKS5 proxy endpoint using.
proxy_service=$(kubectl get tunnelconnector -n ${namespace} ${connector} -o
jsonpath='{.status.tunnelServer.serviceRef.name}')
Procedure
1. To deploy a proxy on the Management cluster, obtain the SOCKS5 proxy endpoint using.
proxy_service=$(kubectl get tunnelconnector -n ${namespace} ${connector} -o
jsonpath='{.status.tunnelServer.serviceRef.name}')
2. Provide the value of ${socks_proxy} as the SOCKS5 proxy to a proxy deployed on the Management cluster.
After setting service_endpoint to the service endpoint, on the Management cluster run.
cat > nginx-proxy.yaml <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: nginx-proxy-crt
spec:
secretName: nginx-proxy-crt-secret
dnsNames:
- nginx-proxy-service.${namespace}.svc.cluster.local
issuerRef:
group: cert-manager.io
kind: ClusterIssuer
name: kubernetes-ca
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-proxy
labels:
app: nginx-proxy-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx-proxy-app
template:
metadata:
labels:
app: nginx-proxy-app
spec:
containers:
- name: nginx-proxy
image: mesosphere/ghostunnel:v1.5.3-server-backend-proxy
args:
- "server"
- "--listen=:443"
- "--target=${service_endpoint}"
- "--cert=/etc/certs/tls.crt"
- "--key=/etc/certs/tls.key"
- "--cacert=/etc/certs/ca.crt"
- "--unsafe-target"
- "--disable-authentication"
env:
- name: ALL_PROXY
value: socks5://${socks_proxy}
ports:
- containerPort: 443
volumeMounts:
- name: certs
mountPath: /etc/certs
volumes:
- name: certs
secret:
secretName: nginx-proxy-crt-secret
---
apiVersion: v1
3. Any client running on the Management cluster can now access the service running on the Attached or Managed
cluster using the proxy service endpoint. Note that the #curl# job runs in the same namespace as the proxy to
provide access to the CA certificate secret.
cat > curl.yaml <<EOF
apiVersion: batch/v1
kind: Job
metadata:
name: curl
spec:
template:
spec:
containers:
- name: curl
image: curlimages/curl:7.76.0
command:
- curl
- --silent
- --show-error
- --cacert
- /etc/certs/ca.crt
- https://nginx-proxy-service.${namespace}.svc.cluster.local:${proxy_port}
volumeMounts:
- name: certs
mountPath: /etc/certs
volumes:
- name: certs
secret:
secretName: nginx-proxy-crt-secret
restartPolicy: Never
backoffLimit: 4
EOF
Enabling a proxied access allows you to access Attached and Managed clusters that are network-restricted, in a
private network, firewalled, or at the edge.
Note: This section only applies to clusters with networking restrictions that were attached through a secure tunnel
You can attach clusters that are in a private network (clusters that have networking restrictions or are at the edge).
Nutanix provides the option of using a secure tunnel or a tunneled attachment to attach a Kubernetes cluster to
the Management cluster. To access these attached clusters through kubectl or monitor its resources through the
Management cluster, you have to be in the same network, or enable a proxied access.
Enabling the proxied access for a network-restricted cluster makes it possible for NKP to authenticate user requests
(regardless of the identity provider) through the Management cluster’s authentication proxy. This is helpful,when the
cluster you are trying to reach is in a different network. The proxied access allows you to:
• Access and observe the cluster’s monitoring and logging services from the Management cluster, for example:
• Access the cluster’s Grafana, Kubernetes, and Kubecost dashboards from the Management cluster.
• Use the CLI to print a cluster’s service UR so that you can access the cluster’s dashboards.
• Access and perform operations on the network-restricted cluster from the Management cluster, for example:
• Generate an API token (Generate Token option, from the upper right corner of the UI) that allows you to
authenticate to the network-restricted cluster.
• Upon authentication, use kubectl to manage the network-restricted cluster.
You can perform the previous actions without being in the same network as the network-restricted cluster.
Procedure
Task step.
Procedure
a. If you have not installed the Kommander component yet, initialize the configuration file, so you can edit it in
the following steps.
Warning: Initialize this file only once. Otherwise, you will overwrite previous customizations.
b. If you have installed the Kommander component already, open the existing kommander.yaml with the editor
of your choice.
2. Adjust the apps section of your kommander.yaml file to include these values.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
[...]
kubetunnel:
enabled: true
values: |
proxiedAccess:
clusterDomainWildcard: "*.example.com"
Enabling a proxied access allows you to access Attached and Managed clusters that are network-
restricted, in a private network, firewalled, or at theedge.
Procedure
1. Set the WORKSPACE_NAMESPACE environment variable to the name of your network-restricted cluster’s workspace
namespace.
export WORKSPACE_NAMESPACE=<workspace namespace>
4. Given that each cluster can only have one proxy domain, reuse the name of the network-restricted cluster for the
proxy object.
TUNNEL_PROXY_NAME=${NETWORK_RESTRICTED_CLUSTER}
Procedure
1. In the Management cluster, create a TunnelProxy object for your proxied cluster and assign it a unique domain.
This domain forwards all user authentication requests through the Management cluster and is used to generate a
URL that exposes the cluster's dashboards (clusterProxyDomain).
» To back the domain, you require both a certificate and a DNS record. If you choose the default configuration,
NKP will handle the certificate creation (self-signed certificate), but you must create a DNS record manually.
» Alternatively, you can set up a different Certificate Authority to handle the certificate creation and rotation for
your domain. You can also set up the external-dns service to automatically create a DNS record.
Note:
• Example 1: Example 1: Domain with Default Certificate and Automatic DNS Record Creation
(Requires External DNS) on page 509
• Example 2: Example 2: Domain with Default Certificate and Default DNS Setup (Requires
Manually-created DNS) on page 509
• Example 3: Example 3: Domain with Auto-generated ACME Certificate and Automatic DNS
Record Creation (Requires External DNS) on page 510
• Example 4: Example 4: Domain with Custom Certificate (Requires Certificate Secret) and
Automatic DNS Record Creation (Requires External DNS) on page 511
Example 1: Domain with Default Certificate and Automatic DNS Record Creation (Requires External DNS)
Example 2: Domain with Default Certificate and Default DNS Setup (Requires Manually-created DNS)
Example 3: Domain with Auto-generated ACME Certificate and Automatic DNS Record Creation (Requires
External DNS)
• Certificate - The domain uses a cert-manager to enable an ACME-based Certificate Authority. This CA
automatically issues and rotates your certificates. By default, NKP uses Let's Encrypt.
• DNS record -
The external-dns manages the creation of a DNS record automatically. For it to work, ensure you have enabled
Configuring External DNS with the CLI: Management or Pro Cluster on page 999.
1. Set the environment variable for your issuing object:
This can be a ClusterIssuer or Issuer. For more information, see Advanced Configuration: ClusterIssuer on
page 996.
ISSUER_KIND=ClusterIssuer
For more information, see DNS Record Creation with External DNS on page 998.
Example 4: Domain with Custom Certificate (Requires Certificate Secret) and Automatic DNS Record
Creation (Requires External DNS)
• Certificate - The domain uses a custom certificate created manually. Ensure you reference the
<certificate_secret_name>.
• DNS record -
The external-dns manages the creation of a DNS record automatically. For it to work, ensure you have enabled
Configuring External DNS with the CLI: Management or Pro Cluster on page 999.
1. Set an environment variable for the name of your custom certificate.
For more information, see Configuring the Kommander Installation with a Custom Domain and Certificate
on page 990.
CERTIFICATE_SECRET_NAME=<custom_certificate_secret_name>
2. (Optional): If you do not have a secret yet and want to create one pointing at the certificate, run the following
command.
kubectl create secret tls ${CERTIFICATE_SECRET_NAME} -n ${WORKSPACE_NAMESPACE} --
key="tls.key" --cert="tls.crt"
For more information, see Configure Custom Domains or Custom Certificates post Kommander Installation on
page 534.
• Ensure to set the required variables as described in Enabling Proxied Access Using the CLI on page 507 and
create a TunnelProxy object as described in Creating a TunnelProxy Object on page 508 before you run the
following commands.
• Ensure you run the following command on the Management cluster.
For more information on switching cluster contexts, see Commands within a kubeconfig File on page 31.
To enable the TunnelProxy, reference the object in the KommanderCluster object:
Procedure
On the Management cluster, patch the KommanderCluster object with the name of the TunnelProxy you created
on the previous page.
kubectl patch --type merge kommanderclusters -n ${WORKSPACE_NAMESPACE}
${NETWORK_RESTRICTED_CLUSTER} --patch "{\"spec\": {\"clusterTunnelProxyConnectorRef\":
{ \"name\": \"${TUNNEL_PROXY_NAME}\"}}}"
Procedure
1. Verify that the following conditions for the TunnelProxy configuration are met.
kubectl wait --for=condition=ClientAuthReady=true --timeout=300s -n
${WORKSPACE_NAMESPACE} tunnelproxy/${TUNNEL_PROXY_NAME}
kubectl wait --for=condition=ReverseProxyReady=true --timeout=300s -n
${WORKSPACE_NAMESPACE} tunnelproxy/${TUNNEL_PROXY_NAME}
kubectl wait --for=condition=available -n ${WORKSPACE_NAMESPACE} deploy -l control-
plane=${TUNNEL_PROXY_NAME}-kubetunnel-reverse-proxy-rp
The output should look like this.
tunnelproxy.kubetunnel.d2iq.io/test condition met
tunnelproxy.kubetunnel.d2iq.io/test condition met
deployment.apps/${TUNNEL_PROXY_NAME}-kubetunnel-reverse-proxy-rp condition met
2. Verify that the TunnelProxy is correctly assigned and connected to your cluster.
curl -Lk -s -o /dev/null -w "%{http_code}" https://${TUNNEL_PROXY_EXTERNAL_DOMAIN}/
nkp/grafana
The output should return a successful HTTP response status.
200
You can access the network-restricted cluster dashboards and use kubectl to manage its resources from the
Management cluster.
Note: These steps are only applicable if you do not set a WORKSPACE_NAMESPACE when creating a cluster. If you
already set a WORKSPACE_NAMESPACE, then you do not need to perform these steps since the cluster is already
attached to the workspace.
Starting with NKP 2.6, when you create a Managed Cluster with the NKP CLI, it attaches automatically to the
Management Cluster after a few moments.
However, if you do not set a workspace, the attached cluster will be created in the default workspace. To ensure
that the attached cluster is created in your desired workspace namespace, follow these instructions:
Procedure
1. Confirm you have your MANAGED_CLUSTER_NAME variable set with the following command:
echo ${MANAGED_CLUSTER_NAME}
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace.
nkp get kubeconfig --cluster-name ${MANAGED_CLUSTER_NAME} >
${MANAGED_CLUSTER_NAME}.conf
3. You can now either [attach it in the UI](link to attaching it to the workspace through UI that was earlier) or
attach your cluster to the workspace you want in the CLI.
Note: This is only necessary if you never set the workspace of your cluster upon creation.
6. You need to create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster.
kubectl -n default get secret ${MANAGED_CLUSTER_NAME}-kubeconfig -o go-
template='{{.data.value}}{{ "\n"}}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: <your-managed-cluster-name>-kubeconfig
10. You can now view this cluster in your Workspace in the UI and you can confirm its status by running the
command below. It might take a few minutes to reach "Joined" status.
kubectl get kommanderclusters -A
If you have several Pro Clusters and want to turn one of them into a Managed Cluster to be centrally
administrated by a Management Cluster, refer to Platform Expansion: Conversion of an NKP Pro Cluster to
an NKP Ultimate Managed Cluster on page 515.
Procedure
1. Select the username in the top right corner, and then select Generate Token.
2. Select the cluster name and follow the instructions to assemble a kubeconfig for accessing its Kubernetes API.
Note: If the UI prompts you to log inlog on, use the credentials you normally use to access the UI.
You can also retrieve a custom kubeconfig file by visiting the /token endpoint on the Kommander cluster
domain (example URL: https://your-server-name.your-region-2.elb.service.com/token/.
Selecting the cluster’s name displays the instructions to assemble a kubeconfig for accessing its Kubernetes
API.
Downtime Considerations
Your NKP Pro cluster will not be accessible externally for several minutes during the expansion process. Any
configuration of the cluster’s Ingress that requires traefik-forward-auth authentication will be affected.
Note: Access from within the cluster through Kubernetes service hostname (for example, http://
SERVICE_NAME.NAMESPACE:PORT) is not affected.
• nkp-ceph-cluster-dashboard
• grafana-logging
• kube-prometheus-stack-alertmanager
• kube-prometheus-stack-grafana
• kube-prometheus-stack-prometheus
• kubernetes-dashboard
• traefik-dashboard
Other Services
To verify if your services are affected by traefik-forward-auth's downtime, run the following command:
kubectl get ingress -n NS <your_customized_ingress_name>
Look for the traefik.ingress.kubernetes.io/router.middlewares field in the output. If this field contains
the value kommander-forwardauth@kubernetescrd, your service will be affected by the downtime.
SSO Configuration
After attachment, the SSO configuration of the Management cluster applies to the Managed (formerly Pro) cluster.
Any SSO configuration of the former Pro cluster will be deleted.
• If the Pro cluster has SSO configured but the Management cluster does not, you can copy your Pro cluster’s SSO
configuration (dex-controller resources) to the Management cluster before conversion.
• If your Management cluster has SSO configured and the Pro cluster has another SSO configuration, you can
choose to keep one or both. To keep the configuration of your Pro cluster, manually copy the dex-controller
resources to the Management cluster before conversion. NKP maintains the SSO configuration of your
Management cluster automatically unless you manually delete it.
Warning:
After conversion, any domain or certificate customizations you want to apply to your Managed cluster must
be done through the KommanderCluster. This object is now stored in the Management cluster.
Note: All NKP Platform Applications will be migrated from the NKP Pro Cluster to the NKP Ultimate Managed
Cluster.
Procedure
1. Prior to turning an NKP Pro cluster to an NKP Ultimate Managed Cluster, clone the Management Git Repository
using the following command:
nkp experimental gitops clone
2. Verify that the Git Repository has been successfully cloned to your local environment.
cd kommander
git remote -v
# output from git remote -v look like
# origin https://<YOUR_CLUSTER_INGRESS_HOSTNAME>/nkp/kommander/git-operator/
repositories/kommander/kommander.git (fetch)
Note: All NKP Platform Applications will be migrated from the NKP Pro Cluster to the NKP Ultimate Managed
Cluster.
Procedure
1. Prior to turning an NKP Pro cluster to an NKP Ultimate Managed Cluster, clone the Management Git Repository
using the following command:
nkp experimental gitops clone
2. Verify that the Git Repository has been successfully cloned to your local environment.
cd kommander
git remote -v
# output from git remote -v look like
# origin https://<YOUR_CLUSTER_INGRESS_HOSTNAME>/nkp/kommander/git-operator/
repositories/kommander/kommander.git (fetch)
Note: Back up and restore your cluster’s applications before attempting to convert your NKP Pro cluster into a NKP
Ultimate Managed cluster.
The instructions differ depending on the infrastructure provider of your NKP Pro cluster.
For AWS, see AWS Cluster Backup on page 517.
For Azure, vSphere, GCP, and pre-provisioned environments, see Azure, vSphere, GCP, or Pre-
provisioned Cluster Backup on page 521.
This section contains the instructions necessary to back up and restore the NKP Pro cluster on the AWS environment.
• Ensure Velero is installed on your Pro cluster. Use at least Velero CLI version 1.10.1.
For more information, see Velero Installation Using CLI on page 557.
• Ensure kubectl is installed.
For more information, see https://kubernetes.io/docs/tasks/tools/.
Procedure
Prepare your cluster.
Run the following commands in the NKP Pro cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
Preparing Velero
Procedure
3. Verify the configuration has been updated before proceeding with the next section.
kubectl -n kommander wait --for=condition=Ready kustomization velero
The output should look similar to this.
kustomization.kustomize.toolkit.fluxcd.io/velero condition met
Procedure
Add the https://docs.aws.amazon.com/aws-managed-policy/latest/reference/
AmazonEBSCSIDriverPolicy.html policy to the control plane role control-plane.cluster-api-provider-
aws.sigs.k8s.io.
aws iam attach-role-policy \
--role-name control-plane.cluster-api-provider-aws.sigs.k8s.io \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
This will allow the EBS CSI driver, a volume manager, to have enough permissions to create volume snapshots.
Procedure
Configure a VolumeSnapshotClass object on the cluster so that Velero can create a volume snapshot:
cat << EOF | kubectl apply -f -
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: aws
labels:
velero.io/csi-volumesnapshot-class: "true"
driver: ebs.csi.aws.com
deletionPolicy: Delete
parameters:
EOF
Warning: Run the following commands in the NKP Pro cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
2. Create a backup with Velero. Use the following flags to reduce the scope of the backup and only include the
applications that are affected during the expansion.
velero backup create pre-expansion \
--include-namespaces="kommander,kommander-default-workspace,kommander-
flux,kubecost" \
--include-cluster-resources \
--wait
After completion, the output should look similar to this.
Backup request "pre-expansion" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your
backup will continue in the background.
................................................................................................
Backup completed with status: Completed. You may check for more information using
the commands `velero backup describe pre-expansion` and `velero backup logs pre-
expansion`.
3. Verify the Backup. Review the backup has been completed successfully.
velero backup describe pre-expansion
The following example output will vary depending on your cloud provider. Verify that it shows no errors and the
Phase is Completed.
Name: pre-expansion
Namespace: kommander
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.25.5
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=25
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: kommander, kommander-default-workspace, kommander-flux, kubecost
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: included
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
Hooks: <none>
This section contains the instructions necessary to back up and restore an NKP Pro cluster on Azure, vSphere, GCP or
Pre-provisioned environments.
This section describes how to prepare your cluster on AWS, Azure, vSphere, Google Cloud, or pre-
provisioned environment, so it can be backed up.
• Ensure Velero is installed on your Pro cluster. Use at least Velero CLI version 1.10.1.
For more information, see Velero Installation Using CLI on page 557.
• Ensure kubectl is installed.
For more information, see https://kubernetes.io/docs/tasks/tools/.
• Ensure you have admin rights to the NKP Pro cluster.
Procedure
Prepare you cluster.
Run the following commands in the NKP Pro cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
Preparing Velero
Procedure
1. Create an Override with a custom configuration for Velero. This custom configuration deploys the node-agent
service, which enables restic.
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: velero-overrides
namespace: kommander
data:
values.yaml: |
---
2. Reference the created Override in Velero’s AppDeployment to apply the new configuration.
cat << EOF | kubectl -n kommander patch appdeployment velero --type='merge' --patch-
file=/dev/stdin
spec:
configOverrides:
name: velero-overrides
EOF
4. Verify the configuration has been updated before proceeding with the next section.
kubectl -n kommander wait --for=condition=Ready kustomization velero
The output should look similar to this.
kustomization.kustomize.toolkit.fluxcd.io/velero condition met
Annotate the pod to ensure restic backs up the Persistent Volumes (PVs) of the pods that will be affected
during the expansion process. These volumes contain the Git repository information of your NKP Pro
cluster.
Warning: Run the following commands in the NKP Pro cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
Procedure
Run the following command.
kubectl -n git-operator-system annotate pod git-operator-git-0 backup.velero.io/backup-
volumes=data
With this workflow, you can back up and restore your cluster’s applications. This backup contains
Kubernetes objects and the Persistent Volumes (PVs) of Git Operator pods. Given that Git Operator’s PVs
store information on your cluster’s state, you will be able to restore your cluster if required.
Warning: Run the following commands in the NKP Pro cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
1. Create a backup with Velero. Use the following flags to reduce the scope of the backup and only include the
applications that are affected during the expansion.
velero backup create pre-expansion \
--include-namespaces="git-operator-system,kommander,kommander-default-
workspace,kommander-flux,kubecost" \
--include-cluster-resources \
--snapshot-volumes=false --wait \
--namespace kommander
After completion, the output should look similar to this.
Backup request "pre-expansion" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your
backup will continue in the background.
................................................................................................
Backup completed with status: Completed. You may check for more information using
the commands `velero backup describe pre-expansion` and `velero backup logs pre-
expansion`.
2. Verify the Backup. Ensure the backup has been completed successfully.
velero backup describe pre-expansion --namespace kommander
The following example output will vary depending on your cloud provider. Verify that it shows no errors and the
Phase is Completed.
Name: pre-expansion
Namespace: kommander
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.25.5
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=25
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: git-operator-system, kommander, kommander-default-workspace, kommander-
flux, kubecost
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: included
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
Hooks: <none>
Procedure
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown list on the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
skip the following steps and perform the steps in Cluster Attachment with Networking Restrictions on
page 488.
5. In the Cluster Configuration section, paste your kubeconfig file into the field or select Upload
kubeconfig File to specify the file.
6. The Cluster Name field will automatically populate with the name of the cluster in the kubeconfig. You can
edit this field using the name you want for your cluster.
Warning: Run the following commands in the Management cluster. For general guidelines on setting the
context, see Commands within a kubeconfig File on page 31.
Note: After conversion, all Platform Applications will be in the Kommander Namespace in the Managed
Cluster.
What to do next
Post Conversion: Cleaning Clusters Running on Different Cloud Platforms on page 528
Warning:
Ingress that contains Traefik-Forward-Authentication in NKP (TFA) on page 592 configuration
will not be available during the expansion process, therefore, your NKP Pro cluster will not be accessible
externally for several minutes. Access from within the cluster through Kubernetes service hostname (for
example, http://SERVICE_NAME.NAMESPACE:PORT) is not affected.
For more information, see Downtime Considerations on page 515.
Procedure
Warning: Run the following commands in the Management cluster. For general guidelines on setting the context,
see Commands within a kubeconfig File on page 31.
Note: After conversion, all Platform Applications will be in the Kommander Namespace in the Managed
Cluster.
What to do next
Post Conversion: Cleaning Up Cluster Autoscaler Configuration on page 526
Note: Run the following commands in the Management cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
Procedure
4. To check that the status of the deployment has the expected AVAILABLE count of 1, run the following command
and verify that the output is similar.
$ kubectl get deployment -n $WORKSPACE_NAMESPACE cluster-autoscaler-$CLUSTER_NAME
NAME READY UP-TO-DATE AVAILABLE AGE
cluster-autoscaler-<cluster-name> 1/1 1 1 1m
What to do next
Post Conversion: Cleaning Clusters Running on Different Cloud Platforms on page 528.
NKP Enterprise Management NKP Enterprise Management NKP Pro cluster host provider
cluster host provider cluster IAM permissions
AWS https://cluster-api-aws.sigs.k8s.io/ AWS, GCP, vSphere, Pre-
topics/iam-permissions.html provisioned
GCP https://cloud.google.com/iam/ AWS, GCP, vSphere, Pre-
docs/overview provisioned
vSphere https://docs.vmware.com/en/ AWS, GCP, vSphere, Pre-
vRealize-Operations/Cloud/ provisioned
com.vmware.vcom.config.doc/
GUID-
F85638E3-937E-4E31-90D0-9D4A5E479292.html
Procedure
1. Following the conversion into an NKP Enterprise managed cluster, run the following command to move the CAPI
Objects.
nkp move capi-resources --from-kubeconfig <essential_cluster_kubeconfig> --to-
kubeconfig <enterprise_cluster_kubeconfig> --to-namespace ${WORKSPACE_NAMESPACE}
3. After moving the resources, run the following command to remove the CAPI controller manager deployments.
nkp delete capi-components --kubeconfig <essential_cluster_kubeconfig>
Note: Run the following commands in the Management cluster. For general guidelines on how to set the context, see
Commands within a kubeconfig File on page 31.
3. If the condition is not met yet, you can observe the conversion process.
4. Export the environment variable for the workspace namespace:
export WORKSPACE_NAMESPACE=<workspace_namespace>
5. Print the state of your cluster’s KommanderCluster object through the CLI and observe the cluster conversion
process,
kubectl get kommandercluster -n ${WORKSPACE_NAMESPACE} <essential_cluster_name>
-o go-template='{{range .status.conditions }}type: {{.type}} {{"\n"}}status:
{{.status}} {{"\n"}}reason: {{.reason}} {{"\n"}}lastTxTime: {{.lastTransitionTime}}
{{"\n"}}message: {{.message}} {{"\n\n"}}{{end}}'
type: IngressCertificateReady
status: True
reason: Ready
lastTxTime: 2023-02-24T14:49:09Z
message: Certificate is up to date and has not expired
type: CAPIResourceMoved
status: True
reason: Succeeded
lastTxTime: 2023-02-24T14:50:56Z
message: Moved CAPI resources from the attached cluster to management cluster
type: PreAttachmentCleanup
status: True
reason: Succeeded
lastTxTime: 2023-02-24T14:54:47Z
message: pre-attach cleanup succeeded
# [...]
• No errors in Ouput: If your output shows no errors, the error message is not related to a kubeconfig.
• Errors in Output: If the output shows an error, delete the KommanderCluster object through CLI:
kubectl delete kommandercluster -n <WORKSPACE_NAMESPACE> <WRONG_KOMMANDER_CLUSTER>
4. At this particular stage and in the context of converting your cluster, deleting your KommanderCluster will not
affect your environment. However, DO NOT delete your KommanderCluster in other scenarios, as it detaches
the referenced cluster from the Management cluster.
Finally, restart the cluster conversion process with the UI. For more information, see Converting a Pro Cluster
Into a Managed Cluster Using the UI on page 524.
The pro cluster has more than one instance of v1beta1/clusters.cluster.x-k8s.io
Warning: Nutanix does not support converting Pro clusters that contain the Cluster API resources of more than one
cluster.
Ensure your Pro cluster only contains its own CAPI resources and does not contain the CAPI resources of other
clusters.
Warning: Switch between NKP Ultimate Management and NKP Pro clusters for the following commands. For general
guidelines on how to set the context, see Commands within a kubeconfig File on page 31.
2. Disable the Flux controllers on the NKP Pro cluster to interrupt the expansion process:
kubectl -n kommander-flux delete deployment -l app.kubernetes.io/instance=kommander-
flux
1. Retrieve your Managed cluster’s kubeconfig and write it to the <pro.conf> file:
KUBECONFIG=management.conf ./nkp get kubeconfig -n <WORKSPACE_NAMESPACE> -c
<CAPI_CLUSTER_NAME> > pro.conf
Phase: Completed
Total items to be restored: 2411
Items restored: 2411
...
Warning: This feature is for advanced users and users in unique environments only. We highly recommend using
other documented methods to create clusters whenever possible.
Procedure
2. In the selected workspace Dashboard, select the Add Cluster option in the Actions dropdown list on the top-
right.
» Workspace: The workspace where this cluster belongs (if within the Global workspace).
» Cluster YAML: Paste or upload your customized set of cluster objects into this field. Only valid YAML is
accepted.
» Add Labels: By default, your cluster has labels that reflect the infrastructure provider provisioning. For
example, your AWS cluster might have a label for the datacenter region and provider: aws. Cluster labels
are matched to the selectors created for Projects on page 423. Changing a cluster label might add or
remove the cluster from projects.
4. To begin provisioning the NKP CLI cluster, click Create. This step might take a few minutes for the cluster to
be ready and fully deploy its components. The cluster automatically tries to join and should resolve after it is fully
provisioned.
NKP supports configuring a custom domain name for accessing the UI and other platform services, as well as setting
up manual or automatic certificate renewal or rotation. This section provides instructions and examples on how to
configure a customized domain and certificate on your Pro, Managed, or attached clusters.
• Configure a custom domain and certificate as part of the cluster’s installation process. This is only possible for
your Management/Pro cluster.
• Update your cluster’s current domain and certificate configuration as part of your cluster management operations.
For information, see Cluster Operations Management on page 339. You can do this for any cluster type in
your environment.
KommanderCluster Object
The KommanderCluster resource is an object that contains key information for all types of clusters that are part of
your environment, such as:
Location of the KommanderCluster and Issuer Objects: Management, Managed or Attached Cluster
In the Management or Pro cluster, both the KommanderCluster and issuer objects are stored on the same cluster.
The issuer can be referenced as an Issuer, ClusterIssuer , or certificateSecret.
In the Managed and attached clusters, the KommanderCluster object is stored on the Management cluster. The
Issuer, ClusterIssuer , or certificateSecret is stored on the Managed or Attached cluster.
NKP supports configuring a custom domain name for accessing the UI and other platform services, as well as setting
up manual or automatic certificate renewal or rotation. This section provides instructions and examples on how to
configure a customized domain and certificate on Pro, Management, Managed, or Attached clusters.
NKP supports configuring a custom domain name for accessing the UI and other platform services, as well as setting
up manual or automatic certificate renewal or rotation. This section provides instructions and examples on how to
configure the NKP installation to add a customized domain and certificate on your Pro cluster or your Management
cluster.
Configuration Options
After you have installed the Kommander component of NKP, you can configure a custom domain and certificate by
modifying the KommanderCluster object of your cluster. You have several options to establish a custom domain
and certificate.
Note: If you want the cert-manager to automatically handle certificate renewal and rotation, choose an ACME-
supported Certificate Authority.
• Update the KommanderCluster by referencing the name of the created Issuer or ClusterIssuer in the
spec.ingress.issuerRef field. Enter the custom domain name in the spec.ingress.hostname field:
Procedure
1. Create an Issuer or ClusterIssuer with your certificate provider information. Store this object in the cluster
where you want to customize the certificate and domain.
a. If you want to use NKP’s default certificate authority, see Configuring a Custom Certificate With Let’s
Encrypt on page 536.
b. For an advanced configuration example, select I want to use an automatically-generated certificate
with ACME and require advanced configuration.
2. Update the KommanderCluster by referencing the name of the created Issuer or ClusterIssuer in the
spec.ingress.issuerRef field.
Enter the custom domain name in the spec.ingress.hostname field.
cat <<EOF | kubectl -n <workspace_namespace> --kubeconfig
<management_cluster_kubeconfig> patch \
kommandercluster <cluster_name> --type='merge' --patch-file=/dev/stdin
spec:
ingress:
hostname: <cluster_hostname>
issuerRef:
Warning: Certificates issued by another Issuer .--You can also configure a certificate issued by another Certificate
Authority. In this case, the CA will determine which information to include in the configuration.
Procedure
1. Obtain or create a certificate that is customized for your hostname. Store this object in the workspace namespace
of the target cluster.
2. Create a secret with the certificate in the cluster’s namespace. Give it a name by replacing
<certificate_secret_name>:
kubectl create secret generic -n "${WORKSPACE_NAMESPACE}" <certificate_secret_name> \
--from-file=ca.crt=$CERT_CA_PATH \
--from-file=tls.crt=$CERT_PATH \
--from-file=tls.key=$CERT_KEY_PATH \
--type=kubernetes.io/tls
Note: For Kommander to access the secret containing the certificate, it must be located in the workspace
namespace of the target cluster.
2. Configure the Management cluster to use your custom-domain.example.com with a certificate issued by Let’s
Encrypt by referencing the created ClusterIssuer.
cat <<EOF | kubectl -n kommander --kubeconfig <management_cluster_kubeconfig> patch \
kommandercluster host-cluster --type='merge' --patch-file=/dev/stdin
spec:
ingress:
hostname: custom-domain.example.com
issuerRef:
name: custom-acme-issuer
kind: ClusterIssuer
EOF
Procedure
2. If the ingress is still being provisioned, the output looks similar to this.
[...]
Conditions:
Last Transition Time: 2022-06-24T07:48:31Z
Message: Ingress service object was not found in the cluster
Reason: IngressServiceNotFound
Status: False
Type: IngressAddressReady
Warning: After successfully detaching the cluster, manually disconnect the attached cluster's Flux installation
from the management Git repository. Otherwise, changes to apps in the managed cluster's workspace will still
be reflected on the cluster you just detached. Ensure your nkp configuration references the target cluster. You
can do this by setting the KUBECONFIG environment variable to the appropriate kubeconfig file location.
For more information, see https://kubernetes.io/docs/tasks/access-application-cluster/configure-
access-multiple-clusters/. An alternative to initializing the KUBECONFIG environment variable is to use the –
kubeconfig=cluster_name.conf flag. Then, run kubectl -n kommander-flux patch gitrepo
management -p '{"spec":{"suspend":true}}' --type merge to make the cluster's workloads not
managed by Kommander, anymore.
If you created a managed cluster with Kommander, you cannot disconnect it, but you can delete it. This completely
removes the cluster and all of its cloud assets.
We recommend deleting a managed cluster through the NKP UI.
Warning: If you delete the Management (Konvoy) cluster, you can not use Kommander to delete any Managed
clusters created by Kommander. If you want to delete all clusters, ensure you delete any Managed clusters before finally
deleting the Management cluster.
Statuses: For a list of possible states a cluster can have when it is getting disconnected or deleted, see Cluster
Statuses on page 539.
Troubleshooting: I cannot detach an attached cluster that is “Pending,” OR the cluster I deleted through the CLI
still appears in the UI with an “Error” state.
Procedure
1. Determine the KommanderCluster resource backing the cluster you tried to attach/detach.
kubectl -n WORKSPACE_NAMESPACE get kommandercluster
Replace WORKSPACE_NAMESPACE with the actual current workspace name. You can find this name by going to
https://YOUR_CLUSTER_DOMAIN_OR_IP_ADDRESS/nkp/kommander/dashboard/workspaces in your
browser.
3. If the resource does not go after a short time, remove its finalizers.
kubectl -n WORKSPACE_NAMESPACE patch kommandercluster CLUSTER_NAME --type json -p
'[{"op":"remove", "path":"/metadata/finalizers"}]'
This removes the cluster from the NKP UI.
Management Cluster
A guide for the Management Cluster and the Management Cluster Workspace
When you install Kommander, the host cluster is attached to the Management Cluster Workspace, which is called
Management Cluster in the Global workspace dashboard, and Kommander Host inside the Management Cluster
Workspace. This allows the Management Cluster to be included in Projects on page 423 and enables the
management of its Platform Applications on page 386 from the Management Cluster Workspace.
Note: Do not attach a cluster in the "Management Cluster Workspace" workspace. This workspace is reserved for your
Kommander Management cluster only.
Editing
As an attached cluster, you can edit the Management Cluster to add or remove Labels and then use these labels to
include the Management Cluster in Projects inside of the Management Cluster Workspace.
Disconnecting
The Management Cluster cannot be disconnected from the GUI like other attached clusters. Because of this, the
Management Cluster Workspace cannot be deleted from the GUI as it will always have the Management Cluster
inside itself.
Cluster Statuses
A cluster card’s status line displays both the current status and the version of Kubernetes running in the
cluster.
These statuses only appear on Managed clusters.
Status Description
Pending This is the initial state when a cluster is created or
connected.
Cluster Resources
The Resource graphs on a cluster card show you a cluster’s resource requests, limits, and usage. This
allows a quick, visual scan of cluster health. Hover over each resource to get specific details for that
specific cluster resource.
Resource Description
CPU Requests The requested portion of the total allocatable CPU
resource for the cluster is measured in number of
cores, such as 0.5 cores.
CPU Limits The portion of the total allocatable CPU resource to
which the cluster is limited is measured in number
of cores, such as 0.5 cores.
CPU Usage The amount of the allocatable CPU resource being
consumed. It cannot be higher than the configured
CPU limit. Measured in number of cores, such as
0.5 cores)
Memory Requests The requested portion of the cluster's total
allocatable memory resource is measured in bytes,
such as 64 GiB.
Memory Limits The portion of the allocatable memory resource to
which the cluster is limited is measured in bytes,
such as 64 GiB.
Memory Usage The amount of the allocatable memory resource
being consumed. It cannot be higher than the
configured memory limit. It is measured in bytes,
such as 64 GiB.
Disk Requests The requested portion of the allocatable ephemeral
storage resource for the cluster is measured in
bytes, such as 64 GiB.
Disk Limits The portion of the allocatable ephemeral storage
resource to which the cluster is limited is measured
in bytes, such as 64 GiB.
Status Description
Enabled The application is enabled, but the status on the
cluster is not available.
Pending The application is waiting to be deployed.
Deploying The application is currently being deployed to the
cluster.
Deployed The application has successfully been deployed to
the cluster.
Deploy Failed The application failed to deploy to the cluster.
• https://github.com/kubernetes-sigs/kubefed/blob/master/docs/concepts.md
Velero Configuration
For default installations, NKP deploys Velero integrated with Rook Ceph, operating inside the same cluster.
For more information on Velero, see https://velero.io/. For more information on Rook Ceph, see https://rook.io/.
For production usecases, Nutanix advises providing an external storage class to use with Rook Ceph, see Rook Ceph
in NKP on page 633.
• Ensure you have installed Velero (included in the default NKP installation).
• Ensure you have installed the Velero CLI. For more information, see Velero Installation Using CLI on
page 557.
• Ensure that you have created an S3 bucket with AWS. For more information, https://docs.aws.amazon.com/
AmazonS3/latest/userguide/creating-bucket.html.
a. Set the BUCKET environment variable to the name of the S3 bucket you want to use as backup storage.
export BUCKET=<aws-bucket-name>
b. Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace. Replace
<workspace_namespace> with the name of the target workspace.
export WORKSPACE_NAMESPACE=<workspace_namespace>
This can be the kommander namespace for the Management cluster or any other additional workspace
namespace for attached or managed clusters. To list all available workspace namespaces, use the kubectl
get kommandercluster -A command.
c. Set theCLUSTER_NAME environment variable. Replace <target_cluster> with the name of the cluster
where you want to set up Velero.
export CLUSTER_NAME=<target_cluster>
b. Create a secret on the cluster where you are installing and configuring Velero by referencing the file created in
the previous step. This can be the Management, a Managed, or an Attached cluster.
In this example, the secret’s name is velero-aws-credentials.
kubectl create secret generic -n ${WORKSPACE_NAMESPACE} velero-aws-credentials --
from-file=aws=aws-credentials --kubeconfig=${CLUSTER_NAME}.conf
Procedure
1. Create a ConfigMap to enable Velero to use AWS S3 buckets as backup storage location.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: velero-overrides
data:
values.yaml: |
configuration:
backupStorageLocation:
2. Patch the Velero AppDeployment to reference the created ConfigMap with the Velero overrides.
This is an alternative configuration path for management or Pro clusters. You can also configure Velero
by editing the kommander.yaml and rerunning the installation. To follow this alternative configuration path,
expand the following section:
Procedure
Warning: Before running this command, ensure the kommander.yaml is the configuration file you are currently
using for your environment. Otherwise, your previous configuration will be lost.
2. Configure NKP to load the plugins and to include the secret in the apps.velero section.
This process has been tested to work with plugins for AWS v1.1.0 and Azure v1.5.1. More recent versions of
these plugins can be used, but have not been tested by Nutanix.
...
velero:
values: |
configuration:
backupStorageLocation:
bucket: ${BUCKET}
config:
region: <AWS_REGION> # such as us-west-2
s3ForcePathStyle: "false"
insecureSkipTLSVerify: "false"
s3Url: ""
# profile should be set to the AWS profile name mentioned in the secret
profile: default
credentials:
# With the proper IAM permissions with access to the S3 bucket,
# you can attach the EC2 instances using the IAM Role, OR fill in
"existingSecret" OR "secretContents" below.
#
# Name of a pre-existing secret (if any) in the Velero namespace
# that should be used to get IAM account credentials.
existingSecret: velero-aws-credentials
# The key must be named "cloud", and the value corresponds to the entire
content of your IAM credentials file.
# For more information, consult the documentation for the velero plugin for
AWS at:
# [AWS] https://github.com/vmware-tanzu/velero-plugin-for-aws/blob/main/
README.md
secretContents:
# cloud: |
# [default]
# aws_access_key_id=<REDACTED>
# aws_secret_access_key=<REDACTED>
...
Procedure
b. Check that the backup storage location is Available and that it references the correct S3 bucket.
kubectl get backupstoragelocations -n ${WORKSPACE_NAMESPACE} -oyaml
Note: If the BackupStorageLocation is not Available, view any error events by using: kubectl describe
backupstoragelocations -n ${WORKSPACE_NAMESPACE}
a. Create a test backup that is stored in the location you created in the previous section.
velero backup create aws-velero-testbackup -n ${WORKSPACE_NAMESPACE} --kubeconfig=
${CLUSTER_NAME}.conf --storage-location <aws-backup-location-name> --snapshot-
volumes=false
• Ensure you have installed Velero (included in the default NKP installation).
• Ensure you have installed the Velero CLI. For more information, see Velero Installation Using CLI on
page 557.
Procedure
e. Set the AZURE_STORAGE_ACCOUNT_ID variable to the unique identifier of the storage account you want to use
for the backup.
To obtain the ID, get the resource ID for a storage account. For more information, see https://
learn.microsoft.com/en-us/azure/storage/common/storage-account-get-info?toc=%2Fazure
%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb
%2Ftoc.json&tabs=azure-cli#get-the-resource-id-for-a-storage-account. The output shows the entire
location path of the storage account. You only need the last part, or storage account name, to set the variable.
AZURE_STORAGE_ACCOUNT_ID=<storage-account-name>
f. Set the AZURE_BACKUP_SUBSCRIPTION_ID variable to the unique identifier of the subscription you want to
use for the backup.
To obtain the ID and Azure account list , see https://learn.microsoft.com/en-us/cli/azure/account?
view=azure-cli-latest#az-account-list.
AZURE_BACKUP_SUBSCRIPTION_ID=<azure-subscription-id>
g. Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace.
Replace <workspace_namespace> with the name of the target workspace:
export WORKSPACE_NAMESPACE=<workspace_namespace>
This can be the kommander namespace for the Management cluster or any other additional workspace
namespace for Attached or Managed clusters. To list all available workspace namespaces, use the kubectl
get kommandercluster -A command.
h. Set the CLUSTER_NAME environment variable. Replace <target_cluster> with the name of the cluster
where you want to set up Velero.
export CLUSTER_NAME=<target_cluster>
a. Create a credentials-velero file with the information required to create a secret. Use the same credentials
that you employed when creating the cluster.
These credentials should not be Base64 encoded because Velero will not read them properly.
Replace the variables in <...> with your environment's information. See your Microsoft Azure account to
look up the values.
cat << EOF > ./credentials-velero
AZURE_SUBSCRIPTION_ID=${AZURE_BACKUP_SUBSCRIPTION_ID}
AZURE_TENANT_ID=<AZURE_TENANT_ID>
AZURE_CLIENT_ID=<AZURE_CLIENT_ID>
AZURE_CLIENT_SECRET=<AZURE_CLIENT_SECRET>
AZURE_BACKUP_RESOURCE_GROUP=${AZURE_BACKUP_RESOURCE_GROUP}
AZURE_CLOUD_NAME=AzurePublicCloud
EOF
Procedure
1. Create a ConfigMap to enable Velero to use Azure blob containers as backup storage location.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: velero-overrides
data:
values.yaml: |
initContainers:
- name: velero-plugin-for-microsoft-azure
image: velero/velero-plugin-for-microsoft-azure:v1.5.1
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
credentials:
extraSecretRef: velero-azure-credentials
EOF
2. Patch the Velero AppDeployment to reference the created ConfigMap with the Velero overrides.
This is an alternative configuration path for management or Pro clusters. You can also configure Velero
by editing the kommander.yaml and rerunning the installation. To follow this alternative configuration path,
expand the following section:
Procedure
Warning: Before running this command, ensure the kommander.yaml is the configuration file you are currently
using for your environment. Otherwise, your previous configuration will be lost.
2. Configure NKP to load the plugins and to include the secret in the apps.velero section.
This process has been tested to work with plugin Azure v1.5.1. More recent versions of these plugins can be used,
but Nutanix has not tested them.
...
velero:
values: |
initContainers:
- name: velero-plugin-for-microsoft-azure
image: velero/velero-plugin-for-microsoft-azure:v1.5.1
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
credentials:
extraSecretRef: velero-azure-credentials
...
Procedure
b. Check that the backup storage location is Available and that it references the correct Azure bucket.
kubectl get backupstoragelocations -n ${WORKSPACE_NAMESPACE} -oyaml
a. Create a test backup that is stored in the location you created in the previous section.
velero backup create azure-velero-testbackup -n ${WORKSPACE_NAMESPACE} \
--kubeconfig=${CLUSTER_NAME}.conf \
--storage-location <azure-backup-location-name> \
--snapshot-volumes=false
Note: If your backup wasn’t created, Velero might have had an issue installing the plugin.
• Ensure you have installed Velero (included in the default NKP installation).
• Ensure you have installed the Velero CLI. For more information, see Velero Installation Using CLI on
page 557.
• You have installed the gcloud CLI. For more information, see https://cloud.google.com/sdk/docs/install.
• (Optional) You can install the gsutil CLI or opt to create buckets through the GCS Console. For more information,
see https://cloud.google.com/storage/docs/gsutil_install.
• Ensure you have created a GCS bucket. For more information, see https://cloud.google.com/storage/docs/
creating-buckets.
• Ensure you have sufficient access rights to the bucket you want to use for backup. For more information on GCP-
related access control, see https://cloud.google.com/storage/docs/access-control.
Procedure
a. Set the BUCKET environment variable to the name of the GCS container you want to use as backup storage.
export BUCKET=<GCS-bucket-name>
b. Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace.
Replace <workspace_namespace> with the name of the target workspace:
export WORKSPACE_NAMESPACE=<workspace_namespace>
This can be the kommander namespace for the Management cluster or any other additional workspace
namespace for Attached or Managed clusters. To list all available workspace namespaces, use thekubectl
get kommandercluster -A command.
a. Create a credentials-velero file with the information required to create a secret. Use the same credentials
that you employed
Replace <service-account-email> with the email address you used to grant permissions
to your bucket. The address usually follows the format <service-account-user>@<gcp-
project>.iam.gserviceaccount.com.
gcloud iam service-accounts keys create credentials-velero \
--iam-account <service-account-email>
Procedure
1. Create a ConfigMap to enable Velero to use GCS buckets as backup storage location.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: velero-overrides
data:
values.yaml: |
initContainers:
- name: velero-plugin-for-gcp
image: velero/velero-plugin-for-gcp:v1.5.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
credentials:
extraSecretRef: velero-gcp-credentials
EOF
2. Patch the Velero AppDeployment to reference the created ConfigMap with the Velero overrides.
This is an alternative configuration path for management or Pro clusters. You can also configure Velero
by editing the kommander.yaml and rerunning the installation. To follow this alternative configuration path,
expand the following section:
Procedure
Warning: Before running this command, ensure the kommander.yaml is the configuration file you are currently
using for your environment. Otherwise, your previous configuration will be lost.
2. Configure NKP to load the plugins and to include the secret in the apps.velero section.
This process has been tested to work with plugin GCP v1.5.0. More recent versions of these plugins can be used,
but Nutanix has not tested them.
...
velero:
values: |
initContainers:
- name: velero-plugin-for-gcp
image: velero/velero-plugin-for-gcp:v1.5.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
credentials:
extraSecretRef: velero-gcp-credentials
...
Procedure
b. Check that the backup storage location is Available and that it references the correct GCS bucket.
kubectl get backupstoragelocations -n ${WORKSPACE_NAMESPACE} -oyaml
a. Create a test backup that is stored in the location you created in the previous section.
velero backup create gcp-velero-testbackup -n ${WORKSPACE_NAMESPACE} \
--kubeconfig=${CLUSTER_NAME}.conf \
--storage-location <gcp-backup-location-name> \
--snapshot-volumes=false
Note: If your backup wasn’t created, Velero might have had an issue installing the plugin.
Velero Backup
If you do not want to use Rook Ceph to store Velero backups, you can configure Velero to use the default
cloud provider storage.
• By default, NKP sets up Velero to use Rook Ceph over TLS using a self-signed certificate.
• As a result, when using certain commands, you might be asked to use the --insecure-skip-tls-verify flag.
Again, the default setup is not suitable for production use cases.
Install the Velero command-line interface. For more information, see https://velero.io/docs/v1.5/basic-install/
#install-the-cli.
In NKP, the Velero platform application is installed in the kommander namespace instead of velero. Thus, after
installing the CLI, we recommend that you set the Velero CLI namespace config option so that subsequent Velero
CLI invocations will use the correct namespace:
velero client config set namespace=kommander
Backup Operations
Velero provides the following basic administrative functions to back up production clusters:
Note:
• If you want to back up your cluster in the scope of Platform Expansion: Conversion of an NKP
Pro Cluster to an NKP Ultimate Managed Cluster on page 515, that is, from NKP Pro cluster to
an NKP Ultimate Managed cluster, see Cluster Applications and Persistent Volumes Backup on
page 517.
• If you require a custom backup location, see how to create one for Velero with AWS: Establishing
a Backup Location on page 548, Velero with Azure: Establishing a Backup Location on
page 552, and Velero with GCP: Establishing a Backup Location on page 556.
Procedure
1. Specify the workspace namespace of the cluster for which you want to configure the backup.
export WORKSPACE_NAMESPACE=<workspace_namespace>
2. Specify the cluster for which you want to create the backup.
export CLUSTER_NAME=<target_cluster_name>
Procedure
Warning: NKP default backups do not support the creation of Volume Snapshots.
These default settings take effect after the cluster is created. If you install NKP with the default platform services
deployed, the initial backup starts after the cluster is successfully provisioned and ready for use.
Procedure
Run the following command.
velero create schedule <backup-schedule-name> -n ${WORKSPACE_NAMESPACE} \
--kubeconfig=${CLUSTER_NAME}.conf \
--snapshot-volumes=false \
--schedule="@every 8h"
Procedure
Backing Up on Demand
Procedure
Create a backup by running the following command.
velero backup create <backup-name> -n ${WORKSPACE_NAMESPACE} \
--kubeconfig=${CLUSTER_NAME}.conf \
--snapshot-volumes=false
Procedure
a. Ensure the following ResourceQuota setup is not configured on your cluster (this ResourceQuota will be
automatically restored)
kubectl -n kommander delete resourcequota one-kommandercluster-per-kommander-
workspace
b. Turn off the Workspace validation webhooks. Otherwise, you will not restore Workspaces with pre-
configured namespaces. If the validation webhook named kommander-validating is present, it must be
modified with this command.
kubectl patch validatingwebhookconfigurations kommander-validating \
a. To list the available backup archives for your cluster, run the following command.
velero backup get
b. Check your deployment to verify that the configuration change was applied correctly.
helm get values -n kommander velero
c. If restoring your cluster from a backup, use Read Only Backup Storage. To restore cluster data on demand
from a selected backup snapshot available in the cluster, run a command similar to the following.
velero restore create --from-backup <BACKUP-NAME>
If you are restoring using Velero from the default setup (and not using an external bucket or blob to store your
backups), you might see an error when describing or viewing the logs of your backup restore. This is a known
issue when restoring from an object store that is not accessible from outside your cluster. However, you can
review the success of the backup restore by confirming the Phase is Completed and not in error, and viewing
the logs by running these kubectl commands:
kubectl logs -l name=velero -n kommander --tail -1
Logging
Nutanix Kubernetes Platform (NKP) comes with a pre-configured logging stack that allows you to collect and
visualize pod and admin log data at the Workspace level. The logging stack is also multi-tenant capable, and multi-
tenancy is enabled at the Project level through role-based access control (RBAC).
By default, logging is disabled on managed and attached clusters. You need to enable the logging stack applications
explicitly on the workspace to make use of these capabilities.
The primary components of the logging stack include these platform services:
• BanzaiCloud Logging-operator
• Grafana and Grafana Loki
• Fluent Bit and Fluentd
In addition to these platform services, logging relies on other software and system facilities, including the container
runtime, the journal facility, and system configuration, to collect logs and messages from all the machines in the
cluster.
The following diagram illustrates how different components of the logging stack collect log data and provide
information about clusters:
The NKP logging stack aggregates logs from applications and nodes running inside your cluster.
Logging Operator
This section contains information about setting up the logging operator to manage the Fluent Bit and Fluentd
resources.
Fluent Bit can be configured to use a hostPath volume to store the buffer information, so it can be picked up again
when Fluent Bit restarts.
For more information on Fluent Bit and Fluent Bit log collector, see https://kube-logging.dev/docs/logging-
infrastructure/fluentbit/#hostpath-volumes-for-buffers-and-positions.
For more information on Logging in relation to how it is used in NKP, refer to these pages in our Help Center:
Logging Stack
Depending on the application workloads you run on your clusters, you might find that the default settings for the NKP
logging stack do not meet your needs. In particular, if your workloads produce lots of log traffic, you might find you
need to adjust the logging stack components to capture all the log traffic properly. Follow the suggestions below to
tune the logging stack components as needed.
Grafana Dashboard
In NKP, if the Prometheus Monitoring (kube-prometheus-stack) platform application is enabled, you can view
the Logging Operator dashboard in the Grafana UI.
You can also improve Fluentd throughput by turning off the buffering for loki clusterOutput.
Example Configuration
You can see an example configuration of the logging operator in Logging Stack Application Sizing
Recommendations on page 377.
For more information on performance tuning, Fluentd 1.0, see https://docs.fluentd.org/deployment/performance-
tuning-single-process.
Grafana Loki
NKP deploys Loki in Microservice mode. This provides you with the highest flexibility in terms of scaling.
For more information on Microservice mode, see https://grafana.com/docs/loki/latest/get-started/deployment-
modes/#microservices-mode.
In a high log traffic environment, Nutanix recommends:
• Usually, the number of Distributor pods should be much lower than the number of Ingester pods.
Grafana Dashboard
In NKP, if Prometheus Monitoring (kube-prometheus-stack) platform app is enabled, you can view the Loki
dashboards in Grafana UI.
Example Configuration
You can see an example configuration of the logging operator in Logging Stack Application Sizing
Recommendations on page 377.
For more information:
Rook Ceph
Ceph is the default S3 storage provider. In NKP, a Rook Ceph Operator and a Rook Ceph Cluster are
deployed together to have a Ceph Cluster.
Example Configuration
You can see an example configuration in Rook Ceph Cluster Sizing Recommendations on page 380.
Storage
The default configuration of Rook Ceph Cluster in NKP has a 33% overhead in data storage for redundancy.
Meaning, if the data disks allocated for your Rook Ceph Cluster is 1000Gb, 750Gb will be used to store your data.
Thus, it is important to account for that in planning the capacity of your data disks to prevent issues.
Audit Log
Overhead
Enabling audit logging requires additional computing and storage resources.
When you enable audit logging by enabling the kommander-fluent-bit AppDeployment, inbound traffic to the
logging stack increases the log traffic by approximately 3-4 more times.
Thus, when enabling the audit log, consider scaling up all components in the logging stack mentioned above.
Warning: On the Management cluster, the Fluentbit application is disabled by default. The amount of admin logs
ingested to Loki requires additional disk space to be configured on the rook-ceph-cluster. Enabling admin logs
might take around 2GB/day per node. See
For more details on how to configure the Ceph Cluster, see Rook Ceph in NKP on page 633.
Workspace-level Logging
How to enable Workspace-level Logging for use with NKP.
Logging is disabled by default on managed and attached clusters. You will need to enable logging features explicitly
at the Workspace level if you want to capture and view log data.
Warning: You must perform these procedures to enable multi-tenant logging at the Project level as well.
Logging Architecture
The NKP logging stack architecture provides a comprehensive logging solution for the NKP platform. It combines
Fluent Bit, Fluentd, Loki, and Grafana components to collect, process, store, and visualize log data. The architecture
establishes a robust logging solution by assigning specific roles to each of those components.
Components:
• Fluent Bit - Fluent Bit is a lightweight log processor and forwarder that collects log data from various sources,
such as application logs or Kubernetes components. It forwards the collected logs to Fluentd for further
processing.
• Fluentd - Fluentd is a powerful and flexible log aggregator that receives log data from Fluent Bit, processes it, and
forwards it to the Loki Distributor. Fluentd can handle various log formats and enrich the log data with additional
metadata before forwarding it.
• Distributor: Receives log streams, partitions them into chunks, and forwards these chunks to the Loki Ingester
component.
• Gateway: Acts as a single access point to various Loki components, routing requests between the Distributor,
Query Frontend, and other components as needed.
• Ingester: Compresses, indexes, and persists received log chunks.
• Querier: Fetches log chunks from the Ingester, decompresses and filters them based on the query, and returns
the results to the Query Frontend.
• Query Frontend: Splits incoming queries into smaller parts, forwards these to the Loki Querier component, and
combines the results from all Queriers before returning the final result.
• Grafana: Grafana is a visualization and analytics platform that supports Loki as one of its data sources.
Grafana provides a user-friendly interface for querying and visualizing log data based on user-defined
dashboards and panels.
• Grafana - Grafana is a visualization and analytics platform that supports Loki as one of its data sources. Grafana
provides a user-friendly interface for querying and visualizing log data based on user-defined dashboards and
panels.
Workflow
• Write Path:
• Fluent Bit instances running on each node collect log data from various sources, like application logs or
Kubernetes components.
• Fluent Bit forwards the collected log data to the Fluentd instance.
• Fluentd processes the received log data and forwards it to the Loki Distributor through the Loki Gateway.
• The Loki Distributor receives the log streams, partitions them into chunks, and forwards these chunks to the
Loki Ingester component.
• Loki Ingesters are responsible for compressing, indexing, and persisting the received log chunks.
• Read Path:
• When a user queries logs through Grafana, the request goes to the Loki Gateway, which routes it to the Loki
Query Frontend.
• The Query Frontend splits the query into smaller parts and forwards these to the Loki Querier component.
• Loki Queriers fetch the log chunks from the Loki Ingester, decompress and filter them based on the query, and
return the results to the Query Frontend.
• The Query Frontend combines the results from all Queriers and returns the final result to Grafana through the
Loki Gateway.
• Grafana visualizes the log data based on the user's dashboard configuration.
Procedure
3. Ensure traefik and cert-manager are enabled on your cluster. These are deployed by default unless you
modify your configuration.
5. Select the three-dot button from the bottom-right corner of the cards for Rook Ceph and Rook Ceph Cluster,
then click Enable. On the Enable Workspace Platform Application page, you can add a customized
configuration for settings that best fit your organization. You can leave the configuration settings unchanged to
enable with default settings.
7. Repeat the process for the Grafana Loki, Logging Operator, and Grafana Logging applications.
8. You can verify the cluster logging stack installation by waiting until the cards have a Deployed checkmark on the
Cluster Application page, or you can verify the Cluster Logging Stack installation through the CLI
Warning: We do not recommend installing Fluent Bit, which is responsible for collecting admin logs, unless you
have configured the Grafana Loki Ceph Cluster Bucket with sufficient storage space. The amount of admin logs
ingested to Loki requires additional disk space to be configured on the rook-ceph-cluster. Enabling admin
logs might use around 2GB/day per node. For details on how to configure the Ceph Cluster, see Rook Ceph in
NKP on page 633.
Procedure
1. Execute the following command to get the name and namespace of your workspace.
nkp get workspaces
And copy the values under the NAME and NAMESPACE columns for your workspace.
4. Ensure that Cert-Manager and Traefik are enabled in the workspace. If you want to find out if the applications are
enabled on the management cluster workspace, you can run them.
nkp get appdeployments --workspace ${WORKSPACE_NAME}
5. You can confirm that the applications are deployed on the managed or attached cluster by running this kubectl
command in that cluster.
Ensure you switch to the correct context or kubeconfig of the attached cluster for the following kubectl
command. For more information, see https://kubernetes.io/docs/tasks/access-application-cluster/configure-
access-multiple-clusters/).
kubectl get helmreleases -n ${WORKSPACE_NAMESPACE}
6. Copy these commands and run them on the management cluster from a command line to create the Logging-
operator, Grafana-loki, and Grafana-logging AppDeployments.
nkp create appdeployment logging-operator --app logging-operator-3.17.9 --workspace
${WORKSPACE_NAME}
nkp create appdeployment rook-ceph --app rook-ceph-1.10.3 --workspace
${WORKSPACE_NAME}
nkp create appdeployment rook-ceph-cluster --app rook-ceph-cluster-1.10.3 --workspace
${WORKSPACE_NAME}
nkp create appdeployment grafana-loki --app grafana-loki-0.69.16 --workspace
${WORKSPACE_NAME}
nkp create appdeployment grafana-logging --app grafana-logging-6.57.4 --workspace
${WORKSPACE_NAME}
Then, you can verify the cluster logging stack installation. For more information, see Verifying the Cluster
Logging Stack Installation on page 571.
To deploy the applications to selected clusters within the workspace, refer to the Cluster-scoped Application
Configuration from the NKP UI on page 398.
Warning: We do not recommend installing Fluent Bit, which is responsible for collecting admin logs unless you
have configured the Rook Ceph Cluster with sufficient storage space. Enabling admin logs through Fluent Bit might
use around 2GB/day per node. For more information on how to configure the Rook Ceph Cluster, see Rook Ceph
in NKP on page 633.
Procedure
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
4. Create a file named logging-operator-logging-overrides.yaml and paste the following YAML code into
it to create the overrides configMap.
apiVersion: v1
kind: ConfigMap
metadata:
name: logging-operator-logging-overrides
namespace: ${WORKSPACE_NAMESPACE}
data:
values.yaml: |
---
clusterFlows:
- name: cluster-containers
spec:
globalOutputRefs:
- loki
match:
- exclude:
namespaces:
- <your-namespace>
- <your-other-namespace>
8. Perform actions that generate log data, both in the specified namespaces and the namespaces you mean to exclude.
9. Verify that the log data contains only the data you expected to receive.
Procedure
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
Procedure
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
Run the following commands on the managed or attached cluster. Ensure you switch to the correct context or
kubeconfig of the attached cluster for the following kubectl commands. For more information, see https://
kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/.
3. Check the deployment status using this command on the attached cluster.
kubectl get helmreleases -n ${WORKSPACE_NAMESPACE}
Note: It may take some time for these changes to take effect, based on the duration configured for the Flux
GitRepository reconciliation.
When the logging stack is successfully deployed, you will see output that includes the following HelmReleases:
NAME READY STATUS AGE
grafana-logging True Release reconciliation succeeded 15m
logging-operator True Release reconciliation succeeded 15m
logging-operator-logging True Release reconciliation succeeded 15m
grafana-loki True Release reconciliation succeeded 15m
rook-ceph True Release reconciliation succeeded 15m
rook-ceph-cluster True Release reconciliation succeeded 15m
What to do next
Viewing Cluster Log Data on page 572
Procedure
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
Run the following commands on the attached cluster to access the Grafana UI.
Ensure you switch to the correct context or kubeconfig of the attached cluster for the following kubectl
commands. For more information, see https://kubernetes.io/docs/tasks/access-application-cluster/
configure-access-multiple-clusters/.
Warning: Cert-Manager and Traefik must be deployed in the attached cluster to be able to access the Grafana UI.
These are deployed by default on the workspace.
• BanzaiCloud logging-operator
• Grafana Loki
• Grafana
Note:
You must perform the Workspace-level Logging on page 565 procedures as a prerequisite to enable
multi-tenant logging at the Project level as well.
Access to log data is done at the namespace level through the use of Projects within Kommander, as shown in the
diagram:
Each Project namespace has a logging-operator, “Flow” that sends its pod logs to its own Loki server. A custom
controller deploys corresponding Loki and Grafana servers in each namespace, and defines a logging-operator Flow
in each namespace that forwards its pod logs to its respective Loki server. There is a corresponding Grafana server for
visualizations for each namespace.
For the convenience of cluster Administrators, a cluster-scoped Loki/Grafana instance pair is deployed with a
corresponding Logging-operator ClusterFlow that directs pod logs from all namespaces to the pair. A cluster
Note: Cluster Administrators will need to monitor and adjust resource usage to prevent operational difficulties or
excessive use on a per namespace basis.
• Enable workspace-level logging before you can configure multi-tenant logging. For more information, see
Workspace-level Logging on page 565.
• Be a cluster administrator with permissions to configure cluster-level platform services.
Multi-tenant Logging Enablement Process
The steps required to enable multi-tenant logging include:
Procedure
1. Get started with multi-tenant logging by Creating a Project for Logging on page 574.
Procedure
2. Then, you can create project-level AppDeployments for use in multi-tenant logging.
1. Determine the namespace of the workspace that your project is in. You can use the nkp get workspaces
command to see the list of workspace names and their corresponding namespaces.
nkp get workspaces
Copy the value under the NAMESPACE column for your workspace. This might NOT be identical to the Display
Name of the Workspace.
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
4. Set the PROJECT_NAMESPACE variable to the namespace copied in the previous step.
export PROJECT_NAMESPACE=<PROJECT_NAMESPACE>
6. Run the following command on the management cluster to reference the configOverrides in project-
grafana-loki AppDeployment.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: project-grafana-loki
namespace: ${PROJECT_NAMESPACE}
spec:
appRef:
name: project-grafana-loki-0.48.6
kind: ClusterApp
configOverrides:
name: project-grafana-loki-custom-overrides
Procedure
1. Determine the namespace of the workspace that your project is in. You can use the nkp get workspaces
command to see the list of workspace names and their corresponding namespaces.
nkp get workspaces
Copy the value under the NAMESPACE column for your workspace.
Note: It may take some time for these changes to take effect, based on the duration configured for the Flux
GitRepository reconciliation.
When the logging stack is successfully deployed, you will see output that includes the following HelmReleases:
NAMESPACE NAME READY STATUS
AGE
${PROJECT_NAMESPACE} project-grafana-logging True Release
reconciliation succeeded 15m
${PROJECT_NAMESPACE} project-grafana-loki True Release
reconciliation succeeded 11m
${PROJECT_NAMESPACE} project-loki-object-bucket-claims True Release
reconciliation succeeded 11m
What to do next
Viewing Project Log Data on page 577
Procedure
1. Determine the namespace of the workspace that your project is in. You can use the nkp get workspaces
command to see the list of workspace names and their corresponding namespaces.
nkp get workspaces
Copy the value under the NAMESPACE column for your workspace.
Warning: Cert-Manager and Traefik must be deployed in the attached cluster to be able to access the Grafana UI.
These are deployed by default on the workspace.
You can configure the workspace policy to restrict access to the Project logging Grafana UI. For more
information, see Logging on page 561.
Each Grafana instance in a Project has a unique URL at the cluster level. Consider creating a
WorkspaceRoleBinding that maps to a ClusterRoleBinding, on attached cluster(s), for each Project level
Grafana instance. For example, If you have a group named sample-group and two projects named first-
project and second-project in sample-workspace workspace, then the Role Bindings will look similar to
the following image:
Select the correct role bindings for each group for a project at the workspace level.
Fluent Bit
Fluent Bit is the NKP choice of open-source log collection and forwarding tool.
For more information, see https://fluentbit.io/.
Warning: The Fluentbit application is disabled by default on the management cluster. To ingest Loki the required
amount of admin logs, additional disk space mustquires additional disk space to be configured on the rook-ceph-
cluster. Enabling admin logs might use around 2 GB/day per node.
For more details on how to configure the Rook Ceph Cluster, see Rook Ceph on page 563.
Procedure
1. To get the namespace of the workspace where you want to configure Fluent Bit, run the following command.
nkp get workspaces
Copy the value under the NAMESPACE column for your workspace.
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
3. Identify the systemd-journald log data storage path on the nodes of the clusters in the workspace by using
the OS documentation and examining the systemd configuration.
Usually, it will be either /var/log/journal (typically used when systemd-journald is configured to
store logs permanently; in this case, the default Fluent Bit configuration should work) or /run/log/journal
(typically used when systemd-journald is configured to use volatile storage).
4. Extract the default Helm values used by the Fluent Bit App.
kubectl get -n ${WORKSPACE_NAMESPACE} configmaps fluent-bit-0.20.9-d2iq-defaults -
o=jsonpath='{.data.values\.yaml}' > fluent-bit-values.yaml
5. Edit the resulting file fluent-bit-values.yaml by removing all sections except for extraVolumes,
extraVolumeMounts and config.inputs. The result should look similar to this.
extraVolumes:
# we create this to have a persistent tail-db directory an all nodes
# otherwise a restarted fluent-bit would rescrape all tails
- name: tail-db
hostPath:
path: /var/log/tail-db
type: DirectoryOrCreate
# we create this to get rid of error messages that would appear on non control-
plane nodes
- name: kubernetes-audit
hostPath:
path: /var/log/kubernetes/audit
type: DirectoryOrCreate
# needed for kmsg input plugin
- name: uptime
hostPath:
path: /proc/uptime
type: File
- name: kmsg
hostPath:
extraVolumeMounts:
- name: tail-db
mountPath: /tail-db
- name: kubernetes-audit
mountPath: /var/log/kubernetes/audit
- name: uptime
mountPath: /proc/uptime
- name: kmsg
mountPath: /dev/kmsg
config:
inputs: |
# Collect audit logs, systemd logs, and kernel logs.
# Pod logs are collected by the fluent-bit deployment managed by logging-
operator.
[INPUT]
Name tail
Alias kubernetes_audit
Path /var/log/kubernetes/audit/*.log
Parser kubernetes-audit
DB /tail-db/audit.db
Tag audit.*
Refresh_Interval 10
Rotate_Wait 5
Mem_Buf_Limit 135MB
Buffer_Chunk_Size 5MB
Buffer_Max_Size 20MB
Skip_Long_Lines Off
[INPUT]
Name systemd
Alias kubernetes_host
DB /tail-db/journal.db
Tag host.*
Max_Entries 1000
Read_From_Tail On
Strip_Underscores On
[INPUT]
Name kmsg
Alias kubernetes_host_kernel
Tag kernel
6. Add the following item to the list under the extraVolumes key.
- name: kubernetes-host
hostPath:
path: <path to systemd logs on the node>
type: Directory
7. Add the following item to the list under the extraVolumeMounts key.
- name: kubernetes-host
mountPath: <path to systemd logs on the node>
These items will make Kubernetes mount logs into Fluent Bit pods.
8. Add the following line into the [INPUT] entry identified by Name systemd and Alias kubernetes_host.
Path <path to systemd logs on the node>
This is needed to make Fluent Bit actually collect the mounted logs.
extraVolumeMounts:
- name: tail-db
mountPath: /tail-db
- name: kubernetes-audit
mountPath: /var/log/kubernetes/audit
- name: uptime
mountPath: /proc/uptime
- name: kmsg
mountPath: /dev/kmsg
- name: kubernetes-host
mountPath: /run/log/journal
config:
inputs: |
# Collect audit logs, systemd logs, and kernel logs.
# Pod logs are collected by the fluent-bit deployment managed by logging-
operator.
[INPUT]
Name tail
Alias kubernetes_audit
Path /var/log/kubernetes/audit/*.log
Parser kubernetes-audit
DB /tail-db/audit.db
Tag audit.*
Refresh_Interval 10
Rotate_Wait 5
Mem_Buf_Limit 135MB
Buffer_Chunk_Size 5MB
Buffer_Max_Size 20MB
Skip_Long_Lines Off
[INPUT]
Name systemd
Alias kubernetes_host
12. Edit the fluent-bit AppDeployment to set the value of spec.configOverrides.name to the name of
the created ConfigMap. You can use the steps in the procedure, and deploy an application with a custom
configuration.
nkp edit appdeployment -n ${WORKSPACE_NAMESPACE} fluent-bit
After your editing is complete, the AppDeployment resembles this example.
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: fluent-bit
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
name: fluent-bit-0.20.9
kind: ClusterApp
configOverrides:
name: fluent-bit-overrides
13. Log in to the Grafana logging UI of your workspace and verify that logs with a label
log_source=kubernetes_host are now present in Loki.
2. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
3. Create a secret containing the static AWS S3 credentials. The secret is then mounted into each of the Grafana Loki
pods as environment variables.
kubectl create secret generic nkp-aws-s3-creds -n${WORKSPACE_NAMESPACE} \
--from-literal=AWS_ACCESS_KEY_ID=<key id> \
--from-literal=AWS_SECRET_ACCESS_KEY=<secret key>
Note: This can also be added to the installer configuration if you are configuring Grafana Loki on the Management
Cluster.
Note: If you use the Kommander CLI installation configuration file, you don’t need this step
Procedure
a. On the management cluster, run the following command to get the namespace of your workspace.
nkp get workspaces
b. Copy the value under the NAMESPACE column for your workspace.
c. Set the WORKSPACE_NAMESPACE variable to the namespace copied in the previous step.
export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
a. On the Attached or Managed Cluster, retrieve the kubeconfig for the cluster.
b. Apply the ConfigMap directly to the managed/attached cluster using the name, logging-operator-
logging-overrides.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: logging-operator-logging-overrides
namespace: ${WORKSPACE_NAMESPACE}
data:
values.yaml: |
<insert config here>
EOF
This is an example of a ConfigMap that contains customized resource requests and limit values for fluentd:
apiVersion: v1
kind: ConfigMap
metadata:
name: logging-operator-logging-overrides
namespace: kommander
data:
values.yaml: |
fluentd:
resources:
Security
Details on distributed authentication and authorization between clusters
Authentication
NKP UI comes with a pre-configured authentication Dex identity broker and provider.
Warning: Kubernetes, Kommander, and Dex do not store any user identities. The Kommander installation comes with
default admin static credentials. These credentials should only be used to access the NKP UI for configuring an external
identity provider. Currently, there is no way to update these credentials, so they should be treated as backup credentials
and not used for normal access.
The NKP UI admin credentials are stored as a secret. They never leave the boundary of the UI cluster and are never
shared with any other cluster.
The Dex service issues an OIDC ID token (see https://openid.net/specs/openid-connect-
core-1_0.html#IDToken) on successful user authentication. Other platform services use ID tokens as proof of
authentication. User identity to the Kubernetes API server is provided by the kube-oidc-proxy platform service
that reads the identity from an ID token. Web requests to NKP UI access are authenticated by the traefik forward auth
platform service (see https://github.com/mesosphere/traefik-forward-auth).
A user identity is shared across a UI cluster and all other attached clusters.
Attached Clusters
A newly attached cluster has federated kube-oidc-proxy, dex-k8s-authenticator, and traefik-forward-
authplatform applications. These platform applications are configured to accept the Management or Pro cluster (see
Cluster Types on page 19), Dex issues ID tokens.
When the traefik-forward-auth is used as a Traefik Ingress authenticator (see https://doc.traefik.io/traefik/
v2.4/providers/kubernetes-ingress/), it checks if the user identity was issued by the Kommander cluster Dex
service issued the user identity to the Management/Pro cluster Dex service to authenticate and confirm their identity.
The Kommander cluster Dex service issued the user identity for the attached clusters. On the ManagementPro cluster,
use the static admin credentials or an external identity provider (IDP).
Authorization
Kommander does not have a centralized authorization component, and the service makes its own authorization
decisions based on user identity.
Identity Providers
An Identity Provider (IdP) is a service that lets you manage identity information for users, including groups.
A cluster created in Kommander uses Dex as its IdP. Dex, in turn, delegates to one or more external IdPs.
If you already use one or more of the following IdPs, you can configure Dex to use them:
Note:
These are the Identity Providers supported by Dex 2.22.0, the version used by NKP.
Login Connectors
Kommander uses Dex to provide OpenID Connect single sign-on (SSO) to the cluster. Dex can be configured to
use multiple connectors, including GitHub, LDAP, and SAML 2.0. The Dex Connector documentation describes
how to configure different connectors. You can add the configuration as the values field in the Dex application. An
example Dex configuration provided to the Kommander CLI’s install command is similar to this:
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
dex:
Authentication
OpenID Connect is an extension of the OAuth2 authentication protocol. As required by OAuth2, the
client must be registered with Dex. Do this by passing the name of the application and a callback/redirect
URI. These handle the processing of the OpenID token after the user authenticates successfully. After
registration, Dex returns a client_id and a secret. Authentication requests use these between the client
and Dex to identify the client.
For more information on OAuth2 authentication protocol, https://oauth.net/2/.
Users access Kommander in two ways:
5. Select Launch Console and follow the authentication steps to complete the procedure.
• You must have access to a Linux, macOS, or Windows computer with a supported operating system version.
• You must have a properly deployed and running cluster. For information about deploying Kubernetes with
default settings on different types of infrastructures, see the Custom Installation and Infrastructure Tools on
page 644.
• If you install Kommander with a custom configuration, make sure you enable Gatekeeper.
Warning: If you intend to disable Gatekeeper, keep in mind that the app is deployed pre-configured with constraint
templates that enforce multi-tenancy in projects.
Procedure
Procedure
Procedure
2. Define the Constraint. Constraints are then used to inform Gatekeeper that the admin wants to enforce a
ConstraintTemplate, and how.
Create the host path volume policy constraint psp-host-filesystem by running the following command to
only allow /foo to be mounted as a host path volume.
cat <<EOF | kubectl apply -f -
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPHostFilesystem
metadata:
name: psp-host-filesystem
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
allowedHostPaths:
- readOnly: true
pathPrefix: "/foo"
EOF
•
•
• The cookies are securely encrypted, so they cannot be modified by users.
• The cookies contain the RBAC username.
Local Users
Nutanix recommends configuring an external identity provider to manage access to your cluster.
For an overview of the benefits, supported providers, and instructions on how to configure an external identity
provider, see Identity Providers.
However, you can create local users as an alternative, for example, when you want to test RBAC policies quickly.
NKP automatically creates a unique local user, the admin user, but you can create additional local users and assign
them other RBAC roles.
This section shows how to create and manage local users to access your Pro or Management cluster.
There are two ways in which you can create local users:
Warning: Nutanix does not recommend creating local users for production clusters. See for instructions on how to
configure an external identity provider to manage your users.
1. Open the Kommander Installer Configuration File or kommander.yaml file. If you do not have the
kommander.yaml file, initialize the configuration file so you can edit it in the subsequent steps.
Note: Initialize this file only one time, otherwise you will- overwrite previous customizations.
2. In that file, add the following customization for dex and create local users by establishing their credentials.
Warning: You have created a user that does not have any permissions to see or manage your NKP cluster yet.
You have created a user that does not have any permissions to see or manage your NKP cluster yet.
Warning: Nutanix does not recommend creating local users for production clusters. See for instructions on how to
configure an external identity provider to manage your users.
Procedure
1. Create a configMap resource with the credentials of the new local user
3. Copy the following values and paste them in a location in the file where they are nested in the spec field.
configOverrides:
name: dex-overrides
Example:
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
...
spec:
appRef:
kind: ClusterApp
name: dex-2.11.1
clusterConfigOverrides:
- clusterSelector:
matchExpressions:
- key: kommander.d2iq.io/cluster-name
operator: In
values:
- host-cluster
configMapName: dex-kommander-overrides
configOverrides: # Copy and paste this section.
name: dex-overrides
status:
...
Editing the AppDeployment restarts the HelmRelease for Dex. The new users will be created after the
reconciliation. However, the user creation is not completed until you assign it permissions.
Note: You have created a user that does not have any permissions to see or manage your NKP cluster yet.
To complete the configuration, see Adding RBAC Roles to Local Users on page 596.
Procedure
Create the following ClusterRoleBinding resource:.
Note: The Login page and cluster URL are the same for the default admin user and the local users you create with this
method.
For more information on RBAC resources in NKP and granting access to Kubernetes and Kommander resources, see
Access to Kubernetes and Kommander Resources on page 342.
For general information on RBAC as a Kubernetes resource, see https://kubernetes.io/docs/reference/access-
authn-authz/rbac/.
Procedure
2. If you change the email address or username of a user, ensure you update any RoleBindings or RBAC resources
associated with this user.
Procedure
To delete local users, edit the dex-overrides ConfigMap and remove the email and hash fields for the user. Also,
ensure you delete any RoleBindings or RBAC resources associated with this user.
Networking
Configure networking for the Konvoy cluster.
Networking Service
A Service is an API resource that defines a logical set of pods and a policy by which to access them, and is
an abstracted manner to expose applications as network services.
For more information on service, see https://kubernetes.io/docs/concepts/services-networking/service/
#service-resource.
Kubernetes gives pods their own IP addresses and a single DNS name for a set of pods. Services are used as entry
points to load-balance the traffic across the pods. A selector determines the set of Pods targeted by a Service.
For example, if you have a set of pods that each listen on TCP port 9191 and carry a label app=MyKonvoyApp, as
configured in the following:
apiVersion: v1
kind: Service
metadata:
name: my-konvoy-service
namespace: default
spec:
selector:
app: MyKonvoyApp
ports:
- protocol: TCP
port: 80
targetPort: 9191
This specification creates a new Service object named "my-konvoy-service", that targets TCP port 9191 on any
pod with the app=MyKonvoyApp label.
Kubernetes assigns this Service an IP address. In particular, the kube-proxy implements a form of virtual IP for
Services of type other than ExternalName.
Note:
Service Topology
A Service Topology is a mechanism in Kubernetes that routes traffic based on the cluster's Node topology.
For example, you can configure a Service to route traffic to endpoints on specific nodes or even based on
the region or availability zone of the node’s location.
To enable this new feature in your Kubernetes cluster, use the feature gates --feature-
gates="ServiceTopology=true,EndpointSlice=true" flag. After enabling, you can control Service traffic
routing by defining the topologyKeys field in the Service API object.
In the following example, a Service defines topologyKeys to be routed to endpoints only in the same zone:
apiVersion: v1
kind: Service
metadata:
name: my-konvoy-service
namespace: default
spec:
selector:
app: MyKonvoyApp
ports:
- protocol: TCP
Note: If the value of the topologyKeys field does not match any pattern, the traffic is rejected.
EndpointSlices
EndpointSlices are an API resource that appears as a scalable and more manageable solution to network
endpoints within a Kubernetes cluster. They allow for distributing network endpoints across multiple
resources with a limit of 100 endpoints per EndpointSlice.
For more information, see https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/
#endpointslice-resource.
An EndpointSlice contains references to a set of endpoints, and the control plane takes care of creating
EndpointSlices for any service that has a selector specified. These EndpointSlices include references to all the pods
that match the Service selector.
Like Services, the name of a EndpointSlice object must be a valid DNS subdomain name.
In this example, here’s a sample EndpointSlice resource for the example Kubernetes Service:
apiVersion: discovery.k8s.io/v1beta1
kind: EndpointSlice
metadata:
name: konvoy-endpoint-slice
namespace: default
labels:
kubernetes.io/service-name: my-konvoy-service
addressType: IPv4
ports:
- name: http
protocol: TCP
port: 80
endpoints:
- addresses:
- "192.168.126.168"
conditions:
ready: true
hostname: ip-10-0-135-39.us-west-2.compute.internal
topology:
kubernetes.io/hostname: ip-10-0-135-39.us-west-2.compute.internal
topology.kubernetes.io/zone: us-west2-b
Ingress Controllers
In contrast with the controllers in the Kubernetes control plane, Ingress controllers are not started with a cluster so
you need to choose the desired Ingress controller.
An Ingress controller has to be deployed in a cluster for the Ingress definitions to work.
Kubernetes, as a project, currently supports and maintains GCE and nginx controllers.
These are four of the most known Ingress controllers:
• HAProxy Ingress is a highly customizable community-driven ingress controller for HAProxy. See https://
haproxy-ingress.github.io/
• NGINX offers support and maintenance for the NGINX Ingress Controller for Kubernetes. See https://
www.nginx.com/products/nginx-ingress-controller.
• Traefik is a fully featured Ingress controller (Let’s Encrypt, secrets, http2, websocket), and has commercial
support. See https://github.com/containous/traefik.
Network Policies
NetworkPolicy is an API resource that controls the traffic flow at port level 3 or 4 or at the IP address level. It enables
defining constraints on how a pod communicates with various network services such as endpoints and services.
A Pod can be restricted to talk to other network services through a selection of the following identifiers:
• Namespaces that have to be accessed. There can be pods that are not allowed to talk to other namespaces.
• Other allowed IP blocks regardless of the node or IP address assigned to the targeted Pod.
• Other allowed Pods.
An example of a NetworkPolicy specification is:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: network-konvoy-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
app: MyKonvoyApp
- podSelector:
matchLabels:
app: MyKonvoyApp
ports:
- protocol: TCP
port: 6379
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/24
ports:
- protocol: TCP
port: 5978
As shown in the example, when defining a pod or namespace based NetworkPolicy, you use a selector to specify what
traffic is allowed to and from the Pod(s).
Required Domains
This section describes the required domains for NKP.
You must have access to the following domains through the customer networking rules so that Kommander can
download all required images:
• http://docker.io
• http://gcr.io
• http://k8s.gcr.io
• mcr.microsoft.com
• nvcr.io
• http://quay.io
• http://us.gcr.io
• registry.k8s.io
Load Balancing
In a Kubernetes cluster, depending on the flow of traffic direction, there are two kinds of load balancing:
Note: In NKP environments, the external load balancer must be configured without TLS termination.
In cloud deployments, the load balancer is provided by the cloud provider. For example, in AWS, the service
controller communicates with the AWS API to provision an ELB service that targets the cluster nodes.
For an on-premises Pre-provisioned deployment, NKP ships with MetalLB (see https://metallb.universe.tf/
concepts/), which provides load-balancing services. The environments that use MetalLB are pre-provisioned, as
well as vSphere infrastructures. For more information on how to configure MetalLB for these environments, see the
following:
Ingress
Kubernetes Ingress resources expose HTTP and HTTPS routes from outside the cluster to services within the cluster.
In Kommander, the Traefik Ingress controller is installed by default and provides access to the NKP UI.
An Ingress performs the following:
Traefik v2.4
Traefik is a modern HTTP reverse proxy and load balancer that deploys microservices with ease. Kommander
currently installs Traefik v2.4 by default on every cluster. Traefik creates a service of type LoadBalancer. In the
cloud, the cloud provider creates the appropriate load balancer. In an on-premises deployment, by default, it uses
MetalLB.
• Have access to a Linux, macOS, or Windows computer with a supported operating system version.
• Have a properly deployed and running cluster.
To expose a pod using an Ingress (L7)
Procedure
1. Deploy two web application Pods on your Kubernetes cluster by running the following command.
kubectl run --restart=Never --image hashicorp/http-echo --labels app=http-echo-1 --
port 80 http-echo-1 -- -listen=:80 --text="Hello from http-echo-1"
kubectl run --restart=Never --image hashicorp/http-echo --labels app=http-echo-2 --
port 80 http-echo-2 -- -listen=:80 --text="Hello from http-echo-2"
2. Expose the Pods with a service type of ClusterIP by running the following commands.
kubectl expose pod http-echo-1 --port 80 --target-port 80 --name "http-echo-1"
kubectl expose pod http-echo-2 --port 80 --target-port 80 --name "http-echo-2"
4. Run the following command to get the URL of the load balancer created on AWS for the Traefik service.
kubectl get svc kommander-traefik -n kommander
This command displays the internal and external IP addresses for the exposed service. (Note that IP addresses and
host names are for illustrative purposes. Always use the information from your own cluster)
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S) AGE
kommander-traefik LoadBalancer 10.0.24.215
abf2e5bda6ca811e982140acb7ee21b7-37522315.us-west-2.elb.amazonaws.com 80:31169/
TCP,443:32297/TCP,8080:31923/TCP 4h22m
5. Validate that you can access the web application Pods by running the following commands: (Note that IP
addresses and host names are for illustrative purposes. Always use the information from your own cluster)
curl -k https://abf2e5bda6ca811e982140acb7ee21b7-37522315.us-
west-2.elb.amazonaws.com/echo1
curl -k https://abf2e5bda6ca811e982140acb7ee21b7-37522315.us-
west-2.elb.amazonaws.com/echo2
Procedure
1. Review the list of available applications to obtain the current APP ID and version for Istio and its dependencies
(see Platform Applications Dependencies on page 390). You need this information to executerun the
following commands.
3. Install Istio.
Replace <APPID-version> with the version of Istio you want to deploy.
nkp create appdeployment istio --app <APPID-version> --workspace ${WORKSPACE_NAME}
Note:
• Create the resource in the workspace you just created, which instructs Kommander to deploy the
AppDeployment to the KommanderClusters in the same workspace.
• Observe that the nkp create command must be run with the WORKSPACE_NAME instead of the
WORKSPACE_NAMESPACE flag.
Procedure
2. Change to the Istio directory and set your PATH environment variable by running the following commands.
cd istio*
export PATH=$PWD/bin:$PATH
Procedure
1. Deploy the sample bookinfo application on the Kubernetes cluster by running the following commands.
Important: Ensure your nkp configuration references the cluster where you deployed Istio by setting the
KUBECONFIG environment variable, or using the --kubeconfig flag, in accordance with Kubernetes
conventions (see https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-
multiple-clusters/).
2. Get the URL of the load balancer created on AWS for this service by running the following command.
kubectl get svc istio-ingressgateway -n istio-system
The command displays output similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP
PORT(S)
AGE
istio-ingressgateway LoadBalancer 10.0.29.241
a682d13086ccf11e982140acb7ee21b7-2083182676.us-west-2.elb.amazonaws.com
15020:30380/TCP,80:31380/TCP,443:31390/TCP,31400:31400/TCP,15029:30756/
TCP,15030:31420/TCP,15031:31948/TCP,15032:32061/TCP,15443:31232/TCP 110s
3. Open a browser and navigate to the external IP address for the load balancer to access the application.
For example, the external IP address in the sample output is
a682d13086ccf11e982140acb7ee21b7-2083182676.us-west-2.elb.amazonaws.com,
enabling you to access the application using the following URL: http://
a682d13086ccf11e982140acb7ee21b7-2083182676.us-west-2.elb.amazonaws.com/productpage
4. To understand the different Istio features, follow the steps in the Istio BookInfo Application.
For more information on Istio, see https://istio.io/latest/docs/.
GPUs
In NKP), the nodes with NVIDIA GPUs are configured with nvidia-gpu-operator (see https://docs.nvidia.com/
datacenter/cloud-native/gpu-operator/latest/index.html) and NVIDIA drivers to support the container runtime.
For more information, see Nutanix GPU Passthrough.
Warning: Specific instructions must be followed for enabling nvidia-gpu-operator depending on if you want
to deploy the app on a Management cluster or a Attached or a Managed cluster.
• For instructions on enabling the NVIDIA platform application on a Management cluster, follow the
instructions in the NVIDIA Platform Application Management Cluster section. See Enabling the
NVIDIA Platform Application on a Management Cluster on page 608.
• For instructions on enabling the NVIDIA platform application on attached or managed clusters, follow
the instructions in the NVIDIA Platform Application Attached or Managed Cluster section. See
Enabling the NVIDIA Platform Application on Attached or Managed Clusters on page 610.
After nvidia-gpu-operator has been enabled depending on the cluster type, proceed to the Select the
correct Toolkit version for your NVIDIA GPU Operator on each of those pages.
Procedure
Task step.
Procedure
2. Append the following to the apps section in the install.yaml file to enable Nvidia platform services.
apps:
nvidia-gpu-operator:
enabled: true
4. Proceed to the Selecting the Correct Toolkit Version for Your NVIDIA GPU Operator on page 609 section.
Tip: Sometimes, applications require a longer period to deploy, which causes the installation to time out. Add the
--wait-timeout <time to wait> flag and specify a period (for example, 1h) to allocate more time to the
deployment of applications.
Selecting the Correct Toolkit Version for Your NVIDIA GPU Operator
Procedure
• Centos 7.9/RHEL 7.9: If you’re using Centos 7.9 or RHEL 7.9 as the base operating system for your GPU
enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or
kommander.yaml to the following.
kind: Installation
apps:
nvidia-gpu-operator:
enabled: true
values: |
toolkit:
version: v1.14.6-centos7
• RHEL 8.4/8.6 : If you’re using RHEL 8.4/8.6 as the base operating system for your GPU enabled nodes, set
the toolkit.version parameter in your Kommander Installer Configuration file or kommander.yaml to the
following.
kind: Installation
apps:
nvidia-gpu-operator:
enabled: true
values: |
toolkit:
version: v1.14.6-ubi8
Tip: Sometimes, applications require a longer period to deploy, which causes the installation to time out. Add the
--wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate more
time to the deployment of applications.
Note:
• To use the UI to enable the application, see Platform Applications on page 386.
• To use the CLI, see Deploying Platform Applications Using CLI on page 389.
• If only a subset of attached or managed clusters in the workspace are utilizing GPUs. For more
information on how to enable only the nvidia-gpu-operator on specific clusters, see Enabling or
Disabling an Application Per Cluster on page 401.
Procedure
After you have enabled the nvidia-gpu-operator app in the workspace on the necessary clusters, proceed to the
next section.
Selecting the Correct Toolkit Version for Your NVIDIA GPU Operator
Note: For how to use the CLI to customize the platform application on a workspace, see AppDeployment
Resources on page 396.
Procedure
1. Select the correct Toolkit version based on your OS and create a ConfigMap with these configuration override
values:
• Centos 7.9/RHEL 7.9: If you’re using Centos 7.9 or RHEL 7.9 as the base operating system for your GPU
enabled nodes, set the toolkit.version parameter in your install.yaml to the following:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: nvidia-gpu-operator-overrides-attached
data:
values.yaml: |
toolkit:
version: v1.10.0-centos7
EOF
• RHEL 8.4/8.6 : If you’re using RHEL 8.4/8.6 as the base operating system for your GPU enabled nodes, set
the toolkit.version parameter in your install.yaml to the following:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: nvidia-gpu-operator-overrides-attached
data:
values.yaml: |
toolkit:
version: v1.10.0-ubi8
EOF
• Ubuntu 18.04 and 20.04: If you’re using Ubuntu 18.04 or 20.04 as the base operating system for your GPU
enabled nodes, set the toolkit.version parameter in your install.yaml to the following:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: nvidia-gpu-operator-overrides-attached
data:
values.yaml: |
toolkit:
version: v1.11.0-ubuntu20.04
EOF
2. Note the name of this ConfigMap (nvidia-gpu-operator-overrides-attached) and use it to set the
necessary nvidia-gpu-operator AppDeployment spec fields depending on the scope of the override.
Alternatively, you can also use the UI to pass in the configuration overrides for the app per workspace or per
cluster.
Procedure
Run the following command to validate that your application has started correctly.
kubectl get pods -A | grep nvidia
Output example:
nvidia-container-toolkit-daemonset-7h2l5 1/1 Running 0 150m
nvidia-container-toolkit-daemonset-mm65g 1/1 Running 0 150m
nvidia-container-toolkit-daemonset-mv7xj 1/1 Running 0 150m
nvidia-cuda-validator-pdlz8 0/1 Completed 0 150m
nvidia-cuda-validator-r7qc4 0/1 Completed 0 150m
nvidia-cuda-validator-xvtqm 0/1 Completed 0 150m
nvidia-dcgm-exporter-9r6rl 1/1 Running 1 (149m ago) 150m
nvidia-dcgm-exporter-hn6hn 1/1 Running 1 (149m ago) 150m
nvidia-dcgm-exporter-j7g7g 1/1 Running 0 150m
nvidia-dcgm-jpr57 1/1 Running 0 150m
nvidia-dcgm-jwldh 1/1 Running 0 150m
nvidia-dcgm-qg2vc 1/1 Running 0 150m
nvidia-device-plugin-daemonset-2gv8h 1/1 Running 0 150m
nvidia-device-plugin-daemonset-tcmgk 1/1 Running 0 150m
nvidia-device-plugin-daemonset-vqj88 1/1 Running 0 150m
nvidia-device-plugin-validator-9xdqr 0/1 Completed 0 149m
nvidia-device-plugin-validator-jjhdr 0/1 Completed 0 149m
nvidia-device-plugin-validator-llxjk 0/1 Completed 0 149m
nvidia-operator-validator-9kzv4 1/1 Running 0 150m
nvidia-operator-validator-fvsr7 1/1 Running 0 150m
nvidia-operator-validator-qr9cj 1/1 Running 0 150m
If you are seeing errors, ensure that you set the container toolkit version appropriately based on your OS, as described
in the previous section.
Note: MIG is only available for the following NVIDIA devices: H100, A100, and A30.
• mig.strategy should be set to mixed when MIG mode is not enabled on all GPUs on a node.
• mig.strategy should be set to single when MIG mode is enabled on all GPUs on a node and they have the
same MIG device types across all of them.
For the Management Cluster, this can be set at install time by modifying the Kommander configuration file to add
configuration for the nvidia-gpu-operator application:
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
nvidia-gpu-operator:
values: |
mig:
strategy: single
...
Or by modifying the clusterPolicy object for the GPU operator once it has already been installed.
2. Set the MIG profile for the GPU you are using. In our example, we are using the A30 GPU that supports the
following MIG profiles.
4 GPU instances @ 6GB each
2 GPU instances @ 12GB each
1 GPU instance @ 24GB
Set the mig profile by labeling the node ${NODE} with the profile as in the example below:
kubectl label nodes ${NODE} nvidia.com/mig.config=all-1g.6gb --overwrite
3. Check the node labels to see if the changes were applied to your MIG enabled GPU node
kubectl get no -o json | jq .items[0].metadata.labels
"nvidia.com/mig.config": "all-1g.6gb",
"nvidia.com/mig.config.state": "success",
"nvidia.com/mig.strategy": "single"
nodeSelector:
"nvidia.com/gpu.product": NVIDIA-A30-MIG-1g.6gb
If the workload successfully finishes, then your GPU has been properly MIG partitioned.
Procedure
1. Connect (using SSH or similar) to your GPU enabled nodes and run the nvidia-smi command. Your output
should be similar to the following example:
[ec2-user@ip-10-0-0-241 ~]$ nvidia-smi
Thu Nov 3 22:52:59 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06 Driver Version: 535.183.06 CUDA Version: 12.2.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 54C P8 11W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
2. Another common issue is having a misconfigured toolkit version, resulting in NVIDIA pods in a bad state. For
example.
nvidia-container-toolkit-daemonset-jrqt2 1/1 Running
0 29s
nvidia-dcgm-exporter-b4mww 0/1 Error
1 (9s ago) 16s
nvidia-dcgm-pqsz8 0/1
CrashLoopBackOff 1 (13s ago) 27s
nvidia-device-plugin-daemonset-7fkzr 0/1 Init:0/1
0 14s
nvidia-operator-validator-zxn4w 0/1
Init:CrashLoopBackOff 1 (7s ago) 11s
To modify the toolkit version, run the following commands to modify the AppDeployment for the nvidia gpu
operator application.
a. Provide the name of a ConfigMap with the custom configuration in the AppDeployment.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: nvidia-gpu-operator
namespace: kommander
spec:
appRef:
kind: ClusterApp
b. Create the ConfigMap with the name provided in the previous step, which provides the custom configuration
on top of the default configuration in the config map, and set the version appropriately.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: kommander
name: nvidia-gpu-operator-overrides
data:
values.yaml: |
toolkit:
version: v1.10.0-centos7
EOF
3. If a node has an NVIDIA GPU installed and the nvidia-gpu-operator application is enabled on the cluster,
but the node is still not accepting GPU workloads, it's possible that the nodes do not have a label that indicates
there is an NVIDIA GPU present.
By default, the GPU operator will attempt to configure nodes with the following labels present, which are usually
applied by the node feature discovery component.
"feature.node.kubernetes.io/pci-10de.present": "true",
"feature.node.kubernetes.io/pci-0302_10de.present": "true",
"feature.node.kubernetes.io/pci-0300_10de.present": "true",
If these labels are not present on a node that you know contains an NVIDIA GPU, you can manually label the
node using the following command:
kubectl label node ${NODE} feature.node.kubernetes.io/pci-0302_10de.present=true
Procedure
1. Delete all GPU workloads on the GPU nodes where the NVIDIA GPU Operator platform application is present.
2. Delete the existing NVIDIA GPU Operator AppDeployment using the following command:
kubectl delete appdeployment -n kommander nvidia-gpu-operator
3. Wait for all NVIDIA related resources in the Terminating state to be cleaned up. You can check pod status with
the following command.
kubectl get pods -A | grep nvidia
For information on how to delete node pools, see Deleting Pre-provisioned Node Pools on page 741.
• RHEL 8.4/8.6 : If you’re using RHEL 8.4/8.6 as the base operating system for your GPU enabled nodes, set
the toolkit.version parameter in your Kommander Installer Configuration file or kommander.yaml to the
following.
kind: Installation
apps:
nvidia-gpu-operator:
enabled: true
values: |
toolkit:
version: v1.14.6-ubi8
• Ubuntu 18.04 and 20.04: If you’re using Ubuntu 18.04 or 20.04 as the base operating system for your
GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or
kommander.yaml to the following.
kind: Installation
apps:
nvidia-gpu-operator:
enabled: true
values: |
toolkit:
version: v1.14.6-ubuntu20.04
Note: If you have not installed the Kommander component of NKP yet, set the Toolkit version in the Kommander
Installer Configuration file (see GPU Toolkit Versions on page 615)and skip this section.
Procedure
1. Create a ConfigMap with the necessary configuration overrides to set the correct Toolkit version. For example,
if you’re using Centos 7.9 or RHEL 7.9 as the base operating system for your GPU-enabled nodes, set the
toolkit.version parameter.
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: kommander
name: nvidia-gpu-operator-overrides
data:
values.yaml: |
2. Update the nvidia-gpu-operator AppDeployment in the kommander namespace to reference the ConfigMap
you created.
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.d2iq.io/v1alpha3
kind: AppDeployment
metadata:
name: nvidia-gpu-operator
namespace: kommander
spec:
appRef:
name: nvidia-gpu-operator-1.11.1
kind: ClusterApp
configOverrides:
name: nvidia-gpu-operator-overrides
EOF
Recommendations
Recommended settings for monitoring and collecting metrics for Kubernetes, platform services, and
applications deployed on the cluster
Nutanix conducts routine performance testing of Kommander. The following table provides recommended settings,
based on cluster size and increasing workloads, that maintain a healthy Prometheus monitoring deployment.
Note: The resource settings reflect some settings but do not represent the exact structure to be used in the platform
service configuration.
25 1k 250 resources:
limits:
cpu: 2
memory: 6Gi
requests:
cpu: 1
memory: 3Gi
storage: 60Gi
100 3k 1k resources:
limits:
cpu: 12
memory: 50Gi
requests:
cpu: 10
memory: 48Gi
storage: 100Gi
• Kubernetes Components: API Server, Nodes, Pods, Kubelet, Scheduler, StatefulSets and Persistent Volumes
• Kubernetes USE method: Cluster and Nodes
• Calico
• etcd
• Prometheus
Find the complete list of default-enabled dashboards on GitHub. For more information, see https://github.com/
prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/templates/grafana/
dashboards-1.14.
Procedure
1. Create a file named kube-prometheus-stack-overrides.yaml and paste the following YAML code into it
to create the overrides ConfigMap.
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-prometheus-stack-overrides
namespace: <your-workspace-namespace>
data:
values.yaml: |
---
grafana:
defaultDashboardsEnabled: false
Procedure
1. One method is to use ConfigMaps to import dashboards. Below are steps on how to create a ConfigMap with your
dashboard definition.
For more information, see https://github.com/grafana/helm-charts/tree/main/charts/grafana#sidecar-for-
dashboards.
For simplicity, this section assumes the desired dashboard definition is in json format.
{
"annotations": {
"list": []
},
"description": "etcd sample Grafana dashboard with Prometheus",
"editable": true,
"gnetId": null,
"hideControls": false,
"id": 6,
"links": [],
"refresh": false,
...
}
2. After creating your custom dashboard json, insert it into a ConfigMap and save it as etcd-custom-
dashboard.yaml.
apiVersion: v1
kind: ConfigMap
metadata:
name: etcd-custom-dashboard
labels:
grafana_dashboard: "1"
data:
etcd.json: |
{
"annotations": {
"list": []
},
"description": "etcd sample Grafana dashboard with Prometheus",
"editable": true,
"gnetId": null,
"hideControls": false,
"id": 6,
"links": [],
"refresh": false,
...
Cluster Metrics
The kube-prometheus-stackis deployed by default on the management cluster and attached clusters. This stack
deploys the following Prometheus components to expose metrics from nodes, Kubernetes units, and running apps:
Note: NKP has a listener on the metrics.k8s.io/v1beta1/nodes resource, which updates your backend store
when that value changes. We then poll that backend store every 5 seconds, so the metrics are updated in real time every
5 seconds without the need to refresh your view.
Kommander is configured with pre-defined alerts to monitor four specific events. You receive alerts related to:
• CPUThrottlingHigh
• TargetDown
• KubeletNotReady
• KubeAPIDown
• CoreDNSDown
• KubeVersionMismatch
Procedure
1. Create a file named kube-prometheus-stack-overrides.yaml and paste the following YAML code into it
to create the overrides ConfigMap.
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-prometheus-stack-overrides
namespace: ${WORKSPACE_NAMESPACE}
data:
values.yaml: |
---
defaultRules:
rules:
node: false
5. Alert rules for the Velero platform service are turned off by default. You can enable them with the following
overrides ConfigMap. They should be enabled only if the velero platform service is enabled. If platform services
are disabled disable the alert rules to avoid alert misfires.
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-prometheus-stack-overrides
namespace: ${WORKSPACE_NAMESPACE}
data:
values.yaml: |
---
mesosphereResources:
rules:
velero: true
6. To create a custom alert rule named my-rule-name, create the overrides ConfigMap with this YAML code.
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-prometheus-stack-overrides
namespace: ${WORKSPACE_NAMESPACE}
data:
values.yaml: |
---
additionalPrometheusRulesMap:
my-rule-name:
groups:
- name: my_group
rules:
- record: my_record
expr: 100 * my_record
After you set up your alerts, you can manage each alert using the Prometheus web console to mute or unmute
firing alerts, and perform other operations. For more information on configuring alertmanager, see https://
prometheus.io/docs/alerting/latest/configuration/.
To access the Prometheus Alertmanager UI, browse to the landing page and then search for the Prometheus
Alertmanager dashboard, for example https://<CLUSTER_URL>/nkp/alertmanager
1. The following file, named alertmanager.yaml, configures alertmanager to use the Incoming Webhooks
feature of Slack (slack_api_url: https://hooks.slack.com/services/<HOOK_ID>) to fire all the alerts
to a specific channel #MY-SLACK-CHANNEL-NAME.
global:
resolve_timeout: 5m
slack_api_url: https://hooks.slack.com/services/<HOOK_ID>
route:
group_by: ['alertname']
group_wait: 2m
group_interval: 5m
repeat_interval: 1h
receivers:
- name: "null"
- name: slack_general
slack_configs:
- channel: '#MY-SLACK-CHANNEL-NAME'
icon_url: https://avatars3.githubusercontent.com/u/3380462
send_resolved: true
color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}'
title: '{{ template "slack.default.title" . }}'
title_link: '{{ template "slack.default.titlelink" . }}'
pretext: '{{ template "slack.default.pretext" . }}'
text: '{{ template "slack.default.text" . }}'
fallback: '{{ template "slack.default.fallback" . }}'
icon_emoji: '{{ template "slack.default.iconemoji" . }}'
templates:
- '*.tmpl'
2. The following file, named notification.tmpl, is a template that defines a pretty format for the fired
notifications.
{{ define "__titlelink" }}
{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver }}
{{ end }}
{{ define "__title" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}
{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }}
{{ end }}
{{ define "__text" }}
{{ range .Alerts }}
{{ range .Labels.SortedPairs }}*{{ .Name }}*: `{{ .Value }}`
{{ end }} {{ range .Annotations.SortedPairs }}*{{ .Name }}*: {{ .Value }}
{{ end }} *source*: {{ .GeneratorURL }}
{{ end }}
{{ end }}
3. Finally, apply these changes to alertmanager as follows. Set ${WORKSPACE_NAMESPACE} to the workspace
namespace that kube-prometheus-stack is deployed in.
kubectl create secret generic -n ${WORKSPACE_NAMESPACE} \
alertmanager-kube-prometheus-stack-alertmanager \
--from-file=alertmanager.yaml \
--from-file=notification.tmpl \
--dry-run=client --save-config -o yaml | kubectl apply -f -
Procedure
alertmanager-kube-prometheus-stack-alertmanager \
--from-file=alertmanager.yaml \
--dry-run=client --save-config -o yaml | kubectl apply -f -
3. Allow some time for the configuration to take affect. You can then use the following command to verify that the
configuration took effect.
kubectl exec -it alertmanager-kube-prometheus-stack-alertmanager-0 -n kommander --
cat /etc/alertmanager/config_out/alertmanager.env.yaml
For more information on configuring email alerting, see https://prometheus.io/docs/alerting/latest/
configuration/.
Centralized Monitoring
Monitor clusters, created with Kommander, on any attached cluster
Kommander provides centralized monitoring, in a multi-cluster environment, using the monitoring stack running on
any attached clusters. Centralized monitoring is provided by default in every managed or attached cluster.
Managed or attached clusters are distinguished by a monitoring ID. The monitoring ID corresponds to the kube-
system namespace UID of the cluster. To find a cluster’s monitoring ID, you can go to the Clusters tab on the NKP
UI (in the relevant workspace), or go to the Clusters page in the Global workspace:
https://<CLUSTER_URL>/nkp/kommander/dashboard/clusters
Select the View Details link on the attached cluster card, and then select the Configuration tab, and find the
monitoring ID under Monitoring ID (clusterId).
You might also search or filter by monitoring IDs on the Clusters page, linked above.
You can also run this kubectl command, using the correct cluster’s context or kubeconfig, to look up the
cluster’s kube-system namespace UID to determine which cluster the metrics and alerts correspond to:
kubectl get namespace kube-system -o jsonpath='{.metadata.uid}'
Procedure
3. Apply the ConfigMap, which will automatically get imported to Grafana through the Grafana dashboard sidecar.
kubectl apply -f some-dashboard.yaml
Centralized Metrics
Managed and attached clusters, collects and presents metrics from all attached clusters remotely using Thanos. You
can visualize these metrics in Grafana using a set of provided dashboards.
The Thanos Query (see https://thanos.io/v0.5/components/query/#query) component is installed on attached
and managed clusters. Thanos Query queries the Prometheus instances on the attached clusters, using a Thanos
sidecar running alongside each Prometheus container. Grafana is configured with Thanos Query as its data source and
comes with pre-installed dashboards for a global view of all attached clusters. The Thanos Query dashboard is also
installed, by default, to monitor the Thanos Query component.
Note: Cluster metrics are read remotely from Kommander; they are not backed up. If an attached cluster goes down,
Kommander no longer collects or presents its metrics, including past data.
Note: This is a separate Grafana instance from the one installed on all attached clusters. It is dedicated specifically to
components related to centralized monitoring.
Optionally, if you want to access the Thanos Query UI (essentially the Prometheus UI), the UI is accessible at:
https://<CLUSTER_URL>/nkp/kommander/monitoring/query/
You can also check that the attached cluster’s Thanos sidecars are successfully added to Thanos Query by going to:
https://<CLUSTER_URL>/nkp/kommander/monitoring/query/stores
The preferred method to view the metrics for a specific cluster is to go directly to that cluster’s Grafana UI.
Centralized Alerts
A centralized view of alerts from attached clusters, is provided using an alert dashboard called Karma (see https://
github.com/prymitive/karma). Karma aggregates all alerts from the Alertmanagers running in the attached clusters,
allowing you to visualize these alerts on one page. Using the Karma dashboard, you can get an overview of each alert
and filter by alert type, cluster, and more.
Note: When there are no attached clusters, the Karma UI displays an error message Get https://
placeholder.invalid/api/v2/status: dial tcp: lookup placeholder.invalid on
10.0.0.10:53: no such host. This is expected, and the error disappears when clusters are connected.
Procedure
5. Ensure that the cluster selection (status.clusters) is appropriately set for your desired federation strategy and
check the propagation status.
kubectl get federatedprometheusrules kube-prometheus-stack-alertmanager.rules -n
kommander -oyaml
Note: By default, up to 15 days of cost metrics are retained, with no backup to an external store.
Warning: The license key must be applied to the Centralized Kubecost application running on the Management
cluster.
Note: Considerations when Adding a License Key. Until you add a license key, Kubecost caches the context
of the cluster you first navigate from. This means that if you navigate to the Kubecost UI through the dashboard in the
Application Dashboards tab of any cluster, you will not be able to access the centralized Kubecost UI until you clear
your browser cookies/cache.
The following instructions provide information on how you can add a Kubecost license key to your clusters if you
have purchased one.
Procedure
3. From the dashboard, select the Settings icon, then select “Add License Key”. Alternatively, the dashboard
can be accessed through the following link: https://%3CCLUSTER_URL%3E/nkp/kommander/kubecost/
frontend/settings
5. After the license key has been added, the licensed features become available in the Kubecost UI.
Note: The Kubecost Enterprise plan offers a free trial ows you to preview Ultimate features for 30 days. To access
this, select “Start Free Trial” in the Settings pane.
Centralized Costs
Using Thanos, the management cluster collects cost metrics remotely from each attached cluster. Costs from the last
day and the last seven days are displayed for each cluster, workspace, and project in the respective NKP UI pages.
Further cost analysis and details can be found in the centralized Kubecost UI running on Kommander at:
https://<CLUSTER_URL>/nkp/kommander/kubecost/frontend/detail.html#&agg=cluster
For more information on cost allocation metrics and how to navigate this view in the Kubecost UI, see https://
docs.kubecost.com/using-kubecost/getting-started/cost-allocation.
To identify the clusters in Kubecost, use the cluster’s monitoring ID. The monitoring ID corresponds to the kube-
system namespace UID of the cluster. To find the cluster’s monitoring ID, select the Clusters tab on the NKP UI in
the relevant workspace, or go to the Clusters page in the Global workspace.
https://<CLUSTER_URL>/nkp/kommander/dashboard/clusters
Select View Details on the attached cluster card. Select the Configuration tab, and find the monitoring ID under
Monitoring ID (clusterId).
You can also search or filter by monitoring ID on the Clusters page.
Kubecost
Kubecost integrates directly with the Kubernetes API and cloud billing APIs to give you real-time visibility into your
Kubernetes spending and cost allocation. By monitoring your Kubernetes spending across clusters, you can avoid
overspend caused by uncaught bugs or oversights. With a cost monitoring solution in place, you can realize the full
potential and cost of these resources and avoid over-provisioning resources.
To customize pricing and of clusters-of-cluster costs for AWS, you must apply these settings using the Kubecost UI
running on each attached cluster. You can access the attached cluster’s Kubecost Settings page at:
https://<MANAGED_CLUSTER_URL>/nkp/kubecost/frontend/settings.html
Warning: Make sure you access the cluster's Kubecost UI linked above, not the centralized Kubecost UI running on
the Kommander management cluster.
AWS
To configure a data feed for the AWS Spot instances and a more accurate AWS Spot pricing, follow the steps at
https://docs.kubecost.com/using-kubecost/getting-started/spot-checklist#implementing-spot-nodes-in-your-
cluster.
To allocate out-of-cluster costs for AWS, see https://docs.kubecost.com/install-and-configure/advanced-
configuration/cloud-integration/aws-cloud-integrations/aws-out-of-cluster.
Grafana Dashboards
A set of Grafana dashboards providing visualization of cost metrics is provided in the centralized Grafana UI:
https://<CLUSTER_URL>/nkp/kommander/monitoring/grafana
These dashboards give a global view of accumulated costs from all attached clusters. From the navigation in Grafana,
you can find these dashboards by selecting those tagged with cost, metrics, and utilization.
• kube-apiserver
• kube-scheduler
• kube-controller-manager
• etcd
Note:
Official documentation about using a ServiceMonitor to monitor an app with the Prometheus-operator
on Kubernetes can be found on the GitHub repository.
Procedure
When defining the requirements of a cluster, you can specify the capacity and resource requirements of Prometheus
by modifying the settings in the overrides ConfigMap definition, as shown below.
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-prometheus-stack-overrides
namespace: ${WORKSPACE_NAMESPACE}
data:
values.yaml: |
---
prometheus:
prometheusSpec:
resources:
limits:
cpu: "4"
memory: "8Gi"
requests:
cpu: "2"
memory: "6Gi"
storageSpec:
volumeClaimTemplate:
spec:
resources:
requests:
storage: "100Gi"
Note: The Ceph instance installed by NKP is intended only for use by Nutanix Kubernetes Platform Insights and the
logging stack and velero platform applications. For more information, see the Nutanix Kubernetes Platform
Insights Guide on page 1143.
If you have an instance of Ceph that is managed outside of the NKP life cycle, see Bring Your Own
Storage (BYOS) to NKP Clusters on page 637.
Note: The Ceph instance installed by NKP is intended only for use by the logging stack and velero platform
applications.
If you have an instance of Ceph that is managed outside of the NKP life cycle, see Bring Your Own
Storage (BYOS) to NKP Clusters on page 637
If you do not plan on using any of the logging stack components such as grafana-loki or velero
for backups, then you do not need Rook Ceph for your installation and you can disable it by adding the
following to your installer config file:
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
...
...
grafana-loki:
enabled: false
...
rook-ceph:
enabled: false
rook-ceph-cluster:
enabled: false
...
velero:
enabled: false
...
You must enable rook-ceph and rook-ceph-cluster if any of the following is true:
• If grafana-loki is enabled.
• If velero is enabled. If you applied config overrides for velero to use ,storage that is external to your cluster
(see Usage of Velero with AWS S3 Buckets on page 544), then you do not need Ceph to be installed.
For more information on Rook Ceph, see https://rook.io/docs/rook/v1.10/Getting-Started/intro/.
Ceph supports creating clusters in different modes as listed in CephCluster CRD Rook Ceph Documentation (see
https://rook.io/docs/rook/v1.10/CRDs/Cluster/ceph-cluster-crd/). NKP, specifically, is shipped with a PVC
Cluster, as documented in PVC Storage Cluster Rook Ceph Documentation (see https://rook.io/docs/rook/v1.10/
CRDs/Cluster/pvc-cluster/#pvc-storage-only-for-monitors). It is recommended that the PVC mode be used to
keep the deployment and upgrades simple and agnostic to technicalities with node draining.
Ceph cannot be your CSI Provisioner when installing in PVC mode as Ceph relies on an existing CSI provisioner to
bind the PVCs created by it. It is possible to use Ceph as your CSI provisioner, but that is outside the scope of this
document. If you have an instance of Ceph that acts as the CSI Provisoner, then it is possible to reuse it for your NKP
Storage needs.
When you create AppDeployments for rook-ceph and rook-ceph-cluster platform applications, it results in
the deployment of various components as listed in the following diagram
Your default StorageClass should support creation of PersistentVolumes created by Ceph with that satisfy the
volumeMode: Block.
• For general information on how to configure Object Storage Daemons (OSDs), see https://www.rook.io/docs/
rook/v1.11/Storage-Configuration/Advanced/ceph-osd-mgmt/.
• For information on how to set up auto-expansion of OSDs, https://www.rook.io/docs/rook/v1.11/Storage-
Configuration/Advanced/ceph-configuration/?h=expa#auto-expansion-of-osds.
Replication and Erasure Coding are the two primary methods for storing data in a durable fashion in any
distributed system.
Replication
• For a replication factor of N, data has N copies (including the original copy)
• Smallest possible replication factor is 2 (usually this means two storage nodes).
• With replication factor of 2, data has 2 copies and this tolerates loss of one copy of data.
• Storage efficiency : (1/N) * 100percentage. For example,
Erasure Coding
• Slices an object into k data fragments and computes m parity fragments. The erasure coding scheme guarantees
that data can be recreated using any k fragments out of k+m fragments.
• The k + m = n fragments are spread across (>=n) Storage Nodes to offer durability.
• Since k out of n fragments (parity or data fragments) are needed for the recreation of data, at most m fragments can
be lost without loss of data.
• The smallest possible count is k = 2, m = 1 that is., n = k + m = 3. This works only if there are at least n =
3 storage nodes.
• If k=3, m=1 then atmost 1 out of 4 nodes can be lost without data loss.
• If k=4, m=2 then atmost 2 out of 6 nodes can be lost without data loss, and so on.
Note: This guide assumes you have a Ceph cluster that is not managed by NKP.
For information on how to configure the Ceph instance installed by NKP for use by NKP platform
applications, see Rook Ceph: Configuration on page 633.
Procedure
Run the following command.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
rook-ceph:
enabled: false
rook-ceph-cluster:
enabled: false
...
...
The NKP instances of velero and grafana-loki rely on the storage provided by Ceph. Before installing the
Kommander component of NKP, be sure to configure appropriate Ceph resources for their usage as detailed in the
next section.
Note: This guide assumes your Ceph instance is installed in the rook-ceph namespace. In subsequent steps,
configure the variable #CEPH_NAMESPACE# as it applies to your environment.
1. Create. CephObjectStore
There are two ways to install Ceph:
» Using Helm Charts (For more information on Helm Chart, see https://www.rook.io/docs/rook/v1.10/Helm-
Charts/operator-chart/#release).
This section is relevant if you have installed Ceph using helm install or some other managed Helm
resource mechanism.
If you have applied any configuration overrides to your Rook Ceph operator, ensure it was deployed with
currentNamespaceOnly set to false (It is the default value, so unless you have applied any overrides, it will
be false by default). This ensures that the Ceph Operator in the rook-ceph namespace is able to monitor
and manage resources in other namespaces such as kommander.
Note:
2. You must enable the following configuration overrides for the rook-ceph-cluster. See
https://www.rook.io/docs/rook/v1.10/Helm-Charts/ceph-cluster-chart/#ceph-object-
stores.
cephObjectStores:
- name: nkp-object-store
# see https://github.com/rook/rook/blob/master/Documentation/
CRDs/Object-Storage/ceph-object-store-crd.md#object-store-settings
for available configuration
spec:
metadataPool:
# The failure domain: osd/host/(region or zone if available)
- technically also any type in the crush map
failureDomain: osd
# Must use replicated pool ONLY. Erasure coding is not
supported.
replicated:
size: 3
dataPool:
# The failure domain: osd/host/(region or zone if available)
- technically also any type in the crush map
failureDomain: osd
# Data pool can use either replication OR erasure coding.
Consider the following example scenarios:
# Erasure Coding is used here with 3 data chunks and 1 parity
chunks which assumes 4 OSDs exist.
# Configure this according to your CephCluster specification.
erasureCoded:
dataChunks: 3
codingChunks: 1
preservePoolsOnDelete: false
gateway:
port: 80
instances: 2
priorityClassName: system-cluster-critical
resources:
limits:
» By directly applying Kubernetes manifests (For more information, see Bring Your Own Storage (BYOS) to
NKP Clusters on page 637:
Note:
• 1. Set a variable to refer to the namespace the AppDeployments are created in.
export CEPH_NAMESPACE=rook-ceph
export NAMESPACE=kommander
2. Create ObjectBucketClaims.
After connecting the Object Store, create the ObjectBucketClaim in the same namespace as velero and
grafana-loki.
This results in the creation of ObjectBucket , that then creates Secrets that are consumed by velero and
grafana-loki.
a. For grafana-loki.
cat <<EOF | kubectl apply -f -
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: nkp-loki
namespace: ${NAMESPACE}
spec:
additionalConfig:
maxSize: 80G
bucketName: nkp-loki
storageClassName: nkp-object-store
EOF
b. For velero.
cat <<EOF | kubectl apply -f -
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
Procedure
Run the following command.
apiVersion: v1
data:
AWS_ACCESS_KEY_ID: base64EncodedValue
AWS_SECRET_ACCESS_KEY: base64EncodedValue
kind: Secret
metadata:
name: nkp-loki #If you want to configure a custom name here, also use it in the step
below
namespace: kommander
Procedure
Run the following command.
cat <<EOF | kubectl apply -f -
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
grafana-loki:
enabled: true
values: |
loki:
structuredConfig:
storage_config:
aws:
Procedure
export CEPH_NAMESPACE=rook-ceph
export NAMESPACE=my-project
Section Contents
Section Contents
• A Container engine or runtime installed is required to install NKP and bootstrap. You can select one of these
supported CREs:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. On MacOS, Docker runs
on a virtual machine that, Docker runs in a virtual machine which needs to be configured with at least 8 GB of
memory. For more information, see https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• Bootstrap cluster
• CAPI components
• NKP Kommander component
When you create a NKP cluster using the --self-managed flag, the bootstrap cluster and Cluster API (CAPI)
components are created for you automatically, and use the HTTP and HTTPS proxy settings you specify in the NKP
create cluster <provider>... command.
You can also create the bootstrap cluster and CAPI components manually, using the appropriate commands: NKP
create bootstrap and NKP create capi-components , respectively, combined with the command line flags to
include your HTTP or HTTPS proxy information.
You can also specify HTTP or HTTPS proxy information in an override file when using Konvoy Image Builder
(KIB). For more information, see Use Override Files with Konvoy Image Builder on page 1067.
Without these values provided as part of the relevant NKP create command, NKP cannot create the requisite parts
of your new cluster correctly. This is true of both management and managed clusters alike.
Note: For NKP installation, create the bootstrap cluster from within the same network where the new cluster will run.
Using a bootstrap cluster on a laptop with different proxy settings, for example, or residing in a different network, can
cause problems.
Section Contents
You can define HTTP or HTTPS proxy information using the steps on these pages:
The following is an example of the #nkp create bootstrap# command’s syntax, with the HTTP proxy settings
included.
nkp create bootstrap --http-proxy <<http proxy list>> --https-proxy <<https proxy
list>> --no-proxy <<no proxy list>>
The following is an example nkp create capi-components command’s syntax with the HTTP proxy
settings included:
Tip:
nkp create capi-components --http-proxy <<http proxy list>> --https-proxy
<<https proxy list>> --no-proxy <<no proxy list>>
Procedure
1. If an HTTP proxy is required, locate the values to use for the http_proxy, https_proxy, and no_proxy flags.
They will be built into the CAPI components during their creation.
2. Create CAPI components using this command syntax and any other flags you might need.
nkp create capi-components --kubeconfig $HOME/.kube/config \
--http-proxy <string> \
--https-proxy <string> \
--no-proxy <string>
This code sample shows the command with example values for the proxy settings:
nkp create capi-components \
--http-proxy 10.0.0.15:3128 \
--https-proxy 10.0.0.15:3128 \
Using an HTTP override file, you must apply the same configuration to any custom machine images built with the
Konvoy Image Builder (KIB). For more information, see Image Overrides on page 1073.
Configure the Control Plane and Worker Nodes to Use HTTP/S Proxy
This method uses environment variables to configure the HTTP proxy values. (You are not required to use this
method.)
Review this sample code to configure environment variables for the control plane and worker nodes, considering the
list of considerations that follow the sample.
export CONTROL_PLANE_HTTP_PROXY=http://example.org:8080
export CONTROL_PLANE_HTTPS_PROXY=http://example.org:8080
export
CONTROL_PLANE_NO_PROXY="example.org,example.com,example.net,localhost,127.0.0.1,10.96.0.0/12,192.1
export WORKER_HTTP_PROXY=http://example.org:8080
export WORKER_HTTPS_PROXY=http://example.org:8080
export
WORKER_NO_PROXY="example.org,example.com,example.net,localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/
HTTP proxy configuration considerations to ensure the core components work correctly
• kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.d
is the internal Kubernetes kube-apiserver service
• The entries .svc,.svc.cluster,.svc.cluster.local are the internal Kubernetes services
• .elb.amazonaws.com is for the worker nodes to allow them to communicate directly to the kube-apiserver
ELB
Note: The NO_PROXY variable contains the Kubernetes Services CIDR. This example uses the default CIDR,
10.96.0.0/12. If your cluster's CIDR differs, update the value in the NO_PROXY field.
Set the httpProxy and httpsProxy environment variables to the address of the HTTP and HTTPS proxy servers,
respectively. (Frequently, environments use the same values for both.) Set the noProxy environment variable to the
addresses that can be accessed directly and not through the proxy.
For the Kommander component of Nutanix Kubernetes Platform (NKP), refer to more HTTP Proxy information in
Additional Kommander Configurations.
Example:
nkp create cluster aws
--cluster-name=${CLUSTER_NAME} \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
The “>” shows the output from the command saved to the file named ${CLUSTER_NAME}.yaml. To edit that
YAML Ain't Markup Language (YAML), you need to understand the CAPI components to avoid failure of the
cluster deployment. The objects are Custom Resources defined by Cluster API components, and they belong to three
different categories:
• Cluster
A Cluster object references the infrastructure-specific and control plane objects. Because this is an Amazon
Web Services (AWS) cluster, an AWS Cluster object describes the infrastructure-specific cluster properties.
This means the AWS region, the VPC ID, subnet IDs, and security group rules required by the Pod network
implementation.
• Control Plane
A KubeadmControlPlane object describes the control plane, the group of machines that run the Kubernetes
control plane components. Those include the etcd distributed database, the API server, the core controllers,
and the scheduler. The object describes the configuration for these components and refers to an infrastructure-
specific object that represents the properties of all control plane machines. For AWS, the object references an
AWSMachineTemplate object, which means the instance type, the type of disk used, and the disk size, among
other properties.
• Node Pool
A Node Pool is a collection of machines with identical properties. For example, a cluster might have one Node
Pool with large memory capacity and another Node Pool with graphics processing unit (GPU) support. Each Node
Pool is described by three objects: The MachinePool references an object that represents the configuration of
Kubernetes components (kubelet) deployed on each node pool machine, and an infrastructure-specific object that
describes the properties of all node pool machines. For AWS, it references a KubeadmConfigTemplate and an
AWSMachineTemplate object, which represents the instance type, the type of disk used, and the disk size, among
other properties.
For more information on the objects, see the Cluster API book https://cluster-api.sigs.k8s.io/user/concepts.html
or Custom Resources in https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-
resources/.
• Registry Mirrors are local copies of images from a public registry that follow (or mirror) the file structure of
a public registry. If you need to set up a private registry with a registry mirror or details on using the flag(s), see
Using a Registry Mirror on page 1019.
• Container registries are collections of container repositories and can also offer API paths and access rules.
• Container repositories are a collection of related container images. The container image has everything the
software might need to run, including code, resources, and tools. Container repositories store container images for
setup and deployment, and you use the repositories to manage, pull, and push images during cluster operations.
Kubernetes does not natively provide a registry for hosting the container images you will use to run the applications
you want to deploy on Kubernetes. Instead, Kubernetes requires you to use an external solution to store and share
container images. A variety of Kubernetes-compatible registry options are compatible with NKP.
• --bundle <bundle> the group of images. The example below is for the NKP air-gapped environment bundle
Related Information
For information on related topics or procedures, see Registry Mirror Tools on page 1017.
Changing the Kubernetes subnets must be done during cluster creation. To change the subnets, perform the following
steps:
Procedure
1. Generate the YAML Ain't Markup Language (YAML) manifests for the cluster using the --dry-run and -o
yaml flags, along with the desired nkp cluster create command.
Example:
nkp create cluster preprovisioned --cluster-name ${CLUSTER_NAME} --control-plane-
endpoint-host <control plane endpoint host> --control-plane-endpoint-port <control
plane endpoint port, if different than 6443> --dry-run -o yaml > cluster.yaml
Note: MetalLB IP address ranges, Classless Inter-Domain Routing (CIDR), and node subnet should not conflict
with the Kubernetes cluster pod and service subnets.
2. To modify the service subnet, add or edit the spec.clusterNetwork.services.cidrBlocks field of the
Cluster object.
Example:
kind: Cluster
spec:
clusterNetwork:
3. To modify the pod subnet, edit the Cluster and calico-cni ConfigMap resources. Cluster: Add or edit the
spec.clusterNetwork.pods.cidrBlocks field.
Example:
kind: Cluster
spec:
clusterNetwork:
pods:
cidrBlocks:
- 172.16.0.0/16
• Create a bastion VM host template for the cluster nodes to use within the air-gapped network. This bastion VM
host also needs access to a local registry instead of an Internet connection to pull images.
• Find and record the bastion VM’s IP or hostname.
• Download the following required NKP Konvoy binaries and installation bundles are discussed in step 5 below. To
access the download bundles, see Downloading NKP on page 16.
• A local registry or Docker version 18.09.2 or later installed. You must install Docker on the host where the
NKP Konvoy CLI runs. For example, if you install Konvoy on your laptop, ensure the computer has a supported
version of Docker. On macOS, Docker runs in a virtual machine that you configure with at least 8GB of memory.
Procedure
1. Open an ssh terminal to the bastion host and install the tools and packages using the command sudo yum
install -y yum-utils bzip2 wget.
2. Install kubectl.
RHEL example:
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
sudo yum install -y kubectl
3. Install Docker, for example (only on the Bastion Host), and add the repo for upstream Docker using the command
sudo yum-config-manager --add-repo https://download.docker.com/linux/rhel/docker-
ce.repo
Docker Install example:
sudo yum install -y docker-ce docker-ce-cli containerd.io
5. Set the following environment variables to enable connection to an existing Docker or other registry using the
command export REGISTRY_ADDRESS=<https/http>://<registry-address>:<registry-port>
export REGISTRY_CA=<path to the CA on the bastion host>.
Note: You must create the VM template with the Konvoy Image Builder to use the registry mirror feature.
• REGISTRY_ADDRESS: The address of an existing registry accessible in the environment where the new cluster
nodes will be configured to use a mirror registry when pulling images.
• REGISTRY_CA: (Optional) path on the bastion host to the registry CA. Konvoy configures the cluster nodes
to trust this CA. This value is only needed if the registry uses a self-signed certificate and the VMs are not
already configured to trust this CA.
Note: For provisioning Nutanix Kubernetes Platform (NKP) on Flatcar, NKP configures cluster nodes to use Control
Groups (cgroups) version 1. In versions before Flatcar 3033.3.x, a restart is required to apply the changes to the kernel.
Also note that once Ignition runs, it is not available on reboot.
For more information on Flatcar usage, see:
Load Balancers
Kubernetes has a basic load balancer solution internally but does not offer an external load balancing component
directly. You must provide one, or you can integrate your Kubernetes cluster with a cloud provider. In a Kubernetes
cluster, depending on the flow of traffic direction, there are two kinds of load balancing:
Built-in virtual IP
If an external load balancer is unavailable, use the built-in virtual IP. The virtual IP is not a load balancer. It does
not distribute the request load among the control plane machines. However, if the machine receiving requests
MetalLB
MetalLB is an external load balancer (LB) and is recommended to be the control plane endpoint. To distribute request
load among the control plane machines, configure the load balancer to send requests to all the control plane machines.
Configure the load balancer to send requests only to control plane machines responding to API requests. You can use
Metal LB to create a MetalLB config map for your infrastructure if you do not have one.
Choose one of the two protocols you want to use to announce service IPs. If your environment is not currently
equipped with a load balancer, you can use MetalLB. Otherwise, your own load balancer will work, and you can
continue the installation process with Pre-provisioned: Install Kommander. To use MetalLB, create a MetalLB
configMap for your infrastructure. MetalLB uses one of two protocols for exposing Kubernetes services:
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses.
Layer 2 mode does not require the IPs to be bound to the network interfaces of your worker nodes. It gives the
machine’s MAC address to clients giving, and giving clients the machine’s MAC address.
• MetalLB IP address ranges or CIDRs must be within the node’s primary network subnet.
• MetalLB IP address ranges, CIDRs, and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250 and
configures Layer 2 mode:
The following values are generic. Enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.1.240-192.168.1.250
EOF
kubectl apply -f metallb-conf.yaml
• Check your pods to see if anything not running and investigate those pods. You can view your pods by checking
their status with the command:
kubectl get pods -A
• You can check the logs in your cluster API pod of your cluster infrastructure provider choice. The example below
uses Nutanix infrastructure, so replace it with your infrastructure.
kubectl logs -l cluster.x-k8s.io/provider=infrastructure-nutanix --namespace capx-
system --kubeconfig ${CLUSTER_NAME}.conf
• If you still have your bootstrap cluster running, you can check your CAPI logs from the bootstrap with the
command. The example below uses Nutanix infrastructure, so replace it with your CAPI driver and infra name.
kubectl logs -l cluster.x-k8s.io/provider=infrastructure-nutanix --namespace capx-
system --kubeconfig ${CLUSTER_NAME}-bootstrap.conf
Nutanix Infrastructure
Configuration types for installing Nutanix Kubernetes Platform (NKP) on a Nutanix Infrastructure.
For an environment on the Nutanix Infrastructure, install options based on those environment variables are provided
for you in this location.
Nutanix Overview
The overall process for configuring Nutanix and NKP together includes the following steps:
1. Configure Nutanix to provide the elements described in the Nutanix Prerequisites.
2. For air-gapped environments, create a bastion VM host. For more information see, Creating a Bastion
Host on page 652.
3. Create a base OS image. For more information, see Nutanix Base OS Image Requirements on page 663.
4. Create a new cluster.
5. Verify and log on to the UI.
After creating the base OS image, the NKP image builder uses it to make a custom image if you are not using the
pre-built Out-of-the-box Rocky Linux 9.4 image provided. You can use that resulting image with the nkp create
cluster nutanix command to create the VM nodes in your cluster directly on a server. You can use ##NKP# to
provision and manage your cluster from that point. Section Contents
NKP Prerequisites
Before using NKP to create a Nutanix cluster, verify that you have the following:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• A registry is needed in your environment.
• kubectl 1.28.x for interacting with the running cluster, installed on the host where the NKP Konvoy command line
interface (CLI) runs. For more information, see https://kubernetes.io/docs/tasks/tools/#kubectl.
• A valid Nutanix account with credentials configured.
• NKP uses the Nutanix CSI driver 3.0 as the default storage provider. For more information on the
default storage providers, see Default storage providers.
• For compatible storage suitable for production, you can choose from any of the storage options available
for Kubernetes. For more information, see https://kubernetes.io/docs/concepts/storage/volumes/
#volume-types.
• To turn off the default that Konvoy deploys:
1. Set the default StorageClass as non-default.
2. Set your newly created StorageClass to be the default.
For more information on Changing the Default Storage Class, see https://kubernetes.io/docs/tasks/
administer-cluster/change-default-storage-class/.
Nutanix Prerequisites
Before installing, verify that your environment meets the following basic requirements:
• Nutanix Prism Central version 2024.1 has role credentials configured, allowing Administrator privileges.
• AOS 6.5, 6.8+
• Configure Prism Central Settings. For more information, see Prism Central Settings (Infrastructure).
• Pre-designated subnets.
• A subnet with unused IP addresses. The number of IP addresses required is computed as follows:
• One IP address for each node in the Kubernetes Cluster. The default cluster size has three control plane
nodes and four worker nodes, which require seven IP addresses.
• One IP address is not part of an address pool for the Kubernetes API server (kubevip).
• One IP address in the same CIDR as the subnet but is not part of an address pool for the load balancer
service used by Traefik (metallb).
• For air-gapped environments, a bastion VM host template with access to a configured local registry. The
recommended template naming pattern is ../folder-name/NKP-e2e-bastion-template or similar.
Each infrastructure provider has its own set of bastion host instructions. For more information see, Creating a
Bastion Host on page 652.
• Access to a bastion VM or other network-connected host running NKP Image Builder.
Note: Nutanix will provide a complete image built on Nutanix-provided if you do not want to create your own
from a BaseOS image.
• You must reach the Nutanix endpoint where the Konvoy Command Line Interface (CLI) runs.
• Verify that your OS is supported. For more information, see Supported Infrastructure Operating Systems on
page 12.
• If not already complete, review the NKP installation prerequisites. For more information, see Prerequisites for
Installation on page 44.
Section Contents
1. To manage the cluster, such as listing subnets and other infrastructure, and to create VMs in Prism Central, which
the Cluster API Provider Nutanix Cloud Infrastructure (CAPX) infrastructure provider uses.
2. To manage persistent storage used by Nutanix CSI providers.
3. To discover node metadata used by the Nutanix Cloud Cost Management (CCM) provider.
PC credentials are required to authenticate to the Prism Central APIs. CAPX currently supports two mechanisms for
supplying the required credentials.
Injected Credentials
By default, credentials will be injected into the CAPX manager deployment when CAPX is initialized. See the
Getting Started Guide topic for information about getting started with Cluster API Provider Nutanix Cloud
Infrastructure (CAPX).
Upon initialization, a nutanix-creds secret will automatically be created in the capx-system namespace. This
secret will contain the values supplied through the NUTANIX_USER and NUTANIX_PASSWORD parameters.
The nutanix-creds secret will be used for workload cluster deployment if no other credential is provided.
Caution: Update the different credentials once a cluster has been deployed. Update the credentials on the workload
(used by CCM, CSI, and other add-ons. Update credentials on the management cluster used by CAPX or keep CCM
and CSI secrets in sync.
Note: There is a 1:1 relation between the secret and the NutanixCluster object.
Procedure
4. To create an authorization policy, select Create New Authorization Policy . The Create New Authorization
Policy window appears
5. In the Choose Role step, enter a role name by typing in the Select the role to add to this policy field, and
select Next You can enter any built-in or custom roles.
» Full Access - which gives all added users access to all entity types in the associated role.
» Configure Access - which provides you with the option to configure the entity types and instances for the
added users in the associated role.
7. Select Next.
» From the dropdown list, select Local User to add a local user or group to the policy. Search a user or group by
typing the first few letters in the text field.
» From the dropdown list, select the available directory to add a directory user or group. Search a user or group
by typing the first few letters in the text field.
9. click Save.
The authorization policy configurations are saved, and the authorization policy is listed in the Authorization
Policies window.
Note: To display role permissions for any built-in role, see the Nutanix AOS Security documentation topic
Displaying Role Permissions.
Category
• Create Or Update Name Category
• Create Or Update Value Category
• Delete Name Category
• Delete Value Category
• View Name Category
• View Value Category
Category Mapping
• Create Category Mapping
• Delete Category Mapping
• Update Category Mapping
• View Category Mapping
Cluster
• View Cluster
• Create Image
• Delete Image
• View Image
Project
• View Project
Subnet
• View Subnet
• Prism Element (PE) cluster name. For more information, see Modifying Cluster Details in the Prism Element
Web Console Guide.
• Subnet name.
• OS image name from creating the OS Image in the previous topic.
• Docker Hub credentials, or you will encounter Docker Hub rate limiting, and the cluster will not be created.
• Find an available control plane endpoint IP not assigned to any VMs.
In Prism Central, you will enter information.
Procedure
Procedure
2. Select the option next to Use the VLAN migrate workflow to convert VLAN Basic subnets to Network
Controller managed VLAN Subnets.
4. Under Advanced Configuration, remove the check from the checkbox next to VLAN Basic Networking to
change from Basic to Advanced OVN.
5. Modify the subnet specification in the control plane and worker nodes to use the new subnet. kubectl edit
cluster <clustername>.
CAPX will roll out the new control plane and worker nodes in the new Subnet and destroy the old ones.
Note: You can choose Basic or Advanced OVN when creating the subnet(s) you used during cluster creation. If
you created the cluster with basic, you can migrate to OVN.
To modify the service subnet, add or edit the configmap. See the topic Managing Subnets and Pods for more
details.
• Prism Central configuration. For more information, see Prism Central Admin Center Guide.
• Choose to use or build a pre-built Rocky Linux 9.4 image. This Base OS Image is later used with NKP Image
Builder during installation and cluster creation.
• If using a pre-built image, ensure it has been uploaded to Prism Central images folder.
• If creating a custom image, NIB will place the new image in the Prism Central images folder upon creation.
Note: Out-of-the-box image: Nutanix provides a complete image built on Nutanix-provided base images.
• Network configuration is required because NIB must download and install packages, and activating the network is
required.
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
Disk Size
For each cluster you create using this base OS image, ensure you establish the disk size of the root file system based
on the following:
• The minimum NKP Resource Requirements. For more information, see Resource Requirements.
• The minimum storage requirements for your organization.
• Clusters are created with a default disk size of 80 GB.
• For clusters created with the default disk size, the base OS image root file system must be precisely 80 GB. The
root file system cannot be reduced automatically when a machine first boots.
Customization
You can also specify a custom disk size when you create a cluster (see the flags available for use with the Nutanix
Create Cluster command). This allows you to use one base OS image to create multiple clusters with different storage
requirements.
Before specifying a disk size when you create a cluster, take into account the following:
• For some base OS images, the custom disk size option does not affect the size of the root file system. This is
because some root file systems, for example, those contained in Logical volume management (LVM) Logical
Volume, cannot be resized automatically when a machine first boots.
• The specified custom disk size must be equal to, or larger than, the size of the base OS image root file system.
This is because a root file system cannot be reduced automatically when a machine first boots.
• Configure Prism Central. For more information, see Prism Central Admin Center Guide.
• Ensure you have Docker or Podman installed. For more information, see Nutanix Base OS Image
Requirements.
• You will need:
2. Build Rocky 9.4 image using the command nkp create image nutanix rocky-9.4.
Example:
nkp create image nutanix rocky-9.4 \
--cluster <PE_CLUSTER_NAME> \
--endpoint <PC_ENDPOINT_WITHOUT_PORT_EX_prismcentral.foobar.example.com> \
--subnet <NAME_OR_UUID_OF_SUBNET>
a. To specify the name of the base image, use the flag --source-image <name of base image>.
The Output will have the name of the image created. Take note of the Image name for use in
cluster creation. For example, nutanix.kib_image: Image successfully created: nkp-
rocky-9.3-1.29.6-20240612181040 (db03feec-66f5-4c4d-85b1-79797a2aecc5)`.
• Configure Prism Central. For more information, see Prism Central Admin Center Guide.
• Ensure you have Docker or Podman installed. For more information, see Nutanix Base OS Image
Requirements.
• You will need:
Procedure
2. You must fetch the distro packages and other artifacts. You get the latest security fixes available at machine image
build time by fetching the distro packages from Rocky Linux or Ubuntu repositories.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for
a particular OS. To create it, run the Nutanix Kubernetes Platform (NKP) command create-package-
bundle. This builds an OS bundle using the Kubernetes version defined in ansible/group_vars/all/
defaults.yaml.
For example:
./nkp create package-bundle --artifacts-directory </path/to/save/os-package-bundles>
Other supported air-gapped Operating Systems can be specified in place of --os ubuntu-22.04 using the flag
and corresponding OS name: rocky-9.4.
5. Build Rocky 9.4 image using the command nkp create image nutanix rocky-9.4.
For example:
nkp create image nutanix rocky-9.4 \
--cluster <PE_CLUSTER_NAME> \
--endpoint <PC_ENDPOINT_WITHOUT_PORT_EX_prismcentral.foobar.example.com> \
--subnet <NAME_OR_UUID_OF_SUBNET>
--artifacts-directory </path/to/saved/os-package-bundles>
--source-image <name of base image>
Tip: A local registry can also be used in a non-air-gapped environment for speed and security if desired. To do so, add
the following steps to your non-air-gapped installation process. See the topic Registry Mirror Tools.
Section Contents
Bootstrapping Nutanix
To create Kubernetes clusters, NKP uses Cluster API (CAPI) controllers. These controllers run on a
Kubernetes cluster.
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
3. Decide your Base OS image selection. See BaseOS Image Requirements in the Prerequisites section.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices and
begin bootstrapping. For more information, see Universal Configurations for all Infrastructure Providers.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. Ensure that the Cluster API Provider Nutanix Cloud Infrastructure (CAPX) controllers are present using the
command kubectl get pods -n capx-system.
Output example:
NAME READY STATUS RESTARTS AGE
capx-controller-manager-785c5978f-nnfns 1/1 Running 0 13h
6. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io.
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
Note: NKP uses the Nutanix CSI driver as the default storage provider. For more information see, Default Storage
Providers on page 33.
Note: NKP uses a CSI storage container on your Prism Element (PE). The CSI Storage Container image names must
be the same for every PE environment in which you deploy an NKP cluster.
Note: The cluster name might only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
2. Set the environment variable for cluster name using the command export CLUSTER_NAME=<my-nutanix-
cluster>
4. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
The default subnets used in NKP are.
spec:
» (Optional) Modify Control Plane Audit logs - Users can modify the KubeadmControlplane cluster-API
object to configure different kubelet options. If you need to configure your control plane beyond the
existing options available from flags, see Configuring your control plane.
» (Optional) Determine what VPC Network to use. Nutanix accounts have a default preconfigured VPC
Network, which will be used if you do not specify a different network. To use a different VPC network for
your cluster, create one by following these instructions for Create and Manage VPC Networks. Then select
the --network $new_vpc_network_name option on the create cluster command below.
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
6. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully.
Note: If the cluster creation fails, check issues with your environment such as storage resources. If the cluster
becomes self-managed before it stalls, you can investigate what is running and what has failed to try to resolve
those issues independently. See Resource Requirements and Inspect Cluster Issues for more
information.
Note: A warning will appear in the console if the resource already exists, requiring you to remove the resource or
update your YAML.
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
9. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a
command to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/nutanix-e2e-cluster_name-1 True
13h
##ClusterInfrastructure - NutanixCluster/nutanix-e2e-cluster_name-1 True
13h
##ControlPlane - KubeadmControlPlane/nutanix-control-plane True
13h
# ##Machine/nutanix--control-plane-7llgd True
13h
# ##Machine/nutanix--control-plane-vncbl True
13h
# ##Machine/nutanix--control-plane-wbgrm True
13h
##Workers
##MachineDeployment/nutanix--md-0 True
13h
##Machine/nutanix--md-0-74c849dc8c-67rv4 True
13h
##Machine/nutanix--md-0-74c849dc8c-n2skc True
13h
##Machine/nutanix--md-0-74c849dc8c-nkftv True
13h
##Machine/nutanix--md-0-74c849dc8c-sqklv True
13h
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io
--registry-mirror-username= --registry-mirror-password= on the nkp create
cluster command. See Docker Hub's rate limit.
12. Describe the kubeadm control plane and check its status and events with the command.
kubectl describe kubeadmcontrolplane
Note: NKP uses the Nutanix CSI driver as the default storage provider. For more information see, Default Storage
Providers on page 33.
Note: NKP uses a CSI storage container on your Prism Element (PE). The CSI Storage Container image names must
be the same for every PE environment in which you deploy an NKP cluster.
• Ensure you have a VPC associated with the external subnet on PCVM with connectivity outside the cluster.
Note: You must configure the default route (0.0.0.0/0) to the external subnet as the next hop for connectivity
outside the cluster (north-south connectivity).
Note: The cluster name might only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Note: Additional steps are required before and during cluster deployment if you use VPC instead of other
environments.
Procedure
2. Set the environment variable for cluster name using the command export CLUSTER_NAME=<my-nutanix-
cluster>.
4. Create a VM(Bastion) inside a VPC subnet where you want to create the cluster. See Creating a VM through
Prism Central.
5. Associate a Floating IP with the Bastion VM. See Assigning Secondary IP Addresses to Floating IPs.
7. Extract the package onto the Bastion VM: tar -xzvf nkp-bundle_v2.12.0_linux_amd64.tar.gz.
8. Select the desired scenario between accessing the cluster inside or outside the VPC subnet.
» Inside the VPC: Proceed directly to the Create a Kubernetes cluster step.
Note: To access the cluster in the VPC, use the Bastion VM or any other VM in the same VPC.
» Outside the VPC: If access is needed from outside the VPC, link the floating IP to an internal IP used as
CONTROL_PLANE_ENDPOINT_IP while deploying the cluster. For information on Floating IP, see the topic
Request Floating IPs in Flow Virtual Networking.
Note: Access the cluster in the VPC from outside using updated kubeconfig after creating the cluster.
Note: To access the UI outside the VPC, you need to request three floating IPs.
9. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
The default subnets used in NKP are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
If you need to change the Kubernetes subnets, you must do this at cluster creation. See the topic Subnets for
more information.
10. Create a Kubernetes cluster using the example, populating the fields starting with $ as required.
Note: Use the second floating IP from step 8 for passing as --extra-sans.
11. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully.
Note: If the cluster creation fails, check issues with your environment such as storage resources. If the cluster
becomes self-managed before it stalls, you can investigate what is running and what has failed to try to resolve
those issues independently. See Resource Requirements and Inspect Cluster Issues for more
information.
12. Create the cluster from the objects generated from the dry run.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: A warning will appear in the console if the resource already exists, requiring you to remove the resource or
update your YAML.
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
14. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a
command to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/nutanix-e2e-cluster_name-1 True
13h
##ClusterInfrastructure - NutanixCluster/nutanix-e2e-cluster_name-1 True
13h
##ControlPlane - KubeadmControlPlane/nutanix-control-plane True
13h
# ##Machine/nutanix--control-plane-7llgd True
13h
# ##Machine/nutanix--control-plane-vncbl True
13h
# ##Machine/nutanix--control-plane-wbgrm True
13h
##Workers
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io
--registry-mirror-username= --registry-mirror-password= on the nkp create
cluster command. See Docker Hub's rate limit.
15. If the cluster needs to be accessed from outside the VPC, get the kubeconfig of the cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Update the field to the Floating IP in order to access the cluster outside of the VPC.
server: https://ControlPlane_EndPoint:6443
in ${CLUSTER_NAME}.conf where CONTROL_PLANE_ENDPOINT_IP is to be replaced with FloatingIP passed
as --extra-sans IP during k8s cluster creation.
17. Verify that the kubeadm control plane is ready with the command.
kubectl get kubeadmcontrolplane
Output is similar to:
NAME CLUSTER INITIALIZED API SERVER
AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
nutanix-e2e-cluster-1-control-plane nutanix-e2e-cluster-1 true true
3 3 3 0 14h v1.29.6
18. Describe the kubeadm control plane and check its status and events with the command.
kubectl describe kubeadmcontrolplane
19. As they progress, the controllers also create Events, which you can list using the command
kubectl get events | grep ${CLUSTER_NAME}
For brevity, this example uses grep. You can also use separate commands to get Events for specific objects,
such as kubectl get events --field-selector involvedObject.kind="NutanixCluster."
By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will
reside in a single zone. You might create additional node pools in other zones with the nkp create nodepool
command.
Availability zones (AZs) are isolated locations within datacenter regions where public cloud services originate
and operate. Because all the nodes in a node pool are deployed in a single Availability Zone, you may wish to
create additional node pools to ensure your cluster has nodes deployed in multiple Availability Zones.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment or a
free-standing Pro Cluster.
Procedure
1. Deploy cluster life cycle services on the workload cluster using the command nkp create capi-components
--kubeconfig ${CLUSTER_NAME}.conf.
Output example:
# Initializing new CAPI components
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is also called a Pivot. For more
information, see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster using the command nkp move capi-
resources --to-kubeconfig ${CLUSTER_NAME}.conf.
Output example:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
3. Wait for the cluster control plane to be ready using the command kubectl --kubeconfig
${CLUSTER_NAME}.conf wait --for=condition=ControlPlaneReady "clusters/
${CLUSTER_NAME}" --timeout=20m
Output example:
cluster.cluster.x-k8s.io/gcp-example condition met
5. Remove the bootstrap cluster because the workload cluster is now self-managed using the command nkp delete
bootstrap --kubeconfig $HOME/.kube/config.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Prerequisites:
Procedure
1. Set the environment variable for your cluster using the command export CLUSTER_NAME=<your-
management-cluster-name>.
2. Copy the kubeconfig file of your Management cluster to your local directory using the command nkp get
kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf.
3. Create a configuration file for the deployment using the command nkp install kommander --init >
kommander.yaml .
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for nkp-catalog-appliations.
Example:
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Configuring a Default
Ultimate Catalog after Installing NKP on page 1011.
Procedure
You can check the status of the installation using the command kubectl -n kommander wait --for
condition=Ready helmreleases --all --timeout 15m.
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using the command nkp open
dashboard --kubeconfig=${CLUSTER_NAME}.conf.
2. Retrieve your credentials at any time if necessary, using the command kubectl -n kommander get
secret nkp-credentials -o go-template='Username: {{.data.username|base64decode}}
{{ "\n"}}Password: {{.data.password|base64decode}}{{ "\n"}}'
3. Retrieve the URL used for accessing the UI sing the command kubectl -n kommander get svc
kommander-traefik -o go-template='https://{{with index .status.loadBalancer.ingress
0}}{{or .hostname .ip}}{{end}}/NKP/kommander/dashboard{{ "\n"}}'.
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
a. Rotate the password using the command nkp experimental rotate dashboard-password.
Example output displaying the new password:
Password: kqZ31lMBSCLcBjUKVwLJMQL2PxalipIzZw5Pjyw09wDqjWV3dz2wPSSBYi09JGJp
Further Prerequisites
In an environment with access to the Internet, you retrieve artifacts from specialized repositories dedicated to them,
such as Docker images contained in DockerHub and Helm Charts that come from a dedicated Helm Chart repository.
However, in an air-gapped environment, you need local repositories to store Helm charts, Docker images, and other
artifacts. Tools such as JFrog, Harbor, and Nexus handle multiple types of artifacts in one local repository.
Section Contents
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images, is required. This registry must be accessible from both the bastion
machine and other machines that will be created for the Kubernetes cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory,
similar to the example below, depending on your current location
cd nkp-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
• REGISTRY_URL: the address of an existing registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. Konvoy will configure the
cluster nodes to trust this CA. This value is only needed if the registry uses a self-signed certificate and the
images are not already configured to trust this CA.
• REGISTRY_USERNAME: optional, set to a user with pull access to this registry.
4. Execute the following command to load the air-gapped image bundle into your private registry using any relevant
flags to apply variables from Step 3.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the network performance
of the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
3. Decide your Base OS image selection. See BaseOS Image Requirements in the Prerequisites section.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
5. Ensure that the Cluster API Provider Nutanix Cloud Infrastructure (CAPX) controllers are present using the
command kubectl get pods -n capx-system.
Output example:
NAME READY STATUS RESTARTS AGE
capx-controller-manager-785c5978f-nnfns 1/1 Running 0 13h
6. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io.
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capx-system capx-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
cert-manager cert-manager-cainjector
1/1 1 1 1h
cert-manager cert-manager-webhook
1/1 1 1 1h
Note: NKP uses the Nutanix CSI driver as the default storage provider. For more information see, Default Storage
Providers on page 33.
Note: NKP uses a CSI storage container on your Prism Element (PE). The CSI Storage Container image names must
be the same for every PE environment in which you deploy an NKP cluster.
• Ensure you have Nutanix Air-gapped: Loading the Registry on page 61.
• Named your cluster.
Note: The cluster name might only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Procedure
2. Set the environment variable for cluster name using the command export CLUSTER_NAME=<my-nutanix-
cluster>.
4. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
The default subnets used in NKP are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
If you need to change the kubernetes subnets, you must do this at cluster creation. See the topic Subnets for
more information.
» (Optional) Modify Control Plane Audit logs - Users can modify the KubeadmControlplane cluster-API
object to configure different kubelet options. If you need to configure your control plane beyond the
existing options available from flags, see Configuring your control plane.
6. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully.
Note: If the cluster creation fails, check issues with your environment such as storage resources. If the cluster
becomes self-managed before it stalls, you can investigate what is running and what has failed to try to resolve
those issues independently. See Resource Requirements and Inspect Cluster Issues for more
information.
7. Create the cluster from the objects generated from the dry run.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: A warning will appear in the console if the resource already exists, requiring you to remove the resource or
update your YAML.
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
9. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a
command to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io
--registry-mirror-username= --registry-mirror-password= on the nkp create
cluster command. See Docker Hub's rate limit.
11. Verify that the kubeadm control plane is ready with the command.
kubectl get kubeadmcontrolplane
Output is similar to:
NAME CLUSTER INITIALIZED API SERVER
AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
nutanix-e2e-cluster-1-control-plane nutanix-e2e-cluster-1 true true
3 3 3 0 14h v1.29.6
12. Describe the kubeadm control plane and check its status and events with the command.
kubectl describe kubeadmcontrolplane
13. As they progress, the controllers also create Events, which you can list using the command
kubectl get events | grep ${CLUSTER_NAME}
For brevity, this example uses grep. You can also use separate commands to get Events for specific objects,
such as kubectl get events --field-selector involvedObject.kind="NutanixCluster."
Note: NKP uses a CSI storage container on your Prism Element (PE). The CSI Storage Container image names must
be the same for every PE environment in which you deploy an NKP cluster.
• Ensure you have a VPC associated with the external subnet on PCVM with connectivity outside the cluster.
Note: You must configure the default route (0.0.0.0/0) to the external subnet as the next hop for connectivity
outside the cluster (north-south connectivity).
Note: The cluster name might only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Note: Additional steps are required before and during cluster deployment if you use VPC instead of other
environments.
Procedure
2. Set the environment variable for cluster name using the command export CLUSTER_NAME=$my-nutanix-
cluster.
4. Create a VM(Bastion) inside a VPC subnet where you want to create the cluster. See Creating a VM through
Prism Central.
5. Associate a Floating IP with the Bastion VM. See Assigning Secondary IP Addresses to Floating IPs.
7. Extract the package onto the Bastion VM: tar -xzvf nkp-bundle_v2.12.0_linux_amd64.tar.gz.
» Inside the VPC: Proceed directly to the Create a Kubernetes cluster step.
Note: To access the cluster in the VPC, use the Bastion VM or any other VM in the same VPC.
» Outside the VPC: If access is needed from outside the VPC, link the floating IP to an internal IP used as
CONTROL_PLANE_ENDPOINT_IP while deploying the cluster. For information on Floating IP, see the topic
Request Floating IPs in Flow Virtual Networking.
Note: Access the cluster in the VPC from outside using updated kubeconfig after creating the cluster.
Note: To access the UI outside the VPC, you need to request three floating IPs.
9. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
The default subnets used in NKP are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
If you need to change the kubernetes subnets, you must do this at cluster creation. For more information, see the
topic Subnets .
10. Create a Kubernetes cluster using the example, populating the fields starting with $ as required.
Note: Use the second floating IP from step 8 for passing as --extra-sans.
11. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully.
Note: If the cluster creation fails, check issues with your environment such as storage resources. If the cluster
becomes self-managed before it stalls, you can investigate what is running and what has failed to try to resolve
those issues independently. See Resource Requirements and Inspect Cluster Issues for more
information.
12. Create the cluster from the objects generated from the dry run.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: A warning will appear in the console if the resource already exists, requiring you to remove the resource or
update your YAML.
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
14. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a
command to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/nutanix-e2e-cluster_name-1 True
13h
##ClusterInfrastructure - NutanixCluster/nutanix-e2e-cluster_name-1 True
13h
##ControlPlane - KubeadmControlPlane/nutanix-control-plane True
13h
# ##Machine/nutanix--control-plane-7llgd True
13h
# ##Machine/nutanix--control-plane-vncbl True
13h
# ##Machine/nutanix--control-plane-wbgrm True
13h
##Workers
##MachineDeployment/nutanix--md-0 True
13h
##Machine/nutanix--md-0-74c849dc8c-67rv4 True
13h
##Machine/nutanix--md-0-74c849dc8c-n2skc True
13h
##Machine/nutanix--md-0-74c849dc8c-nkftv True
13h
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io
--registry-mirror-username= --registry-mirror-password= on the nkp create
cluster command. See Docker Hub's rate limit.
15. If the cluster needs to be accessed from outside the VPC, get the kubeconfig of the cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Update the field to the Floating IP in order to access the cluster outside of the VPC.
server: https://ControlPlane_EndPoint:6443
in ${CLUSTER_NAME}.conf where CONTROL_PLANE_ENDPOINT_IP is to be replaced with FloatingIP passed
as --extra-sans IP during k8s cluster creation.
17. Verify that the kubeadm control plane is ready with the command.
kubectl get kubeadmcontrolplane
Output is similar to:
NAME CLUSTER INITIALIZED API SERVER
AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
nutanix-e2e-cluster-1-control-plane nutanix-e2e-cluster-1 true true
3 3 3 0 14h v1.29.6
18. Describe the kubeadm control plane and check its status and events with the command.
kubectl describe kubeadmcontrolplane
19. As they progress, the controllers also create Events, which you can list using the command
kubectl get events | grep ${CLUSTER_NAME}
For brevity, this example uses grep. You can also use separate commands to get Events for specific objects,
such as kubectl get events --field-selector involvedObject.kind="NutanixCluster."
By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will
reside in a single zone. You might create additional node pools in other zones with the nkp create nodepool
command.
Availability zones (AZs) are isolated locations within datacenter regions where public cloud services originate
and operate. Because all the nodes in a node pool are deployed in a single Availability Zone, you may wish to
create additional node pools to ensure your cluster has nodes deployed in multiple Availability Zones.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for nkp-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/chartmuseum condition met
helmrelease.helm.toolkit.fluxcd.io/cluster-observer-2360587938 condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/gatekeeper condition met
helmrelease.helm.toolkit.fluxcd.io/gatekeeper-proxy-mutations condition met
helmrelease.helm.toolkit.fluxcd.io/karma-traefik-certs condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-operator condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-ui condition met
helmrelease.helm.toolkit.fluxcd.io/kube-oidc-proxy condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-traefik-certs condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-traefik-certs condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp get dashboard --kubeconfig=${CLUSTER_NAME}.conf
Section Contents
Note:
cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size
cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size
The full list of command line arguments to the Cluster Autoscaler controller is on the Kubernetes public GitHub
repository.
For more information on how Cluster Autoscaler works, see these documents:
Procedure
1. Locate the machinedeployment for the worker nodes you want to adjust your auto-scaling. You can retrieve the
name of your nodepool by running the command nkp get nodepool --cluster-name ${CLUSTER_NAME}.
Example:
nkp get nodepool --cluster-name ${CLUSTER_NAME}
Locate the nodepool name under spec.topology.workers.machineDeployments.metadata.name, Select
the nodepool you want to scale.
2. To enable autoscaling on your Nutanix cluster, edit the cluster object using the command kubectl.
Example:
kubectl edit cluster ${CLUSTER_NAME}
Pre-provisioned Infrastructure
Configuration types for installing the Nutanix Kubernetes Platform (NKP) on a Pre-provisioned
Infrastructure.
Create a Kubernetes cluster on pre-provisioned nodes in a bare metal infrastructure.
The following procedure describes creating an NKP cluster on a pre-provisioned infrastructure using SSH. For more
information on a Pre-provisioned environment, see Pre-provisioned Infrastructure on page 22
Completing this procedure results in a Kubernetes cluster that includes a Container Networking Interface (CNI) and a
Local Persistence Volume Static Provisioner that is ready for workload deployment.
Before moving to a production environment, you might add applications for logging and monitoring, storage,
security, and other functions. You can use NKP to select and deploy applications or deploy your own. For more
information, see Deploying Platform Applications Using CLI on page 389.
For more information, see:
Section Contents
• Installing an operating system, device drivers, and partitioning and setup tools
• Installing enterprise software and applications
• Setting parameters such as IP addresses
• Performing partitioning or installation of virtualization software
• Connectivity, whether an air-gapped or non-air-gapped environment, meaning it is connected to the internet
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• To use a local registry, whether air-gapped or non-air-gapped environment, download and extract the
bundle. Download the Complete NKP Air-gapped Bundle for this release (for example, nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load the registry.
Machine Specifications
You need to have at least three Control Plane Machines.
Each control plane machine must have the following:
• 4 cores
• 16 GB memory
• Approximately 80 GB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd.
• 15% free space on the root file system.
Note: Swap is disabled. The kubelet does not have generally available support for swap. Due to variable
commands, refer to your operating system documentation.
Worker Machines
You need to have at least four worker machines. The specific number of worker machines required for your
environment can vary depending on the cluster workload and size of the machines.
Each worker machine must have the following:
• 8 cores
• 32 GiB memory
• Around 80 GiB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd
• 15% free space on the root file system
• If you plan to use local volume provisioning to provide persistent volumes for your workloads, you must mount at
least four volumes to the /mnt/disks/ mount point on each machine. Each volume must have at least 55 GiB of
capacity.
• Ensure your disk meets the resource requirements for Rook Ceph in Block mode for ObjectStorageDaemons as
specified in the requirements table.
• Multiple ports open, as described in NKP Ports.
• firewalld systemd service disabled. If it exists and is enabled, use the commands systemctl stop
firewalld then systemctl disable firewalld, so that firewalld remains disabled after the machine
restarts.
• For a Pre-provisioned environment using Ubuntu 20.04, ensure the machine has the /run directory mounted with
exec permissions.
Note: Swap is disabled. The kubelet does not generally support swap. Due to variable commands, refer to your
operating system documentation.
1. Export the following environment variables, ensuring that all control plane and worker nodes are included:
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_SECRET_NAME="$CLUSTER_NAME-ssh-key"
The environment variables you set in this step automatically replace the variable names when the inventory
YAML file is created.
3. To tell the bootstrap cluster which nodes you want to be control plane nodes and which nodes are
worker nodes. Apply the file to the bootstrap cluster using the command kubectl apply -f
preprovisioned_inventory.yaml.
Example:
preprovisionedinventory.infrastructure.cluster.konvoy.nutanix.io/preprovisioned-
example-control-plane created
preprovisionedinventory.infrastructure.cluster.konvoy.nutanix.io/preprovisioned-
example-md-0 created
What to do next
Pre-provisioned Cluster Creation Customization Choices
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images, is required. This registry must be accessible from the bastion machine and
the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the Kubernetes cluster.
Procedure
2. After extraction, you can access files from different directories using the command nkp-<version>.
The following example shows the nkp-<version> command to change the directory.
cd nkp-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
4. To load the air-gapped image bundle into your private registry using any of the relevant flags to apply variables
above use the command nkp push bundle --bundle ./container-images/konvoy-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the network's
performance between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit, use your credentials to create the cluster by setting the --
registry-mirror-url=https://registry-1.docker.io --registry-mirror-username=
--registry-mirror-password= flag when using the command nkp create cluster.
5. Load the Kommander component images to your private registry using the command nkp push bundle
--bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-registry=
${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-password=
${REGISTRY_PASSWORD}.
Optional step required only if you have an Ultimate license: Load the image bundle into your private registry
using the command nkp-catalog-applications.
Example:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Replacing the Pre-provisioned Driver with the Azure Disk CSI Driver
After your bootstrap is running and your cluster is created, you will need to install the Azure Disk CSI Driver
on your pre-provisioned Azure Kubernetes cluster.
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• CLI tool Kubectl is used to interact with the running cluster. https://kubernetes.io/docs/tasks/tools/#kubectl
• Azure CLI. For more information, see https://docs.microsoft.com/en-us/cli/azure/install-azure-cli.
• A valid Azure account with credentials configured. For more information, see https://github.com/kubernetes-
sigs/cluster-api-provider-azure/blob/master/docs/book/src/topics/getting-started.md#prerequisites.
• Ability to download artifacts from the internet and then copy those onto your Bastion machine.
• Download the Complete NKP Air-gapped Bundle for this release - nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz.
• An existing local registry to seed the air-gapped environment. For more information, see Registry and
Registry Mirrors on page 650.
Procedure
2. Create an Azure Service Principal (SP) by using the command az ad sp create-for-rbac --role
contributor --name "$(whoami)-konvoy" --scopes=/subscriptions/$(az account show --
query id -o tsv).
This command will rotate the password if an SP with the name exists.
Example output:
{
"appId": "7654321a-1a23-567b-b789-0987b6543a21",
"displayName": "azure-cli-2021-03-09-23-17-06",
"password": "Z79yVstq_E.R0R7RUUck718vEHSuyhAB0C",
"tenant": "a1234567-b132-1234-1a11-1234a5678b90"
}
• For air-gapped environments, you need to create a resource management private link with a private
endpoint to ensure the Azure CSI driver will run correctly in further steps. Private links enable you to
access Azure services over a private endpoint in your virtual network. For more information, see https://
learn.microsoft.com/en-us/azure/azure-resource-manager/management/create-private-link-access-
portal.
To set up a private link resource, use the following process.
1. Create a private resource management link using Azure CLI. For more information, see https://
learn.microsoft.com/en-us/azure/azure-resource-manager/management/create-private-link-
access-commands?tabs=azure-cli#create-resource-management-private-link
4. Set your KUBECONFIG environment variable using the command export kubeconfig=
${CLUSTER_NAME}.conf
5. Create the Secret with the Azure credentials. The Azure CSI driver will use this.
b. Create the Secret using the command kubectl create secret generic azure-cloud-provider --
namespace=kube-system --type=Opaque --from-file=cloud-config=azure.json.
6. Install the Azure Disk CSI driver using the command $ curl -skSL https://
raw.githubusercontent.com/kubernetes-sigs/azuredisk-csi-driver/v1.26.2/deploy/
install-driver.sh | bash -s v1.26.2 snapshot –.
7. Check the status to see if the driver is ready for use using the command kubectl -n kube-system get pod
-o wide --watch -l app=csi-azuredisk-controller kubectl -n kube-system get pod -o
wide --watch -l app=csi-azuredisk-node.
Kubernetes knows this is Azure disk and will create clusters on Azure.
8. Create the StorageClass for the Azure Disk CSI Driver using the command kubectl create -f https://
raw.githubusercontent.com/kubernetes-sigs/azuredisk-csi-driver/master/deploy/
example/storageclass-azuredisk-csi.yaml.
9. Change the default storage class to this new StorageClass so that every new disk will be created in the Azure
environment using the command kubectl patch sc/localvolumeprovisioner -p '{"metadata":
{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}' kubectl
patch sc/managed-csi -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/
is-default-class":"true"}}}'
What to do next
For more information about Azure Disk CSI for persistent storage and changing the default StorageClass,
see Default Storage Providers in NKP.
• Pre-provisioned Customizing CAPI Clusters : Familiarize yourself with the Cluster API before editing the
cluster objects, as edits can prevent the cluster from deploying successfully.
• Pre-provisioned Registry Mirrors: In an air-gapped environment, you need a local repository to store Helm
charts, Docker images, and other artifacts. In an environment with access to the Internet, you can retrieve artifacts
from specialized repositories dedicated to them, such as Docker images contained in DockerHub and Helm Charts
that come from a dedicated Helm Chart repository.
• Pre-provisioned Create Secrets and Overrides: Create necessary secrets and overrides for pre-provisioned
clusters. Most applications deployed through Kubernetes https://kubernetes.io/docs/concepts/configuration/
secret/ require access to databases, services, and other external resources. The easiest way to manage the login
information necessary to access those resources is using secrets, which help organize and distribute sensitive
information across a cluster while minimizing the risk of sensitive information exposure.
• Pre-provisioned Define Control Plane Endpoint: A control plane needs to have three, five, or seven nodes
to remain available if one, two, or three nodes fail. A control plane with one node is not for production use.
• Pre-provisioned Configure MetalLB: An external load balancer (LB) is recommended to be the control
plane endpoint. To distribute request load among the control plane machines, configure the load balancer to
send requests to all the control plane machines. Configure the load balancer to send requests only to control
plane machines responding to API requests. If you do not have one, you can use Metal LB to create a MetalLB
configmap for your Pre-provisioned infrastructure.
• Pre-provisioned Modify the Calico Installation: Calico is a networking and security solution that enables
Kubernetes and non-Kubernetes/legacy workloads to communicate seamlessly and securely. Sometimes, changes
are needed, so use the information on this Pre-provisioned Modify the Calico Installation page.
• Pre-provisioned Built-in Virtual IP: As explained in Define the Control Plane Endpoint, we recommend
using an external load balancer for the control plane endpoint but provide a built-in virtual IP when one is not
available.
• Pre-provisioned Use HTTP Proxy: When you require HTTP proxy configurations, you can apply them
during the create operation by adding the appropriate flags to the nkp create cluster command.
• Pre-provisioned Use Alternate Pod or Service Subnets: Some subnets are reserved by Kubernetes and
can prevent proper cluster deployment if you unknowingly configure NKP so that the Node subnet collides with
either the Pod or Service subnet.
A control plane with one node can use its single node as the endpoint, so you will not require an external load
balancer or a built-in virtual IP. At least one control plane node must always be running. Therefore, a spare machine
must be available in the control plane inventory to upgrade a cluster with one control plane node. This machine is
used to provision the new node before the old node is deleted.
When the API server endpoints are defined, you can create the cluster.
Note: For more information on modifying Control Plane Audit logs settings, see Configuring the Control Plane.
Section Contents
• Built-in virtual IP
You can use the built-in virtual IP if an external load balancer is unavailable. The virtual IP is not a load balancer; it
does not distribute request load among the control plane machines. However, if the machine receiving requests does
not respond, the virtual IP automatically moves to another machine.
Note: Modify Control Plane Audit log settings using the information on the page Configure the Control Plane.
Known Limitations
The control plane endpoint port is also used as the API server port on each control plane machine. The default port is
6443. Before creating the cluster, ensure the port is available on each control plane machine.
Layer 2 Configuration
Layer 2 mode is the simplest to configure: in many cases, you don’t need any protocol-specific configuration, only IP
addresses. It does not require the IPs to be bound to the network interfaces of your worker nodes. It responds to ARP
requests on your local network directly and gives clients the machine’s MAC address.
• MetalLB IP address ranges or CIDRs must be within the node’s primary network subnet. For more information,
see Managing Subnets and Pods on page 651.
• MetalLB IP address ranges, CIDRs, and node subnets must not conflict with the Kubernetes cluster pod and
service subnets.
For example, the following configuration gives MetalLB control over IPs from 192.168.1.240 to 192.168.1.250 and
configures Layer 2 mode:
The following values are generic; enter your specific values into the fields where applicable.
cat << EOF > metallb-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
BGP Configuration
For a basic configuration featuring one BGP router and one IP address range, you need four pieces of information:
Network interface to use for Virtual IP. It must exist --virtual-ip-interface string
on all control plane machines.
IPv4 address. Reserved for use by the cluster. --control-plane-endpoint string
Note: The cluster name might only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes for more naming information.
Procedure
Set the environment variable to be used throughout this procedure using the command export
CLUSTER_NAME=<preprovisioned-example>
(Optional) If you want to create a unique cluster name using the command export
CLUSTER_NAME=preprovisioned-example-$(LC_CTYPE=C tr -dc 'a-z0-9' </dev/urandom | fold -
w 5 | head -n1)echo $CLUSTER_NAME.
Note: This creates a unique name every time you run it, so use it carefully.
Create a Secret
Procedure
3. Create the secret using the command kubectl create secret generic
${SSH_PRIVATE_KEY_SECRET_NAME} --from-file=ssh-privatekey=${SSH_PRIVATE_KEY_FILE}
Create Overrides
Procedure
1. Example CentOS7 and Docker - If you want to provide an override with Docker credentials and a different source
for EPEL on a CentOS7 machine, create a file like this.
cat > overrides.yaml << EOF
image_registries_with_auth:
- host: "registry-1.docker.io"
username: "my-user"
password: "my-password"
auth: ""
identityToken: ""
epel_centos_7_rpm: https://my-rpm-repostory.org/epel/epel-release-latest-7.noarch.rpm
EOF
You can then create the related secret by using the command kubectl create secret generic
$CLUSTER_NAME-user-overrides --from-file=overrides.yaml=overrides.yaml kubectl label
secret $CLUSTER_NAME-user-overrides clusterctl.cluster.x-k8s.io/move=.
2. When using Oracle 7 OS, you might wish to deploy the RHCK kernel instead of the default UEK kernel. To do so,
add the following text to your overrides.yaml.
cat > overrides.yaml << EOF
---
oracle_kernel: RHCK
EOF
You can then create the related secret by using the command kubectl create secret generic
$CLUSTER_NAME-user-overrides --from-file=overrides.yaml=overrides.yaml kubectl label
secret $CLUSTER_NAME-user-overrides clusterctl.cluster.x-k8s.io/move=.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
Set the environment variable to be used throughout this procedure using the command export
CLUSTER_NAME=<preprovisioned-example>
(Optional) If you want to create a unique cluster name, use the command. export
CLUSTER_NAME=preprovisioned-example-$(LC_CTYPE=C tr -dc 'a-z0-9' </dev/urandom | fold -
w 5 | head -n1) echo $CLUSTER_NAME.
Note: This creates a unique name every time you run it, so use it carefully.
Create a Secret
Procedure
3. Create the secret using the command kubectl create secret generic
${SSH_PRIVATE_KEY_SECRET_NAME} --from-file=ssh-privatekey=${SSH_PRIVATE_KEY_FILE}
kubectl label secret ${SSH_PRIVATE_KEY_SECRET_NAME} clusterctl.cluster.x-k8s.io/
move=.
Example output:
secret/preprovisioned-example-ssh-key created
secret/preprovisioned-example-ssh-key labeled
Note: Konvoy Image Builder (KIB) can produce images containing FIPS-140 compliant binaries. Use the
fips.yaml FIPS Override Non-air-gapped files provided with the image bundles. To locate the available
Override files in the Konvoy Image Builder repo, see https://github.com/mesosphere/konvoy-image-
builder/tree/main/overrides.
Create Overrides
1. Create a secret that includes the customization Overrides for FIPS compliance.
cat > overrides.yaml << EOF
---
k8s_image_registry: docker.io/mesosphere
fips:
enabled: true
build_name_extra: -fips
kubernetes_build_metadata: fips.0
default_image_repo: hub.docker.io/mesosphere
kubernetes_rpm_repository_url: "https://packages.nutanix.com/konvoy/stable/linux/
repos/el/kubernetes-v{{ kubernetes_version }}-fips/x86_64"
docker_rpm_repository_url: "\
https://containerd-fips.s3.us-east-2.amazonaws.com\
/{{ ansible_distribution_major_version|int }}\
/x86_64"
EOF
2. If your pre-provisioned machines need customization with alternate package libraries, Docker image or other
container registry image repos, or other Custom Override Files, add more lines to the same Overrides file.
a. Example One - If you want to provide an override with Docker credentials and a different source for EPEL on
a CentOS7 machine, create a file like this.
cat > overrides.yaml << EOF
---
# fips configuration
k8s_image_registry: docker.io/mesosphere
fips:
enabled: true
build_name_extra: -fips
kubernetes_build_metadata: fips.0
default_image_repo: hub.docker.io/mesosphere
kubernetes_rpm_repository_url: "https://packages.nutanix.com/konvoy/stable/linux/
repos/el/kubernetes-v{{ kubernetes_version }}-fips/x86_64"
docker_rpm_repository_url: "\
https://containerd-fips.s3.us-east-2.amazonaws.com\
/{{ ansible_distribution_major_version|int }}\
/x86_64"
# custom configuration
image_registries_with_auth:
- host: "registry-1.docker.io"
username: "my-user"
password: "my-password"
auth: ""
identityToken: ""
epel_centos_7_rpm: https://my-rpm-repostory.org/epel/epel-release-
latest-7.noarch.rpm
EOF
fips:
enabled: true
build_name_extra: -fips
kubernetes_build_metadata: fips.0
default_image_repo: hub.docker.io/mesosphere
kubernetes_rpm_repository_url: "https://packages.nutanix.com/konvoy/stable/linux/
repos/el/kubernetes-v{{ kubernetes_version }}-fips/x86_64"
docker_rpm_repository_url: "\
https://containerd-fips.s3.us-east-2.amazonaws.com\
/{{ ansible_distribution_major_version|int }}\
/x86_64"
# custom configuration
oracle_kernel: RHCK
EOF
3. Create the related secret by using the command kubectl create secret generic $CLUSTER_NAME-
user-overrides --from-file=overrides.yaml=overrides.yaml kubectl label secret
$CLUSTER_NAME-user-overrides clusterctl.cluster.x-k8s.io/move=.
Note: Azure does not set the interface. Proceed to Change the Encapsulation Type section below.
Procedure
1. Follow the steps outlined in this section to modify Calico’s configuration. In this example, all cluster nodes use
ens192 as the interface name. Get the pods running on your cluster with this command.
kubectl get pods -A --kubeconfig ${CLUSTER_NAME}.conf
Output:
NAMESPACE NAME
READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-57fbd7bd59-vpn8b
1/1 Running 0 16m
calico-system calico-node-5tbvl
1/1 Running 0 16m
calico-system calico-node-nbdwd
1/1 Running 0 4m40s
Note: If a calico-node pod is not ready on your cluster, you must edit the default Installation resource.
To edit the Installation resource, run the command: kubectl edit installation default --
kubeconfig ${CLUSTER_NAME}.conf
3. Save this resource. If that pod has failed, you might need to delete the node feature discovery worker pod in
the node-feature-discovery namespace. After you delete it, Kubernetes replaces the pod as part of its normal
reconciliation.
Procedure
Note: Azure only supports VXLAN encapsulation type. Therefore, if you install on Azure pre-provisioned VMs, you
must set the encapsulation mode to VXLAN.
Nutanix recommends using the method below to change the encapsulation type after cluster creation but before
production. To change the encapsulation type, follow these steps:
Procedure
1. First, remove the existing default-ipv4-ippool IPPool resource from kubeconfig. After you edit the
installation resource, the resource must be deleted to be recreated. Run the command below to delete.
kubectl delete ippool default-ipv4-ippool
Note: VXLAN is a tunneling protocol that encapsulates Layer 2 Ethernet frames in UDP packets, enabling you to
create virtualized Layer 2 subnets that span Layer 3 networks. It has a slightly larger header than IP-in-IP, which
slightly reduces performance over IP-in-IP.
Note: IPIP IP-in-IP is an IP tunneling protocol that encapsulates one IP packet in another IP packet. An outer
packet header is added with the tunnel entry and exit points. The calico implementation of this protocol uses BGP to
determine the exit point, which makes this protocol unusable on networks that don’t pass BGP.
Tip: If using Windows, see this documentation on the Calico site regarding limitations: Calico for Windows
VXLAN
Note: If desired, a local registry can also be used in a non-air-gapped environment for speed and security. To do so,
add the Pre-provisioned Air-gapped Define Environment steps to your non-air-gapped installation process.
Section Contents
Bootstrap Pre-provisioned
To create Kubernetes clusters, NKP uses Cluster API (CAPI) controllers. These controllers run on a
Kubernetes cluster.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io.
Example output:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
cert-manager cert-manager-cainjector
1/1 1 1 1h
cert-manager cert-manager-webhook
1/1 1 1 1h
Procedure
Note: The cluster name may only contain the following characters: a-z, 0-9, and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
What to do next
Create a Kubernetes Cluster
Before you create a new Nutanix Kubernetes Platform (NKP) cluster below, you may choose an external load
balancer or virtual IP and use the corresponding nkp create cluster command example from that page in the
docs from the links below. Other customizations are available but require different flags during the nkp create
cluster command. Refer to Pre-provisioned Cluster Creation Customization Choices for more cluster
customizations.
When you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding nkp create cluster command.
In a pre-provisioned environment, use the Kubernetes CSI and third-party drivers for local volumes and other storage
devices in your datacenter.
NKP uses a local static provisioner as the default storage provider for a pre-provisioned environment. However,
localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible storage that is
suitable for production.
After turning off localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands in this section of the Kubernetes
documentation: Changing the Default Storage Class
For Pre-provisioned environments, you define a set of existing nodes. During the cluster creation process, Konvoy
Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which KIB uses to
build images for other providers) against the set of nodes that you defined. This results in your pre-existing or pre-
provisioned nodes being appropriately configured.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the
Kubernetes control plane and worker nodes on the hosts defined in the inventory YAML previously created.
Note: (Optional) If you have overrides for your clusters, specify the secret as part of the create cluster command.
If these are not specified, the overrides for your nodes will not be applied.--override-secret-name=
$CLUSTER_NAME-user-overrides. See the topic Custom Overrides for details.
Note: (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when
attempting to pull images previously pushed to your registry when defining your infrastructure. Instructions in the
expandable Custom Installation section. For registry mirror information, see topics Using a Registry Mirror and
Registry Mirror Tools.
Note: When creating the cluster, specify the cluster-name. Using the same #cluster-name# when defining your
inventory objects would be best. See topic Defining Cluster Hosts and Infrastructure for more details.
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1 \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
2. Inspect or edit the cluster objects and familiarize yourself with Cluster API before editing them, as edits can
prevent the cluster from deploying successfully.
3. Create the cluster from the objects generated in the dry run. A warning will appear in the console if the resource
already exists, requiring you to remove the resource or update your YAML.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: If you used the --output-directory flag in your nkp create .. --dry-run step above, create
the cluster from the objects you created by specifying the directory:
kubectl create -f <existing-directory>/
Note: It will take a few minutes to create, depending on the cluster size.
Note: If changing the Calico encapsulation, Nutanix recommends changing it after cluster creation but before
production.
Important: If you need to increase Docker Hub's rate limit, use your Docker Hub credentials when creating the
cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
Audit Logs
To modify Control Plane Audit logs settings using the information on the page Configure the Control Plane.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY
SEVERITY REASON SINCE MESSAGE
Cluster/preprovisioned-example True
2m31s
##ClusterInfrastructure - PreprovisionedCluster/preprovisioned-example
##MachineDeployment/preprovisioned-example-md-0 True
2m34s
##Machine/preprovisioned-example-md-0-77f667cd9-tnctd True
2m33s
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
Output:
# Deleting bootstrap cluster
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Installing Kommander
This section provides installation instructions for the Kommander component of NKP in a non-air-gapped
Pre-provisioned environment.
• If the Kommander installation fails, or you wantwantreconfigure applications, rerun the install
command to retry.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PersistentVolumeClaim (PVC) based storage, which requires your CSI provider
to support PVC with type volumeMode: Block. As this is impossible with the default local static provisioner,
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See Kommanderthe Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of Nutanix Kubernetes Platform (NKP) . The Day 2 section allows you to manage
cluster operations and their application workloads to optimize your organization’s productivity.
Section Contents
Follow these steps to deploy NKP in a Pre-provisioned, Non-air-gapped environment:
• Local repositories to store Helm charts, Docker images, and other artifacts. Tools such as ECR, jFrog, Harbor,
and Nexus handle multiple types of artifacts in one local repository.
• Bastion Host - If you have not set up a Bastion Host yet, refer to that Documentation section.
• The complete NKP air-gapped bundle, which contains all the NKP components needed for an air-gapped
environment installation and also to use a local registry in a non-air-gapped environment: Pre-provisioned
Loading the Registry
Copy Air-gapped Artifacts onto Cluster Hosts
Procedure
2. You must fetch the distro packages and other artifacts. You get the latest security fixes available at machine image
build time by fetching the distro packages from distro repositories.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
a. The bootstrap image must be extracted and loaded onto the bastion host.
b. Artifacts must be copied onto cluster hosts for nodes to access.
c. If using GPU, those artifacts must be positioned locally.
d. Registry seeded with images locally.
5. Load the bootstrap image on your bastion machine from the air-gapped bundle you downloaded (nkp-air-
gapped-bundle_v2.12.0_linux_amd64.tar.gz)
docker load -i konvoy-bootstrap-image-v2.12.0.tar
6. Copy air-gapped artifacts onto cluster hosts. Using the Konvoy Image Builder, you can copy the required
artifacts onto your cluster hosts. The Kubernetes image bundle will be located in kib/artifacts/images and
you will want to verify the image and artifacts.
b. Verify the artifacts for your OS exist in the artifacts/ directory and export the appropriate variables.
$ ls kib/artifacts/
1.29.6_centos_7_x86_64.tar.gz 1.29.6_redhat_8_x86_64_fips.tar.gz
containerd-1.6.28-nutanix.1-rhel-7.9-x86_64.tar.gz containerd-1.6.28-nutanix.1-
rhel-8.6-x86_64_fips.tar.gz pip-packages.tar.gz
1.29.6_centos_7_x86_64_fips.tar.gz 1.29.6_rocky_9_x86_64.tar.gz
containerd-1.6.28-nutanix.1-rhel-7.9-x86_64_fips.tar.gz containerd-1.6.28-
nutanix.1-rocky-9.0-x86_64.tar.gz
1.29.6_redhat_7_x86_64.tar.gz 1.29.6_ubuntu_20_x86_64.tar.gz
containerd-1.6.28-nutanix.1-rhel-8.4-x86_64.tar.gz containerd-1.6.28-nutanix.1-
rocky-9.1-x86_64.tar.gz
1.29.6_redhat_7_x86_64_fips.tar.gz containerd-1.6.28-nutanix.1-centos-7.9-
x86_64.tar.gz containerd-1.6.28-nutanix.1-rhel-8.4-x86_64_fips.tar.gz
containerd-1.6.28-nutanix.1-ubuntu-20.04-x86_64.tar.gz
1.29.6_redhat_8_x86_64.tar.gz containerd-1.6.28-nutanix.1-centos-7.9-
x86_64_fips.tar.gz containerd-1.6.28-nutanix.1-rhel-8.6-x86_64.tar.gz images
7. Export the following environment variables, ensuring that all control plane and worker nodes are included.
export CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>"
export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>"
export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>"
export WORKER_1_ADDRESS="<worker-address-1>"
export WORKER_2_ADDRESS="<worker-address-2>"
export WORKER_3_ADDRESS="<worker-address-3>"
export WORKER_4_ADDRESS="<worker-address-4>"
export SSH_USER="<ssh-user>"
export SSH_PRIVATE_KEY_FILE="<private key file>"
SSH_PRIVATE_KEY_FILE must be either the name of the SSH private key file in your working directory or an
absolute path to the file in your user’s home directory.
9. Upload the artifacts onto cluster hosts with the following command.
konvoy-image upload artifacts \
--container-images-dir=./kib/artifacts/images/ \
--os-packages-bundle=./kib/artifacts/$OS_PACKAGES_BUNDLE \
--containerd-bundle=./kib/artifacts/$CONTAINERD_BUNDLE \
--pip-packages-bundle=./kib/artifacts/pip-packages.tar.gz
Flags: Use the overrides flag (for example:--overrides overrides/fips.yaml) and reference either
fips.yaml or offline-fips.yaml
manifests located in the overrides directory . You can also see these pages in the documentation. Add GPU flags
if needed: --nvidia-runfile=./artifacts/NVIDIA-Linux-x86_64-470.82.01.run
The konvoy-image upload artifacts command and copy all OS packages and other artifacts onto each
machine in your inventory. When you create the cluster, provisioning connects to each node and runs commands
to install those artifacts. Consequently, Kubernetes is running. KIB uses variable overrides to specify your
new machine image's base and container images. The variable overrides files for NVIDIA and FIPS, which can
be ignored unless an overlay feature is added. Use the --overrides overrides/fips.yaml,overrides/
offline-fips.yaml flag with manifests located in the overrides directory
• Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io.
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
Procedure
Note: The cluster name might only contain the following characters: a-z, 0-9, and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information.
What to do next
Create a Kubernetes Cluster
Before you create a new Nutanix Kubernetes Platform (NKP) cluster below, you may choose an external load
balancer or virtual IP and use the corresponding nkp create cluster command example from that page in the
docs from the links below. Other customizations are available but require different flags during the nkp create
cluster command. Refer to Pre-provisioned Cluster Creation Customization Choices for more cluster
customizations.
When you create a new NKP cluster below, choose an external load balancer (LB) or virtual IP and use the
corresponding nkp create cluster command.
In a pre-provisioned environment, use the Kubernetes CSI and third-party drivers for local volumes and other storage
devices in your datacenter.
Note: NKP uses a local static provisioner as the default storage provider for a pre-provisioned environment.
However, localvolumeprovisioner is not suitable for production use. Use Kubernetes CSI compatible
storage that is suitable for production.
After turning off localvolumeprovisioner, you can choose from any of the storage options available for
Kubernetes. To make that storage the default storage, use the commands in this section of the Kubernetes
documentation: Changing the Default Storage Class
For Pre-provisioned environments, you define a set of existing nodes. During the cluster creation process, Konvoy
Image Builder(KIB) is built into NKP and automatically runs the machine configuration process (which KIB uses
to build images for other providers) against the set of nodes you defined. This results in your pre-existing or pre-
provisioned nodes being appropriately configured.
The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the
Kubernetes control plane and worker nodes on the hosts defined in the inventory YAML previously created.
Note: (Optional) If you have overrides for your clusters, specify the secret in the create cluster command. If these are
not specified, the overrides for your nodes will not be applied.--override-secret-name=$CLUSTER_NAME-
user-overrides. See the topic Custom Overrides for details.
Note: (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when
attempting to pull images previously pushed to your registry when defining your infrastructure. Instructions in the
expandable Custom Installation section. For registry mirror information, see topics Using a Registry Mirror and
Registry Mirror Tools.
Note: When creating the cluster, specify the cluster-name. It would be best to use the same #cluster-name# when
defining your inventory objects. See topic Defining Cluster Hosts and Infrastructure for more details.
Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation. If
you need to change the kubernetes subnets, you must do this at cluster creation. See the topic Subnets. The default
subnets used in NKP are:
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
This command uses the default external load balancer (LB) option (see alternative Step 1 for virtual IP):
nkp create cluster preprovisioned --cluster-name ${CLUSTER_NAME}
--control-plane-endpoint-host <control plane endpoint host>
--control-plane-endpoint-port <control plane endpoint port, if different than 6443>
--pre-provisioned-inventory-file preprovisioned_inventory.yaml
--ssh-private-key-file <path-to-ssh-private-key>
--registry-mirror-url=${REGISTRY_URL} \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644.
1. ALTERNATIVE Virtual IP - if you don’t have an external LB and want to use a VIRTUAL IP provided by
kube-vip, specify these flags example below:
nkp create cluster preprovisioned \
--cluster-name ${CLUSTER_NAME} \
--control-plane-endpoint-host 196.168.1.10 \
--virtual-ip-interface eth1 \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
Note: If you used the --output-directory flag in your nkp create .. --dry-run step above, create
the cluster from the objects you created by specifying the directory:
kubectl create -f <existing-directory>/
Note: It will take a few minutes to create, depending on the cluster size.
When the command `completes complete, you will have a running Kubernetes cluster! For bootstrap and custom
YAML cluster creation, refer to the Additional Infrastructure Customization section of the documentation for Pre-
provisioned Pre-provisioned Infrastructure
Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to install the NKP
Kommander UI:
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: If changing the Calico encapsulation, Nutanix recommends changing it after cluster creation but before
production.
Audit Logs
To modify Control Plane Audit logs settings using the information on the page Configure the Control Plane.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY
SEVERITY REASON SINCE MESSAGE
Cluster/preprovisioned-example True
2m31s
##ClusterInfrastructure - PreprovisionedCluster/preprovisioned-example
##MachineDeployment/preprovisioned-example-md-0 True
2m34s
##Machine/preprovisioned-example-md-0-77f667cd9-tnctd True
2m33s
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
Output:
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Prerequisites:
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster. NKP’s default
configuration ships Ceph with PersistentVolumeClaim (PVC) based storage, which requires your CSI provider
to support PVC with type volumeMode: Block. As this is impossible with the default local static provisioner,
you can install Ceph in host storage mode. You can choose whether Ceph’s object storage daemon (osd) pods can
consume all or just some of the devices on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, refer to
Specific Nodes and Devices. For general information on the deviceFilter value, refer to Storage
Selection Settings.
a. See the Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
NKP open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Day 2 Cluster Operations Management section
of the documentation. The majority of this customization such as attaching clusters and deploying applications will
take place in the dashboard or UI of Nutanix Kubernetes Platform (NKP) . The Day 2 section allows you to manage
cluster operations and their application workloads to optimize your organization’s productivity.
Section Content
Procedure
Task step.
Procedure
1. The bootstrap cluster will host the Cluster API controllers that reconcile the cluster objects marked for deletion.
Create a bootstrap cluster. To avoid using the wrong kubeconfig, the following steps use explicit kubeconfig paths
and contexts.
nkp create bootstrap --kubeconfig $HOME/.kube/config --with-aws-bootstrap-
credentials=true
2. Move the Cluster API objects from the workload to the bootstrap cluster: The cluster life cycle services on the
bootstrap cluster are ready, but the workload cluster configuration is on the workload cluster. The move command
moves the configuration, which takes the form of Cluster API Custom Resource objects, from the workload to the
bootstrap cluster. This process is also called a Pivot.
nkp move capi-resources \
--from-kubeconfig ${CLUSTER_NAME}.conf \
--from-context ${CLUSTER_NAME}-admin@${CLUSTER_NAME} \
--to-kubeconfig $HOME/.kube/config \
--to-context kind-konvoy-capi-bootstrapper
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig $HOME/.kube/config get nodes
##MachineDeployment/preprovisioned-example-md-0 True
2m34s
##Machine/preprovisioned-example-md-0-77f667cd9-tnctd True
2m33s
Procedure
1. To delete a cluster, Use nkp delete cluster and pass in the name of the cluster you are trying to delete
with --cluster-name flag. Use kubectl get clusters to get those details (--cluster-name and --
namespace) of the Kubernetes cluster to delete it.
kubectl get clusters
Note: Before deleting the cluster, Nutanix Kubernetes Platform (NKP) deletes all Services of type LoadBalancer
on the cluster. To skip this step, use the flag --delete-kubernetes-resources=false. Do not skip this
step if the VPC is managed by NKP.# NKP manages the VPC NKP deletes the cluster; it deletes the VPC. If the
VPC has any AWS Classic ELBs, AWS does not allow the VPC to be deleted, and NKP cannot delete the cluster.
Procedure
Delete the bootstrap cluster.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
Output:
# Deleting bootstrap cluster
Section Contents
Creating a node pool is useful when you need to run workloads that require machines with specific
resources, such as a GPU, additional memory, or specialized network or storage hardware.
Procedure
1. Create an inventory object with the same name as the node pool you’re creating and the details of the pre-
provisioned machines you want to add to it. For example, to create a node pool named gpu-nodepool, an inventory
named gpu-nodepool must be present in the same namespace.
apiVersion: infrastructure.cluster.konvoy.nutanix.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: ${MY_NODEPOOL_NAME}
spec:
hosts:
- address: ${IP_OF_NODE}
sshConfig:
port: 22
user: ${SSH_USERNAME}
2. Once the PreprovisionedInventory object and overrides are created, create a node pool.
nkp create nodepool preprovisioned -c ${MY_CLUSTER_NAME} ${MY_NODEPOOL_NAME} --
override-secret-name ${MY_OVERRIDE_SECRET}
3. Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally or store
in version control.
While running Cluster Autoscaler, you can manually scale your node pools up or down when you need
finite control over your environment.
• You must have the bootstrap node running with the SSH key or secrets created.
• The export values in the environment variables section need to contain the addresses of the nodes that you need to
add Pre-provisioned: Define Infrastructure.
• Update the preprovisioned_inventory.yaml with the new host addresses.
• Run the kubectl apply command.
Scale Up Node Pools
Procedure
2. Edit the preprovisioned_inventory to add additional IPs needed for additional worker nodes in the
spec.hosts: section.
$ kubectl edit preprovisionedinventory <preprovisioned_inventory> -n default
4. Scale the worker node to the required number. In this example, we scale from 4 to 6 worker nodes.
$ kubectl --kubeconfig ${CLUSTER_NAME}.conf scale --replicas=6 machinedeployment
machinedeployment-md-0
machinedeployment.cluster.x-k8s.io/machinedeployment-md-0 scaled
5. Monitor the scaling with this command by adding the -w option to watch.
$ kubectl --kubeconfig ${CLUSTER_NAME}.conf get machinedeployment -w
6. Also, you can check the machine deployment to see if it is already scaled.
Example output
$ kubectl --kubeconfig ${CLUSTER_NAME}.conf get machinedeployment
7. Alternately, you can use this command to verify the NODENAME column and see the additional worker nodes added
and in Running state.
$ kubectl --kubeconfig ${CLUSTER_NAME}.conf get machines -o wide
While running Cluster Autoscaler, you can manually scale your node pools up or down when you need
finite control over your environment.
Procedure
Deleting a node pool deletes the Kubernetes nodes and the underlying infrastructure.
Procedure
1. Delete a node pool from a managed cluster using the command nkp delete nodepool ${NODEPOOL_NAME}
--cluster-name=${CLUSTER_NAME}.
2. Delete an invalid node pool using the command nkp delete nodepool ${CLUSTER_NAME}-md-invalid --
cluster-name=${CLUSTER_NAME}.
Example output:
MachineDeployments or MachinePools.infrastructure.cluster.x-k8s.io "no
MachineDeployments or MachinePools found for cluster aws-example" not found
For pre-provisioned environments, Nutanix Kubernetes Platform (NKP) has provided the nvidia-runfile
flag for Air-gapped Pre-provisioned environments.
• If the NVIDIA runfile installer has not been downloaded, retrieve and install the download by running the
command. curl -O https://download.nvidia.com/XFree86/Linux-x86_64/470.82.01/NVIDIA-
Linux-x86_64-470.82.01.run mv NVIDIA-Linux-x86_64-470.82.01.run artifacts
Note: For using GPUs in an air-gapped on-premises environment, Nutanix recommends setting up Pod Disruption
Budget before Update Cluster Nodepools. For more information, see https://kubernetes.io/docs/
concepts/workloads/pods/disruptions/ and https://docs.nvidia.com/datacenter/tesla/tesla-
installation-notes/index.html#runfile.
1. In your overrides/nvidia.yaml file, add the following to enable GPU builds. You can also access and use the
overrides repo. Create the secret that GPU nodepool uses. This secret is populated from the KIB overrides. This
output example has a file called overrides/nvidia.yaml.
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
2. Create a secret on the bootstrap cluster populated from the above file. We will name it ${CLUSTER_NAME}-
user-overrides
kubectl create secret generic ${CLUSTER_NAME}-user-overrides --from-
file=overrides.yaml=overrides/nvidia.yaml
3. Create an inventory and node pool with the instructions below and use the $CLUSTER_NAME-user-overrides
secret.
Follow these steps.
a. Create an inventory object with the same name as the node pool you’re creating and the details of the pre-
provisioned machines you want to add to it. For example, to create a node pool named gpu-nodepool an
inventory named gpu-nodepool must be present in the same namespace.
apiVersion: infrastructure.cluster.konvoy.nutanix.io/v1alpha1
kind: PreprovisionedInventory
metadata:
name: ${MY_NODEPOOL_NAME}
spec:
hosts:
- address: ${IP_OF_NODE}
sshConfig:
port: 22
user: ${SSH_USERNAME}
privateKeyRef:
name: ${NAME_OF_SSH_SECRET}
namespace: ${NAMESPACE_OF_SSH_SECRET}
b. (Optional) If your pre-provisioned machines have overrides, you must create a secret that includes all the
overrides you want to provide in one file. Create an override secret using the instructions detailed on this page.
c. Once the PreprovisionedInventory object and overrides are created, create a node pool.
nkp create nodepool preprovisioned -c ${MY_CLUSTER_NAME} ${MY_NODEPOOL_NAME} --
override-secret-name ${MY_OVERRIDE_SECRET}
• Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally
or store in version control.
AWS Infrastructure
Configuration types for installing Nutanix Kubernetes Platform (NKP) on AWS Infrastructure.
For an environment on the AWS Infrastructure, install options based on those environment variables are provided for
you in this location.
If not already done, see the documentation for:
Section Contents
NKP Prerequisites
Before you begin using Konvoy, you must have:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements
• kubectl for interacting with the running cluster.
• A valid AWS account with credentials configured.
• For a local registry, whether air-gapped or non-air-gapped environment, download and extract the
bundle. Download the Complete NKP Air-gapped Bundle for this release (that is. NKP-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry.
Note: On macOS, Docker runs in a virtual machine. Configure this virtual machine with at least 8GB of memory.
• 4 cores
• 16 GiB memory
• Approximately 80 GiB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd.
• Disk usage must be below 85% on the root volume.
NKP on AWS defaults to deploying an m5.xlarge instance with an 80GiB root volume for control plane nodes,
which meets the above requirements.
Worker Nodes
You must have at least four worker nodes. The specific number of worker nodes required for your environment can
vary depending on the cluster workload and size of the nodes. Each worker node needs to have at least the following:
• 8 cores
• 32 GiB memory
• Around 80 GiB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd.
• Disk usage must be below 85% on the root volume.
NKP on AWS defaults to deploying a m5.2xlarge instance with an 80GiB root volume for worker nodes, which
meets the above requirements.
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
Note: Every tenant must be in a different AWS account for multi-tenancy to ensure they are truly independent of other
tenants to enforce security.
Section Contents
Note: For more information and compatible KIB versions, see Konvoy Image Builder on page 1032.
AMI images contain configuration information and software to create a specific, pre-configured operating
environment. For example, you can create an AMI image of your computer system settings and software. The AMI
image can then be replicated and distributed, creating your computer system for other users. You can use override
files to customize components installed on your machine image, such as having the FIPS versions of the Kubernetes
components installed by KIB components.
Depending on which Nutanix Kubernetes Platform (NKP) version you are running, steps and flags will differ. To
deploy in a region where CAPI images are not provided, you need to use KIB to create your image for the region. For
a list of supported Amazon Web Services (AWS) regions, refer to the Published AMI information from AWS. To
begin image creation:
Procedure
Tip: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.https://github.com/mesosphere/konvoy-image-builder/tree/main/images/ami
After KIB provisions the image successfully, the ami id is printed and written to the packer.pkr.hcl (Packer
config) file. This file has an artifact_id field whose value provides the name of the AMI ID, as shown in the
example below. That is the ami you use in the create cluster command:
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
2. To use a custom Amazon Machine Images (AMI) when creating your cluster, you must first create that AMI using
KIB. Then perform the export and name the custom AMI for use in the command nkp create cluster
For example:
export AWS_AMI_ID=ami-<ami-id-here>
3.
Note: For Air-gapped AMI, there are special air-gapped bundle instructions.
Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
• You have a valid AWS account with credentials configured that can manage CloudFormation Stacks, IAM
Policies, IAM Roles, and IAM Instance Profiles.
• The AWS CLI utility is installed.
If you use the AWS STS (Short Lived Token) to create the cluster, you must run the ./nkp update bootstrap
credentials aws --kubeconfig=<kubeconfig file> before you update the node for the management cluster or managed
clusters.
The following is an AWSCloudFormation stack that creates:
• A policy named nkp-bootstrapper-policy that enumerate the minimal permissions for a user that can create
NKP aws clusters.
• A role named nkp-bootstrapper-role that uses the nkp-bootstrapper-policy with a trust policy to allow
IAM users and ec2 instances from MYAWSACCOUNTID to use the role through STS.
Procedure
1. To create the resources in the CloudFormation stack, copy the following contents into a file.
AWSTemplateFormatVersion: 2010-09-09
Resources:
AWSIAMInstanceProfileNKPBootstrapper:
Properties:
InstanceProfileName: NKPBootstrapInstanceProfile
Roles:
- Ref: NKPBootstrapRole
Type: AWS::IAM::InstanceProfile
AWSIAMManagedPolicyNKPBootstrapper:
Properties:
Description: Minimal policy to create NKP clusters in AWS
ManagedPolicyName: nkp-bootstrapper-policy
PolicyDocument:
Statement:
- Action:
- ec2:AllocateAddress
- ec2:AssociateRouteTable
- ec2:AttachInternetGateway
- ec2:AuthorizeSecurityGroupIngress
- ec2:CreateInternetGateway
- ec2:CreateNatGateway
- ec2:CreateRoute
- ec2:CreateRouteTable
- ec2:CreateSecurityGroup
- ec2:CreateSubnet
- ec2:CreateTags
- ec2:CreateVpc
- ec2:ModifyVpcAttribute
- ec2:DeleteInternetGateway
- ec2:DeleteNatGateway
- ec2:DeleteRouteTable
- ec2:DeleteSecurityGroup
- ec2:DeleteSubnet
- ec2:DeleteTags
- ec2:DeleteVpc
- ec2:DescribeAccountAttributes
- ec2:DescribeAddresses
- ec2:DescribeAvailabilityZones
- ec2:DescribeInstanceTypes
- ec2:DescribeInternetGateways
- ec2:DescribeImages
- ec2:DescribeNatGateways
- ec2:DescribeNetworkInterfaces
- ec2:DescribeNetworkInterfaceAttribute
- ec2:DescribeRouteTables
- ec2:DescribeSecurityGroups
- ec2:DescribeSubnets
- ec2:DescribeVpcs
- ec2:DescribeVpcAttribute
- ec2:DescribeVolumes
- ec2:DetachInternetGateway
- ec2:DisassociateRouteTable
- ec2:DisassociateAddress
- ec2:ModifyInstanceAttribute
- ec2:ModifyInstanceMetadataOptions
Procedure
export AWS_SECRET_ACCESS_KEY=(.Credentials.SecretAccessKey)
export AWS_SESSION_TOKEN=(.Credentials.SessionToken)
Note: These credentials are short-lived and need to be updated in the bootstrap cluster
Procedure
• The created nkp-bootstrapper-role can be assumed by an ec2 instance that a user then runs Nutanix
Kubernetes Platform (NKP) create cluster commands from. To do this, specify the IAM Instance Profile
NKPBootstrapInstanceProfile on creation.
Procedure
Note: Regarding Access Keys usage, Nutanix recommends that a system administrator always consider AWS’s
Best practices.
Suppose your organization uses encrypted AMIs (##Use encryption with EBS-backed AMIs - Amazon Elastic
Compute Cloud# ). In that case, you must add additional permissions to the control plane policy to allow access to
the Amazon Key Management Services.
Return to EKS Cluster IAM Permissions and Roles or proceed to the next AWS step below.
For more information see:
• You have a valid AWS account with credentials configured that can manage CloudFormation Stacks, IAM
Policies, IAM Roles, and IAM Instance Profiles. For more information, see https://docs.aws.amazon.com/cli/
latest/userguide/cli-configure-files.html.
• You have the AWS CLI utility installed, see https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-
install.html.
Policies
1. AWSIAMManagedPolicyCloudProviderControlPlane enumerates the Actions required by the workload
cluster control plane machines. It is attached to the AWSIAMRoleControlPlane Role.
2. AWSIAMManagedPolicyCloudProviderNodes enumerates the Actions required by the workload cluster worker
machines. It is attached to the AWSIAMRoleNodes Role.
3. AWSIAMManagedPolicyControllers enumerates the Actions required by the workload cluster worker
machines. It is attached to the AWSIAMRoleControlPlane Role.
Roles
1. AWSIAMRoleControlPlane is the Role associated with the AWSIAMInstanceProfileControlPlane Instance
Profile.
2. AWSIAMRoleNodes is the Role associated with the AWSIAMInstanceProfileNodes Instance Profile.
For more information on how to grant cluster access to IAM users and roles, see https://docs.aws.amazon.com/
eks/latest/userguide/add-user-role.html.
Procedure
» Important: If the name is changed from the default, used below, it must be passed to nkp create
cluster with the --control-plane-iam-instance-profile flag.
» If the name is changed from the default, used below, it must be passed to nkp create cluster with the --
worker-iam-instance-profile flag.
AWSTemplateFormatVersion: 2010-09-09
Resources:
AWSIAMInstanceProfileControlPlane:
Properties:
InstanceProfileName: control-plane.cluster-api-provider-aws.sigs.k8s.io
Roles:
- Ref: AWSIAMRoleControlPlane
Type: AWS::IAM::InstanceProfile
AWSIAMInstanceProfileNodes:
Properties:
InstanceProfileName: nodes.cluster-api-provider-aws.sigs.k8s.io
Roles:
- Ref: AWSIAMRoleNodes
Type: AWS::IAM::InstanceProfile
AWSIAMManagedPolicyCloudProviderControlPlane:
Properties:
Description: For the Kubernetes Cloud Provider AWS Control Plane
ManagedPolicyName: control-plane.cluster-api-provider-aws.sigs.k8s.io
3. To create the resources in the cloudformation stack copy the contents above into a file replacing
MYFILENAME.yaml and MYSTACKNAME with the intended values. Then, execute the following command.
aws cloudformation create-stack --template-body=file://MYFILENAME.yaml --stack-
name=MYSTACKNAME --capabilities CAPABILITY_NAMED_IAM
Caution: If your organization uses encrypted AMIs, you must add additional permissions to the control plane
policy control-plane.cluster-api-provider-aws.sigs.k8s.io to allow access to the Amazon
Key Management Services. The code snippet shows how to add a particular key ARN to encrypt and decrypt AMIs.
---
Action:
- kms:CreateGrant
- kms:DescribeKey
- kms:Encrypt
- kms:Decrypt
- kms:ReEncrypt*
- kms:GenerateDataKey*
Resource:
- arn:aws:kms:us-west-2:111122223333:key/key-arn
Effect: Allow
Caution: If your organization uses Flatcar, then you will need to add additional permissions to the control plane
policy control-plane.cluster-api-provider-aws.sigs.k8s.io
For flatcar, when using the default object storage, add the following permissions to the IAM Role control-
plane.cluster-api-provider-aws.sigs.k8s.io:
PolicyDocument:
Statement:
...
- Action:
- 's3:CreateBucket'
- 's3:DeleteBucket'
- 's3:PutObject'
- 's3:DeleteObject'
- 's3:PutBucketPolicy'
- 's3:PutBucketTagging'
- 'ec2:CreateVpcEndpoint'
- 'ec2:ModifyVpcEndpoint'
- 'ec2:DeleteVpcEndpoints'
- 'ec2:DescribeVpcEndpoints'
Effect: Allow
Resource:
• Ensure you have followed the steps to create proper permissions in AWS Minimal Permissions and Role to
Create Clusters
• Ensure you have created AWS Cluster IAM Policies, Roles, and Artifacts
Upload the Air-gapped Image Bundle to the Local ECR Registry.
Procedure
A cluster administrator uses NKP CLI commands to upload the image bundle to ECR with parameters.
nkp push bundle --bundle <bundle> --to-registry=<ecr-registry-address>/<ecr-registry-
name>
Parameter definitions.
» --bundle <bundle> the group of images. The example below is for the NKP air-gapped environment bundle.
--to-registry=<ecr-registry-address>/<ecr-registry-name> to provide registry location for push
nkp push bundle --bundle container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=333000009999.dkr.ecr.us-west-2.amazonaws.com/can-test
» You can also set an environment variable with your registry address for ECR.
export REGISTRY_URL=<ecr-registry-URI>
Note: The cluster administrator uses existing NKP CLI commands to create the cluster and refer to their internal
ECR for the image repository. The administrator does not need to provide static ECR registry credentials. See Use
a Registry Mirror and Create an EKS Cluster from the CLI for more details.
Tip:
export REGISTRY_URL=<ecr-registry-URI>
Procedure
• REGISTRY_URL: the address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• NOTE: Other local registries might use the options below.
a. JFrog - REGISTRY_CA: (optional) the path on the Creating a Bastion Host on page 652 to the registry CA.
This value is only needed if the registry uses a self-signed certificate and the AMIs are not already configured
to trust this CA.
b. REGISTRY_USERNAME: optional, set to a user with pull access to this registry.
c. REGISTRY_PASSWORD: optional if username is not set.
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your image for the region.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants and enforce security.
Section Contents
Procedure
Tip: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command. For more information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/
images/ami
After KIB provisions the image successfully, the ami id is printed and written to the packer.pkr.hcl (Packer
config) file. This file has an artifact_id field whose value provides the name of the AMI ID, as shown in the
example below. That is the ami you use in the create cluster command.
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
2. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command nkp create cluster
For example:
export AWS_AMI_ID=ami-<ami-id-here>
3. Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images.
Bootstrapping AWS
To create Kubernetes clusters, NKP uses Cluster API (CAPI) controllers. These controllers run on a
Kubernetes cluster.
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io.
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
cert-manager cert-manager-cainjector
1/1 1 1 1h
cert-manager cert-manager-webhook
1/1 1 1 1h
Warning: In previous NKP releases, AMI images provided by the upstream CAPA project were used if you did not
specify an AMI. However, the upstream images are not recommended for production and may not always be available.
Therefore, NKP requires you to specify an AMI when creating a cluster. To create an AMI, use Konvoy Image Builder.
Note: NKP uses the AWS CSI driver as the Provider. Use a Kubernetes CSI-compatible storage that is suitable for
production. For more information, see https://kubernetes.io/docs/concepts/storage/volumes/#volume-
types.
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. For Kubernetes naming information, see https://kubernetes.io/docs/concepts/
overview/working-with-objects/names/ .
Procedure
a. If you are using Static Credentials, refresh the credentials using the command nkp update bootstrap
credentials aws
3. Set the environment variable of cluster name selected according to requirements and custom AMI identification
using the command export CLUSTER_NAME=<aws-example> export AWS_AMI_ID=<ami-...>.
» Option One - Provide the ID of your AMI using the --ami AMI_ID command and leave the existing flag
that provides the AMI ID.
» Option Two - Provide a path for your AMI with the information required for NKP to discover the AMI using
location, format, and OS information.
Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
The format or string used to search for matching AMIs and ensure it references the Kubernetes version
example --ami-base-os ubuntu-20.04 plus the base OS information example --ami-format
'example-{{.BaseOS}}-?{{.K8sVersion}}-*' .
Note:
• The AMI must be created with Konvoy Image Builder to use the registry mirror feature.
Example:
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror
when attempting to pull images.
AWS ECR example output shows the REGISTRY_URL of an existing local registry is accessible
in the VPC, where the new cluster nodes will be configured to use a mirror registry when
pulling images.
Example:
export REGISTRY_URL=<ecr-registry-URI>
5. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
If you need to change the kubernetes subnets, you must do this at cluster creation. To review the default subnets
used in NKP, see Managing Subnets and Pods on page 651.
Example output:
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
• (Optional) Modify control plane audit logs: Modify the KubeadmControlplane cluster-API object to
configure different kubelet options. For information beyond the existing options available from flags, see
Configure the Control Plane on page 1022.
• (Optional) Configure your cluster to use an existing local registry as a mirror when pulling images
previously pushed to your registry. Set an environment variable with your registry address for ECR using
the command export REGISTRY_URL=<ecr-registry-URI>. For more information, see Registry
Mirror Tools on page 1017.
Note: More flags can be added to the nkp create cluster command. See the optional choices below or
the complete list in the Universal Configurations for all Infrastructure Providers on page 644:
• If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy,
--https-proxy, and --no-proxy and their related values for it to be successful. For more
information, see Configuring an HTTP or HTTPS Proxy.
• FIPS flags - To create a cluster in FIPS mode, inform the controllers of the appropriate image
repository and version tags of the official Nutanix FIPS builds of Kubernetes by adding flags to the
command nkp create a cluster. --kubernetes-version=v1.29.6+fips.0 \ --etcd-
version=3.5.10+fips.0
• You can create individual manifest files with different smaller manifests for ease in editing
using the --output-directory flag. For more information, see Output Directory Flag on
page 649.
7. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully. For more information, see Customization of Cluster
CAPI Components on page 649.
8. Create the cluster from the objects generated from the dry run using the command kubectl create -f
${CLUSTER_NAME}.yaml
If the resource already exists, a warning appears in the console, requiring you to remove it or update your
YAML.
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
Note: NKP uses the AWS CSI driver as the default storage provider. Use a Kubernetes CSI-compatible
storage that is suitable for production. If you’re not using the default, you cannot deploy an alternate provider
until after the nkp create cluster is finished. However, this must be determined before the Kommander
installation. For more information, see:
• Kubernetes CSI:https://kubernetes.io/docs/concepts/storage/volumes/#volume-types
• Changing the Default Storage Class: https://kubernetes.io/docs/tasks/administer-cluster/
change-default-storage-class/
Note: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster by
setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username=<username> --registry-mirror-password=<password> on
the nkp create cluster command. For more information, see https://docs.docker.com/docker-hub/
download-rate-limit/.
A guide for creating Nutanix Kubernetes Platform (NKP) clusters on the AWS console.
1. In the selected workspace Dashboard, select the Add Cluster option in the Actions dropdown menu at the top
right.
a. Workspace: The workspace where this cluster belongs (if within the Global workspace).
b. Kubernetes Version: The initial Kubernetes version will be installed on the cluster.
c. Name: A valid Kubernetes name for the cluster.
d. Add Labels: By default, your cluster has labels that reflect the infrastructure provider provisioning. For
example, your AWS cluster have a label for the datacenter region and provider: aws. Cluster labels are
matched to the selectors created for Projects. Changing a cluster label may add or remove the cluster from
projects.
4. Select the pre-configured AWS Infrastructureor AWS role credentials to display the remaining options specific
to AWS.
5. Select Create to begin provisioning the cluster. This step may take a few minutes, taking time for the cluster
to be ready and fully deploy its components. The cluster automatically tries to join and resolve after it is fully
provisioned.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The node status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Output:
NAME STATUS ROLES AGE VERSION
aws-example-control-plane-9z77w Ready control-plane,master 4m44s v1.27.6
aws-example-control-plane-rtj9h Ready control-plane,master 104s v1.27.6
aws-example-control-plane-zbf9w Ready control-plane,master 3m23s v1.27.6
aws-example-md-0-88c46 Ready <none> 3m28s v1.27.6
aws-example-md-0-fp8s7 Ready <none> 3m28s v1.27.6
aws-example-md-0-qvnx7 Ready <none> 3m28s v1.27.6
aws-example-md-0-wjdrg Ready <none> 3m27s v1.27.6
Prerequisites:
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer; set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
1. Follow the steps to create permissions and roles on the Minimal Permissions and Role to Create Clusters page.
2. Create Cluster IAM Policies and Roles
3. Export the AWS region where you want to deploy the cluster:
export AWS_REGION=us-west-2
4. Export the AWS profile with the credentials you want to use to create the Kubernetes cluster:
export AWS_PROFILE=<profile>
If using AWS ECR as your local private registry, more information can be found on the Registry Mirror Tools page.
To deploy a cluster with a custom image in a region where CAPI images are not provided, you need to use Konvoy
Image Builder to create your image for the region. For information on CAPI images, see https://cluster-api-
aws.sigs.k8s.io/topics/images/built-amis.html.
Note: For multi-tenancy, every tenant needs to be in a different AWS account to ensure they are truly independent of
other tenants to enforce security.
Section Contents
Explore the Customize your Image topic for more options. Using KIB, you can build an AMI without
requiring access to the internet by providing an additional --override flag.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To create it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml.
For example:
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
Note:
Tip: The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
• Set environment variables for AWS access. The following variables must be set using your credentials
including the required required IAM:
• export AWS_ACCESS_KEY_ID
• export AWS_SECRET_ACCESS_KEY
• export AWS_DEFAULT_REGION
• If you have an override file to configure specific attributes of your AMI file, add it. Instructions for
customizing an override file are found on this page: Image Overrides.
Procedure
Tip: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command.
After KIB provisions the image successfully, the ami id is printed and written to the packer.pkr.hcl (Packer
config) file. This file has an artifact_id field whose value provides the name of the AMI ID as shown in the
example below. That is the ami you use in the create cluster command:
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.29.6"
}
}
],
"last_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05"
}
2. To use a custom AMI when creating your cluster, you must create that AMI using KIB first. Then perform the
export and name the custom AMI for use in the command NKP create cluster link update needed creating
your cluster
For example:
export AWS_AMI_ID=ami-<ami-id-here>
3. Inside the sections for either Non-air-gapped or Air-gapped cluster creation, you will find the instructions for
how to apply custom images link updates needed Non-air-gapped
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images, is required. This registry must be accessible from the bastion machine and
the AWS EC2 instances (if deploying to AWS) or other machines that will be created for the Kubernetes cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory,
similar to the example below, depending on your current location
cd nkp-v2.12.0
• REGISTRY_URL: the address of an existing local registry accessible in the VPC that the new cluster nodes will
be configured to use a mirror registry when pulling images
• The environment where you are running the nkp push command must be authenticated with AWS to load
your images into ECR.
• Other registry variables:
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
Before creating or upgrading a Kubernetes cluster, you must load the required images in a local registry if
operating in an air-gapped environment.
Note: It may take some time to push all the images to your image registry, depending on the network's performance
between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
cert-manager cert-manager-cainjector
1/1 1 1 1h
cert-manager cert-manager-webhook
1/1 1 1 1h
Warning: In previous NKP releases, Amazon Machine Image (AMI) images provided by the upstream CAPA project
were used if you did not specify an AMI. However, the upstream images are not recommended for production and may
not always be available. Therefore, NKP requires you to specify an AMI when creating a cluster. To create an AMI, use
Konvoy Image Builder.
Note: NKP uses the AWS CSI driver as the default storage provider. Use a Kubernetes CSI compatible storage
that is suitable for production. For more information, see https://kubernetes.io/docs/concepts/storage/
volumes/#volume-types
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if
the name has capital letters. See Kubernetes documentation for more naming information at https://kubernetes.io/
docs/concepts/overview/working-with-objects/names/.
Procedure
2. Ensure your AWS credentials are up to date. If you use Static Credentials, refresh the credentials using the
following command. Otherwise, proceed to the next step.
nkp update bootstrap credentials aws
3. Set the environment variable to the name you assigned this cluster:
export CLUSTER_NAME=<aws-example>
4. Export the variables, such as custom AMI and existing infrastructure details, for later use with the nkp create
cluster command.
export AWS_AMI_ID=<ami-...>
export AWS_VPC_ID=<vpc-...>
export AWS_SUBNET_IDS=<subnet-...,subnet-...,subnet-...>
export AWS_ADDITIONAL_SECURITY_GROUPS=<sg-...>
• Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
• The format or string used to search for matching AMIs and ensure it references the Kubernetes version
plus the base OS name: --ami-base-os ubuntu-20.04
• The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
Note:
• The AMI must be created with Konvoy Image Builder to use the registry mirror feature.
export AWS_AMI_ID=<ami-...>
• (Optional) Registry Mirror - Configure your cluster to use an existing local registry as a mirror
when attempting to pull images. Below is an AWS ECR example where REGISTRY_URL: the
address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.:
export REGISTRY_URL=<ecr-registry-URI>
6. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
If you need to change the Kubernetes subnets, you must do this at cluster creation. The default subnets used in
NKP are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
» Additional Options for your environment, otherwise, proceed to the next step to create your cluster.
(Optional) Modify Control Plane Audit logs - Users can make modifications to the KubeadmControlplane
cluster-API object to configure different kubelet options. See the following guide if you wish to configure
your control plane beyond the existing options available from flags.
7. Create a Kubernetes cluster object with a dry run output for customizations. The following example shows a
common configuration.
nkp create cluster aws
--cluster-name=${CLUSTER_NAME} \
--vpc-id=${AWS_VPC_ID} \
--ami=${AWS_AMI_ID} \
--subnet-ids=${AWS_SUBNET_IDS} \
--internal-load-balancer=true \
--additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
--registry-mirror-url=${REGISTRY_URL} \
--dry-run \
--output=yaml \
Note: More flags can be added to the nkp create cluster command for more options. See Choices below
or refer to the topic Universal Configurations:
» If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information
is available in Configuring an HTTP or HTTPS Proxy
» FIPS flags -
To create a cluster in FIPS mode, inform the controllers of the appropriate image repository and version
tags of the official Nutanix FIPS builds of Kubernetes by adding those flags to nkp create cluster
command. --kubernetes-version=v1.29.6+fips.0 \ --etcd-version=3.5.10+fips.0
» You can create individual manifest files with different smaller manifests for ease in editing using the --
output-directory flag.
» Flatcar OS uses this flag to instruct the bootstrap cluster to make some changes related to the installation
paths: --os-hint flatcar
8. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully:
kubectl get clusters,kubeadmcontrolplanes,machinedeployments
9. Create the cluster from the objects generated from the dry run. A warning will appear in the console if the
resource already exists, requiring you to remove the resource or update your YAML.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
11. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a
command to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/aws-example True
60s
##ClusterInfrastructure - AWSCluster/aws-example True
5m23s
##ControlPlane - KubeadmControlPlane/aws-example-control-plane True
60s
Note: NKP uses the AWS CSI driver as the default storage provider. Use a Kubernetes CSI compatible
storage that is suitable for production. For more information, see the Kubernetes documentation called
##Changing the Default Storage Class# at https://kubernetes.io/docs/concepts/storage/volumes/
#volume-types. If you’re not using the default, you cannot deploy an alternate provider until after the nkp
create cluster is finished. However, this must be determined before the Kommander installation.
Note: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster by
setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username=<username> --registry-mirror-password=<password> on
the nkp create cluster command. For more information see https://docs.docker.com/docker-hub/
download-rate-limit/.
12. As they progress, the controllers also create Events. List the Events using the kubectl get events | grep
${CLUSTER_NAME} command.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/aws-example True
109s
##ClusterInfrastructure - AWSCluster/aws-example True
112s
##ControlPlane - KubeadmControlPlane/aws-example-control-plane True
109s
# ##Machine/aws-example-control-plane-55jh4 True
111s
# ##Machine/aws-example-control-plane-6sn97 True
111s
# ##Machine/aws-example-control-plane-nx9v5 True
110s
##Workers
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Procedure
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: .The Status may take a few minutes to move to Ready while the Pod network is deployed. The node status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Output:
NAME STATUS ROLES AGE VERSION
aws-example-control-plane-9z77w Ready control-plane,master 4m44s v1.27.6
aws-example-control-plane-rtj9h Ready control-plane,master 104s v1.27.6
aws-example-control-plane-zbf9w Ready control-plane,master 3m23s v1.27.6
aws-example-md-0-88c46 Ready <none> 3m28s v1.27.6
aws-example-md-0-fp8s7 Ready <none> 3m28s v1.27.6
aws-example-md-0-qvnx7 Ready <none> 3m28s v1.27.6
aws-example-md-0-wjdrg Ready <none> 3m27s v1.27.6
Tip:
Prerequisites:
Procedure
1. Set the environment variable for your cluster using the command export CLUSTER_NAME=<your-
management-cluster-name>.
2. Copy the kubeconfig file of your Management cluster to your local directory using the command nkp get
kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf.
3. Create a configuration file for the deployment using the command nkp install kommander --init >
kommander.yaml
a. See Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer; set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Section Contents
Procedure
Procedure
1. Make sure your AWS credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials aws --kubeconfig $HOME/.kube/config
3. Move the Cluster API objects from the workload to the bootstrap cluster: The cluster life cycle services on the
bootstrap cluster are ready, but the workload cluster configuration is on the workload cluster. The move command
moves the configuration, which takes the form of Cluster API Custom Resource objects, from the workload to the
bootstrap cluster. This process is also called a Pivot.
nkp move capi-resources \
--from-kubeconfig ${CLUSTER_NAME}.conf \
--from-context ${CLUSTER_NAME}-admin@${CLUSTER_NAME} \
--to-kubeconfig $HOME/.kube/config \
--to-context kind-konvoy-capi-bootstrapper
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig $HOME/.kube/config get nodes
4. Use the cluster life cycle services on the workload cluster to check the workload cluster’s status.
nkp describe cluster --kubeconfig $HOME/.kube/config -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/aws-example True
91s
##ClusterInfrastructure - AWSCluster/aws-example True
103s
##ControlPlane - KubeadmControlPlane/aws-example-control-plane True
91s
# ##Machine/aws-example-control-plane-55jh4 True
102s
# ##Machine/aws-example-control-plane-6sn97 True
102s
# ##Machine/aws-example-control-plane-nx9v5 True
102s
##Workers
##MachineDeployment/aws-example-md-0 True
108s
##Machine/aws-example-md-0-cb9c9bbf7-hcl8z True
102s
##Machine/aws-example-md-0-cb9c9bbf7-rtdqw True
102s
##Machine/aws-example-md-0-cb9c9bbf7-td29r True
102s
##Machine/aws-example-md-0-cb9c9bbf7-w64kg True
102s
Procedure
1. Make sure your AWS credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials aws --kubeconfig $HOME/.kube/config
Note:
Persistent Volumes (PVs) are not deleted automatically by design to preserve your data. However,
the PVs take up storage space if not deleted. You must delete PVs manually. For more information on
backing up a cluster and PVs, see Back up your Cluster's Applications and Persistent Volumes.
2. To delete a cluster, use the command NKP delete cluster and pass in the name of the cluster you are trying
to delete with --cluster-name flag. Use the command kubectl get clusters to get those details (--
cluster-name and --namespace) of the Kubernetes cluster to delete it.
kubectl get clusters
Note: Before deleting the cluster, Nutanix Kubernetes Platform (NKP) deletes all Services of type LoadBalancer
on the cluster. An AWS Classic ELB backs each Service. Deleting the Service deletes the ELB that backs it. To
skip this step, use the flag --delete-kubernetes-resources=false. Do not skip this step if # NKP
manages the VPC when NKP deletes the cluster, it deletes the VPC. If the VPC has any AWS Classic ELBs, AWS
does not allow the VPC to be deleted, and NKP cannot delete the cluster.
Procedure
1. Make sure your AWS credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials aws --kubeconfig $HOME/.kube/config
• Implementing isolation between environment tiers such as development, testing, acceptance, and production.
• Implementing separation of concerns between management clusters, and workload clusters.
• Reducing the impact of security events and incidents.
For additional benefits of using multiple AWS accounts, See the following white paper at https://
docs.aws.amazon.com/whitepapers/latest/organizing-your-aws-environment/benefits-of-using-multiple-
aws-accounts.html
This document describes how to leverage the Nutanix Kubernetes Platform (NKP) to deploy a management
cluster, and multiple workload clusters, leveraging multiple AWS accounts. This guide assumes you have some
understanding of Cluster API concepts and basic NKP provisioning workflows on AWS.
Procedure
1. Deploy a management cluster in your AWS source account. NKP leverages the Cluster API provider for AWS
(CAPA) to provision Kubernetes clusters in a declarative way. Customers declare the desired state of the cluster
through a cluster configuration YAML file which is generated using.
nkp create cluster aws --cluster-name=${CLUSTER_NAME} \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
2. Configure a trusted relationship between source and target accounts. Go to your target (workload) account b.
Search for the role control-plane.cluster-api-provider-aws.sigs.k8s.io c. Navigate to the Trust Relationship tab and
select Edit Trust Relationship d. Add the following relationship.
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::${mgmt-aws-account}:role/control-plane.cluster-api-
provider-aws.sigs.k8s.io"
},
"Action": "sts:AssumeRole"
}
4. Modify the management cluster configuration file and update the AWSCluster object with following details.
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSCluster
metadata:
spec:
identityRef:
kind: AWSClusterRoleIdentity
name: cross-account-role
…
…
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSClusterRoleIdentity
metadata:
name: cross-account-role
spec:
allowedNamespaces: {}
roleARN: "arn:aws:iam::${workload-aws-account}:role/control-plane.cluster-api-
provider-aws.sigs.k8s.io"
sourceIdentityRef:
kind: AWSClusterControllerIdentity
name: default
After performing the above steps, your Management cluster will be configured to create new managed clusters in
the target AWS workload account.
Note:
cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size
cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size
The full list of command line arguments to the Cluster Autoscaler controller is on the Kubernetes public GitHub
repository at https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-
the-parameters-to-ca
For more information on how Cluster Autoscaler works, see :
Procedure
1. Ensure the Cluster Autoscaler controller is up and running (no restarts and no errors in the logs)
kubectl --kubeconfig=${CLUSTER_NAME}.conf logs deployments/cluster-autoscaler
cluster-autoscaler -n kube-system -f
3. To demonstrate that it is working properly, create a large deployment that will trigger pending pods (For this
example, we used Amazon Web Services (AWS) m5.2xlarge worker nodes. If you have larger worker nodes, the
recommendation is to scale up the number of replicas accordingly).
cat <<EOF | kubectl --kubeconfig=${CLUSTER_NAME}.conf apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
5. Cluster Autoscaler starts to scale down the number of Worker Nodes after the default timeout of 10 minutes.
Procedure
1. Create a secret with a kubeconfig file of the primary cluster in the managed cluster with limited user permissions
to only modify resources for the given cluster.
3. Add the following flag to the cluster-autoscaler command so that /mnt//masterconfig/ value is the path
where the primary cluster’s kubeconfig is loaded through the secret created.
--cloud-config=/mnt//masterconfig/value
Section Contents
Creating a node pool is useful when you need to run workloads that require machines with specific
resources, such as a GPU, additional memory, or specialized network or storage hardware.
Procedure
1. Set the environment variable to the name you assigned this cluster.
export CLUSTER_NAME=aws-example
2. If your workload cluster is self-managed, as described in Make the New Cluster Self-Managed, configure kubectl
to use the kubeconfig for the cluster.
export KUBECONFIG=${CLUSTER_NAME}.conf
Procedure
Create a new AWS node pool with 3 replicas using this command.
nkp create nodepool aws ${NODEPOOL_NAME} \
--cluster-name=${CLUSTER_NAME} \
--replicas=3
This example uses default values for brevity. Use flags to define custom instance types, AMIs, and other properties.
Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally or store in
version control.
List the node pools of a given cluster. This returns specific properties of each node pool so that you can
see the names of the Machine Deployments.
Procedure
To list all node pools for a managed cluster, run:.
nkp get nodepools --cluster-name=${CLUSTER_NAME} --kubeconfig=${CLUSTER_NAME}.conf
The expected output is similar to the following example, indicating the desired size of the node pool, the number of
replicas ready in the node pool, and the Kubernetes version those nodes are running:
NODEPOOL DESIRED READY KUBERNETES
VERSION
example 3 3 v1.29.6
aws-example-md-0 4 4 v1.29.6
Attached Clusters
• KUBECONFIG for the management cluster - To find the KUBECONFIG for a cluster from the UI, refer to this
section in the documentation: Access a Managed or Attached Cluster
• The CLUSTER_NAME of the attached cluster
• The NAMESPACE of the attached cluster
Procedure
To list all node pools for an attached cluster, run.
nkp get nodepools --cluster-name=${ATTACHED_CLUSTER_NAME} --kubeconfig=
${CLUSTER_NAME}.conf -n ${ATTACHED_CLUSTER_NAMESPACE}
The expected output is similar to below:
NODEPOOL DESIRED READY KUBERNETES
VERSION
aws-attached-md-0 4 4 v1.29.6
While running Cluster Autoscaler, you can manually scale your node pools up or down when you need
finite control over your environment.
aws-attached-md-0 5 5 v1.29.6
While running Cluster Autoscaler, you can manually scale your node pools up or down when you need
finite control over your environment.
Procedure
aws-example-md-0 4 4 v1.29.6
3. In a default cluster, the nodes to delete are selected at random. This behavior is controlled by CAPI’s delete
policy. For more information, see https://github.com/kubernetes-sigs/cluster-api/blob/v0.4.0/api/v1alpha4/
machineset_types.go#L85-L105. However, when using the Nutanix Kubernetes Platform (NKP) CLI to scale
down a node pool, it is also possible to specify the Kubernetes Nodes you want to delete.
To do this, set the flag --nodes-to-delete with a list of nodes as below. This adds an annotation cluster.x-
k8s.io/delete-machine=yes to the matching Machine object that contains status.NodeRef with the node
names from --nodes-to-delete.
nkp scale nodepools ${NODEPOOL_NAME} --replicas=3 --nodes-to-delete=<> --cluster-
name=${CLUSTER_NAME}
Output:
# Scaling node pool example to 3 replicas
Procedure
Note: Similarly, scaling down to several replicas less than the configured min-size also returns an error.
Deleting a node pool deletes the Kubernetes nodes and the underlying infrastructure.
Procedure
Note: For using GPUs in an air-gapped on-premises environment, Nutanix recommends setting up Pod Disruption
Budget before Update Cluster Nodepools. For more information, see https://kubernetes.io/docs/concepts/
workloads/pods/disruptions/.
Procedure
1. In your overrides/nvidia.yaml file, add the following to enable GPU builds. You can also access and use the
overrides repo. For more information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/
overrides.
gpu:
type:
- nvidia
3. By default, your image builds in the us-west-2 region. To specify another region, set the --region flag.
konvoy-image build --region us-east-1 --overrides override-source-ami.yaml images/
ami/<Your OS>.yaml
4. When the command is complete, the ami id is printed and written to packer.pkr.hcl
nkp create cluster aws --cluster-name=$(whoami)-aws-cluster --region us-west-2 --ami
<ami>
To use the built ami with Konvoy Image Builder, specify it with the --ami flag when calling cluster create.
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
Procedure
2. If the default detection does not work, you can manually change the daemonset that the GPU operator creates by
running the following command.
nodeSelector:
feature.node.kubernetes.io/pci-< class > _ < vendor>.present: "true"
where class is any 4 digit number starting with 03xy and the vendor for NVIDIA is 10de. If this is already
deployed, you can always change the daemonset and change the nodeSelector field so that it deploys to the
right nodes.
Upgrading a node pool involves draining the existing nodes in the node pool and replacing them with new
nodes.
Procedure
1. Deploy Pod Disruption Budget for your critical applications. If your application can tolerate only one replica to be
unavailable at a time, then you can set the Pod disruption budget as shown in the following example. The example
below is for NVIDIA GPU node pools, but the process is the same for all node pools.
• This feature requires Python 3.5 or greater to be installed on all control plane hosts.
• Complete the Bootstrap Cluster topic.
To create a cluster with automated certificate renewal, you create a Konvoy cluster using the certificate-renew-
interval flag. The certificate-renew-interval is the number of days after which Kubernetes-managed PKI
certificates will be renewed. For example, an certificate-renew-interval value of 60 means the certificates
will be renewed every 60 days.
Technical Details
The following manifests are modified on the control plane hosts and are located at /etc/kubernetes/manifests.
Modifications to these files require SUDO access.
kube-controller-manager.yaml
To debug the automatic certificate renewal feature, a cluster administrator can look at several components
to see if the certificates were renewed.
Procedure
2. Administrators who want more details on the execution of the systemd service can use ssh to connect to the
control plane hosts and then use the systemctl and journalctl commands that follow to help diagnose
potential issues.
systemctl list-timers
4. To get the logs of the last run of the service, use the command.
journalctl logs -u renew-certs
2. Export a variable with the node name for the next steps. This example uses the name ip-10-0-100-85.us-
west-2.compute.internal.
export NAME_NODE_TO_DELETE="<ip-10-0-100-85.us-west-2.compute.internal>"
4. Observe that the Machine resource is being replaced using this command.
kubectl --kubeconfig ${CLUSTER_NAME}.conf get machinedeployment
Output:
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE
AGE VERSION
aws-example-md-0 aws-example 4 3 4 1 ScalingUp
7m53s v1.26.3
Procedure
6. Enter a name for your infrastructure provider. Select a name that matches the AWS user.
8. You can add an External ID if you share the Role with a 3rd party. External IDs secure your environment from
accidentally used roles.
9. Click Save.
Procedure
1. In Kommander, select the Workspace associated with the credentials you are adding.
2. Navigate to Administration > Infrastructure Providers and click Add Infrastructure Provider.
5. Enter a name for your infrastructure provider for later reference. Consider choosing a name that matches the AWS
user.
7. Click Save.
EKS Infrastructure
Configuration types for installing NKP on Amazon Elastic Kubernetes Service (EKS) Infrastructure.
When installing Nutanix Kubernetes Platform (NKP) on your EKS infrastructure, you must also set up various
permissions through AWS Infrastructure.
If not already done, see the documentation for:
Section Contents
EKS Introduction
Nutanix Kubernetes Platform (NKP) brings value to EKS customers by providing all components needed for a
production-ready Kubernetes environment. NKP provides the capability to provision EKS clusters using the NKP UI.
It also provides the ability to upgrade your EKS clusters using the NKP platform, making it possible to manage the
complete life cycle of EKS clusters from a centralized platform.
NKP adds value to Amazon EKS through:
• Time to Value in hours/days to get to production, instead of weeks/months, or even failure. Particularly in
complex environments like air-gapped, customers tried various options and spending millions did not succeed or
saw Day 2 later than expected. We delivered results in hours or days.
• Less Risk
• Cloud-Native Expertise eliminates the issue of a lack of skills. Our industry-leading expertise closes skill
gaps on the customer side, avoids costly mistakes, transfers skills, and improves project success rates while
shortening timelines.
• Simplicity mitigates operational complexity. We focus on a great user experience and automate parts of
cloud-native operations to get customers to Day 2 faster and meet all Day 2 operational challenges. This frees
up customer time to build what differentiates them instead of reinventing the wheel for Kubernetes operations.
• Military-Grade Security alleviates security concerns. The Nutanix Kubernetes Platform can be configured
to meet NSA Kubernetes security hardening guidelines. Nutanix Kubernetes Platform and supported add-on
components are security scanned and secure out of the box—encryption of data-at-rest, FIPS compliance, and
fully supported air-gapped deployments round out Nutanix offerings.
• Lower TCO with operational insights and a more straightforward platform that curates needed capabilities from
Amazon EKS and the open source community that reduces the time and cost of consulting engagements and
ongoing support costs.
• Ultimate-grade Kubernetes - comes with a curated list of Day 2 applications necessary for running
Kubernetes in production.
• One platform for all - Single platform to manage multiple clusters on any infrastructure cloud, on-premises,
and edge.
• Projects deliver applications through FluxCD's built-in GitOps—just provide a Git repository, and NKP does
the rest.
• Through integration with Kubecost, NKP monitors the utilization of project resources and provides real-time
reporting for performance and cost optimization.
• Project security is defined through ##forced the# integration of customer authentication methods by NKP and
enforced through several application security layers.
• Cluster Life cycle Management through CAPI - Through cluster API, NKP gives customers complete life
cycle management of their EKS clusters with the ability to instantiate new EKS clusters through a unified API.
This allows administrators to deploy new EKS clusters through code and deliver consistent cluster configurations.
• Time to application value is significantly reduced by minimizing the steps necessary to provision a cluster
segment clusters through integrated permissions.
• Secure and reliable cluster deployments.
• Automatic day 2 operations of EKS clusters (Monitoring, Logging, Central Management, Security, Cost
Optimization).
• Day 2 GitOps integration with every EKS cluster.
Konvoy Prerequisites
Before you begin using Konvoy, you must have:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• kubectl for interacting with the running cluster.
• A valid AWS account with credentials configured.
• For a local registry, whether air-gapped or non-air-gapped environment, download and extract the
bundle. Download the Complete NKP Air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry.
Note: On macOS, Docker runs in a virtual machine. Configure this virtual machine with at least 8GB of memory.
Worker Nodes
You need at least four worker nodes. The specific number of worker nodes required for your environment can vary
depending on the cluster workload and size of the nodes. Each worker node needs to have at least the following:
• 8 cores
• 32 GiB memory
• Around 80 GiB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd.
• Disk usage must be below 85% on the root volume.
NKP on AWS defaults to deploying am5.2xlarge instance with an 80GiB root volume for worker nodes, which
meets the above requirements.
If you use these instructions to create a cluster on AWS using the NKP default settings without any edits to
configuration files or additional flags, your cluster is deployed on an Ubuntu 20.04 operating system image with 3
control plane nodes, and 4 worker nodes, which match the requirements above.
AWS Prerequisites
Before you begin using Konvoy with AWS, you must:
• Export the AWS profile with the credentials you want to use to create the Kubernetes cluster.
export AWS_PROFILE=<profile>
Section Contents
Note: If your role is not named nkp-bootstrapper-role , change the parameter in line 6 of the file.
• The user you delegate from your role must have a minimum set of permissions, see User Roles and Instance
Profiles page for AWS.
• Create the Cluster IAM Policies in your AWS account.
Roles
eks-controlplane.cluster-api-provider-aws.sigs.k8s.io - is the Role associated with EKS cluster
control planes
To create the resources in the CloudFormation stack, copy the contents above into a file and execute the following
command after replacing MYFILENAME.yaml and MYSTACKNAME with the intended values.
aws cloudformation create-stack --template-body=file://MYFILENAME.yaml --stack-name=
MYSTACKNAME --capabilities CAPABILITY_NAMED_IAM
Note: Ensure that the KUBECONFIG environment variable is set to the Management cluster by running export
KUBECONFIG=<Management_cluster_kubeconfig>.conf.
Note: By default, the control-plane Nodes will be created in 3 different Availability Zones. However, the default
worker Nodes will reside in a single zone. You may create additional node pools in other Availability Zones with the
nkp create nodepool command.
To create an EKS cluster from the CLI, perform the following tasks.
Procedure
1. Set the environment variable to the name you assigned this cluster using the command export
CLUSTER_NAME=eks-example.
2. Make sure your AWS credentials are up-to-date. Refresh the credentials command using the command nkp
update bootstrap credentials aws.
This is only necessary if using Static Credentials (Access Keys). For more information on access keys, see Using
AWS Minimal Permissions and Role to Create Clusters on page 746. If you use role-based authentication
on a bastion host, proceed to step 3.
3. Create the cluster using the command nkp create cluster eks --cluster-name=${CLUSTER_NAME} --
additional-tags=owner=$(whoami).
Example:
nkp create cluster eks \
--cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami)
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Clusters with HTTP or HTTPS Proxy on page 647.
Note:
Optional flag for ECR: Configure your cluster to use an existing local registry as a mirror when
attempting to pull images. Below is an AWS ECR example.
export --registry-mirror-url-string=YOUR_ECR_URL
• REGISTRY-MIRROR-URL-STRING: the address of an existing local registry accessible in the VPC that the new
cluster nodes will be configured to use a mirror registry when pulling images. Users can still pull their own
images from ECR directly or use ECR as a mirror. For more information, review the nkp create cluster
eks NKP CLI commands for --registry-mirror flag
Generating cluster resources
cluster.cluster.x-k8s.io/eks-example created
awsmanagedcontrolplane.controlplane.cluster.x-k8s.io/eks-example-control-plane
created
machinedeployment.cluster.x-k8s.io/eks-example-md-0 created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/eks-example-md-0 created
eksconfigtemplate.bootstrap.cluster.x-k8s.io/eks-example-md-0 created
clusterresourceset.addons.cluster.x-k8s.io/calico-cni-installation-eks-example
created
configmap/calico-cni-installation-eks-example created
configmap/tigera-operator-eks-example created
clusterresourceset.addons.cluster.x-k8s.io/cluster-autoscaler-eks-example created
configmap/cluster-autoscaler-eks-example created
clusterresourceset.addons.cluster.x-k8s.io/node-feature-discovery-eks-example created
Note: Editing the cluster objects requires some understanding of Cluster API. Edits can prevent the cluster from
deploying successfully.
The objects are Custom Resources defined by Cluster API components, and they belong to three
different categories:
• Cluster
A Cluster object references the infrastructure-specific and control plane objects. Because this is an AWS
cluster, an #AWSCluster# object describes the infrastructure-specific cluster properties. This means the AWS
region, the VPC ID, subnet IDs, and security group rules required by the Pod network implementation.
• Control Plane
A AWSManagedControlPlane object describes the control plane, which is the group of machines that run the
Kubernetes control plane components, which include the etcd distributed database, the API server, the core
controllers, and the scheduler. The object describes the configuration for these components. The object also
refers to an infrastructure-specific object that describes the properties of all control plane machines.
• Node Pool
A Node Pool is a collection of machines with identical properties. For example, a cluster might have one Node
Pool with large memory capacity and another Node Pool with GPU support. Each Node Pool is described
by three objects: The MachinePool references an object that describes the configuration of Kubernetes
components (kubelet) deployed on each node pool machine and an infrastructure-specific object that
describes the properties of all node pool machines. Here, it references a KubeadmConfigTemplate, and an
AWSMachineTemplate object, which describes the instance type, the type of disk used, and the disk size,
among other properties.
For more information, see https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-
resources/.
For more information about the objects, see https://kubernetes.io/docs/concepts/overview/working-with-
objects/.
5. To wait for the cluster control plane to be ready, use the command kubectl wait --
for=condition=ControlPlaneReady "clusters/${CLUSTER_NAME}" --timeout=20m
Example:
cluster.cluster.x-k8s.io/eks-example condition met
The READY status will become True after the cluster control plane becomes ready in one of the following steps.
6. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. To describe the current status
of the cluster in Konvoy, use the command nkp describe cluster -c ${CLUSTER_NAME}.
Example:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/eks-example True
10m
##ControlPlane - AWSManagedControlPlane/eks-example-control-plane True
10m
7. As they progress, the controllers also create Events. To list the Events, use the command kubectl get events
| grep ${CLUSTER_NAME}.
For brevity, the example uses grep. Using separate commands to get Events for specific objects is also possible.
For example, kubectl get events --field-selector involvedObject.kind="AWSCluster" and
kubectl get events --field-selector involvedObject.kind="AWSMachine".
Example:
46m Normal SuccessfulCreateVPC
awsmanagedcontrolplane/eks-example-control-plane Created new managed VPC
"vpc-05e775702092abf09"
46m Normal SuccessfulSetVPCAttributes
awsmanagedcontrolplane/eks-example-control-plane Set managed VPC attributes for
"vpc-05e775702092abf09"
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0419dd3f2dfd95ff8"
46m Normal SuccessfulModifySubnetAttributes
awsmanagedcontrolplane/eks-example-control-plane Modified managed Subnet
"subnet-0419dd3f2dfd95ff8" attributes
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0e724b128e3113e47"
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-06b2b31ea6a8d3962"
46m Normal SuccessfulModifySubnetAttributes
awsmanagedcontrolplane/eks-example-control-plane Modified managed Subnet
"subnet-06b2b31ea6a8d3962" attributes
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0626ce238be32bf98"
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0f53cf59f83177800"
46m Normal SuccessfulModifySubnetAttributes
awsmanagedcontrolplane/eks-example-control-plane Modified managed Subnet
"subnet-0f53cf59f83177800" attributes
46m Normal SuccessfulCreateSubnet
awsmanagedcontrolplane/eks-example-control-plane Created new managed Subnet
"subnet-0878478f6bbf153b2"
46m Normal SuccessfulCreateInternetGateway
awsmanagedcontrolplane/eks-example-control-plane Created new managed Internet
Gateway "igw-09fb52653949d4579"
46m Normal SuccessfulAttachInternetGateway
awsmanagedcontrolplane/eks-example-control-plane Internet Gateway
"igw-09fb52653949d4579" attached to VPC "vpc-05e775702092abf09"
46m Normal SuccessfulCreateNATGateway
awsmanagedcontrolplane/eks-example-control-plane Created new NAT Gateway
"nat-06356aac28079952d"
• The Konvoy version used to create a workload cluster must match the Konvoy version used to delete a
workload cluster.
• EKS clusters cannot be Self-managed.
Section Contents
Procedure
4. Choose a workspace. If you are already in a workspace, the provider is automatically created in that workspace.
6. Add a Name for your Infrastructure Provider and include the Role ARN from Step 1 above.
7. click Save.
Procedure
6. If available, choose a Kubernetes Version. Otherwise, the default Kubernetes version installs.
8. Edit your worker Node Pools as necessary. You can choose the Number of Nodes, the Machine Type,
and our IAM Instance Profile. You can also choose a #Worker Availability Zone#for the worker pool.
What to do next
For more information on AWS IAM ARNs, see https://docs.aws.amazon.com/IAM/latest/UserGuide/
reference_identifiers.html#identifiers-arns.
Enabling
IAM-Based Cluster AccessShort Description: Configure your EKS cluster so admins can monitor all
actions.
Procedure
1. Download the kubeconfig by selecting the Download kubeconfig button on the top section of the UI.
2. Using that kubeconfig, edit the config map with a command similar to the example.
kubectl --kubeconfig=MYCLUSTER.conf edit cm -n kube-system aws-auth
3. Modify the mapRoles and mapUsers objects according to the permissions as needed. The following example
shows mapping the arn:aws:iam::MYAWSACCOUNTID:role/PowerUser role to the systems:masters on
the Kubernetes cluster.
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::MYAWSACCOUNTID:role/nodes.cluster-api-provider-
aws.sigs.k8s.io
username: system:node:{{EC2PrivateDNSName}}
- groups:
- system:masters
rolearn: arn:aws:iam::MYAWSACCOUNTID:role/PowerUser
username: admin
4. From your management cluster, run the nkp get kubeconfig command to fetch a kubeconfig that uses IAM-
based permissions.
nkp get kubeconfig -c ${EKS_CLUSTER_NAME} -n ${KOMMANDER_WORKSPACE_NAMESPACE} >>
${EKS_CLUSTER_NAME}.conf
Note: More information about the configuration of the EKS control plane can be found on the EKS Cluster IAM
Policies and Roles page.
Suppose the EKS cluster was created as a cluster using a self-managed AWS cluster that uses IAM Instance Profiles.
In that case, you must modify the IAMAuthenticatorConfig field in the AWSManagedControlPlane API object
to allow other IAM entities to access the EKS workload cluster. Follow the steps below:
Procedure
1. Run the following command with your KUBECONFIG configured to select the self-managed cluster previously used
to create the workload EKS cluster. Ensure you substitute ${CLUSTER_NAME} and ${CLUSTER_NAMESPACE}
with their corresponding values for your cluster.
kubectl edit awsmanagedcontrolplane ${CLUSTER_NAME}-control-plane -n
${CLUSTER_NAMESPACE}
2. Edit the IamAuthenticatorConfig field with the IAM Role to the corresponding Kubernetes Role. In
this example, the IAM role arn:aws:iam::111122223333:role/PowerUser is granted the cluster role
system:masters. Note that this example uses an example AWS resource ##ARNs#, so remember to substitute
real values in the corresponding AWS account.
iamAuthenticatorConfig:
mapRoles:
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::111122223333:role/my-node-role
username: system:node:{{EC2PrivateDNSName}}
- groups:
- system:masters
rolearn: arn:aws:iam::111122223333:role/PowerUser
username: admin
Procedure
1. Get the kubeconfig file from the Secret, and write it to a file, using the nkp get kubeconfig -c
${CLUSTER_NAME} > ${CLUSTER_NAME}.conf command.
When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator.
2. List the nodes using the kubectl --kubeconfig=${CLUSTER_NAME}.conf get nodes command.
Example output:
NAME STATUS ROLES AGE VERSION
ip-10-0-122-211.us-west-2.compute.internal Ready <none> 35m v1.23.9-eks-
ba74326
ip-10-0-127-74.us-west-2.compute.internal Ready <none> 35m v1.23.9-eks-
ba74326
ip-10-0-71-155.us-west-2.compute.internal Ready <none> 35m v1.23.9-eks-
ba74326
ip-10-0-93-47.us-west-2.compute.internal Ready <none> 35m v1.23.9-eks-
ba74326
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The node status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
3. List the Pods using the kubectl --kubeconfig=${CLUSTER_NAME}.conf get --all-namespaces pods
command.
NAMESPACE NAME READY
STATUS RESTARTS AGE
calico-system calico-kube-controllers-7d6749878f-ccsx9 1/1
Running 0 34m
calico-system calico-node-2r6l8 1/1
Running 0 34m
calico-system calico-node-5pdlb 1/1
Running 0 34m
calico-system calico-node-n24hh 1/1
Running 0 34m
calico-system calico-node-qrh7p 1/1
Running 0 34m
calico-system calico-typha-7bbcb87696-7pk45 1/1
Running 0 34m
calico-system calico-typha-7bbcb87696-t4c8r 1/1
Running 0 34m
What to do next
Attach an Existing Cluster to the Management Cluster on page 825
Note: This procedure assumes you have an existing and spun-up Amazon EKS cluster(s) with administrative
privileges. Refer to the Amazon EKS at https://aws.amazon.com/eks/
Section Contents
Procedure
1. Ensure that the KUBECONFIG environment variable is set to the Management cluster.
Procedure
1. Ensure you are connected to your EKS clusters, using the command kubectl config get-contexts
kubectl config use-context context for first eks cluster for each of your clusters
2. Confirm kubectl can access the EKS cluster using the command kubectl get nodes.
Procedure
5. Set up the following environment variables with the access data that is needed for producing a new kubeconfig
file.
export USER_TOKEN_VALUE=$(kubectl -n kube-system get secret/kommander-cluster-admin-
sa-token -o=go-template='{{.data.token}}' | base64 --decode)
export CURRENT_CONTEXT=$(kubectl config current-context)
7. Generate a kubeconfig file that uses the environment variable values from the previous step.
cat << EOF > kommander-cluster-admin-config
apiVersion: v1
kind: Config
current-context: ${CURRENT_CONTEXT}
contexts:
- name: ${CURRENT_CONTEXT}
context:
cluster: ${CURRENT_CONTEXT}
user: kommander-cluster-admin
namespace: kube-system
clusters:
- name: ${CURRENT_CONTEXT}
cluster:
certificate-authority-data: ${CLUSTER_CA}
server: ${CLUSTER_SERVER}
users:
- name: kommander-cluster-admin
user:
token: ${USER_TOKEN_VALUE}
EOF
8. This process produces a file in your current working directory called kommander-cluster-admin-config .
The contents of this file are used in Kommander to attach the cluster. Before importing this configuration, verify
the kubeconfig file can access the cluster.
kubectl --kubeconfig $(pwd)/kommander-cluster-admin-config get all --all-namespaces
Procedure
1. From the top menu bar, select your target workspace—task step.
2. On the Dashboard page, select the Add Cluster option in the Actions dropdown menu at the top right.
4. Select the No additional networking restrictions card. Alternatively, if you must use network restrictions,
stop following the steps below and see the instructions on the page Cluster Attachment with Networking
Restrictions on page 488.
6. The Cluster Name field automatically populates with the name of the cluster in the kubeconfig. You can edit
this field using the name you want for your cluster.
Note: If a cluster has limited resources to deploy all the federated platform services, it will fail to stay attached to
the NKP UI. If this happens, ensure your system has sufficient resources for all pods.
Note: These steps only apply if you do not set a WORKSPACE_NAMESPACE when creating a cluster. If you already
set a WORKSPACE_NAMESPACE, then you do not need to perform these steps since the cluster is already attached to
the workspace.
Procedure
1. Confirm you have your MANAGED_CLUSTER_NAME variable set using the command echo
${MANAGED_CLUSTER_NAME}.
2. Retrieve your kubeconfig from the cluster you have created without setting a workspace using the command nkp
get kubeconfig --cluster-name $MANAGED_CLUSTER_NAME > $MANAGED_CLUSTER_NAME.conf
3. You can now attach it in the UI (link to attaching it to workspace through UI that was earlier) or attach your
cluster to the workspace you want in the CLI.
Note: This is only necessary if you never set the workspace of your cluster upon creation
4. Retrieve the workspace where you want to attach the cluster using the command kubectl get workspaces
-A .
6. Create a secret in the desired workspace before attaching the cluster to that workspace. Retrieve the
kubeconfig secret value of your cluster using the command kubectl -n default get secret
$MANAGED_CLUSTER_NAME-kubeconfig -o go-template='{.data.value}{"\n"}'
7. This will return a lengthy value. Copy this entire string for a secret using the template below as a reference.
Create a new attached-cluster-kubeconfig.yaml file.
apiVersion: v1
kind: Secret
metadata:
name: your-managed-cluster-name-kubeconfig
labels:
cluster.x-k8s.io/cluster-name: your-managed-cluster-name
8. Create this secret in the desired workspace using the command kubectl apply -f attached-cluster-
kubeconfig.yaml --namespace $WORKSPACE_NAMESPACE.
10. You can now view this cluster in your Workspace in the UI, and you can confirm its status by using the
command kubectl get kommanderclusters -A.
It may take a few minutes to reach "Joined" status.
11. If you have several Pro Clusters and want to turn one of them into a Managed Cluster to be centrally
administrated by a Management Cluster, see Platform Expansion: Conversion of an NKP Pro Cluster to an
NKP Ultimate Managed Cluster on page 515.
What to do next
Cluster Management on page 462.
For more information on related topics, see:
Note: Ensure that the KUBECONFIG environment variable is set to the self-managed cluster by running export
KUBECONFIG={SELF_MANAGED_AWS_CLUSTER}.conf.
If you prefer to continue working in the terminal or shell using the CLI, the steps for deleting the cluster are listed
below. If you are in the NKP UI, you can also delete the cluster from the UI using the steps on this page: Delete EKS
Cluster from the NKP UI
Follow these steps for deletion from the CLI:
1. Ensure your AWS credentials are up to date. If you use user profiles, refresh the credentials using the command
below. Otherwise, proceed to step 2.
nkp update bootstrap credentials aws
2. Important: Do not skip this step if the VPC is managed by Nutanix Kubernetes Platform (NKP). When NKP
deletes the cluster, it deletes the VPC. If the VPC has any EKS Classic ELBs, EKS does not allow the VPC to be
deleted, and NKP cannot delete the cluster.
Delete the Kubernetes cluster and wait a few minutes. Before deleting the cluster, nkp deletes all Services of
type LoadBalancer on the cluster. Service is backed by an AWS Classic ELBAn, and an AWS Classic ELB
backs each service. Deleting the Service deletes the ELB that backs it. To skip this step, use the flag --delete-
kubernetes-resources=false.
nkp delete cluster --cluster-name=${CLUSTER_NAME}
3. Example output.
# Deleting Services with type LoadBalancer for Cluster default/eks-example
# Deleting ClusterResourceSets for Cluster default/eks-example
# Deleting cluster resources
# Waiting for cluster to be fully deleted
Deleted default/eks-example cluster
Known Limitations: in the current release of NKP:
• The NKP version used to create the workload cluster must match the NKP version used to delete the workload
cluster.
What to do next
Day 2 - Cluster Operations Management
Procedure
2. Select the cluster you wish to delete and click the dotted icon in the bottom right corner.
4. When the next screen appears, copy the name of your cluster and paste it into the empty box.
6. You will see the status as “Deleting” in the top left corner of the cluster you selected for deletion.
What to do next
For a generic overview of deleting clusters within the UI and troubleshooting, see the Disconnecting or
Deleting Clusters on page 538 instructions.
Note: Ensure that the KUBECONFIG environment variable is set to the self-managed cluster by running export
KUBECONFIG={SELF_MANAGED_AWS_CLUSTER}.conf.
Note: Konvoy implements node pools using Cluster API MachineDeployments. For more information, see https://
cluster-api.sigs.k8s.io/developer/architecture/controllers/machine-deployment.html.
Section Contents
Note: By default, the first Availability Zone in the region is used for the nodes in the node pool. To create the nodes
in a different Availability Zon,e set the appropriate --availability-zone. For more information, see https://
aws.amazon.com/about-aws/global-infrastructure/regions_az/.
Procedure
To create a new EKS node pool with 3 replicas, run.
nkp create nodepool eks ${NODEPOOL_NAME} \
--cluster-name=${CLUSTER_NAME} \
--replicas=3
machinedeployment.cluster.x-k8s.io/example created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/example created
eksconfigtemplate.bootstrap.cluster.x-k8s.io/example created
# Creating default/example nodepool resources
Advanced users can use a combination of the --dry-run and --output=yaml flags to get a complete set of node
pool objects to modify locally or store in version control.
Procedure
4. Example output showing the number of DESIRED and READY replicas increased to 5.
NODEPOOL DESIRED READY
KUBERNETES VERSION
example 5 5 v1.23.6
eks-example-md-0 4 4 v1.23.6
Procedure
1. To delete a node pool from a managed cluster using the command nkp delete nodepool.
nkp delete nodepool ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME}
The expected output will be similar to the following example, indicating the node pool is being deleted.
Azure Infrastructure
Configuration types for installing NKP on Azure Infrastructure.
For an environment on the Azure Infrastructure, install options based on those environment variables are provided for
you in this location.
If not already done, complete this guide's Getting Started with ##NKP# section more information, see the following
topics in Getting Started with NKP on page 17.
• Resource Requirements
• Installing NKP on page 47
• Prerequisites for Install
Otherwise, proceed to the Azure Prerequisites and Permissions topic to begin your custom installation.
Section Contents
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• kubectl for interacting with the running cluster.
• Install the Azure CLI
• A valid Azure account with credentials configured.
• Create a custom Azure image using KIB.
• 4 cores
• 16 GiB memory
• Approximately 80 GiB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd.
• Disk usage must be below 85% on the root volume.
NKP on Azure defaults to deploying an Standard_D4s_v3 virtual machine with a 128 GiB volume for the OS and
an 80GiB volume for etcd storage, which meets the above requirements.
Worker Nodes
You must have at least four worker nodes. The specific number of worker nodes required for your environment can
vary depending on the cluster workload and size of the nodes. Each worker node needs to have at least the following:
• 8 cores
• 32 GiB memory
• Around 80 GiB of free space for the volume used for /var/lib/kubelet and /var/lib/containerd.
• Disk usage must be below 85% on the root volume.
NKP on Azure defaults to deploying a Standard_D4s_v3 instance with an 80GiB root volume for the OS, which
meets the above requirements.
Azure Prerequisites
In Azure, application registration, application objects, and service principals in Azure Active Directory (Azure
AD) are used for access. An application must be registered with an Azure AD tenant to delegate identity and access
management functions to Azure AD. An Azure AD application is defined by its only application object, which resides
in the Azure AD. To access resources secured by an Azure AD tenant, a security principal must represent the entity
Section Contents
Procedure
1. Sign in to Azure.
az login
Output
[
{
"cloudName": "AzureCloud",
"homeTenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"id": "b1234567-abcd-11a1-a0a0-1234a5678b90",
"isDefault": true,
"managedByTenants": [],
"name": "Mesosphere Developer Subscription",
"state": "Enabled",
"tenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"user": {
"name": "[email protected]",
"type": "user"
}
}
]
4. Ensure you have an override file to configure specific attributes of your Azure image. Otherwise, edit the YAML
file for your OS directly. For more information, see https://github.com/mesosphere/konvoy-image-builder/
tree/7447074a6d910e71ad2e61fc4a12820d073ae5ae/images/azure.
Note: The konvoy-image binary and all supporting folders are also extracted. When extracted, konvoy-image
bind mounts the current working directory (${PWD}) into the container to be used.
This procedure describes using the Konvoy Image Builder (KIB) to create a Cluster API compliant Amazon
Machine Image (AMI). KIB uses variable overrides to specify the base and container images for your new AMI.
The default Azure image is not recommended for use in production. We suggest using KIB for Azure to build the
image and take advantage of enhanced cluster operations. Explore the Customize your Image topic for more
options.
For more information regarding using the image in creating clusters, refer to the Azure Create a New Cluster
section of the documentation.
• Download the Konvoy Image Builder bundle for your version of Nutanix Kubernetes Platform (NKP).
• Check the Supported Kubernetes Version for your Provider.
• Create a working Docker setup.
Extract the bundle and cd into the extracted konvoy-image-bundle-$VERSION_$OS folder. The bundled version
of konvoy-image contains an embedded docker image that contains all the requirements for building.
The konvoy-image binary and all supporting folders are also extracted. When extracted, konvoy-image bind
mounts the current working directory (${PWD}) into the container to be used.
Procedure
Run the konvoy-image command to build and validate the image.
konvoy-image build azure --client-id ${AZURE_CLIENT_ID} --tenant-id ${AZURE_TENANT_ID}
--overrides override-source-image.yaml images/azure/ubuntu-2004.yaml
By default, the image builder builds in the westus2 location. To specify another location, set the --location flag
(shown in the example below is how to change the location to eastus):
konvoy-image build azure --client-id ${AZURE_CLIENT_ID} --tenant-id ${AZURE_TENANT_ID}
--location eastus --overrides override-source-image.yaml images/azure/centos-7.yaml
Image Gallery
Procedure
• To specify a specific Resource Group, Gallery, or Image Name flags may be specified:
--gallery-image-locations string a list of locations to publish the image
(default same as location)
--gallery-image-name string the gallery image name to publish the image to
--gallery-image-offer string the gallery image offer to set (default "nkp")
--gallery-image-publisher string the gallery image publisher to set (default
"nkp")
--gallery-image-sku string the gallery image sku to set
--gallery-name string the gallery name to publish the image in
(default "nkp")
--resource-group string the resource group to create the image in
(default "nkp")
When creating your cluster, you will add this flag during the creation process for your custom image: --
compute-gallery-id "<Managed Image Shared Image Gallery Id>." See Create a New Azure
Cluster for specific consumption of image commands.
The SKU and Image Name will default to the values in the YAML image.
Ensure you have named the correct YAML file for your OS in the konvoy-image build command at https://
github.com/mesosphere/konvoy-image-builder/tree/main/images/gcp.
Azure Marketplace
To allow Nutanix Kubernetes Platform (NKP) to create a cluster with Marketplace-based images, such as Rocky
Linux, you must specify them with flags.
If these fields were specified in the override file during image creation, the flags must be used in cluster creation:
If you see a similar error to "Creating a virtual machine from Marketplace image or a custom image sourced from a
Marketplace image requires Plan information in the request." when creating a cluster, you must also set the following
flags --plan-offer, --plan-publisher, --plan-sku. For example, when creating a cluster with Rocky Linux
VMs, add the following flags to your nkp create cluster azure command:
Azure Prerequisites
Before you begin using Konvoy with Azure, you must:
1. Sign in to Azure:
az login
[
{
"cloudName": "AzureCloud",
"homeTenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"id": "b1234567-abcd-11a1-a0a0-1234a5678b90",
"isDefault": true,
"managedByTenants": [],
"name": "Nutanix Developer Subscription",
"state": "Enabled",
"tenantId": "a1234567-b132-1234-1a11-1234a5678b90",
"user": {
"name": "[email protected]",
"type": "user"
}
}
]
2. Run this command to ensure you are pointing to the correct Azure subscription ID:
az account set --subscription "Nutanix Developer Subscription"
4. Ensure you have an override file to configure specific attributes of your Azure image.
Section Contents
Bootstrapping Azure
To create Kubernetes clusters, NKP uses Cluster API (CAPI) controllers. These controllers run on a
Kubernetes cluster.
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
Procedure
Base64 encodes the Azure environment variables set in the Azure install prerequisites step.
export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "${AZURE_SUBSCRIPTION_ID}" | base64 | tr -d
'\n')"
export AZURE_TENANT_ID_B64="$(echo -n "${AZURE_TENANT_ID}" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo -n "${AZURE_CLIENT_ID}" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo -n "${AZURE_CLIENT_SECRET}" | base64 | tr -d
'\n')"
1. To use a custom Azure Image when creating your cluster, you must create that Azure Image using KIB first and
then use the flag --compute-gallery-id to apply the image.
...
--compute-gallery-id "<Managed Image Shared Image Gallery Id>"
Note:
The --compute-gallery-id image will be in the format
--compute-gallery-id /subscriptions/<subscription id>/resourceGroups/
<resource group
name>/providers/Microsoft.Compute/galleries/<gallery name>/images/
<image definition
name>/versions/<version id>
2. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation. If
you need to change the kubernetes subnets, you must do this at cluster creation. The default subnets used in NKP
are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
» Additional Options for your environment; otherwise, proceed to the next step to create your cluster. (Optional)
Modify Control Plane Audit logs - Users can modify the KubeadmControlplane cluster-API object to
configure different kubelet options. See the following guide if you wish to configure your control plane
beyond the existing options available from flags.
» (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when
attempting to pull images previously pushed to your registry.
3. Run this command to create your Kubernetes cluster using any relevant flags.
nkp create cluster azure \
--cluster-name=${CLUSTER_NAME} \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
Note: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
4. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as edits
can prevent the cluster from deploying successfully.
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above, create
the cluster from the objects you created by specifying the directory:
7. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command
to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/azure-example True
55s
##ClusterInfrastructure - AzureCluster/azure-example True
67s
##ControlPlane - KubeadmControlPlane/azure-example-control-plane True
55s
# ##Machine/azure-example-control-plane-67f47 True
58s
# ##Machine/azure-example-control-plane-7pllh True
65s
# ##Machine/azure-example-control-plane-jtfgv True
65s
##Workers
##MachineDeployment/azure-example-md-0 True
67s
##Machine/azure-example-md-0-f9cb9c79b-6nsb9 True
59s
##Machine/azure-example-md-0-f9cb9c79b-jxwl6 True
58s
##Machine/azure-example-md-0-f9cb9c79b-ktg7z True
59s
##Machine/azure-example-md-0-f9cb9c79b-nxcm2 True
66s
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
NKP delete bootstrap --kubeconfig $HOME/.kube/config
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Procedure
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The node status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Example output:
NAME STATUS ROLES AGE VERSION
azure-example-control-plane-7ffnl Ready control-plane,master 6m18s v1.28.7
azure-example-control-plane-l4bv8 Ready control-plane,master 14m v1.28.7
azure-example-control-plane-n4g4l Ready control-plane,master 18m v1.28.7
azure-example-md-0-mpctb Ready <none> 15m v1.28.7
azure-example-md-0-qglp9 Ready <none> 15m v1.28.7
azure-example-md-0-sgrd6 Ready <none> 16m v1.28.7
azure-example-md-0-wzbkl Ready <none> 16m v1.28.7
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer; set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Section Contents
Procedure
4. Observe that the Machine resource is being replaced using this command.
kubectl --kubeconfig ${CLUSTER_NAME}.conf get machinedeployment
Output:
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE
AGE VERSION
azure-example-md-0 azure-example 4 3 4 1
ScalingUp 7m30s v1.28.7
long-running-md-0 long-running 4 4 4 0
Running 7m28s v1.28.7
Procedure
Task step.
Procedure
1. Make sure your AWS credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials aws --kubeconfig $HOME/.kube/config
2. The bootstrap cluster will host the Cluster API controllers that reconcile the cluster objects marked for deletion.
Create a bootstrap cluster. To avoid using the wrong kubeconfig, the following steps use explicit kubeconfig paths
and contexts.
nkp create bootstrap --kubeconfig $HOME/.kube/config --with-aws-bootstrap-
credentials=true
3. Move the Cluster API objects from the workload to the bootstrap cluster: The cluster life cycle services on the
bootstrap cluster are ready, but the workload cluster configuration is on the workload cluster. The move command
moves the configuration, which takes the form of Cluster API Custom Resource objects, from the workload to the
bootstrap cluster. This process is also called a Pivot.
nkp move capi-resources \
--from-kubeconfig ${CLUSTER_NAME}.conf \
--from-context ${CLUSTER_NAME}-admin@${CLUSTER_NAME} \
--to-kubeconfig $HOME/.kube/config \
--to-context kind-konvoy-capi-bootstrapper
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig $HOME/.kube/config get nodes
4. Use the cluster life cycle services on the workload cluster to check the workload cluster’s status.
nkp describe cluster --kubeconfig $HOME/.kube/config -c ${CLUSTER_NAME}
Output:
102s
Procedure
1. To delete a cluster, Use nkp delete cluster and pass in the name of the cluster you are trying to delete
with --cluster-name flag. Use kubectl get clusters to get those details (--cluster-name and --
namespace) of the Kubernetes cluster to delete it.
kubectl get clusters
Note: Before deleting the cluster, Nutanix Kubernetes Platform (NKP) deletes all Services of type LoadBalancer
on the cluster. To skip this step, use the flag --delete-kubernetes-resources=false.
Procedure
Delete the bootstrap cluster.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
Output:
# Deleting bootstrap cluster
• This feature requires Python 3.5 or greater to be installed on all control plane hosts.
• Complete the Bootstrap Cluster topic.
To create a cluster with automated certificate renewal, you create a Konvoy cluster using the certificate-renew-
interval flag. The certificate-renew-interval is the number of days after which Kubernetes-managed PKI
certificates will be renewed. For example, an certificate-renew-interval value of 60 means the certificates
will be renewed every 60 days.
Technical Details
The following manifests are modified on the control plane hosts and are located at /etc/kubernetes/manifests.
Modifications to these files require SUDO access.
kube-controller-manager.yaml
kube-apiserver.yaml
kube-scheduler.yaml
kube-proxy.yaml
The following annotation indicates the time each component was reset:
metadata:
annotations:
konvoy.nutanix.io/restartedAt: $(date +%s)
This only occurs when the PKI certificates are older than the interval given at cluster creation time. This is activated
by a systemd timer called renew-certs.timer that triggers an associated systemd service called renew-
certs.service that runs on all of the control plane hosts.
AKS Infrastructure
Configuration types for installing NKP on Azure Kubernetes Serves (AKS) Infrastructure.
You can choose from multiple configuration types when installing ##NKP# on Azure Kubernetes Service (AKS)
infrastructure. If not already done, see the documentation for:
Note: An AKS cluster cannot be a Management or Pro cluster. When installing NKP on your AKS cluster, first ensure
you have a Management cluster with NKP and the Kommander component installed that handles the life cycle of your
AKS cluster.
Section Contents
Section Contents
• The cluster name may only contain the following characters: a-z, 0-9, ., and -.
• Cluster creation will fail if the name includes capital letters.
• For more naming information, see https://kubernetes.io/docs/concepts/overview/working-with-objects/
names/.
Procedure
2. Check to see what version of Kubernetes is available in your region. When deploying with Azure Kubernetes
Service (AKS), you need to declare the version of Kubernetes you wish to use by running the following command,
substituting <your-location> for the Azure region you're deploying to.
az aks get-versions -o table --location your-location
Note: Refer to the current release Kubernetes compatibility table for the correct version to use and select an
available 1.27.x version. The version listed in the command is an example.
export KUBERNETES_VERSION=1.27.6
Note: Editing the cluster objects requires some understanding of Cluster API. Edits can prevent the cluster from
deploying successfully.
The objects are Custom Resources defined by Cluster API components, and they belong to three different categories:
• Cluster A Cluster object references the infrastructure-specific and control plane objects.
• Control Plane
• Node Pool A Node Pool is a collection of machines with identical properties. For example, a cluster might
have one Node Pool with large memory capacity and another Node Pool with GPU support. Each Node Pool is
described by three objects: The MachinePool references an object that describes the configuration of Kubernetes
components (kubelet) deployed on each node pool machine and an infrastructure-specific object that describes the
properties of all node pool machines. Here, it references a KubeadmConfigTemplate.
For more information about the objects, see Concepts in the Cluster API Book https://cluster-api.sigs.k8s.io/user/
concepts.html.
Procedure
1. Wait for the cluster control-plane to be ready using the command kubectl wait --
for=condition=ControlPlaneReady "clusters/$CLUSTER_NAME" --timeout=20m
Example:
cluster.cluster.x-k8s.io/aks-example condition met
The READY status will become True after the cluster control-plane becomes ready.
3. As they progress, the controllers also create Events. To list the events, use the command kubectl get events
| grep $CLUSTER_NAME
Example:
For brevity, the example uses grep. Using separate commands to get Events for specific objects is also possible.
For example, kubectl get events --field-selector involvedObject.kind="AKSCluster" and
kubectl get events --field-selector involvedObject.kind="AKSMachine".
48m Normal SuccessfulSetNodeRefs machinepool/aks-
example-md-0 [{Kind: Namespace: Name:aks-mp6gglj-41174201-
vmss000000 UID:e3c30389-660d-46f5-b9d7-219f80b5674d APIVersion: ResourceVersion:
FieldPath:} {Kind: Namespace: Name:aks-mp6gglj-41174201-vmss000001 UID:300d71a0-
f3a7-4c29-9ff1-1995ffb9cfd3 APIVersion: ResourceVersion: FieldPath:} {Kind:
Namespace: Name:aks-mp6gglj-41174201-vmss000002 UID:8eae2b39-a415-425d-8417-
d915a0b2fa52 APIVersion: ResourceVersion: FieldPath:} {Kind: Namespace: Name:aks-
mp6gglj-41174201-vmss000003 UID:3e860b88-f1a4-44d1-b674-a54fad599a9d APIVersion:
ResourceVersion: FieldPath:}]
6m4s Normal AzureManagedControlPlane available azuremanagedcontrolplane/
aks-example successfully reconciled
48m Normal SuccessfulSetNodeRefs machinepool/aks-
example [{Kind: Namespace: Name:aks-mp6gglj-41174201-
vmss000000 UID:e3c30389-660d-46f5-b9d7-219f80b5674d APIVersion: ResourceVersion:
FieldPath:} {Kind: Namespace: Name:aks-mp6gglj-41174201-vmss000001 UID:300d71a0-
f3a7-4c29-9ff1-1995ffb9cfd3 APIVersion: ResourceVersion: FieldPath:} {Kind:
Namespace: Name:aks-mp6gglj-41174201-vmss000002 UID:8eae2b39-a415-425d-8417-
d915a0b2fa52 APIVersion: ResourceVersion: FieldPath:}]
Known Limitations
Limitations for using NKP to create a new Azure Kubernetes Service (AKS) cluster.
The following are known limitations:
• The Nutanix Kubernetes Platform (NKP) version used to create a workload cluster must match the NKP version
used to create a workload cluster.
• NKP supports deploying one workload cluster.
• NKP generates a single node pool deployed by default; adding additional node pools is supported.
• NKP does not validate edits to cluster objects.
Section Contents
Procedure
5. Choose a workspace. If you are already in a workspace, the provider is automatically created in that workspace.
8. Take the ID output from the log in command above and put it into the Subscription ID field.
9. Take the tenant used in Step 2 and put it into the Tenant ID field.
10. Take the appId used in Step 2 and put it into the Client ID field.
11. Take the password used in Step 2 and put it into the Client Secret field.
Procedure
5. From Select Infrastructure Provider, choose the provider created in the prerequisites section.
6. To create a Kubernetes Version, run the command below in the az CLI, and then select the version of AKS
you want to use.
azI aks get-versions -o table --location location
8. Edit your worker Node Pools, as necessary. You can choose the Number of Nodes, the Machine Type,
and for the worker nodes, you can choose a Worker Availability Zone.
Note: Your cluster can take up to 15 minutes to appear in the Provisioned status.
You are then redirected to the Clusters page, where you’ll see your new cluster in the Provisioning status.
Hover over the status to view the details.
Section Contents
Procedure
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The node status
will change to Ready after the calico-node DaemonSet Pods are Ready.
3. List the Pods using the command kubectl --kubeconfig=$CLUSTER_NAME.conf get --all-namespaces
pods
NAMESPACE NAME READY
STATUS RESTARTS AGE
calico-system calico-kube-controllers-5dcd4b47b5-tgslm 1/1
Running 0 3m58s
calico-system calico-node-46dj9 1/1
Running 0 3m58s
calico-system calico-node-crdgc 1/1
Running 0 3m58s
calico-system calico-node-m7s7x 1/1
Running 0 3m58s
calico-system calico-node-qfkqc 1/1
Running 0 3m57s
calico-system calico-node-sfqfm 1/1
Running 0 3m57s
calico-system calico-node-sn67x 1/1
Running 0 3m53s
calico-system calico-node-w2pvt 1/1
Running 0 3m58s
calico-system calico-typha-6f7f59969c-5z4t5 1/1
Running 0 3m51s
calico-system calico-typha-6f7f59969c-ddzqb 1/1
Running 0 3m58s
calico-system calico-typha-6f7f59969c-rr4lj 1/1
Running 0 3m51s
kube-system azure-ip-masq-agent-4f4v6 1/1
Running 0 4m11s
kube-system azure-ip-masq-agent-5xfh2 1/1
Running 0 4m11s
kube-system azure-ip-masq-agent-9hlk8 1/1
Running 0 4m8s
kube-system azure-ip-masq-agent-9vsgg 1/1
Running 0 4m16s
kube-system azure-ip-masq-agent-b9wjj 1/1
Running 0 3m57s
kube-system azure-ip-masq-agent-kpjtl 1/1
Running 0 3m53s
kube-system azure-ip-masq-agent-vr7hd 1/1
Running 0 3m57s
kube-system cluster-autoscaler-b4789f4bf-qkfk2 0/1
Init:0/1 0 3m28s
kube-system coredns-845757d86-9jf8b 1/1
Running 0 5m29s
kube-system coredns-845757d86-h4xfs 1/1
Running 0 4m
kube-system coredns-autoscaler-5f85dc856b-xjb5z 1/1
Running 0 5m23s
kube-system csi-azuredisk-node-4n4fx 3/3
Running 0 3m53s
kube-system csi-azuredisk-node-8pnjj 3/3
Running 0 3m57s
kube-system csi-azuredisk-node-sbt6r 3/3
Running 0 3m57s
Section Contents
Procedure
Delete the Kubernetes cluster and wait a few minutes.
Before deleting the cluster, NKP deletes all Services of type LoadBalancer on the cluster. Deleting the Service deletes
the Azure LoadBalancer that backs it. To skip this step, use the flag --delete-kubernetes-resources=false.
Caution: Do not skip this step; if # NKP manages the Azure Network when NKP deletes the cluster, it also deletes the
Network.
What to do next
To view your dashboard and continue your customization, complete the Kommander installation. For more
information, see Kommander Installation Based on Your Environment on page 964.
Known Limitations
Limitations for deleting an Azure Kubernetes Service (AKS) cluster.
The following limitations apply to the current NKP release.
• The NKP version used to create the workload cluster must match the NKP version used to delete the workload
cluster.
vSphere Infrastructure
Configuration types for installing NKP on vSphere Infrastructure.
For an environment on the vSphere Infrastructure, install options based on those environment variables are provided
for you in this location.
If not already done, see the documentation for:
The workflow on the left shows the creation of a base OS image in the vCenter vSphere client using inputs from
Packer. The workflow on the right shows how NKP uses that same base OS image to create CAPI-enabled VM
images for your cluster.
After creating the base image, the NKP image builder uses it to create a CAPI-enabled vSphere template that includes
the Kubernetes objects for the cluster. You can use that resulting template with the NKP create cluster command
to create the VM nodes in your cluster directly on a vCenter server.
Section Contents
vSphere Prerequisites
This section contains all the prerequisite information specific to VMware vSphere infrastructure. These are above and
beyond all of the NKP prerequisites for Install. Fulfilling the prerequisites involves completing these two areas:
1. Nutanix Kubernetes Platform (NKP) prerequisites
2. vSphere prerequisites - vCenter Server + vSphere AHV
1. NKP Prerequisites
Before using NKP to create a vSphere cluster, verify that you have:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• A registry needs to be installed on the host where the NKP Konvoy CLI runs. For example, if you install Konvoy
on your laptop, ensure the computer has a supported version of Docker or other registry. On macOS, Docker runs
in a virtual machine. Configure this virtual machine with at least 8GB of memory.
• CLI tool Kubectl 1.21.6 interacts with the running cluster, installed on the host where the NKP Konvoy command
line interface (CLI) runs. For more information, see https://kubernetes.io/docs/tasks/tools/#kubectl
• A valid VMware vSphere account with credentials configured.
Note: NKP uses the vsphere CSI driver as the default storage provider. Use a Kubernetes CSI-compatible storage
that is suitable for production. For more information, see https://kubernetes.io/docs/concepts/storage/
volumes/#volume-types.
Note: You can choose from any of the storage options available for Kubernetes. To disable the default that Konvoy
deploys, set the default StorageClass localvolumeprovisioner as non-default. Then, set your newly created
StorageClass as the default by following the commands in the Kubernetes documentation called Changing the Default
Storage Class.
• Access to a bastion VM or other network-connected host running vSphere Client version 6.7.x with Update 3 or
later.
• You must reach the vSphere API endpoint from where the Konvoy command line interface (CLI) runs.
• AWS: https://aws.amazon.com/solutions/implementations/linux-bastion/
• Azure: https://learn.microsoft.com/en-us/azure/bastion/quickstart-host-portal
• GCP: https://blogs.vmware.com/cloud/2021/06/02/intro-google-cloud-vmware-engine-bastion-host-
access-iap/
• vSphere: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.security.doc/
GUID-6975426F-56D0-4FE2-8A58-580B40D2F667.html
• VMware: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.security.doc/
GUID-6975426F-56D0-4FE2-8A58-580B40D2F667.html.
• Valid values for the following:
• The use of Persistent Volumes in your cluster depends on Cloud Native Storage (CNS), which is available
in vSphere v6.7.x with Update 3 and later versions. CNS depends on this shared datastore’s configuration.
• Datastore URL from the datastore record for the shared datastore you want your cluster to use.
• You need this URL value to ensure the correct Datastore is used when NKP creates VMs for your cluster in
vSphere.
• Folder name.
• Base template names, such as base-rhel-8 or base-rhel-7.
• Name of a Virtual Network with DHCP enabled for air-gapped and non-air-gapped environments.
• Resource Pools - at least one resource pool is needed, with every host in the pool having access to shared
storage, such as VSAN.
• Each host in the resource pool needs access to shared storage, such as NFS or VSAN, to use
MachineDeployments and high-availability control planes.
Section Contents
Procedure
Template Yes No
1. Open a vSphere Client connection to the vCenter Server, described in the Prerequisites.
3. Give the new Role a name from the four choices detailed in the next section.
4. Select the Privileges from the permissions directory tree dropdown list below each of the four roles.
• The list of permissions can be set so the provider can create, modify, or delete resources or clone templates,
VMs, disks, attach network, etc.
The following four Roles need to be created for proper Nutanix Kubernetes Platform (NKP) access to the
required Resource(s) on the correct level of vCenter and resource pools.
Open vCenter:
• 1. vcenter root
1. Resource: View
2. Cns: Searchable
3. Profile-driven storage: Profile-driven storage view
4. Network - Session: ValidateSession
2. nkp-datacenter This role allows CAPV to create resources and assign networks. It is the most extensive
permission, but it is only assigned to folders, resource pools, data stores, and networks, so it can easily be
separated from other environments. It applies to the Resources.
• Do not propagate them because it gives the user view privileges on all folders and resource pools:
1. datacenter
2. cluster
3. esx host 1
4. esx host 2
Resource
X View
Cluster
X View
ESX Host 1
X View
ESX Host 2
X View
3. nkp-k8srole - This role allows CAPV to create resources and assign networks. It is the most extensive
permission, but it is only assigned to folders, resource pools, data stores, and networks, so it can easily be
separated from other environments. It applies to the Resources.
Resource
X View
Datastore
X Allocate space
X Delete File
X File Management
Global
Network
X Assign network
Resource
X Assign vApp to Pool
X Assign VM to Pool
Scheduled Task
X Create
X Delete
X Edit
X Run
Session
X ValidateSession
Storage Profile
X View
Storage Views
X View
4. nkp-readonly - This optional role allows the role to be cloned from templates in other folders and data stores
but does not have write access. It applies to the Resources.
a. templates folder
b. templates data store
Datastore
X View
Folder
X View
vApp
X Clone
X Export
X Clone
X Clone template
X Deploy template
Cns
XSearchable
Datastore
XAllocate space
Host
• Configuration
Network
XAssign network
Resource
Virtual machine
• Change Configuration - from the list in that section, select these permissions below:
XChange Memory
XChange Settings
Edit inventory
XRemove
Interaction
XPower off
XPower on
Provisioning
XClone template
XDeploy template
Session
XValidateSession
The table below describes the level at which these permissions are assigned.
• Storage configuration: Nutanix recommends customizing disk partitions and not configuring a SWAP partition.
• Network configuration: as KIB must download and install packages, activating the network is required.
• Connect to Red Hat: if using Red Hat Enterprise Linux (RHEL), registering with Red Hat is required to configure
software repositories and install software packages.
• Software selection: Nutanix recommends choosing Minimal Install.
• NKP recommends installing with the packages provided by the operating system package managers. Use the
version that corresponds to the major version of your operating system.
Disk Size
For each cluster you create using this base OS image, ensure you establish the disk size of the root file system based
on the following:
• For some base OS images, the custom disk size option does not affect the size of the root file system. This is
because some root file systems, for example, those contained in an LVM Logical Volume, cannot be resized
automatically when a machine first boots.
• The specified custom disk size must be equal to, or larger than, the size of the base OS image root file system.
This is because a root file system cannot be reduced automatically when a machine first boots.
• In VMware Cloud Director Infrastructure on page 912, the base image determines the minimum storage
available for the VM.
This Base OS Image is later used toCreate during installation and cluster creation.
If using Flatcar, the documentation from Flatcar regarding disabling or enabling autologin in the Base OS Image
is found here: In a vSphere or Pre-provisioned environment, anyone with access to the console of a Virtual
Machine(VM) has access to the core operating system user. This is called autologin. To disable autologin, you add
parameters to your base Flatcar image. For more information on using Flatcar, see
Tip: A local registry can also be used in a non-air-gapped environment for speed and security if desired. To do so, add
the following steps to your non-air-gapped installation process. See the topic Registry Mirror Tools.
Section Contents
Procedure
1. Users must perform the steps in the topic vSphere: Creating an Image before starting this procedure.
Procedure
2. Copy the base OS image file created in the vSphere Client to your desired location on the bastion VM host and
make a note of the path and file name.
3. Create an image.yaml file and add the following variables for vSphere. Nutanix Kubernetes Platform (NKP)
uses this file and these variables as inputs in the next step. To customize your image.yaml file, refer to this
section: Customize your Image.
Note: This example is Ubuntu 20.04. You will need to replace the OS name below based on your OS. Also, refer
to the example YAML files located here: OVA YAML: https://github.com/mesosphere/konvoy-image-
builder/tree/main/images/ova.
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "<VSPHERE_CLUSTER_NAME>"
datacenter: "<VSPHERE_DATACENTER_NAME>"
datastore: "<VSPHERE_DATASTORE_NAME>"
folder: "<VSPHERE_FOLDER>"
insecure_connection: "false"
network: "<VSPHERE_NETWORK>"
resource_pool: "<VSPHERE_RESOURCE_POOL>"
template: "os-qualification-templates/nutanix-base-Ubuntu-20.04" # change default
value with your base template name
vsphere_guest_os_type: "other4xLinux64Guest"
guest_os_type: "ubuntu2004-64"
# goss params
distribution: "ubuntu"
distribution_version: "20.04"
# Use following overrides to select the authentication method that can be used with
base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
# ssh_agent_auth: false # is set to true, ssh_password and ssh_private_key will be
ignored
• Any additional configurations can be added to this command using --overrides flags as shown below:
1. Any credential overrides: --overrides overrides.yaml
2. for FIPS, add this flag: --overrides overrides/fips.yaml
3. for air-gapped, add this flag: --overrides overrides/offline-fips.yaml
5. The Konvoy Image Builder (KIB) uses the values in image.yaml and the input base OS image: https://
github.com/mesosphere/konvoy-image-builder/tree/main/images/ova to create a vSphere template directly
on the vCenter server. This template contains the required artifacts needed to create a Kubernetes cluster. When
Tip: Recommendation: Now that you can now see the template created in your vCenter, it is best to rename it to
nkp-<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to
keep templates organized.
6. The following steps are to deploy an NKP cluster using your vSphere template.
Bootstrapping vSphere
To create Kubernetes clusters, NKP uses Cluster API (CAPI) controllers. These controllers run on a
Kubernetes cluster.
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. Ensure that the CAPV controllers are present using the command kubectl get pods -n capv-system.
Output example:
NAME READY STATUS RESTARTS AGE
capv-controller-manager-785c5978f-nnfns 1/1 Running 0 13h
6. NKP waits until the controller-manager and webhook deployments of these providers are ready. List
these deployments using the command kubectl get --all-namespaces deployments -
l=clusterctl.cluster.x-k8s.io.
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
cert-manager cert-manager-cainjector
1/1 1 1 1h
cert-manager cert-manager-webhook
1/1 1 1 1h
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Procedure
3. To set the environment variables for vSphere use the command export
VSPHERE_SERVER=example.vsphere.url export [email protected]
export VSPHERE_PASSWORD=example_password
Note: NKP uses the vSphere CSI driver as the default storage provider. Use a Kubernetes CSI
compatible storage that is suitable for production. For more information, see the Kubernetes documentation called
##Changing the Default Storage Class# If you’re not using the default, you cannot deploy an alternate provider
until after the nkp create cluster is finished. However, this must be determined before the Kommander
installation.
4. Ensure your vSphere credentials are up-to-date by refreshing the credentials with the command.
nkp update bootstrap credentials vsphere
5. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation.
If you need to change the kubernetes subnets, you must do this at cluster creation. The default subnets used in
NKP are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
6. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
nkp create cluster vsphere \
--cluster-name ${CLUSTER_NAME} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file <SSH_PUBLIC_KEY_FILE> \
--resource-pool <RESOURE_POOL_NAME> \
--virtual-ip-interface <ip_interface_name> \
--vm-template <TEMPLATE_NAME>
Note: To increase Dockerhub's rate limit, use your Dockerhub credentials when creating the cluster by setting
the following flag --registry-mirror-url=https://registry-1.docker.io --registry-
mirror-username= --registry-mirror-password= on the nkp create cluster command.
For more information, see https://docs.docker.com/docker-hub/download-rate-limit/
» Flatcar OS flag: Flatcar OS use --os-hint flatcar to instruct the bootstrap cluster to make some
changes related to the installation paths:
» HTTP: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
» FIPS flags -
To create a cluster in FIPS mode, inform the controllers of the appropriate image repository and version
tags of the official Nutanix FIPS builds of Kubernetes by adding those flags to nkp create cluster
command:
» You can create individual manifest files with smaller manifests to ease editing using the --output-
directory flag.
7. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as
edits can prevent the cluster from deploying successfully.
8. Create the cluster from the objects generated from the dry run. A warning will appear in the console if the
resource already exists, requiring you to remove the resource or update your YAML.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above,
create the cluster from the objects you created by specifying the directory:
10. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a
command to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/nutanix-e2e-cluster_name-1 True
13h
##ClusterInfrastructure - VSphereCluster/nutanix-e2e-cluster_name-1 True
13h
##ControlPlane - KubeadmControlPlane/nutanix-control-plane True
13h
# ##Machine/nutanix--control-plane-7llgd True
13h
Note: NKP uses the vSphere CSI driver as the default storage provider. Use a Kubernetes CSI compatible
storage that is suitable for production. For more information, see the Kubernetes documentation called
##Changing the Default Storage Class# If you’re not using the default, you cannot deploy an alternate provider
until after the nkp create cluster is finished. However, this must be determined before the Kommander
installation. For more information, see https://kubernetes.io/docs/concepts/storage/volumes/
#volume-types and https://kubernetes.io/docs/tasks/administer-cluster/change-default-
storage-class/
Note: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster by
setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username=<username> --registry-mirror-password=<password> on
the nkp create cluster command. For more information, see https://docs.docker.com/docker-hub/
download-rate-limit/.
12. Verify that the kubeadm control plane is ready with the command.
kubectl get kubeadmcontrolplane
Output is similar to:
NAME CLUSTER INITIALIZED API SERVER
AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
nutanix-e2e-cluster-1-control-plane nutanix-e2e-cluster-1 true true
3 3 3 0 14h v1.29.6
13. Describe the kubeadm control plane and check its status and events with the command.
kubectl describe kubeadmcontrolplane
14. As they progress, the controllers also create Events, which you can list using the command
kubectl get events | grep ${CLUSTER_NAME}
For brevity, this example uses grep. You can also use separate commands to get Events for specific objects,
such as kubectl get events --field-selector involvedObject.kind="VSphereCluster" and
kubectl get events --field-selector involvedObject.kind="VSphereMachine".
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster, or vice-
versa.
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster, and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
a. Access the Datastore tab in the vSphere client and select a datastore by name.
b. Copy the URL of that datastore from the information dialog that displays.
c. Return to the Nutanix Kubernetes Platform (NKP) CLI, and delete the existing StorageClass with the
command: kubectl delete storageclass vsphere-raw-block-sc
d. Run the following command to create a new StorageClass, supplying the correct values for your environment.
cat <<EOF > vsphere-raw-block-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: vsphere-raw-block-sc
provisioner: csi.vsphere.vmware.com
parameters:
datastoreurl: "<url>"
volumeBindingMode: WaitForFirstConsumer
EOF
Note: It may take a few minutes for the Status to move to Ready while the Pod network is deployed. The Nodes'
Status will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Output:
NAME STATUS ROLES AGE VERSION
aws-example-control-plane-9z77w Ready control-plane,master 4m44s v1.27.6
aws-example-control-plane-rtj9h Ready control-plane,master 104s v1.27.6
aws-example-control-plane-zbf9w Ready control-plane,master 3m23s v1.27.6
aws-example-md-0-88c46 Ready <none> 3m28s v1.27.6
aws-example-md-0-fp8s7 Ready <none> 3m28s v1.27.6
aws-example-md-0-qvnx7 Ready <none> 3m28s v1.27.6
aws-example-md-0-wjdrg Ready <none> 3m27s v1.27.6
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Provide Context for Commands with a kubeconfig File.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment
of applications.
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy and External Load Balancer.
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for nkp-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Procedure
You can check the status of the installation using the following command.
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Cluster Operations Management section of the
documentation. The majority of this customization such as attaching clusters and deploying applications will take
place in the dashboard or UI of Nutanix Kubernetes Platform (NKP) . The Cluster Operations section allows you to
manage cluster operations and their application workloads to optimize your organization’s productivity.
Section Contents
Note: Users need to perform the steps in the topic vSphere: Creating an Image before starting this procedure.
Procedure
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for
a particular OS. To create it, run the new Nutanix Kubernetes Platform (NKP) command create-package-
bundle. This builds an OS bundle using the Kubernetes version defined in ansible/group_vars/all/
defaults.yaml. Example command.
./konvoy-image create-package-bundle --os redhat-8.4 --output-directory=artifacts
5. Follow the instructions to build a vSphere template below and if applicable, set the override --overrides
overrides/offline.yaml flag described in Step 4 below.
Procedure
3. Create an image.yaml file and add the following variables for vSphere. NKP uses this file and these variables as
inputs in the next step. To customize your image.yaml file, refer to this section: Customize your Image.
Note: This example is Ubuntu 20.04. You will need to replace OS name below based on your OS. Also refer to
example YAML files located here: OVA YAML: https://github.com/mesosphere/konvoy-image-builder/
tree/main/images/ova
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "<VSPHERE_CLUSTER_NAME>"
datacenter: "<VSPHERE_DATACENTER_NAME>"
datastore: "<VSPHERE_DATASTORE_NAME>"
folder: "<VSPHERE_FOLDER>"
insecure_connection: "false"
network: "<VSPHERE_NETWORK>"
resource_pool: "<VSPHERE_RESOURCE_POOL>"
template: "os-qualification-templates/d2iq-base-Ubuntu-20.04" # change default
value with your base template name
vsphere_guest_os_type: "other4xLinux64Guest"
guest_os_type: "ubuntu2004-64"
# goss params
distribution: "ubuntu"
distribution_version: "20.04"
# Use following overrides to select the authentication method that can be used with
base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
# ssh_agent_auth: false # is set to true, ssh_password and ssh_private_key will be
ignored
• Any additional configurations can be added to this command using --overrides flags as shown below:
1. Any credential overrides: --overrides overrides.yaml
2. for FIPS, add this flag: --overrides overrides/fips.yaml
3. for air-gapped, add this flag: --overrides overrides/offline-fips.yaml
5. The Konvoy Image Builder (KIB) uses the values in image.yaml and the input base OS image to create a
vSphere template directly on the vCenter server. This template contains the required artifacts needed to create a
Kubernetes cluster. When KIB provisions the OS image successfully, it creates a manifest file. The artifact_id
field of this file contains the name of the AMI ID (AWS), template name (vSphere), or image name (GCP/Azure),
for example. For more information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/
images/ova.
{
Tip: Recommendation: Now that you can now see the template created in your vCenter, it is best to rename it to
nkp-<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to
keep templates organized.
6. Next steps are to deploy a NKP cluster using your vSphere template.
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
If you are operating in an air-gapped environment, a local container registry containing all the necessary installation
images, including the Kommander images is required. This registry must be accessible from both the bastion machine
or other machines that will be created for the Kubernetes cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap cluster, change your directory to the nkp-<version> directory similar
to example below depending on your current location
cd nkp-v2.12.0
• REGISTRY_URL: the address of an existing local registry accessible in the VPC that the new cluster nodes will
be configured to use a mirror registry when pulling images
• The environment where you are running the nkp push command must be authenticated with AWS in order to
load your images into ECR.
• Other registry variables:
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
Before creating or upgrading a Kubernetes cluster, you need to load the required images in a local registry if
operating in an air-gapped environment.
4. Execute the following command to load the air-gapped image bundle into your private registry using any of the
relevant flags to apply variables above.
If not ECR as shown in example code below, use the other relevant flags: --to-registry=
${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-password=
${REGISTRY_PASSWORD}
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Important: To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster,
by setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username= --registry-mirror-password= on the nkp create cluster
command.
5. Load the Kommander component images to your private registry using the command.
nkp push bundle --bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Optional: This step is required only if you have an Ultimate license.
For NKP Catalog Applications available with the Ultimate license, perform this image load by running the
following command to load the nkp-catalog-applications image bundle into your private registry:
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note:
• Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command
for it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on
page 644.
• Flatcar OS use --os-hint to instruct the bootstrap cluster to make some changes related to the
installation paths:
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. Ensure that the CAPV controllers are present using the command kubectl get pods -n capv-system
Output example:
NAME READY STATUS RESTARTS AGE
capv-controller-manager-785c5978f-nnfns 1/1 Running 0 13h
Note: The cluster name may only contain the following characters: a-z, 0-9,., and -. Cluster creation will fail if the
name has capital letters. See Kubernetes for more naming information at https://kubernetes.io/docs/concepts/
overview/working-with-objects/names/.
Procedure
3. Use the following command to set the environment variables for vSphere.
export VSPHERE_SERVER=example.vsphere.url
export [email protected]
5. Generate the Kubernetes cluster objects by copying and editing this command to include the correct values,
including the VM template name you assigned in the previous procedure.
Note: NKP uses the vSphere CSI driver as the default storage provider. Use a Kubernetes CSI compatible storage
that is suitable for production. For more information, see the Kubernetes documentation called ##Changing the
Default Storage Class# If you’re not using the default, you cannot deploy an alternate provider until after the
nkp create cluster is finished. However, this must be determined before the Kommander installation. For
more information, see https://kubernetes.io/docs/concepts/storage/volumes/#volume-types and
Changing the Default Storage Class
Note: To increase Dockerhub's rate limit use your Dockerhub credentials when creating the cluster by setting
the following flag --registry-mirror-url=https://registry-1.docker.io --registry-
mirror-username= --registry-mirror-password= on the nkp create cluster command.
For more information, see https://docs.docker.com/docker-hub/download-rate-limit/.
» Flatcar OS flag: Flatcar OS use --os-hint flatcar to instruct the bootstrap cluster to make some changes
related to the installation paths:
» HTTP: If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --
https-proxy, and --no-proxy and their related values in this command for it to be successful. More
information is available in Configuring an HTTP or HTTPS Proxy on page 644.
» FIPS flags -
To create a cluster in FIPS mode, inform the controllers of the appropriate image repository and version tags
of the official Nutanix FIPS builds of Kubernetes by adding those flags to nkp create cluster command:
» You can create individual manifest files with different smaller manifests for ease in editing using the --
output-directory flag.
7. Create the cluster from the objects generated from the dry run. A warning will appear in the console if the
resource already exists, requiring you to remove the resource or update your YAML.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above, create
the cluster from the objects you created by specifying the directory:
9. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command
to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/nutanix-e2e-cluster_name-1 True
13h
##ClusterInfrastructure - VSphereCluster/nutanix-e2e-cluster_name-1 True
13h
##ControlPlane - KubeadmControlPlane/nutanix-control-plane True
13h
# ##Machine/nutanix--control-plane-7llgd True
13h
# ##Machine/nutanix--control-plane-vncbl True
13h
# ##Machine/nutanix--control-plane-wbgrm True
13h
##Workers
##MachineDeployment/nutanix--md-0 True
13h
##Machine/nutanix--md-0-74c849dc8c-67rv4 True
13h
##Machine/nutanix--md-0-74c849dc8c-n2skc True
13h
##Machine/nutanix--md-0-74c849dc8c-nkftv True
13h
##Machine/nutanix--md-0-74c849dc8c-sqklv True
13h
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
You can now view resources in the moved cluster by using the --kubeconfig flag with
kubectl. For example: kubectl --kubeconfig=gcp-example.conf get nodes
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Known Limitations
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Procedure
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
a. Access the Datastore tab in the vSphere client and select a datastore by name.
b. Copy the URL of that datastore from the information dialog displayed.
c. Return to the Nutanix Kubernetes Platform (NKP) CLI, and delete the existing StorageClass with the
command: kubectl delete storageclass vsphere-raw-block-sc
d. Run the following command to create a new StorageClass, supplying the correct values for your environment.
cat <<EOF > vsphere-raw-block-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: vsphere-raw-block-sc
provisioner: csi.vsphere.vmware.com
parameters:
datastoreurl: "<url>"
volumeBindingMode: WaitForFirstConsumer
EOF
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The Nodes' Status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Prerequisites:
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
5. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Note: If the Kommander installation fails or you wish to reconfigure applications, you can rerun the install command
to retry the installation.
Note: If you prefer the CLI to not wait for all applications to become ready, you can set the --wait=false flag.
The first wait for each of the helm charts to reach their Ready condition, eventually resulting in an output resembling
below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-loki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
Failed HelmReleases
Procedure
If an application fails to deploy, check the status of a HelmRelease using the command kubectl -n kommander
get helmrelease <HELMRELEASE_NAME>
If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress”, trigger a reconciliation of the HelmRelease using the commands kubectl -n kommander patch
helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/
suspend", "value": true}]' kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --
type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
Log in to the UI
Procedure
1. By default, you can log in to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. Retrieve the URL used for accessing the UI with the following.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/NKP/kommander/
dashboard{{ "\n"}}'
Only use these static credentials to access the UI for configuring an external identity provider. Treat them as
back up credentials rather than use them for normal access.
Dashboard UI Functions
Procedure
After installing Konvoy component and building a cluster as well as successfully installing Kommander and logging
into the UI, you are now ready to customize configurations using the Cluster Operations Management section of the
documentation. The majority of this customization such as attaching clusters and deploying applications will take
place in the dashboard or UI of Nutanix Kubernetes Platform (NKP) . The Cluster Operations section allows you to
manage cluster operations and their application workloads to optimize your organization’s productivity.
Section Contents
Section Contents
Creating a node pool is useful when you need to run workloads that require machines with specific
resources, such as a GPU, additional memory, or specialized network or storage hardware.
Procedure
1. Set the environment variable to the name you assigned this cluster.
export CLUSTER_NAME=<my-vsphere-cluster>
2. If your workload cluster is self-managed, as described in Make the New Cluster Self-Managed, configure kubectl
to use the kubeconfig for the cluster.
export KUBECONFIG=${CLUSTER_NAME}.conf
Procedure
Create a new node pool with three replicas using this command.
nkp create nodepool vsphere ${NODEPOOL_NAME} \
--cluster-name=${CLUSTER_NAME} \
--network=example_network \
--data-center=example_datacenter \
--data-store=example_datastore \
--folder=example_folder \
--server=example_vsphere_api_server_url\
--resource-pool=example_resource_pool \
--vm-template=example_vm_template \
--replicas=3
Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally or store in
version control.
List the node pools of a given cluster. This returns specific properties of each node pool so that you can
see the name of the MachineDeployments.
Procedure
To list all node pools for a managed cluster, run:.
nkp get nodepools --cluster-name=${CLUSTER_NAME} --kubeconfig=${CLUSTER_NAME}.conf
While running Cluster Autoscaler, you can manually scale your node pools up or down when you need
finite control over your environment.
Procedure
While running Cluster Autoscaler, you can manually scale your node pools up or down when you need
finite control over your environment.
Procedure
3. In a default cluster, the nodes to delete are selected at random. CAPI’s delete policy controls this behavior.
However, when using the Nutanix Kubernetes Platform (NKP) CLI to scale down a node pool, it is also possible
to specify the Kubernetes Nodes you want to delete.
To do this, set the flag --nodes-to-delete with a list of nodes as below. This adds an annotation cluster.x-
k8s.io/delete-machine=yes to the matching Machine object that contains status.NodeRef with the node
names from --nodes-to-delete.
nkp scale nodepools ${NODEPOOL_NAME} --replicas=3 --nodes-to-delete=<> --cluster-
name=${CLUSTER_NAME}
Output:
# Scaling node pool example to 3 replicas
Deleting a node pool deletes the Kubernetes nodes and the underlying infrastructure.
• This feature requires Python 3.5 or greater to be installed on all control plane hosts.
• Complete the Bootstrap Cluster topic.
To create a cluster with automated certificate renewal, you create a Konvoy cluster using the certificate-renew-
interval flag. The certificate-renew-interval is the number of days after which Kubernetes-managed PKI
certificates will be renewed. For example, an certificate-renew-interval value of 60 means the certificates
will be renewed every 60 days.
Note:
cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size
cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size
The full list of command line arguments to the Cluster Autoscaler controller is on the Kubernetes public GitHub
repository.
For more information about how Cluster Autoscaler works, see these documents:
1. Ensure the Cluster Autoscaler controller is up and running (no restarts and no errors in the logs)
kubectl --kubeconfig=${CLUSTER_NAME}.conf logs deployments/cluster-autoscaler
cluster-autoscaler -n kube-system -f
3. To demonstrate that it is working properly, create a large deployment that will trigger pending pods (For this
example, we used AWS m5.2xlarge worker nodes. If you have larger worker-nodes, you need to scale up the
number of replicas accordingly).
cat <<EOF | kubectl --kubeconfig=${CLUSTER_NAME}.conf apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox-deployment
labels:
app: busybox
spec:
replicas: 600
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: busybox:latest
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
Cluster Autoscaler will scale up the number of Worker Nodes until there are no pending pods.
5. Cluster Autoscaler starts to scale down the number of Worker Nodes after the default timeout of 10 minutes.
Procedure
1. Create a secret with a kubeconfig file of the master cluster in the managed cluster with limited user permissions to
only modify resources for the given cluster.
3. Add the following flag to the cluster-autoscaler command so that /mnt//masterconfig/ value is the path
where the master cluster’s kubeconfig is loaded through the secret created.
--cloud-config=/mnt//masterconfig/value
Procedure
Task step.
Procedure
1. The bootstrap cluster will host the Cluster API controllers that reconcile the cluster objects marked for deletion.
Create a bootstrap cluster. To avoid using the wrong kubeconfig, the following steps use explicit kubeconfig paths
and contexts.
nkp create bootstrap --kubeconfig $HOME/.kube/config --with-vSphere-bootstrap-
credentials=true
2. Move the Cluster API objects from the workload to the bootstrap cluster: The cluster life cycle services on the
bootstrap cluster are ready, but the workload cluster configuration is on the workload cluster. The move command
moves the configuration, which takes the form of Cluster API Custom Resource objects, from the workload to the
bootstrap cluster. This process is also called a Pivot.
nkp move capi-resources \
--from-kubeconfig ${CLUSTER_NAME}.conf \
3. Use the cluster life cycle services on the workload cluster to check the workload cluster’s status.
nkp describe cluster --kubeconfig $HOME/.kube/config -c ${CLUSTER_NAME}
Output:
NAME READY SEVERITY
REASON SINCE MESSAGE
Cluster/nutanix-e2e-cluster_name-1 True
13h
##ClusterInfrastructure - VSphereCluster/nutanix-e2e-cluster_name-1 True
13h
##ControlPlane - KubeadmControlPlane/nutanix-control-plane True
13h
# ##Machine/nutanix--control-plane-7llgd True
13h
# ##Machine/nutanix--control-plane-vncbl True
13h
# ##Machine/nutanix--control-plane-wbgrm True
13h
##Workers
##MachineDeployment/nutanix--md-0 True
13h
##Machine/nutanix--md-0-74c849dc8c-67rv4 True
13h
##Machine/nutanix--md-0-74c849dc8c-n2skc True
13h
##Machine/nutanix--md-0-74c849dc8c-nkftv True
13h
##Machine/nutanix--md-0-74c849dc8c-sqklv True
13h
Procedure
1. Make sure your vSphere credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials vSphere --kubeconfig $HOME/.kube/config
Note:
Persistent Volumes (PVs) are not deleted automatically by design to preserve your data. However, the
PVs take up storage space if not deleted. You must delete PVs manually. Information for backup of a
cluster and PVs is on the Back up your Cluster's Applications and Persistent Volumes page.
2. To delete a cluster, Use NKP delete cluster and pass in the name of the cluster you are trying to delete
with --cluster-name flag. Use kubectl get clusters to get those details (--cluster-name and --
namespace) of the Kubernetes cluster to delete it.
kubectl get clusters
Note: Before deleting the cluster, Nutanix Kubernetes Platform (NKP) deletes all Services of type LoadBalancer
on the cluster. A vSphere Classic ELB backs each Service. Deleting the Service deletes the ELB that backs it. To
skip this step, use the flag --delete-kubernetes-resources=false. Do not skip this step if # NKP
manages the VPC. When NKP deletes the cluster, it deletes the VPC. If the VPC has any vSphere Classic ELBs,
vSphere does not allow the VPC to be deleted, and NKP cannot delete the cluster.
Procedure
1. Make sure your vSphere credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials vSphere --kubeconfig $HOME/.kube/config
VM Placement Policy
The VCD Provider (SP) determines the VM Placement Policy For more information, see https://docs.vmware.com/
en/VMware-Cloud-Director/10.4/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/
GUID-236A070E-83E6-4648-8F2F-557248C9735D.html. This policy defines the infrastructure where the VM
runs, meaning its placement of a VM on a host. When you assign a VM Placement Policy to a virtual machine, the
placement engine adds this virtual machine to the corresponding VM group of the cluster on which it resides.
Storage
For storage, follow Create a Base OS image in vSphere vCenter when creating the KIB base image. In VCD, the
base image determines the minimum storage available for the VM.
Section Contents
• Ensure you have met VMware Cloud Director (VCD) requirements: VMware Cloud Director 10.4 Release
Notes
• For CPU and Memory, the VCD Provider creates the appropriate VM Sizing Policies.
• The Provider (or tenant user with proper permissions) references these VM Sizing Policies when creating the
cluster, using the --control-plane-sizing-policy and --worker-sizing-policy flags.
• See Attributes of VM Sizing Policies regarding parameters like vCPU Speed or CPU Reservation Guarantee
that require consideration in VM Sizing Policy. For example, the recommended vCPU minimum speed is at
least 3 GHz.
• Considerations for Storage during configuration:
• For storage, follow Base OS image in vSphere when creating the KIB base image. In VCD, the base image
determines the minimum storage available for the VM.
Section Contents
VCD Overview
VMware Cloud Director (VCD) provides a private cloud-like experience in data centers. VCD allows the creation of
virtual data centers for more self-service administration tasks.
VMware Cloud Director is a platform that turns physical datacenter into virtual data centers and runs on one or
more vCenters. After you have vSphere vCenter, a brief overview of the general workflow from this section of the
documentation is below:
• Creating the Organization (tenant) and its users. For more information, see https://docs.vmware.com/
en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/
GUID-2F217F99-48C1-42F3-BF06-5ABBACADE2BA.html
• Creating Roles, Rights, and other related permissions necessary for the tenant users and other software
components
• Creating Base Images and VM Templates that will be uploaded to the vApp catalog For more information, see
https://docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Tenant-Portal-
Guide/GUID-D5737821-C3A4-4C73-8959-CA293C12A7DE.htmlof the VCD tenant Organization
• Use Nutanix Kubernetes Platform (NKP) to create and manage clusters and node pools
An overview of the steps has been provided to help. The overall process for configuring VCD and NKP together
includes the following steps:
1. Configure vSphere to provide the elements described in the vSphere Prerequisites.
2. For air-gapped environments: Create a Creating a Bastion Host on page 652.
3. Create a base OS image (for use in the OVA package containing the disk images packaged with the OVF).
4. Create a CAPI VM image template that uses the base OS image and adds the needed Kubernetes cluster
components.
5. Create a Tenant Organization
6. Configure Virtual Data Centers (VDCs)
7. MSP will upload the appropriate OVA to the tenant organization’s catalog
1. MSP uses konvoy-image build vsphere to create an OVA
2. MSP creates a vApp from the OVA
3. Before the MSP creates a cluster in a VCD tenant, it makes this vApp available to that tenant
Note: To see a visual architecture structure related to Tenants, Organization(or Tenant) VDCs, and Provider VDCs for
VMWare Cloud Director, refer to the page linked here for their documentation under the title: New to VMware Cloud
Director. For more information, see https://docs.vmware.com/en/VMware-Cloud-Director/index.html
• Provider: Service Provider(SP) that administrates the data centers and provisions virtual infrastructure for
Organizations(tenants).
• Organizations (Tenants): An administration unit for users, groups, and computing resources. Tenant users are
managed at the Organization level.
• System Administrators: This role exists only in the provider organization (SP) and can create and provision
tenant # organizations# and the portal. The #System Administrator# role has all VMware Cloud Director rights by
default.
• Organization Administrators: This role creates user groups and service catalogs. Tenant is a predefined role
that can manage users in their Organization and assign them roles.
• Rights: Each right provides view or manage access to a particular object type in VCD. Also see:
• Rights Bundle: A collection of rights for the Organization .
• Roles: A role is a set of rights assigned to one or more users and groups. When you create or import a user or
group, you must assign it a role.
• Users and Groups: Administrators can create users manually, programmatically, or integrate with a directory
service like LDAP to import user accounts and groups at scale.
• Virtual Data Centers (VDC): An isolated environment provided to a cloud user in which they can provision
resources, deploy, store, and operate applications.
• Organization VMware Cloud Director Networks: Similar to the Amazon concept of Virtual Private Cloud,
a VMware Cloud Director network is available only to a specific VMware Cloud Director and available to all
vApps in the Organization. It can be connected to external networks as needed.
• vApp Networks: Similar to the concept of a subnet, a vApp network is an isolated network within a VMware
Cloud Director network that allows specific vApps to communicate with each other.
• vApp: One or more virtual machines(VMs) that come preconfigured to provide a specific cloud service. vApp is a
virtual app that defines computing, storage, and networking metadata.
• Media Files and Catalogs: VMware Cloud Director organizes deployable resources through media files.
Virtual machines and vApp templates (machine images) can be used as an initial boot program for a VM. The
Organization Administrator organizes these files into catalogs, allowing users within the Organization to
provision the resources they need.
• Storage Profiles: VCD concept to organize storage For more information, see https://docs.vmware.com/
en/VMware-Cloud-Director/10.5/VMware-Cloud-Director-Tenant-Guide/GUID-17DDD3AC-0ABB-49EC-
Port Permissions
There are various VCD abstractions such as Edge Gateway Firewall Rules, IP sets, and Security Groups that work
together. For many Kubernetes components in your Nutanix Kubernetes Platform (NKP) cluster, access is required.
Allow access to ports 6443 and 443 from outside the cluster. There are other ports listed in the NKP ports page that
must be allowed among the machines in the cluster.
Section Contents
Procedure
Procedure
In the System Administration menu.
• Under the Data Center tab, select Virtual Data Center. This is the location where you can define CPU size,
memory, and storage.
• Under the Data Center tab - Networking - Edges: select Configuration-External Networks and supply the publicly
accessible IP address range under the Subnet column.
Note: The Service Provider (SP) can create a shared Catalog where items are placed to be automatically imported.
• Under the Networking tab - Networks: the tenant Organization Administrator will configure a network. Select the
network name to be taken to its General properties, such as Gateway CIDR address, where all VMs will receive an
IP address from the private IP space.
Note: The LoadBalancer (LB) will use the routable external network, which is automatically created using the
CAPVCD controller.
• Under the Resources tab - Edge Gateway - Services - NAT - External IP: Specify the IP address that allows VMs
to access external networks. Either create an SNAT rule or provide Egresses to the VMs.
• Edge Gateway Firewall
Important: —Allow port access from outside the cluster for TCP 6443 to the control plane endpoint load balancers
and TCP 443 to Kubernetes load balancers (e.g., to reach the ##Nutanix Kubernetes Platform# (##NKP#) dashboard).
Other ports must be allowed access among the cluster machines. The tenant is required to get one public IP to create
an Edge Gateway. The Service Provider(SP) will allocate the pool of IPs from which the tenant pulls. After you have
associated an external network through a gateway, the tenant can ask for IPs. Otherwise, if you have chosen an IP
address, you can specify it.
Note: After the tenant Organization is in production, various policies will need to be defined for storage and resources.
The Configure Organization Policy Section of VMware documentation will provide more detail. For more information,
see https://docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-
Provider-Admin-Portal-Guide/GUID-DBA11D01-E102-47A4-926C-BFDB681B75F2.html
• Rights: Provide view or manage access to a particular object type in VMware Cloud Director and belong to
different categories depending on the objects to which they relate, such as vApp, Catalog, or an Organization. For
more information, see https://docs.vmware.com/en/VMware-Cloud-Director/10.4/VMware-Cloud-Director-
Tenant-Portal-Guide/GUID-9B462637-E350-43B6-989A-621F226A56D4.html.
• Roles: A collection of rights for a User and defines what an individual user has access to. For more information,
see https://docs.vmware.com/en/VMware-Cloud-Director/10.4/VMware-Cloud-Director-Tenant-Portal-
Guide/GUID-9B462637-E350-43B6-989A-621F226A56D4.html.
• Rights Bundles: A collection of rights for the tenant Organization as a whole and defines what a tenant
Organization can access. For more information, see https://blogs.vmware.com/cloudprovider/2019/12/
effective-rights-bundles.html.
Various Rights are common to multiple predefined Global Roles. These Rights are granted by default to all new
organizations and are available for use in other Roles created by the tenant Organization Administrator. The VMware
documentation explains some predefined Roles for both Provider and Tenant.
• vCenter/NXT/AVI Infrastructure System Administrator: Manages physical infra for vCenter, NXT network fabric,
AVI load balancers (Nutanix SRE / Service provider(OVH cloud))
• Create Users:https://docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-
Tenant-Portal-Guide/GUID-1CACBB2E-FE35-4662-A08D-D2BCB174A43C.html
• Managing System Administrators and Roles:https://docs.vmware.com/en/VMware-
Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/
GUID-9DFB2238-23FB-4D07-B563-144AC4E9EDAF.html
• Manage Users:https://docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-
Tenant-Portal-Guide/GUID-A358E190-BFC0-4187-9406-66E82C92564A.html
• Organization Administrator:https://docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-
Cloud-Director-Tenant-Portal-Guide/GUID-BC504F6B-3D38-4F25-AACF-ED584063754F.html#GUID-
BC504F6B-3D38-4F25-AACF-ED584063754F
• Predefined Roles and Their Rights:https://docs.vmware.com/en/VMware-Cloud-Director/10.3/
VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/GUID-BC504F6B-3D38-4F25-AACF-
ED584063754F.html
Related Information
The CAPVCD provider uses a related component called CSE. Some of the permissions necessary to create a VCD
cluster are defined using this component. Note that the term Role Based Access Control (RBAC) used in the CSE
documentation refers ONLY to the VCD rights and permissions necessary to perform life cycle management of
Kubernetes clusters using VCD. It has no impact on the RBAC configuration of any clusters created using VCD
Role Based Access Control (RBAC) from GitHub - The RBAC on that page refers to the roles and rights required for
the tenants to perform the life cycle management of Kubernetes clusters. If anything has nothing to do with the RBAC
inside the Kubernetes cluster itself.
For more information, see the related RBAC links:
• CSE: https://vmware.github.io/container-service-extension/cse3_1/INTRO.html
• Role Based Access Control (RBAC): https://vmware.github.io/container-service-extension/cse3_1/
RBAC.html#additional-required-rights
The System Administrator role exists only in the provider organization. For more information, see https://
docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-
Guide/GUID-438B2F8C-65B0-4895-AF40-6506E379A89D.html
As a Service Provider (SP), you will have vSphere vCenter and VMware Cloud Director (VCD) roles.
• vCenter, NXT or AVI Infrastructure System Administrator: Manages physical infra for vCenter, NXT
network fabric, AVI load balancers (Nutanix SRE or Service provider (OVH cloud))
• System Administrator - Provider (SP): Manages Virtual infra in VCD that uses vCenter(s), NXT(s), AVI(s),
etc. (Nutanix SRE)
• Organization(Tenant) Administrator: Manages Virtual infra (org, orgvdc, network, catalogs, templates,
users, etc.) for a tenant. Users that can create k8s cluster
As an Organization Administrator, from the tenant portal you can create, edit, import, and delete users. The
tenant organization administrator manages the virtual infrastructure that includes the organization itself, including the
related network, catalogs, templates, and such for the tenant organization. Important predefined role information is
below:
• Tenant Organization Administrator is predefined and can use the VCD tenant portal to create and manage
users in their organization and assign them roles.
• vApp Author role can use catalogs and create vApps
A tenant Organization Administrator can access roles if allowed. They can only view the global tenant roles
that a System Administrator has published to the organization but cannot modify them. The Organization
Administrator can create custom tenant roles with similar rights and assign them to the users within their tenant
Organization.
There are some predefined Global Tenant Roles as well, which are explained in the VMware documentation: https://
docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-
Guide/GUID-BC504F6B-3D38-4F25-AACF-ED584063754F.html as well as the https://docs.vmware.com/
en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/GUID-
AE42A8F6-868C-4FC0-B224-87CA0F3D6350.html.
Procedure
1. Create a Rights Bundle (a collection of rights) that includes all required Rights. See the page for the Creating
Tenant Roles and Rights on page 921. For more information, see https://docs.vmware.com/en/
VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/GUID-
CFB0EFEE-0D4C-498D-A937-390811F11B8E.html.
2. Create a Tenant Role that uses all these Rights. For more information, see https://docs.vmware.com/
en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/
GUID-0D991FCF-3800-461D-B123-FAE7CFF34216.html.
• The CAPVCD provider uses a related component called CSE; see https://vmware.github.io/container-service-
extension/cse3_1/INTRO.html.
• Permissions necessary to create a VCD cluster are defined using this component. Note that the term Role Based
Access Control (RBAC) used in the CSE documentation refers ONLY to the VCD rights and permissions
necessary to perform life cycle management of Kubernetes clusters using VCD. It does not impact the RBAC
configuration of any clusters created using VCD. For more information, see https://vmware.github.io/container-
service-extension/cse3_1/RBAC.html#additional-required-rights.
• Role Based Access Control (RBAC) from GitHub - The RBAC in that page refers to the roles and rights
required for the tenants to manage Kubernetes clusters' life cycle. It does not have anything to do with the RBAC
inside the Kubernetes cluster itself; see https://vmware.github.io/container-service-extension/cse3_1/
RBAC.html#additional-required-rights.
CAPVCD User
CAPVCD uses the credentials (username, password, or API token) of a VCD User to manage the cluster. The VCD
Cloud Provider Interface(CPI) and CSI controllers also use the same credentials. This same user needs specific API
permissions, known as Rights, for the CAPVCD, CPI, and CSI controllers to work correctly. For a User to be granted
a Right, the User must be associated with a Role that consumes this Right, meaning the Role grants Rights to the
User.
If the User belongs to an Organization, the Provider must publish the Right to the Tenant Organization in a Rights
Bundle. The Rights Bundle grants Rights to the tenant Organization.
For more information, see https://github.com/vmware/cloud-provider-for-cloud-director/tree/main#terminology.
This page describes what the CAPVCD User requires about the Rights and Rights Bundles.
1. A provider administrator creates a rights bundle that enumerates all the rights listed below. We recommend the
name Nutanix Kubernetes Platform (NKP) Cluster Admin for the Rights Bundle.
2. A Provider administrator creates a Global Role that enumerates all the below rights. We recommend the name
NKP Cluster Admin for the Global Role.
3. A Provider administrator publishes both the Rights Bundle and Global Role to every Organization that will deploy
NKP clusters
4. An Organization administrator creates a User and associates it with the Global Role.
List of Rights
Cluster API for VMware Cloud Director (CAPVCD) requires the following Rights:
vSphere Prerequisites
• Resource Requirements
• Installing NKP on page 47
• Prerequisites for Install
Section Contents
Refer to the VMWare documentation for specifics on creating and deploying OVA for your tenants: Deploying OVF
and OVA TemplatesDeploying OVF and OVA Templates and Working with vApp Templates.
If you have not already done so, create your base image and template in vSphere and import them into the VCD
tenant catalog to make them available for tenant use.
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
5. Ensure that the CAPV controllers are present using the command kubectl get pods -n capv-system.
Output example:
NAME READY STATUS RESTARTS AGE
capv-controller-manager-785c5978f-nnfns 1/1 Running 0 13h
6. NKP waits until these providers' controller-manager and webhook deployments are ready. List these deployments
using the command kubectl get --all-namespaces deployments -l=clusterctl.cluster.x-
k8s.io.
Output example:
NAMESPACE NAME
READY UP-TO-DATE AVAILABLE AGE
capa-system capa-controller-manager
1/1 1 1 1h
capg-system capg-controller-manager
1/1 1 1 1h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager
1/1 1 1 1h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager
1/1 1 1 1h
capi-system capi-controller-manager
1/1 1 1 1h
cappp-system cappp-controller-manager
1/1 1 1 1h
capv-system capv-controller-manager
1/1 1 1 1h
capz-system capz-controller-manager
1/1 1 1 1h
cert-manager cert-manager
1/1 1 1 1h
cert-manager cert-manager-cainjector
1/1 1 1 1h
cert-manager cert-manager-webhook
1/1 1 1 1h
When creating a VCD cluster, CPU and Memory flags are needed:
• For CPU and Memory, the VCD Provider creates the appropriate VM Sizing Policies. Then, the Provider
references these VM Sizing Policies when creating the cluster, using the flags:
• --control-plane-sizing-policy
• --worker-sizing-policy
If the Service Provider(SP) has given a tenant user the permissions to create clusters inside their own Organization,
then that tenant user will need to reference those flags are well.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY
SEVERITY REASON SINCE MESSAGE
Cluster/vcd-example True
14s
##ClusterInfrastructure - vcdCluster/vcd-example
##ControlPlane - KubeadmControlPlane/vcd-example-control-plane True
14s
# ##Machine/vcd-example-control-plane-6fbzn True
17s
# # ##MachineInfrastructure - vcdMachine/vcd-example-control-plane-62g6s
# ##Machine/vcd-example-control-plane-jf6s2 True
17s
# # ##MachineInfrastructure - vcdMachine/vcd-example-control-plane-bsr2z
# ##Machine/vcd-example-control-plane-mnbfs True
17s
# ##MachineInfrastructure - vcdMachine/vcd-example-control-plane-s8xsx
##Workers
##MachineDeployment/vcd-example-md-0 True
17s
##Machine/vcd-example-md-0-68b86fddb8-8glsw True
17s
# ##MachineInfrastructure - vcdMachine/vcd-example-md-0-zls8d
##Machine/vcd-example-md-0-68b86fddb8-bvbm7 True
17s
# ##MachineInfrastructure - vcdMachine/vcd-example-md-0-5zcvc
##Machine/vcd-example-md-0-68b86fddb8-k9499 True
17s
# ##MachineInfrastructure - vcdMachine/vcd-example-md-0-k8h5p
##Machine/vcd-example-md-0-68b86fddb8-l6vfb True
17s
##MachineInfrastructure - vcdMachine/vcd-example-md-0-9h5vn
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Procedure
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
a. Access the Datastore tab in the vSphere client and select a datastore by name.
b. Copy the URL of that datastore from the information dialog displayed.
c. Return to the Nutanix Kubernetes Platform (NKP) CLI, and delete the existing StorageClass with the
command: kubectl delete storageclass vsphere-raw-block-sc
d. Run the following command to create a new StorageClass, supplying the correct values for your environment.
cat <<EOF > vsphere-raw-block-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: vsphere-raw-block-sc
provisioner: csi.vsphere.vmware.com
parameters:
datastoreurl: "<url>"
volumeBindingMode: WaitForFirstConsumer
EOF
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The Nodes' Status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Output:
NAME STATUS ROLES AGE VERSION
aws-example-control-plane-9z77w Ready control-plane,master 4m44s v1.27.6
aws-example-control-plane-rtj9h Ready control-plane,master 104s v1.27.6
aws-example-control-plane-zbf9w Ready control-plane,master 3m23s v1.27.6
Procedure
1. To begin configuring Kommander, run the following command to initialize a default configuration file.
2. After the initial deployment of Kommander, you can find the application Helm Charts by checking the
spec.chart.spec.sourceRef field of the associated HelmRelease.
kubectl get helmreleases <application> -o yaml -n kommander
Inline configuration (using values) :
In this example, you configure the centralized-grafana application with resource limits by defining the Helm
Chart values in the Kommander configuration file.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
centralized-grafana:
values: |
3. If you have an Ultimate License, you can enable NKP Catalog Applications and Install Kommander in the same
kommander.yaml from the previous section. Add these values (if you are enabling NKP Catalog Apps) for nkp-
catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
url: https://github.com/mesosphere/NKP-catalog-applications
ref:
tag: v2.12.0
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Section Contents
Procedure
1. Ensure the Cluster Autoscaler controller is up and running (no restarts and no errors in the logs)
kubectl --kubeconfig=${CLUSTER_NAME}.conf logs deployments/cluster-autoscaler
cluster-autoscaler -n kube-system -f
3. The Cluster Autoscaler logs will show that the worker nodes are associated with node groups and that pending
pods are being watched.
5. Cluster Autoscaler will scale up the number of Worker Nodes until there are no pending pods.
7. Cluster Autoscaler starts to scale down the number of Worker Nodes after the default timeout of 10 minutes.
Section Contents
Procedure
1. Set the environment variable to the name you assigned this cluster using the command export
CLUSTER_NAME=vcd-example.
2. If your workload cluster is self-managed, as described in Make the New Cluster Self-Managed, configure
kubectl to use the kubeconfig for the cluster.
export KUBECONFIG=${CLUSTER_NAME}.conf
Procedure
Set the --zone flag to a zone in the same region as your cluster.
nkp create nodepool vcd ${NODEPOOL_NAME} \
--cluster-name=${CLUSTER_NAME} \
--image $IMAGE_NAME \
--zone us-west1-b \
--replicas=3
This example uses default values for brevity. Use flags to define custom instance types and other properties.
machinedeployment.cluster.x-k8s.io/example created
## Creating default/example nodepool resources
vcdmachinetemplate.infrastructure.cluster.x-k8s.io/example created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/example created
# Creating default/example nodepool resources
Advanced users can use a combination of the --dry-run and --output=yaml or --output-directory=/flags
to get a complete set of node pool objects to modify locally or store in version control.
Node pools are part of a cluster and managed as a group, and you can use a node pool to manage a group of machines
using the same common properties. When Konvoy creates a new default cluster, there is one node pool for the worker
nodes, and all nodes in that new node pool have the same configuration. You can create additional node pools for
more specialized hardware or configuration. For example, suppose you want to tune your memory usage on a cluster
where you need maximum memory for some machines and minimal memory on other machines. In that case, you
create a new node pool with those specific resource needs.
While you can run Cluster Autoscaler, you can manually scale your node pools up or down when you need more
finite control over your environment. For example, if you require 10 machines to run a process, you can only
manually set the scaling to run those 10 machines. However, using the Cluster Autoscaler, you must stay within your
minimum and maximum bounds.
Procedure
1. To scale up a node pool in a cluster, use the command nkp scale nodepool ${NODEPOOL_NAME} --
replicas=5 --cluster-name=${CLUSTER_NAME}.
2. After a few minutes, you can list the node pools using the command nkp get nodepool --cluster-name=
${CLUSTER_NAME}.
Example output showing that the number of DESIRED and READY replicas increased to 5:
NODEPOOL DESIRED READY
KUBERNETES VERSION
example 5 5 v1.28.7
vcd-example-md-0 4 4 v1.28.7
Procedure
1. To scale down a node pool in a cluster, use the command nkp scale nodepool ${NODEPOOL_NAME} --
replicas=4 --cluster-name=${CLUSTER_NAME}.
Example output:
# Scaling node pool example to 4 replicas
2. After a few minutes, you can list the node pools using the command nkp get nodepool --cluster-name=
${CLUSTER_NAME}.
Example output showing the number of DESIRED and READY replicas decreased to 4:
NODEPOOL DESIRED READY
KUBERNETES VERSION
example 4 4 v1.28.7
a. To do this, set the flag --nodes-to-delete with a list of nodes as below. This adds an annotation
cluster.x-k8s.io/delete-machine=yes to the matching Machine object that contains
status.NodeRef with the node names from --nodes-to-delete.
nkp scale nodepool ${NODEPOOL_NAME} --replicas=3 --cluster-name=${CLUSTER_NAME}
# Scaling node pool example to 3 replicas
Procedure
Procedure
1. To delete a node pool from a managed cluster, run the command nkp delete nodepool ${NODEPOOL_NAME}
--cluster-name=${CLUSTER_NAME}.
Section Content
Procedure
1. Create a bootstrap cluster using the command nkp create bootstrap --vcd-bootstrap-
credentials=true --kubeconfig $HOME/.kube/config.
The bootstrap cluster will host the Cluster API controllers that reconcile the cluster objects marked for deletion.
Note: To avoid using the wrong kubeconfig, the following steps use explicit kubeconfig paths and contexts.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
2. Move the Cluster API objects from the workload to the bootstrap cluster: The cluster life cycle services on the
bootstrap cluster are ready, but the workload cluster configuration is on the workload cluster. The move command
moves the configuration, which takes the form of Cluster API Custom Resource objects, from the workload to
the bootstrap cluster. This process is called a Pivot. For more information, see https://cluster-api.sigs.k8s.io/
reference/glossary.html?highlight=pivot#pivot.
nkp get nodepool --cluster-name=${CLUSTER_NAME}
Example output showing the number of DESIRED and READY replicas decreased to 4:
nkp move capi-resources \
--from-kubeconfig ${CLUSTER_NAME}.conf \
--to-kubeconfig $HOME/.kube/config
In a default cluster, the nodes to delete are selected at random. CAPI’s delete policy controls this behavior.
However, when using the Nutanix Kubernetes Platform (NKP) CLI to scale down a node pool, it is also possible
to specify the Kubernetes Nodes you want to delete.
3. Use the cluster life cycle services on the workload cluster to check the workload cluster status.
nkp describe cluster --kubeconfig $HOME/.kube/config -c ${CLUSTER_NAME}
NAME READY
SEVERITY REASON SINCE MESSAGE
Note: After moving the cluster life cycle services to the workload cluster, remember to use nkp with the workload
cluster kubeconfig.
Procedure
Note: Persistent Volumes (PVs) are not deleted automatically by design to preserve your data. However, they take
up storage space if not deleted. You must delete PVs manually. Information for the backup of a cluster and PVs is
in the documentation called Cluster Applications and Persistent Volumes Backup on page 517.
2. Use nkp with the bootstrap cluster to delete the workload cluster. Delete the Kubernetes cluster and wait a few
minutes.
Before deleting the cluster, Nutanix Kubernetes Platform (NKP) deletes all Services of type LoadBalancer on the
cluster. To skip this step, use the flag --delete-kubernetes-resources=false.
nkp delete cluster --kubeconfig $HOME/.kube/config --cluster-name ${CLUSTER_NAME}
# Deleting Services with type LoadBalancer for Cluster default/vcd-example
# Deleting ClusterResourceSets for Cluster default/vcd-example
# Deleting cluster resources
# Waiting for cluster to be fully deleted
Procedure
Delete the bootstrap cluster using the command nkp delete bootstrap --kubeconfig $HOME/.kube/
config.
Example output:
# Deleting bootstrap
cluster
Procedure
2. Get all of the PVCs that you need to delete from the Applications deployed by Kommander.
kubectl get pvc --namespace ${WORKSPACE_NAMESPACE}
3. Delete those PVCs and the Managed cluster deletion process will continue.
kubectl delete pvc --namespace ${WORKSPACE_NAMESPACE} <pvc-name-1> <pvc-name-2>
Known Limitations
• The Nutanix Kubernetes Platform (NKP) version used to create the workload cluster must match the NKP version
used to delete the workload cluster.
Section Contents
GCP Prerequisites
Before beginning a Nutanix Kubernetes Platform (NKP) installation, verify that you have the following:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• kubectl for interacting with the running cluster.
• Install the GCP gcloud CLI by following the Install the gcloud CLI | Google Cloud CLI Documentation
Worker Nodes
You must have at least four worker nodes. The specific number of worker nodes required for your environment can
vary depending on the cluster workload and size of the nodes. Each worker node needs to have at least the following:
GCP Prerequisites
• Verify that your Google Cloud project does not enable the OS Login feature.
GCP projects may have the Enable OS Login feature enabled by default. If this feature is enabled, KIB cannot ssh to
the VM instances it creates, and the image creation fails.
• The user creating the Service Accounts needs additional privileges besides the Editor role. For more information,
see GCP Roles
Section Contents
GCP Roles
Service accounts are a special type of Google account that grants permissions to virtual machines instead of end users.
The primary purpose of Service accounts is to ensure safe, managed connections to APIs and Google Cloud services.
These roles are needed when creating an image using Konvoy Image Builder.
Role Options
• Either create a new GCP service account or retrieve credentials from an existing one.
• (Option 1) Create a GCP Service Account using the following gcloud commands:
export GCP_PROJECT=<your GCP project ID> export GCP_SERVICE_ACCOUNT_USER=<some
new service account user> export GOOGLE_APPLICATION_CREDENTIALS="$HOME/.gcloud/
credentials.json"
gcloud iam service-accounts create "$GCP_SERVICE_ACCOUNT_USER"
--project=$GCP_PROJECT gcloud projects add-iam-policy-binding
$GCP_PROJECT
--member="serviceAccount:$GCP_SERVICE_ACCOUNT_USER@
$GCP_PROJECT.iam.gserviceaccount.com"
--role=roles/editor gcloud iam service-accounts keys create
$GOOGLE_APPLICATION_CREDENTIALS
--iam-account="$GCP_SERVICE_ACCOUNT_USER@
$GCP_PROJECT.iam.gserviceaccount.com"
• (Option 2) Retrieve the credentials for an existing service account using the following gcloud commands:
export GCP_PROJECT=<your GCP project ID> export GCP_SERVICE_ACCOUNT_USER=<existing
service account user> export GOOGLE_APPLICATION_CREDENTIALS="$HOME/.gcloud/
credentials.json"
gcloud iam service-accounts keys create
$GOOGLE_APPLICATION_CREDENTIALS
--iam-account="$GCP_SERVICE_ACCOUNT_USER@
$GCP_PROJECT.iam.gserviceaccount.com"
To create a GCP Service Account with the Editor role, the user creating the GCP Service Account needs
the Editor, RoleAdministrator, and SecurityAdmin roles. However, those pre-defined roles grant more
permissions than the minimum set needed to create a Nutanix Kubernetes Platform (NKP) cluster. Granting
unnecessary permissions can lead to potential security risks and should be avoided.
Note: For NKP cluster creation, a minimal set of roles and permissions needed for the user creating the GCP Service
Account is the Editor role plus the following additional permissions:
• compute.disks.setIamPolicy
• compute.instances.setIamPolicy
• iam.roles.create
• iam.roles.delete
• iam.roles.update
• iam.serviceAccounts.setIamPolicy
• resourcemanager.projects.setIamPolicy
Note: Google Cloud Platform does not publish images. You must first build the image using Konvoy Image Builder.
Explore the Customize your Image topic for more options. For more information regarding using the image in creating
clusters, refer to the GCP Infrastructure section of the documentation.
NKP Prerequisites
Before you begin, you must:
• Download the KIB bundle for your Nutanix Kubernetes Platform (NKP) version.
• Check the Supported Infrastructure Operating Systems.
• Check the Supported Kubernetes version for your provider in Upgrade NKP on page 1089.
• Create a working Docker or other registry setup.
• On Debian-based Linux distributions, install a version of the cri-tools package compatible with the Kubernetes
and container runtime versions. For more information, see https://github.com/kubernetes-sigs/cri-tools.
Note: GCP projects may have the Enable OS Login feature enabled by default. If this feature is enabled, KIB cannot
ssh to the VM instances it creates, and the image creation fails.
To check if it is enabled, use the Google commands to inspect the metadata configured in your project. If
you find the enable-oslogin flag set to TRUE, you must remove or set it to FALSE to use KIB. For
more information on Set and Remove Custom Metadata, see https://cloud.google.com/compute/docs/
metadata/setting-custom-metadata#console_2
GCP Prerequisites
Verify that your Google Cloud project does not enable the OS Login feature.
Note:
GCP projects may have the Enable OS Login feature enabled by default. If this feature is enabled, KIB
cannot ssh to the VM instances it creates, and the image creation fails.
To check if it is enabled, use the Google commands to inspect the metadata configured in your project. If
you find the enable-oslogin flag set to TRUE, you must remove or set it to FALSE to use KIB. For
more information on Set and Remove Custom Metadata, see https://cloud.google.com/compute/docs/
metadata/setting-custom-metadata#console_2
• The user creating the Service Accounts needs additional privileges in addition to the Editor role. For more
information, see GCP Roles
Bootstrapping GCP
NKP creates Kubernetes clusters using Cluster API (CAPI) controllers, which run on a Kubernetes cluster.
Procedure
1. Complete the Nutanix Infrastructure Prerequisites. For more information, see Nutanix Infrastructure
Prerequisites on page 657.
Procedure
1. Review Universal Configurations for all Infrastructure Providers regarding settings, flags, and other choices
and then begin bootstrapping.
2. Create a bootstrap cluster using the command nkp create bootstrap --kubeconfig $HOME/.kube/
config.
Note: Use --http-proxy, --https-proxy, and --no-proxy and their related values in this command for
it to be successful. For more information, see Configuring an HTTP or HTTPS Proxy on page 644.
Example output:
# Creating a bootstrap cluster
# Initializing new CAPI components
4. NKP then deploys the following Cluster API providers on the cluster.
Note: The cluster name may only contain the following characters: a-z, 0-9,. , and -. Cluster creation will fail if
the name has capital letters. For more naming information, see https://kubernetes.io/docs/concepts/overview/
working-with-objects/names/.
Note: NKP uses the GCP CSI driver as the default storage provider. Use a Kubernetes CSI compatible storage
that is suitable for production.
Procedure
Note: To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster by
setting the following flag --registry-mirror-url=https://registry-1.docker.io --
registry-mirror-username=<username> --registry-mirror-password=<password> on
the nkp create cluster command. For more information, see https://docs.docker.com/docker-hub/
download-rate-limit/
Note: Google Cloud Platform does not publish images. You must first build the image using Konvoy Image
Builder.
Procedure
1. Create an image using Konvoy Image Builder (KIB) and then export the image name.
export IMAGE_NAME=projects/${GCP_PROJECT}/global/images/<image_name_from_kib>
2. Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation. If
you need to change the kubernetes subnets, you must do this at cluster creation. The default subnets used in NKP
are.
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
» Additional Options for your environment; otherwise, proceed to the next step to create your cluster.
(Optional) Modify Control Plane Audit logs - Users can modify the KubeadmControlplane cluster-API
object to configure different kubelet options. See the following guide if you wish to configure your control
plane beyond the existing options available from flags.
» (Optional) Determine what VPC Network to use. All GCP accounts come with a default preconfigured VPC
Network, which will be used if you do not specify a different network. To use a different VPC network for
your cluster, create one by following these instructions for Create and Manage VPC Networks. Then select the
--network <new_vpc_network_name> option on the create cluster command below. More information is
available on GCP Cloud Nat and network flag.
» (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when
pulling images previously pushed to your registry.
3. Create a Kubernetes cluster object with a dry run output for customizations. The following example shows a
common configuration.
nkp create cluster gcp \
--cluster-name=${CLUSTER_NAME} \
--additional-tags=owner=$(whoami) \
--with-gcp-bootstrap-credentials=true \
--project=${GCP_PROJECT} \
Note: More flags can be added to the nkp create cluster command for more options. See Choices below or
refer to the topic Universal Configurations:
» If your environment uses HTTP or HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP or HTTPS Proxy on page 644
» You can create individual manifest files with different smaller manifests for ease in editing using the --
output-directory flag.
4. Inspect or edit the cluster objects. Familiarize yourself with Cluster API before editing the cluster objects, as edits
can prevent the cluster from deploying successfully.
5. Create the cluster from the objects generated from the dry run. A warning will appear in the console if the
resource already exists, requiring you to remove the resource or update your YAML.
kubectl create -f ${CLUSTER_NAME}.yaml
Note: If you used the --output-directory flag in your NKP create .. --dry-run step above, create
the cluster from the objects you created by specifying the directory:
7. After the objects are created on the API server, the Cluster API controllers reconcile them. They create
infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command
to describe the current status of the cluster.
nkp describe cluster -c ${CLUSTER_NAME}
Output:
NAME READY
SEVERITY REASON SINCE MESSAGE
Cluster/gcp-example True
52s
##ClusterInfrastructure - GCPCluster/gcp-example
##ControlPlane - KubeadmControlPlane/gcp-example-control-plane True
52s
# ##Machine/gcp-example-control-plane-6fbzn True
2m32s
# # ##MachineInfrastructure - GCPMachine/gcp-example-control-plane-62g6s
# ##Machine/gcp-example-control-plane-jf6s2 True
7m36s
# # ##MachineInfrastructure - GCPMachine/gcp-example-control-plane-bsr2z
# ##Machine/gcp-example-control-plane-mnbfs True
54s
# ##MachineInfrastructure - GCPMachine/gcp-example-control-plane-s8xsx
##Workers
##MachineDeployment/gcp-example-md-0 True
78s
##Machine/gcp-example-md-0-68b86fddb8-8glsw True
2m49s
Note: NKP uses the GCP CSI driver as the default storage provider. Use a Kubernetes CSI compatible
storage that is suitable for production. For more information, see the Kubernetes documentation Changing the
Default Storage Class. If you’re not using the default, you cannot deploy an alternate provider until after the
nkp create cluster is finished. However, this must be determined before the Kommander installation.
Note: If you already have a self-managed or Management cluster in your environment, skip this page.
Follow these steps to turn your new cluster into a Management Cluster for an Ultimate license environment (or a
free-standing Pro Cluster):
Procedure
Note: If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-
proxy, and --no-proxy and their related values in this command for it to be successful. More information is
available in Configuring an HTTP/HTTPS Proxy.
2. The cluster life cycle services on the workload cluster are ready, but the workload cluster configuration is on the
bootstrap cluster. The move command moves the configuration, which takes the form of Cluster API Custom
Resource objects, from the bootstrap to the workload cluster. This process is called a Pivot. For more information,
see https://cluster-api.sigs.k8s.io/reference/glossary.html?highlight=pivot#pivot.
Move the Cluster API objects from the bootstrap to the workload cluster:
nkp move capi-resources --to-kubeconfig ${CLUSTER_NAME}.conf
Output:
# Moving cluster resources
Note: To ensure only one set of cluster life cycle services manages the workload cluster, NKP first pauses the
reconciliation of the objects on the bootstrap cluster, then creates the objects on the workload cluster. As NKP
copies the objects, the cluster life cycle services on the workload cluster reconcile the objects. The workload cluster
becomes self-managed after NKP creates all the objects. If it fails, the move command can be safely retried.
4. Use the cluster life cycle services on the workload cluster to check the workload cluster status. After moving the
cluster life cycle services to the workload cluster, remember to use NKP with the workload cluster kubeconfig.
nkp describe cluster --kubeconfig ${CLUSTER_NAME}.conf -c ${CLUSTER_NAME}
Output:
NAME READY
SEVERITY REASON SINCE MESSAGE
Cluster/gcp-example True
14s
##ClusterInfrastructure - GCPCluster/gcp-example
##ControlPlane - KubeadmControlPlane/gcp-example-control-plane True
14s
# ##Machine/gcp-example-control-plane-6fbzn True
17s
# # ##MachineInfrastructure - GCPMachine/gcp-example-control-plane-62g6s
# ##Machine/gcp-example-control-plane-jf6s2 True
17s
# # ##MachineInfrastructure - GCPMachine/gcp-example-control-plane-bsr2z
# ##Machine/gcp-example-control-plane-mnbfs True
17s
# ##MachineInfrastructure - GCPMachine/gcp-example-control-plane-s8xsx
##Workers
##MachineDeployment/gcp-example-md-0 True
17s
##Machine/gcp-example-md-0-68b86fddb8-8glsw True
17s
# ##MachineInfrastructure - GCPMachine/gcp-example-md-0-zls8d
##Machine/gcp-example-md-0-68b86fddb8-bvbm7 True
17s
# ##MachineInfrastructure - GCPMachine/gcp-example-md-0-5zcvc
##Machine/gcp-example-md-0-68b86fddb8-k9499 True
17s
# ##MachineInfrastructure - GCPMachine/gcp-example-md-0-k8h5p
##Machine/gcp-example-md-0-68b86fddb8-l6vfb True
17s
##MachineInfrastructure - GCPMachine/gcp-example-md-0-9h5vn
5. Remove the bootstrap cluster because the workload cluster is now self-managed.
NKP delete bootstrap --kubeconfig $HOME/.kube/config
# Deleting bootstrap cluster
Procedure
• NKP only supports moving all namespaces in the cluster; NKP does not support migration of individual
namespaces.
• Konvoy supports moving only one set of cluster objects from the bootstrap cluster to the workload cluster or vice-
versa.
Procedure
1. When the workload cluster is created, the cluster life cycle services generate a kubeconfig file for the workload
cluster and write it to a Secret. The kubeconfig file is scoped to the cluster administrator. Get a kubeconfig
file for the workload cluster.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
Note: The Status may take a few minutes to move to Ready while the Pod network is deployed. The node status
will change to Ready soon after the calico-node DaemonSet Pods are Ready.
Output:
NAME STATUS ROLES AGE VERSION
gcp-example-control-plane-9z77w Ready control-plane,master 4m44s v1.27.6
gcp-example-control-plane-rtj9h Ready control-plane,master 104s v1.27.6
gcp-example-control-plane-zbf9w Ready control-plane,master 3m23s v1.27.6
gcp-example-md-0-88c46 Ready <none> 3m28s v1.27.6
gcp-example-md-0-fp8s7 Ready <none> 3m28s v1.27.6
gcp-example-md-0-qvnx7 Ready <none> 3m28s v1.27.6
gcp-example-md-0-wjdrg Ready <none> 3m27s v1.27.6
Prerequisites:
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} >> ${CLUSTER_NAME}.conf
a. See the Kommander Customizations page for customization options. Some options include Custom Domains
and Certificates, HTTP proxy, and External Load Balancer.
5. Only required if your cluster uses a custom AWS VPC and requires an internal load-balancer; set the traefik
annotation to create an internal-facing ELB.
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
6. Enable NKP Catalog Applications and Install Kommander in the same kommander.yaml from the previous
section, add these values (if you are enabling NKP Catalog Apps) for NKP-catalog-appliations.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
catalog:
repositories:
- name: NKP-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "NKP"
gitRepositorySpec:
Note: If you only want to enable catalog applications to an existing configuration, add these values to an existing
installer configuration file to maintain your Management cluster’s settings.
If you want to enable NKP Catalog applications after installing NKP, see Enable NKP Catalog
Applications after Installing NKP.
Section Contents
Procedure
Task step.
Procedure
1. Make sure your AWS credentials are up to date. Refresh the credentials using this command.
nkp update bootstrap credentials aws --kubeconfig $HOME/.kube/config
2. The bootstrap cluster will host the Cluster API controllers that reconcile the cluster objects marked for deletion.
Create a bootstrap cluster. To avoid using the wrong kubeconfig, the following steps use explicit kubeconfig paths
and contexts.
nkp create bootstrap --kubeconfig $HOME/.kube/config --with-aws-bootstrap-
credentials=true
3. Move the Cluster API objects from the workload to the bootstrap cluster: The cluster life cycle services on the
bootstrap cluster are ready, but the workload cluster configuration is on the workload cluster. The move command
4. Use the cluster life cycle services on the workload cluster to check the workload cluster’s status.
nkp describe cluster --kubeconfig $HOME/.kube/config -c ${CLUSTER_NAME}
Output:
102s
Procedure
1. To delete a cluster, Use nkp delete cluster and pass in the name of the cluster you are trying to delete
with --cluster-name flag. Use kubectl get clusters to get those details (--cluster-name and --
namespace) of the Kubernetes cluster to delete it.
kubectl get clusters
Note: Before deleting the cluster, Nutanix Kubernetes Platform (NKP) deletes all Services of type LoadBalancer
on the cluster. An AWS Classic ELB backs each Service. Deleting the Service deletes the ELB that backs it. To
skip this step, use the --delete-kubernetes-resources=false flag. Do not skip this step if NKP
manages the VPC when NKP deletes the cluster, it deletes the VPC. If the VPC has any AWS Classic ELBs, AWS
does not allow the VPC to be deleted, and NKP cannot delete the cluster.
Procedure
Delete the bootstrap cluster.
nkp delete bootstrap --kubeconfig $HOME/.kube/config
Output:
# Deleting bootstrap cluster
Section Contents
Nutanix Kubernetes Platform (NKP) implements node pools using Cluster API MachineDeployments. For more
information on node pools, see these sections:
Creating a node pool is useful when you need to run workloads that require machines with specific
resources, such as a GPU, additional memory, or specialized network or storage hardware.
Procedure
1. Set the environment variable to the name you assigned this cluster.
export CLUSTER_NAME=gcp-example
2. If your workload cluster is self-managed, as described in Make the New Cluster Self-Managed, configure kubectl
to use the kubeconfig for the cluster.
export KUBECONFIG=${CLUSTER_NAME}.conf
Procedure
Set the --zone flag to a zone in the same region as your cluster. Create a new node pool with three replicas using
this command.
nkp create nodepool gcp ${NODEPOOL_NAME} \
--cluster-name=${CLUSTER_NAME} \
--image $IMAGE_NAME \
--zone us-west1-b \
--replicas=3
This example uses default values for brevity. Use flags to define custom instance types, images, and other properties.
Advanced users can use a combination of the --dry-run and --output=yaml or --output-
directory=<existing-directory> flags to get a complete set of node pool objects to modify locally or store in
version control.
List the node pools of a given cluster. This returns specific properties of each node pool so that you can
see the name of the MachineDeployments.
Procedure
To list all node pools for a managed cluster, run:.
nkp get nodepools --cluster-name=${CLUSTER_NAME} --kubeconfig=${CLUSTER_NAME}.conf
gcp-example-md-0 4 4 v1.29.6
While you can run Cluster Autoscaler, you can also manually scale your node pools up or down when you
need finite control over your environment.
Procedure
gcp-attached-md-0 5 5 v1.29.6
While you can run Cluster Autoscaler, you can also manually scale your node pools down when you need
more finite control over your environment.
Procedure
gcp-example-md-0 4 4 v1.29.6
3. In a default cluster, the nodes to delete are selected at random. • CAPI’s delete policy controls this behavior.
However, when using the Nutanix Kubernetes Platform (NKP) CLI to scale down a node pool, it is also possible
to specify the Kubernetes Nodes you want to delete.
To do this, set the flag --nodes-to-delete with a list of nodes as below. This adds an annotation cluster.x-
k8s.io/delete-machine=yes to the matching Machine object that contains status.NodeRef with the node
names from --nodes-to-delete.
nkp scale nodepools ${NODEPOOL_NAME} --replicas=3 --nodes-to-delete=<> --cluster-
name=${CLUSTER_NAME}
Output:
# Scaling node pool example to 3 replicas
Deleting a node pool deletes the Kubernetes nodes and the underlying infrastructure.
In an air-gapped environment, your environment is isolated from unsecured networks, like the Internet. In a
non-air-gapped environment, your environment has two-way access to and from the Internet.
For more information, see Installing Kommander in an Air-gapped Environment on page 965 and
Installing Kommander in a Non-Air-gapped Environment on page 969.
• Your License Type
NKP Pro and NKP Government Pro are self-managed single-cluster Kubernetes solutions that give you a
feature-rich, easy-to-deploy, and easy-to-manage entry-level cloud container platform. The NKP Pro and NKP
Gov licenses give the user access to the entire Konvoy cluster environment, as well as manage the Kommander
platform application manager.
NKP Ultimate and NKP Government Advanced are multi-cluster solutions centered around a management
cluster that manage multiple attached or managed Kubernetes clusters through a centralized management
dashboard. For this license type, you will determine whether or not to use the NKP Catalog applications.
For more information, see Licenses on page 23.
• Whether you want to enable NKP Insights, for more information, see Nutanix Kubernetes Platform Insights
Guide on page 1143.
NKP Insights is a predictive analytics capability that detects anomalies that occur either in the present or future
and generates an alert in the NKP UI.
Note: For security purposes, AI Navigator is not installed for air-gapped environments.
• Ensure you have reviewed all the Prerequisites for Installation on page 44.
• Ensure you have a default StorageClass.
• Ensure you have loaded all necessary images for your configuration. For more information, see Images
Download into Your Registry: Air-gapped Environments on page 967.
• Note down the name of the cluster where you want to install Kommander. If you do not know it, use kubectl
get clusters -A to display it.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
5. If required: If your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
...
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true
...
6. Expand one of the following sets of instructions, depending on your license and application environments.
If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy,
and --no-proxy and their related values in this command for it to be successful.
Tips and recommendations:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Commands within a kubeconfig File on page 31.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time to
wait> flag and specify a period (for example, 1h) to allocate more time to the deployment of applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install command to
retry.
What to do next
See Verifying Kommander Installation on page 985.
If you want to enable a solution that detects current and future anomalies in workload configurations or Kubernetes
clusters, see Nutanix Kubernetes Platform Insights Guide on page 1143.
Procedure
In the kommander.yaml file, run the following command.
nkp install kommander \
--installer-config kommander.yaml --kubeconfig=${CLUSTER_NAME}.conf \
--kommander-applications-repository ./application-repositories/kommander-applications-
v2.12.0.tar.gz \
Note: If you want to enable NKP Catalog applications after installing NKP, see NKP Catalog Applications
Enablement after Installing NKP on page 1011.
Procedure
1. In the same kommander.yaml of the previous section, add the following values to enable NKP Catalog
Applications.
...
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
path: ./nkp-catalog-applications-v2.12.0.tar.gz
...
Warning: If you only want to enable catalog applications to an existing configuration, add these values to an
existing installer configuration file to maintain your Management cluster’s settings.
Procedure
1. Download the NKP air-gapped bundle for this release to load registry images as explained below.
For more information, see Downloading NKP on page 16.
nkp-air-gapped-bundle_v2.12.0_linux_amd64.tar.gz
• Both management and attached clusters must be able to connect to the local registry.
• The management cluster must be able to connect to all the attached cluster API servers.
• The management cluster must be able to connect to any load balancers created for platform services on the
management cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap, change your directory to the nkp-<version>directory, similar to the
example below, depending on your current location.
cd nkp-v2.12.0
Warning: If you do not already have a local registry, set up one. For more information, see Registry Mirror Tools
on page 1017.
Procedure
Run the following command to load the air-gapped image bundle into your private registry.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Procedure
Run the following command to load the nkp-catalog-applications image bundle into your private registry.
nkp push bundle --bundle ./container-images/nkp-catalog-applications-image-bundle-
v2.12.0.tar --to-registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME}
--to-registry-password=${REGISTRY_PASSWORD}
It can take a while to push all the images to your image registry, depending on the performance of the network
between the machine you are running the script on and the registry.
• Ensure you have reviewed all Identifying and Modifying Your StorageClass on page 982.
• Ensure you have a default StorageClass. See Identifying and Modifying Your StorageClass on page 982.
• Note down the name of the cluster where you want to install Kommander. If you do not know it, use kubectl
get clusters -A to display it.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
5. If required: If your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
...
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true
...
6. Expand one of the following sets of instructions, depending on your license and application environments.
If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy,
and --no-proxy and their related values in this command for it to be successful. More information is available
in.
Tips and recommendations:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives and to Provide Context for Commands with a kubeconfig File, see Commands
within a kubeconfig File on page 31.
• Applications can take longer to deploy and time out the installation. Add the --wait-timeout <time to
wait> flag and specify a period (for example, 1h) to allocate more time to the deployment of applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install command to
retry.
What to do next
See Verifying Kommander Installation on page 985.
If you want to enable a solution that detects current and future anomalies in workload configurations or Kubernetes
clusters, see Nutanix Kubernetes Platform Insights Guide on page 1143.
Note: If you want to enable NKP Catalog applications after installing NKP, see NKP Catalog Applications
Enablement after Installing NKP on page 1011.
Procedure
1. In the same kommander.yaml of the previous section, add the following values to enable NKP Catalog
Applications.
...
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
...
Warning: If you only want to enable catalog applications to an existing configuration, add these values to an
existing installer configuration file to maintain your Management cluster’s settings.
• Ensure you have reviewed all Identifying and Modifying Your StorageClass on page 982.
• Ensure you have a default StorageClass. See Identifying and Modifying Your StorageClass on page 982.
• Note down the name of the cluster where you want to install Kommander. If you do not know it, use kubectl
get clusters -A to display it.
• Ensure you have completed all the Prerequisites for Installation on page 44.
• Ensure you have a default StorageClass. See Identifying and Modifying Your StorageClass on page 982.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
4. Edit the installer file to include configuration overrides for the rook-ceph-cluster.
NKP’s default configuration ships Ceph with PVC based storage (see https://rook.io/docs/rook/v1.10/CRDs/
Cluster/pvc-cluster/) which requires your CSI provider to support PVC with type volumeMode: Block. As
this is not possible with the default local static provisioner (see Default Storage Providers on page 33), you can
install Ceph in host storagemode (see https://rook.io/docs/rook/v1.10/CRDs/Cluster/host-cluster/).
You can choose whether Ceph’s object storage daemon (osd) pods should consume all or just some of the devices
on your nodes. Include one of the following Overrides.
a. To automatically assign all raw storage devices on all nodes to the Ceph cluster.
...
rook-ceph-cluster:
enabled: true
values: |
cephClusterSpec:
storage:
storageClassDeviceSets: []
useAllDevices: true
useAllNodes: true
deviceFilter: "<<value>>"
...
Note: If you want to assign specific devices to specific nodes using the deviceFilter option, see https://
rook.io/docs/rook/v1.10/CRDs/Cluster/host-cluster/#specific-nodes-and-devices.
6. If required: If your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
...
apps:
...
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true
...
7. Expand one of the following sets of instructions, depending on your license and application environments.
If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy,
and --no-proxy and their related values in this command for it to be successful.
Tips and recommendations:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Commands within a kubeconfig File on page 31.
• Applications can take longer to deploy and time out the installation. Add the --wait-timeout <time to
wait> flag and specify a period (for example, 1h) to allocate more time to the deployment of applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install command to
retry.
What to do next
See Verifying Kommander Installation on page 985.
If you want to enable a solution that detects current and future anomalies in workload configurations or Kubernetes
clusters, see Nutanix Kubernetes Platform Insights Guide on page 1143.
Procedure
In the kommander.yaml file, run the following command.
nkp install kommander --installer-config kommander.yaml --kubeconfig=
${CLUSTER_NAME}.conf \
--kommander-applications-repository ./application-repositories/kommander-applications-
v2.12.0.tar.gz \
--charts-bundle ./application-charts/nkp-kommander-charts-bundle-v2.12.0.tar.gz
Note: If you want to enable NKP Catalog applications after installing NKP, see NKP Catalog Applications
Enablement after Installing NKP on page 1011.
Procedure
1. In the same kommander.yaml of the previous section, add the following values to enable NKP Catalog
Applications.
...
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
path: ./nkp-catalog-applications-v2.12.0.tar.gz
...
Warning: If you only want to enable catalog applications to an existing configuration, add these values to an
existing installer configuration file to maintain your Management cluster’s settings.
Procedure
1. Download the NKP air-gapped Bundle for this release (that is. nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz) to load registry images as explained below. See Downloading NKP
on page 16.
• Both management and attached clusters must be able to connect to the local registry.
• The management cluster must be able to connect to all the attached cluster’s API servers.
• The management cluster must be able to connect to any load balancers created for platform services on the
management cluster.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap, change your directory to the nkp-<version>directory similar, to the
example below, depending on your current location.
cd nkp-v2.12.0
Procedure
1. The Kubernetes image bundle will be located in kib/artifacts/images , and you will want to verify the
image and artifacts.
b. Verify the artifacts for your OS exist in the kib/artifacts/directory and export the appropriate variables.
$ ls kib/artifacts/
c. Set the bundle values with the name from the private registry location.
export OS_PACKAGES_BUNDLE=<name_of_the_OS_package>
export CONTAINERD_BUNDLE=<name_of_the_containerd_bundle>
For example, for RHEL 8.4, set.
export OS_PACKAGES_BUNDLE=1.29.6_redhat_8_x86_64.tar.gz
export CONTAINERD_BUNDLE=containerd-1.6.28-d2iq.1-rhel-8.4-x86_64.tar.gz
Warning: If you do not already have a local registry, set up one. For more information, see Registry Mirror Tools
on page 1017.
Procedure
Run the following command to load the air-gapped image bundle into your private registry.
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL} --to-registry-username=${REGISTRY_USERNAME} --to-registry-
password=${REGISTRY_PASSWORD}
Note: It might take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
• Ensure you have reviewed all the prerequisites for installation (see Prerequisites for Installation on page 44).
• Ensure you have a default StorageClass. For more information, see
• Note down the name of the cluster where you want to install Kommander. If you do not know it, use kubectl
get clusters -A to display it.
Warning: You must modify the Kommander installer configuration file (kommander.yaml) before installing the
Kommander component of NKP in a pre-provisioned environment.
Procedure
2. Copy the kubeconfig file of your Management cluster to your local directory.
nkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf
5. If required: If your cluster uses a custom AWS VPC and requires an internal load-balancer, set the traefik
annotation to create an internal-facing ELB.
...
apps:
traefik:
enabled: true
values: |
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true
...
6. Expand one of the following sets of instructions, depending on your license and application environments.
If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy,
and --no-proxy and their related values in this command for it to be successful. More information is available in
Configuring an HTTP/HTTPS Proxy.
Tips and recommendations:
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Commands within a kubeconfig File on page 31.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment of
applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install command to
retry.
What to do next
See Verifying Kommander Installation on page 985.
If you want to enable a solution that detects current and future anomalies in workload configurations or Kubernetes
clusters, see Nutanix Kubernetes Platform Insights Guide on page 1143.
Note: If you want to enable NKP Catalog applications after installing NKP, see NKP Catalog Applications
Enablement after Installing NKP on page 1011.
Procedure
1. In the same kommander.yaml of the previous section, add the following values to enable NKP Catalog
Applications.
...
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
kommander.d2iq.io/workspace-default-catalog-repository: "true"
kommander.d2iq.io/gitapps-gitrepository-type: "nkp"
gitRepositorySpec:
url: https://github.com/mesosphere/nkp-catalog-applications
ref:
tag: v2.12.0
...
Warning: If you only want to enable catalog applications to an existing configuration, add these values to an
existing installer configuration file to maintain your Management cluster’s settings.
Warning: Some applications depend on other applications to work properly. To find out which other applications you
need to enable to test the target application, see See Platform Applications on page 386
Warning: Ultimate considerations: Nutanix recommends performing testing and demo tasks in a single-cluster
environment. The Ultimate license is designed for multi-cluster environments and fleet management, which require a
minimum number of resources. Applying an Ultimate license key to the previous installation adds modifications to your
environment that can exhaust a small environment’s resources.
Procedure
Tip: Sometimes, applications require a longer period of time to deploy, which causes the installation to time out.
Add the --wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate
more time to the deployment of applications.
If the Kommander installation fails, or you wish to reconfigure applications, you can rerun the install command,
and you can view the progress by increasing the log verbosity by adding the flag -v 2.
Dashboard UI Functions
After installing the Konvoy component and building a cluster, as basic, successfully installing Kommander
and logging into the UI, you are now ready to customize configurations.
For more information, see Basic Installations by Infrastructure on page 50. The majority of the customization, such
as attaching clusters and deploying applications, will take place in the dashboard or UI of NKP. The Basic Installation
section allows you to manage cluster operations and their application workloads to optimize your organization’s
productivity.
If you want to enable a solution that detects current and future anomalies in workload configurations or Kubernetes
clusters, see Nutanix Kubernetes Platform Insights Guide on page 1143.
Procedure
1. By default, you can log on to the UI in Kommander with the credentials given using this command.
nkp open dashboard --kubeconfig=${CLUSTER_NAME}.conf
3. You can also retrieve the URL used for accessing the UI using the following command.
kubectl -n kommander get svc kommander-traefik -o go-template='https://{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}/nkp/kommander/
dashboard{{ "\n"}}'
You should only use these static credentials to access the UI and configure an external identity provider. For more
information, see Identity Providers on page 350. Treat the backup as backup credentials rather than use them for
normal access.
Default StorageClass
Kommander requires a default StorageClass.
For the supported cloud providers, the Konvoy component handles the creation of a default StorageClass.
For pre-provisioned environments, the Konvoy component handles the creation of a StorageClass in the form of a
local volume provisioner, which is not suitable for production use. Before installing the Kommander component,
you should identify and install a Kubernetes CSI (see https://kubernetes.io/docs/concepts/storage/volumes/
#volume-types) compatible storage provider that is suitable for production, and then ensure it is set as the default, as
shown below. For more information, see Provisioning a Static Local Volume on page 36..
For infrastructure driver specifics, see Default Storage Providers on page 33.
Procedure
Installling Kommander
You can configure the Kommander component of NKP during the initial installation and post-installation
using the NKP CLI.
• To ensure your cluster has enough resources, review the Management Cluster Application Requirements on
page 41.
• Ensure you have a default StorageClass, as shown in Identifying and Modifying Your StorageClass on
page 982.
Initialize a Kommander Installer Configuration File as follows:
Procedure
To begin configuring Kommander, run the following command to initialize a default configuration file.
Procedure
1. Add the --installer-config flag to the kommander install command to use a custom configuration file.
Tip: Sometimes, applications require a longer period of time to deploy, which causes the installation to time out.
Add the --wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate
more time to the deployment of applications.
Procedure
After the initial deployment of Kommander, you can find the application Helm Charts by checking the
spec.chart.spec.sourceRef field of the associated HelmRelease.
kubectl get helmreleases <application> -o yaml -n kommander
Inline configuration (using values) :
In this example, you configure the centralized-grafana application with resource limits by defining the Helm
Chart values in the Kommander configuration file.
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
centralized-grafana:
values: |
grafana:
resources:
limits:
cpu: 150m
memory: 100Mi
requests:
cpu: 100m
memory: 50Mi
...
Reference another YAML file (using valuesFrom):
Alternatively, you can create another YAML file containing the configuration for centralized-grafana and
reference that using valuesFrom. You can point to this file by using either a relative path (from the configuration file
location) or by using an absolute path.
cat > centralized-grafana.yaml <<EOF
grafana:
resources:
limits:
cpu: 150m
memory: 100Mi
requests:
cpu: 100m
memory: 50Mi
EOF
apiVersion: config.kommander.mesosphere.io/v1alpha1
kind: Installation
apps:
centralized-grafana:
valuesFrom: centralized-grafana.yaml
Note: If the Kommander installation fails or you wishwant to reconfigure applications, you can rerun the install
command to retry the installation.
Procedure
» If you prefer the CLI not to wait for all applications to become ready, you can set the --wait=false flag.
» If you choose not to wait through the NKP CLI, you can check the status of the installation using the following
command:
kubectl -n kommander wait --for condition=Ready helmreleases --all --timeout 15m
This will wait for each of the helm charts to reach their Ready condition, eventually resulting in an output
resembling below:
helmrelease.helm.toolkit.fluxcd.io/centralized-grafana condition met
helmrelease.helm.toolkit.fluxcd.io/dex condition met
helmrelease.helm.toolkit.fluxcd.io/dex-k8s-authenticator condition met
helmrelease.helm.toolkit.fluxcd.io/fluent-bit condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-logging condition met
helmrelease.helm.toolkit.fluxcd.io/grafana-lokiLoki condition met
helmrelease.helm.toolkit.fluxcd.io/karma condition met
helmrelease.helm.toolkit.fluxcd.io/kommander condition met
helmrelease.helm.toolkit.fluxcd.io/kommander-appmanagement condition met
helmrelease.helm.toolkit.fluxcd.io/kube-prometheus-stack condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost condition met
helmrelease.helm.toolkit.fluxcd.io/kubecost-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/kubefed condition met
helmrelease.helm.toolkit.fluxcd.io/kubernetes-dashboard condition met
helmrelease.helm.toolkit.fluxcd.io/kubetunnel condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator condition met
helmrelease.helm.toolkit.fluxcd.io/logging-operator-logging condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-adapter condition met
helmrelease.helm.toolkit.fluxcd.io/prometheus-thanos-traefik condition met
helmrelease.helm.toolkit.fluxcd.io/reloader condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph condition met
helmrelease.helm.toolkit.fluxcd.io/rook-ceph-cluster condition met
helmrelease.helm.toolkit.fluxcd.io/thanos condition met
helmrelease.helm.toolkit.fluxcd.io/traefik condition met
helmrelease.helm.toolkit.fluxcd.io/traefik-forward-auth-mgmt condition met
helmrelease.helm.toolkit.fluxcd.io/velero condition met
» If you find any HelmReleases in a “broken” release state, such as “exhausted” or “another rollback/release in
progress” trigger a reconciliation of the HelmRelease using the following commands:
kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --type='json' -
p='[{"op": "replace", "path": "/spec/suspend", "value": true}]'
kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --type='json' -
p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'
What to do next
After building the Konvoy cluster and installing Kommander, you can verify your Kommander installation
Logging into the UI with Kommander on page 981.
Configuration Parameters
For additional information about configuring the Kommander component of NKP during initial installation, see
Installing Kommander with a Configuration File on page 983.
AppConfig Parameters
neither can
values Contains the values that are
passed to the application's
HelmRelease.
IngressCertificate
Airgapped Parameters
Next Step:
Configuring HTTP proxy for the Kommander Clusters on page 1006
NKP supports configuring a custom domain name for accessing the UI and other platform services, as well as setting
up manual or automatic certificate renewal or rotation. This section provides instructions and examples on how to
configure the NKP installation to add a customized domain and certificate to your Pro cluster or Management cluster.
Note: Using Let’s Encrypt or other public ACME certificate authorities does not work in air-gapped scenarios, as these
services require a connection to the Internet for their setup. For air-gapped environments, you can either use self-signed
certificates issued by the cluster (the default configuration) or a certificate created manually using a trusted Certificate
Authority.
KommanderCluster Object
The KommanderCluster resource is an object that contains key information for all types of clusters that are part of
your environment, such as:
Issuer Objects
Issuer, ClusterIssuer or certificateSecret?
If you use a certificate issued and managed automatically by cert-manager, you need an Issuer or
ClusterIssuerthat you reference in your KommanderCluster resource. The referenced object must contain the
information of your certificate provider.
If you want to use a manually-created certificate, you need a certificateSecret that you reference in your
KommanderCluster resource.
Note:
If you are enabling a proxied access (see Proxied Access to Network-Restricted Clusters on page 505)
for a network-restricted cluster, this configuration is restricted to DNS.
Warning: Certificates issued by another Issuer: You can also configure a certificate issued by another
Certificate Authority. In this case, the CA will determine which information to include in the configuration.
Next Step:
Verifying and Troubleshooting the Domain and Certificate Customization on page 998
a. If you do not have the kommander.yaml file to initialize the configuration file so that you can edit it in the
following steps.
Warning: Initialize this file only ONE time, otherwise, you will overwrite previous customizations.
b. If you have initialized the configuration file already, open the <kommander.yaml> with the editor of your
choice.
3. Enable ACME by adding acme value, the issuer's server, and your e-mail. If you don’t provide a server, NKP sets
up Let's Encrypt as your certificate provider.
acme:
email: <your_email>
server: <your_server>
[...]
Note: Basic configuration in this topic refers to the ACME server without EAB (External Account Bindings) and
HTTP solver.
Procedure
1. Create a ClusterIssuer and store it in the target cluster. It must be called kommander-acme-issuer:
a. If you require an HTTP solver, adapt the following example with the properties required for your certificate
and run the command:
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: kommander-acme-issuer # This part is important
spec:
acme:
email: <your_email>
b. If you require a DNS solver, adapt the following example with the properties required for your certificate and
run the command:
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: kommander-acme-issuer # This part is important
spec:
acme:
email: <your_email>
server: <https://acme.server.example>
privateKeySecretRef:
name: kommander-acme-issuer-account # Set this to <name>-account
solvers:
- dns01:
route53:
region: us-east-1
role: arn:aws:iam::YYYYYYYYYYYY:role/dns-manager
EOF
2. (Optional) If you require External Account Bindings to link your ACME account to an external database, see
https://cert-manager.io/docs/configuration/acme/#external-account-bindings.
3. (Optional): Create a DNS record by setting up the external-dns service. For more information, DNS Record
Creation with External DNS on page 998. This way, the external-dns will take care of pointing the DNS
record to the ingress of the cluster automatically.
You can also manually create a DNS record , that maps your domain name or IP address to the cluster ingress.
If you choose to create a DNS record manually, finish installing the Kommander component, and then manually
create a DNS record that points to the load balancer address.
a. If you do not have the kommander.yaml file, initialize the configuration file, so you can edit it in the
following steps.
Warning: Initialize this file only ONCE, otherwise you will overwrite previous customizations.
b. If you have initialized the configuration file already, open the kommander.yaml with the editor of your
choice.
• Certificate
• certificate’s private key
• CA bundle (containing the root and intermediate certificates)
Procedure
a. If you do not have the kommander.yaml file, initialize the configuration file so that you can edit it in the
following steps.
Warning: Initialize this file only ONCE; otherwise, you will overwrite previous customizations.
b. If you have initialized the configuration file already, open the kommander. yaml with the editor of your
choice.
2. In the Kommander Installer Configuration file, provide your custom domain and the paths to the PEM files of
your certificate.
[...]
clusterHostname: <mycluster.example.com>
ingressCertificate:
certificate: <certs/cert.pem>
Warning: If you require a ClusterIssuer, you must create it before you run the Kommander installation.
Warning: If you need to make changes in the configuration of your domain or certificate after you have installed NKP,
or if you want to set up a custom domain and certificate for Attached or Managed clusters , modify the ingress in the
KommanderCluster object as shown in the Custom domains and certificates configuration section.
Procedure
a. If you do not have the kommander.yaml file, see Installing Kommander with a Configuration File on
page 983, so you can edit it in the following steps.
Warning: Initialize this file only ONCE, otherwise, you will overwrite previous customizations.
b. If you have initialized the configuration file already, open the kommander.yaml with the editor of your
choice.
2. In that file, configure the custom domain for your cluster by adding this line.
[...]
clusterHostname: <mycluster.example.com>
[...]
3. This configuration can be used when installing or reconfiguring Kommander by passing it to the nkp install
kommander command.
nkp install kommander --installer-config <kommander.yaml> --kubeconfig=
${CLUSTER_NAME}.conf
Note: To ensure Kommander is installed on the right cluster, use the --kubeconfig=cluster_name.conf
flag as an alternative to KUBECONFIG.
4. After the command completes, obtain the cluster ingress IP address or hostname using the following command.
kubectl -n kommander get svc kommander-traefik -o go-template='{{with
index .status.loadBalancer.ingress 0}}{{or .hostname .ip}}{{end}}{{ "\n"}}'
If required, create a DNS record (for example, by using external-dns) for your custom hostname that resolves to
the cluster ingress load balancer hostname or IP address. If the previous command returns a hostname, you should
create a CNAME DNS entry that resolves to that hostname. If the cluster ingress is an IP address, create a DNS A
record.
Warning: The domain must be resolvable from the client (your browser) and fro the cluster.by the client (your
browser) and from If you set up an external-dns service, it will take care of pointing the DNS record to the
ingress of the cluster automatically. If you are manually creating a DNS record, you have to install Kommander first
to obtain the load balancer address required for the DNS record.
For more details and examples on how and when to set up the DNS record, see Configuring the Kommander
Installation with a Custom Domain and Certificate on page 990.
Procedure
2. If the ingress is still being provisioned, the output looks similar to this.
[...]
Conditions:
Last Transition Time: 2022-06-24T07:48:31Z
Message: Ingress service object was not found in the cluster
Reason: IngressServiceNotFound
Status: False
Type: IngressAddressReady
[...]
If the provisioning has been completed, the output looks similar to this.
[...]
Conditions:
Last Transition Time: 2022-06-28T13:43:33Z
Message: Ingress service address has been provisioned
Reason: IngressServiceAddressFound
Status: True
Type: IngressAddressReady
Last Transition Time: 2022-06-28T13:42:24Z
Message: Certificate is up to date and has not expired
Reason: Ready
Status: True
Type: IngressCertificateReady
[...]
The same command also prints the actual customized values for the KommanderCluster.Status.Ingress.
Here is an example.
[...]
ingress:
address: 172.20.255.180
caBundle: LS0tLS1CRUdJTiBD...<output has been shortened>...DQVRFLS0tLS0K
[...]
• Configuring External DNS with the CLI: Management or Pro Cluster on page 999
• Configuring the External DNS Using the UI on page 1000
• Verifying Your External DNS Configuration on page 1002
If you choose to create a DNS record manually, finish installing the Kommander component and then manually create
a DNS record that points to the load balancer address.
Procedure
a. If you do not have the kommander.yaml file, see Installing Kommander with a Configuration File on
page 983so that you can edit it in the following steps.
Warning: Initialize this file only ONCE; otherwise, you will overwrite previous customizations.
b. If you have installed the Kommander component already, open the existing kommander.yaml with the editor
of your choice.
2. Adjust the app section of your kommander.yaml file to include these values.
AWS Example: Replace the placeholders <...>with your environment's information.
The following example shows how to configure external-dns to manage DNS records in AWS Route 53
automatically:
apps:
external-dns:
enabled: true
values: |
aws:
credentials:
secretKey: <secret-key>
accessKey: <access-key>
region: <provider-region>
preferCNAME: true
policy: upsert-only
txtPrefix: local-
domainFilters:
- <example.com>
Azure Example: Replace the placeholders <...>with your environment's information.
apps:
external-dns:
enabled: true
3. In the same app section, adjust the traefiksection to include the following.
traefik:
enabled: true
values: |
service:
annotations:
external-dns.alpha.kubernetes.io/hostname: <mycluster.example.com>
What to do next
Verifying Your External DNS Configuration on page 1002
Procedure
6. Copy and paste the following contents into the code editor and replace the placeholders <...> with your
environment’s information.
Here is an example configuration.
AWS Example: Replace the placeholders <...>with your environment's information.
The following example shows how to configure external-dns to manage DNS records in AWS Route 53
automatically.
aws:
credentials:
secretKey: <secret-key>
accessKey: <access-key>
region: <provider-region>
preferCNAME: true
policy: upsert-only
txtPrefix: local-
domainFilters:
- <example.com>
Azure Example: Replace the placeholders <...>with your environment's information.
azure:
cloud: AzurePublicCloud
resourceGroup: <resource-group>
tenantId: <tenant-id>
subscriptionId: <your-subscription-id>
aadClientId: <client-id>
aadClientSecret: <client-secret>
domainFilters:
- <example.com>
txtPrefix: txt-
policy: sync
provider: azure
For more configuration options, see https://artifacthub.io/packages/helm/bitnami/external-dns.
Procedure
Warning: Ensure you set up a domain per cluster, for example: <mycluster1.example.com>,
<mycluster2.example.com> and <mycluster3.example.com>.
What to do next
Verifying Your External DNS Configuration on page 1002
Procedure
1. Set the environment variable to the Management/Pro cluster by exporting the kubeconfig file in your terminal
window or using the --kubeconfig=${CLUSTER_NAME}.conf as explained in Commands within a
kubeconfig File on page 31.
Procedure
1. Set the environment variable to the target cluster (where you enabled external-dns) by exporting the
kubeconfigfile in your terminal window or using the --kubeconfig=${CLUSTER_NAME}.conf as explained
in Commands within a kubeconfig File on page 31
Procedure
1. Set the environment variable to the target cluster (where you enabled external-dns) by exporting the
kubeconfigfile in your terminal window or using the --kubeconfig=${CLUSTER_NAME}.conf as explained
in Commands within a kubeconfig File on page 31.
2. Verify that the cluster’s ingress contains the correct hostname annotation.
Replace <target_WORKSPACE_NAMESPACE> in the namespace -n flag with the target cluster’s workspace
namespace.
kubectl get services -n <target_WORKSPACE_NAMESPACE> kommander-traefik -o yaml
The output looks like this.
Ensure that the service object contains the external-dns.alpha.kubernetes.io/hostname:
<mycluster.example.com> annotation.
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: kommander-traefik
meta.helm.sh/release-namespace: kommander
external-dns.alpha.kubernetes.io/hostname: <mycluster.example.com>
creationTimestamp: "2023-06-21T04:52:49Z"
finalizers:
[...]
The external-dns service has been linked to the cluster correctly.
Procedure
1. Set the environment variable to the target cluster (where you enabled external-dns) by exporting the
kubeconfigfile in your terminal window or using the --kubeconfig=${CLUSTER_NAME}.conf as explained
in Commands within a kubeconfig File on page 31.
3. Use the image to check your domain and see the record.
Replace <mycluster.example.com> with the domain you assigned to your target cluster.
nslookup <mycluster.example.com>
The output should look like this.
Server: 192.168.178.1
Address: 192.168.178.1#53
Non-authoritative answer:
Name: <mycluster.example.com>
Address: 134.568.789.12
The external-dns service is working, and the DNS provider recognizes the record created by the service. If the
command displays an error, the configuration is failing on the end of the DNS provider.
Note: If your deployment has not succeeded and the previous steps have not helped you identify the issue, you can
also check the logs for the external-dns deployment:
• 1. Set the environment variable to the target cluster (where you enabled external-dns) by
exporting the kubeconfigfile in your terminal window or using the --kubeconfig=
${CLUSTER_NAME}.conf as explained in Commands within a kubeconfig File on page 31.
2. Verify the external-dns logs:
Replace <target_WORKSPACE_NAMESPACE> in the namespace -n flag with the target cluster’s
workspace namespace.
kubectl logs -n kommander deployment/external-dns
The output displays the pod’s logs for the external-dns deployment. Here is an example:
...
time="2023-07-04T06:56:35Z" level=info msg="Instantiating new Kubernetes
client"
time="2023-07-04T06:56:35Z" level=info msg="Using inCluster-config based
on serviceaccount-token"
time="2023-07-04T06:56:35Z" level=info msg="Created Kubernetes client
https://10.96.0.1:443"
time="2023-07-04T06:56:35Z" level=error msg="records retrieval failed:
failed to list hosted zones:
...
Note: In NKP environments, the external load balancer must be configured without TLS termination.
Procedure
a. If you do not have the kommander.yaml file, see Installing Kommander with a Configuration File on
page 983so that you can edit it in the following steps.
Warning: Initialize this file only ONCE. Otherwise you will overwrite previous customizations.
b. If you have installed the Kommander component already, open the existing kommander.yaml with the editor
of your choice.
2. In that file, add the following line for the IP address or DNS name:
Warning: ACME does not support the automatic creation of a certificate if you select an IP address for your
clusterHostname.
[...]
clusterHostname: <mycluster.example.com OR IP_address>
[...]
3. (Optional): If you require a custom certificate for your clusterHostname, see Configuring the Kommander
Installation with a Custom Domain and Certificate on page 990.
4. In the same Kommander Installer Configuration File, configure Kommander to use the NodePort service
by adding a custom configuration under traefik.
Warning: You can specify the nodePort entry points for the load balancer. Ensure the port is within the
Kubernetes default (30 000 - 32 768). If not specified, Kommander assigns a port dynamically.
traefik:
enabled: true
values: |-
ports:
web:
nodePort: 32080 #if not specified, will be assigned dynamically
websecure:
nodePort: 32443 #if not specified, will be assigned dynamically
service:
type: NodePort
Warning:
• The NO_PROXY variable contains the Kubernetes Services CIDR. This example uses the default CIDR,
10.96.0.0/12. If your cluster's CIDR is different, update the value in the NO_PROXY field.
• Based on the order in which the Gatekeeper Deployment is Ready (in relation to other Deployments),
not all the core services are guaranteed to be mutated with the proxy environment variables. Only the
user user-deployed workloads are guaranteed to be mutated with the proxy environment variables.
If you need a core service to be mutated with your proxy environment variables, you can restart the
AppDeployment for that core service.
Note: Kommander follows a common convention for using an HTTP proxy server. The convention is based on three
environment variables, and is supported by many, though not all, applications.
• NO_PROXY: a list of IPs and domain names that are not subject to proxy settings
Procedure
1. Verify the cluster nodes can access the Internet through the proxy server.
Enabling Gatekeeper
Gatekeeper acts as a Kubernetes mutating webhook.
Procedure
1. Create (if necessary) or update the Kommander installation configuration file. If one does not already exist, then
create it using the following commands.
nkp install kommander --init > kommander.yaml
Note:
Only pods created after applying this setting will be mutated. Also, this will only affect pods in the
namespace with the "gatekeeper.d2iq.com/mutate=pod-proxy" label.
apps:
gatekeeper:
values: |
disableMutation: false
mutations:
enablePodProxy: true
podProxySettings:
noProxy:
"127.0.0.1,192.168.0.0/16,10.0.0.0/16,10.96.0.0/12,169.254.169.254,169.254.0.0/24,localhost,kub
prometheus-server.kommander,logging-operator-logging-
fluentd.kommander.svc.cluster.local,elb.amazonaws.com"
httpProxy: "http://proxy.company.com:3128"
httpsProxy: "http://proxy.company.com:3128"
excludeNamespacesFromProxy: []
namespaceSelectorForProxy:
"gatekeeper.d2iq.com/mutate": "pod-proxy"
3. Create the kommander and kommander-flux namespaces, or the namespace where Kommander will be
installed. Label the namespaces to activate the Gatekeeper mutation on them.
kubectl create namespace kommander
kubectl label namespace kommander gatekeeper.d2iq.com/mutate=pod-proxy
Procedure
Run the following command.
export NAMESPACE=kommander
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: gatekeeper-overrides
namespace: ${NAMESPACE}
data:
values.yaml: |
---
# enable mutations
disableMutation: false
mutations:
enablePodProxy: true
podProxySettings:
noProxy:
"127.0.0.1,192.168.0.0/16,10.0.0.0/16,10.96.0.0/12,169.254.169.254,169.254.0.0/24,localhost,kubern
Note: To ensure Kommander is installed on the workload cluster, use the --kubeconfig=cluster_name.conf
flag:
Procedure
Run the following command.
nkp install kommander --installer-config kommander.yaml
Procedure
1. To have Gatekeeper mutate the manifests, create the Workspace (or Project) with the following label.
labels:
gatekeeper.d2iq.com/mutate: "pod-proxy"
2. This can be done when creating the Workspace (or Project) from the UI OR by running the following command
from the CLI after creating the namespace.
kubectl label namespace <NAMESPACE> "gatekeeper.d2iq.com/mutate=pod-proxy"
Procedure
1. Execute the following command in the attached cluster before attaching it to the host cluster.
kubectl create namespace <NAMESPACE>
1. To configure Gatekeeper so that these environment variables are mutated in the pods, create the following
gatekeeper-overrides ConfigMap in the Workspace Namespace.
export NAMESPACE=<NAMESPACE>
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: gatekeeper-overrides
namespace: ${NAMESPACE}
data:
values.yaml: |
---
# enable mutations
disableMutation: false
mutations:
enablePodProxy: true
podProxySettings:
noProxy:
"127.0.0.1,192.168.0.0/16,10.0.0.0/16,10.96.0.0/12,169.254.169.254,169.254.0.0/24,localhost,kub
prometheus-server.kommander,logging-operator-logging-
fluentd.kommander.svc.cluster.local,elb.amazonaws.com"
httpProxy: "http://proxy.company.com:3128"
httpsProxy: "http://proxy.company.com:3128"
excludeNamespacesFromProxy: []
namespaceSelectorForProxy:
"gatekeeper.d2iq.com/mutate": "pod-proxy"
EOF
Set the httpProxy and httpsProxy environment variables to the address of the HTTP and HTTPS proxy
servers, respectively. Set the noProxy environment variable to the addresses that should be accessed directly,
not through the proxy. To view the list of the recommended settings, see HTTP Proxy Configuration
Considerations on page 1006.
2.
Procedure
Run the following command.
"gatekeeper.d2iq.com/mutate": "pod-proxy"
Note: If Gatekeeper is not installed and you need to use an HTTP proxy, you must manually configure your
applications.
In this example, the environment variables are set for a container in a Pod:
Procedure
1. Some applications follow the convention of HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables.
2. In this example, the environment variables are set for a container in a Pod.
For more information, see https://kubernetes.io/docs/tasks/inject-data-application/define-environment-
variable-container/#define-an-environment-variable-for-a-container.
What to do next
Next Steps: Now select your environment, and finish your Kommander Installation using one of the
following:
1. Locate the Kommander Installer Configuration file you are using for your current deployment. This file is stored
locally on your computer.
The default file name is kommander.yaml, but you can provide a different name for your file.
3. Enable the default catalog repository by adding the following values to your existing and already configured
Kommander Installer Configuration file.
...
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.d2iq.io/project-default-catalog-repository: "true"
4. Reconfigure NKP by reinstalling the Kommander component. In the following example, replace
kommander.yaml with the name of your Kommander Installer Configuration file.
Warning: Ensure you are using the correct name for the Kommander Installer Configuration file to maintain your
cluster settings. Installing a different kommander.yaml than the one your environment is using overwrites all of
your previous configurations and customizations.
• The --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you install Kommander on the correct
cluster. For alternatives, see Commands within a kubeconfig File on page 31.
• Applications can take longer to deploy, and time out the installation. Add the --wait-timeout <time
to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment of
applications.
• If the Kommander installation fails, or you want to reconfigure applications, rerun the install command to
retry.
Label Description
kommander.d2iq.io/project-default-catalog- Indicates this acts as a Catalog Repository in all
repository projects
kommander.d2iq.io/workspace-default- Indicates this acts as a Catalog Repository in all
catalog-repository workspaces
kommander.d2iq.io/gitapps-gitrepository- Indicates this Catalog Repository (and all its
type Applications) are certified to run on NKP
8
ADDITIONAL KONVOY CONFIGURATIONS
When installing Nutanix Kubernetes Platform (NKP for a project, line-of-business, or enterprise, the first step is to
determine the infrastructure on which you want to deploy. The infrastructure you select then determines the specific
requirements for a successful installation.
For basic recommended installations by infrastructure, see Basic Installations by Infrastructure on page 50.
For custom or advanced installations by infrastructure, see Custom Installation and Infrastructure Tools on
page 644.
If you have decided to uninstall NKP, see the same infrastructure documentation you have selected in the Basic
Installations by Infrastructure on page 50.
Section Contents
The different Infrastructures and components for further configuration are listed below:
Section Contents
Note: You cannot apply FIPS mode to an existing cluster. You must create a new cluster with FIPS enabled. Similarly,
a FIPS-mode cluster must remain a FIPS-mode cluster; you cannot change its FIPS status after creating it.
AWS Example
When creating a cluster, use the following command line options:
• --kubernetes-version <version>+fips.<build>
• --etcd-version <version>+fips.<build>
• --kubernetes-image-repository docker.io/mesosphere
• --etcd-image-repository docker.io/mesosphere
nkp create cluster aws --cluster-name myFipsCluster \
--ami=ami-03dcaa75d45aca36f \
--kubernetes-version=v1.29.6+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
--etcd-image-repository=docker.io/mesosphere \
--etcd-version=3.5.10+fips.0
vSphere Example
nkp create cluster vsphere \
--cluster-name ${CLUSTER_NAME} \
--network <NETWORK_NAME> \
--control-plane-endpoint-host <xxx.yyy.zzz.000> \
--data-center <DATACENTER_NAME> \
--data-store <DATASTORE_NAME> \
--folder <FOLDER_NAME> \
--server <VCENTER_API_SERVER_URL> \
--ssh-public-key-file <SSH_PUBLIC_KEY_FILE> \
--resource-pool <RESOURE_POOL_NAME> \
--vm-template <TEMPLATE_NAME> \
Note: To locate the available Override files in the Konvoy Image Builder repo, see https://github.com/
mesosphere/konvoy-image-builder/tree/main/overrides.
Examples
The following snippets show the creation of FIPS-compliant Kubernetes components. If you need the underlying OS
to be FIPS-compliant, provide the specific FIPS-compliant OS image using the --source-ami flag for AWS.
• A non-air-gapped environment example of override file use is the command below, which produces a FIPS-
compliant image on RHEL 8.4 for AWS: Replace ami with your infrastructure provisioner
konvoy-image build --overrides overrides/fips.yaml images/ami/rhel-84.yaml
Note: In the Using Override Files with Konvoy Image Builder, there is a list of Override Files and how to
apply them.
Note: To locate the available Override files in the Konvoy Image Builder repo, see https://github.com/
mesosphere/konvoy-image-builder/tree/main/overrides.
Examples:
The following snippets show the creation of FIPS-compliant Kubernetes components. If you need the underlying OS
to be FIPS-compliant, provide the specific FIPS-compliant OS image using the --source-ami flag for AWS.
• An air-gapped environment example of override file use is the command below, which produces an AWS FIPS-
compliant image on RHEL 8.4:
konvoy-image build --overrides offline-fips.yaml --overrides overrides/fips.
yaml images/ami/rhel-84.yaml
Procedure
2. Create a secret on the bootstrap cluster with the contents from fips.yamloverride file and any other user
overrides you wish to provide.
kubectl create secret generic $CLUSTER_NAME-fips-overrides --from-
file=overrides.yaml=overrides.yaml
kubectl label secret $CLUSTER_NAME-fips-overrides clusterctl.cluster.x-k8s.io/move=
To view the complete list of FIPS Override Files, see FIPS Override Non-air-gapped Files on page 1068.
Signature Files
The following signature files are embedded in the nkp executable. This information is for reference only. You do not
need to download them to run the FIPS check.
The aggregate impact on a stable control plane seems to be an increase of around 10% CPU utilization over default
operation. Workloads that do not directly interact with the control plane are not affected.
For more information, see https://github.com/golang/go/issues/21525
Section Contents
Section Contents
AWS ECR
AWS ECR (Elastic Container Registry) is supported as your air-gapped image registry or a non-air-gapped registry
mirror. Nutanix Kubernetes Platform (NKP) added support for using AWS ECR as a default registry when uploading
image bundles in AWS.
Prerequisites
• Ensure you have followed the steps to create proper permissions in AWS Minimal Permissions and Role to
Create Clusters
• Ensure you have created AWS Cluster IAM Policies, Roles, and Artifacts
• --bundle <bundle> the group of images. The example below is for the NKP air-gapped environment bundle
An example command:
NKP push bundle --bundle container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=333000009999.
dkr.ecr.us-west-2.amazonaws.com/can-test
Note: You can also set an environment variable with your registry address for ECR:
export REGISTRY_URL=<ecr-registry-URI>
• REGISTRY_URI: the address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
JFrog Artifactory
JFrog Artifactory can function as a container registry and an automated management tool for binaries and artifacts of
all types. If you use JFrog Artifactory or JFrog Container Registry, you must update to a new software version. Use a
build newer than version 7.11; older versions are not compatible.
For more information, see https://jfrog.com/artifactory/.
Nexus Registry
Nexus Repository is a package registry for your Docker images and Helm Chart repositories and supports Proxy,
Hosted, and Group repositories. It can be used as a single registry for all your Kubernetes deployments.
For more information, see https://www.nexusregistry.com/info/.
Harbor Registry
Install Harbor and configure any HTTP access require ands the system level parameters in the harbor.yml file.
Then, run the installer script. If you upgrade from a previous version of Harbor, you update the configuration file
and migrate your data to fit the database schema of the later version. For information about upgrading, see https://
goharbor.io/docs/2.0.0/administration/upgrade/ and https://goharbor.io/docs/2.0.0/install-config/download-
installer/. A version than Harbor Registry v2.1.1-5f52168e will support OCI images.
While seeding, you may see error messages such as the following:
2023/09/12 20:01:18 retrying without mount: POST https://harbor-registry.daclusta/v2/
harbor-registry
/mesosphere/kube-proxy/blobs/uploads/?from=mesosphere%2Fkube-
proxy&mount=sha256%3A9fd5070b83085808ed850ff84acc
98a116e839cd5dcfefa12f2906b7d9c6e50d&origin=REDACTED: UNAUTHORIZED: project not found,
name: mesosphere
: project not found, name: mesosphere
This indicates that the image was not successfully pushed to your Harbor docker registry but is a false positive error
message. This will only affect the version of the Nutanix Kubernetes Platform (NKP) binary newer than NKP 2.4.0.
This does not affect any other Local Registry solution, such as Nexus or Artifactory. You can safely ignore these error
messages.
Bastion Host
If you have not set up a Bastion Host yet, refer to that section of the documentation.
Related Information
If you need to configure a private registry with a registry mirror, see Use a Registry Mirror.
Export Variables
Adding your registry information to the environment variable.
Set the environment variable with your registry information.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
Definitions:
REGISTRY_URL: the address of an existing local registry accessible in the VPC that the new cluster nodes will be
configured to use a mirror registry when pulling images.
• JFrog - REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. This value is only needed if
the registry uses a self-signed certificate and the AMIs are not already configured to trust this CA.
• To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster by setting
flags ----registry-mirror-url=https://registry-1.docker.io --registry-mirror-
username=<your-username> --registry-mirror-password=<your-password> when running NKP
create cluster.
This is useful when using an internal registry and when Internet access is unavailable, such as in an air-gapped
environment. However, registry mirrors can also be used in non-air-gapped environments for security and speed.
You can deploy and test workloads when the cluster is up and running.
Note: If you do not already have a local registry set up, see the Local Registry Tools page for more information.
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files
from different directories. Change your directory to the ###NKP#-<version># directory for the bootstrap cluster
example,
cd NKP-v2.12.0
3. Set an environment variable with your registry address and any other needed variables using this command.
export REGISTRY_URL="<https/http>://<registry-address>:<registry-port>"
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
4. Execute the following command to load the air-gapped image bundle into your private registry using any relevant
flags to apply the above variables.
NKP push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=${REGISTRY_URL}
Note: It may take some time to push all the images to your image registry, depending on the network's performance
between the machine you are running the scripttoscriptn and the registry.
Note: To use Elastic Container Registry (ECR), set an environment variable with your registry address for ECR:
export REGISTRY_URL=<ecr-registry-URI>
• REGISTRY_URL: the address of an existing local registry accessible in the Virtual Private Cloud (VPC) that the
new cluster nodes will be configured to use a mirror registry when pulling images.
• The environment where you are running the NKP push command must be authenticated with AWS in order to
load your images into ECR.
You are now ready to create an air-gapped bootstrap cluster for a custom cluster for your infrastructure
provideror create an air-gapped cluster from the Day 1 - Basic Installs section for your provider.
Prerequisites
Before you begin, make sure you have created your cluster using a bootstrap cluster from the respective
Infrastructure Providers section.
Section Contents
Note: The following example uses AWS, but can be used for gcp, azure, preprovisioned, and vsphere
clusters.
NKP create cluster aws -c {MY_CLUSTER_NAME} -o yaml --dry-run >>
{MY_CLUSTER_NAME}.yaml
Procedure
1. When you open {MY_CLUSTER_NAME}.yaml with your favorite text editor, look for the KubeadmControlPlane
object for your cluster. For example.
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: my-cluster-control-plane
namespace: default
spec:
kubeadmConfigSpec:
clusterConfiguration:
SECONDS=0
until crictl info
do
if (( SECONDS > 60 ))
then
echo "Containerd is not running. Giving up..."
exit 1
fi
echo "Containerd is not running yet. Waiting..."
sleep 5
done
path: /run/konvoy/restart-containerd-and-wait.sh
permissions: "0700"
- contentFrom:
secret:
key: value
name: my-cluster-etcd-encryption-config
owner: root:root
path: /etc/kubernetes/pki/encryption-config.yaml
permissions: "0640"
format: cloud-config
initConfiguration:
localAPIEndpoint: {}
Note: If you use the previous example as-is, update the Kubernetes version number on the final line by replacing
the x with your version.
2. Now, you can configure the fields below for the log backend. The log backend will write audit events to a file in
JSON format. You can configure the log audit backend using the kube-apiserver flags shown in the example.
audit-log-maxage
audit-log-maxbackup
audit-log-maxsize
audit-log-path
3. After modifying the values appropriately, you can create the cluster by running the kubectl create -f
{MY_CLUSTER_NAME}.yaml command.
kubectl create -f {MY_CLUSTER_NAME}.yaml
4. Once the cluster is created, users can get the corresponding kubeconfig for the cluster by running the NKP get
kubeconfig -c {MY_CLUSTER_NAME} >> {MY_CLUSTER_NAME}.conf command.
NKP get kubeconfig -c {MY_CLUSTER_NAME} >> {MY_CLUSTER_NAME}.conf
What to do next
For information on related topics or procedures, see Fluent bit.
• Deploy Pod Disruption Budget (PDB). For more information, see https://kubernetes.io/docs/concepts/
workloads/pods/disruptions/
• Konvoy Image Builder (KIB)
Procedure
1. Deploy Pod Disruption Budget for your critical applications. If your application can tolerate only one replica to be
unavailable at a time, then you can set the Pod disruption budget as shown in the following example. The example
below is for NVIDIA GPU node pools, but the process is the same for all.
3. Apply the YAML file above using the command kubectl create -f pod-disruption-budget-
nvidia.yaml.
4. Prepare OS image for your node pool using the Konvoy Image Builder.
What to do next
For information on related topics or procedures, see Upgrade NKP Pro.
Section Contents
Procedure
1. If you have not done so already, set the environment variable for your cluster name, substituting NKP-example
with the name of your cluster.
export CLUSTER_NAME=NKP-example
2. Then, run this command to check the health of the cluster infrastructure.
NKP describe cluster --cluster-name=${CLUSTER_NAME}
Note: For more details on the NKP describe cluster command, see https://docs.d2iq.com/dkp/2.8/
dkp-describe-cluster.
##MachineDeployment/NKP-example-md-0 True
121m
##Machine/NKP-example-md-0-88488cb74-2vxjq True
121m
##Machine/NKP-example-md-0-88488cb74-84xsd True
121m
##Machine/NKP-example-md-0-88488cb74-9xmc6 True
121m
##Machine/NKP-example-md-0-88488cb74-mjf6s True
121m
3. Use this kubectl command to see if all cluster nodes are ready.
kubectl get nodes
Example output showing all statuses set to Ready.
NAME STATUS ROLES AGE
VERSION
ip-10-0-112-116.us-west-2.compute.internal Ready <none> 135m
v1.21.6
ip-10-0-122-142.us-west-2.compute.internal Ready <none> 135m
v1.21.6
ip-10-0-186-214.us-west-2.compute.internal Ready control-plane,master 133m
v1.21.6
ip-10-0-231-82.us-west-2.compute.internal Ready control-plane,master 135m
v1.21.6
ip-10-0-71-114.us-west-2.compute.internal Ready <none> 135m
v1.21.6
ip-10-0-71-207.us-west-2.compute.internal Ready <none> 135m
v1.21.6
ip-10-0-85-253.us-west-2.compute.internal Ready control-plane,master 137m
v1.21.6
Troubleshooting
If any pod is not in Running or Completed status, you need to investigate further. If something has not been
deployed properly or thoroughly, run the NKP diagnose command, this collects information from pods and
infrastructure.
Note: You need to delete the attachment for any clusters attached in Kommander before running the delete
command.
Important: Persistent Volumes (PVs) are not automatically deleted by design to preserve your data. However, they
take up storage space if not deleted. You must delete PVs manually. Information for backup of a cluster and PVs
is in the documentation called Cluster Applications and Persistent Volumes Backup on page 517. With
Vsphere clusters, NKP delete doesn't delete the virtual disks backing the PVs for NKP add-ons. Therefore, the
internal VMware cluster eventually runs out of storage. These PVs are only visible if VSAN is installed, giving users a
Container Native Storage tab.
Procedure
• Ansible to install software, configure, and sanitize systems for running Konvoy.
• Packer is used to build images for cloud environments.
• Goss is used to validate systems are capable of running Konvoy.
This section describes using KIB to create Cluster API compliant machine images. Machine images contain
configuration information and software to create a specific, pre-configured operating environment. For example,
you can create an image of your computer system settings and software. The machine image can then be replicated
and distributed to your computer system for other users. KIB uses variable overrides to specify the base image and
container images to use in your new machine image. The variable overrides files for Nvidia and FIPS, which can be
ignored unless an overlay feature is added.
Prerequisites
Before you begin, you must ensure your versions of KIB and NKP are compatible:
• Download the Konvoy Image Builder bundle from the KIB Version column of the chart below for your version of
NKP prefixed with konvoy-image-bundle for your Operating System.
• Check the Supported Infrastructure Operating Systems and the Supported Kubernetes Version for your
Provider.
• An x86_64-based Linux or MacOS machine.
• A Container engine/runtime installed is required to install NKP and bootstrap:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. On macOS, Docker runs
in a virtual machine, which needs to be configured with at least 8 GB of memory. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• For air-gapped only - a local registry.
• Override files
You can use override files to customize some of the components installed on this machine image. The KIB base
override files are located in this GitHub repository https://github.com/mesosphere/konvoy-image-builder/
tree/main/overrides. For more information on using override flags, see Use Override Files with Konvoy
Image Builder on page 1067
• Customize image YAML
Begin creating an image, interrupt the process so that the manifest.jason gets built, and you can open and edit keys
in the YAML. For more information, see Customize your Image YAML or Manifest File.
• Using HTTP or HTTPS Proxies
In some networked environments, the machines used for building images can reach the Internet, but only through
an HTTP or HTTPS proxy. For NKP to operate in these networks, you need a way to specify what proxies to use.
Further explanation is found in Use HTTP or HTTPS Proxy with KIB Images on page 1076
Note: The konvoy-image binary and all supporting folders are also extracted, and bind mount places the current
working directory (${PWD}) into the container to be used. For more information, see https://docs.docker.com/
storage/bind-mounts/.
Set the environment variables for AWS access. The following variables must be set using your credentials, including
required IAM. For more information, see https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-
envvars.html.
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
Next Steps
Either return to Basic Install or Custom Install instructions, or for more KIB specific provider information, you can
continue to the provider link below for additional information:
• Basic Install
• Custom Install instructions
Section Contents
Note:
To modify repository templates for RHEL, these files are required for operating systems that will use
NVIDIA GPU drivers.
Procedure
Note: RPMs are packaged into the KIB create-package-bundle workflow for RHEL and Rocky. For other distros
the current behavior remains the same.
2. You will need to fetch the distro packages as well as other artifacts. By fetching the distro packages from distro
repositories, you get the latest security fixes available at machine image build time.
• redhat-8.6
• redhat-8.8
• ubuntu-20.04
• ubuntu-22.04
Note: For RHEL OS, pass your RedHat subscription/licensing manager credentials.
export RHSM_ACTIVATION_KEY="-ci"
export RHSM_ORG_ID="1232131"
OR
export RHSM_USER=""
export RHSM_PASS=""
4. Run the konvoy-image command to build and validate the image. Ensure you have named the correct AMI
image YAML file for your OS in the konvoy-image build command.
konvoy-image build aws images/ami/rhel-86.yaml
Note:
• 1. To enable EUS, add the --enable-eus-repos flag which fetches packages from EUS
repositories during RHEL package bundles creation. It is disabled by default.
2. If kernel headers are required for GPU, add the --fetch-kernel-headers flag which
fetches kernel headers for the target operating system. To modify the version, edit the file at
bundles/{OS_NAME} {VERSION}
5. Use custom image with NKP by specifying it with the image flag during the create cluster command.
Section Contents
• You have a valid AWS account with credentials configured that can manage CloudFormation Stacks, IAM
Policies, IAM Roles, and IAM Instance Profiles. For more information, see https://docs.aws.amazon.com/cli/
latest/userguide/cli-configure-profiles.html
• The AWS CLI utility is installed. For more information, see https://docs.aws.amazon.com/cli/latest/
userguide/cli-configure-profiles.html.
Procedure
Note: In previous Nutanix Kubernetes Platform (NKP) releases, AMI images provided by the upstream CAPA project
were used if you did not specify an AMI. However, the upstream images are not recommended for production and may
not always be available. Therefore, NKP requires you to specify an AMI when creating a cluster. To create an AMI, use
Konvoy Image Builder. A customized image requires the Konvoy Image Builder tool to be downloaded. You can use
variable overrides to specify the base image and container images for your new custom AMI.
Procedure
1. Build and validate the image using the command dkp-image-builder create image aws/
centos-79.yaml.
By default, it builds in the us-west-2 region. To specify another region, set the --region flag using the
command Run the NKP-image-builder create command to build and validate the image.
NKP-image-builder create image aws/centos-79.yaml
2. Once NKP provisions the image successfully, the ami id is printed and written to the packer.pkr.hcl file.
Example output showing an artifact_id field whose value provides the name of the AMI ID.
{
"name": "rhel-7.9-fips",
"builder_type": "amazon-ebs",
"build_time": 1659486130,
"files": null,
"artifact_id": "us-west-2:ami-0f2ef742482e1b829",
"packer_run_uuid": "0ca500d9-a5f0-815c-6f12-aceb4d46645b",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "7.9",
"kubernetes_cni_version": "",
Important: In previous Nutanix Kubernetes Platform (NKP) releases, AMI images provided by the upstream CAPA
project were used if you did not specify an AMI. However, the upstream images are not recommended for production
and may not always be available. Therefore, NKP requires you to specify an AMI when creating a cluster. To create an
AMI, use Konvoy Image Builder. Explore the Customize your Image topic for more options about overrides.
Prerequisites
Before you begin, you must:
• Download the KIB bundle for your version of NKP prefixed with konvoy-image-bundle for your OS. To
locate KIB download links, see the Compatible NKP to KIB Versions table in KIB.
• Check the Supported Infrastructure Operating Systems.
• Check the Supported Kubernetes Version for your Provider.
• Create a working registry:
• Docker container engine version 18.09.2 or 20.10.0 installed for Linux or MacOS. For more information, see
https://docs.docker.com/get-docker/.
• Podman Version 4.0 or later for Linux. For more information, see https://podman.io/getting-started/
installation. For host requirements, see https://kind.sigs.k8s.io/docs/user/rootless/#host-requirements.
• For information on Cluster APIs, see https://cluster-api.sigs.k8s.io/.
• Ensure you have met the minimal set of permissions from the AWS Image Builder Book. For more information,
see https://image-builder.sigs.k8s.io/capi/providers/aws.html#required-permissions-to-build-the-aws-
amis.
• A Minimal IAM Permissions for KIB to create an Image for an AWS account using Konvoy Image Builder. For
more information, see Creating Minimal IAM Permissions for KIB on page 1035.
Section Contents
Procedure
• The bundles directory is located in your downloads location and contains all the steps to create an OS package
bundle for a particular OS.
• For FIPS, pass the flag: --fips.
• For RHEL OS, pass your RedHat subscription manager credentials: export RMS_ACTIVATION_KEY.
Example:
export RHSM_ACTIVATION_KEY="-ci" export RHSM_ORG_ID="1232131"
4. Build an AMI.
The konvoy-image binary and all supporting folders are also extracted. When run, konvoy-image bind mounts
the current working directory (${PWD}) into the container to be used.
• Set environment variables for AWS access. The following variables must be set using your credentials,
including Creating Minimal IAM Permissions for KIB on page 1035.
Example Output:
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION
• If you have an override file to configure specific attributes of your AMI file, add it. For more information on
customizing an override file, see Image Overrides on page 1073.
• Minimal IAM Permissions for KIB to create an image for an Amazon Web Services (AWS) account using
Konvoy Image Builder.
Important: In previous Nutanix Kubernetes Platform (NKP) releases, AMI images provided by the upstream CAPA
project were used if you did not specify an AMI. However, the upstream images are not recommended for production
and may not always be available. Therefore, NKP requires you to specify an AMI when creating a cluster. To create an
AMI, use Konvoy Image Builder (KIB).
Procedure
1. Build and validate the image using the command konvoy-image build aws images/ami/rhel-86.yaml.
2. Set the --region flag to specify a region other than the us-west-2 default region.
Example:
konvoy-image build aws --region us-east-1 images/ami/rhel-86.yaml
Note: Ensure you have named the correct AMI image YAML file for your OS in the konvoy-image build
command. For more information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/
images/ami.
3. After KIB provisions the image successfully, locate the artifact_id field in the packer.pkr.hcl (Packer
config) file.
The artifact_id field value provides the name of the AMI ID. Use this ami value in the next step.
Example output:
{
"builds": [
{
"name": "kib_image",
"builder_type": "amazon-ebs",
"build_time": 1698086886,
"files": null,
"artifact_id": "us-west-2:ami-04b8dfef8bd33a016",
"packer_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05",
"custom_data": {
"containerd_version": "",
"distribution": "RHEL",
"distribution_version": "8.6",
"kubernetes_cni_version": "",
"kubernetes_version": "1.26.6"
}
}
],
"last_run_uuid": "80f8296c-e975-d394-45f9-49ef2ccc6e05"
}
6. Name the custom AMI using the command NKP create cluster.
For more information, see Creating a New AWS Cluster on page 762.
» Non-air-gapped
» Air-gapped
Procedure
1. Download NKP-air-gapped-bundle_v2.12.0_linux_amd64.tar.gz.
2. Extract the tarball to a local directory using the command tar -xzvf NKP-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz && cd NKP-v2.12.0/kib.
3. Fetch the distro packages from distro repositories; these include the latest security fixes available at machine
image build time.
• centos-7.9
• redhat-7.9
• redhat-8.6
• redhat-8.8
• rocky-9.1
• ubuntu-20.04
Note:
• For Federal Information Processing Standards (FIPS), pass the flag: --fips.
• For Red Hat Enterprise Linux (RHEL) OS, pass your RedHat subscription manager credentials:
export RMS_ACTIVATION_KEY. Example command:
export RHSM_ACTIVATION_KEY="-ci"
export RHSM_ORG_ID="1232131"
or
export RHSM_USER=""
export RHSM_PASS=""
Continue to the next step, which is to build an AMI. Depending on which version of NKP you are running, steps
and flags will be different. To deploy in a region where CAPI images are not provided, you need to use KIB to
create your image for the region. For a list of supported AWS regions, See the Published AMI information from
AWS at https://cluster-api-aws.sigs.k8s.io/topics/images/built-amis.html.
5. Run the konvoy-image command to build and validate the image. Ensure you have named the correct AMI
image YAML file for your OS in the konvoy-image build command. For more information, see https://
github.com/mesosphere/konvoy-image-builder/tree/main/images/ami.
./konvoy-image build aws images/ami/centos-79.yaml --overrides overrides/offline.yaml
By default, it builds in the us-west-2 region. To specify another region, set the --region flag as shown in the
example.
./konvoy-image build aws --region us-east-1 images/ami/centos-79.yaml --
overrides overrides/offline.yaml
To customize an override file see, Image Overrides.
After KIB provisions the image successfully, the ami id is printed and written to the packer.pkr.hcl (Packer
config) file. This file has an amazon-ebs.kib_image field whose value provides the name of the AMI ID as
shown in the example below. That is the ami you use in the NKP create cluster command.
...
amazon-ebs.kib_image: Adding tag: "distribution_version": "8.6"
amazon-ebs.kib_image: Adding tag: "gpu_nvidia_version": ""
amazon-ebs.kib_image: Adding tag: "kubernetes_cni_version": ""
What to do next
Explore the Customize your Image topic for more options.
Note: If using AWS ECR as your local private registry, more information can be found on the Registry Mirror
Tools. page.
packer:
ami_filter_name: "CentOS Linux 7"
packer:
ami_filter_name: ""
ami_filter_owners: ""
distribution: "CentOS"
distribution_version: "7.9"
source_ami: "ami-123456789"
ssh_username: "centos"
root_device_name: "/dev/sda1"
...
When you're done selecting your source_ami, you can build your KIB image as usual.
konvoy-image build aws path/to/ami/centos-79.yaml
Note: The default Azure image is not recommended for use in production. We suggest using KIB for Azure to build
the image to take advantage of enhanced cluster operations. For more options, see the Customize your Image topic.
For more information regarding using the image in creating clusters, see Creating a New Azure Cluster
on page 842.
Prerequisites
Before you begin, you must:
• Download the Konvoy Image Builder bundle for your version of Nutanix Kubernetes Platform (NKP).
• Check the Supported Infrastructure Operating Systems.
2. Create an Azure Service Principal (SP) by using the command az ad sp create-for-rbac --role
contributor --name "$(whoami)-konvoy" --scopes =/subscriptions/$(az account show --
query id -o tsv) --query "{ client_id: appId, client_secret: password, tenant_id:
tenant }".
Note: The command will rotate the password if an SP with the name exists.
Example:
{
"client_id": "7654321a-1a23-567b-b789-0987b6543a21",
"client_secret": "Z79yVstq_E.R0R7RUUck718vEHSuyhAB0C",
"tenant_id": "a1234567-b132-1234-1a11-1234a5678b90"
}
4. Ensure you have an override file to configure specific attributes of your Azure image.
By default, the image builder builds in the westus2 location. To specify another location, set the --location flag
(shown in the example below, which shows how to change the location to eastus).
Example:
konvoy-image build azure --client-id $AZURE_CLIENT_ID --tenant-id ${AZURE_TENANT_ID}
--location eastus --overrides override-source-image.yaml images/azure/centos-7.yaml
When the command is complete, the image ID is printed and written to the ./packer.pkr.hcl file. This file has an
artifact_id field whose value provides the name of the image. Specify this image ID when creating the cluster.
Image Gallery
By default, Konvoy Image Builder will create a Resource Group, Gallery, and Image Name to store the resulting
image. To specify a specific Resource Group, Gallery, or Image Name flags may be specified:
Example:
--gallery-image-locations string a list of locations to publish the image (default
same as location)
--gallery-image-name string the gallery image name to publish the image to
--gallery-image-offer string the gallery image offer to set (default "nkp")
--gallery-image-publisher string the gallery image publisher to set (default "nkp")
--gallery-image-sku string the gallery image sku to set
--gallery-name string the gallery name to publish the image in (default
"nkp")
--resource-group string the resource group to create the image in (default
"nkp")
When creating your cluster, add the --compute-gallery-id "<Managed Image Shared Image Gallery
Id>" flag while creating your custom image. See the update link once the topic is created Creating a New Azure
Cluster on page 842 for specific consumption of image commands.
The SKU and Image Name will default to the values in the YAML image.
Ensure you have named the correct YAML file for your OS in the konvoy-image build command. For more
information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/images/gcp.
• If you use these fields in the override file when you create a machine image with KIB, you must also set the
corresponding flags when you create your cluster with NKP.
packer:
distribution: "rockylinux-9" # Offer
distribution_version: "rockylinux-9" # SKU
# Azure Rocky linux official image: https://portal.azure.com/#view/
Microsoft_Azure_Marketplace/
GalleryItemDetailsBladeNopdl/id/
erockyenterprisesoftwarefoundationinc1653071250513.rockylinux-9
image_publisher: "erockyenterprisesoftwarefoundationinc1653071250513"
image_version: "latest"
ssh_username: "azureuser"
plan_image_sku: "rockylinux-9" # SKU
plan_image_offer: "rockylinux-9" # offer
plan_image_publisher: "erockyenterprisesoftwarefoundationinc1653071250513" #
publisher
build_name: "rocky-90-az"
packer_builder_type: "azure"
python_path: ""
Section Contents
Note: Google Cloud Platform does not publish images. You must first build the image using Konvoy Image Builder.
Explore the Customize your Image topic for more options. For more information regarding using the image in creating
clusters, refer to the GCP Infrastructure section of the documentation.
Note: The Enable OS Login feature is sometimes enabled by default in GCP projects. If the OS login feature is
enabled, KIB will be unable to ssh to the VM instances it creates and cannot successfully create an image.
Inspect the metadata configured in your project to check if it is enabled. If you find the enable-oslogin
flag set to TRUE, you must remove it (or set it to FALSE) to use KIB successfully. For more information,
see https://cloud.google.com/compute/docs/metadata/setting-custom-metadata#console_2
• Create a GCP service account. For more information, see GCP service account.
• If you have already created a service account, retrieve the credentials for an existing service account.
• Export the static credentials that will be used to create the cluster using the command export
GCP_B64ENCODED_CREDENTIALS=$(base64 < "$GOOGLE_APPLICATION_CREDENTIALS" | tr -d '\n')
Section Contents
Procedure
2. KIB will run and print out the name of the created image, use this name when creating a Kubernetes cluster. See
the sample output below.
...
==> ubuntu-2004-focal-v20220419: Deleting instance...
ubuntu-2004-focal-v20220419: Instance has been deleted!
==> ubuntu-2004-focal-v20220419: Creating image...
==> ubuntu-2004-focal-v20220419: Deleting disk...
ubuntu-2004-focal-v20220419: Disk has been deleted!
==> ubuntu-2004-focal-v20220419: Running post-processor: manifest
Build 'ubuntu-2004-focal-v20220419' finished after 7 minutes 46 seconds.
==> Wait completed after 7 minutes 46 seconds
==> Builds finished. The artifacts of successful builds are:
--> ubuntu-2004-focal-v20220419: A disk image was created: konvoy-
ubuntu-2004-1-23-7-1658523168
Note: Ensure you have named the correct YAML file for your OS in the konvoy-image build command. For
more information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/images/gcp
3. To find a list of images you have created in your account, use the command gcloud compute images list
--no-standard-images
What to do next
Refer to Nutanix Kubernetes Platform (NKP) Documentation regarding roles and minimum permission for
use with Konvoy Image Builder: GCP Roles.
Procedure
1. Set your GCP Project ID for your gcp account using the command export GCP_PROJECT=your GCP project
ID.
2. Create a new network using the command export NETWORK_NAME=kib-ssh-network gcloud compute
networks create "$NETWORK_NAME" --project="$GCP_PROJECT" --subnet-mode=auto --
mtu=1460 --bgp-routing-mode=regional
3. Create the firewall rule to allow Ingress access on port 22 using the command gcloud compute firewall-
rules create "$NETWORK_NAME-allow-ssh" --project="$GCP_PROJECT" --network="projects/
$GCP_PROJECT/global/networks/$NETWORK_NAME" --description="Allows TCP connections
from any source to any instance on the network using port 22." --direction=INGRESS --
priority=65534 --source-ranges=0.0.0.0/0 --action=ALLOW --rules=tcp:22
With your KIB image now created, you can now move on to create your cluster and set up your Cluster API
(CAPI) controllers.
What to do next
For more information on related topics, see
• Network: https://cloud.google.com/vpc/docs/vpc
• Cluster API:https://cluster-api.sigs.k8s.io/
Note: The NVIDIA driver requires a specific Linux kernel version. Ensure the base image for the OS version has the
required kernel version.
If the NVIDIA runfile installer has not been downloaded, retrieve and install the download first by running the
following command. The first line in the command below downloads and installs the runfile, and the second line
places it in the artifacts directory (you must create an artifacts directory if you don’t already have one). For more
information, see https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html#runfile.
curl -O https://download.nvidia.com/XFree86/Linux-x86_64/535.183.06/NVIDIA-Linux-
x86_64535.183.06.run
mv NVIDIA-Linux-x86_64-535.183.06.run artifacts
Note: The Nutanix Kubernetes Platform (NKP) supported NVIDIA driver version is 470.x.
Procedure
1. add the following to enable GPU builds in your override file. Otherwise, you can access and use the overrides
in our repo, the documentation under Nvidia GPU Override File or Offline Nvidia Override file. For more
information, see https://github.com/mesosphere/konvoy-image-builder/tree/main/overrides and https://
www.nvidia.com/Download/Find.aspx
Note: For RHEL Pre-provisioned Override Files used with KIB, see specific note for GPU.
2. Build your image using the Konvoy Image Builder command, including the flag --instance-type that specifies an
AWS instance with an available GPU.
AWS Example:
konvoy-image build --region us-east-1 --instance-type=p2.xlarge --source-
ami=ami-12345
abcdef images/ami/centos-7.yaml --overrides overrides/nvidia.yaml
In this example, we chose an instance type with an NVIDIA GPU using the --instance-type flag, and we
provided the NVIDIA overrides using the --overrides flag. See Using KIB with AWS for more information on
creating an AMI.
3. When the Konvoy Image Builder image is complete, the custom ami id is printed and written to ./.. To use the
built ami with Konvoy, specify it with the --ami flag when calling cluster create.
Verification
Procedure
Connect to the node and execute the nvidia-smi command.
When drivers are successfully installed, the display looks like the following sample output:
Fri Jun 11 09:05:31 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06 Driver Version: 535.183.06 CUDA Version: 12.2.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:1E.0 Off | 0 |
| N/A 35C P0 73W / 149W | 0MiB / 11441MiB | 99% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Section Contents
• Storage configuration: Nutanix recommends customizing disk partitions and not configuring a SWAP
partition.
• Network configuration: as KIB must download and install packages, activating the network is required.
• Connect to Red Hat: If using Red Hat Enterprise Linux (RHEL), registering with Red Hat is required to
configure software repositories and install software packages.
• Software selection: Nutanix recommends choosing Minimal Install.
• NKP recommends installing with the packages provided by the operating system package managers. Use the
version that corresponds to the primary version of your operating system.
Disk Size
For each cluster you create using this base OS image, ensure you establish the disk size of the root file system
based on the following:
Defaults
Important: The base OS image root file system must be 80 GB for clusters created with the default disk size. The root
file system cannot be reduced automatically when a machine first boots.
Customization
You can also specify a custom disk size when you create a cluster add when the topic is made (see the flags available
for use with the Creating a New vSphere Cluster on page 875 command). This allows you to use one base OS
image to create multiple clusters with different storage requirements.
Before specifying a disk size when you create a cluster, take into account:
1. For some base OS images, the custom disk size option does not affect the size of the root file
system. This is because some root file systems, for example, those contained in an LVM Logical Volume,
cannot be resized automatically when a machine first boots.
2. The specified custom disk size must be equal to, or larger than, the size of the base OS image
root file system. This is because a root file system cannot be reduced automatically when a machine first boots.
3. In VMware Cloud Director Infrastructure on page 912, the base image determines the minimum storage
available for the VM.
For additional information and examples of KIB vSphere overrides, see this page: Image Overrides
If using Flatcar, the documentation from Flatcar regarding disabling or enabling autologin in the Base OS Image
is found here: In a vSphere or Pre-provisioned environment, anyone with access to the console of a Virtual Machine
Prerequisites
• Users must create a base OS image in their vSphere client before starting this procedure.
• Konvoy Image Builder (KIB) downloaded and extracted.
Section Contents
Procedure
1. Download nkp-air-gapped-bundle_v2.12.0_linux_amd64.tar.gz.
3. Fetch the distro packages as well as other artifacts. You get the latest security fixes available at machine image
build time by fetching the distro packages from distro repositories.
• centos-7.9
• redhat-7.9
• redhat-8.6
• redhat-8.8
• rocky-9.1
• ubuntu-20.04
Note:
5. Follow the instructions to build a vSphere template below and set the override --overrides overrides/
offline.yaml flag.
Procedure
2. Copy the base OS image file created in the vSphere Client to your desired location on the bastion VM host and
make a note of the path and file name.
Important: Use the YAML file that matches your OS name. The following example is for Ubuntu 20.04. See the
Additional YAML File Examples on page 1057 for alternate copy and paste options for the last step.
---
download_images: true
build_name: "ubuntu-2004"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-
vmware-guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "<VSPHERE_CLUSTER_NAME>"
datacenter: "<VSPHERE_DATACENTER_NAME>"
datastore: "<VSPHERE_DATASTORE_NAME>"
folder: "<VSPHERE_FOLDER>"
insecure_connection: "false"
network: "<VSPHERE_NETWORK>"
resource_pool: "<VSPHERE_RESOURCE_POOL>"
template: "os-qualification-templates/nutanix-base-Ubuntu-20.04" # change default
value with your base template name
vsphere_guest_os_type: "other4xLinux64Guest"
guest_os_type: "ubuntu2004-64"
# goss params
distribution: "ubuntu"
distribution_version: "the 20.04"
# Use following overrides to select the authentication method that can be used with
the base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
# ssh_agent_auth: false # is set to true, ssh_password and ssh_private_key will be
ignored
5. The Konvoy Image Builder (KIB) uses the values in image.yaml and the input base OS image to create a vSphere
template directly on the vCenter server. This template contains the required artifacts needed to create a Kubernetes
cluster.
When KIB successfully provisions the OS image, it creates a manifest file. The artifact_id field of this file
contains the name of the AMI ID (AWS), template name (vSphere), or image name (GCP/Azure). Github image
folder location, see https://github.com/mesosphere/konvoy-image-builder/tree/main/images/ova
Example:
{
"name": "vsphere-clone",
Tip: Recommendation: Now we can now see the template created in our vCenter, it is best to rename it to nkp-
<NKP_VERSION>-k8s-<K8S_VERSION>-<DISTRO>, like nkp-2.4.0-k8s-1.24.6-ubuntu to keep
templates organized.
RHEL 8.6
---
download_images: true
build_name: "rhel-86"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-vmware-
guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: ""
datacenter: ""
datastore: ""
folder: ""
insecure_connection: "false"
network: ""
resource_pool: ""
template: "os-qualification-templates/nutanix-base-RHEL-86" # change default value
with your base template name
vsphere_guest_os_type: "rhel8_64Guest"
guest_os_type: "rhel8-64"
# goss params
distribution: "RHEL"
distribution_version: "8.6"
# Use following overrides to select the authentication method that can be used with
base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
# ssh_agent_auth: false # is set to true, ssh_password and ssh_private_key will be
ignored
Flatcar
---
download_images: true
build_name: "flatcar-3033.3.16"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-vmware-
guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: ""
datacenter: ""
datastore: ""
folder: ""
insecure_connection: "false"
network: ""
resource_pool: ""
template: "nutanix-base-templates/nutanix-base-Flatcar-3033.3.16"
vsphere_guest_os_type: "flatcar_64Guest"
guest_os_type: "flatcar-64"
# goss params
distribution: "flatcar"
distribution_version: "3033.3.16"
# Use following overrides to select the authentication method that can be used with
base template
# ssh_username: "" # can be exported as environment variable 'SSH_USERNAME'
# ssh_password: "" # can be exported as environment variable 'SSH_PASSWORD'
# ssh_private_key_file = "" # can be exported as environment variable
'SSH_PRIVATE_KEY_FILE'
• In a Pre-provisioned environment, you have existing machines, and NKP consumes them to form a cluster.
• When you have another provisioner (for example, cloud providers such AWS, vSphere , and others), you build
images with KIB, and NKP consumes the images to provision machines and form a cluster.
Before NKP 2.6, you had to specify the HTTP proxy in the KIB override setup and then again in the nkp create
cluster command. After NKP 2.6, an HTTP proxy gets created from the Konvoy flags for the control plane
proxy and workers proxy values. The flags in the NKP command for Pre-provisioned clusters populate a Secret
automatically in the bootstrap cluster. That Secret has a known name that the Pre-provisioned controller finds and
applies when it runs the KIB provisioning job.
Note: For a Pre-provisioned air-gapped environment, you must build the OS packages after fetching the packages from
the distro repositories.
In previous NKP releases, the distro package bundles were included in the downloaded air-gapped bundle. Currently,
that air-gapped bundle contains the following artifacts, except for the distro packages:
Procedure
2. You must fetch the distro packages and other artifacts. You get the latest security fixes available at machine image
build time by fetching the distro packages from distro repositories.
3. In your download location, there is a bundles directory with all the steps to create an OS package bundle for a
particular OS. To make it, run the new NKP command create-package-bundle. This builds an OS bundle
using the Kubernetes version defined in ansible/group_vars/all/defaults.yaml.
For Example:
Note:
Section Contents
packer:
ami_filter_name: "RHEL-8.6.0_HVM-*"
ami_filter_owners: "309956199498"
distribution: "RHEL"
distribution_version: "8.6"
source_ami: ""
ssh_username: "ec2-user"
root_device_name: "/dev/sda1"
volume_size: "15"
build_name: "rhel-8.6"
packer_builder_type: "amazon"
python_path: ""
Once the YAML for the image is edited, you create your image using the customized YAML by applying it in the
command.konvoy-image build aws images/ami/rhel-86.yaml.
For more information about the default YAML files, see https://github.com/mesosphere/konvoy-image-builder/
tree/7447074a6d910e71ad2e61fc4a12820d073ae5ae/images
Section Contents
Procedure
Note: To avoid fully creating the image and reduce image charges, interrupt the command right after running it.
This will generate the needed files but avoid a full image creation charge.
a. Overrides for FIPS: You can also use FIPS overrides during initial image creation.
konvoy-image build aws images/ami/ubuntu-2004.yaml --overrides overrides/fips.yaml
2. A work file is generated (for example: work/ubuntu-20-3486702485-iulvY), which contains the following
files to be opened in an editor, and the values are added or changed.
• ansible_vars.yaml
• packer_vars.json
• packer.pkr.hcl
a. For a complete list of KIB flags, run the command: konvoy-image build aws --help.
5. Once the YAML for the image is edited, you create another image using your customized YAML by applying
it with the flag --packer-manifest. Provision the new image applying the recently edited and renamed
manifest.pkr.kcl.
AWS example:
Using Flags
One option is to execute KIB with specific flags to override the following values:
Note: While CLI flags can be combined with override files, CLI flags take priority over any override files.
build_name: "centos-7"
packer_builder_type: "azure"
python_path: ""
vSphere Example:
A vSphere example shows the current base image description at images/vsphere/rhel-79.yaml , similar to the
example.
---
download_images: true
build_name: "rhel-79"
packer_builder_type: "vsphere"
guestinfo_datasource_slug: "https://raw.githubusercontent.com/vmware/cloud-init-vmware-
guestinfo"
guestinfo_datasource_ref: "v1.4.0"
guestinfo_datasource_script: "{{guestinfo_datasource_slug}}/
{{guestinfo_datasource_ref}}/install.sh"
packer:
cluster: "zone1"
datacenter: "dc1"
Procedure
1. Generate manifest files for KIB by executing the following command. The following Amazon Web Services
(AWS) example shows the CentOS 7.9 image being built:
./konvoy-image build images/ami/centos-79.yaml
Note: Other provider commands are located in the Konvoy Image Builder.
2. Send SIGINT to the process to halt it after seeing the output ...writing new packer configuration to
work/centos-7-xxxxxxx-yyyyy. During the attempt to build the image, a packer.pkr.hcl is generated
under the path work/centos-7-xxxxxxx-yyyyy/packer.pkr.hcl.
3. Edit the packer.pkr.hcl file by adding the parameter "run_tags" to the packer.pkr.hcl file as seen in the
AWS example below:
"builders": [
{
"name": "{{(user `distribution`) | lower}}-{{user `distribution_version`}}{{user
`build_name
_extra`}}",
"type": "amazon-ebs",
"run_tags": {"my_custom_tag": "tag"},
"instance_type": "{{user `aws_instance_type`}}",
"source_ami_filter": {
"filters": {
"virtualization-type": "hvm",
"name": "{{user `ami_filter_name`}}",
"root-device-type": "ebs",
"architecture": "x86_64"
},
4. Use the --packer-manifest flag to apply it when you build the image.
./konvoy-image build aws images/ami/centos-79.yaml --packer-manifest=work/centos-7-
1658174984-TycMM/packer.pkr.hcl
2022/09/07 18:23:33 writing new packer configuration to work/centos-7-1662575013-
zJUhP
What to do next
Using Override Files with Konvoy Image Builder
Ansible Variables
Ansible variables are generally preset when using KIB. To change the variables, you change the Ansible runbook or
preprogrammed playbook.
Example ansible_vars.yaml
build_name: ubuntu-20
download_images: true
kubernetes_full_version: 1.29.6
kubernetes_version: 1.29.6
packer:
ami_filter_name: ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server*
ami_filter_owners: "099720109477"
distribution: Ubuntu
distribution_version: "20.04"
dry_run: false
goss_arch: amd64
goss_entry_file: goss/goss.yaml
goss_format: json
goss_format_options: pretty
goss_inspect_mode: false
goss_tests_dir: goss
goss_url: null
goss_vars_file: ansible/group_vars/all/system.yaml
goss_version: 0.3.16
root_device_name: /dev/sda1
source_ami: ""
ssh_username: ubuntu
volume_size: "15"
packer_builder_type: amazon
python_path: ""
Section Contents
Section Contents
Add the following FIPS override file to your environment using the command --overrides overrides/
fips.yaml.
Example:
---
k8s_image_registry: docker.io/mesosphere
fips:
enabled: true
build_name_extra: -fips
kubernetes_build_metadata: fips.0
default_image_repo: hub.docker.io/mesosphere
kubernetes_rpm_repository_url: "https://packages.nutanix.com/konvoy/stable/linux/repos/
el/kubernetes-v{{ kubernetes_version }}-fips/x86_64"
docker_rpm_repository_url: "\
https://containerd-fips.s3.us-east-2.amazonaws.com\
/{{ ansible_distribution_major_version|int }}\
/x86_64"
To locate the available Override files in the Konvoy Image Builder repo, see https://github.com/mesosphere/
konvoy-image-builder/tree/main/overrides.
Note: To locate the available Override files in the Konvoy Image Builder repo, see https://github.com/
mesosphere/konvoy-image-builder/tree/main/overrides.
Procedure
1. If your pre-provisioned machines need a default Override file like FIPS, create a secret that includes the overrides
in a file.
Example:
cat > fips.yaml << EOF
---
k8s_image_registry: docker.io/mesosphere
fips:
enabled: true
build_name_extra: -fips
kubernetes_build_metadata: fips.0
default_image_repo: hub.docker.io/mesosphere
kubernetes_rpm_repository_url: "https://packages.nutanix.com/konvoy/stable/linux/
repos/el/kubernetes-v{{ kubernetes_version }}-fips/x86_64"
docker_rpm_repository_url: "\
https://containerd-fips.s3.us-east-2.amazonaws.com\
/{{ ansible_distribution_major_version|int }}\
/x86_64"
EOF
2. Create the related secret by using the command kubectl create secret generic $CLUSTER_NAME-user-
overrides --from-file=fips.yaml=fips.yaml kubectl label secret $CLUSTER_NAME-user-
overrides clusterctl.cluster.x-k8s.io/move=.
Section Contents
Note: All available Override files are in the Konvoy Image Builder repo, at https://github.com/mesosphere/
konvoy-image-builder/tree/main/overrides.
Perform this task to add the FIPS override file to your air-gapped pre-provisioned environment.
Note: All available Override files are in the Konvoy Image Builder repo. For more information, see https://
github.com/mesosphere/konvoy-image-builder/tree/main/overrides.
Procedure
1. If your pre-provisioned machines need a default Override file like FIPS, create a secret that includes the overrides
in a file.
cat > fips.yaml << EOF
# fips os-packages
os_packages_local_bundle_file: "{{ playbook_dir }}/../artifacts/
{{ kubernetes_version }}_{{ ansible
_distribution|lower }}_{{ ansible_distribution_major_version }}_x86_64_fips.tar.gz"
containerd_local_bundle_file: "{{ playbook_dir }}/../artifacts/
{{ containerd_tar_file }}"
pip_packages_local_bundle_file: "{{ playbook_dir }}/../artifacts/pip-packages.tar.gz"
images_local_bundle_dir: "{{ playbook_dir}}/../artifacts/images"
EOF
Related Information
For information on related topics or procedures, see Private Registry in Air-gapped Override.
You can find these offline (air-gapped) override files in the Konvoy Image Builder repo at https://github.com/
mesosphere/konvoy-image-builder/tree/main/overrides.
--overrides overrides/offline.yaml
os_packages_local_bundle_file: "{{ playbook_dir }}/../artifacts/
{{ kubernetes_version }}_
{{ ansible_distribution|
lower }}_{{ ansible_distribution_major_version }}_x86_64.tar.gz"
containerd_local_bundle_file: "{{ playbook_dir }}/../artifacts/
{{ containerd_tar_file }}"
pip_packages_local_bundle_file: "{{ playbook_dir }}/../artifacts/pip-packages.tar.gz"
images_local_bundle_dir: "{{ playbook_dir}}/../artifacts/images"
Note: For Ubuntu 20.04, when Konvoy Image Builder runs, it will temporarily disable all defined Debian repositories
by appending a .disabled suffix. Each repository will revert to its original name at the end of installation. In case of
failures, the files will not be renamed back.
For GPU Offline Override File, you can use nvidia-runfile flag for GPU support if you have downloaded the
runfile installer at https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html#runfile.
Section Contents
This override file is used to create images with ##Nutanix Kubernetes Platform# (##NKP#), which uses GPU
hardware. These override files can also be located in the Konvoy Image Builder repo at https://github.com/
mesosphere/konvoy-image-builder/tree/main/overrides.
--overrides overrides/offline-nvidia.yaml
# Use this file when building a machine image, not as a override secret for
preprovisioned environments
nvidia_runfile_local_file: "{{ playbook_dir}}/../artifacts/
{{ nvidia_runfile_installer }}"
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
Perform this task to add the Nvidia GPU override file to your Air-gapped Pre-provisioned environment.
Procedure
1. If your pre-provisioned machines need a default Override file like FIPS, create a secret that includes the overrides
in a file.
cat > nvidia.yaml << EOF
---
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
EOF
Offline Nvidia GPU override file and task to add the Nvidia GPU override file to your air-gapped pre-
provisioned environment.
Offline Nvidia GPU override file options:
Section Contents
This override file is used to create images with ##Nutanix Kubernetes Platform# (##NKP#), which uses GPU
hardware. These override files can also be located in the Konvoy Image Builder repo at https://github.com/
mesosphere/konvoy-image-builder/tree/main/overrides.
--overrides overrides/offline-nvidia.yaml
# Use this file when building a machine image, not as a override secret for
preprovisioned environments
nvidia_runfile_local_file: "{{ playbook_dir}}/../artifacts/
{{ nvidia_runfile_installer }}"
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
Perform this task to add the Nvidia GPU override file to your Air-gapped Pre-provisioned environment.
Procedure
1. If your pre-provisioned machines need a default Override file like GPU, create a secret that includes the overrides
in a file.
cat > nvidia.yaml << EOF
---
nvidia_runfile_local_file: "{{ playbook_dir}}/../artifacts/
{{ nvidia_runfile_installer }}"
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
EOFcat > nvidia.yaml << EOF
---
nvidia_runfile_local_file: "{{ playbook_dir}}/../artifacts/
{{ nvidia_runfile_installer }}"
gpu:
types:
- nvidia
build_name_extra: "-nvidia"
EOF
You can find these override files in the Konvoy Image Builder repo at https://github.com/mesosphere/konvoy-
image-builder/tree/main/overrides.
--overrides overrides/rhck.yaml
---
oracle_kernel: RHCK
Section Contents
Image Overrides
When using KIB to create an OS image that is compliant with Nutanix Kubernetes Platform (NKP), the parameters
for building and configuring the image are included in the file located in images/<builder-type>/<os-
version.yaml where <builder-type> is infrastructure provider specific such as ami or ova.
Although several parameters are specified by default in the Packer templates for each provider, it is possible to
override the default values with flags and override files.
Run the ./konvoy-image build images/<builder-type>/{os}-{version}.yaml command for those files:
Azure Usage
Azure also requires the --client-id --tenant-id flag. For more information, see Using KIB with Azure.
Note: To locate the YAML Ain't Markup Language (YAML) for each provider, see https://github.com/
mesosphere/konvoy-image-builder/tree/7447074a6d910e71ad2e61fc4a12820d073ae5ae/images.
Using Flags
Another option is creating a file with the parameters to be overridden and specifying the --overrides flag, as
shown in the example.
./konvoy-image build images/<builder-type>/<os>-<version>.yaml --overrides
overrides.yaml
Note: While CLI flags can be combined with override files, CLI flags take priority over any override files.
Procedure
2. After creating the override file for our source_ami, we can pass our override file by using the--overrides flag
when building our image:
./konvoy-image build aws images/ami/centos-7.yaml --overrides override-source-
ami.yaml
vSphere Example
Procedure
1. Create your vSphere overrides file overrides/vcenter.yaml and fill in the relevant details for your vSphere
environment.
---
packer:
vcenter_server: <FQDN of your vcenter>
vsphere_username: <username for vcenter e.g. [email protected]>
vsphere_password: <password for vcenter>
ssh_username: builder
ssh_password: <password for your VMs builder user>
linked_clone: false
cluster: <vsphere cluster to use>
datacenter: <vsphere datacenter to use>
datastore: <vsphere datastore to use for templates>
folder: <vsphere folder to store the template in>
insecure_connection: "true"
2. After creating the override file for your source_ova, pass your override file by using the --overrides flag
when building your image:
konvoy-image build images/ova/ubuntu-2004.yaml \
--overrides overrides/kube-version.yaml \
--overrides overrides/vcenter.yaml
Pre-provisioned environments require specific override files to work correctly. The override files can contain HTTP
Proxy info and other factors, including whether you want the image to be FIPS-enabled, NVIDIA optimized, and
so on.
Those override files must also have a Secret that includes all of the overrides you wish to provide in one file.
Before Nutanix Kubernetes Platform 2.6, you had to specify the proxy in the KIB override setup. Then again, in the
NKP create cluster command, even though, in this case, they both always use the same proxy setting. As of NKP
2.6, an HTTP proxy gets created from the Konvoy flags for the control plane proxy and workers' proxy values. The
flags in the NKP command for Pre-provisioned clusters populate a Secret automatically in the bootstrap cluster. That
Secret has a known name that the Pre-provisioned controller finds and applies when it runs the KIB provisioning job.
More information about these flags is on the Clusters with HTTP or HTTPS Proxy on page 647 page.
The nodes that get Kubernetes on them through CAPP automatically get the HTTP proxy secrets that you set using
the flags. You no longer have to put the proxy info in both the overrides AND the NKP create cluster command
as an argument.
For example, if you wish to provide an override with Docker credentials and a different source for EPEL on a
CentOS7 machine, you can create a file like this:
image_registries_with_auth:
- host: "registry-1.docker.io"
username: "my-user"
password: "my-password"
auth: ""
identityToken: ""
Note: For Red Hat Enterprise Linux (RHEL) Pre-provisioned using GPU, provide the following additional lines of
information in your override file:
rhsm_user: ""
rhsm_password: ""
Example:
image_registries_with_auth:
In some networked environments, the machines used for building images can reach the Internet, but only through
an HTTP or HTTPS proxy. For Nutanix Kubernetes Platform (NKP) to operate in these networks, you need a way
to specify what proxies to use. You can use an HTTP proxy override file to specify that proxy. When KIB tries
installing a particular OS package, it uses that proxy to reach the Internet to download it.
Important: The proxy setting specified here is NOT “baked into” the image - it is only used while the image is being
built. The settings are removed before the image is finalized.
While it might seem logical to include the proxy information in the image, the reality is that many companies have
multiple proxies - one perhaps for each geographical region or maybe even a proxy per datacenter or office. All
network traffic to the Internet goes through the proxy. If you were in Germany, you probably would not want to send
all your traffic to a U.S.-based proxy. Doing that slows traffic down and consumes too many network resources.
If you bake the proxy settings into the image, you must create a separate image for each region. Creating an image
without a proxy makes more sense, but remember that you still need a proxy to access the Internet. Thus, when
creating the cluster (and installing the Kommander component of NKP), you must specify the correct proxy settings
for the network environment into which you install the cluster. You will use the same base image for that cluster as
one installed in an environment with different proxy settings.
See, HTTP Proxy Override Files
Next Step
Either navigate to the main Konvoy Image Builder section of the documentation or back to your installation section:
• If using the Day 1- Basic Install instructions, proceed (or return) to that section Basic Installs by Infrastructure
combinations to install and set up NKP based on your infrastructure environment provider.
• If using the Custom Install instructions, proceed (or return) to that section and select the infrastructure provider
you are using: Custom Installation and Additional Infrastructure Tools
Related Information
For information on related topics or procedures, see Pre-Provisioned Override Files.
Section Contents
You can use an HTTP proxy configuration when creating your image. The Ansible playbooks create systemddrop-
in files for containerd and kubelet to configure the http_proxy, http_proxy, and no_proxy environment
variables for the service from the file /etc/konvoy_http_proxy.conf.
To configure a proxy for use during image creation, create a new override file and specify the following:
# Example override-proxy.yaml
---
http_proxy: http://example.org:8080
https_proxy: http://example.org:8081
You can use the NKP command to configure the KubeadmConfigTemplate object to create this file on the start-up
of the image with values supplied during the NKP invocation. This enables using different proxy settings for image
creation and runtime.
Next Steps
Pre-Provisioned Override Files or return to cluster creation in the Custom Installation and Additional
Infrastructure Tools under your Infrastructure Provider.
To create an Air-gapped registry override for the Docker hub, run the following command:
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://my-local-registry.local/v2/harbor-registry","https://
registry-1.docker.io"]
Tell the override how to create a wildcard mirror with this command:
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."*"]
endpoint = ["https://my-local-registry.local/v2/harbor-registry"]
Additional Resources:
Section Contents
konvoy-image build
Konvoy-image builds command pages.
Build and provision images:
Synopsis
Build and Provision images. Specifying AWS arguments is deprecated and will be removed in a future version. Use
the aws subcommand instead.
konvoy-image build <image.yaml> [flags]
Options
--ami-groups stringArray a list of AWS groups which are allowed use the
image, using 'all' result in a public image
--ami-regions stringArray a list of regions to publish amis
--ami-users stringArray a list AWS user accounts which are allowed use
the image
--containerd-version string the version of containerd to install
--dry-run do not create artifacts, or delete them after
creating. Recommended for tests.
--extra-vars strings flag passed Ansible's extra-vars
Section Contents
Examples
aws --region us-west-2 --source-ami=ami-12345abcdef images/ami/centos-79.yaml
Options
--ami-groups stringArray a list of AWS groups which are allowed use the
image, using 'all' result in a public image
--ami-regions stringArray a list of regions to publish amis
--ami-users stringArray a list AWS user accounts which are allowed use
the image
--containerd-version string the version of containerd to install
--dry-run do not create artifacts, or delete them after
creating. Recommended for tests.
--extra-vars strings flag passed Ansible's extra-vars
-h, --help help for aws
--instance-type string instance type used to build the AMI; the type
must be present in the region in which the AMI is built
--kubernetes-version string The version of kubernetes to install. Example:
1.21.6
--overrides strings a comma separated list of override YAML files
--packer-manifest string provide the path to a custom packer manifest
--packer-on-error string [advanced] set error strategy for packer.
strategies [cleanup, abort, run-cleanup-provisioner]
--packer-path string the location of the packer binary (default
"packer")
--region string the region in which to build the AMI
See Also
Examples
azure --location westus2 --subscription-id <sub_id> images/azure/centos-79.yaml
Options
--client-id string the client id to use for the build
--cloud-endpoint string Azure cloud endpoint. Which can be one of
[Public USGovernment China] (default "Public")
--containerd-version string the version of containerd to install
--dry-run do not create artifacts, or delete them
after creating. Recommended for tests.
--extra-vars strings flag passed Ansible's extra-vars
--gallery-image-locations stringArray a list of locations to publish the image
(default [westus])
--gallery-image-name string the gallery image name to publish the
image to
--gallery-image-offer string the gallery image offer to set (default
"nkp")
--gallery-image-publisher string the gallery image publisher to set
(default "nkp")
--gallery-image-sku string the gallery image sku to set
--gallery-name string the gallery name to publish the image in
(default "nkp")
-h, --help help for azure
--instance-type string the Instance Type to use for the build
(default "Standard_D2s_v3")
--kubernetes-version string The version of kubernetes to install.
Example: 1.21.6
--location string the location in which to build the image
(default "westus2")
--overrides strings a comma separated list of override YAML
files
--packer-manifest string provide the path to a custom packer
manifest
--packer-on-error string [advanced] set error strategy for packer.
strategies [cleanup, abort, run-cleanup-provisioner]
See Also
Examples
gcp ... images/gcp/centos-79.yaml
Options
--containerd-version string the version of containerd to install
--dry-run do not create artifacts, or delete them after
creating. Recommended for tests.
--extra-vars strings flag passed Ansible's extra-vars
-h, --help help for gcp
--kubernetes-version string The version of kubernetes to install. Example:
1.21.6
--network string the network to use when creating an image
--overrides strings a comma separated list of override YAML files
--packer-manifest string provide the path to a custom packer manifest
--packer-on-error string [advanced] set error strategy for packer.
strategies [cleanup, abort, run-cleanup-provisioner]
--packer-path string the location of the packer binary (default
"packer")
--project-id string the project id to use when storing created image
--region string the region in which to launch the instance (default
"us-west1")
--work-dir string path to custom work directory generated by the
generate command
See Also
Example output:
vsphere --datacenter dc1 --cluster zone1 --datastore nfs-store1 --network public --
template=nutanix-base-
templates/nutanix-base-CentOS-7.9 images/ami/centos-79.yaml
Options
--cluster string vSphere cluster to be used. Alternatively set
host. Required value: you can pass the cluster name through an override file or image
definition file.
--containerd-version string the version of containerd to install
--datacenter string The vSphere datacenter. Required value: you can
pass the datacenter name through an override file or image definition file.
--datastore string vSphere datastore used to build and store the
image template. Required value: you can pass the datastore name through an override
file or image definition file.
--dry-run do not create artifacts, or delete them after
creating. Recommended for tests.
--extra-vars strings flag passed Ansible's extra-vars
--folder string vSphere folder to store the image template
-h, --help help for vsphere
--host string vSphere host to be used. Alternatively set
cluster. Required value: you can pass the host name through an override file or image
definition file.
--kubernetes-version string The version of kubernetes to install. Example:
1.21.6
--network string vSphere network used to build image template.
Ensure the host running the command has access to this network. Required value: you
can pass the network name through an override file or image definition file.
--overrides strings a comma separated list of override YAML files
--packer-manifest string provide the path to a custom packer manifest
--packer-on-error string [advanced] set error strategy for packer.
strategies [cleanup, abort, run-cleanup-provisioner]
--packer-path string the location of the packer binary (default
"packer")
--resource-pool string vSphere resource pool to be used to build image
template
--ssh-privatekey-file string Path to ssh private key which will be used to log
into the base image template
--ssh-publickey string Path to SSH public key which will be copied to the
image template. Ensure to set ssh-privatekey-file or load the private key into ssh-
agent
--ssh-username string username to be used with the vSphere image
template
--template string Base template to be used. Can include folder.
<templatename> or <folder>/<templatename>. Required value: you can pass the template
name through an override file or image definition file.
--work-dir string path to custom work directory generated by the
generate command
konvoy-image completion
Konvoy-image completion command pages.
Synopsis
Generate the autocompletion script for the konvoy-image for the specified shell. See each sub-command's help for
details on how to use the generated script.
Options
-h, --help help for completion
Section Contents
Synopsis
Generate the autocompletion script for the bash shell.
This script depends on the 'bash-completion' package. You can install it through your OS's package manager if it has
not been installed yet.
To load completions in your current shell session:
source <(konvoy-image completion bash)
To load completions for every new session, execute once:
Linux:
konvoy-image completion bash > /etc/bash_completion.d/konvoy-image
macOS
konvoy-image completion bash > $(brew --prefix)/etc/bash_completion.d/konvoy-image
You will need to start a new shell for this setup to take effect.
konvoy-image completion bash
Options
-h, --help help for bash
--no-descriptions disable completion descriptions
• konvoy-image completion - Generate the autocompletion script for the specified shell
Synopsis
Generate the autocompletion script for the fish shell.
To load completions in your current shell session:
konvoy-image completion fish | source
To load completions for every new session, execute once:
konvoy-image completion fish > ~/.config/fish/completions/konvoy-image.fish
You must start a new shell for this setup to take effect.
konvoy-image completion fish [flags]
Options
-h, --help help for fish
--no-descriptions disable completion descriptions
Synopsis
Generate the autocompletion script for the powershell.
To load completions in your current shell session:
konvoy-image completion powershell | Out-String | Invoke-Expression
To load completions for every new session, add the output of the above command to your powershell profile.
konvoy-image completion powershell [flags]
Options
-h, --help help for powershell
--no-descriptions disable completion descriptions
Synopsis
Generate the autocompletion script for the zsh shell.
Linux
konvoy-image completion zsh > "${fpath[1]}/_konvoy-image"
macOS
konvoy-image completion zsh > $(brew --prefix)/share/zsh/site-functions/_konvoy-image
Options
-h, --help help for zsh
--no-descriptions disable completion descriptions
konvoy-image generate-docs
Konvoy-image generate-docs command page.
Generate docs in the path
Examples
generate-docs /tmp/docs
Options
-h, --help help for generate-docs
konvoy-image generate
Konvoy-image generates command pages.
Generate files relating to building images.
Synopsis
Generate files relating to building images. Specifying AWS arguments is deprecated and will be removed in a future
version. Use the aws subcommand instead.
konvoy-image generate <image.yaml> [flags]
Options
--ami-groups stringArray a list of AWS groups which are allowed use the
image, using 'all' result in a public image
Section Contents
Examples
aws --region us-west-2 --source-ami=ami-12345abcdef images/ami/centos-79.yaml
Options
aws --region us-west-2 --source-ami=ami-12345abcdef images/ami/centos-79.yaml
Examples
azure --location westus2 --subscription-id <sub_id> images/azure/centos-79.yaml
Options
--client-id string the client id to use for the build
--cloud-endpoint string Azure cloud endpoint. Which can be one of
[Public USGovernment China] (default "Public")
See Also
konvoy-image provision
Konvoy-image provision command page.
Provision to an inventory.yaml or hostname. Note the comma at the end of the hostname.
konvoy-image provision <inventory.yaml|hostname,> [flags]
Examples
provision --inventory-file inventory.yaml
Options
--containerd-version string the version of containerd to install
--extra-vars strings flag passed Ansible's extra-vars
-h, --help help for provision
--inventory-file string an ansible inventory defining your infrastructure
--kubernetes-version string The version of kubernetes to install. Example:
1.21.6
--overrides strings a comma separated list of override YAML files
--provider string specify a provider if you wish to install provider
specific utilities
--work-dir string path to custom work directory generated by the
generate command
konvoy-image upload
Konvoy-image uploads command pages.
Upload one of the [artifacts].
Options
-h, --help help for upload
Section Contents
Options
--container-images-dir string path to container images for install on remote
hosts.
--containerd-bundle string path to Containerd tar file for install on remote
hosts.
--extra-vars strings flag passed Ansible's extra-vars
-h, --help help for artifacts
--inventory-file string an ansible inventory defining your infrastructure
(default "inventory.yaml")
--nvidia-runfile string path to nvidia runfile to place on remote hosts.
--os-packages-bundle string path to os-packages tar file for install on
remote hosts.
--overrides strings a comma separated list of override YAML files
--pip-packages-bundle string path to pip-packages tar file for install on
remote hosts.
--work-dir string path to custom work directory generated by the
command
Section Contents
konvoy-image validate
Konvoy-image validate command pages.
Options
--apiserver-port int apiserver port (default 6443)
-h, --help help for validate
--inventory-file string an ansible inventory defining your infrastructure
(default "inventory.yaml")
--pod-subnet string ip addresses used for the pod subnet (default
"192.168.0.0/16")
--service-subnet string ip addresses used for the service subnet (default
"10.96.0.0/12")
konvoy-image version
Konvoy-image version command pages.
Version for konvoy-image
konvoy-image version [flags]
Options
-h, --help help for version
Prerequisites
Review the following before starting your upgrade:
Upgrade Order
Perform the upgrade sequentially, beginning with the Kommander component and moving to upgrade clusters
and CAPI components in the Konvoy component. The process for upgrading your entire NKP product is different
depending on your license and environment.
To get started, select the license that matches your environment and license:
Section Contents
Important:
• Attaching Kubernetes clusters with versions greater than n-1 is not recommended.
• Deploy Pod Disruption Budget (PDB). For more information, see https://kubernetes.io/docs/concepts/
workloads/pods/disruptions/.
• Konvoy Image Builder (KIB)
Procedure
1. Deploy a Pod Disruption Budget for your critical applications. If your application can tolerate only one replica
being unavailable at a time, you can set a Pod disruption budget, as shown in the following example. The example
below is for NVIDIA graphics processing unit (GPU) node pools, but the process is the same for all node pools.
3. Apply the YAML Ain't Markup Language (YAML) file using the command kubectl create -f pod-
disruption-budget-nvidia.yaml.
4. Prepare an OS image for your node pool using Konvoy Image Builder.
What to do next
For information on related topics or procedures, see Ultimate: Upgrade the Management Cluster Kubernetes
Version.
• Before upgrading, we strongly recommend reading the release notes and verifying your current setup against
any possible breaking changes.
• Review the Components and Application versions that are part of this upgrade.
• Ensure the applications you are running will be compatible with Kubernetes v1.27.x.
• If you have attached clusters, ensure your applications will be compatible with Kubernetes v1.27.x.
• Review the list of major Kubernetes changes that may affect your system.
• Ensure you are attempting to run a supported NKP upgrade. For more information, see Upgrade NKP on
page 1089.
• REQUIRED Create a Backup and Restore on page 544 of your current configuration with Velero before
upgrading.
• Download and install this release's supported Nutanix Kubernetes Platform (NKP) CLI binary on your computer.
The remaining prerequisites are necessary if you have one of the following environments or additions to basic
Kommander. Otherwise, proceed to the prerequisites for the Konvoy component.
• There are several sets of images you will need to push to your local registry:
For more information, see Upgrade: For Air-gapped Environments Only on page 1094.
• For air-gapped environments with catalog applications: Ensure you have updated your catalog repository before
upgrading. The catalog repository contains the Docker registry with the Workspace Catalog Application
Upgrade on page 412 images and a charts bundle file containing Downloading NKP on page 16 charts.
Note: If your current Catalog application version is incompatible with this release’s Kubernetes version, upgrade
the application to a compatible version BEFORE upgrading the Konvoy component on Managed clusters or
BEFORE upgrading the Kubernetes version on Attached clusters.
• Air-gapped environments with catalog applications: Ensure you have updated your catalog repository
before upgrading. The catalog repository contains the Docker registry with the NKP catalog application images
and a charts bundle file containing NKP catalog application charts.
Related Topics
Additional information resources for upgrading NKP.
For more information on supported components and applications, see the NKP Release Notes at https://
portal.nutanix.com/page/documents/details?targetId=Release-Notes-Nutanix-Kubernetes-Platform-
v2_12:rel-release-notes-nkp-v2_12-r.html.
• Verify your current Nutanix Kubernetes Platform (NKP) version using the CLI command nkp version.
• Set the environment variables:
• For Air-gapped environments, download the required bundles from Nutanix, at https://support.d2iq.com/hc/
en-us.
• For Azure, set the required environment variables. For more information, see Azure Prerequisites on
page 835.
• For AWS and EKS, set the required AWS Infrastructure and AWS Installation on page 156.
• For vSphere, see vSphere Prerequisites: All Installation Types on page 249.
• For GCP, set the required Google Cloud Platform (GCP) Infrastructure.
• vSphere only: If you want to resize your disk, ensure you have reviewed Create a vSphere Base OS Image.
• Verify your current NKP version using the CLI command nkp version.
Topic Link
List of third-party and open source attributions. https://d2iq.com/legal/3rd
Next Steps
Links to your next step after completing the Nutanix Kubernetes Platform (NKP) upgrade prerequisites.
Depending on your license type, you will follow the relevant link:
Section Contents
Procedure
1. Download the Complete Nutanix Kubernetes Platform (NKP) Air-gapped Bundle for this release (that is. nkp-
air-gapped-bundle_v2.12.0_linux_amd64.tar.gz) to load registry images as explained below. For more
the download link, see Downloading NKP on page 16.
• Both management and attached clusters must be able to connect to the local registry.
• The management cluster must be able to connect to all attached cluster’s Application Programming Interface
(API) servers.
• The management cluster must be able to connect to any load balancers created for platform services on the
management cluster.
Procedure
1. Extract the tarball to a local directory using the command tar -xzvf nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz.
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories.
For example, for the bootstrap, change your directory to the nkp-version directory similar to example below
depending on your current location.
cd nkp-v2.12.0
3. Set an environment variable with your registry address using the command export REGISTRY_URL="https/
http://registry-address:registry-port" export REGISTRY_USERNAME=username export
REGISTRY_PASSWORD=password
Procedure
1. The Kubernetes image bundle will be located in kib/artifacts/images. Verify the image and artifacts exist.
b. To verify the artifacts for your OS exist in the directory and export the appropriate variables, use the following
command.
$ ls kib/artifacts/
c. Set the bundle values with the name from the private registry location, use the export
OS_PACKAGES_BUNDLE command as shown in the example.
export OS_PACKAGES_BUNDLE=name_of_the_OS_package
export CONTAINERD_BUNDLE=name_of_the_containerd_bundle
For example, for RHEL 8.4 set the following:
export OS_PACKAGES_BUNDLE=1.29.6_redhat_8_x86_64.tar.gz
export CONTAINERD_BUNDLE=containerd-1.6.28-nutanix.1-rhel-8.4-x86_64.tar.gz
Important: If you do not already have a local registry set up, refer to Local Registry Tools page for more
information.
Execute the following command to load the air-gapped image bundle into your private registry:
nkp push bundle --bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-
registry=
$REGISTRY_URL --to-registry-username=$REGISTRY_USERNAME --to-registry-password=
$REGISTRY_PASSWORD
It may take some time to push all the images to your image registry, depending on the performance of the network
between the machine you are running the script on and the registry.
Next Step
Depending on your license type, you will follow the relevant link:
Note: Before you begin the upgrade, ensure you review the Upgrade Prerequisites.
This section describes how to upgrade your Nutanix Kubernetes Platform (NKP) clusters to the latest NKP version
in an Ultimate, multi-cluster, environment. The NKP upgrade represents an important step of your environment’s life
cycle, as it ensures that you are up-to-date with the latest features and can benefit from the most recent improvements,
enhanced cluster management, and better performance. If you are on this scenario of the upgrade section of
documentation, you have an Ultimate License. For an NKP Pro license, refer to the NKP Upgrade Pro section
instead.
Important: Ensure that all platform applications in the management cluster have been upgraded to avoid compatibility
issues with the Kubernetes version included in this release. This is done automatically when upgrading
Kommander, so ensure that you follow these sections in order to upgrade the Kommander component prior to
upgrading the Konvoy component.
Overview of Steps
Preform the following steps in the order presented.
1. Ultimate: For Air-gapped Environments Only
2. Ultimate: Upgrade the Management Cluster and Platform Applications
3. Ultimate: Upgrade Platform Applications on Managed and Attached Clusters
4. Ultimate: Upgrade Workspace NKP Catalog Applications
5. Ultimate: Upgrade Project Catalog Applications
6. Ultimate: Upgrade Custom Applications
7. Ultimate: Upgrade the Management Cluster CAPI Components
8. Ultimate: Upgrade the Management Cluster Core Addons
Next Step:
• If you are in an air-gapped environment, first load your local registry. For more information see, Ultimate: For
Air-gapped Environments Only.
• For all other environments, your next step is to Upgrade the Management Cluster and Platform Applications.
Section Contents
• Download the complete NKP air-gapped bundle for this release ( nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz)
• Both management and attached clusters must be able to connect to the local registry.
• The management cluster must be able to connect to all attached cluster’s API servers.
• The management cluster must be able to connect to any load balancers created for platform services on the
attached cluster.
Note: You can also attach clusters when there are networking restrictions between the management cluster and
attached cluster. See Attach a Cluster with Networking Restrictions for more information.
Section Contents
Procedure
1. Extract the tarball to a local directory using the command tar -xzvf nkp-air-gapped-
bundle_v2.12.0_linux_amd64.tar.gz.
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. EX: For the bootstrap, change your directory to the nkp-version directory similar to
example below depending on your current location.
cd nkp-v2.12.0
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Kommander
To load the air-gapped kommander image bundle to your private registry use the command nkp push bundle
--bundle ./container-images/kommander-image-bundle-v2.12.0.tar --to-registry=
$REGISTRY_URL --to-registry-username=$REGISTRY_USERNAME --to-registry-password=
$REGISTRY_PASSWORD.
To load the NKP Catalog Applications image bundle to your private registry, use the command nkp push bundle
--bundle ./container-images/nkp-catalog-applications-image-bundle-v2.12.0.tar --
to-registry=$REGISTRY_URL --to-registry-username=$REGISTRY_USERNAME --to-registry-
password=$REGISTRY_PASSWORD.
Konvoy
Before creating or upgrading a Kubernetes cluster, you need to load the required images in a local registry if
operating in an air-gapped environment. This registry must be accessible from both the Bastion Host on page 1019
and either the AWS EC2 instances or other machines that will be created for the Kubernetes cluster.
Note: If you do not already have a local registry set up, refer to Registry Mirror Tools on page 1017 page for more
information.
To load the air-gapped Konvoy image bundle to your private registry use the command nkp push bundle --
bundle ./container-images/konvoy-image-bundle-v2.12.0.tar --to-registry=$REGISTRY_URL
--to-registry-username=$REGISTRY_USERNAME --to-registry-password=$REGISTRY_PASSWORD.
Note: It is important you upgrade Kommander BEFORE upgrading the Kubernetes version (or Konvoy version for
Managed Konvoy clusters) in attached clusters. This ensures that any changes required for new or changed Kubernetes
API’s are already present.
Important: Authentication token changes were made. In previous releases, you used the same token
against clusters attached to the management cluster. In this release, users will be logged out of attached clusters
until the upgrade process is complete. The kubeconfig must then be retrieved from the endpoint and shared
Prerequisites
• Use the --kubeconfig=${CLUSTER_NAME}.conf flag or set the KUBECONFIG environment variable to ensure
that you upgrade Kommander on the right cluster. For alternatives and recommendations around setting your
context, refer to Provide Context for Commands with a kubeconfig File.
• If you have NKP Insights installed, ensure you uninstall it before upgrading NKP. For more information, see
Upgrade to 1.0.0 (DKP 2.7.0)
• If you have configured a custom domain, running the upgrade command may result in an
inaccessibility of your services through your custom domain for a few minutes.
• The NKP UI and other Application Programming Interfaces (API) may be inconsistent or unavailable
until the upgrade is complete.
Section Contents
Non-air-gapped Environments
To upgrade Kommander and all the Platform Applications in the Management Cluster use the command nkp
upgrade kommander.
Note: If you want to disable the Artificial Intelligence (AI) Navigator, add the flag --disable-appdeployments
ai-navigator-app to this command.
Example output:
# Ensuring upgrading conditions are met
# Ensuring application definitions are updated
# Ensuring helm-mirror implementation is migrated to ChartMuseum
...
After the upgrade, if you have Nutanix Kubernetes Platform (NKP) Catalog Applications deployed, proceed to
Update the NKP Catalog Applications GitRepository on page 1104.
Air-gapped Environments
To upgrade Kommander and all the Platform Applications in the Management Cluster use the command nkp
upgrade kommander \ --charts-bundle ./application-charts/nkp-kommander-charts-bundle-
v2.7.1.tar.gz \ --kommander-applications-repository ./application-repositories/
kommander-applications-v2.7.1.tar.gz.
Example output:
# Ensuring upgrading conditions are met
# Ensuring application definitions are updated
# Ensuring helm-mirror implementation is migrated to ChartMuseum
...
Troubleshooting
Troubleshooting tips for upgrading Ultimate management cluster and platform applications.
If the upgrade fails, perform the troubleshooting commands as needed.
• If the upgrade fails, run the nkp upgrade kommander -v 6 command to get more information on the upgrade
process.
• If you find any HelmReleases in a “broken” release state such as “exhausted” or “another rollback/
release in progress”, you can trigger a reconciliation of the HelmRelease using the commandkubectl -n
kommander patch helmrelease HELMRELEASE_NAME --type='json' -p='[{"op": "replace",
"path": "/spec/suspend", "value": true}]' kubectl -n kommander patch helmrelease
HELMRELEASE_NAME --type='json' -p='[{"op": "replace", "path": "/spec/suspend",
"value": false}]'.
Important: If you are upgrading your Platform applications as part of the NKP upgrade, upgrade your Platform
applications on any additional Workspaces before proceeding with the Konvoy upgrade. Some applications in the
previous release are not compatible with the Kubernetes version of this release, and upgrading Kubernetes is part of the
Nutanix Kubernetes Platform (NKP) Konvoy upgrade process.
Prerequisites
Before you begin, you must:
• Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace where the
cluster is attached using the command export WORKSPACE_NAMESPACE=workspace_namespace.
• Set the WORKSPACE_NAME environment variable to the name of the workspace where the cluster is attached using
the command export WORKSPACE_NAME=workspace_name
Section Contents
Troubleshooting
If the upgrade fails or times out, retry the command with later verbosity to get more information on the upgrade
process:
• If the upgrade fails, run the nkp upgrade kommander -v 6 command to get more information on the upgrade
process.
• If you find any HelmReleases in a “broken” release state such as “exhausted” or “another rollback/
release in progress”, you can trigger a reconciliation of the HelmRelease using the commandkubectl -n
kommander patch helmrelease HELMRELEASE_NAME --type='json' -p='[{"op": "replace",
"path": "/spec/suspend", "value": true}]' kubectl -n kommander patch helmrelease
HELMRELEASE_NAME --type='json' -p='[{"op": "replace", "path": "/spec/suspend",
"value": false}]'.
Considerations for Upgrading NKP Catalog Applications with Spark Operator Post NKP 2.7
Important: Starting in Nutanix Kubernetes Platform (NKP) 2.7, the Spark operator app is removed, resulting in the
complete uninstallation of the app after upgrading the NKP Catalog Applications GitRepository.
This section provides instructions on how you can continue using Spark in NKP 2.7 and after.
If you do not plan on using Spark or if you’re content with having Spark be uninstalled automatically,
you can skip this section and proceed to the Update the NKP Catalog Applications GitRepository on
page 1104
Section Contents
Procedure
1. Set the WORKSPACE_NAMESPACE variable to the namespace where the spark-operator is deployed using the
comman export WORKSPACE_NAMESPACE=<WORKSPACE_NAMESPACE>
5. Unsuspend the cluster Kustomization using the command kubectl -n ${WORKSPACE_NAMESPACE} patch
kustomization cluster --type='json' -p='[{"op" : "replace", "path": "/spec/suspend",
"value": false}]'
Note: If the Kustomization is left suspended, Kommander will be unable to function properly.
Procedure
3. Select the three dot menu from the bottom-right corner of the Spark application tile, and then select Uninstall.
4. click Save.
Procedure
1. Execute the following command to get the WORKSPACE_NAMESPACE of your workspace using the command nkp
get workspaces.
Copy the values under the NAMESPACE column for your workspace.
3. To delete the spark-operator AppDeployment use the command kubectl delete AppDeployment
<spark operator appdeployment name> -n ${WORKSPACE_NAMESPACE}
This results in the spark-operator Kustomizations being deleted, but the HelmRelease and default
ConfigMap remains in the cluster. From here, you can continue to manage the spark-operator through the
HelmRelease.
Section Contents
Note: For the following section, ensure you modify the most recent kommander.yaml configuration file. It
must be the file that reflects the current state of your environment. Reinstalling Kommander with an outdated
kommander.yaml overwrites the list of platform applications that are currently running in your cluster.
Procedure
1. In the kommander.yaml you are currently using for your environment, update the NKP Catalog Applications by
setting the correct NKP version.
Example output:
...
# The list of enabled/disabled apps here should reflect the current state of the
environment, including
configuration overrides!
...
catalog:
repositories:
- name: nkp-catalog-applications
labels:
kommander.nutanix.io/project-default-catalog-repository: "true"
kommander.nutanix.io/workspace-default-catalog-repository: "true"
kommander.nutanix.io/gitapps-gitrepository-type: "nkp"
path: ./nkp-catalog-applications-v2.7.1.tar.gz # modify this version to match
the nkp upgrade version
...
2. Refresh the kommander.yaml to apply the updated tarball using the command nkp install kommander --
installer-config kommander.yaml.
Note: Ensure the kommander.yaml is the Kommander Installer Configuration file you are currently using for
your environment. Otherwise, your configuration will be overwritten and previous configuration lost.
Procedure
1. Update the GitRepository with the tag of your updated Nutanix Kubernetes Platform (NKP) version on the
kommander workspace using the command kubectl patch gitrepository -n kommander nkp-
catalog-applications --type merge --patch '{"spec": {"ref":{"tag":"v2.7.1"}}}'
Note: This command updates the catalog application repositories for all workspaces.
2. For any additional Catalog GitRepository resources created outside the kommander.yaml configuration
(e.g. such as in a project or workspace namespace), set the WORKSPACE_NAMESPACE environment variable
to the namespace of the workspace: using the command export WORKSPACE_NAMESPACE=<workspace
namespace>
3. Update the GitRepository for the additional workspace using the command kubectl patch gitrepository
-n ${WORKSPACE_NAMESPACE} nkp-catalog-applications --type merge --patch '{"spec":
{"ref":{"tag":"v2.7.1"}}}'.
What to do next
Cleaning up Spark on page 1105
Cleaning up Spark
Procedure
1. To locate the WORKSPACE_NAMESPACE of your workspace use the command nkp get workspaces.
Copy the values under the NAMESPACE column for your workspace.
3. To delete the spark-operator AppDeployment use the .kubectl delete AppDeployment <spark operator
appdeployment name> -n ${WORKSPACE_NAMESPACE}.
Upgrading with UI
Procedure
3. Select the three dot menu from the bottom-right corner of the desired application tile, and then select Edit.
4. Select the Version dropdown list, and select a new version. This dropdown list will only be available if there is a
newer version to upgrade to.
5. click Save.
Note: The following commands are using the workspace name and not namespace.
• You can retrieve the workspace name by running the command nkp get workspaces.
• To view a list of the deployed apps to your workspace, run the command nkp get appdeployments
--workspace=.
Procedure
1. To see what app(s) and app versions are available to upgrade, use the command kubectl get apps -n
${WORKSPACE_NAMESPACE}
Note: You can reference the app version by going into the app name (e.g. <APP ID>-<APP VERSION>)
You can also use this command to display the apps and app versions, for example:
kubectl get apps -n ${WORKSPACE_NAMESPACE} -o jsonpath='{range .items[*]}
{@.spec.appId}
{"----"}{@.spec.version}{"\n"}{end}'
Example output:
kafka-operator----0.20.0
kafka-operator----0.20.2
kafka-operator----0.23.0-dev.0
zookeeper-operator----0.2.13
zookeeper-operator----0.2.14
The output example upgrades the Kafka Operator application, named kafka-operator-abc, in a workspace to
version 0.25.1:
nkp upgrade catalogapp kafka-operator-abc --workspace=${WORKSPACE_NAME} --to-
version=0.25.1
3. Repeat this process for on each workspace where you have deployed the application.
Note: Platform applications cannot be upgraded on a one-off basis, and must be upgraded in a single process for
each workspace. If you attempt to upgrade a platform application with these commands, you receive an error and
the application is not upgraded.
Important: As of NKP 2.7, these are the supported versions of Catalog application, all other have been deprecated:
• kafka-operator-0.25.1
• zookeeper-operator-0.2.16-nkp.1
If you plan on upgrading to NKP 2.7 or later, ensure that you upgrade these applications to the latest
compatible version.
• To find what versions of applications are available for upgrade, use the command kubectl get
apps -n ${WORKSPACE_NAMESPACE} :
Important: To ensure you do not install images with known CVEs, specify a custom image for kafka and
zookeeper by following these instructions:
Note: Catalog applications must be upgraded to the latest version BEFORE upgrading the Konvoy components for
Managed clusters or Kubernetes version for attached clusters.
Section Contents
Upgrading with UI
Procedure
5. Select the three dot menu from the bottom-right corner of the desired application tile, and then select Edit.
6. Select the Version dropdown list, and select a new version. This dropdown list will only be available if there is a
newer version to upgrade to.
7. click Save.
Procedure
1. To see what app(s) and app versions are available to upgrade use the command kubectl get apps -n
${PROJECT_NAMESPACE}.
Note: You can reference the app version by going into the app name (e.g. <APP ID>-<APP VERSION>)
You can also use this command to display the apps and app versions, for example:
kubectl get apps -n ${PROJECT_NAMESPACE} -o jsonpath='{range .items[*]}{@.spec.appId}
{"----"}{@.spec.version}{"\n"}{end}'
Example output:
zookeeper-operator----0.2.13
zookeeper-operator----0.2.14
zookeeper-operator----0.2.15
2. Run the following command to upgrade an application from the Nutanix Kubernetes Platform (NKP) CLI using
the command nkp upgrade catalogapp <appdeployment-name> --workspace=my-workspace --
project=my-project --to-version=<version.number>
In the following example shows the upgrade of the Zookeeper Operator application, named zookeeper-
operator-abc, in a workspace to version 0.2.15:
nkp upgrade catalogapp zookeeper-operator-abc --workspace=my-workspace --to-
version=0.2.15
Note: Platform applications cannot be upgraded on a one-off basis, and must be upgraded in a single process for
each workspace. If you attempt to upgrade a platform application with these commands, you receive an error and
the application is not upgraded.
Note: Ensure you validate any Custom Applications you run for compatibility issues against the Kubernetes version
in the new release. If the Custom Application’s version is not compatible with the Kubernetes version, do not continue
with the Konvoy upgrade. Otherwise, your custom Applications may stop running.
Note: For a Pre-provisioned air-gapped environment only, ensure you have reviewed: Ultimate: For Air-gapped
Environments Only on page 1098.
Note: Ensure your NKP configuration references the management cluster where you want to run the upgrade by
setting the KUBECONFIG environment variable, or using the --kubeconfig flag, in accordance with Kubernetes
conventions
Note: If you created CAPI components using flags to specify values, use those same flags during Upgrade to preserve
existing values while setting additional values.
• Refer to nkp create cluster aws CLI for flag descriptions for --with-aws-bootstrap-credentials
and --aws-service-endpoints
• Refer to the HTTP section for details: Clusters with HTTP or HTTPS Proxy on page 647
nkp upgrade capi-components
The output resembles the following:
# Upgrading CAPI components
# Waiting for CAPI components to be upgraded
# Initializing new CAPI components
If the upgrade fails, review the prerequisites section and ensure that you’ve followed the steps in the Upgrade NKP
on page 1089 overview. Furthermore, ensure you have adhered to the Prerequisites at the top of this page.
Note: If you have modified any of the ClusterResourceSet definitions, these changes will not be preserved
when running the command nkp upgrade addons <provider>. You must define the cloud provider before
you use the --dry-run -o yaml options to save the new configuration to a file and remake the same changes upon
each upgrade.
Your cluster comes preconfigured with a few different core addons that provide functionality to your cluster upon
creation. These include: CSI, Container Network Interface (CNI), Cluster Autoscaler, and Node Feature Discovery.
New versions of NKP may come pre-bundled with newer versions of these addons.
Perform the following steps to update these addons:
1. If you have any additional managed clusters, you will need to upgrade the core addons and Kubernetes version for
each one.
2. Ensure your NKP configuration references the management cluster where you want to run the upgrade by
setting the KUBECONFIG environment variable, or using the --kubeconfig flag, in accordance with Kubernetes
conventions.
3. Upgrade the core addons in a cluster using the nkp upgrade addons command specifying the cluster
infrastructure (choose aws, azure, vsphere,vcd, eks, gcp, preprovisioned) and the name of the cluster.
Note: If you need to verify or discover your cluster name to use with this example, first run the kubectl get
clusters command.
Additional References:
For more information, see:
Note: If you have FIPS clusters you should note the additional considerations for FIPS if using FIPS configuration:
Procedure
3. If you have any additional managed clusters, you need to upgrade the core addons and Kubernetes version for
each one. Attached clusters need the Kubernetes version upgraded to a supported Nutanix Kubernetes Platform
(NKP) version using the tool that created that cluster.
4. During an upgrade, you need to create new AMI’s or images with the Kubernetes version to which you want
to upgrade which requires selecting your OS and ensuring you have the current supported version. Build a new
image if applicable.
• If an AMI was specified when initially creating a cluster for AWS, you must build a new one with Create a
Custom AMI on page 1039 and set the flag(s) in the update commands. Either AMI ID--ami AMI_ID, or
the lookup image flags: --ami-owner AWS_ACCOUNT_ID, --ami-base-os ubuntu-20.04, and --ami-
format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'.
Important:
The AMI lookup method will return an error if the lookup uses the upstream CAPA account ID.
• If an Azure Machine Image was specified for Azure, you must build a new one Using KIB with Azure on
page 1045 .
• If a vSphere template Image was specified for vSphere, you must build a new one Using KIB with vSphere
on page 1052.
• You must build a new GCP image Using KIB with GCP on page 1048.
5. Upgrade the Kubernetes version of the control plane. Each cloud provider has distinctive commands. Select the
drop down list next to your provider for compliant CLI.
AWS Example:
nkp update controlplane aws --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.28.7
OR for AMI ID lookup Example:
Note: If you created your initial cluster with a custom AMI using the --ami flag, it is required to set the --ami
flag during the Kubernetes upgrade.
Note: Some advanced options are available for various providers. To see all the options for your particular provider
use the command nkp update controlplane aws|vsphere|preprovisioned|azure|gcp|eks
--help
For more advance options like this example for AWS AMI instance type: aws: --ami, --
instance-type includes some of the options mentioned.
Tip: The command nkp update controlplane {provider} has a 30 minute default
timeout for the update process to finish. If you see the error " timed out waiting for the
condition“, you can check the control plane nodes version using the command kubectl get
machines -o wide $KUBECONFIG before trying again.
6. Upgrade the Kubernetes version of your node pools. Upgrading a nodepool involves draining the existing nodes
in the nodepool and replacing them with new nodes. In order to ensure minimum downtime and maintain high
availability of the critical application workloads during the upgrade process, we recommend deploying Pod
Disruption Budget (Disruptions) for your critical applications. For more information, see Updating Cluster Node
Pools on page 1028.
a. List all node pools available in your cluster by using the command nkp get nodepool --cluster-name
${CLUSTER_NAME}.
b. Select the nodepool you want to upgrade using the command export NODEPOOL_NAME=my-nodepool.
c. Update the selected nodepool using the command for your cloud provider as shown in the examples. The
first example command shows AWS language, so select the dropdown list for your provider for the correct
command. Execute the update command for each of the node pools listed in the previous command.
AWS Example:
If you created your initial cluster with a custom AMI using the --ami flag, it is required to set the --ami flag
during the Kubernetes upgrade.
nkp update nodepool aws ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6
Azure Example:
nkp update nodepool azure ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6 --compute-gallery-id <Azure Compute Gallery built by
KIB for
Kubernetes v1.29.6>
If the --plan-offer, --plan-publisher and --plan-sku fields were specified in the override file during
image creation, the flags must be used in upgrade:
Example:
--plan-offer rockylinux-9
--plan-publisher erockyUltimatesoftwarefoundationinc1653071250513
--plan-sku rockylinux-9
vSphere Example:
nkp update nodepool vsphere ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6 --vm-template <vSphere template built by KIB for
Kubernetes v1.29.6>
VCD Example:
nkp update nodepool vcd ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6
When all nodepools have been updated, your upgrade is complete. For the overall process for upgrading
to the latest version of NKP, refer back to Upgrade NKP on page 1089 for more details.
Procedure
1. Using the kubeconfig of your management cluster, find your cluster name and be sure to copy the information for
all of your clusters.
Example:
kubectl get clusters -A
3. Set your cluster's workspace variable using the command export CLUSTER_WORKSPACE=<your-workspace-
namespace>.
4. Then, upgrade the core addons (replacing aws with whatever infrastructure provider you are using) the command
nkp upgrade addons aws --cluster-name=${CLUSTER_NAME} -n ${CLUSTER_WORKSPACE}
Note: First complete the upgrade of your Kommander Management Cluster before upgrading any managed clusters.
Procedure
1. To begin upgrading the Kubernetes version use the command nkp update controlplane review the example
that applies to your cloud provider.
AWS Example:
nkp update controlplane aws --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 -n
${CLUSTER_WORKSPACE}
EKS Example:
nkp update controlplane eks --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.25.11 -n
${CLUSTER_WORKSPACE}
Azure Example:
nkp update controlplane azure --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6
--compute-gallery-id <Azure Compute Gallery built by KIB for Kubernetes v1.29.6> -n
${CLUSTER_WORKSPACE}
vSphere Example:
nkp update controlplane vsphere --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6
--vm-template <vSphere template built by KIB for Kubernetes v1.29.6> -n
${CLUSTER_WORKSPACE}
VCD Example:
nkp update controlplane vcd --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 --catalog <tenant
catalog image> --vapp-template <vApp template built in vSphere KIB for Kubernetes
v1.29.6> -n ${CLUSTER_WORKSPACE}
GCP Example:
nkp update controlplane gcp --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 --image=projects/${
GCP_PROJECT}/global/images/<GCP image built by KIB for Kubernetes v1.29.6> -n
${CLUSTER_WORKSPACE}
Pre-provisioned Example:
nkp update controlplane preprovisioned --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 -n
${CLUSTER_WORKSPACE}
2. Get a list of all node pools available in your cluster by running the command nkp get nodepools -c
${CLUSTER_NAME} -n ${CLUSTER_WORKSPACE} export NODEPOOL_NAME=.
What to do next
Upgrade Attached Cluster
Since any attached clusters you have are managed by their corresponding cloud provider, none of the components are
upgraded using the Nutanix Kubernetes Platform (NKP) process. The tool used to create the cluster is used to upgrade
it. Ensure that it has the Kubernetes version that is compatible with the versions we support.
Procedure
1. Ensure that your workloads that use this Zookeeper cluster are compatible with Zookeeper 3.7.2.
2. Set the spec.image.repository and spec.image.tag fields of each ZookeeperCluster custom resource to reference the
new image. For example, you can usekubectl patch to upgrade the image in a ZookeeperCluster named zk-3 that
has been manually created in a namespace ws-1 as follows.
kubectl patch zookeepercluster -n ws-1 zk-3 --type='json' -p='[{"op": "replace",
"path":
"/spec/image/repository", "value": "ghcr.io/mesosphere/zookeeper"}, {"op":
"replace", "path":
"/spec/image/tag", "value": "0.2.15-nutanix"}]'
What to do next
For more information, refer to Zookeeper Operator documentation: https://github.com/pravega/zookeeper-
operator/blob/master/README.md#trigger-the-upgrade-manually
Procedure
1. Ensure that all workloads that use this Kafka cluster are compatible with this version of Kafka.
2. Identify the current version of Kafka, which is specified in the .spec.clusterImage field of the
KafkaCluster resource.
4. Wait until the kafka-operator reconciles the broker protocol change. The expected result is for the kubectl
get -A kafkaclusters to report a ClusterRunning status for the KafkaCluster.
6. Wait until kafka-operator fully applies the image change. After that, the cluster will run a new version of
Kafka, but will still use the old protocol version.
7. Verify the behavior and performance of this Kafka cluster and workloads that use it.
Important: Do not postpone this verification. The next step makes a rollback to the previous Kafka version
impossible.
8. Bump the protocol version to the current one by modifying the spec.readOnlyConfig field so that
inter.broker.protocol.version equals the new Kafka version.
Example output showing `KafkaCluster` manifest:
...
spec:
clusterImage: "http://ghcr.io/banzaicloud/kafka:2.13-3.4.1"
...
readOnlyConfig: |
...
inter.broker.protocol.version=3.4.1
...
9. Wait until kafka-operator fully applies the protocol version change. The upgrade of the Kafka image for this
cluster is now complete.
Prerequisites
Review the following before starting your upgrade:
• Upgrade Prerequisites.
• If you are in an air-gapped environment, also see, NKP Pro: For Air-gapped Environments Only on
page 1119.
Proceed to the NKP Pro: Upgrade the Cluster and Platform Applications on page 1120 section and begin your
upgrade.
Section Contents
• Download the Complete Nutanix Kubernetes Platform (NKP) Air-gapped Bundle for this release (that is. nkp-
air-gapped-bundle_v2.12.0_linux_amd64.tar.gz. For more information, see Downloading NKP on
page 16).
• Connectivity: Your Pro cluster must be able to connect to the local registry.
Section Contents
Procedure
2. The directory structure after extraction can be accessed in subsequent steps using commands to access files from
different directories. For example: For the bootstrap, change your directory to the nkp-<version> directory
similar to example below depending on your current location.
Example:
cd nkp-v2.12.0
3. Set an environment variable with your registry address using the command export REGISTRY_URL="<https/
http>://<registry-address>:<registry-port>" export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>.
Note: If you do not already have a local registry set up, continue to the Local Registry Tools Compatible with
NKP on page 1018 page for more information.
To load the air-gapped image bundle into your private registry use the command nkp push bundle --bundle ./
container-images/konvoy-image-bundle-v2.12.0.tar --to-registry=$ {REGISTRY_URL} --to-
registry-username=${REGISTRY_USERNAME} --to-registry-password=${REGISTRY_PASSWORD}.
Note: It may take some time to push all the images to your image registry, depending on the performance of the
network between the machine you are running the script on and the registry.
Note: It is important you upgrade Kommander BEFORE upgrading the Kubernetes version. This ensures that any
changes required for new or changed Kubernetes API’s are already present.
Upgrade Kommander
Prerequisites
• Use the --kubeconfig=${CLUSTER_NAME}.conf flag or set the KUBECONFIG environment variable to ensure
that you upgrade Kommander on the right cluster. For alternatives and recommendations around setting your
context, see Commands within a kubeconfig File on page 31 .
• If you have Nutanix Kubernetes Platform (NKP) Insights installed, ensure you uninstall it before upgrading NKP.
For more information, see Upgrade to 1.0.0 (DKP 2.7.0).
• If you have configured a custom domain, running the upgrade command can result in an
inaccessibility of your services through your custom domain for a few minutes.
• The NKP UI and other APIs may be inconsistent or unavailable until the upgrade is complete.
Note: If you want to disable the Artificial Intelligence (AI) Navigator, add the flag --disable-appdeployments
ai-navigator-app to this command.
Example output:
# Ensuring upgrading conditions are met
# Ensuring application definitions are updated
# Ensuring helm-mirror implementation is migrated to ChartMuseum
...
Troubleshooting
If the upgrade fails, get more information on the upgrade process using the command nkp upgrade kommander -
v 6
If you find any HelmReleases in a “broken” release state such as “exhausted” or “another rollback/release in
progress”, you can trigger a reconciliation of the HelmRelease using the following commands:
kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op":
"replace"
, "path": "/spec/suspend", "value": true}]'
kubectl -n kommander patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op":
"replace", "path":
"/spec/suspend", "value": false}]'
• For a Pre-provisioned air-gapped environment only, ensure you have uploaded the artifacts.
• For air-gapped environments, ensure you have created the air-gapped bundle correctly.
Note: Ensure your NKP configuration references the management cluster where you want to run the upgrade by setting
the KUBECONFIG environment variable, or using the --kubeconfig flag. For more information, see https://
kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/.
Execute the upgrade command for the CAPI components using the command nkp upgrade capi-components
Example output:
# Upgrading CAPI components
# Waiting for CAPI components to be upgraded
Note:
If you created CAPI components using flags to specify values, use those same flags during Upgrade to
preserve existing values while setting additional values.
• Refer to nkp create cluster aws CLI for flag descriptions for --with-aws-bootstrap-
credentials and --aws-service-endpoints.
• Refer to the HTTP or HTTPS section for details: Configuring an HTTP or HTTPS Proxy on
page 644.
If the upgrade fails, review the prerequisites section and ensure that you’ve followed the steps in the Upgrade NKP
on page 1089 overview. Furthermore, ensure you have adhered to the Prerequisites at the top of this page.
Section Contents
If you modify any of the ClusterResourceSet definitions, these changes are not be preserved when running the
command nkp upgrade addons. You must use the --dry-run -o yaml options to save the new configuration to
a file and continue the same changes upon each upgrade.
Your cluster comes preconfigured with a few different core addons that provide functionality to your cluster upon
creation. These include: CSI, Container Network Interface (CNI), Cluster Autoscaler, and Node Feature Discovery.
New versions of NKP may come pre-bundled with newer versions of these addons.
Important: If you have more than one Pro cluster, ensure your nkp configuration references the cluster where you
want to run the upgrade by setting the KUBECONFIG environment variable, or using the --kubeconfig flag, in
accordance with Kubernetes conventions. For information on Kubernetes conventions, see https://kubernetes.io/
docs/tasks/access-application-cluster/configure-access-multiple-clusters/
Procedure
1. Ensure your nkp configuration references the cluster where you want to run the upgrade by setting the
KUBECONFIG environment variable, or use the --kubeconfig flag.
What to do next
For more NKP CLI command help, see NKP upgrade addons CLI.
Procedure
1. Upgrade the control plane first using the infrastructure specific command.
2. Upgrade the node pools second using the infrastructure specific command.
• If an AMI was specified when initially creating a cluster for AWS, you must build a new one Using Konvoy
Image Builder on page 745 and set the flag(s) in the update commands. Either AMI ID --ami AMI_ID, or
Important: The AMI lookup method will return an error if the lookup uses the upstream CAPA account ID.
• If an Azure Machine Image was specified for Azure, you must build a new one with Using KIB with Azure
on page 1045.
• If a vSphere template Image was specified for vSphere, you must build a new one with Using KIB with
vSphere on page 1052.
• You must build a new GCP image with Using KIB with GCP on page 1048.
4. Upgrade the Kubernetes version of the control plane. Each cloud provider has distinctive commands. Below is the
AWS command example. Select the dropdown list menu next to your provider for compliant CLI.
Examples:
• AWS
Note: The first example below is for AWS. If you created your initial cluster with a custom AMI using the --
ami flag, it is required to set the --ami flag during the Kubernetes upgrade.
• Azure
nkp update controlplane azure --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 --
compute-gallery-id <Azure Compute Gallery built by KIB for Kubernetes v1.29.6>
If these fields were specified in the Default Override Files on page 1068 during Azure: Creating an Image
on page 311, the flags must be used in upgrade:
--plan-offer, --plan-publisher and --plan-sku.
--plan-offer rockylinux-9
--plan-publisher erockyenterprisesoftwarefoundationinc1653071250513
--plan-sku rockylinux-9
• vSphere
nkp update controlplane vsphere --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 --
vm-template <vSphere template built by KIB for Kubernetes v1.29.6>
• VCD
nkp update controlplane vcd --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 --catalog
• GCP
nkp update controlplane gcp --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.29.6 --image=
projects/${GCP_PROJECT}/global/images/<GCP image built by KIB for Kubernetes
v1.29.6>
• Pre-provisioned
nkp update controlplane preprovisioned --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6
• EKS
nkp update controlplane eks --cluster-name=${CLUSTER_NAME} --kubernetes-
version=v1.26.6
• Example output, showing the provider name corresponding to the CLI you executed from the choices above.
Updating control plane resource controlplane.cluster.x-k8s.io/v1beta1,
Kind=KubeadmControlPlane default/my-aws-cluster-control-plane
Waiting for control plane update to finish.
# Updating the control plane
Tip: Some advanced options are available for various providers. To see all the options for your particular provider,
run the command nkp update controlplane aws|vsphere|preprovisioned|azure|gcp|eks
--help
Note: The command nkp update controlplane {provider} has a 30 minute default
timeout for the update process to finish. If you see the error "timed out waiting for the
condition“, you can check the control plane nodes version using the command kubectl get
machines -o wide --kubeconfig $KUBECONFIG before trying again.
5. Upgrade the Kubernetes version of your node pools. Upgrading a nodepool involves draining the existing nodes
in the nodepool and replacing them with new nodes. In order to ensure minimum downtime and maintain high
availability of the critical application workloads during the upgrade process, we recommend deploying Pod
Disruption Budget (Disruptions) for your critical applications. For more information, see Updating Cluster Node
Pools on page 1028.
a. First, get a list of all node pools available in your cluster by using the command nkp get nodepool --
cluster-name ${CLUSTER_NAME}
b. Select the nodepool you want to upgrade using the command export NODEPOOL_NAME=my-nodepool.
c. Then update the selected nodepool using the command below. Upgrading a node pool involves draining the
existing nodes in the node pool and replacing them with new nodes. we recommend deploying Pod Disruption
Budget (Disruptions) for your critical applications. Refer to Updating Cluster Node Pools on page 1028
for more information. The first example command shows AWS language, so select the dropdown list for your
Note: The first example below is for AWS. If you created your initial cluster with a custom AMI using the --
ami flag, it is required to set the --ami flag during the Kubernetes upgrade.
• AWS Example:
nkp update nodepool aws ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6
• Azure Example:
nkp update nodepool azure ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6 --compute-gallery-id <Azure Compute Gallery built by
KIB for
Kubernetes v1.29.6>
If these fields were specified in the Use Override Files with Konvoy Image Builder on page 1067 during
Azure: Creating an Image on page 311, the flags must be used in upgrade:
--plan-offer
--plan-publisher
--plan-sku
--plan-offer rockylinux-9
--plan-publisher erockyenterprisesoftwarefoundationinc1653071250513
--plan-sku rockylinux-9
• vSphere Example:
nkp update nodepool vsphere ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME}
--kubernetes-version=v1.29.6 --vm-template <vSphere template built by KIB for
Kubernetes v1.29.6>
• VCD Example:
nkp update nodepool vcd ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6 --catalog <tenant catalog vApp template> --vapp-
template <vApp template built in vSphere KIB for Kubernetes v1.29.6>
• GCP Example:
nkp update nodepool gcp ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.29.6 --image=projects/${GCP_PROJECT}/global/images/<GCP
image built by KIB for Kubernetes v1.29.6>
• Pre-provisioned Example:
nkp update nodepool preprovisioned ${NODEPOOL_NAME} --cluster-name=
${CLUSTER_NAME} --kubernetes-version=v1.29.6
• EKS Example:
nkp update nodepool eks ${NODEPOOL_NAME} --cluster-name=${CLUSTER_NAME} --
kubernetes-version=v1.26.6
Note: AI Navigator is not supported in air-gapped implementations to help support heightened security in public
sector, defense, and network-restricted customer environments.
Section Content
AI Navigator Guidelines
General usage guidelines for AI Navigator.
In addition to applying any of your own organization's specific rules about the kinds of prompts you should enter,
Nutanix recommends that you NOT enter sensitive information, including:
Installing AI Navigator
When installing this version of NKP for the first time, you have a choice as to whether you installation AI Navigator.
To installation the chatbot application, perform the installation according to the procedures for your chosen
infrastructure provider, and NKP installs AI Navigator by default.
Disabling AI Navigator
Disable AI Navigator app using CLI.
Procedure
In the apps: section of the file, modify the Kommander configuration file after generating it using the command ai-
navigator-app.
For more information on modifying the Kommander configuration file, see Installing Kommander in an Air-
gapped Environment on page 965
Procedure
2. Follow the instructions to add a CLI flag to the listed nkp upgrade kommander command to disable the
chatbot.
Procedure
2. Follow the instructions to add a CLI flag to the listed nkp upgrade kommander command to disable the
chatbot.
Related Information
Links to related AI Navigator topics.
The following provides additional information on the installation and upgrade of AI Navigator.
• For information about the Kommander configuration file in the topic, see Kommander Configuration
Reference on page 986
• If you want to the AI Navigator to include cluster live data, see NKP AI Navigator Cluster Info Agent: Obtain
Live Cluster Information.
1. Click the Artificial Intelligence (AI) Navigator icon in the lower right corner.
Note: AI Navigator maintains your query history for the duration of your browser session, whether or not you close
the AI Navigator application.
AI Navigator Queries
Prompt engineering is the process of creating a chatbot that returns the data you really want. For AI chatbots trained
on very large language models, there are some tips you can use to get better information with fewer prompt attempts.
Some of those techniques are useful with the Artificial Intelligence (AI) Navigator as well.
When you create a prompt, an AI chatbot breaks down your entry into discrete parts to help it search its model. The
more precise you can make your prompt, the better the chatbot is at returning the information you wanted.
Another technique for getting to the desired information is called fine-tuning. This involves adjusting the parameters
and database model that the chatbot has to search. Nutanix Kubernetes Platform (NKP) fine-tunes the AI Navigator
model by adding both the NKP and Insights documentation, and the NKP Support Knowledge Base. This helps to
keep answers focused and fast.
In this specific YAML, the PDB is configured with the following properties:
selector: This specifies the labels used to select the pods that are part of
the "nvidia-critical-app". In this case, it selects the pods with the label
app: nvidia-critical-app.
Overall, this YAML is creating a Pod Disruption Budget that allows only one
pod of the "nvidia-critical-app" to be unavailable at a time during disruptions.
Links
-------
[Update Cluster Nodepools](https://beta-docs.nutanix.com/nkp/2.6/update-nvidia-gpu-
cluster-nodepools)
[Setting Priority Classes in NKP Applications](https://beta-docs.nutanix.com/nkp/2.6/
setting-priority-classes-in-nkp-applications)
[Project Quotas & Limit Ranges](https://beta-docs.nutanix.com/nkp/2.6/project-quotas-
limit-ranges)
[Configure the Control Plane](https://beta-docs.nutanix.com/nkp/2.6/configuring-the-
control-plane)
Replace <CLUSTER_NAME>, <TLS_THUMBPRINT>, and <BASE64 Encoded> with the actual values
specific to your environment.
Links
------
Updating vCenter Server TLS Certificate Thumbprint in NKP – Nutanix
Updating vCenter Server TLS Certificate Thumbprint in NKP – Nutanix
Note: To help support heightened security in public sector, defense, and network-restricted customer environments, AI
Navigator is not offered for air-gapped implementations.
• Customers with an Ultimate license must ensure they have selected the Management Cluster Workspace before
deploying the Agent.
• The Cluster Info Agent requires a few hours to index the complete cluster dataset. Until then, you can still use the
NKP AI Navigator, but it will not include the entire cluster-specific dataset when providing answers.
Procedure
2. Ultimate only: Select the Management Cluster Workspace from the top navigation bar.
3. Select Applications from the sidebar and search for AI Navigator Cluster Info Agent.
4. Using the three-dot menu in the application card, select Edit > Configuration.
6. Uninstall the Cluster Info Agent: Uninstall the Cluster Info Agent application like any other application as
explained in Ultimate: Disabling an Application Using the UI on page 384
What Happens to My Data During an AI Navigator Query if the Cluster Info Agent is Enabled?
The Nutanix Kubernetes Platform does not store any type of data related to your queries, nor Pro or Management
cluster data. Further, no data is stored by the Azure OpenAI Service. Your data is not available to OpenAI, Nutanix,
or any other customers, and your data is not used to improve the OpenAI model. To learn more about the data
• nodes
• pods
• services
• events
• endpoints
• deployments
• statefulsets
• daemonsets
• replicasets
• ingresses
• jobs
• cronjobs
• helm.toolkit.fluxcd.io/helmreleases
Supported Documentation
You can access all n-2 supported documentation at
Archived Documentation
In accordance with our , we regularly archive older, unsupported versions of our documentation. At this time, this
includes documentation for:
Scanning Policy
Our procedure for managing CVEs is explained in the sections below.
• Our primary objective is to provide software that is free from critical security vulnerabilities (CVEs) at the time of
delivery.
• We conduct regular scans of our software components, including:
• Kubernetes
• Nutanix Platform applications such as Traefik, Istio, and so on.
• Nutanix Catalog applications (only versions that are compatible with the default Kubernetes version supported
with a respective NKP release) are listed in Workplace Catalog Applications on page 406.
• NKP Insights Add-on
• Scans are performed every 24 hours using the latest CVE database to identify and address potential vulnerabilities
promptly. When results are published, the CVE identifier, criticality, and release tied to a mitigation or
remediation is be included with those results.
• Security Advisories are published for discovered Critical CVEs.
Shipping Policy
• Our objective is to ship software releases that do not have Critical CVEs where a mitigation or remediation is not
available.
• For major and minor releases, our objective is to ship only when there are no known Critical CVEs or where there
is no mitigation available.
• A patch for a critical CVE might be provided in a minor release or a patch release dependent on the component.
• We prioritize resolving these issues in the next minor release to maintain our commitment to security.
• In the event that we discover a critical CVE for a Generally Available (GA) version of our Software, a mitigation
or patch release will be targeted for release within 45 days from the date of publication or development, as
applicable.
More Information
For a list of mitigated CVEs, please refer to https://downloads.d2iq.com/dkp/cves/v2.12.0/nkp/mitigated.csv
• Project name
• Cluster name
• Description
• Type
From the NKP Insights Alert table, you can also toggle by each Severity level:
• Critical
• Warning
• Notice
Prerequisites
Note: The Management/Pro cluster comes with Rook Ceph, Rook Ceph Cluster, and Kube Prometheus Stack by
default. Deploy Rook Ceph and Rook Ceph Cluster on any Managed or Attached clusters using the UI or CLI.
Procedure
1. Enable the Insights Engine or nkp-insights on the clusters you want to monitor. There are several options:
» You can enable NKP Insights per cluster or workspace using the UI.
» You can enable NKP Insights per workspace using the CLI. This enables nkp-insights in all clusters in the
target workspace.
» You can enable NKP Insights per cluster using the CLI. This enables nkp-insights in selected clusters
within a workspace.
2. (Optional) Nutanix recommends enabling the Insights Engine Application on the kommander workspace to
monitor your Management cluster.
Note: If you only want to monitor workload clusters, skip this step.
a. Enable NKP Insights on the Management Cluster Workspace using the UI.
b. Create an AppDeployment for NKP Insights in the kommander namespace (the workspace name is
kommander-workspace). Specify the correct application version in the latest Release Notes from the
applications table.
For example:
nkp create appdeployment nkp-insights --app nkp-insights-1.2.1 --workspace
kommander-workspace
3. Apply a valid NKP Insights license key to allow NKP Insights to start analyzing your environment.
NKP Insights now displays alerts for clusters where you have installed the Insights Engine (nkp-insights).
Note: Access control to Insights summary cards and Insights Alert Details is performed via Kubernetes RBAC based
on the namespace to which the Insight Alert is tied.
Admin task for creating a workspace-based role with view rights to Nutanix Kubernetes Platform Insights
summary cards.
Procedure
1. Select the Management Cluster Workspace. The workspace selector is located at the top navigation bar.
(Option available for Ultimate customers only)
Field Value
Resources insights
9. Assign the roles you created to a user group as explained in Workspace Role Bindings in the Nutanix Kubernetes
Platform Guide.
Admin task for creating a project-based role with view rights to Nutanix Kubernetes Platform Insights
details.
Procedure
1. Select the workspace you want to grant view rights to. The workspace selector is located at the top navigation
bar. (Option available for Ultimate customers only)
2. Select Projects in the sidebar menu. Select or create a Project for which you want to create a role.
Field Value
Select Rule Type Resources
Resources insights, rca, solutions
Verbs get
8. Select Save to exit the rule configuration window and Save again to create the new role.
9. Assign the role you created to a user group as explained in Workspace Role Bindings.
10. If you want to grant view rights to the alert details for clusters in another Workspace, repeat the same procedure
on a per-workspace basis.
Note:
Admin task for creating a project-based role with view rights to Nutanix Kubernetes Platform Insights
summary cards.
Procedure
1. Select the Management Cluster Workspace. The workspace selector is located at the top navigation bar.
(Option available for Ultimate customers only)
5. Select NKP Role, as you are providing access to NKP UI resources, and add a Role Name
Field Value
Resources insights
10. Assign the roles you created to a user group as explained in Workspace Role Bindings in the Nutanix
Kubernetes Platform Guide.
Admin task for creating a workspace-based role with view rights to Nutanix Kubernetes Platform Insights
details.
Procedure
1. Select the workspace you want to grant view rights to. The workspace selector is located at the top navigation
bar. (Option available for Ultimate customers only)
Field Value
Select Rule Type Resources
Resources insights, rca, solutions
Verbs get
8. Select Save to exit the rule configuration window and Save again to create the new role.
9. Assign the role you created to a user group as explained in Workspace Role Bindings.
10. If you want to grant view rights to the alert details for clusters in another Workspace, repeat the same procedure
on a per-workspace basis.
Note:
Procedure
6. Wait until the application is removed completely before you continue deleting persistent volume claims.
Note: The NKP Insights Engine can take several minutes to delete completely.
Procedure
1. Disable the NKP Insights Engine application on the Management/Pro cluster by deleting the nkp-insights
AppDeployment:
kubectl delete appdeployment -n kommander nkp-insights
Note: The Insights Engine can take several minutes to delete completely.
Procedure
6. Wait until the application is entirely removed before you continue deleting persistent volume claims.
7. Verify that the application has been removed entirely from a Managed or Attached cluster.
a. Select the target Workspace > Clusters > View Details > Applications tab.
b. Ensure Insights is no longer deployed.
Note: The Insights Engine can take several minutes to delete completely.
Procedure
1. To list all workspaces and their namespaces using the command kubectl get workspaces
export WORKSPACE_NAMESPACE=<target_workspace_namespace>
2. Disable the Nutanix Kubernetes Platform Insights (NKP Insights) Engine application on all attached/managed
clusters by deleting the nkp-insights AppDeployment:
kubectl delete appdeployment -n ${WORKSPACE_NAMESPACE} nkp-insights
Note: The Insights Engine can take several minutes to delete completely.
Procedure
1. Set the environment variable for the additional cluster using the command export KUBECONFIG=<attached/
managed_cluster_kubeconfig>
2. Delete all remaining data from the Engine clusters on any managed or attached clusters. To execute the following
command:
kubectl delete pvc \
Note:
• Ensure your configuration references the cluster where the Insights Engine is installed. For
more information, see the Provide Context for Commands with a kubeconfig File in the Nutanix
Kubernetes Platform Guide.
• Ensure your configuration references the correct ${WORKSPACE_NAMESPACE}.
3. Delete Insights-related data usng the command kubectl delete insights --all -A.
Procedure
Disable the Nutanix Kubernetes Platform Insights (NKP Insights) Management application on the Management/Pro
cluster by deleting the nkp-insights-management AppDeployment:
kubectl delete appdeployment -n kommander nkp-insights-management
Note: This guide assumes you have a Ceph cluster that NKP does not manage.
For information on configuring the Ceph instance installed by NKP for use by NKP platform applications,
see the Rook Ceph Configuration chapter in the Nutanix Kubernetes Platform Guide.
This guide also assumes that you have already disabled NKP Managed Ceph. For more information, see the
BYOS (Bring Your Own Storage) to NKP Clusters | Disable-NKP-Managed-Ceph chapter in the Nutanix
Kubernetes Platform Guide.
Requirements
The following are required for using Nutanix Kubernetes Platform Insights with your storage:
• You only need to disable the Object Bucket claim if you use an S3 Provider that does not use an object bucket
claim.
• If you disable the Object Bucket Claim, then a S3 bucket needs to be created.
Procedure
Create a secret in the same namespace where you installed NKP Insights
# Set to the workspace namespace insights is installed in
export WORKSPACE_NAMESPACE=kommander
Note: Object bucket claims (OBC) are a custom resource that declares object storage.
Ceph is one provider that uses custom resource definitions (CRD).
If you are using Ceph or another provider that supports object bucket claims, you want to leave it on. This
creates an OBC as part of the installation. If you're going to use S3 directly or create the storage container
manually, you should turn it off.
Procedure
Add the following in the UI:
Note: For more information on editing the values, see the Enable Nutanix Kubernetes Platform Insights (NKP Insights)
Engine in an Air-gapped Environment chapter.
backend:
s3:
port: 80
region: "us-east-1"
endpoint: "rook-ceph-rgw-nkp-object-store"
bucketSize: "1G"
storageClassName: nkp-object-store
enableObjectBucketClaim: true
cleanup:
insightsTTL: "168h"
Procedure
1. Create the ConfigMap with the name provided in the step above, with the custom configuration:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ${WORKSPACE_NAMESPACE}
name: nkp-insights-overrides
data:
values.yaml: |
# helm values here
backend:
s3:
port: 80
region: "us-east-1"
endpoint: "rook-ceph-rgw-nkp-object-store"
bucketSize: "1G"
storageClassName: nkp-object-store
enableObjectBucketClaim: true
cleanup:
insightsTTL: "168h"
EOF
Note: Kommander waits for the ConfigMap to be present before deploying the AppDeployment to the
managed or attached clusters.
2. Provide the name of a ConfigMap in the AppDeployment, which provides a custom configuration on top of the
default configuration:
cat <<EOF | kubectl apply -f -
apiVersion: apps.kommander.nutanix.io/v1alpha2
kind: AppDeployment
metadata:
name: nkp-insights
namespace: ${WORKSPACE_NAMESPACE}
spec:
appRef:
kind: App
name: nkp-insights-0.4.1
configOverrides:
name: nkp-insights-overrides
EOF
Procedure
In kommander.yaml, enable NKP Insights and NKP Catalog Applications by setting the following:
apiVersion: config.kommander.mesosphere.io/v1alpha1
What to do next
For more information, see the Install Air-gapped Kommander with NKP Insights and NKP Catalog Applications in
the Nutanix Kubernetes Platform Guide.
Procedure
If needed, an ObjectBucketClaim can be created manually in the same namespace as nkp-insights. This results
in the creation of ObjectBucket , which creates a Secret consumed by nkp-insights.
For nkp-insights:
cat <<EOF | kubectl apply -f -
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: nkp-insights
namespace: ${NAMESPACE}
spec:
additionalConfig:
maxSize: 1G
bucketName: nkp-insights
storageClassName: nkp-object-store
Note:
After enabling the NKP Insights Engine in a Managed or Attached cluster, select a workspace in the NKP UI that
includes the cluster to view the NKP Insights Alerts. An Insights summary card in the Dashboard tab displays the
most recent Insight Alerts, as well as the number of insights within each severity level of Critical, Warning, and
Notices.
Select View All from the NKP Insights summary card or Insights from the sidebar to access the NKP Insights Alert
table. This table provides an overview of all the NKP Insights Alerts. You can filter these NKP Insights Alerts in
several different ways:
Note: Muted and Resolved status are manually set, as described below.
• Open
• Muted
• Resolved
• Toggle your view by the following NKP Insights types:
• All types
• Availability
• Best Practices
• Configuration
• Security
• Select All Clusters or an individual cluster.
• Select All Projects or an individual project.
• Toggle between All, Critical, Warning, and Notices.
To clear filters and reset your view to all NKP Insights items, select Clear All.
Note: When you resolve an alert, it is impossible to move it back to Open or Muted. Ensure you only resolve alerts
once you fix the issue.
Procedure
1. From the Insight Alert table, filter and check the boxes for the alerts.
3. A confirmation prompt for the status change appears once you resolve or mute an Insight Alert.
Note: Once you set an alert to Resolved or Muted, it does not appear in the Open Insight Alert table view.
Procedure
From the Insights Alert field, select the desired filter from the drop-down list.
• Severity
• Last Detected
• Description
• Types
• Cluster
• Project, if applicable
Solutions
This section contains recommended steps to resolve the anomaly.
Note: You can configure NKP Insights to send notifications to other communication platforms like PagerDuty or e-
mail. However, we have only included examples for Slack and Microsoft Teams.
Why Should I Send Notifications Activating this feature eliminates the need to check your cluster’s
for NKP Insights Alerts? health manually. NKP Insights, combined with Alertmanager, can
automatically warn users about critical issues. They can then take
measures to keep your environment healthy and avoid possible
downtime.
How Do NKP Insights and Alertmanager acts as a central component for managing and routing alerts. It
Alertmanager Work Together? is available by default in your NKP installation and automatically monitors
several NKP-defined alerts.
By enabling NKP Insights to route alerts to Alertmanager, you add another
source of alerting. In the examples provided in this section, you use an
AlertmanagerConfig YAML file to enable Alertmanager to group and
filter NKP Insights alerts according to rules and send notifications to a
communication platform.
What Type of Configuration In the AlertmanagerConfig object, you can define the following
Options Are Possible? parameters:
Routes: Routes define which alert types generate notifications and
which do not. In the provided examples, we configure Alertmanager to
send notifications for all Critical and Warning NKP Insights based on
Severity.
Receivers: Receivers define the communication platform where you want
to receive the notifications. The provided examples show how to configure
notifications for Slack and Microsoft Teams.
Message content and format: The receiver configuration also
defines the display format for the alert message. The examples provide
message formatting designed for Slack and Microsoft Teams. The provided
notification examples display all the informational fields you can find when
looking at an alert in the NKP UI.
Prerequisites
• Kube Prometheus Stack installed on the Management cluster (included in the default configuration)
• A Slack Incoming Webhook created by a Slack workspace admin. For more information, see https://
api.slack.com/messaging/webhooks#create_a_webhook.
• Nutanix Kubernetes Platform Insights installed. For more information, see Nutanix Kubernetes Platform
Insights Setup on page 1145.
Procedure
2. Set the Slack Webhook variable to the URL you obtained from Slack for this purpose: The webhook format is
similar to https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX.
export SLACK_WEBHOOK=<endpoint_URL>
Procedure
Note: Replace <#target_slack_channel> with the name of the Slack channel where you want to receive
the notifications.
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: slack-config
namespace: kommander
spec:
route:
groupBy: ['source', 'insightClass', 'severity', 'cluster']
groupWait: 3m
groupInterval: 15m
repeatInterval: 1h
receiver: 'slack'
routes:
- receiver: 'slack'
matchers:
- name: source
value: Insights
matchType: =
- name: severity
value: Critical
matchType: =
continue: true
receivers:
- name: 'slack'
slackConfigs:
- apiURL:
name: slack-webhook
key: slack-webhook-url
channel: '#<target_slack_channel>'
username: Insights Slack Notifier
iconURL: https://avatars3.githubusercontent.com/u/3380462
title: |-
{{ .Status | toUpper -}}{{ if eq .Status "firing" }}: {{ .Alerts.Firing
| len }} {{- end}} Insights Alert{{ if gt (len .Alerts.Firing) 1 }}s{{ end }}
({{ .CommonLabels.insightClass }})
titleLink: 'https://{{ (index .Alerts 0).Annotations.detailsURL }}'
text: |-
{{- if (index .Alerts 0).Labels.namespace }}
{{- "\n" -}}
*Namespace:* `{{ (index .Alerts 0).Labels.namespace }}`
{{- end }}
{{- if (index .Alerts 0).Labels.severity }}
{{- "\n" -}}
*Severity:* `{{ (index .Alerts 0).Labels.severity }}`
{{- end }}
{{- if (index .Alerts 0).Labels.cluster }}
{{- "\n" -}}
*Cluster:* `{{ (index .Alerts 0).Labels.cluster }}`
{{- end }}
{{- if (index .Alerts 0).Annotations.description }}
{{- "\n" -}}
*Description:* {{ (index .Alerts 0).Annotations.description }}
{{- end }}
{{- if (index .Alerts 0).Annotations.categories }}
Prerequisites
• Kube Prometheus Stack installed on the Management cluster (included in the default configuration)
• A Microsoft Teams Incoming Webhook. For more information, see https://learn.microsoft.com/en-us/
microsoftteams/platform/webhooks-and-connectors/how-to/add-incoming-webhook?tabs=newteams
%2Cdotnet.
• Nutanix Kubernetes Platform Insights installed. For more information, see Nutanix Kubernetes Platform
Insights Setup on page 1145.
Procedure
2. Set the Microsoft Teams Webhook variable to the URL you obtained from Microsoft Teams for this purpose: The
webhook format is similar to: https://xxxx.webhook.office.com/xxxxxxxxx.
export TEAMS_WEBHOOK=<endpoint_URL>
3. Create a custom display format for your message in Microsoft Teams message, and name the file custom-
card.tmpl:
{{ define "teams.card" }}
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{- if eq .Status "resolved" -}}2DC72D
{{- else if eq .Status "Firing" -}}
{{- if eq .CommonLabels.severity "Critical" -}}8C1A1A
{{- else if eq .CommonLabels.severity "Warning" -}}FFA500
{{- else -}}808080{{- end -}}
{{- else -}}808080{{- end -}}",
"summary": "{{- if eq .CommonAnnotations.description "" -}}
{{- if eq .CommonLabels.insightClass "" -}}
{{- if eq .CommonLabels.alertname "" -}}
Prometheus Alert
{{- else -}}
{{- .CommonLabels.alertname -}}
{{- end -}}
{{- else -}}
{{- .CommonLabels.insightClass -}}
{{- end -}}
{{- else -}}
{{- .CommonAnnotations.description -}}
{{- end -}}",
"title": "{{ .Status | toUpper -}}{{ if eq .Status "firing" }}: {{ .Alerts.Firing
| len }} {{- end}} Insights Alert{{ if gt (len .Alerts.Firing) 1 }}s{{ end }}
({{ .CommonLabels.insightClass }})",
"sections": [ {{$externalUrl := (index .Alerts 0).Annotations.detailsURL }}
{
"activityTitle": "[{{ (index .Alerts 0).Annotations.description }}]
({{ $externalUrl }})",
"facts": [
{{- if (index .Alerts 0).Labels.namespace }}
Prerequisite
• Slack: Send Nutanix Kubernetes Platform Insights Alert Notifications to a Channel on page 1163
• Microsoft Teams: Send NKP Insights Alert Notifications to a Channel on page 1165
Procedure
Troubleshooting
Note: You can recognize NKP Insights alerts from other default NKP alerts because the alert severity tags are
capitalized. For example, an NKP Insights alert is Critical, whereas other non-Insights alerts are critical.
Procedure
3. Select Application Dashboards, and look for the Prometheus Alert Manager application card.
Procedure
Verify the deployment logs using the command kubectl -n kommander logs alertmanager-kube-
prometheus-stack-alertmanager-0.
If the output is blank, the configuration has been successful. The output displays errors if the deployment has failed.
Note: NKP Insights displays alerts related to DiskFull and PVCFull, regardless of whether they are rooted in your
environment’s underlying NKP resources, Kubernetes resources, or one of your production workloads. Ensure you have
Procedure
2. Ultimate only: Select the target workspace from the top navigation bar.
3. Select Applications from the sidebar and search for NKP Insights.
4. Select the three-dot menu in the application card, and Edit > Configuration.
Procedure
1. Copy the following customization and paste it into the code editor:
backend:
engineConfig:
dkpIdentification:
enabled: false
3. Repeat the configuration steps included in this page for each workspace.
Configuration Anomalies
In Kubernetes, a class of problems arises from incorrect or insufficient configuration in workload and Kubernetes
cluster deployments. We refer to them as configuration anomalies.
We integrated third-party open-source components into the Nutanix Kubernetes Platform Insights(NKP Insights)
Engine that handles specific classes of configuration anomalies:
Polaris
Polaris by Fairwinds is an open-source project that identifies Kubernetes deployment configuration errors. Polaris
runs over a dozen checks to help users discover Kubernetes misconfigurations that frequently cause security
vulnerabilities, outages, scaling limitations, and more. Using Polaris, you can avoid problems and ensure you’re using
Kubernetes best practices.
Polaris checks configurations against a set of best practices for workloads and Kubernetes cluster deployments, such
as:
• Health Checks
• Images
Procedure
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Procedure
1. Polaris Audits run by default every 37 minutes and use Cron syntax. You can change the default by editing the
Service configuration with the following values:
polaris:
schedule: "@every 37m"
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
• Security:https://polaris.docs.fairwinds.com/checks/security/
• Efficiency: https://polaris.docs.fairwinds.com/checks/efficiency/
Procedure
1. You can change these defaults by modifying the Service configuration with the following values:
polaris:
config:
# See https://github.com/FairwindsOps/polaris/blob/master/examples/config.yaml
checks:
# reliability
deploymentMissingReplicas: warning
priorityClassNotSet: ignore
tagNotSpecified: danger
pullPolicyNotAlways: warning
readinessProbeMissing: warning
livenessProbeMissing: warning
metadataAndNameMismatched: ignore
pdbDisruptionsIsZero: warning
missingPodDisruptionBudget: ignore
# efficiency
cpuRequestsMissing: warning
cpuLimitsMissing: warning
memoryRequestsMissing: warning
memoryLimitsMissing: warning
# security
hostIPCSet: danger
hostPIDSet: danger
notReadOnlyRootFilesystem: warning
privilegeEscalationAllowed: danger
runAsRootAllowed: danger
runAsPrivileged: danger
dangerousCapabilities: danger
insecureCapabilities: warning
hostNetworkSet: danger
hostPortSet: warning
tlsSettingsMissing: warning
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Note: When you mark a Polaris Audit Insight alert as Not-Useful, newly generated alerts are set to the lowest
Notice severity.
Procedure
1. You can exclude a particular workload from a Polaris Audit via its Exemptions. This example shows how to
exempt the workload dummy-deployment, which currently has an issue where CPU Limits are Missing.
Change the exceptions list by modifying the Service configuration with the following values:
polaris:
config:
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Pluto
Pluto by Fairwinds is a tool that scans Live Helm releases running in your cluster for deprecated Kubernetes API
versions. It sends an alert about deprecated apiVersions deployed in your Helm releases.
In Nutanix Kubernetes Platform Insights(NKP Insights), Pluto scans Live Helm releases running in your cluster for
deprecated API versions and sends an alert about any deprecated apiVersions deployed in your Helm releases.
To know which Pluto version is included in this release, see the Nutanix Kubernetes Platform Insights(NKP Insights)
Release Notes.
For more information on Pluto, see https://pluto.docs.fairwinds.com/
Procedure
1. Enable or disable Helm release scanning with Pluto Insights by editing the Service configuration with the
following values:
pluto:
enabled: true
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Procedure
1. Pluto scans run by default every 41 minutes and uses Cron syntax. You can change the default by editing the
values of the Service configuration:
pluto:
schedule: "@every 41m"
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Nova
Nova by Fairwinds adds the ability for the Insights engine to check the helm chart version of the current workload
deployment. It scans the latest helm chart version available from the configured Helm repositories and then sends a
structural Insight alert if there is an issue. The alert details show an RCA and a solution to resolve the problem.
Nova adds the ability for the Insights engine to check the helm chart version of the current workload deployment. It
scans the latest helm chart version available from the helm repository and then sends a structural insight alert if there
is an issue. The alert details show an RCA and a solution to resolve the problem.
To know which Nova version is included in this release, see the Nutanix Kubernetes Platform Insights(NKP Insights)
Release Notes.
For more information on Nova, see https://nova.docs.fairwinds.com/.
Procedure
2. Set the helmRepositoryURLs to the URLs for the Helm repositories used by your workloads where you want
Helm chart versions to be scanned.
nova:
enabled: true
helmRepositoryURLs:
- https://charts.bitnami.com/bitnami/
- https://charts.jetstack.io
3. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
1. Nova runs every 37 minutes by default and uses the Cron syntax. You can change the default by editing the
Service configuration with the following values:
nova:
schedule: "@every 34m"
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Trivy
Note: This function is disabled in the default configuration of Nutanix Kubernetes Platform Insights(NKP Insights) .
This and later versions of Insights come with CVE scanning functionality for customer-deployed workload clusters
and deployments.
CVE/CIS databases are updated every couple of hours. When enabled, the CVE scanning feature scans these
databases and runs an analysis against your workloads to flag any potential security issues.
Trivy is an open-source vulnerability and misconfiguration scanner that scans to detect vulnerabilities in:
• Container Images
• Rootfs
• Filesystems
To know which Trivy version is included in this release, see the NKP Insights Release Notes.
For more information on Trivy, see https://aquasecurity.github.io/trivy/v0.44/docs/scanner/vulnerability/.
Procedure
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
1. Trivy scans run by default every 2 hours and uses Cron syntax. You can change the default by editing the values
of the Service configuration:
trivy:
schedule: "@every 2h"
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Trivy Severity Leve Insights Alert Level Example (depends on the categorization of the
source database)
CRITICAL Critical Denial of crucial service
MEDIUM
UNKNOWN
Prerequisites
Procedure
1. Clone the Nutanix Kubernetes Platform Insights(NKP Insights) - Trivy Bundles repository to your local machine
using the command git clone https://github.com/mesosphere/trivy-bundles.git
2. Specify the Trivy Version included in this version of NKP Insights using the command export
TRIVY_VERSION=
docker:default
=> [internal] load build definition from Dockerfile
0.0s
=> => transferring dockerfile: 534B
0.0s
=> [internal] load .dockerignore
0.0s
=> => transferring context: 2B
0.0s
=> [internal] load metadata for docker.io/aquasec/trivy:0.42.1
0.3s
=> [1/7] FROM docker.io/aquasec/
trivy:0.42.1@sha256:49a0b08589b7577f3e21a7d479284c69dc4d27cbb86bd07ad36773f075581313
0.0s
=> CACHED [2/7] RUN mkdir /trivy_cache
0.0s
0.0s
=> [4/7] RUN echo 20230908T185308Z
0.3s
=> [5/7] RUN trivy image --download-db-only --cache-dir /trivy_cache
4.5s
=> [6/7] RUN ls -Rl /trivy_cache
0.3s
=> exporting to image
1.8s
=> => exporting layers
1.8s
=> => writing image
sha256:62f71725212e5b680a3cef771bcb312e931e05445c50632fa4495e216793c9cf
0.0s
=> => naming to docker.io/mesosphere/trivy-bundles:0.42.1-20230908T185308Z
0.0s
Executing target: create-airgapped-image-bundle
4. Transfer the created bundle to the air-gapped bastion host or node you used to install NKP.
Procedure
1. Go to the air-gapped bastion host or node you used for installing NKP.
2. Export the environment variables for your registry. For more information, see the Local Registry.
export REGISTRY_ADDRESS=<registry-address>:<registry-port>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
4. Update Nutanix Kubernetes Platform Insights (NKP Insights) in the air-gapped environment to use the refreshed
database. Edit the service configuration on each workspace by providing the path to the Docker image. To
modify an existing installation, select Workspace, Applications, NKP-Insights, and then Edit. Replace
<docker-image-name> with the path to the Docker image. It looks similar to docker.io/mesosphere/
trivy-bundles:0.42.1-20230908T185308Z
trivy:
enabled: true
image:
imageFull: <docker-image-path>
After Insights has completed deploying, check the currently used Trivy database shown in Verifying the Trivy
Version on page 1176 to ensure the configuration has been deployed correctly.
Kube-bench
Kube-bench by Aqua Security is a tool that verifies that Kubernetes clusters run securely. This tool checks against
the best practices and guidelines specified in the CIS Kubernetes Benchmark developed by the Center for Internet
Security to ensure that your clusters comply with the latest security configuration standards.
Whenever a standard is not met during a scan, an Insights alert is created with comprehensive information. For more
information about this application, refer to the official documentation from Kube-bench.
Kube-bench adds the ability to ensure that Kubernetes clusters run securely. This tool checks against the best
practices and guidelines specified in the CIS Kubernetes Benchmark.
Whenever a security standard is not met during a scan, an Insights alert is created with comprehensive information.
To know which Kube-bench version is included in this release, see the Nutanix Kubernetes Platform Insights(NKP
Insights) Release Notes.
For more information on Kube-bench, see https://www.aquasec.com/products/kubernetes-security/ and https://
aquasecurity.github.io/kube-bench/v0.6.12/. For more information on the Center for Internet Security, see https://
www.cisecurity.org/.
Procedure
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Procedure
1. Kube-bench scans run by default every 35 minutes and uses Cron syntax. You can change the default by editing
the Service configuration with the following values:
kubeBench:
schedule: "@every 35m"
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Procedure
1. You can change this default behavior and define a CIS benchmark version to check against, editing the service
configuration with the following values:
The example configuration configures Kube-bench to check against the cis-1.15 regardless of the Kubernetes
version.
kubeBench:
config:
instances:
defaultSetup:
additionalArgs: ["--version", "cis-1.15"]
2. To modify an installation, select Workspace > Applications > NKP-Insights > Edit.
Severity Levels
• If the validation runs correctly and does not detect any anomalies, no Insight is created.
• If the validation runs and fails due to a detected anomaly, an Insight is created with the alert level Warning.
• If the validation check cannot run or is incomplete, an Insight is created with the alert level Warning.
kube-bench analyses security-related aspects of your cluster and creates alerts when your Kubernetes cluster is not
compliant with the best practices established in the CIS benchmark.
Some issue alerts relate to cluster elements created with Konvoy, NKP’s provisioning tool.
For customers who require CIS Benchmark compliance, this page provides an overview of mitigating these known
alerts or why addressing the issue is not feasible.
• For issues that can be mitigated, create patch files with the mitigations, then create a cluster kustomization that
references these patch files, and, lastly, create a new cluster based on the kustomization file as shown in Mitigate
Issues by Creating Custom Clusters on page 1181.
• For issues that cannot be mitigated, see the List of CIS Benchmark Explanations at https://docs.d2iq.com/
dins/2.7/list-of-cis-benchmark-explanations.
For issues that can be mitigated, create patch files with the mitigations, then create a cluster kustomization that
references these patch files, and, lastly, create a new cluster based on the kustomization file
Creating Patch Files with CIS Benchmark Mitigations
This is a Nutanix task.
Note: All files you create in this and the following sections must be present in the same directory.
Procedure
1. Establish a name for the cluster you will create by setting the CLUSTER_NAME environment variable: Replace the
placeholder <name_of_the_cluster> with the actual name you want to use.
export CLUSTER_NAME=<name_of_the_cluster>
2. Create CIS patch files for the issues you want to mitigate. These are the issues that you can mitigate:
CIS 1.2.12
This is a Nutanix reference.
ID Text Remediation
1.2.12 Ensure that the admission control Edit the API server pod specification file
plugin AlwaysPullImages is set $apiserverconf on the control plane node and
(Manual). set the --enable-admission-plugins parameter to
include AlwaysPullImages:--enable-admission-
plugins=...,AlwaysPullImages,...
NKP Mitigation
Create a file called cis-1.2.12-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-1.2.12-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
NKP Mitigation
Create a file called cis-1.2.18-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-1.2.18-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
extraArgs:
profiling: "false"
EOF
CIS 1.2.32
This is a Nutanix reference.
ID Text Remediation
1.2.32 Ensure that the API Server Edit the API server pod specification file /etc/kubernetes/
only makes use of Strong manifests/kube-apiserver.yaml
Cryptographic Ciphers (Manual)
on the control plane node and set the below parameter.
--tls-cipher-
suites=TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,
TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,TLS_R
NKP Mitigation
Create a file called cis-1.2.32-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-1.2.32-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
extraArgs:
tls-cipher-suites:
"TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_A
ID Text Remediation
1.3.1 Ensure that the --terminated-pod- Edit the Controller Manager pod specification file
gc-threshold argument is set as $controllermanagerconf on the control plane node
appropriate (Manual). and set the --terminated-pod-gc-threshold to an
appropriate threshold, for example:--terminated-
pod-gc-threshold=10
NKP Mitigation
Create a file called cis-1.3.1-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-1.3.1-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
clusterConfiguration:
controllerManager:
extraArgs:
terminated-pod-gc-threshold: "12500"
EOF
CIS 1.3.2
This is a Nutanix reference.
ID Text Remediation
1.3.2 Ensure that the --profiling Edit the Controller Manager pod specification file
argument is set to false $controllermanagerconf on the control plane node
(Automated). and set the below parameter:--profiling=false
NKP Mitigation
Create a file called cis-1.3.2-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-1.3.2-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
clusterConfiguration:
controllerManager:
extraArgs:
profiling: "false"
EOF
CIS 1.4.1
This is a Nutanix reference.
ID Text Remediation
1.4.1 Ensure that the --profiling Edit the Controller Manager pod specification file
argument is set to false $schedulerconf on the control plane node and set
(Automated). the below parameter:--profiling=false
ID Text Remediation
4.2.6 Ensure that the --protect-kernel- If using a Kubelet config file, edit the file
defaults argument is set to true to set protectKernelDefaults to true.
(Automated). If using command line arguments, edit the
kubelet service file $kubeletsvc on each
worker node and set the below parameter in
KUBELET_SYSTEM_PODS_ARGS variable:--
protect-kernel-defaults=trueBased on your system,
restart the kubelet service. For example systemctl
daemon-reloadsystemctl restart kubelet.service
NKP Mitigation
Create a file called cis-4.2.6-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-4.2.6-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
protect-kernel-defaults: "true"
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
protect-kernel-defaults: "true"
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: ${CLUSTER_NAME}-md-0
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
protect-kernel-defaults: "true"
ID Text Remediation
4.2.9 Ensure that the eventRecordQPS If using a Kubelet config file, edit the file to
argument is set to a level that set eventRecordQPS to an appropriate level.
ensures appropriate event If using command line arguments, edit the
capture (Manual). kubelet service file$kubeletsvc on each worker
node and set the parameter below in the
KUBELET_SYSTEM_PODS_ARGS variable.
Based on your system, restart the kubelet service.
For example, systemctl daemon-reloadsystemctl
restart kubelet.service
NKP Mitigation
eventRecordQPS can also be configured with the --event-qps argument on the kubelet’s arguments.
Create a file called cis-4.2.9-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-4.2.9-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
event-qps: "0"
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
event-qps: "0"
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: ${CLUSTER_NAME}-md-0
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
event-qps: "0"
EOF
CIS 4.2.13
This is a Nutanix reference.
NKP Mitigation
Create a file called cis-4.2.13-patches.yaml with the following in the same folder as kustomization.yaml:
cat <<EOF > cis-4.2.13-patches.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
kubeadmConfigSpec:
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
tls-cipher-suites:
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WIT
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
tls-cipher-suites:
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WIT
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: ${CLUSTER_NAME}-md-0
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
tls-cipher-suites:
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WIT
EOF
Create a Cluster Kustomization
Note: The kustomization.yaml file you create in this section must be in the same directory as the CIS patch
files.
Prerequisites
Refer to Customizing CAPI Components for a Cluster to familiarize yourself with the customization procedure and
options. We will use similar terms on this page.
For more information, see Customizing CAPI Components for a Cluster at https://docs.d2iq.com/dkp/2.8/
customizing-capi-components-for-a-cluster
Creating a Kustomization YAML File
This is a Nutanix task.
Procedure
2. Create a kustomization.The yaml# file will include patches for each of the CIS mitigations.
We use the ##CIS-1.2.18# patch in this example, but you can include all mitigation files you created in the first
section.
cat <<EOF > kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ${CLUSTER_NAME}.yaml
patches:
- cis-1.2.18-patch.yaml
#- Add more CIS patch files here.
EOF
Note: The CIS patch, kustomization.yaml, and ${CLUSTER_NAME}.yaml files must be in the same
directory.
Procedure
1. Create a Bootstrap Cluster. Ensure that the bootstrap cluster has been created for the desired provider.
Note: Supported providers include AWS, Azure, GCP, Pre-Provisioned, and vSphere.
2. To apply the customizations and create a new cluster, use the command kubectl create -k.
ID Text Remediation
1.1.10 Ensure that the Container Run the below command (based on the file location
Network Interface file ownership on your system) on the control plane node. For
is set to root:root (Manual) example,chown root:root <path/to/cni/files>
NKP Explanation
The kubelet config --cni-config-dir has been deprecated and removed since Kubernetes v1.24. Calico, used for
CNI stores, is configured at /etc/cni/net.d and has ownership set to root:root.
CIS 1.1.9
This is a Nutanix reference.
ID Text Remediation
1.1.9 Ensure that the Container Run the below command (based on the file location
Network Interface file permissions on your system) on the control plane node. For
are set to 644 or more restrictive example, chmod 644 <path/to/cni/files>
(Manual)
NKP Explanation
The kubelet config --cni-config-dir has been deprecated and removed since Kubernetes v1.24. Calico, which is
used for CNI, stores its configuration at /etc/cni/net.d and has permissions set to 644.
CIS 1.1.12
This is a Nutanix reference.
ID Text Remediation
1.1.12 Ensure that the etcd data On the etcd server node, get the etcd data
directory ownership is set to directory, passed as an argument --data-dir, from
etcd:etcd (Automated) the command 'ps -ef | grep etcd.'Run the below
command (based on the etcd data directory found
above).For example, chown etcd:etcd /var/lib/etcd
NKP Explanation
etcd files are owned by root. Creating another user adds additional attack vectors. On previous STIGs, this has been
acceptable to leave as root:root.
CIS 1.2.1
This is a Nutanix reference.
ID Text Remediation
1.2.1 Ensure that the --anonymous-auth Edit the API server pod specification file
argument is set to false (Manual) $apiserverconfon the control plane node and set the
below parameter.--anonymous-auth=false
ID Text Remediation
1.2.6 Ensure that the --kubelet- Follow the Kubernetes documentation and set up
certificate-authority argument is the TLS connection between the apiserver and
set as appropriate (Automated) kubelets. Then, edit the API server pod specification
file$apiserverconf on the control plane node and set
the--kubelet-certificate-authority parameter to the
certificate authority's path to the cert file.--kubelet-
certificate-authority=<ca-string>
NKP Explanation
The --kubelet-certificate-authority flag needs to be set on each API Server after the cluster has been fully
provisioned; adding it earlier causes issues with the creation and adding of worker nodes via CAPI and kubeadm.
CIS 1.2.10
This is a Nutanix reference.
ID Text Remediation
4.2.10 Ensure that the --tls-cert-file and If using a Kubelet config file, edit the file to set
--tls-private-key-file arguments tlsCertFile to the location of the certificate file
are set as appropriate (Manual) to identify this Kubelet and tlsPrivateKeyFileto
the location of the corresponding private key
file. If using command line arguments, edit
the kubelet service file$kubeletsvc on each
worker node and the below parameters in
KUBELET_CERTIFICATE_ARGS variable.--tls-
cert-file=<path/to/tls-certificate-file>--tls-private-key-
file=<path/to/tls-key-file>Based on your system,
restart the kubelet service. For example,systemctl
daemon-reloadsystemctl restart kubelet.service
NKP Explanation
This remediation refers to a serving certificate on the kubelet, where the https endpoint on the kubelet is used.
By default, a self-signed certificate is used here. Connecting to a kubelet’s https endpoint should only be used for
diagnostic or debugging purposes where applying a provided key and certificate isn’t expected.
For more information, see Client and serving certificates at https://kubernetes.io/docs/reference/access-authn-
authz/kubelet-tls-bootstrapping/#client-and-serving-certificates.
CIS 1.2.13
This is a Nutanix reference.
NKP Explanation
The Kubernetes Project recommends not using this admission controller, as it is deprecated and will be removed in a
future release. For more information, see Admission Controllers Reference https://kubernetes.io/docs/reference/
access-authn-authz/admission-controllers/#securitycontextdeny.
CIS 4.2.8
This is a Nutanix reference.
ID Text Remediation
4.2.8 Ensure that the --hostname- Edit the kubelet service file $kubeletsvcon
override argument is not set each worker node and remove the --
(Manual) hostname-override argument from
theKUBELET_SYSTEM_PODS_ARGS variable.
Based on your system, restart the kubelet service.
For example,systemctl daemon-reloadsystemctl
restart kubelet.service
NKP Explanation
The hostname-override argument is used by various infrastructure providers to provision nodes; removing this
argument will impact how CAPI works with the infrastructure provider.
CIS 4.2.10
This is a Nutanix reference.
ID Text Remediation
1.2.10 Ensure that the admission control Follow the Kubernetes documentation and set the
plugin EventRateLimit is set desired limits in a configuration file.Then, edit the
(Manual) API server pod specification file $apiserverconfand
set the below parameters.--enable-admission-
plugins=...,EventRateLimit,...--admission-control-
config-file=<path/to/configuration/file>
NKP Explanation
Kubernetes recommends the use of API Priority and Fairness using the --max-requests-inflight and --max-
mutating-requests-inflight flags to control how the Kubernetes API Server behaves in overload situations.
The APIPriorityAndFairness Feature Gate has been enabled by default since Kubernetes v1.20.
For more information, see: