AKS Doc
AKS Doc
AKS Doc
Azure Kubernetes Service (AKS) simplifies deploying a managed Kubernetes cluster in Azure by offloading the
operational overhead to Azure. As a hosted Kubernetes service, Azure handles critical tasks, like health
monitoring and maintenance. Since Kubernetes masters are managed by Azure, you only manage and maintain
the agent nodes. Thus, AKS is free; you only pay for the agent nodes within your clusters, not for the masters.
You can create an AKS cluster using:
The Azure CLI
The Azure portal
Azure PowerShell
Using template-driven deployment options, like Azure Resource Manager templates, Bicep and Terraform.
When you deploy an AKS cluster, the Kubernetes master and all nodes are deployed and configured for you.
Advanced networking, Azure Active Directory (Azure AD) integration, monitoring, and other features can be
configured during the deployment process.
For more information on Kubernetes basics, see Kubernetes core concepts for AKS.
NOTE
This service supports Azure Lighthouse, which lets service providers sign in to their own tenant to manage subscriptions
and resource groups that customers have delegated.
Kubernetes certification
AKS has been CNCF-certified as Kubernetes conformant.
Regulatory compliance
AKS is compliant with SOC, ISO, PCI DSS, and HIPAA. For more information, see Overview of Microsoft Azure
compliance.
Next steps
Learn more about deploying and managing AKS with the Azure CLI Quickstart.
Deploy an AKS Cluster using Azure CLI
Quotas, virtual machine size restrictions, and region
availability in Azure Kubernetes Service (AKS)
6/15/2022 • 2 minutes to read • Edit Online
All Azure services set default limits and quotas for resources and features, including usage restrictions for
certain virtual machine (VM) SKUs.
This article details the default resource limits for Azure Kubernetes Service (AKS) resources and the availability
of AKS in Azure regions.
Maximum nodes per cluster with Virtual Machine Scale Sets 1000 (across all node pools)
and Standard Load Balancer SKU
Maximum pods per node: Basic networking with Kubenet Maximum: 250
Azure CLI default: 110
Azure Resource Manager template default: 110
Azure portal deployment default: 30
Maximum pods per node: Advanced networking with Azure Maximum: 250
Container Networking Interface Default: 30
Open Service Mesh (OSM) AKS addon Kubernetes Cluster Version: 1.19+
OSM controllers per cluster: 1
Pods per OSM controller: 500
Kubernetes service accounts managed by OSM: 50
K UB ERN ET ES C O N T RO L P L A N E T IER L IM IT
Provisioned infrastructure
All other network, compute, and storage limitations apply to the provisioned infrastructure. For the relevant
limits, see Azure subscription and service limits.
IMPORTANT
When you upgrade an AKS cluster, extra resources are temporarily consumed. These resources include available IP
addresses in a virtual network subnet or virtual machine vCPU quota.
For Windows Server containers, you can perform an upgrade operation to apply the latest node updates. If you don't
have the available IP address space or vCPU quota to handle these temporary resources, the cluster upgrade process will
fail. For more information on the Windows Server node upgrade process, see Upgrade a node pool in AKS.
Supported VM sizes
The list of supported VM sizes in AKS is evolving with the release of new VM SKUs in Azure. Please follow the
AKS release notes to stay informed of new supported SKUs.
Restricted VM sizes
VM sizes with less than 2 CPUs may not be used with AKS.
Each node in an AKS cluster contains a fixed amount of compute resources such as vCPU and memory. If an AKS
node contains insufficient compute resources, pods might fail to run correctly. To ensure the required kube-
system pods and your applications can be reliably scheduled, AKS requires nodes use VM sizes with at least 2
CPUs.
For more information on VM types and their compute resources, see Sizes for virtual machines in Azure.
Region availability
For the latest list of where you can deploy and run clusters, see AKS region availability.
Standard Best if you're not sure what to choose. Works well with most
applications.
Hardened access Best for large enterprises that need full control of security
and stability.
Next steps
You can increase certain default limits and quotas. If your resource supports an increase, request the increase
through an Azure support request (for Issue type , select Quota ).
Supported Kubernetes versions in Azure Kubernetes
Service (AKS)
6/15/2022 • 9 minutes to read • Edit Online
The Kubernetes community releases minor versions roughly every three months. Recently, the Kubernetes
community has increased the support window for each version from 9 months to 12 months, starting with
version 1.19.
Minor version releases include new features and improvements. Patch releases are more frequent (sometimes
weekly) and are intended for critical bug fixes within a minor version. Patch releases include fixes for security
vulnerabilities or major bugs.
Kubernetes versions
Kubernetes uses the standard Semantic Versioning versioning scheme for each version:
[major].[minor].[patch]
Example:
1.17.7
1.17.8
Each number in the version indicates general compatibility with the previous version:
Major versions change when incompatible API updates or backwards compatibility may be broken.
Minor versions change when functionality updates are made that are backwards compatible to the other
minor releases.
Patch versions change when backwards-compatible bug fixes are made.
Aim to run the latest patch release of the minor version you're running. For example, your production cluster is
on 1.17.7 . 1.17.8 is the latest available patch version available for the 1.17 series. You should upgrade to
1.17.8 as soon as possible to ensure your cluster is fully patched and supported.
Azure Kubernetes Service allows for you to create a cluster without specifying the exact patch version. When
creating a cluster without designating a patch, the cluster will run the minor version's latest GA patch. For
example, if you create a cluster with 1.21 , your cluster will be running 1.21.7 , which is the latest GA patch
version of 1.21.
When upgrading by alias minor version, only a higher minor version is supported. For example, upgrading from
1.14.x to 1.14 will not trigger an upgrade to the latest GA 1.14 patch, but upgrading to 1.15 will trigger an
upgrade to the latest GA 1.15 patch.
To see what patch you are on, run the az aks show --resource-group myResourceGroup --name myAKSCluster
command. The property currentKubernetesVersion shows the whole Kubernetes version.
{
"apiServerAccessProfile": null,
"autoScalerProfile": null,
"autoUpgradeProfile": null,
"azurePortalFqdn": "myaksclust-myresourcegroup.portal.hcp.eastus.azmk8s.io",
"currentKubernetesVersion": "1.21.7",
}
NOTE
AKS uses safe deployment practices which involve gradual region deployment. This means it may take up to 10 business
days for a new release or a new version to be available in all regions.
The supported window of Kubernetes versions on AKS is known as "N-2": (N (Latest release) - 2 (minor
versions)).
For example, if AKS introduces 1.17.a today, support is provided for the following versions:
1.17.a
1.17.b
1.16.c
1.16.d
1.15.e
1.15.f
AKS releases 1.18.*, removing all the 1.15.* versions out of support in 30 days.
NOTE
If customers are running an unsupported Kubernetes version, they will be asked to upgrade when requesting support for
the cluster. Clusters running unsupported Kubernetes releases are not covered by the AKS support policies.
In addition to the above, AKS supports a maximum of two patch releases of a given minor version. So given the
following supported versions:
If AKS releases 1.17.9 and 1.16.11 , the oldest patch versions are deprecated and removed, and the supported
version list becomes:
Azure CLI
Azure PowerShell
az aks install-cli
NOTE
To find out who is your subscription administrators or to change it, please refer to manage Azure subscriptions.
Users have 30 days from version removal to upgrade to a supported minor version release to continue
receiving support.
For new patch versions of Kubernetes:
Because of the urgent nature of patch versions, they can be introduced into the service as they become
available. Once available, patches will have a two month minimum lifecycle.
In general, AKS does not broadly communicate the release of new patch versions. However, AKS constantly
monitors and validates available CVE patches to support them in AKS in a timely manner. If a critical patch is
found or user action is required, AKS will notify users to upgrade to the newly available patch.
Users have 30 days from a patch release's removal from AKS to upgrade into a supported patch and
continue receiving support. However, you will no longer be able to create clusters or node pools once
the version is deprecated/removed.
Supported versions policy exceptions
AKS reserves the right to add or remove new/existing versions with one or more critical production-impacting
bugs or security issues without advance notice.
Specific patch releases may be skipped or rollout accelerated, depending on the severity of the bug or security
issue.
To find out what versions are currently available for your subscription and region, use the az aks get-versions
command. The following example lists the available Kubernetes versions for the EastUS region:
FAQ
How does Microsoft notify me of new Kubernetes versions?
The AKS team publishes pre-announcements with planned dates of the new Kubernetes versions in our
documentation, our GitHub as well as emails to subscription administrators who own clusters that are going to
fall out of support. In addition to announcements, AKS also uses Azure Advisor to notify the customer inside the
Azure portal to alert users if they are out of support, as well as alerting them of deprecated APIs that will affect
their application or development process.
How often should I expect to upgrade Kubernetes versions to stay in suppor t?
Starting with Kubernetes 1.19, the open source community has expanded support to 1 year. AKS commits to
enabling patches and support matching the upstream commitments. For AKS clusters on 1.19 and greater, you
will be able to upgrade at a minimum of once a year to stay on a supported version.
What happens when a user upgrades a Kubernetes cluster with a minor version that isn't
suppor ted?
If you're on the n-3 version or older, it means you're outside of support and will be asked to upgrade. When your
upgrade from version n-3 to n-2 succeeds, you're back within our support policies. For example:
If the oldest supported AKS version is 1.15.a and you are on 1.14.b or older, you're outside of support.
When you successfully upgrade from 1.14.b to 1.15.a or higher, you're back within our support policies.
Downgrades are not supported.
What does 'Outside of Suppor t' mean
'Outside of Support' means that:
The version you're running is outside of the supported versions list.
You'll be asked to upgrade the cluster to a supported version when requesting support, unless you're within
the 30-day grace period after version deprecation.
Additionally, AKS doesn't make any runtime or other guarantees for clusters outside of the supported versions
list.
What happens when a user scales a Kubernetes cluster with a minor version that isn't suppor ted?
For minor versions not supported by AKS, scaling in or out should continue to work. Since there are no Quality
of Service guarantees, we recommend upgrading to bring your cluster back into support.
Can a user stay on a Kubernetes version forever?
If a cluster has been out of support for more than three (3) minor versions and has been found to carry security
risks, Azure proactively contacts you to upgrade your cluster. If you do not take further action, Azure reserves
the right to automatically upgrade your cluster on your behalf.
What version does the control plane suppor t if the node pool is not in one of the suppor ted AKS
versions?
The control plane must be within a window of versions from all node pools. For details on upgrading the control
plane or node pools, visit documentation on upgrading node pools.
Can I skip multiple AKS versions during cluster upgrade?
When you upgrade a supported AKS cluster, Kubernetes minor versions cannot be skipped. Kubernetes control
planes version skew policy does not support minor version skipping. For example, upgrades between:
1.12.x -> 1.13.x: allowed.
1.13.x -> 1.14.x: allowed.
1.12.x -> 1.14.x: not allowed.
To upgrade from 1.12.x -> 1.14.x:
1. Upgrade from 1.12.x -> 1.13.x.
2. Upgrade from 1.13.x -> 1.14.x.
Skipping multiple versions can only be done when upgrading from an unsupported version back into the
minimum supported version. For example, you can upgrade from an unsupported 1.10.x to a supported 1.15.x if
1.15 is the minimum supported minor version.
Can I create a new 1.xx.x cluster during its 30 day suppor t window?
No. Once a version is deprecated/removed, you cannot create a cluster with that version. As the change rolls out,
you will start to see the old version removed from your version list. This process may take up to two weeks from
announcement, progressively by region.
I am on a freshly deprecated version, can I still add new node pools? Or will I have to upgrade?
No. You will not be allowed to add node pools of the deprecated version to your cluster. You can add node pools
of a new version. However, this may require you to update the control plane first.
How often do you update patches?
Patches have a two month minimum lifecycle. To keep up to date when new patches are released, follow the AKS
Release Notes.
Next steps
For information on how to upgrade your cluster, see Upgrade an Azure Kubernetes Service (AKS) cluster.
Add-ons, extensions, and other integrations with
Azure Kubernetes Service
6/15/2022 • 2 minutes to read • Edit Online
Azure Kubernetes Service (AKS) provides additional, supported functionality for your cluster using add-ons and
extensions. There are also many more integrations provided by open-source projects and third parties that are
commonly used with AKS. These open-source and third-party integrations are not covered by the AKS support
policy.
Add-ons
Add-ons provide extra capabilities for your AKS cluster and their installation and configuration is managed by
Azure. Use az aks addon to manage all add-ons for your cluster.
The below table shows the available add-ons.
virtual-node Use virtual nodes with your AKS Use virtual nodes
cluster.
azure-policy Use Azure Policy for AKS, which Understand Azure Policy for
enables at-scale enforcements and Kubernetes clusters
safeguards on your clusters in a
centralized, consistent manner.
open-service-mesh Use Open Service Mesh with your AKS Open Service Mesh AKS add-on
cluster.
azure-keyvault-secrets-provider Use Azure Keyvault Secrets Provider Use the Azure Key Vault Provider for
addon. Secrets Store CSI Driver in an AKS
cluster
Extensions
Cluster extensions build on top of certain Helm charts and provide an Azure Resource Manager-driven
experience for installation and lifecycle management of different Azure capabilities on top of your Kubernetes
cluster. For more details on the specific cluster extensions for AKS, see Deploy and manage cluster extensions for
Azure Kubernetes Service (AKS). For more details on the currently available cluster extensions, see Currently
available extensions.
Couchbase A distributed NoSQL cloud database. Install Couchbase and the Operator on
AKS
Apache Spark An open source, fast engine for large- Running Apache Spark jobs requires a
scale data processing. minimum node size of
Standard_D3_v2. See running Spark on
Kubernetes for more details on
running Spark jobs on Kubernetes.
Azure Kubernetes Service (AKS) is a managed Kubernetes service that lets you quickly deploy and manage
clusters. In this quickstart, you will:
Deploy an AKS cluster using the Azure CLI.
Run a sample multi-container application with a web front-end and a Redis instance in the cluster.
This quickstart assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes
core concepts for Azure Kubernetes Service (AKS).
If you don't have an Azure subscription, create an Azure free account before you begin.
To learn more about creating a Windows Server node pool, see Create an AKS cluster that supports Windows
Server containers.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
This article requires version 2.0.64 or later of the Azure CLI. If using Azure Cloud Shell, the latest version
is already installed.
The identity you are using to create your cluster has the appropriate minimum permissions. For more
details on access and identity for AKS, see Access and identity options for Azure Kubernetes Service
(AKS).
If you have multiple Azure subscriptions, select the appropriate subscription ID in which the resources
should be billed using the Az account command.
Verify Microsoft.OperationsManagement and Microsoft.OperationalInsights are registered on your
subscription. To check the registration status:
NOTE
Run the commands with administrative privileges if you plan to run the commands in this quickstart locally instead of in
Azure Cloud Shell.
The following output example resembles successful creation of the resource group:
{
"id": "/subscriptions/<guid>/resourceGroups/myResourceGroup",
"location": "eastus",
"managedBy": null,
"name": "myResourceGroup",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null
}
After a few minutes, the command completes and returns JSON-formatted information about the cluster.
NOTE
When you create an AKS cluster, a second resource group is automatically created to store the AKS resources. For more
information, see Why are two resource groups created with AKS?
az aks install-cli
2. Configure kubectl to connect to your Kubernetes cluster using the az aks get-credentials command. The
following command:
Downloads credentials and configures the Kubernetes CLI to use them.
Uses ~/.kube/config , the default location for the Kubernetes configuration file. Specify a different
location for your Kubernetes configuration file using --file argument.
3. Verify the connection to your cluster using the kubectl get command. This command returns a list of the
cluster nodes.
The following output example shows the single node created in the previous steps. Make sure the node
status is Ready:
NAME STATUS ROLES AGE VERSION
aks-nodepool1-31718369-0 Ready agent 6m44s v1.12.8
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-back
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-back
template:
metadata:
labels:
app: azure-vote-back
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-back
image: mcr.microsoft.com/oss/bitnami/redis:6.0.8
env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-back
spec:
ports:
ports:
- port: 6379
selector:
app: azure-vote-back
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-front
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-front
template:
metadata:
labels:
app: azure-vote-front
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 80
env:
- name: REDIS
value: "azure-vote-back"
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
3. Deploy the application using the kubectl apply command and specify the name of your YAML manifest:
The following example resembles output showing the successfully created deployments and services:
The EXTERNAL-IP output for the azure-vote-front service will initially show as pending.
Once the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To see the Azure Vote app in action, open a web browser to the external IP address of your service.
NOTE
The AKS cluster was created with system-assigned managed identity (default identity option used in this quickstart), the
identity is managed by the platform and does not require removal.
Next steps
In this quickstart, you deployed a Kubernetes cluster and then deployed a simple multi-container application to
it.
To learn more about AKS, and walk through a complete code to deployment example, continue to the
Kubernetes cluster tutorial.
AKS tutorial
This quickstart is for introductory purposes. For guidance on a creating full solutions with AKS for production,
see AKS solution guidance.
Quickstart: Deploy an Azure Kubernetes Service
cluster using PowerShell
6/15/2022 • 6 minutes to read • Edit Online
Azure Kubernetes Service (AKS) is a managed Kubernetes service that lets you quickly deploy and manage
clusters. In this quickstart, you will:
Deploy an AKS cluster using PowerShell.
Run a sample multi-container application with a web front-end and a Redis instance in the cluster.
This quickstart assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes
core concepts for Azure Kubernetes Service (AKS).
Prerequisites
If you don't have an Azure subscription, create an Azure free account before you begin.
If you're running PowerShell locally, install the Az PowerShell module and connect to your Azure account
using the Connect-AzAccount cmdlet. For more information about installing the Az PowerShell module,
see Install Azure PowerShell.
The identity you are using to create your cluster has the appropriate minimum permissions. For more
details on access and identity for AKS, see Access and identity options for Azure Kubernetes Service
(AKS).
If you have multiple Azure subscriptions, select the appropriate subscription ID in which the resources
should be billed using the Set-AzContext cmdlet.
O P T IO N EXA M P L E/ L IN K
The following output example resembles successful creation of the resource group:
ResourceGroupName : myResourceGroup
Location : eastus
ProvisioningState : Succeeded
Tags :
ResourceId : /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup
After a few minutes, the command completes and returns information about the cluster.
NOTE
When you create an AKS cluster, a second resource group is automatically created to store the AKS resources. For more
information, see Why are two resource groups created with AKS?
Install-AzAksKubectl
2. Configure kubectl to connect to your Kubernetes cluster using the Import-AzAksCredential cmdlet. The
following cmdlet downloads credentials and configures the Kubernetes CLI to use them.
3. Verify the connection to your cluster using the kubectl get command. This command returns a list of the
cluster nodes.
The following output example shows the single node created in the previous steps. Make sure the node
status is Ready:
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-back
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-back
template:
metadata:
labels:
app: azure-vote-back
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-back
image: mcr.microsoft.com/oss/bitnami/redis:6.0.8
env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-back
spec:
ports:
- port: 6379
selector:
app: azure-vote-back
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-front
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-front
template:
metadata:
labels:
app: azure-vote-front
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 80
env:
- name: REDIS
value: "azure-vote-back"
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
3. Deploy the application using the kubectl apply command and specify the name of your YAML manifest:
The following example resembles output showing the successfully created deployments and services:
deployment.apps/azure-vote-back created
service/azure-vote-back created
deployment.apps/azure-vote-front created
service/azure-vote-front created
The EXTERNAL-IP output for the azure-vote-front service will initially show as pending.
Once the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
azure-vote-front LoadBalancer 10.0.37.27 52.179.23.131 80:30572/TCP 2m
To see the Azure Vote app in action, open a web browser to the external IP address of your service.
NOTE
The AKS cluster was created with system-assigned managed identity (default identity option used in this quickstart), the
identity is managed by the platform and does not require removal.
Next steps
In this quickstart, you deployed a Kubernetes cluster and then deployed a sample multi-container application to
it.
To learn more about AKS, and walk through a complete code to deployment example, continue to the
Kubernetes cluster tutorial.
AKS tutorial
Quickstart: Deploy an Azure Kubernetes Service
(AKS) cluster using the Azure portal
6/15/2022 • 5 minutes to read • Edit Online
Azure Kubernetes Service (AKS) is a managed Kubernetes service that lets you quickly deploy and manage
clusters. In this quickstart, you will:
Deploy an AKS cluster using the Azure portal.
Run a sample multi-container application with a web front-end and a Redis instance in the cluster.
This quickstart assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes
core concepts for Azure Kubernetes Service (AKS).
Prerequisites
If you don't have an Azure subscription, create an Azure free account before you begin.
If you are unfamiliar with using the Bash environment in Azure Cloud Shell, review Overview of Azure
Cloud Shell.
The identity you are using to create your cluster has the appropriate minimum permissions. For more
details on access and identity for AKS, see Access and identity options for Azure Kubernetes Service
(AKS).
NOTE
To perform these operations in a local shell installation:
1. Verify Azure CLI is installed.
2. Connect to Azure via the az login command.
2. Configure kubectl to connect to your Kubernetes cluster using the az aks get-credentials command. The
following command downloads credentials and configures the Kubernetes CLI to use them.
3. Verify the connection to your cluster using kubectl get to return a list of the cluster nodes.
Output shows the single node created in the previous steps. Make sure the node status is Ready:
NAME STATUS ROLES AGE VERSION
aks-agentpool-12345678-vmss000000 Ready agent 23m v1.19.11
aks-agentpool-12345678-vmss000001 Ready agent 24m v1.19.11
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-back
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-back
template:
metadata:
labels:
app: azure-vote-back
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-back
image: mcr.microsoft.com/oss/bitnami/redis:6.0.8
env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
---
apiVersion: v1
kind: Service
metadata:
metadata:
name: azure-vote-back
spec:
ports:
- port: 6379
selector:
app: azure-vote-back
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-front
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-front
template:
metadata:
labels:
app: azure-vote-front
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 80
env:
- name: REDIS
value: "azure-vote-back"
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
3. Deploy the application using the kubectl apply command and specify the name of your YAML manifest:
The EXTERNAL-IP output for the azure-vote-front service will initially show as pending.
Once the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To see the Azure Vote app in action, open a web browser to the external IP address of your service.
Delete cluster
To avoid Azure charges, if you don't plan on going through the tutorials that follow, clean up your unnecessary
resources. Select the Delete button on the AKS cluster dashboard. You can also use the az aks delete command
in the Cloud Shell:
NOTE
When you delete the cluster, system-assigned managed identity is managed by the platform and does not require
removal.
Next steps
In this quickstart, you deployed a Kubernetes cluster and then deployed a sample multi-container application to
it.
To learn more about AKS by walking through a complete example, including building an application, deploying
from Azure Container Registry, updating a running application, and scaling and upgrading your cluster, continue
to the Kubernetes cluster tutorial.
AKS tutorial
Quickstart: Deploy an Azure Kubernetes Service
(AKS) cluster using an ARM template
6/15/2022 • 8 minutes to read • Edit Online
Azure Kubernetes Service (AKS) is a managed Kubernetes service that lets you quickly deploy and manage
clusters. In this quickstart, you will:
Deploy an AKS cluster using an Azure Resource Manager template.
Run a sample multi-container application with a web front-end and a Redis instance in the cluster.
An ARM template is a JavaScript Object Notation (JSON) file that defines the infrastructure and configuration for
your project. The template uses declarative syntax. In declarative syntax, you describe your intended deployment
without writing the sequence of programming commands to create the deployment.
This quickstart assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes
core concepts for Azure Kubernetes Service (AKS).
If your environment meets the prerequisites and you're familiar with using ARM templates, select the Deploy to
Azure button. The template will open in the Azure portal.
If you don't have an Azure subscription, create an Azure free account before you begin.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
This article requires version 2.0.64 or later of the Azure CLI. If using Azure Cloud Shell, the latest version
is already installed.
To create an AKS cluster using a Resource Manager template, you provide an SSH public key. If you need
this resource, see the following section; otherwise skip to the Review the template section.
The identity you are using to create your cluster has the appropriate minimum permissions. For more
details on access and identity for AKS, see Access and identity options for Azure Kubernetes Service
(AKS).
To deploy a Bicep file or ARM template, you need write access on the resources you're deploying and
access to all operations on the Microsoft.Resources/deployments resource type. For example, to deploy a
virtual machine, you need Microsoft.Compute/virtualMachines/write and
Microsoft.Resources/deployments/* permissions. For a list of roles and permissions, see Azure built-in
roles.
Create an SSH key pair
To access AKS nodes, you connect using an SSH key pair (public and private), which you generate using the
ssh-keygen command. By default, these files are created in the ~/.ssh directory. Running the ssh-keygen
command will overwrite any SSH key pair with the same name already existing in the given location.
1. Go to https://shell.azure.com to open Cloud Shell in your browser.
2. Run the ssh-keygen command. The following example creates an SSH key pair using RSA encryption and
a bit length of 4096:
For more information about creating SSH keys, see Create and manage SSH keys for authentication in Azure.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"metadata": {
"_generator": {
"name": "bicep",
"version": "0.5.6.12127",
"templateHash": "4935111768494834723"
}
},
"parameters": {
"clusterName": {
"type": "string",
"type": "string",
"defaultValue": "aks101cluster",
"metadata": {
"description": "The name of the Managed Cluster resource."
}
},
"location": {
"type": "string",
"defaultValue": "[resourceGroup().location]",
"metadata": {
"description": "The location of the Managed Cluster resource."
}
},
"dnsPrefix": {
"type": "string",
"metadata": {
"description": "Optional DNS prefix to use with hosted Kubernetes API server FQDN."
}
},
"osDiskSizeGB": {
"type": "int",
"defaultValue": 0,
"maxValue": 1023,
"minValue": 0,
"metadata": {
"description": "Disk size (in GB) to provision for each of the agent pool nodes. This value ranges
from 0 to 1023. Specifying 0 will apply the default disk size for that agentVMSize."
}
},
"agentCount": {
"type": "int",
"defaultValue": 3,
"maxValue": 50,
"minValue": 1,
"metadata": {
"description": "The number of nodes for the cluster."
}
},
"agentVMSize": {
"type": "string",
"defaultValue": "Standard_Ds_v3",
"metadata": {
"description": "The size of the Virtual Machine."
}
},
"linuxAdminUsername": {
"type": "string",
"metadata": {
"description": "User name for the Linux Virtual Machines."
}
},
"sshRSAPublicKey": {
"type": "string",
"metadata": {
"description": "Configure all linux machines with the SSH RSA public key string. Your key should
include three parts, for example 'ssh-rsa AAAAB...snip...UcyupgH azureuser@linuxvm'"
}
}
},
"resources": [
{
"type": "Microsoft.ContainerService/managedClusters",
"apiVersion": "2020-09-01",
"name": "[parameters('clusterName')]",
"location": "[parameters('location')]",
"identity": {
"type": "SystemAssigned"
},
"properties": {
"dnsPrefix": "[parameters('dnsPrefix')]",
"dnsPrefix": "[parameters('dnsPrefix')]",
"agentPoolProfiles": [
{
"name": "agentpool",
"osDiskSizeGB": "[parameters('osDiskSizeGB')]",
"count": "[parameters('agentCount')]",
"vmSize": "[parameters('agentVMSize')]",
"osType": "Linux",
"mode": "System"
}
],
"linuxProfile": {
"adminUsername": "[parameters('linuxAdminUsername')]",
"ssh": {
"publicKeys": [
{
"keyData": "[parameters('sshRSAPublicKey')]"
}
]
}
}
}
}
],
"outputs": {
"controlPlaneFQDN": {
"type": "string",
"value": "[reference(resourceId('Microsoft.ContainerService/managedClusters',
parameters('clusterName'))).fqdn]"
}
}
}
For more AKS samples, see the AKS quickstart templates site.
az aks install-cli
2. Configure kubectl to connect to your Kubernetes cluster using the az aks get-credentials command. This
command downloads credentials and configures the Kubernetes CLI to use them.
The following output example shows the single node created in the previous steps. Make sure the node
status is Ready:
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-back
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-back
template:
metadata:
labels:
app: azure-vote-back
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-back
image: mcr.microsoft.com/oss/bitnami/redis:6.0.8
env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-back
spec:
ports:
- port: 6379
selector:
app: azure-vote-back
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-front
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-front
template:
metadata:
labels:
app: azure-vote-front
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 80
env:
- name: REDIS
value: "azure-vote-back"
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
3. Deploy the application using the kubectl apply command and specify the name of your YAML manifest:
The following example resembles output showing the successfully created deployments and services:
deployment "azure-vote-back" created
service "azure-vote-back" created
deployment "azure-vote-front" created
service "azure-vote-front" created
The EXTERNAL-IP output for the azure-vote-front service will initially show as pending.
Once the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To see the Azure Vote app in action, open a web browser to the external IP address of your service.
Clean up resources
To avoid Azure charges, if you don't plan on going through the tutorials that follow, clean up your unnecessary
resources. Use the az group delete command to remove the resource group, container service, and all related
resources.
Next steps
In this quickstart, you deployed a Kubernetes cluster and then deployed a sample multi-container application to
it.
To learn more about AKS, and walk through a complete code to deployment example, continue to the
Kubernetes cluster tutorial.
AKS tutorial
Create a Windows Server container on an Azure
Kubernetes Service (AKS) cluster using the Azure
CLI
6/15/2022 • 13 minutes to read • Edit Online
Azure Kubernetes Service (AKS) is a managed Kubernetes service that lets you quickly deploy and manage
clusters. In this article, you deploy an AKS cluster that runs Windows Server 2019 containers using the Azure
CLI. You also deploy an ASP.NET sample application in a Windows Server container to the cluster.
This article assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes core
concepts for Azure Kubernetes Service (AKS).
If you don't have an Azure subscription, create an Azure free account before you begin.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
This article requires version 2.0.64 or later of the Azure CLI. If using Azure Cloud Shell, the latest version
is already installed.
The identity you're using to create your cluster has the appropriate minimum permissions. For more
details on access and identity for AKS, see Access and identity options for Azure Kubernetes Service
(AKS).
If you have multiple Azure subscriptions, select the appropriate subscription ID in which the resources
should be billed using the Az account command.
Limitations
The following limitations apply when you create and manage AKS clusters that support multiple node pools:
You can't delete the first node pool.
The following additional limitations apply to Windows Server node pools:
The AKS cluster can have a maximum of 10 node pools.
The AKS cluster can have a maximum of 100 nodes in each node pool.
The Windows Server node pool name has a limit of 6 characters.
NOTE
This article uses Bash syntax for the commands in this tutorial. If you're using Azure Cloud Shell, ensure that the
dropdown in the upper-left of the Cloud Shell window is set to Bash .
The following example output shows the resource group created successfully:
{
"id": "/subscriptions/<guid>/resourceGroups/myResourceGroup",
"location": "eastus",
"managedBy": null,
"name": "myResourceGroup",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null,
"type": null
}
NOTE
To ensure your cluster to operate reliably, you should run at least 2 (two) nodes in the default node pool.
Create a username to use as administrator credentials for the Windows Server nodes on your cluster. The
following commands prompt you for a username and sets it to WINDOWS_USERNAME for use in a later
command (remember that the commands in this article are entered into a BASH shell).
echo "Please enter the username to use as administrator credentials for Windows Server nodes on your
cluster: " && read WINDOWS_USERNAME
Create your cluster ensuring you specify --windows-admin-username parameter. The following example command
creates a cluster using the value from WINDOWS_USERNAME you set in the previous command. Alternatively
you can provide a different username directly in the parameter instead of using WINDOWS_USERNAME. The
following command will also prompt you to create a password for the administrator credentials for the
Windows Server nodes on your cluster. Alternatively, you can use the windows-admin-password parameter and
specify your own value there.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 2 \
--enable-addons monitoring \
--generate-ssh-keys \
--windows-admin-username $WINDOWS_USERNAME \
--vm-set-type VirtualMachineScaleSets \
--network-plugin azure
NOTE
If you get a password validation error, verify the password you set meets the Windows Server password requirements. If
your password meets the requirements, try creating your resource group in another region. Then try creating the cluster
with the new resource group.
If you do not specify an administrator username and password when setting --vm-set-type VirtualMachineScaleSets
and --network-plugin azure , the username is set to azureuser and the password is set to a random value.
The administrator username can't be changed, but you can change the administrator password your AKS cluster uses for
Windows Server nodes using az aks update . For more details, see Windows Server node pools FAQ.
After a few minutes, the command completes and returns JSON-formatted information about the cluster.
Occasionally the cluster can take longer than a few minutes to provision. Allow up to 10 minutes in these cases.
The above command creates a new node pool named npwin and adds it to the myAKSCluster. The above
command also uses the default subnet in the default vnet created when running az aks create .
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
Use az aks nodepool add command to add a Windows Server 2022 node pool:
Use the az aks nodepool add command to add a node pool that can run Windows Server containers with the
containerd runtime.
NOTE
If you do not specify the WindowsContainerRuntime=containerd custom header, the node pool will use Docker as the
container runtime.
The above command creates a new Windows Server node pool using containerd as the runtime named npwcd
and adds it to the myAKSCluster. The above command also uses the default subnet in the default vnet created
when running az aks create .
Upgrade an existing Windows Server node pool to containerd
Use the az aks nodepool upgrade command to upgrade a specific node pool from Docker to containerd .
The above command upgrades a node pool named npwd to the containerd runtime.
To upgrade all existing node pools in a cluster to use the containerd runtime for all Windows Server node
pools:
az aks upgrade \
--resource-group myResourceGroup \
--name myAKSCluster \
--kubernetes-version 1.20.7 \
--aks-custom-headers WindowsContainerRuntime=containerd
The above command upgrades all Windows Server node pools in the myAKSCluster to use the containerd
runtime.
NOTE
After upgrading all existing Windows Server node pools to use the containerd runtime, Docker will still be the default
runtime when adding new Windows Server node pools.
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. This
command downloads credentials and configures the Kubernetes CLI to use them.
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows the all the nodes in the cluster. Make sure that the status of all nodes is
Ready:
NOTE
The container runtime for each node pool is shown under CONTAINER-RUNTIME. Notice aksnpwin987654 begins with
docker:// which means it is using Docker for the container runtime. Notice aksnpwcd123456 begins with
containerd:// which means it is using containerd for the container runtime.
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample
labels:
app: sample
spec:
replicas: 1
template:
metadata:
name: sample
labels:
app: sample
spec:
nodeSelector:
"kubernetes.io/os": windows
containers:
- name: sample
image: mcr.microsoft.com/dotnet/framework/samples:aspnetapp
resources:
limits:
cpu: 1
memory: 800M
ports:
- containerPort: 80
selector:
matchLabels:
app: sample
---
apiVersion: v1
kind: Service
metadata:
name: sample
spec:
type: LoadBalancer
ports:
- protocol: TCP
port: 80
selector:
app: sample
Deploy the application using the kubectl apply command and specify the name of your YAML manifest:
The following example output shows the Deployment and Service created successfully:
deployment.apps/sample created
service/sample created
When the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To see the sample app in action, open a web browser to the external IP address of your service.
NOTE
If you receive a connection timeout when trying to load the page then you should verify the sample app is ready with the
following command [kubectl get pods --watch]. Sometimes the Windows container will not be started by the time your
external IP address is available.
Delete cluster
To avoid Azure charges, if you don't plan on going through the tutorials that follow, use the az group delete
command to remove the resource group, container service, and all related resources.
NOTE
The AKS cluster was created with system-assigned managed identity (default identity option used in this quickstart), the
identity is managed by the platform and does not require removal.
Next steps
In this article, you deployed a Kubernetes cluster and deployed an ASP.NET sample application in a Windows
Server container to it.
To learn more about AKS, and walk through a complete code to deployment example, continue to the
Kubernetes cluster tutorial.
AKS tutorial
Create a Windows Server container on an Azure
Kubernetes Service (AKS) cluster using PowerShell
6/15/2022 • 8 minutes to read • Edit Online
Azure Kubernetes Service (AKS) is a managed Kubernetes service that lets you quickly deploy and manage
clusters. In this article, you deploy an AKS cluster running Windows Server 2019 containers using PowerShell.
You also deploy an ASP.NET sample application in a Windows Server container to the cluster.
This article assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes core
concepts for Azure Kubernetes Service (AKS).
Prerequisites
If you don't have an Azure subscription, create a free account before you begin.
The identity you are using to create your cluster has the appropriate minimum permissions. For more
details on access and identity for AKS, see Access and identity options for Azure Kubernetes Service
(AKS).
If you choose to use PowerShell locally, you need to install the Az PowerShell module and connect to your
Azure account using the Connect-AzAccount cmdlet. For more information about installing the Az
PowerShell module, see Install Azure PowerShell.
You also must install the Az.Aks PowerShell module:
Install-Module Az.Aks
Use Azure Cloud Shell
Azure hosts Azure Cloud Shell, an interactive shell environment that you can use through your browser. You can
use either Bash or PowerShell with Cloud Shell to work with Azure services. You can use the Cloud Shell
preinstalled commands to run the code in this article, without having to install anything on your local
environment.
To start Azure Cloud Shell:
O P T IO N EXA M P L E/ L IN K
Limitations
The following limitations apply when you create and manage AKS clusters that support multiple node pools:
You can't delete the first node pool.
The following additional limitations apply to Windows Server node pools:
The AKS cluster can have a maximum of 10 node pools.
The AKS cluster can have a maximum of 100 nodes in each node pool.
The Windows Server node pool name has a limit of 6 characters.
NOTE
This article uses PowerShell syntax for the commands in this tutorial. If you are using Azure Cloud Shell, ensure that the
dropdown in the upper-left of the Cloud Shell window is set to PowerShell.
The following example output shows the resource group created successfully:
ResourceGroupName : myResourceGroup
Location : eastus
ProvisioningState : Succeeded
Tags :
ResourceId : /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup
NOTE
To ensure your cluster operates reliably, you should run at least 2 (two) nodes in the default node pool.
$Username = Read-Host -Prompt 'Please create a username for the administrator credentials on your Windows
Server containers: '
$Password = Read-Host -Prompt 'Please create a password for the administrator credentials on your Windows
Server containers: ' -AsSecureString
New-AzAksCluster -ResourceGroupName myResourceGroup -Name myAKSCluster -NodeCount 2 -NetworkPlugin azure -
NodeVmSetType VirtualMachineScaleSets -WindowsProfileAdminUserName $Username -
WindowsProfileAdminUserPassword $Password
NOTE
If you are unable to create the AKS cluster because the version is not supported in this region then you can use the
Get-AzAksVersion -Location eastus command to find the supported version list for this region.
After a few minutes, the command completes and returns information about the cluster. Occasionally the cluster
can take longer than a few minutes to provision. Allow up to 10 minutes in these cases.
The above command creates a new node pool named npwin and adds it to the myAKSCluster . When creating
a node pool to run Windows Server containers, the default value for VmSize is Standard_D2s_v3 . If you
choose to set the VmSize parameter, check the list of restricted VM sizes. The minimum recommended size is
Standard_D2s_v3 . The previous command also uses the default subnet in the default vnet created when
running New-AzAksCluster .
Install-AzAksKubectl
To configure kubectl to connect to your Kubernetes cluster, use the Import-AzAksCredential cmdlet. This
command downloads credentials and configures the Kubernetes CLI to use them.
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows all the nodes in the cluster. Make sure that the status of all nodes is Ready :
Deploy the application using the kubectl apply command and specify the name of your YAML manifest:
The following example output shows the Deployment and Service created successfully:
deployment.apps/sample created
service/sample created
When the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To see the sample app in action, open a web browser to the external IP address of your service.
NOTE
If you receive a connection timeout when trying to load the page then you should verify the sample app is ready with the
following command kubectl get pods --watch . Sometimes the Windows container will not be started by the time your
external IP address is available.
Delete cluster
To avoid Azure charges, if you don't plan on going through the tutorials that follow, use the Remove-
AzResourceGroup cmdlet to remove the resource group, container service, and all related resources.
Next steps
In this article, you deployed a Kubernetes cluster and deployed an ASP.NET sample application in a Windows
Server container to it.
To learn more about AKS, and walk through a complete code to deployment example, continue to the
Kubernetes cluster tutorial.
AKS tutorial
Quickstart: Develop on Azure Kubernetes Service
(AKS) with Helm
6/15/2022 • 5 minutes to read • Edit Online
Helm is an open-source packaging tool that helps you install and manage the lifecycle of Kubernetes
applications. Similar to Linux package managers like APT and Yum, Helm manages Kubernetes charts, which are
packages of pre-configured Kubernetes resources.
In this quickstart, you'll use Helm to package and run an application on AKS. For more details on installing an
existing application using Helm, see the Install existing applications with Helm in AKS how-to guide.
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
Helm v3 installed.
Output will be similar to the following example. Take note of your loginServer value for your ACR since you'll
use it in a later step. In the below example, myhelmacr.azurecr.io is the loginServer for MyHelmACR .
{
"adminUserEnabled": false,
"creationDate": "2019-06-11T13:35:17.998425+00:00",
"id":
"/subscriptions/<ID>/resourceGroups/MyResourceGroup/providers/Microsoft.ContainerRegistry/registries/MyHelmA
CR",
"location": "eastus",
"loginServer": "myhelmacr.azurecr.io",
"name": "MyHelmACR",
"networkRuleSet": null,
"provisioningState": "Succeeded",
"resourceGroup": "MyResourceGroup",
"sku": {
"name": "Basic",
"tier": "Basic"
},
"status": null,
"storageAccount": null,
"tags": {},
"type": "Microsoft.ContainerRegistry/registries"
}
az aks create --resource-group MyResourceGroup --name MyAKS --location eastus --attach-acr MyHelmACR --
generate-ssh-keys
az aks install-cli
2. Configure kubectl to connect to your Kubernetes cluster using the az aks get-credentials command.
The following command example gets credentials for the AKS cluster named MyAKS in the
MyResourceGroup:
NOTE
In addition to importing container images into your ACR, you can also import Helm charts into your ACR. For more
information, see Push and pull Helm charts to an Azure container registry.
Update azure-vote-front/Chart.yaml to add a dependency for the redis chart from the
https://charts.bitnami.com/bitnami chart repository and update appVersion to v1 . For example:
NOTE
The container image versions shown in this guide have been tested to work with this example but may not be the latest
version available.
apiVersion: v2
name: azure-vote-front
description: A Helm chart for Kubernetes
dependencies:
- name: redis
version: 14.7.1
repository: https://charts.bitnami.com/bitnami
...
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: v1
Update azure-vote-front/values.yaml:
Add a redis section to set the image details, container port, and deployment name.
Add a backendName for connecting the frontend portion to the redis deployment.
Change image.repository to <loginServer>/azure-vote-front .
Change image.tag to v1 .
Change service.type to LoadBalancer.
For example:
replicaCount: 1
backendName: azure-vote-backend-master
redis:
image:
registry: mcr.microsoft.com
repository: oss/bitnami/redis
tag: 6.0.8
fullnameOverride: azure-vote-backend
auth:
enabled: false
image:
repository: myhelmacr.azurecr.io/azure-vote-front
pullPolicy: IfNotPresent
tag: "v1"
...
service:
type: LoadBalancer
port: 80
...
Add an env section to azure-vote-front/templates/deployment.yaml for passing the name of the redis
deployment.
...
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
- name: REDIS
value: {{ .Values.backendName }}
...
It takes a few minutes for the service to return a public IP address. Monitor progress using the
kubectl get service command with the --watch argument.
$ kubectl get service azure-vote-front --watch
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
azure-vote-front LoadBalancer 10.0.18.228 <pending> 80:32021/TCP 6s
...
azure-vote-front LoadBalancer 10.0.18.228 52.188.140.81 80:32021/TCP 2m6s
Navigate to your application's load balancer in a browser using the <EXTERNAL-IP> to see the sample application.
NOTE
If the AKS cluster was created with system-assigned managed identity (default identity option used in this quickstart), the
identity is managed by the platform and does not require removal.
If the AKS cluster was created with service principal as the identity option instead, then when you delete the cluster, the
service principal used by the AKS cluster is not removed. For steps on how to remove the service principal, see AKS
service principal considerations and deletion.
Next steps
For more information about using Helm, see the Helm documentation.
Helm documentation
Quickstart: Deploy an application using the Dapr
cluster extension for Azure Kubernetes Service
(AKS) or Arc-enabled Kubernetes
6/15/2022 • 4 minutes to read • Edit Online
In this quickstart, you will get familiar with using the Dapr cluster extension in an AKS or Arc-enabled
Kubernetes cluster. You will be deploying a hello world example, consisting of a Python application that
generates messages and a Node application that consumes and persists them.
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
An AKS or Arc-enabled Kubernetes cluster with the Dapr cluster extension enabled
You will also need to add the following two lines below redisPassword to enable connection over TLS:
- name: redisPassword
secretKeyRef:
name: redis
key: redis-password
- name: enableTLS
value: true
And verify that your state store was successfully configured in the output:
component.dapr.io/statestore created
NOTE
Kubernetes deployments are asynchronous. This means you'll need to wait for the deployment to complete before
moving on to the next steps. You can do so with the following command:
This will deploy the Node.js app to Kubernetes. The Dapr control plane will automatically inject the Dapr sidecar
to the Pod. If you take a look at the node.yaml file, you will see how Dapr is enabled for that deployment:
dapr.io/enabled: true - this tells the Dapr control plane to inject a sidecar to this deployment.
- this assigns a unique ID or name to the Dapr application, so it can be sent
dapr.io/app-id: nodeapp
messages to and communicated with by other Dapr apps.
To access your service, obtain and make note of the EXTERNAL-IP via kubectl :
curl $EXTERNAL_IP/ports
curl $EXTERNAL_IP/order
{ "orderId": "42" }
TIP
This is a good time to get acquainted with the Dapr dashboard- a convenient interface to check status, information and
logs of applications running on Dapr. The following command will make it available on http://localhost:8080/ :
n = 0
while True:
n += 1
message = {"data": {"orderId": n}}
try:
response = requests.post(dapr_url, json=message)
except Exception as e:
print(e)
time.sleep(1)
If the deployments were successful, you should see logs like this:
Call the Node.js app's order endpoint to get the latest order. Grab the external IP address that you saved before
and, append "/order" and perform a GET request against it (enter it into your browser, use Postman, or curl it!):
curl $EXTERNAL_IP/order
{"orderID":"42"}
Clean up resources
Use the az group delete command to remove the resource group, the cluster, the namespace, and all related
resources.
Next steps
After successfully deploying this sample application:
Learn more about other cluster extensions
Quickstart: Subscribe to Azure Kubernetes Service
(AKS) events with Azure Event Grid (Preview)
6/15/2022 • 3 minutes to read • Edit Online
Azure Event Grid is a fully managed event routing service that provides uniform event consumption using a
publish-subscribe model.
In this quickstart, you'll create an AKS cluster and subscribe to AKS events.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
Register the EventgridPreview preview feature
To use the feature, you must also enable the EventgridPreview feature flag on your subscription.
Register the EventgridPreview feature flag by using the az feature register command, as shown in the following
example:
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
It might take a moment for the registration to finish. To check the status, run:
NOTE
The name of your namespace must be unique.
The following example output shows you're subscribed to events from the MyAKS cluster and those events are
delivered to the MyEventGridHub event hub:
[
{
"deadLetterDestination": null,
"deadLetterWithResourceIdentity": null,
"deliveryWithResourceIdentity": null,
"destination": {
"deliveryAttributeMappings": null,
"endpointType": "EventHub",
"resourceId":
"/subscriptions/SUBSCRIPTION_ID/resourceGroups/MyResourceGroup/providers/Microsoft.EventHub/namespaces/MyNam
espace/eventhubs/MyEventGridHub"
},
"eventDeliverySchema": "EventGridSchema",
"expirationTimeUtc": null,
"filter": {
"advancedFilters": null,
"enableAdvancedFilteringOnArrays": null,
"includedEventTypes": [
"Microsoft.ContainerService.NewKubernetesVersionAvailable"
],
"isSubjectCaseSensitive": null,
"subjectBeginsWith": "",
"subjectEndsWith": ""
},
"id":
"/subscriptions/SUBSCRIPTION_ID/resourceGroups/MyResourceGroup/providers/Microsoft.ContainerService/managedC
lusters/MyAKS/providers/Microsoft.EventGrid/eventSubscriptions/MyEventGridSubscription",
"labels": null,
"name": "MyEventGridSubscription",
"provisioningState": "Succeeded",
"resourceGroup": "MyResourceGroup",
"retryPolicy": {
"eventTimeToLiveInMinutes": 1440,
"maxDeliveryAttempts": 30
},
"systemData": null,
"topic":
"/subscriptions/SUBSCRIPTION_ID/resourceGroups/MyResourceGroup/providers/microsoft.containerservice/managedc
lusters/MyAKS",
"type": "Microsoft.EventGrid/eventSubscriptions"
}
]
When AKS events occur, you'll see those events appear in your event hub. For example, when the list of available
Kubernetes versions for your clusters changes, you'll see a
Microsoft.ContainerService.NewKubernetesVersionAvailable event. For more information on the events AKS
emits, see Azure Kubernetes Service (AKS) as an Event Grid source.
Next steps
In this quickstart, you deployed a Kubernetes cluster and then subscribed to AKS events in Azure Event Hubs.
To learn more about AKS, and walk through a complete code to deployment example, continue to the
Kubernetes cluster tutorial.
AKS tutorial
Tutorial: Prepare an application for Azure
Kubernetes Service (AKS)
6/15/2022 • 3 minutes to read • Edit Online
In this tutorial, part one of seven, a multi-container application is prepared for use in Kubernetes. Existing
development tools such as Docker Compose are used to locally build and test an application. You learn how to:
Clone a sample application source from GitHub
Create a container image from the sample application source
Test the multi-container application in a local Docker environment
Once completed, the following application runs in your local development environment:
In later tutorials, the container image is uploaded to an Azure Container Registry, and then deployed into an AKS
cluster.
To complete this tutorial, you need a local Docker development environment running Linux containers. Docker
provides packages that configure Docker on a Mac, Windows, or Linux system.
NOTE
Azure Cloud Shell does not include the Docker components required to complete every step in these tutorials. Therefore,
we recommend using a full Docker development environment.
Get application code
The sample application used in this tutorial is a basic voting app consisting of a front-end web component and a
back-end Redis instance. The web component is packaged into a custom container image. The Redis instance
uses an unmodified image from Docker Hub.
Use git to clone the sample application to your development environment:
cd azure-voting-app-redis
Inside the directory is the application source code, a pre-created Docker compose file, and a Kubernetes
manifest file. These files are used throughout the tutorial set. The contents and structure of the directory are as
follows:
azure-voting-app-redis
│ azure-vote-all-in-one-redis.yaml
│ docker-compose.yaml
│ LICENSE
│ README.md
│
├───azure-vote
│ │ app_init.supervisord.conf
│ │ Dockerfile
│ │ Dockerfile-for-app-service
│ │ sshd_config
│ │
│ └───azure-vote
│ │ config_file.cfg
│ │ main.py
│ │
│ ├───static
│ │ default.css
│ │
│ └───templates
│ index.html
│
└───jenkins-tutorial
config-jenkins.sh
deploy-jenkins-vm.sh
docker-compose up -d
When completed, use the docker images command to see the created images. Three images have been
downloaded or created. The azure-vote-front image contains the front-end application and uses the nginx-flask
image as a base. The redis image is used to start a Redis instance.
$ docker images
$ docker ps
Clean up resources
Now that the application's functionality has been validated, the running containers can be stopped and removed.
Do not delete the container images - in the next tutorial, the azure-vote-front image is uploaded to an
Azure Container Registry instance.
Stop and remove the container instances and resources with the docker-compose down command:
docker-compose down
When the local application has been removed, you have a Docker image that contains the Azure Vote
application, azure-vote-front, for use with the next tutorial.
Next steps
In this tutorial, an application was tested and container images created for the application. You learned how to:
Clone a sample application source from GitHub
Create a container image from the sample application source
Test the multi-container application in a local Docker environment
Advance to the next tutorial to learn how to store container images in Azure Container Registry.
Push images to Azure Container Registry
Tutorial: Deploy and use Azure Container Registry
6/15/2022 • 5 minutes to read • Edit Online
Azure Container Registry (ACR) is a private registry for container images. A private container registry lets you
securely build and deploy your applications and custom code. In this tutorial, part two of seven, you deploy an
ACR instance and push a container image to it. You learn how to:
Create an Azure Container Registry (ACR) instance
Tag a container image for ACR
Upload the image to ACR
View images in your registry
In later tutorials, this ACR instance is integrated with a Kubernetes cluster in AKS, and an application is deployed
from the image.
This tutorial requires that you're running the Azure CLI version 2.0.53 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Create a resource group with the az group create command. In the following example, a resource group named
myResourceGroup is created in the eastus region:
Create an Azure Container Registry instance with the az acr create command and provide your own registry
name. The registry name must be unique within Azure, and contain 5-50 alphanumeric characters. In the rest of
this tutorial, <acrName> is used as a placeholder for the container registry name. Provide your own unique
registry name. The Basic SKU is a cost-optimized entry point for development purposes that provides a balance
of storage and throughput.
To use the ACR instance, you must first log in. Use the az acr login command and provide the unique name
given to the container registry in the previous step.
docker images
The above command's output shows list of your current local images:
To use the azure-vote-front container image with ACR, the image needs to be tagged with the login server
address of your registry. This tag is used for routing when pushing container images to an image registry.
Azure CLI
Azure PowerShell
To get the login server address, use the az acr list command and query for the loginServer as follows:
Now, tag your local azure-vote-front image with the acrLoginServer address of the container registry. To
indicate the image version, add :v1 to the end of the image name:
docker images
An image is tagged with the ACR instance address and a version number.
REPOSITORY TAG IMAGE ID CREATED
SIZE
mcr.microsoft.com/azuredocs/azure-vote-front v1 84b41c268ad9 16 minutes ago
944MB
mycontainerregistry.azurecr.io/azure-vote-front v1 84b41c268ad9 16 minutes ago
944MB
mcr.microsoft.com/oss/bitnami/redis 6.0.8 3a54a920bb6c 2 days ago
103MB
tiangolo/uwsgi-nginx-flask python3.6 a16ce562e863 6 weeks ago
944MB
To return a list of images that have been pushed to your ACR instance, use the az acr repository list command.
Provide your own <acrName> as follows:
The following example output lists the azure-vote-front image as available in the registry:
Result
----------------
azure-vote-front
To see the tags for a specific image, use the az acr repository show-tags command as follows:
The following example output shows the v1 image tagged in a previous step:
Result
--------
v1
You now have a container image that is stored in a private Azure Container Registry instance. This image is
deployed from ACR to a Kubernetes cluster in the next tutorial.
Next steps
In this tutorial, you created an Azure Container Registry and pushed an image for use in an AKS cluster. You
learned how to:
Create an Azure Container Registry (ACR) instance
Tag a container image for ACR
Upload the image to ACR
View images in your registry
Advance to the next tutorial to learn how to deploy a Kubernetes cluster in Azure.
Deploy Kubernetes cluster
Tutorial: Deploy an Azure Kubernetes Service (AKS)
cluster
6/15/2022 • 4 minutes to read • Edit Online
Kubernetes provides a distributed platform for containerized applications. With AKS, you can quickly create a
production ready Kubernetes cluster. In this tutorial, part three of seven, a Kubernetes cluster is deployed in AKS.
You learn how to:
Deploy a Kubernetes AKS cluster that can authenticate to an Azure container registry
Install the Kubernetes CLI (kubectl)
Configure kubectl to connect to your AKS cluster
In later tutorials, the Azure Vote application is deployed to the cluster, scaled, and updated.
This tutorial requires that you're running the Azure CLI version 2.0.53 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Create an AKS cluster using az aks create. The following example creates a cluster named myAKSCluster in the
resource group named myResourceGroup. This resource group was created in the previous tutorial in the eastus
region. The following example does not specify a region so the AKS cluster is also created in the eastus region.
For more information, see Quotas, virtual machine size restrictions, and region availability in Azure Kubernetes
Service (AKS) for more information about resource limits and region availability for AKS.
To allow an AKS cluster to interact with other Azure resources, a cluster identity is automatically created, since
you did not specify one. Here, this cluster identity is granted the right to pull images from the Azure Container
Registry (ACR) instance you created in the previous tutorial. To execute the command successfully, you're
required to have an Owner or Azure account administrator role on the Azure subscription.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 2 \
--generate-ssh-keys \
--attach-acr <acrName>
To avoid needing an Owner or Azure account administrator role, you can also manually configure a service
principal to pull images from ACR. For more information, see ACR authentication with service principals or
Authenticate from Kubernetes with a pull secret. Alternatively, you can use a managed identity instead of a
service principal for easier management.
After a few minutes, the deployment completes, and returns JSON-formatted information about the AKS
deployment.
NOTE
To ensure your cluster to operate reliably, you should run at least 2 (two) nodes.
If you use the Azure Cloud Shell, kubectl is already installed. You can also install it locally using the az aks
install-cli command:
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. The
following example gets credentials for the AKS cluster named myAKSCluster in the myResourceGroup:
To verify the connection to your cluster, run the kubectl get nodes command to return a list of the cluster nodes:
Next steps
In this tutorial, a Kubernetes cluster was deployed in AKS, and you configured kubectl to connect to it. You
learned how to:
Deploy a Kubernetes AKS cluster that can authenticate to an Azure container registry
Install the Kubernetes CLI (kubectl)
Configure kubectl to connect to your AKS cluster
Advance to the next tutorial to learn how to deploy an application to the cluster.
Deploy application in Kubernetes
Tutorial: Run applications in Azure Kubernetes
Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
Kubernetes provides a distributed platform for containerized applications. You build and deploy your own
applications and services into a Kubernetes cluster, and let the cluster manage the availability and connectivity.
In this tutorial, part four of seven, a sample application is deployed into a Kubernetes cluster. You learn how to:
Update a Kubernetes manifest file
Run an application in Kubernetes
Test the application
In later tutorials, this application is scaled out and updated.
This quickstart assumes a basic understanding of Kubernetes concepts. For more information, see Kubernetes
core concepts for Azure Kubernetes Service (AKS).
TIP
AKS clusters can use GitOps for configuration management. This enables declarations of your cluster's state, which are
pushed to source control, to be applied to the cluster automatically. To learn how to use GitOps to deploy an application
with an AKS cluster, see the tutorial Use GitOps with Flux v2 and follow the prerequisites for Azure Kubernetes Service
clusters.
Azure CLI
Azure PowerShell
This tutorial requires that you're running the Azure CLI version 2.0.53 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Get the ACR login server name using the az acr list command as follows:
az acr list --resource-group myResourceGroup --query "[].{acrLoginServer:loginServer}" --output table
The sample manifest file from the git repo cloned in the first tutorial uses the images from Microsoft Container
Registry (mcr.microsoft.com). Make sure that you're in the cloned azure-voting-app-redis directory, then open
the manifest file with a text editor, such as vi :
vi azure-vote-all-in-one-redis.yaml
Replace mcr.microsoft.com with your ACR login server name. The image name is found on line 60 of the
manifest file. The following example shows the default image name:
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
Provide your own ACR login server name so that your manifest file looks like the following example:
containers:
- name: azure-vote-front
image: <acrName>.azurecr.io/azure-vote-front:v1
The following example output shows the resources successfully created in the AKS cluster:
When the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To see the application in action, open a web browser to the external IP address of your service:
If the application didn't load, it might be due to an authorization problem with your image registry. To view the
status of your containers, use the kubectl get pods command. If the container images can't be pulled, see
Authenticate with Azure Container Registry from Azure Kubernetes Service.
Next steps
In this tutorial, a sample Azure vote application was deployed to a Kubernetes cluster in AKS. You learned how
to:
Update a Kubernetes manifest files
Run an application in Kubernetes
Test the application
Advance to the next tutorial to learn how to scale a Kubernetes application and the underlying Kubernetes
infrastructure.
Scale Kubernetes application and infrastructure
Tutorial: Scale applications in Azure Kubernetes
Service (AKS)
6/15/2022 • 5 minutes to read • Edit Online
If you've followed the tutorials, you have a working Kubernetes cluster in AKS and you deployed the sample
Azure Voting app. In this tutorial, part five of seven, you scale out the pods in the app and try pod autoscaling.
You also learn how to scale the number of Azure VM nodes to change the cluster's capacity for hosting
workloads. You learn how to:
Scale the Kubernetes nodes
Manually scale Kubernetes pods that run your application
Configure autoscaling pods that run the app front-end
In later tutorials, the Azure Vote application is updated to a new version.
This tutorial requires that you're running the Azure CLI version 2.0.53 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
The following example output shows one front-end pod and one back-end pod:
To manually change the number of pods in the azure-vote-front deployment, use the kubectl scale command.
The following example increases the number of front-end pods to 5:
Run kubectl get pods again to verify that AKS successfully creates the additional pods. After a minute or so, the
pods are available in your cluster:
kubectl get pods
Autoscale pods
Azure CLI
Azure PowerShell
Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on
CPU utilization or other select metrics. The Metrics Server is used to provide resource utilization to Kubernetes,
and is automatically deployed in AKS clusters versions 1.10 and higher. To see the version of your AKS cluster,
use the az aks show command, as shown in the following example:
az aks show --resource-group myResourceGroup --name myAKSCluster --query kubernetesVersion --output table
NOTE
If your AKS cluster is less than 1.10, the Metrics Server is not automatically installed. Metrics Server installation manifests
are available as a components.yaml asset on Metrics Server releases, which means you can install them via a url. To learn
more about these YAML definitions, see the Deployment section of the readme.
Example installation:
To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined. In
the azure-vote-front deployment, the front-end container already requests 0.25 CPU, with a limit of 0.5 CPU.
These resource requests and limits are defined for each container as shown in the following example snippet:
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
ports:
- containerPort: 80
resources:
requests:
cpu: 250m
limits:
cpu: 500m
The following example uses the kubectl autoscale command to autoscale the number of pods in the azure-vote-
front deployment. If average CPU utilization across all pods exceeds 50% of their requested usage, the
autoscaler increases the pods up to a maximum of 10 instances. A minimum of 3 instances is then defined for
the deployment:
kubectl autoscale deployment azure-vote-front --cpu-percent=50 --min=3 --max=10
Alternatively, you can create a manifest file to define the autoscaler behavior and resource limits. The following
is an example of a manifest file named azure-vote-hpa.yaml .
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: azure-vote-back-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: azure-vote-back
targetCPUUtilizationPercentage: 50 # target CPU utilization
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: azure-vote-front-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: azure-vote-front
targetCPUUtilizationPercentage: 50 # target CPU utilization
Use kubectl apply to apply the autoscaler defined in the azure-vote-hpa.yaml manifest file.
To see the status of the autoscaler, use the kubectl get hpa command as follows:
After a few minutes, with minimal load on the Azure Vote app, the number of pod replicas decreases
automatically to three. You can use kubectl get pods again to see the unneeded pods being removed.
NOTE
For additional examples on using the horizontal pod autoscaler, see HorizontalPodAutoscaler Walkthrough.
Azure CLI
Azure PowerShell
When the cluster has successfully scaled, the output is similar to following example:
"agentPoolProfiles": [
{
"count": 3,
"dnsPrefix": null,
"fqdn": null,
"name": "myAKSCluster",
"osDiskSizeGb": null,
"osType": "Linux",
"ports": null,
"storageProfile": "ManagedDisks",
"vmSize": "Standard_D2_v2",
"vnetSubnetId": null
}
Next steps
In this tutorial, you used different scaling features in your Kubernetes cluster. You learned how to:
Manually scale Kubernetes pods that run your application
Configure autoscaling pods that run the app front-end
Manually scale the Kubernetes nodes
Advance to the next tutorial to learn how to update application in Kubernetes.
Update an application in Kubernetes
Tutorial: Update an application in Azure Kubernetes
Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
After an application has been deployed in Kubernetes, it can be updated by specifying a new container image or
image version. An update is staged so that only a portion of the deployment is updated at the same time. This
staged update enables the application to keep running during the update. It also provides a rollback mechanism
if a deployment failure occurs.
In this tutorial, part six of seven, the sample Azure Vote app is updated. You learn how to:
Update the front-end application code
Create an updated container image
Push the container image to Azure Container Registry
Deploy the updated container image
This tutorial requires that you're running the Azure CLI version 2.0.53 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Update an application
Let's make a change to the sample application, then update the version already deployed to your AKS cluster.
Make sure that you're in the cloned azure-voting-app-redis directory. The sample application source code can
then be found inside the azure-vote directory. Open the config_file.cfg file with an editor, such as vi :
vi azure-vote/azure-vote/config_file.cfg
Change the values for VOTE1VALUE and VOTE2VALUE to different values, such as colors. The following example
shows the updated values:
# UI Configurations
TITLE = 'Azure Voting App'
VOTE1VALUE = 'Blue'
VOTE2VALUE = 'Purple'
SHOWHOST = 'false'
Save and close the file. In vi , use :wq .
docker-compose up --build -d
The updated values provided in the config_file.cfg file are displayed in your running application.
To correctly use the updated image, tag the azure-vote-front image with the login server name of your ACR
registry. Get the login server name with the az acr list command:
Use docker tag to tag the image. Replace <acrLoginServer> with your ACR login server name or public registry
hostname, and update the image version to :v2 as follows:
Now use docker push to upload the image to your registry. Replace <acrLoginServer> with your ACR login
server name.
Azure CLI
Azure PowerShell
NOTE
If you experience issues pushing to your ACR registry, make sure that you are still logged in. Run the az acr login
command using the name of your Azure Container Registry that you created in the Create an Azure Container Registry
step. For example, az acr login --name <azure container registry name> .
If you don't have multiple front-end pods, scale the azure-vote-front deployment as follows:
To update the application, use the kubectl set command. Update <acrLoginServer> with the login server or host
name of your container registry, and specify the v2 application version:
To monitor the deployment, use the kubectl get pod command. As the updated application is deployed, your
pods are terminated and re-created with the new container image.
The following example output shows pods terminating and new instances running as the deployment
progresses:
Next steps
In this tutorial, you updated an application and rolled out this update to your AKS cluster. You learned how to:
Update the front-end application code
Create an updated container image
Push the container image to Azure Container Registry
Deploy the updated container image
Advance to the next tutorial to learn how to upgrade an AKS cluster to a new version of Kubernetes.
Upgrade Kubernetes
Tutorial: Upgrade Kubernetes in Azure Kubernetes
Service (AKS)
6/15/2022 • 6 minutes to read • Edit Online
As part of the application and cluster lifecycle, you may wish to upgrade to the latest available version of
Kubernetes and use new features. An Azure Kubernetes Service (AKS) cluster can be upgraded using the Azure
CLI.
In this tutorial, part seven of seven, a Kubernetes cluster is upgraded. You learn how to:
Identify current and available Kubernetes versions
Upgrade the Kubernetes nodes
Validate a successful upgrade
This tutorial requires that you are running the Azure CLI version 2.0.53 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Before you upgrade a cluster, use the az aks get-upgrades command to check which Kubernetes releases are
available for upgrade:
In the following example, the current version is 1.18.10, and the available versions are shown under upgrades.
{
"agentPoolProfiles": null,
"controlPlaneProfile": {
"kubernetesVersion": "1.18.10",
...
"upgrades": [
{
"isPreview": null,
"kubernetesVersion": "1.19.1"
},
{
"isPreview": null,
"kubernetesVersion": "1.19.3"
}
]
},
...
}
Upgrade a cluster
To minimize disruption to running applications, AKS nodes are carefully cordoned and drained. In this process,
the following steps are performed:
1. The Kubernetes scheduler prevents additional pods being scheduled on a node that is to be upgraded.
2. Running pods on the node are scheduled on other nodes in the cluster.
3. A node is created that runs the latest Kubernetes components.
4. When the new node is ready and joined to the cluster, the Kubernetes scheduler begins to run pods on it.
5. The old node is deleted, and the next node in the cluster begins the cordon and drain process.
NOTE
If no patch is specified, the cluster will automatically be upgraded to the specified minor version's latest GA patch. For
example, setting --kubernetes-version to 1.21 will result in the cluster upgrading to 1.21.9 .
When upgrading by alias minor version, only a higher minor version is supported. For example, upgrading from 1.20.x
to 1.20 will not trigger an upgrade to the latest GA 1.20 patch, but upgrading to 1.21 will trigger an upgrade to
the latest GA 1.21 patch.
Azure CLI
Azure PowerShell
az aks upgrade \
--resource-group myResourceGroup \
--name myAKSCluster \
--kubernetes-version KUBERNETES_VERSION
NOTE
You can only upgrade one minor version at a time. For example, you can upgrade from 1.14.x to 1.15.x, but cannot
upgrade from 1.14.x to 1.16.x directly. To upgrade from 1.14.x to 1.16.x, first upgrade from 1.14.x to 1.15.x, then perform
another upgrade from 1.15.x to 1.16.x.
The following condensed example output shows the result of upgrading to 1.19.1. Notice the kubernetesVersion
now reports 1.19.1:
{
"agentPoolProfiles": [
{
"count": 3,
"maxPods": 110,
"name": "nodepool1",
"osType": "Linux",
"storageProfile": "ManagedDisks",
"vmSize": "Standard_DS1_v2",
}
],
"dnsPrefix": "myAKSClust-myResourceGroup-19da35",
"enableRbac": false,
"fqdn": "myaksclust-myresourcegroup-19da35-bd54a4be.hcp.eastus.azmk8s.io",
"id": "/subscriptions/<Subscription
ID>/resourcegroups/myResourceGroup/providers/Microsoft.ContainerService/managedClusters/myAKSCluster",
"kubernetesVersion": "1.19.1",
"location": "eastus",
"name": "myAKSCluster",
"type": "Microsoft.ContainerService/ManagedClusters"
}
The following example output shows some of the above events listed during an upgrade.
...
default 2m1s Normal Drain node/aks-nodepool1-96663640-vmss000001 Draining node: [aks-nodepool1-96663640-
vmss000001]
...
default 9m22s Normal Surge node/aks-nodepool1-96663640-vmss000002 Created a surge node [aks-nodepool1-
96663640-vmss000002 nodepool1] for agentpool %!s(MISSING)
...
Validate an upgrade
Azure CLI
Azure PowerShell
Confirm that the upgrade was successful using the az aks show command as follows:
As this tutorial is the last part of the series, you may want to delete the AKS cluster. As the Kubernetes nodes run
on Azure virtual machines (VMs), they continue to incur charges even if you don't use the cluster. Use the az
group delete command to remove the resource group, container service, and all related resources.
NOTE
When you delete the cluster, the Azure Active Directory service principal used by the AKS cluster is not removed. For
steps on how to remove the service principal, see AKS service principal considerations and deletion. If you used a
managed identity, the identity is managed by the platform and does not require you to provision or rotate any secrets.
Next steps
In this tutorial, you upgraded Kubernetes in an AKS cluster. You learned how to:
Identify current and available Kubernetes versions
Upgrade the Kubernetes nodes
Validate a successful upgrade
For more information on AKS, see AKS overview. For guidance on a creating full solutions with AKS, see AKS
solution guidance.
Kubernetes core concepts for Azure Kubernetes
Service (AKS)
6/15/2022 • 14 minutes to read • Edit Online
Application development continues to move toward a container-based approach, increasing our need to
orchestrate and manage resources. As the leading platform, Kubernetes provides reliable scheduling of fault-
tolerant application workloads. Azure Kubernetes Service (AKS), a managed Kubernetes offering, further
simplifies container-based application deployment and management.
This article introduces:
Core Kubernetes infrastructure components:
control plane
nodes
node pools
Workload resources:
pods
deployments
sets
How to group resources into namespaces.
What is Kubernetes?
Kubernetes is a rapidly evolving platform that manages container-based applications and their associated
networking and storage components. Kubernetes focuses on the application workloads, not the underlying
infrastructure components. Kubernetes provides a declarative approach to deployments, backed by a robust set
of APIs for management operations.
You can build and run modern, portable, microservices-based applications, using Kubernetes to orchestrate and
manage the availability of the application components. Kubernetes supports both stateless and stateful
applications as teams progress through the adoption of microservices-based applications.
As an open platform, Kubernetes allows you to build your applications with your preferred programming
language, OS, libraries, or messaging bus. Existing continuous integration and continuous delivery (CI/CD) tools
can integrate with Kubernetes to schedule and deploy releases.
AKS provides a managed Kubernetes service that reduces the complexity of deployment and core management
tasks, like upgrade coordination. The Azure platform manages the AKS control plane, and you only pay for the
AKS nodes that run your applications.
C O M P O N EN T DESC RIP T IO N
kube-apiserver The API server is how the underlying Kubernetes APIs are
exposed. This component provides the interaction for
management tools, such as kubectl or the Kubernetes
dashboard.
AKS provides a single-tenant control plane, with a dedicated API server, scheduler, etc. You define the number
and size of the nodes, and the Azure platform configures the secure communication between the control plane
and nodes. Interaction with the control plane occurs through Kubernetes APIs, such as kubectl or the
Kubernetes dashboard.
While you don't need to configure components (like a highly available etcd store) with this managed control
plane, you can't access the control plane directly. Kubernetes control plane and node upgrades are orchestrated
through the Azure CLI or Azure portal. To troubleshoot possible issues, you can review the control plane logs
through Azure Monitor logs.
To configure or directly access a control plane, deploy a self-managed Kubernetes cluster using Cluster API
Provider Azure.
For associated best practices, see Best practices for cluster security and upgrades in AKS.
The Azure VM size for your nodes defines the storage CPUs, memory, size, and type available (such as high-
performance SSD or regular HDD). Plan the node size around whether your applications may require large
amounts of CPU and memory or high-performance storage. Scale out the number of nodes in your AKS cluster
to meet demand.
In AKS, the VM image for your cluster's nodes is based on Ubuntu Linux or Windows Server 2019. When you
create an AKS cluster or scale out the number of nodes, the Azure platform automatically creates and configures
the requested number of VMs. Agent nodes are billed as standard VMs, so any VM size discounts (including
Azure reservations) are automatically applied.
If you need advanced configuration and control on your Kubernetes node container runtime and OS, you can
deploy a self-managed cluster using Cluster API Provider Azure.
Resource reservations
AKS uses node resources to help the node function as part of your cluster. This usage can create a discrepancy
between your node's total resources and the allocatable resources in AKS. Remember this information when
setting requests and limits for user deployed pods.
To find a node's allocatable resources, run:
To maintain node performance and functionality, AKS reserves resources on each node. As a node grows larger
in resources, the resource reservation grows due to a higher need for management of user-deployed pods.
NOTE
Using AKS add-ons such as Container Insights (OMS) will consume additional node resources.
CPU
C O RES O N
H O ST 1 2 4 8 16 32 64
Memor y
Memory utilized by AKS includes the sum of two values.
1. kubelet daemon
The kubelet daemon is installed on all Kubernetes agent nodes to manage container creation and
termination.
By default on AKS, kubelet daemon has the memory.available<750Mi eviction rule, ensuring a
node must always have at least 750 Mi allocatable at all times. When a host is below that available
memory threshold, the kubelet will trigger to terminate one of the running pods and free up
memory on the host machine.
2. A regressive rate of memor y reser vations for the kubelet daemon to properly function
(kube-reserved).
25% of the first 4 GB of memory
20% of the next 4 GB of memory (up to 8 GB)
10% of the next 8 GB of memory (up to 16 GB)
6% of the next 112 GB of memory (up to 128 GB)
2% of any memory above 128 GB
Memory and CPU allocation rules:
Keep agent nodes healthy, including some hosting system pods critical to cluster health.
Cause the node to report less allocatable memory and CPU than it would if it were not part of a Kubernetes
cluster.
The above resource reservations can't be changed.
For example, if a node offers 7 GB, it will report 34% of memory not allocatable including the 750Mi hard
eviction threshold.
0.75 + (0.25*4) + (0.20*3) = 0.75GB + 1GB + 0.6GB = 2.35GB / 7GB = 33.57% reserved
In addition to reservations for Kubernetes itself, the underlying node OS also reserves an amount of CPU and
memory resources to maintain OS functions.
For associated best practices, see Best practices for basic scheduler features in AKS.
Node pools
Nodes of the same configuration are grouped together into node pools. A Kubernetes cluster contains at least
one node pool. The initial number of nodes and size are defined when you create an AKS cluster, which creates a
default node pool. This default node pool in AKS contains the underlying VMs that run your agent nodes.
NOTE
To ensure your cluster operates reliably, you should run at least two (2) nodes in the default node pool.
You scale or upgrade an AKS cluster against the default node pool. You can choose to scale or upgrade a specific
node pool. For upgrade operations, running containers are scheduled on other nodes in the node pool until all
the nodes are successfully upgraded.
For more information about how to use multiple node pools in AKS, see Create and manage multiple node
pools for a cluster in AKS.
Node selectors
In an AKS cluster with multiple node pools, you may need to tell the Kubernetes Scheduler which node pool to
use for a given resource. For example, ingress controllers shouldn't run on Windows Server nodes.
Node selectors let you define various parameters, like node OS, to control where a pod should be scheduled.
The following basic example schedules an NGINX instance on a Linux node using the node selector
"kubernetes.io/os": linux:
kind: Pod
apiVersion: v1
metadata:
name: nginx
spec:
containers:
- name: myfrontend
image: mcr.microsoft.com/oss/nginx/nginx:1.15.12-alpine
nodeSelector:
"kubernetes.io/os": linux
For more information on how to control where pods are scheduled, see Best practices for advanced scheduler
features in AKS.
Pods
Kubernetes uses pods to run an instance of your application. A pod represents a single instance of your
application.
Pods typically have a 1:1 mapping with a container. In advanced scenarios, a pod may contain multiple
containers. Multi-container pods are scheduled together on the same node, and allow containers to share
related resources.
When you create a pod, you can define resource requests to request a certain amount of CPU or memory
resources. The Kubernetes Scheduler tries to meet the request by scheduling the pods to run on a node with
available resources. You can also specify maximum resource limits to prevent a pod from consuming too much
compute resource from the underlying node. Best practice is to include resource limits for all pods to help the
Kubernetes Scheduler identify necessary, permitted resources.
For more information, see Kubernetes pods and Kubernetes pod lifecycle.
A pod is a logical resource, but application workloads run on the containers. Pods are typically ephemeral,
disposable resources. Individually scheduled pods miss some of the high availability and redundancy
Kubernetes features. Instead, pods are deployed and managed by Kubernetes Controllers, such as the
Deployment Controller.
More complex applications can be created by including services (such as load balancers) within the YAML
manifest.
For more information, see Kubernetes deployments.
Package management with Helm
Helm is commonly used to manage applications in Kubernetes. You can deploy resources by building and using
existing public Helm charts that contain a packaged version of application code and Kubernetes YAML manifests.
You can store Helm charts either locally or in a remote repository, such as an Azure Container Registry Helm
chart repo.
To use Helm, install the Helm client on your computer, or use the Helm client in the Azure Cloud Shell. Search for
or create Helm charts, and then install them to your Kubernetes cluster. For more information, see Install existing
applications with Helm in AKS.
NOTE
If using the Virtual Nodes add-on, DaemonSets will not create pods on the virtual node.
Namespaces
Kubernetes resources, such as pods and deployments, are logically grouped into a namespace to divide an AKS
cluster and restrict create, view, or manage access to resources. For example, you can create namespaces to
separate business groups. Users can only interact with resources within their assigned namespaces.
When you create an AKS cluster, the following namespaces are available:
Next steps
This article covers some of the core Kubernetes components and how they apply to AKS clusters. For more
information on core Kubernetes and AKS concepts, see the following articles:
Kubernetes / AKS access and identity
Kubernetes / AKS security
Kubernetes / AKS virtual networks
Kubernetes / AKS storage
Kubernetes / AKS scale
Security concepts for applications and clusters in
Azure Kubernetes Service (AKS)
6/15/2022 • 8 minutes to read • Edit Online
Container security protects the entire end-to-end pipeline from build to the application workloads running in
Azure Kubernetes Service (AKS).
The Secure Supply Chain includes the build environment and registry.
Kubernetes includes security components, such as pod security standards and Secrets. Meanwhile, Azure
includes components like Active Directory, Microsoft Defender for Containers, Azure Policy, Azure Key Vault,
network security groups and orchestrated cluster upgrades. AKS combines these security components to:
Provide a complete Authentication and Authorization story.
Leverage AKS Built-in Azure Policy to secure your applications.
End-to-End insight from build through your application with Microsoft Defender for Containers.
Keep your AKS cluster running the latest OS security updates and Kubernetes releases.
Provide secure pod traffic and access to sensitive credentials.
This article introduces the core concepts that secure your applications in AKS:
Security concepts for applications and clusters in Azure Kubernetes Service (AKS)
Build security
Registry security
Cluster security
Node security
Compute isolation
Cluster upgrades
Cordon and drain
Network security
Azure network security groups
Application Security
Kubernetes Secrets
Next steps
Build Security
As the entry point for the Supply Chain, it is important to conduct static analysis of image builds before they are
promoted down the pipeline. This includes vulnerability and compliance assessment. It is not about failing a
build because it has a vulnerability, as that will break development. It is about looking at the "Vendor Status" to
segment based on vulnerabilities that are actionable by the development teams. Also leverage "Grace Periods"
to allow developers time to remediate identified issues.
Registry Security
Assessing the vulnerability state of the image in the Registry will detect drift and will also catch images that
didn't come from your build environment. Use Notary V2 to attach signatures to your images to ensure
deployments are coming from a trusted location.
Cluster security
In AKS, the Kubernetes master components are part of the managed service provided, managed, and
maintained by Microsoft. Each AKS cluster has its own single-tenanted, dedicated Kubernetes master to provide
the API Server, Scheduler, etc.
By default, the Kubernetes API server uses a public IP address and a fully qualified domain name (FQDN). You
can limit access to the API server endpoint using authorized IP ranges. You can also create a fully private cluster
to limit API server access to your virtual network.
You can control access to the API server using Kubernetes role-based access control (Kubernetes RBAC) and
Azure RBAC. For more information, see Azure AD integration with AKS.
Node security
AKS nodes are Azure virtual machines (VMs) that you manage and maintain.
Linux nodes run an optimized Ubuntu distribution using the containerd or Docker container runtime.
Windows Server nodes run an optimized Windows Server 2019 release using the containerd or Docker
container runtime.
When an AKS cluster is created or scaled up, the nodes are automatically deployed with the latest OS security
updates and configurations.
NOTE
AKS clusters using:
Kubernetes version 1.19 and greater for Linux node pools use containerd as its container runtime. Using
containerd with Windows Server 2019 node pools is currently in preview. For more details, see [Add a Windows
Server node pool with containerd ][/learn/aks-add-np-containerd].
Kubernetes prior to v1.19 for Linux node pools use Docker as its container runtime. For Windows Server 2019 node
pools, Docker is the default container runtime.
Cluster upgrades
Azure provides upgrade orchestration tools to upgrade of an AKS cluster and components, maintain security
and compliance, and access the latest features. This upgrade orchestration includes both the Kubernetes master
and agent components.
To start the upgrade process, specify one of the listed available Kubernetes versions. Azure then safely cordons
and drains each AKS node and upgrades.
Cordon and drain
During the upgrade process, AKS nodes are individually cordoned from the cluster to prevent new pods from
being scheduled on them. The nodes are then drained and upgraded as follows:
1. A new node is deployed into the node pool.
This node runs the latest OS image and patches.
2. One of the existing nodes is identified for upgrade.
3. Pods on the identified node are gracefully terminated and scheduled on the other nodes in the node pool.
4. The emptied node is deleted from the AKS cluster.
5. Steps 1-4 are repeated until all nodes are successfully replaced as part of the upgrade process.
For more information, see Upgrade an AKS cluster.
Network security
For connectivity and security with on-premises networks, you can deploy your AKS cluster into existing Azure
virtual network subnets. These virtual networks connect back to your on-premises network using Azure Site-to-
Site VPN or Express Route. Define Kubernetes ingress controllers with private, internal IP addresses to limit
services access to the internal network connection.
Azure network security groups
To filter virtual network traffic flow, Azure uses network security group rules. These rules define the source and
destination IP ranges, ports, and protocols allowed or denied access to resources. Default rules are created to
allow TLS traffic to the Kubernetes API server. You create services with load balancers, port mappings, or ingress
routes. AKS automatically modifies the network security group for traffic flow.
If you provide your own subnet for your AKS cluster (whether using Azure CNI or Kubenet), do not modify the
NIC-level network security group managed by AKS. Instead, create more subnet-level network security groups
to modify the flow of traffic. Make sure they don't interfere with necessary traffic managing the cluster, such as
load balancer access, communication with the control plane, and egress.
Kubernetes network policy
To limit network traffic between pods in your cluster, AKS offers support for Kubernetes network policies. With
network policies, you can allow or deny specific network paths within the cluster based on namespaces and
label selectors.
Application Security
To protect pods running on AKS leverage Microsoft Defender for Containers to detect and restrict cyber attacks
against your applications running in your pods. Run continual scanning to detect drift in the vulnerability state
of your application and implement a "blue/green/canary" process to patch and replace the vulnerable images.
Kubernetes Secrets
With a Kubernetes Secret, you inject sensitive data into pods, such as access credentials or keys.
1. Create a Secret using the Kubernetes API.
2. Define your pod or deployment and request a specific Secret.
Secrets are only provided to nodes with a scheduled pod that requires them.
The Secret is stored in tmpfs, not written to disk.
3. When you delete the last pod on a node requiring a Secret, the Secret is deleted from the node's tmpfs.
Secrets are stored within a given namespace and can only be accessed by pods within the same
namespace.
Using Secrets reduces the sensitive information defined in the pod or service YAML manifest. Instead, you
request the Secret stored in Kubernetes API Server as part of your YAML manifest. This approach only provides
the specific pod access to the Secret.
NOTE
The raw secret manifest files contain the secret data in base64 format (see the official documentation for more details).
Treat these files as sensitive information, and never commit them to source control.
Kubernetes secrets are stored in etcd, a distributed key-value store. Etcd store is fully managed by AKS and data
is encrypted at rest within the Azure platform.
Next steps
To get started with securing your AKS clusters, see Upgrade an AKS cluster.
For associated best practices, see Best practices for cluster security and upgrades in AKS and Best practices for
pod security in AKS.
For more information on core Kubernetes and AKS concepts, see:
Kubernetes / AKS clusters and workloads
Kubernetes / AKS identity
Kubernetes / AKS virtual networks
Kubernetes / AKS storage
Kubernetes / AKS scale
Azure Policy Regulatory Compliance controls for
Azure Kubernetes Service (AKS)
6/15/2022 • 18 minutes to read • Edit Online
Regulatory Compliance in Azure Policy provides initiative definitions (built-ins) created and managed by
Microsoft, for the compliance domains and security controls related to different compliance standards. This page
lists the Azure Kubernetes Service (AKS) compliance domains and security controls.
You can assign the built-ins for a security control individually to help make your Azure resources compliant
with the specific standard.
The title of each built-in policy definition links to the policy definition in the Azure portal. Use the link in the
Policy Version column to view the source on the Azure Policy GitHub repo.
IMPORTANT
Each control below is associated with one or more Azure Policy definitions. These policies may help you assess compliance
with the control; however, there often is not a one-to-one or complete match between a control and one or more policies.
As such, Compliant in Azure Policy refers only to the policies themselves; this doesn't ensure you're fully compliant with
all requirements of a control. In addition, the compliance standard includes controls that aren't addressed by any Azure
Policy definitions at this time. Therefore, compliance in Azure Policy is only a partial view of your overall compliance status.
The associations between controls and Azure Policy Regulatory Compliance definitions for these compliance standards
may change over time.
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
SWIFT Environment SWIFT CSCF v2021 SWIFT Environment Authorized IP ranges 2.0.1
Protection 1.1 Protection should be defined on
Kubernetes Services
SWIFT Environment SWIFT CSCF v2021 Restriction of Internet Authorized IP ranges 2.0.1
Protection 1.4 Access should be defined on
Kubernetes Services
Reduce Attack SWIFT CSCF v2021 Internal Data Flow Kubernetes clusters 6.1.0
Surface and 2.1 Security should be accessible
Vulnerabilities only over HTTPS
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E
Detect Anomalous SWIFT CSCF v2021 Software Integrity Both operating 1.0.0
Activity to Systems 6.2 systems and data
or Transaction disks in Azure
Records Kubernetes Service
clusters should be
encrypted by
customer-managed
keys
Detect Anomalous SWIFT CSCF v2021 Intrusion Detection Both operating 1.0.0
Activity to Systems 6.5A systems and data
or Transaction disks in Azure
Records Kubernetes Service
clusters should be
encrypted by
customer-managed
keys
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
Posture and PV-2 Audit and enforce Azure Policy Add-on 1.0.2
Vulnerability secure configurations for Kubernetes
Management service (AKS) should
be installed and
enabled on your
clusters
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
Other Security CIS Microsoft Azure Enable role-based Role-Based Access 1.0.2
Considerations Foundations access control (RBAC) Control (RBAC)
Benchmark within Azure should be used on
recommendation 8.5 Kubernetes Services Kubernetes Services
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
Other Security CIS Microsoft Azure Enable role-based Role-Based Access 1.0.2
Considerations Foundations access control (RBAC) Control (RBAC)
Benchmark within Azure should be used on
recommendation 8.5 Kubernetes Services Kubernetes Services
CMMC Level 3
To review how the available Azure Policy built-ins for all Azure services map to this compliance standard, see
Azure Policy Regulatory Compliance - CMMC Level 3. For more information about this compliance standard, see
Cybersecurity Maturity Model Certification (CMMC).
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
FedRAMP High
To review how the available Azure Policy built-ins for all Azure services map to this compliance standard, see
Azure Policy Regulatory Compliance - FedRAMP High. For more information about this compliance standard,
see FedRAMP High.
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
System and SC-28 (1) Cryptographic Temp disks and cache 1.0.0
Communications Protection for agent node pools
Protection in Azure Kubernetes
Service clusters
should be encrypted
at host
FedRAMP Moderate
To review how the available Azure Policy built-ins for all Azure services map to this compliance standard, see
Azure Policy Regulatory Compliance - FedRAMP Moderate. For more information about this compliance
standard, see FedRAMP Moderate.
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
System and SC-28 (1) Cryptographic Temp disks and cache 1.0.0
Communications Protection for agent node pools
Protection in Azure Kubernetes
Service clusters
should be encrypted
at host
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
System and SC-28 (1) Cryptographic Temp disks and cache 1.0.0
Communications Protection for agent node pools
Protection in Azure Kubernetes
Service clusters
should be encrypted
at host
RMIT Malaysia
To review how the available Azure Policy built-ins for all Azure services map to this compliance standard, see
Azure Policy Regulatory Compliance - RMIT Malaysia. For more information about this compliance standard, see
RMIT Malaysia.
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E ( A ZURE PO RTA L) ( GIT HUB)
Patch and End-of-Life RMiT 10.65 Patch and End-of-Life Kubernetes Services 1.0.2
System Management System Management should be upgraded
- 10.65 to a non-vulnerable
Kubernetes version
Control Measures on RMiT Appendix 5.5 Control Measures on Kubernetes cluster 3.1.0
Cybersecurity Cybersecurity - services should only
Appendix 5.5 use allowed external
IPs
Control Measures on RMiT Appendix 5.6 Control Measures on Kubernetes cluster 4.2.0
Cybersecurity Cybersecurity - pods should only use
Appendix 5.6 approved host
network and port
range
Control Measures on RMiT Appendix 5.6 Control Measures on Kubernetes cluster 6.2.0
Cybersecurity Cybersecurity - services should listen
Appendix 5.6 only on allowed
ports
Control Measures on RMiT Appendix 5.6 Control Measures on Kubernetes clusters 6.1.0
Cybersecurity Cybersecurity - should be accessible
Appendix 5.6 only over HTTPS
P O L IC Y P O L IC Y VERSIO N
DO M A IN C O N T RO L ID C O N T RO L T IT L E
Next steps
Learn more about Azure Policy Regulatory Compliance.
See the built-ins on the Azure Policy GitHub repo.
Center for Internet Security (CIS) Kubernetes
benchmark
6/15/2022 • 11 minutes to read • Edit Online
As a secure service, Azure Kubernetes Service (AKS) complies with SOC, ISO, PCI DSS, and HIPAA standards. This
article covers the security hardening applied to AKS based on the CIS Kubernetes benchmark. For more
information about AKS security, see Security concepts for applications and clusters in Azure Kubernetes Service
(AKS). For more information on the CIS benchmark, see Center for Internet Security (CIS) Benchmarks.
REC O M M EN DAT IO N
C IS ID DESC RIP T IO N SC O RIN G T Y P E L EVEL STAT US
1 Control Plane
Components
1.4 Scheduler
2 etcd
3 Control Plane
Configuration
3.2 Logging
4 Worker Nodes
4.2 Kubelet
5 Policies
NOTE
In addition to the Kubernetes CIS benchmark, there is an AKS CIS benchmark available as well.
Additional notes
The security hardened OS is built and maintained specifically for AKS and is not supported outside of the
AKS platform.
To further reduce the attack surface area, some unnecessary kernel module drivers have been disabled in the
OS.
Next steps
For more information about AKS security, see the following articles:
Azure Kubernetes Service (AKS)
AKS security considerations
AKS best practices
Azure Kubernetes Service (AKS) Ubuntu image
alignment with Center for Internet Security (CIS)
benchmark
6/15/2022 • 11 minutes to read • Edit Online
As a secure service, Azure Kubernetes Service (AKS) complies with SOC, ISO, PCI DSS, and HIPAA standards. This
article covers the security OS configuration applied to Ubuntu imaged used by AKS. This security configuration
is based on the Azure Linux security baseline which aligns with CIS benchmark. For more information about AKS
security, see Security concepts for applications and clusters in Azure Kubernetes Service (AKS). For more
information about AKS security, see Security concepts for applications and clusters in Azure Kubernetes Service
(AKS). For more information on the CIS benchmark, see Center for Internet Security (CIS) Benchmarks. For more
information on the Azure security baselines for Linux, see Linux security baseline.
NOTE
Unrelated to the CIS benchmarks, Azure applies daily patches, including security patches, to AKS virtual machine hosts.
The goal of the secure configuration built into the host OS is to reduce the surface area of attack and optimize
for the deployment of containers in a secure manner.
The following are the results from the CIS Ubuntu 18.04 LTS Benchmark v2.1.0 recommendations.
Recommendations can have one of the following reasons:
Potential Operation Impact - Recommendation was not applied because it would have a negative effect on
the service.
Covered Elsewhere - Recommendation is covered by another control in Azure cloud compute.
The following are CIS rules implemented:
REC O M M EN DAT IO N
C IS PA RA GRA P H N UM B ER DESC RIP T IO N STAT US REA SO N
1 Initial Setup
1.1.22 Ensure sticky bit is set on all Fail Potential Operation Impact
world-writable directories
2 Services
2.1.1.2 Ensure systemd-timesyncd Not Applicable AKS uses ntpd for timesync
is configured
3 Network Configuration
3.5.1 Configure
UncomplicatedFirewall
6 System Maintenance
Next steps
For more information about AKS security, see the following articles:
Azure Kubernetes Service (AKS)
AKS security considerations
AKS best practices
Access and identity options for Azure Kubernetes
Service (AKS)
6/15/2022 • 15 minutes to read • Edit Online
You can authenticate, authorize, secure, and control access to Kubernetes clusters in a variety of ways.
Using Kubernetes role-based access control (Kubernetes RBAC), you can grant users, groups, and service
accounts access to only the resources they need.
With Azure Kubernetes Service (AKS), you can further enhance the security and permissions structure via
Azure Active Directory and Azure RBAC.
Kubernetes RBAC and AKS help you secure your cluster access and provide only the minimum required
permissions to developers and operators.
This article introduces the core concepts that help you authenticate and assign permissions in AKS.
Kubernetes RBAC
Kubernetes RBAC provides granular filtering of user actions. With this control mechanism:
You assign users or user groups permission to create and modify resources or view logs from running
application workloads.
You can scope permissions to a single namespace or across the entire AKS cluster.
You create roles to define permissions, and then assign those roles to users with role bindings.
For more information, see Using Kubernetes RBAC authorization.
Roles and ClusterRoles
Roles
Before assigning permissions to users with Kubernetes RBAC, you'll define user permissions as a Role. Grant
permissions within a namespace using roles.
NOTE
Kubernetes roles grant permissions; they don't deny permissions.
To grant permissions across the entire cluster or to cluster resources outside a given namespace, you can
instead use ClusterRoles.
ClusterRoles
A ClusterRole grants and applies permissions to resources across the entire cluster, not a specific namespace.
RoleBindings and ClusterRoleBindings
Once you've defined roles to grant permissions to resources, you assign those Kubernetes RBAC permissions
with a RoleBinding. If your AKS cluster integrates with Azure Active Directory (Azure AD), RoleBindings grant
permissions to Azure AD users to perform actions within the cluster. See how in Control access to cluster
resources using Kubernetes role-based access control and Azure Active Directory identities.
RoleBindings
Assign roles to users for a given namespace using RoleBindings. With RoleBindings, you can logically segregate
a single AKS cluster, only enabling users to access the application resources in their assigned namespace.
To bind roles across the entire cluster, or to cluster resources outside a given namespace, you instead use
ClusterRoleBindings.
ClusterRoleBinding
With a ClusterRoleBinding, you bind roles to users and apply to resources across the entire cluster, not a specific
namespace. This approach lets you grant administrators or support engineers access to all resources in the AKS
cluster.
NOTE
Microsoft/AKS performs any cluster actions with user consent under a built-in Kubernetes role aks-service and built-in
role binding aks-service-rolebinding .
This role enables AKS to troubleshoot and diagnose cluster issues, but can't modify permissions nor create roles or role
bindings, or other high privilege actions. Role access is only enabled under active support tickets with just-in-time (JIT)
access. Read more about AKS support policies.
Azure AD integration
Enhance your AKS cluster security with Azure AD integration. Built on decades of enterprise identity
management, Azure AD is a multi-tenant, cloud-based directory and identity management service that
combines core directory services, application access management, and identity protection. With Azure AD, you
can integrate on-premises identities into AKS clusters to provide a single source for account management and
security.
With Azure AD-integrated AKS clusters, you can grant users or groups access to Kubernetes resources within a
namespace or across the cluster.
1. To obtain a kubectl configuration context, a user runs the az aks get-credentials command.
2. When a user interacts with the AKS cluster with kubectl , they're prompted to sign in with their Azure AD
credentials.
This approach provides a single source for user account management and password credentials. The user can
only access the resources as defined by the cluster administrator.
Azure AD authentication is provided to AKS clusters with OpenID Connect. OpenID Connect is an identity layer
built on top of the OAuth 2.0 protocol. For more information on OpenID Connect, see the Open ID connect
documentation. From inside of the Kubernetes cluster, Webhook Token Authentication is used to verify
authentication tokens. Webhook token authentication is configured and managed as part of the AKS cluster.
Webhook and API server
As shown in the graphic above, the API server calls the AKS webhook server and performs the following steps:
1. kubectl uses the Azure AD client application to sign in users with OAuth 2.0 device authorization grant flow.
2. Azure AD provides an access_token, id_token, and a refresh_token.
3. The user makes a request to kubectl with an access_token from kubeconfig .
4. kubectl sends the access_token to API Server.
5. The API Server is configured with the Auth WebHook Server to perform validation.
6. The authentication webhook server confirms the JSON Web Token signature is valid by checking the Azure
AD public signing key.
7. The server application uses user-provided credentials to query group memberships of the logged-in user
from the MS Graph API.
8. A response is sent to the API Server with user information such as the user principal name (UPN) claim of the
access token, and the group membership of the user based on the object ID.
9. The API performs an authorization decision based on the Kubernetes Role/RoleBinding.
10. Once authorized, the API server returns a response to kubectl .
11. kubectl provides feedback to the user.
Learn how to integrate AKS with Azure AD with our AKS-managed Azure AD integration how-to guide.
RB A C SY ST EM DESC RIP T IO N
With Azure RBAC, you create a role definition that outlines the permissions to be applied. You then assign a user
or group this role definition via a role assignment for a particular scope. The scope can be an individual
resource, a resource group, or across the subscription.
For more information, see What is Azure role-based access control (Azure RBAC)?
There are two levels of access needed to fully operate an AKS cluster:
Access the AKS resource in your Azure subscription.
Control scaling or upgrading your cluster using the AKS APIs.
Pull your kubeconfig .
Access to the Kubernetes API. This access is controlled by either:
Kubernetes RBAC (traditionally).
Integrating Azure RBAC with AKS for Kubernetes authorization.
Azure RBAC to authorize access to the AKS resource
With Azure RBAC, you can provide your users (or identities) with granular access to AKS resources across one or
more subscriptions. For example, you could use the Azure Kubernetes Service Contributor role to scale and
upgrade your cluster. Meanwhile, another user with the Azure Kubernetes Service Cluster Admin role only has
permission to pull the Admin kubeconfig .
Alternatively, you could give your user the general Contributor role. With the general Contributor role, users can
perform the above permissions and every action possible on the AKS resource, except managing permissions.
Use Azure RBAC to define access to the Kubernetes configuration file in AKS.
Azure RBAC for Kubernetes Authorization
With the Azure RBAC integration, AKS will use a Kubernetes Authorization webhook server so you can manage
Azure AD-integrated Kubernetes cluster resource permissions and assignments using Azure role definition and
role assignments.
As shown in the above diagram, when using the Azure RBAC integration, all requests to the Kubernetes API will
follow the same authentication flow as explained on the Azure Active Directory integration section.
If the identity making the request exists in Azure AD, Azure will team with Kubernetes RBAC to authorize the
request. If the identity exists outside of Azure AD (i.e., a Kubernetes service account), authorization will defer to
the normal Kubernetes RBAC.
In this scenario, you use Azure RBAC mechanisms and APIs to assign users built-in roles or create custom roles,
just as you would with Kubernetes roles.
With this feature, you not only give users permissions to the AKS resource across subscriptions, but you also
configure the role and permissions for inside each of those clusters controlling Kubernetes API access. For
example, you can grant the Azure Kubernetes Service RBAC Reader role on the subscription scope. The role
recipient will be able to list and get all Kubernetes objects from all clusters without modifying them.
IMPORTANT
You need to enable Azure RBAC for Kubernetes authorization before using this feature. For more details and step by step
guidance, follow our Use Azure RBAC for Kubernetes Authorization how-to guide.
Built-in roles
AKS provides the following four built-in roles. They are similar to the Kubernetes built-in roles with a few
differences, like supporting CRDs. See the full list of actions allowed by each Azure built-in role.
RO L E DESC RIP T IO N
Azure Kubernetes Service RBAC Reader Allows read-only access to see most objects in a namespace.
Doesn't allow viewing roles or role bindings.
Doesn't allow viewing Secrets . Reading the Secrets
contents enables access to ServiceAccount credentials in
the namespace, which would allow API access as any
ServiceAccount in the namespace (a form of privilege
escalation).
Azure Kubernetes Service RBAC Writer Allows read/write access to most objects in a namespace.
Doesn't allow viewing or modifying roles, or role bindings.
Allows accessing Secrets and running pods as any
ServiceAccount in the namespace, so it can be used to gain
the API access levels of any ServiceAccount in the
namespace.
Azure Kubernetes Service RBAC Admin Allows admin access, intended to be granted within a
namespace.
Allows read/write access to most resources in a namespace
(or cluster scope), including the ability to create roles and
role bindings within the namespace.
Doesn't allow write access to resource quota or to the
namespace itself.
Azure Kubernetes Service RBAC Cluster Admin Allows super-user access to perform any action on any
resource.
Gives full control over every resource in the cluster and in all
namespaces.
Summary
View the table for a quick summary of how users can authenticate to Kubernetes when Azure AD integration is
enabled. In all cases, the user's sequence of commands is:
1. Run az login to authenticate to Azure.
2. Run az aks get-credentials to download credentials for the cluster into .kube/config .
3. Run kubectl commands.
The first command may trigger browser-based authentication to authenticate to the cluster, as
described in the following table.
In the Azure portal, you can find:
The Role Grant (Azure RBAC role grant) referred to in the second column is shown on the Access Control
tab.
The Cluster Admin Azure AD Group is shown on the Configuration tab.
Also found with parameter name --aad-admin-group-object-ids in the Azure CLI.
C L UST ER A DM IN A Z URE A D
DESC RIP T IO N RO L E GRA N T REQ UIRED GRO UP ( S) W H EN TO USE
Azure AD with manual Azure Kubernetes User User is not in any of these If you want fine-grained
(Cluster)RoleBindings Role . The "User" role allows groups. Because the user is access control, and you're
az aks get-credentials not in any Cluster Admin not using Azure RBAC for
to be used without the groups, their rights will be Kubernetes Authorization.
--admin flag. (This is the controlled entirely by any Note that the user who sets
only purpose of "Azure RoleBindings or up the bindings must log in
Kubernetes User Role".) The ClusterRoleBindings that by one of the other
result, on an Azure AD- have been set up by cluster methods listed in this table.
enabled cluster, is the admins. The
download of an empty (Cluster)RoleBindings
entry into .kube/config , nominate Azure AD users
which triggers browser- or Azure AD groups as their
based authentication when subjects . If no such
it's first used by kubectl . bindings have been set up,
the user will not be able to
excute any kubectl
commands.
Azure AD by member of Same as above User is a member of one of If you want to conveniently
admin group the groups listed here. AKS grant users full admin
automatically generates a rights, and are not using
ClusterRoleBinding that Azure RBAC for Kubernetes
binds all of the listed authorization.
groups to the
cluster-admin
Kubernetes role. So users in
these groups can run all
kubectl commands as
cluster-admin .
Azure AD with Azure RBAC Two roles: The admin roles field on the You are using Azure RBAC
for Kubernetes First, Azure Kubernetes Configuration tab is for Kubernetes
Authorization User Role (as above). irrelevant when Azure RBAC authorization. This
Second, one of the "Azure for Kubernetes approach gives you fine-
Kubernetes Service RBAC ..." Authorization is enabled. grained control, without the
roles listed above, or your need to set up RoleBindings
own custom alternative. or ClusterRoleBindings.
Next steps
To get started with Azure AD and Kubernetes RBAC, see Integrate Azure Active Directory with AKS.
For associated best practices, see Best practices for authentication and authorization in AKS.
To get started with Azure RBAC for Kubernetes Authorization, see Use Azure RBAC to authorize access within
the Azure Kubernetes Service (AKS) Cluster.
To get started securing your kubeconfig file, see Limit access to cluster configuration file
For more information on core Kubernetes and AKS concepts, see the following articles:
Kubernetes / AKS clusters and workloads
Kubernetes / AKS security
Kubernetes / AKS virtual networks
Kubernetes / AKS storage
Kubernetes / AKS scale
Network concepts for applications in Azure
Kubernetes Service (AKS)
6/15/2022 • 9 minutes to read • Edit Online
Kubernetes basics
To allow access to your applications or between application components, Kubernetes provides an abstraction
layer to virtual networking. Kubernetes nodes connect to a virtual network, providing inbound and outbound
connectivity for pods. The kube-proxy component runs on each node to provide these network features.
In Kubernetes:
Services logically group pods to allow for direct access on a specific port via an IP address or DNS name.
You can distribute traffic using a load balancer.
More complex routing of application traffic can also be achieved with Ingress Controllers.
Security and filtering of the network traffic for pods is possible with Kubernetes network policies.
The Azure platform also simplifies virtual networking for AKS clusters. When you create a Kubernetes load
balancer, you also create and configure the underlying Azure load balancer resource. As you open network ports
to pods, the corresponding Azure network security group rules are configured. For HTTP application routing,
Azure can also configure external DNS as new ingress routes are configured.
Services
To simplify the network configuration for application workloads, Kubernetes uses Services to logically group a
set of pods together and provide network connectivity. The following Service types are available:
Cluster IP
Creates an internal IP address for use within the AKS cluster. Good for internal-only applications that
support other workloads within the cluster.
NodePor t
Creates a port mapping on the underlying node that allows the application to be accessed directly with
the node IP address and port.
LoadBalancer
Creates an Azure load balancer resource, configures an external IP address, and connects the requested
pods to the load balancer backend pool. To allow customers' traffic to reach the application, load
balancing rules are created on the desired ports.
For extra control and routing of the inbound traffic, you may instead use an Ingress controller.
ExternalName
Creates a specific DNS entry for easier application access.
Either the load balancers and services IP address can be dynamically assigned, or you can specify an existing
static IP address. You can assign both internal and external static IP addresses. Existing static IP addresses are
often tied to a DNS entry.
You can create both internal and external load balancers. Internal load balancers are only assigned a private IP
address, so they can't be accessed from the Internet.
For more information, see Configure Azure CNI for an AKS cluster.
Compare network models
Both kubenet and Azure CNI provide network connectivity for your AKS clusters. However, there are advantages
and disadvantages to each. At a high level, the following considerations apply:
kubenet
Conserves IP address space.
Uses Kubernetes internal or external load balancer to reach pods from outside of the cluster.
You manually manage and maintain user-defined routes (UDRs).
Maximum of 400 nodes per cluster.
Azure CNI
Pods get full virtual network connectivity and can be directly reached via their private IP address from
connected networks.
Requires more IP address space.
The following behavior differences exist between kubenet and Azure CNI:
C A PA B IL IT Y K UB EN ET A Z URE C N I
Pod-VM connectivity; VM in the same Works when initiated by pod Works both ways
virtual network
Pod-VM connectivity; VM in peered Works when initiated by pod Works both ways
virtual network
On-premises access using VPN or Works when initiated by pod Works both ways
Express Route
Regarding DNS, with both kubenet and Azure CNI plugins DNS are offered by CoreDNS, a deployment running
in AKS with its own autoscaler. For more information on CoreDNS on Kubernetes, see Customizing DNS Service.
CoreDNS by default is configured to forward unknown domains to the DNS functionality of the Azure Virtual
Network where the AKS cluster is deployed. Hence, Azure DNS and Private Zones will work for pods running in
AKS.
Support scope between network models
Whatever network model you use, both kubenet and Azure CNI can be deployed in one of the following ways:
The Azure platform can automatically create and configure the virtual network resources when you create an
AKS cluster.
You can manually create and configure the virtual network resources and attach to those resources when you
create your AKS cluster.
Although capabilities like service endpoints or UDRs are supported with both kubenet and Azure CNI, the
support policies for AKS define what changes you can make. For example:
If you manually create the virtual network resources for an AKS cluster, you're supported when configuring
your own UDRs or service endpoints.
If the Azure platform automatically creates the virtual network resources for your AKS cluster, you can't
manually change those AKS-managed resources to configure your own UDRs or service endpoints.
Ingress controllers
When you create a LoadBalancer-type Service, you also create an underlying Azure load balancer resource. The
load balancer is configured to distribute traffic to the pods in your Service on a given port.
The LoadBalancer only works at layer 4. At layer 4, the Service is unaware of the actual applications, and can't
make any more routing considerations.
Ingress controllers work at layer 7, and can use more intelligent rules to distribute application traffic. Ingress
controllers typically route HTTP traffic to different applications based on the inbound URL.
Next steps
To get started with AKS networking, create and configure an AKS cluster with your own IP address ranges using
kubenet or Azure CNI.
For associated best practices, see Best practices for network connectivity and security in AKS.
For more information on core Kubernetes and AKS concepts, see the following articles:
Kubernetes / AKS clusters and workloads
Kubernetes / AKS access and identity
Kubernetes / AKS security
Kubernetes / AKS storage
Kubernetes / AKS scale
Storage options for applications in Azure
Kubernetes Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
Applications running in Azure Kubernetes Service (AKS) may need to store and retrieve data. While some
application workloads can use local, fast storage on unneeded, emptied nodes, others require storage that
persists on more regular data volumes within the Azure platform.
Multiple pods may need to:
Share the same data volumes.
Reattach data volumes if the pod is rescheduled on a different node.
Finally, you may need to inject sensitive data or application configuration information into pods.
This article introduces the core concepts that provide storage to your applications in AKS:
Volumes
Persistent volumes
Storage classes
Persistent volume claims
Volumes
Kubernetes typically treats individual pods as ephemeral, disposable resources. Applications have different
approaches available to them for using and persisting data. A volume represents a way to store, retrieve, and
persist data across pods and through the application lifecycle.
Traditional volumes are created as Kubernetes resources backed by Azure Storage. You can manually create data
volumes to be assigned to pods directly, or have Kubernetes automatically create them. Data volumes can use:
Azure Disks, Azure Files, Azure NetApp Files, or Azure Blobs.
Azure Disks
Use Azure Disks to create a Kubernetes DataDisk resource. Disks types include:
Ultra Disks
Premium SSDs
Standard SSDs
Standard HDDs
TIP
For most production and development workloads, use Premium SSD.
Since Azure Disks are mounted as ReadWriteOnce, they're only available to a single pod. For storage volumes
that can be accessed by multiple pods simultaneously, use Azure Files.
Azure Files
Use Azure Files to mount an SMB 3.1.1 share or NFS 4.1 share backed by an Azure storage accounts to pods.
Files let you share data across multiple nodes and pods and can use:
Azure Premium storage backed by high-performance SSDs
Azure Standard storage backed by regular HDDs
Azure NetApp Files
Ultra Storage
Premium Storage
Standard Storage
Azure Blob Storage
Block Blobs
Volume types
Kubernetes volumes represent more than just a traditional disk for storing and retrieving information.
Kubernetes volumes can also be used as a way to inject data into a pod for use by the containers.
Common volume types in Kubernetes include:
emptyDir
Commonly used as temporary space for a pod. All containers within a pod can access the data on the volume.
Data written to this volume type persists only for the lifespan of the pod. Once you delete the pod, the volume is
deleted. This volume typically uses the underlying local node disk storage, though it can also exist only in the
node's memory.
secret
You can use secret volumes to inject sensitive data into pods, such as passwords.
1. Create a Secret using the Kubernetes API.
2. Define your pod or deployment and request a specific Secret.
Secrets are only provided to nodes with a scheduled pod that requires them.
The Secret is stored in tmpfs, not written to disk.
3. When you delete the last pod on a node requiring a Secret, the Secret is deleted from the node's tmpfs.
Secrets are stored within a given namespace and can only be accessed by pods within the same
namespace.
configMap
You can use configMap to inject key-value pair properties into pods, such as application configuration
information. Define application configuration information as a Kubernetes resource, easily updated and applied
to new instances of pods as they're deployed.
Like using a Secret:
1. Create a ConfigMap using the Kubernetes API.
2. Request the ConfigMap when you define a pod or deployment.
ConfigMaps are stored within a given namespace and can only be accessed by pods within the same
namespace.
Persistent volumes
Volumes defined and created as part of the pod lifecycle only exist until you delete the pod. Pods often expect
their storage to remain if a pod is rescheduled on a different host during a maintenance event, especially in
StatefulSets. A persistent volume (PV) is a storage resource created and managed by the Kubernetes API that can
exist beyond the lifetime of an individual pod.
You can use Azure Disks or Files to provide the PersistentVolume. As noted in the Volumes section, the choice of
Disks or Files is often determined by the need for concurrent access to the data or the performance tier.
A PersistentVolume can be statically created by a cluster administrator, or dynamically created by the Kubernetes
API server. If a pod is scheduled and requests currently unavailable storage, Kubernetes can create the
underlying Azure Disk or Files storage and attach it to the pod. Dynamic provisioning uses a StorageClass to
identify what type of Azure storage needs to be created.
Storage classes
To define different tiers of storage, such as Premium and Standard, you can create a StorageClass.
The StorageClass also defines the reclaimPolicy. When you delete the pod and the persistent volume is no
longer required, the reclaimPolicy controls the behavior of the underlying Azure storage resource. The
underlying storage resource can either be deleted or kept for use with a future pod.
For clusters using the Container Storage Interface (CSI) drivers the following extra StorageClasses are created:
Unless you specify a StorageClass for a persistent volume, the default StorageClass will be used. Ensure volumes
use the appropriate storage you need when requesting persistent volumes.
IMPORTANT
Starting in Kubernetes version 1.21, AKS will use CSI drivers only and by default. The default class will be the same as
managed-csi
You can create a StorageClass for additional needs using kubectl . The following example uses Premium
Managed Disks and specifies that the underlying Azure Disk should be retained when you delete the pod:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-premium-retain
provisioner: disk.csi.azure.com
parameters:
skuName: Premium_LRS
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
NOTE
AKS reconciles the default storage classes and will overwrite any changes you make to those storage classes.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-premium-retain
resources:
requests:
storage: 5Gi
kind: Pod
apiVersion: v1
metadata:
name: nginx
spec:
containers:
- name: myfrontend
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
volumeMounts:
- mountPath: "/mnt/azure"
name: volume
volumes:
- name: volume
persistentVolumeClaim:
claimName: azure-managed-disk
For mounting a volume in a Windows container, specify the drive letter and path. For example:
...
volumeMounts:
- mountPath: "d:"
name: volume
- mountPath: "c:\k"
name: k-dir
...
Next steps
For associated best practices, see Best practices for storage and backups in AKS.
To see how to use CSI drivers, see the following how-to articles:
Enable Container Storage Interface(CSI) drivers for Azure disks and Azure Files on Azure Kubernetes
Service(AKS)
Use Azure disk Container Storage Interface(CSI) drivers in Azure Kubernetes Service(AKS)
Use Azure Files Container Storage Interface(CSI) drivers in Azure Kubernetes Service(AKS)
Integrate Azure NetApp Files with Azure Kubernetes Service
For more information on core Kubernetes and AKS concepts, see the following articles:
Kubernetes / AKS clusters and workloads
Kubernetes / AKS identity
Kubernetes / AKS security
Kubernetes / AKS virtual networks
Kubernetes / AKS scale
Scaling options for applications in Azure Kubernetes
Service (AKS)
6/15/2022 • 6 minutes to read • Edit Online
As you run applications in Azure Kubernetes Service (AKS), you may need to increase or decrease the amount of
compute resources. As the number of application instances you need change, the number of underlying
Kubernetes nodes may also need to change. You also might need to quickly provision a large number of
additional application instances.
This article introduces the core concepts that help you scale applications in AKS:
Manually scale
Horizontal pod autoscaler (HPA)
Cluster autoscaler
Azure Container Instance (ACI) integration with AKS
When you configure the horizontal pod autoscaler for a given deployment, you define the minimum and
maximum number of replicas that can run. You also define the metric to monitor and base any scaling decisions
on, such as CPU usage.
To get started with the horizontal pod autoscaler in AKS, see Autoscale pods in AKS.
Cooldown of scaling events
As the horizontal pod autoscaler checks the Metrics API every 30 seconds, previous scale events may not have
successfully completed before another check is made. This behavior could cause the horizontal pod autoscaler
to change the number of replicas before the previous scale event could receive application workload and the
resource demands to adjust accordingly.
To minimize race events, a delay value is set. This value defines how long the horizontal pod autoscaler must
wait after a scale event before another scale event can be triggered. This behavior allows the new replica count
to take effect and the Metrics API to reflect the distributed workload. There is no delay for scale-up events as of
Kubernetes 1.12, however the delay on scale down events is defaulted to 5 minutes.
Currently, you can't tune these cooldown values from the default.
Cluster autoscaler
To respond to changing pod demands, Kubernetes has a cluster autoscaler, that adjusts the number of nodes
based on the requested compute resources in the node pool. By default, the cluster autoscaler checks the
Metrics API server every 10 seconds for any required changes in node count. If the cluster autoscale determines
that a change is required, the number of nodes in your AKS cluster is increased or decreased accordingly. The
cluster autoscaler works with Kubernetes RBAC-enabled AKS clusters that run Kubernetes 1.10.x or higher.
Cluster autoscaler is typically used alongside the horizontal pod autoscaler. When combined, the horizontal pod
autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler
adjusts the number of nodes as needed to run those additional pods accordingly.
To get started with the cluster autoscaler in AKS, see Cluster Autoscaler on AKS.
Scale out events
If a node doesn't have sufficient compute resources to run a requested pod, that pod can't progress through the
scheduling process. The pod can't start unless additional compute resources are available within the node pool.
When the cluster autoscaler notices pods that can't be scheduled because of node pool resource constraints, the
number of nodes within the node pool is increased to provide the additional compute resources. When those
additional nodes are successfully deployed and available for use within the node pool, the pods are then
scheduled to run on them.
If your application needs to scale rapidly, some pods may remain in a state waiting to be scheduled until the
additional nodes deployed by the cluster autoscaler can accept the scheduled pods. For applications that have
high burst demands, you can scale with virtual nodes and Azure Container Instances.
Scale in events
The cluster autoscaler also monitors the pod scheduling status for nodes that haven't recently received new
scheduling requests. This scenario indicates the node pool has more compute resources than are required, and
the number of nodes can be decreased.
A node that passes a threshold for no longer being needed for 10 minutes by default is scheduled for deletion.
When this situation occurs, pods are scheduled to run on other nodes within the node pool, and the cluster
autoscaler decreases the number of nodes.
Your applications may experience some disruption as pods are scheduled on different nodes when the cluster
autoscaler decreases the number of nodes. To minimize disruption, avoid applications that use a single pod
instance.
ACI lets you quickly deploy container instances without additional infrastructure overhead. When you connect
with AKS, ACI becomes a secured, logical extension of your AKS cluster. The virtual nodes component, which is
based on Virtual Kubelet, is installed in your AKS cluster that presents ACI as a virtual Kubernetes node.
Kubernetes can then schedule pods that run as ACI instances through virtual nodes, not as pods on VM nodes
directly in your AKS cluster.
Your application requires no modification to use virtual nodes. Deployments can scale across AKS and ACI and
with no delay as cluster autoscaler deploys new nodes in your AKS cluster.
Virtual nodes are deployed to an additional subnet in the same virtual network as your AKS cluster. This virtual
network configuration allows the traffic between ACI and AKS to be secured. Like an AKS cluster, an ACI instance
is a secure, logical compute resource that is isolated from other users.
Next steps
To get started with scaling applications, first follow the quickstart to create an AKS cluster with the Azure CLI.
You can then start to manually or automatically scale applications in your AKS cluster:
Manually scale pods or nodes
Use the horizontal pod autoscaler
Use the cluster autoscaler
For more information on core Kubernetes and AKS concepts, see the following articles:
Kubernetes / AKS clusters and workloads
Kubernetes / AKS access and identity
Kubernetes / AKS security
Kubernetes / AKS virtual networks
Kubernetes / AKS storage
Azure Kubernetes Service (AKS) node auto-repair
6/15/2022 • 2 minutes to read • Edit Online
AKS continuously monitors the health state of worker nodes and performs automatic node repair if they
become unhealthy. The Azure virtual machine (VM) platform performs maintenance on VMs experiencing
issues.
AKS and Azure VMs work together to minimize service disruptions for clusters.
In this document, you'll learn how automatic node repair functionality behaves for both Windows and Linux
nodes.
If AKS identifies an unhealthy node that remains unhealthy for 10 minutes, AKS takes the following actions:
1. Reboot the node.
2. If the reboot is unsuccessful, reimage the node.
3. If the reimage is unsuccessful, redeploy the node.
Alternative remediations are investigated by AKS engineers if auto-repair is unsuccessful.
If AKS finds multiple unhealthy nodes during a health check, each node is repaired individually before another
repair begins.
Node Autodrain
Scheduled Events can occur on the underlying virtual machines (VMs) in any of your node pools. For spot node
pools, scheduled events may cause a preempt node event for the node. Certain node events, such as preempt,
cause AKS node autodrain to attempt a cordon and drain of the affected node, which allows for a graceful
reschedule of any affected workloads on that node. When this happens, you might notice the node to receive a
taint with "remediator.aks.microsoft.com/unschedulable", because of "kubernetes.azure.com/scalesetpriority:
spot".
The following table shows the node events, and the actions they cause for AKS node autodrain.
EVEN T DESC RIP T IO N A C T IO N
Limitations
In many cases, AKS can determine if a node is unhealthy and attempt to repair the issue, but there are cases
where AKS either can't repair the issue or can't detect that there is an issue. For example, AKS can't detect issues
if a node status is not being reported due to error in network configuration, or has failed to initially register as a
healthy node.
Next steps
Use Availability Zones to increase high availability with your AKS cluster workloads.
Multi-instance GPU Node pool
6/15/2022 • 3 minutes to read • Edit Online
Nvidia's A100 GPU can be divided in up to seven independent instances. Each instance has their own memory
and Stream Multiprocessor (SM). For more information on the Nvidia A100, follow Nvidia A100 GPU.
This article will walk you through how to create a multi-instance GPU node pool on Azure Kubernetes Service
clusters and schedule tasks.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
N UM B ER O F IN STA N C ES
P RO F IL E N A M E F RA C T IO N O F SM F RA C T IO N O F M EM O RY C REAT ED
As an example, the GPU Instance Profile of MIG 1g.5gb indicates that each GPU instance will have 1g
SM(Computing resource) and 5gb memory. In this case, the GPU will be partitioned into seven instances.
The available GPU Instance Profiles available for this instance size are MIG1g , MIG2g , MIG3g , MIG4g , MIG7g
IMPORTANT
The applied GPU Instance Profile cannot be changed after node pool creation.
az aks create \
--resource-group myresourcegroup \
--name migcluster\
--node-count 1
HTTP request
If you're using http request, you can place GPU instance profile in the request body:
{
"properties": {
"count": 1,
"vmSize": "Standard_ND96asr_v4",
"type": "VirtualMachineScaleSets",
"gpuInstanceProfile": "MIG1g"
}
}
nvidia.com/gpu: 1
Mixed Strategy
The mixed strategy will expose the GPU instances and the GPU instance profile. If you use this strategy,
the GPU resource will be displayed as:
nvidia.com/mig1g.5gb: 1
export MIG_STRATEGY=single
or
export MIG_STRATEGY=mixed
Install the Nvidia device plugin and GPU feature discovery using helm
helm install \
--version=0.7.0 \
--generate-name \
--set migStrategy=${MIG_STRATEGY} \
nvdp/nvidia-device-plugin
helm install \
--version=0.2.0 \
--generate-name \
--set migStrategy=${MIG_STRATEGY} \
nvgfd/gpu-feature-discovery
Allocable:
nvidia.com/gpu: 56
Allocable:
nvidia.com/mig-1g.5gb: 56
Schedule work
Use the kubectl run command to schedule work using single strategy:
kubectl run -it --rm \
--image=nvidia/cuda:11.0-base \
--restart=Never \
--limits=nvidia.com/gpu=1 \
single-strategy-example -- nvidia-smi -L
Use the kubectl run command to schedule work using mixed strategy:
Troubleshooting
If you do not see multi-instance GPU capability after the node pool has been created, confirm the API version
is not older than 2021-08-01.
About service meshes
6/15/2022 • 2 minutes to read • Edit Online
A service mesh provides capabilities like traffic management, resiliency, policy, security, strong identity, and
observability to your workloads. Your application is decoupled from these operational capabilities and the
service mesh moves them out of the application layer, and down to the infrastructure layer.
Scenarios
These are some of the scenarios that can be enabled for your workloads when you use a service mesh:
Encr ypt all traffic in cluster - Enable mutual TLS between specified services in the cluster. This can be
extended to ingress and egress at the network perimeter, and provides a secure by default option with no
changes needed for application code and infrastructure.
Canar y and phased rollouts - Specify conditions for a subset of traffic to be routed to a set of new
services in the cluster. On successful test of canary release, remove conditional routing and phase
gradually increasing % of all traffic to new service. Eventually all traffic will be directed to new service.
Traffic management and manipulation - Create a policy on a service that will rate limit all traffic to a
version of a service from a specific origin, or a policy that applies a retry strategy to classes of failures
between specified services. Mirror live traffic to new versions of services during a migration or to debug
issues. Inject faults between services in a test environment to test resiliency.
Obser vability - Gain insight into how your services are connected and the traffic that flows between
them. Obtain metrics, logs, and traces for all traffic in cluster, including ingress/egress. Add distributed
tracing abilities to your applications.
Selection criteria
Before you select a service mesh, ensure that you understand your requirements and the reasons for installing a
service mesh. Ask the following questions:
Is an Ingress Controller sufficient for my needs? - Sometimes having a capability like A/B testing
or traffic splitting at the ingress is sufficient to support the required scenario. Don't add complexity to
your environment with no upside.
Can my workloads and environment tolerate the additional overheads? - All the additional
components required to support the service mesh require additional resources like CPU and memory. In
addition, all the proxies and their associated policy checks add latency to your traffic. If you have
workloads that are very sensitive to latency or cannot provide the additional resources to cover the
service mesh components, then re-consider.
Is this adding additional complexity unnecessarily? - If the reason for installing a service mesh is
to gain a capability that is not necessarily critical to the business or operational teams, then consider
whether the additional complexity of installation, maintenance, and configuration is worth it.
Can this be adopted in an incremental approach? - Some of the service meshes that provide a lot
of capabilities can be adopted in a more incremental approach. Install just the components you need to
ensure your success. Once you are more confident and additional capabilities are required, then explore
those. Resist the urge to install everything from the start.
Next steps
Open Service Mesh (OSM) is a supported service mesh that runs Azure Kubernetes Service (AKS):
Learn more about OSM ...
There are also service meshes provided by open-source projects and third parties that are commonly used with
AKS. These open-source and third-party service meshes are not covered by the AKS support policy.
Istio
Linkerd
Consul Connect
For more details on the service mesh landscape, see Layer 5's Service Mesh Landscape.
For more details service mesh standardization efforts:
Service Mesh Interface (SMI)
Service Mesh Federation
Service Mesh Performance (SMP)
Sustainable software engineering principles in Azure
Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
The sustainable software engineering principles are a set of competencies to help you define, build, and run
sustainable applications. The overall goal is to reduce the carbon footprint in every aspect of your application.
The Principles of Sustainable Software Engineering has an overview of the principles of sustainable software
engineering.
Sustainable software engineering is a shift in priorities and focus. In many cases, the way most software is
designed and run highlights fast performance and low latency. Meanwhile, sustainable software engineering
focuses on reducing as much carbon emission as possible. Consider:
Applying sustainable software engineering principles can give you faster performance or lower latency, such
as by lowering total network travel.
Reducing carbon emissions may cause slower performance or increased latency, such as delaying low-
priority workloads.
Before applying sustainable software engineering principles to your application, review the priorities, needs, and
trade-offs of your application.
IMPORTANT
When considering changing the resources in your cluster, verify your system pools have enough resources to maintain
the stability of your cluster's core system components. Never reduce your cluster's resources to the point where your
cluster may become unstable.
After reviewing your cluster's utilization, consider using the features offered by multiple node pools:
Node sizing
Use node sizing to define node pools with specific CPU and memory profiles, allowing you to tailor your
nodes to your workload needs. By sizing your nodes to your workload needs, you can run a few nodes at
higher utilization.
Cluster scaling
Configure how your cluster scales. Use the horizontal pod autoscaler and the cluster autoscaler to scale
your cluster automatically based on your configuration. Control how your cluster scales to keep all your
nodes running at a high utilization while staying in sync with changes to your cluster's workload.
Spot pools
For cases where a workload is tolerant to sudden interruptions or terminations, you can use spot pools.
Spot pools take advantage of idle capacity within Azure. For example, spot pools may work well for batch
jobs or development environments.
NOTE
Increasing utilization can also reduce excess nodes, which reduces the energy consumed by resource reservations on each
node.
Finally, review the CPU and memory requests and limits in the Kubernetes manifests of your applications.
As you lower memory and CPU values, more memory and CPU are available to the cluster to run other
workloads.
As you run more workloads with lower CPU and memory, your cluster becomes more densely allocated,
which increases your utilization.
When reducing the CPU and memory for your applications, your applications' behavior may become degraded
or unstable if you set CPU and memory values too low. Before changing the CPU and memory requests and
limits, run some benchmarking tests to verify if the values are set appropriately. Never reduce these values to
the point of application instability.
IMPORTANT
When considering making changes to your cluster's networking, never reduce network travel at the cost of meeting
workload requirements. For example, while using availability zones causes more network travel on your cluster, availability
zones may be necessary to handle workload requirements.
Demand shaping
Where possible, consider shifting demand for your cluster's resources to times or regions where you can use
excess capacity. For example, consider:
Changing the time or region for a batch job to run.
Using spot pools.
Refactoring your application to use a queue to defer running workloads that don't need immediate
processing.
Next steps
Learn more about the features of AKS mentioned in this article:
Multiple node pools
Node sizing
Scaling a cluster
Horizontal pod autoscaler
Cluster autoscaler
Spot pools
System pools
Resource reservations
Proximity placement groups
Availability Zones
GitOps Flux v2 configurations with AKS and Azure
Arc-enabled Kubernetes
6/15/2022 • 5 minutes to read • Edit Online
Azure provides configuration management capability using GitOps in Azure Kubernetes Service (AKS) and Azure
Arc-enabled Kubernetes clusters. You can easily enable and use GitOps in these clusters.
With GitOps, you declare the desired state of your Kubernetes clusters in files in Git repositories. The Git
repositories may contain the following files:
YAML-formatted manifests that describe Kubernetes resources (such as Namespaces, Secrets, Deployments,
and others)
Helm charts for deploying applications
Kustomize files to describe environment-specific changes
Because these files are stored in a Git repository, they're versioned, and changes between versions are easily
tracked. Kubernetes controllers run in the clusters and continually reconcile the cluster state with the desired
state declared in the Git repository. These operators pull the files from the Git repositories and apply the desired
state to the clusters. The operators also continuously assure that the cluster remains in the desired state.
GitOps on Azure Arc-enabled Kubernetes or Azure Kubernetes Service uses Flux, a popular open-source tool set.
Flux provides support for common file sources (Git and Helm repositories, Buckets) and template types (YAML,
Helm, and Kustomize). Flux also supports multi-tenancy and deployment dependency management, among
other features.
NOTE
The microsoft.flux extension is installed in the flux-system namespace and has cluster-wide scope. The option to
install this extension at the namespace scope is not available, and attempt to install at namespace scope will fail with 400
error.
Flux configurations
Each resource in Azure will be associated in a Kubernetes cluster with one Flux
fluxConfigurations
GitRepository or Bucket custom resource and one or more Kustomization custom resources. When you
create a fluxConfigurations resource, you'll specify, among other information, the URL to the source (Git
repository or Bucket) and the sync target in the source for each Kustomization . You can configure dependencies
between Kustomization custom resources to control deployment sequencing. Also, you can create multiple
namespace-scoped fluxConfigurations resources on the same cluster for different applications and app teams.
NOTE
fluxconfig-agent monitors for new or updated fluxConfiguration resources in Azure. The agent requires
connectivity to Azure for the desired state of the fluxConfiguration to be applied to the cluster. If the agent is
unable to connect to Azure, there will be a delay in making the changes in the cluster until the agent can connect. If
the cluster is disconnected from Azure for more than 48 hours, then the request to the cluster will time-out, and the
changes will need to be re-applied in Azure.
Sensitive customer inputs like private key and token/password are stored for less than 48 hours in the Kubernetes
Configuration service. If you update any of these values in Azure, assure that your clusters connect with Azure within
48 hours.
Data residency
The Azure GitOps service (Azure Kubernetes Configuration Management) stores/processes customer data. By
default, customer data is replicated to the paired region. For the regions Singapore, East Asia, and Brazil South,
all customer data is stored and processed in the region.
Next steps
Advance to the next tutorial to learn how to enable GitOps on your AKS or Azure Arc-enabled Kubernetes
clusters
Enable GitOps with Flux
Cluster operator and developer best practices to
build and manage applications on Azure
Kubernetes Service (AKS)
6/15/2022 • 2 minutes to read • Edit Online
Building and running applications successfully in Azure Kubernetes Service (AKS) require understanding and
implementation of some key considerations, including:
Multi-tenancy and scheduler features.
Cluster and pod security.
Business continuity and disaster recovery.
The AKS product group, engineering teams, and field teams (including global black belts [GBBs]) contributed to,
wrote, and grouped the following best practices and conceptual articles. Their purpose is to help cluster
operators and developers understand the considerations above and implement the appropriate features.
Next steps
If you need to get started with AKS, see the AKS quickstart using the Azure CLI, using Azure PowerShell, or using
the Azure portal.
Best practices for authentication and authorization
in Azure Kubernetes Service (AKS)
6/15/2022 • 8 minutes to read • Edit Online
As you deploy and maintain clusters in Azure Kubernetes Service (AKS), you implement ways to manage access
to resources and services. Without these controls:
Accounts could have access to unnecessary resources and services.
Tracking which set of credentials were used to make changes could be difficult.
This best practices article focuses on how a cluster operator can manage access and identity for AKS clusters. In
this article, you learn how to:
Authenticate AKS cluster users with Azure Active Directory.
Control access to resources with Kubernetes role-based access control (Kubernetes RBAC).
Use Azure RBAC to granularly control access to the AKS resource, the Kubernetes API at scale, and the
kubeconfig .
Use a managed identity to authenticate pods themselves with other services.
Your Kubernetes cluster developers and application owners need access to different resources. Kubernetes lacks
an identity management solution for you to control the resources with which users can interact. Instead, you
typically integrate your cluster with an existing identity solution. Enter Azure AD: an enterprise-ready identity
management solution that integrates with AKS clusters.
With Azure AD-integrated clusters in AKS, you create Roles or ClusterRoles defining access permissions to
resources. You then bind the roles to users or groups from Azure AD. Learn more about these Kubernetes RBAC
in the next section. Azure AD integration and how you control access to resources can be seen in the following
diagram:
In Kubernetes, you provide granular access control to cluster resources. You define permissions at the cluster
level, or to specific namespaces. You determine what resources can be managed and with what permissions. You
then apply these roles to users or groups with a binding. For more information about Roles, ClusterRoles, and
Bindings, see Access and identity options for Azure Kubernetes Service (AKS).
As an example, you create a role with full access to resources in the namespace named finance-app, as shown in
the following example YAML manifest:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: finance-app-full-access-role
namespace: finance-app
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
You then create a RoleBinding and bind the Azure AD user [email protected] to the RoleBinding, as
shown in the following YAML manifest:
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: finance-app-full-access-role-binding
namespace: finance-app
subjects:
- kind: User
name: [email protected]
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: finance-app-full-access-role
apiGroup: rbac.authorization.k8s.io
When [email protected] is authenticated against the AKS cluster, they have full permissions to
resources in the finance-app namespace. In this way, you logically separate and control access to resources. Use
Kubernetes RBAC in conjunction with Azure AD-integration.
To see how to use Azure AD groups to control access to Kubernetes resources using Kubernetes RBAC, see
Control access to cluster resources using role-based access control and Azure Active Directory identities in AKS.
Use Azure RBAC
Best practice guidance
Use Azure RBAC to define the minimum required user and group permissions to AKS resources in one or
more subscriptions.
There are two levels of access needed to fully operate an AKS cluster:
1. Access the AKS resource on your Azure subscription.
This access level allows you to:
Control scaling or upgrading your cluster using the AKS APIs
Pull your kubeconfig .
To see how to control access to the AKS resource and the kubeconfig , see Limit access to cluster
configuration file.
2. Access to the Kubernetes API.
This access level is controlled either by:
Kubernetes RBAC (traditionally) or
By integrating Azure RBAC with AKS for kubernetes authorization.
To see how to granularly give permissions to the Kubernetes API using Azure RBAC, see Use Azure RBAC
for Kubernetes authorization.
NOTE
Pod identities are intended for use with Linux pods and container images only. Pod-managed identities support for
Windows containers is coming soon.
To access other Azure services, like Cosmos DB, Key Vault, or Blob Storage, the pod needs access credentials. You
could define access credentials with the container image or inject them as a Kubernetes secret. Either way, you
would need to manually create and assign them. Usually, these credentials are reused across pods and aren't
regularly rotated.
With pod-managed identities for Azure resources, you automatically request access to services through Azure
AD. Pod-managed identities is now currently in preview for AKS. Please refer to the Use Azure Active Directory
pod-managed identities in Azure Kubernetes Service (Preview) documentation to get started.
Azure Active Directory Pod Identity supports 2 modes of operation:
1. Standard Mode: In this mode, the following 2 components are deployed to the AKS cluster:
Managed Identity Controller(MIC): A Kubernetes controller that watches for changes to pods,
AzureIdentity and AzureIdentityBinding through the Kubernetes API Server. When it detects a relevant
change, the MIC adds or deletes AzureAssignedIdentity as needed. Specifically, when a pod is
scheduled, the MIC assigns the managed identity on Azure to the underlying VMSS used by the node
pool during the creation phase. When all pods using the identity are deleted, it removes the identity
from the VMSS of the node pool, unless the same managed identity is used by other pods. The MIC
takes similar actions when AzureIdentity or AzureIdentityBinding are created or deleted.
Node Managed Identity (NMI): is a pod that runs as a DaemonSet on each node in the AKS cluster.
NMI intercepts security token requests to the Azure Instance Metadata Service on each node, redirect
them to itself and validates if the pod has access to the identity it's requesting a token for and fetch the
token from the Azure Active Directory tenant on behalf of the application.
2. Managed Mode: In this mode, there is only NMI. The identity needs to be manually assigned and
managed by the user. For more information, see Pod Identity in Managed Mode. In this mode, when you
use the az aks pod-identity add command to add a pod identity to an Azure Kubernetes Service (AKS)
cluster, it creates the AzureIdentity and AzureIdentityBinding in the namespace specified by the
--namespace parameter, while the AKS resource provider assigns the managed identity specified by the
--identity-resource-id parameter to virtual machine scale set (VMSS) of each node pool in the AKS
cluster.
NOTE
If you instead decide to install the Azure Active Directory Pod Identity using the AKS cluster add-on, the setup will use the
managed mode.
The managed mode provides the following advantages over the standard :
1. Identity assignment on the VMSS of a node pool can take up 40-60s. In case of cronjobs or applications that
require access to the identity and can't tolerate the assignment delay, it's best to use managed mode as the
identity is pre-assigned to the VMSS of the node pool, manually or via the az aks pod-identity add command.
2. In standard mode, MIC requires write permissions on the VMSS used by the AKS cluster and
Managed Identity Operator permission on the user-assigned managed identities. While running in
managed mode , since there is no MIC, the role assignments are not required.
Instead of manually defining credentials for pods, pod-managed identities request an access token in real time,
using it to access only their assigned services. In AKS, there are two components that handle the operations to
allow pods to use managed identities:
The Node Management Identity (NMI) ser ver is a pod that runs as a DaemonSet on each node in the
AKS cluster. The NMI server listens for pod requests to Azure services.
The Azure Resource Provider queries the Kubernetes API server and checks for an Azure identity
mapping that corresponds to a pod.
When pods request a security token from Azure Active Directory to access to an Azure service, network rules
redirect the traffic to the NMI server.
1. The NMI server:
Identifies pods requesting access to Azure services based on their remote address.
Queries the Azure Resource Provider.
2. The Azure Resource Provider checks for Azure identity mappings in the AKS cluster.
3. The NMI server requests an access token from Azure AD based on the pod's identity mapping.
4. Azure AD provides access to the NMI server, which is returned to the pod.
This access token can be used by the pod to then request access to services in Azure.
In the following example, a developer creates a pod that uses a managed identity to request access to Azure SQL
Database:
1. Cluster operator creates a service account to map identities when pods request access to services.
2. The NMI server is deployed to relay any pod requests, along with the Azure Resource Provider, for access
tokens to Azure AD.
3. A developer deploys a pod with a managed identity that requests an access token through the NMI server.
4. The token is returned to the pod and used to access Azure SQL Database
NOTE
Pod-managed identities is currently in preview status.
To use Pod-managed identities, see Use Azure Active Directory pod-managed identities in Azure Kubernetes
Service (Preview).
Next steps
This best practices article focused on authentication and authorization for your cluster and resources. To
implement some of these best practices, see the following articles:
Integrate Azure Active Directory with AKS
Use Azure Active Directory pod-managed identities in Azure Kubernetes Service (Preview)
For more information about cluster operations in AKS, see the following best practices:
Multi-tenancy and cluster isolation
Basic Kubernetes scheduler features
Advanced Kubernetes scheduler features
Best practices for cluster security and upgrades in
Azure Kubernetes Service (AKS)
6/15/2022 • 10 minutes to read • Edit Online
As you manage clusters in Azure Kubernetes Service (AKS), workload and data security is a key consideration.
When you run multi-tenant clusters using logical isolation, you especially need to secure resource and workload
access. Minimize the risk of attack by applying the latest Kubernetes and node OS security updates.
This article focuses on how to secure your AKS cluster. You learn how to:
Use Azure Active Directory and Kubernetes role-based access control (Kubernetes RBAC) to secure API server
access.
Secure container access to node resources.
Upgrade an AKS cluster to the latest Kubernetes version.
Keep nodes up to date and automatically apply security patches.
You can also read the best practices for container image management and for pod security.
The Kubernetes API server provides a single connection point for requests to perform actions within a cluster. To
secure and audit access to the API server, limit access and provide the lowest possible permission levels. while
this approach isn't unique to Kubernetes, it's especially important when you've logically isolated your AKS
cluster for multi-tenant use.
Azure AD provides an enterprise-ready identity management solution that integrates with AKS clusters. Since
Kubernetes doesn't provide an identity management solution, you may be hard-pressed to granularly restrict
access to the API server. With Azure AD-integrated clusters in AKS, you use your existing user and group
accounts to authenticate users to the API server.
Using Kubernetes RBAC and Azure AD-integration, you can secure the API server and provide the minimum
permissions required to a scoped resource set, like a single namespace. You can grant different Azure AD users
or groups different Kubernetes roles. With granular permissions, you can restrict access to the API server and
provide a clear audit trail of actions performed.
The recommended best practice is to use groups to provide access to files and folders instead of individual
identities. For example, use an Azure AD group membership to bind users to Kubernetes roles rather than
individual users. As a user's group membership changes, their access permissions on the AKS cluster change
accordingly.
Meanwhile, let's say you bind the individual user directly to a role and their job function changes. While the
Azure AD group memberships update, their permissions on the AKS cluster would not. In this scenario, the user
ends up with more permissions than they require.
For more information about Azure AD integration, Kubernetes RBAC, and Azure RBAC, see Best practices for
authentication and authorization in AKS.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: restrict-instance-metadata
spec:
podSelector:
matchLabels: {}
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 10.10.0.0/0#example
except:
- 169.254.169.254/32
NOTE
Alternatively you can use Pod Identity thought this is in Public Preview. It has a pod (NMI) that runs as a DaemonSet on
each node in the AKS cluster. NMI intercepts security token requests to the Azure Instance Metadata Service on each
node, redirect them to itself and validates if the pod has access to the identity it's requesting a token for and fetch the
token from the Azure AD tenant on behalf of the application.
In the same way that you should grant users or groups the minimum privileges required, you should also limit
containers to only necessary actions and processes. To minimize the risk of attack, avoid configuring
applications and containers that require escalated privileges or root access.
For example, set allowPrivilegeEscalation: false in the pod manifest. These built-in Kubernetes pod security
contexts let you define additional permissions, such as the user or group to run as, or the Linux capabilities to
expose. For more best practices, see Secure pod access to resources.
For even more granular control of container actions, you can also use built-in Linux security features such as
AppArmor and seccomp.
1. Define Linux security features at the node level.
2. Implement features through a pod manifest.
Built-in Linux security features are only available on Linux nodes and pods.
NOTE
Currently, Kubernetes environments aren't completely safe for hostile multi-tenant usage. Additional security features, like
AppArmor, seccomp,Pod Security Policies, or Kubernetes RBAC for nodes, efficiently block exploits.
For true security when running hostile multi-tenant workloads, only trust a hypervisor. The security domain for
Kubernetes becomes the entire cluster, not an individual node.
For these types of hostile multi-tenant workloads, you should use physically isolated clusters.
App Armor
To limit container actions, you can use the AppArmor Linux kernel security module. AppArmor is available as
part of the underlying AKS node OS, and is enabled by default. You create AppArmor profiles that restrict read,
write, or execute actions, or system functions like mounting filesystems. Default AppArmor profiles restrict
access to various /proc and /sys locations, and provide a means to logically isolate containers from the
underlying node. AppArmor works for any application that runs on Linux, not just Kubernetes pods.
To see AppArmor in action, the following example creates a profile that prevents writing to files.
1. SSH to an AKS node.
2. Create a file named deny-write.profile.
3. Paste the following content:
#include <tunables/global>
profile k8s-apparmor-example-deny-write flags=(attach_disconnected) {
#include <abstractions/base>
file,
# Deny all file writes.
deny /** w,
}
If the profile is correctly parsed and applied to AppArmor, you won't see any output and you'll be
returned to the command prompt.
3. From your local machine, create a pod manifest named aks-apparmor.yaml. This manifest:
Defines an annotation for container.apparmor.security.beta.kubernetes .
References the deny-write profile created in the previous steps.
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor
annotations:
container.apparmor.security.beta.kubernetes.io/hello: localhost/k8s-apparmor-example-deny-write
spec:
containers:
- name: hello
image: mcr.microsoft.com/dotnet/runtime-deps:6.0
command: [ "sh", "-c", "echo 'Hello AppArmor!' && sleep 1h" ]
4. With the pod deployed, use verify the hello-apparmor pod shows as blocked:
$ kubectl get pods
{
"defaultAction": "SCMP_ACT_ALLOW",
"syscalls": [
{
"name": "chmod",
"action": "SCMP_ACT_ERRNO"
},
{
"name": "fchmodat",
"action": "SCMP_ACT_ERRNO"
},
{
"name": "chmodat",
"action": "SCMP_ACT_ERRNO"
}
]
}
{
"defaultAction": "SCMP_ACT_ALLOW",
"syscalls": [
{
"names": ["chmod","fchmodat","chmodat"],
"action": "SCMP_ACT_ERRNO"
}
]
}
4. From your local machine, create a pod manifest named aks-seccomp.yaml and paste the following
content. This manifest:
Defines an annotation for seccomp.security.alpha.kubernetes.io .
References the prevent-chmod filter created in the previous step.
apiVersion: v1
kind: Pod
metadata:
name: chmod-prevented
annotations:
seccomp.security.alpha.kubernetes.io/pod: localhost/prevent-chmod
spec:
containers:
- name: chmod
image: mcr.microsoft.com/dotnet/runtime-deps:6.0
command:
- "chmod"
args:
- "777"
- /etc/hostname
restartPolicy: Never
apiVersion: v1
kind: Pod
metadata:
name: chmod-prevented
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: prevent-chmod
containers:
- name: chmod
image: mcr.microsoft.com/dotnet/runtime-deps:6.0
command:
- "chmod"
args:
- "777"
- /etc/hostname
restartPolicy: Never
For more information about available filters, see Seccomp security profiles for Docker.
Kubernetes releases new features at a quicker pace than more traditional infrastructure platforms. Kubernetes
updates include:
New features
Bug or security fixes
New features typically move through alpha and beta status before they become stable. Once stable, are
generally available and recommended for production use. Kubernetes new feature release cycle allows you to
update Kubernetes without regularly encountering breaking changes or adjusting your deployments and
templates.
AKS supports three minor versions of Kubernetes. Once a new minor patch version is introduced, the oldest
minor version and patch releases supported are retired. Minor Kubernetes updates happen on a periodic basis.
To stay within support, ensure you have a governance process to check for necessary upgrades. For more
information, see Supported Kubernetes versions AKS.
To check the versions that are available for your cluster, use the az aks get-upgrades command as shown in the
following example:
You can then upgrade your AKS cluster using the az aks upgrade command. The upgrade process safely:
Cordons and drains one node at a time.
Schedules pods on remaining nodes.
Deploys a new node running the latest OS and Kubernetes versions.
IMPORTANT
Test new minor versions in a dev test environment and validate that your workload remains healthy with the new
Kubernetes version.
Kubernetes may deprecate APIs (like in version 1.16) that your workloads rely on. When bringing new versions into
production, consider using multiple node pools on separate versions and upgrade individual pools one at a time to
progressively roll the update across a cluster. If running multiple clusters, upgrade one cluster at a time to progressively
monitor for impact or changes.
For more information about upgrades in AKS, see Supported Kubernetes versions in AKS and Upgrade an AKS
cluster.
Container and container image security is a major priority while you develop and run applications in Azure
Kubernetes Service (AKS). Containers with outdated base images or unpatched application runtimes introduce a
security risk and possible attack vector.
Minimize risks by integrating and running scan and remediation tools in your containers at build and runtime.
The earlier you catch the vulnerability or outdated base image, the more secure your cluster.
In this article, "containers" means both:
The container images stored in a container registry.
The running containers.
This article focuses on how to secure your containers in AKS. You learn how to:
Scan for and remediate image vulnerabilities.
Automatically trigger and redeploy container images when a base image is updated.
You can also read the best practices for cluster security and for pod security.
You can also use Container security in Defender for Cloud to help scan your containers for vulnerabilities. Azure
Container Registry integration with Defender for Cloud helps protect your images and registry from
vulnerabilities.
When adopting container-based workloads, you'll want to verify the security of images and runtime used to
build your own applications. How do you avoid introducing security vulnerabilities into your deployments?
Include in your deployment workflow a process to scan container images using tools such as Twistlock or
Aqua.
Only allow verified images to be deployed.
For example, you can use a continuous integration and continuous deployment (CI/CD) pipeline to automate the
image scans, verification, and deployments. Azure Container Registry includes these vulnerabilities scanning
capabilities.
Each time a base image is updated, you should also update any downstream container images. Integrate this
build process into validation and deployment pipelines such as Azure Pipelines or Jenkins. These pipelines make
sure that your applications continue to run on the updated based images. Once your application container
images are validated, the AKS deployments can then be updated to run the latest, secure images.
Azure Container Registry Tasks can also automatically update container images when the base image is updated.
With this feature, you build a few base images and keep them updated with bug and security fixes.
For more information about base image updates, see Automate image builds on base image update with Azure
Container Registry Tasks.
Next steps
This article focused on how to secure your containers. To implement some of these areas, see the following
articles:
Automate image builds on base image update with Azure Container Registry Tasks
Best practices for cluster isolation in Azure
Kubernetes Service (AKS)
6/15/2022 • 3 minutes to read • Edit Online
As you manage clusters in Azure Kubernetes Service (AKS), you often need to isolate teams and workloads. AKS
provides flexibility in how you can run multi-tenant clusters and isolate resources. To maximize your investment
in Kubernetes, first understand and implement AKS multi-tenancy and isolation features.
This best practices article focuses on isolation for cluster operators. In this article, you learn how to:
Plan for multi-tenant clusters and separation of resources
Use logical or physical isolation in your AKS clusters
With logical isolation, a single AKS cluster can be used for multiple workloads, teams, or environments.
Kubernetes Namespaces form the logical isolation boundary for workloads and resources.
Logical separation of clusters usually provides a higher pod density than physically isolated clusters, with less
excess compute capacity sitting idle in the cluster. When combined with the Kubernetes cluster autoscaler, you
can scale the number of nodes up or down to meet demands. This best practice approach to autoscaling
minimizes costs by running only the number of nodes required.
Currently, Kubernetes environments aren't completely safe for hostile multi-tenant usage. In a multi-tenant
environment, multiple tenants are working on a common, shared infrastructure. If all tenants cannot be trusted,
you will need extra planning to prevent tenants from impacting the security and service of others.
Additional security features, like Kubernetes RBAC for nodes, efficiently block exploits. For true security when
running hostile multi-tenant workloads, you should only trust a hypervisor. The security domain for Kubernetes
becomes the entire cluster, not an individual node.
For these types of hostile multi-tenant workloads, you should use physically isolated clusters.
Physically separating AKS clusters is a common approach to cluster isolation. In this isolation model, teams or
workloads are assigned their own AKS cluster. While physical isolation might look like the easiest way to isolate
workloads or teams, it adds management and financial overhead. Now, you must maintain these multiple
clusters and individually provide access and assign permissions. You'll also be billed for each the individual
node.
Physically separate clusters usually have a low pod density. Since each team or workload has their own AKS
cluster, the cluster is often over-provisioned with compute resources. Often, a small number of pods are
scheduled on those nodes. Unclaimed node capacity can't be used for applications or services in development
by other teams. These excess resources contribute to the additional costs in physically separate clusters.
Next steps
This article focused on cluster isolation. For more information about cluster operations in AKS, see the following
best practices:
Basic Kubernetes scheduler features
Advanced Kubernetes scheduler features
Authentication and authorization
Best practices for basic scheduler features in Azure
Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
As you manage clusters in Azure Kubernetes Service (AKS), you often need to isolate teams and workloads. The
Kubernetes scheduler lets you control the distribution of compute resources, or limit the impact of maintenance
events.
This best practices article focuses on basic Kubernetes scheduling features for cluster operators. In this article,
you learn how to:
Use resource quotas to provide a fixed amount of resources to teams or workloads
Limit the impact of scheduled maintenance using pod disruption budgets
Resource requests and limits are placed in the pod specification. Limits are used by the Kubernetes scheduler at
deployment time to find an available node in the cluster. Limits and requests work at the individual pod level.
For more information about how to define these values, see Define pod resource requests and limits
To provide a way to reserve and limit resources across a development team or project, you should use resource
quotas. These quotas are defined on a namespace, and can be used to set quotas on the following basis:
Compute resources , such as CPU and memory, or GPUs.
Storage resources , including the total number of volumes or amount of disk space for a given storage
class.
Object count , such as maximum number of secrets, services, or jobs can be created.
Kubernetes doesn't overcommit resources. Once your cumulative resource request total passes the assigned
quota, all further deployments will be unsuccessful.
When you define resource quotas, all pods created in the namespace must provide limits or requests in their
pod specifications. If they don't provide these values, you can reject the deployment. Instead, you can configure
default requests and limits for a namespace.
The following example YAML manifest named dev-app-team-quotas.yaml sets a hard limit of a total of 10 CPUs,
20Gi of memory, and 10 pods:
apiVersion: v1
kind: ResourceQuota
metadata:
name: dev-app-team
spec:
hard:
cpu: "10"
memory: 20Gi
pods: "10"
This resource quota can be applied by specifying the namespace, such as dev-apps:
Work with your application developers and owners to understand their needs and apply the appropriate
resource quotas.
For more information about available resource objects, scopes, and priorities, see Resource quotas in
Kubernetes.
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: nginx-pdb
spec:
minAvailable: 3
selector:
matchLabels:
app: nginx-frontend
You can also define a percentage, such as 60%, which allows you to automatically compensate for the replica set
scaling up the number of pods.
You can define a maximum number of unavailable instances in a replica set. Again, a percentage for the
maximum unavailable pods can also be defined. The following pod disruption budget YAML manifest defines
that no more than two pods in the replica set be unavailable:
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: nginx-pdb
spec:
maxUnavailable: 2
selector:
matchLabels:
app: nginx-frontend
Once your pod disruption budget is defined, you create it in your AKS cluster as with any other Kubernetes
object:
Work with your application developers and owners to understand their needs and apply the appropriate pod
disruption budgets.
For more information about using pod disruption budgets, see Specify a disruption budget for your application.
Next steps
This article focused on basic Kubernetes scheduler features. For more information about cluster operations in
AKS, see the following best practices:
Multi-tenancy and cluster isolation
Advanced Kubernetes scheduler features
Authentication and authorization
Best practices for advanced scheduler features in
Azure Kubernetes Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
As you manage clusters in Azure Kubernetes Service (AKS), you often need to isolate teams and workloads.
Advanced features provided by the Kubernetes scheduler let you control:
Which pods can be scheduled on certain nodes.
How multi-pod applications can be appropriately distributed across the cluster.
This best practices article focuses on advanced Kubernetes scheduling features for cluster operators. In this
article, you learn how to:
Use taints and tolerations to limit what pods can be scheduled on nodes.
Give preference to pods to run on certain nodes with node selectors or node affinity.
Split apart or group together pods with inter-pod affinity or anti-affinity.
When you create your AKS cluster, you can deploy nodes with GPU support or a large number of powerful
CPUs. You can use these nodes for large data processing workloads such as machine learning (ML) or artificial
intelligence (AI).
Since this node resource hardware is typically expensive to deploy, limit the workloads that can be scheduled on
these nodes. Instead, you'd dedicate some nodes in the cluster to run ingress services and prevent other
workloads.
This support for different nodes is provided by using multiple node pools. An AKS cluster provides one or more
node pools.
The Kubernetes scheduler uses taints and tolerations to restrict what workloads can run on nodes.
Apply a taint to a node to indicate only specific pods can be scheduled on them.
Then apply a toleration to a pod, allowing them to tolerate a node's taint.
When you deploy a pod to an AKS cluster, Kubernetes only schedules pods on nodes whose taint aligns with the
toleration. For example, assume you added a node pool in your AKS cluster for nodes with GPU support. You
define name, such as gpu, then a value for scheduling. Setting this value to NoSchedule restricts the Kubernetes
scheduler from scheduling pods with undefined toleration on the node.
az aks nodepool add \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name taintnp \
--node-taints sku=gpu:NoSchedule \
--no-wait
With a taint applied to nodes in the node pool, you'll define a toleration in the pod specification that allows
scheduling on the nodes. The following example defines the sku: gpu and effect: NoSchedule to tolerate the
taint applied to the node pool in the previous step:
kind: Pod
apiVersion: v1
metadata:
name: tf-mnist
spec:
containers:
- name: tf-mnist
image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
resources:
requests:
cpu: 0.5
memory: 2Gi
limits:
cpu: 4.0
memory: 16Gi
tolerations:
- key: "sku"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
When this pod is deployed using kubectl apply -f gpu-toleration.yaml , Kubernetes can successfully schedule
the pod on the nodes with the taint applied. This logical isolation lets you control access to resources within a
cluster.
When you apply taints, work with your application developers and owners to allow them to define the required
tolerations in their deployments.
For more information about how to use multiple node pools in AKS, see Create and manage multiple node
pools for a cluster in AKS.
Behavior of taints and tolerations in AKS
When you upgrade a node pool in AKS, taints and tolerations follow a set pattern as they're applied to new
nodes:
Default clusters that use VM scale sets
You can taint a node pool from the AKS API to have newly scaled out nodes receive API specified node taints.
Let's assume:
1. You begin with a two-node cluster: node1 and node2.
2. You upgrade the node pool.
3. Two additional nodes are created: node3 and node4.
4. The taints are passed on respectively.
5. The original node1 and node2 are deleted.
Clusters without VM scale set support
Again, let's assume:
1. You have a two-node cluster: node1 and node2.
2. You upgrade then node pool.
3. An additional node is created: node3.
4. The taints from node1 are applied to node3.
5. node1 is deleted.
6. A new node1 is created to replace to original node1.
7. The node2 taints are applied to the new node1.
8. node2 is deleted.
In essence node1 becomes node3, and node2 becomes the new node1.
When you scale a node pool in AKS, taints and tolerations do not carry over by design.
Taints and tolerations logically isolate resources with a hard cut-off. If the pod doesn't tolerate a node's taint, it
isn't scheduled on the node.
Alternatively, you can use node selectors. For example, you label nodes to indicate locally attached SSD storage
or a large amount of memory, and then define in the pod specification a node selector. Kubernetes schedules
those pods on a matching node.
Unlike tolerations, pods without a matching node selector can still be scheduled on labeled nodes. This behavior
allows unused resources on the nodes to consume, but prioritizes pods that define the matching node selector.
Let's look at an example of nodes with a high amount of memory. These nodes prioritize pods that request a
high amount of memory. To ensure the resources don't sit idle, they also allow other pods to run. The follow
example command adds a node pool with the label hardware=highmem to the myAKSCluster in the
myResourceGroup. All nodes in that node pool will have this label.
A pod specification then adds the nodeSelector property to define a node selector that matches the label set on
a node:
kind: Pod
apiVersion: v1
metadata:
name: tf-mnist
spec:
containers:
- name: tf-mnist
image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
resources:
requests:
cpu: 0.5
memory: 2Gi
limits:
cpu: 4.0
memory: 16Gi
nodeSelector:
hardware: highmem
When you use these scheduler options, work with your application developers and owners to allow them to
correctly define their pod specifications.
For more information about using node selectors, see Assigning Pods to Nodes.
Node affinity
A node selector is a basic solution for assigning pods to a given node. Node affinity provides more flexibility,
allowing you to define what happens if the pod can't be matched with a node. You can:
Require that Kubernetes scheduler matches a pod with a labeled host. Or,
Prefer a match but allow the pod to be scheduled on a different host if no match is available.
The following example sets the node affinity to requiredDuringSchedulingIgnoredDuringExecution. This affinity
requires the Kubernetes schedule to use a node with a matching label. If no node is available, the pod has to wait
for scheduling to continue. To allow the pod to be scheduled on a different node, you can instead set the value to
preferred DuringSchedulingIgnoreDuringExecution:
kind: Pod
apiVersion: v1
metadata:
name: tf-mnist
spec:
containers:
- name: tf-mnist
image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
resources:
requests:
cpu: 0.5
memory: 2Gi
limits:
cpu: 4.0
memory: 16Gi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values: highmem
The IgnoredDuringExecution part of the setting indicates that the pod shouldn't be evicted from the node if the
node labels change. The Kubernetes scheduler only uses the updated node labels for new pods being scheduled,
not pods already scheduled on the nodes.
For more information, see Affinity and anti-affinity.
Inter-pod affinity and anti-affinity
One final approach for the Kubernetes scheduler to logically isolate workloads is using inter-pod affinity or anti-
affinity. These settings define that pods either shouldn't or should be scheduled on a node that has an existing
matching pod. By default, the Kubernetes scheduler tries to schedule multiple pods in a replica set across nodes.
You can define more specific rules around this behavior.
For example, you have a web application that also uses an Azure Cache for Redis.
1. You use pod anti-affinity rules to request that the Kubernetes scheduler distributes replicas across nodes.
2. You use affinity rules to ensure each web app component is scheduled on the same host as a corresponding
cache.
The distribution of pods across nodes looks like the following example:
N O DE 1 N O DE 2 N O DE 3
Inter-pod affinity and anti-affinity provide a more complex deployment than node selectors or node affinity.
With the deployment, you logically isolate resources and control how Kubernetes schedules pods on nodes.
For a complete example of this web application with Azure Cache for Redis example, see Co-locate pods on the
same node.
Next steps
This article focused on advanced Kubernetes scheduler features. For more information about cluster operations
in AKS, see the following best practices:
Multi-tenancy and cluster isolation
Basic Kubernetes scheduler features
Authentication and authorization
Best practices for network connectivity and security
in Azure Kubernetes Service (AKS)
6/15/2022 • 10 minutes to read • Edit Online
As you create and manage clusters in Azure Kubernetes Service (AKS), you provide network connectivity for
your nodes and applications. These network resources include IP address ranges, load balancers, and ingress
controllers. To maintain a high quality of service for your applications, you need to strategize and configure
these resources.
This best practices article focuses on network connectivity and security for cluster operators. In this article, you
learn how to:
Compare the kubenet and Azure Container Networking Interface (CNI) network modes in AKS.
Plan for required IP addressing and connectivity.
Distribute traffic using load balancers, ingress controllers, or a web application firewall (WAF).
Securely connect to cluster nodes.
Virtual networks provide the basic connectivity for AKS nodes and customers to access your applications. There
are two different ways to deploy AKS clusters into virtual networks:
Azure CNI networking
Deploys into a virtual network and uses the Azure CNI Kubernetes plugin. Pods receive individual IPs that
can route to other network services or on-premises resources.
Kubenet networking
Azure manages the virtual network resources as the cluster is deployed and uses the kubenet Kubernetes
plugin.
For production deployments, both kubenet and Azure CNI are valid options.
CNI Networking
Azure CNI is a vendor-neutral protocol that lets the container runtime make requests to a network provider. It
assigns IP addresses to pods and nodes, and provides IP address management (IPAM) features as you connect to
existing Azure virtual networks. Each node and pod resource receives an IP address in the Azure virtual network
- no need for extra routing to communicate with other resources or services.
Notably, Azure CNI networking for production allows for separation of control and management of resources.
From a security perspective, you often want different teams to manage and secure those resources. With Azure
CNI networking, you connect to existing Azure resources, on-premises resources, or other services directly via IP
addresses assigned to each pod.
When you use Azure CNI networking, the virtual network resource is in a separate resource group to the AKS
cluster. Delegate permissions for the AKS cluster identity to access and manage these resources. The cluster
identity used by the AKS cluster must have at least Network Contributor permissions on the subnet within your
virtual network.
If you wish to define a custom role instead of using the built-in Network Contributor role, the following
permissions are required:
Microsoft.Network/virtualNetworks/subnets/join/action
Microsoft.Network/virtualNetworks/subnets/read
By default, AKS uses a managed identity for its cluster identity. However, you are able to use a service principal
instead. For more information about:
AKS service principal delegation, see Delegate access to other Azure resources.
Managed identities, see Use managed identities.
As each node and pod receives its own IP address, plan out the address ranges for the AKS subnets. Keep in
mind:
The subnet must be large enough to provide IP addresses for every node, pods, and network resource that
you deploy.
With both kubenet and Azure CNI networking, each node running has default limits to the number of
pods.
Each AKS cluster must be placed in its own subnet.
Avoid using IP address ranges that overlap with existing network resources.
Necessary to allow connectivity to on-premises or peered networks in Azure.
To handle scale out events or cluster upgrades, you need extra IP addresses available in the assigned subnet.
This extra address space is especially important if you use Windows Server containers, as those node
pools require an upgrade to apply the latest security patches. For more information on Windows
Server nodes, see Upgrade a node pool in AKS.
To calculate the IP address required, see Configure Azure CNI networking in AKS.
When creating a cluster with Azure CNI networking, you specify other address ranges for the cluster, such as the
Docker bridge address, DNS service IP, and service address range. In general, make sure these address ranges:
Don't overlap each other.
Don't overlap with any networks associated with the cluster, including any virtual networks, subnets, on-
premises and peered networks.
For the specific details around limits and sizing for these address ranges, see Configure Azure CNI networking in
AKS.
Kubenet networking
Although kubenet doesn't require you to set up the virtual networks before the cluster is deployed, there are
disadvantages to waiting:
Since nodes and pods are placed on different IP subnets, User Defined Routing (UDR) and IP forwarding
routes traffic between pods and nodes. This extra routing may reduce network performance.
Connections to existing on-premises networks or peering to other Azure virtual networks can be complex.
Since you don't create the virtual network and subnets separately from the AKS cluster, Kubenet is ideal for:
Small development or test workloads.
Simple websites with low traffic.
Lifting and shifting workloads into containers.
For most production deployments, you should plan for and use Azure CNI networking.
You can also configure your own IP address ranges and virtual networks using kubenet. Like Azure CNI
networking, these address ranges shouldn't overlap each other and shouldn't overlap with any networks
associated with the cluster (virtual networks, subnets, on-premises and peered networks).
For the specific details around limits and sizing for these address ranges, see Use kubenet networking with your
own IP address ranges in AKS.
While an Azure load balancer can distribute customer traffic to applications in your AKS cluster, it's limited in
understanding that traffic. A load balancer resource works at layer 4, and distributes traffic based on protocol or
ports.
Most web applications using HTTP or HTTPS should use Kubernetes ingress resources and controllers, which
work at layer 7. Ingress can distribute traffic based on the URL of the application and handle TLS/SSL
termination. Ingress also reduces the number of IP addresses you expose and map.
With a load balancer, each application typically needs a public IP address assigned and mapped to the service in
the AKS cluster. With an ingress resource, a single IP address can distribute traffic to multiple applications.
kind: Ingress
metadata:
name: myapp-ingress
annotations: kubernetes.io/ingress.class: "PublicIngress"
spec:
tls:
- hosts:
- myapp.com
secretName: myapp-secret
rules:
- host: myapp.com
http:
paths:
- path: /blog
backend:
serviceName: blogservice
servicePort: 80
- path: /store
backend:
serviceName: storeservice
servicePort: 80
Ingress controller
An ingress controller is a daemon that runs on an AKS node and watches for incoming requests. Traffic is then
distributed based on the rules defined in the ingress resource. While the most common ingress controller is
based on NGINX, AKS doesn't restrict you to a specific controller. You can use Contour, HAProxy, Traefik, etc.
Ingress controllers must be scheduled on a Linux node. Indicate that the resource should run on a Linux-based
node using a node selector in your YAML manifest or Helm chart deployment. For more information, see Use
node selectors to control where pods are scheduled in AKS.
NOTE
Windows Server nodes shouldn't run the ingress controller.
There are many scenarios for ingress, including the following how-to guides:
Create a basic ingress controller with external network connectivity
Create an ingress controller that uses an internal, private network and IP address
Create an ingress controller that uses your own TLS certificates
Create an ingress controller that uses Let's Encrypt to automatically generate TLS certificates with a dynamic
public IP address or with a static public IP address
For that extra layer of security, a web application firewall (WAF) filters the incoming traffic. With a set of rules,
the Open Web Application Security Project (OWASP) watches for attacks like cross-site scripting or cookie
poisoning. Azure Application Gateway (currently in preview in AKS) is a WAF that integrates with AKS clusters,
locking in these security features before the traffic reaches your AKS cluster and applications.
Since other third-party solutions also perform these functions, you can continue to use existing investments or
expertise in your preferred product.
Load balancer or ingress resources continually run in your AKS cluster and refine the traffic distribution. App
Gateway can be centrally managed as an ingress controller with a resource definition. To get started, create an
Application Gateway Ingress controller.
Network policy is a Kubernetes feature available in AKS that lets you control the traffic flow between pods. You
allow or deny traffic to the pod based on settings such as assigned labels, namespace, or traffic port. Network
policies are a cloud-native way to control the flow of traffic for pods. As pods are dynamically created in an AKS
cluster, required network policies can be automatically applied.
To use network policy, enable the feature when you create a new AKS cluster. You can't enable network policy on
an existing AKS cluster. Plan ahead to enable network policy on the necessary clusters.
NOTE
Network policy should only be used for Linux-based nodes and pods in AKS.
You create a network policy as a Kubernetes resource using a YAML manifest. Policies are applied to defined
pods, with ingress or egress rules defining traffic flow.
The following example applies a network policy to pods with the app: backend label applied to them. The ingress
rule only allows traffic from pods with the app: frontend label:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: backend-policy
spec:
podSelector:
matchLabels:
app: backend
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
To get started with policies, see Secure traffic between pods using network policies in Azure Kubernetes Service
(AKS).
You can complete most operations in AKS using the Azure management tools or through the Kubernetes API
server. AKS nodes are only available on a private network and aren't connected to the public internet. To connect
to nodes and provide maintenance and support, route your connections through a bastion host, or jump box.
Verify this host lives in a separate, securely-peered management virtual network to the AKS cluster virtual
network.
The management network for the bastion host should be secured, too. Use an Azure ExpressRoute or VPN
gateway to connect to an on-premises network, and control access using network security groups.
Next steps
This article focused on network connectivity and security. For more information about network basics in
Kubernetes, see Network concepts for applications in Azure Kubernetes Service (AKS)
Best practices for storage and backups in Azure
Kubernetes Service (AKS)
6/15/2022 • 5 minutes to read • Edit Online
As you create and manage clusters in Azure Kubernetes Service (AKS), your applications often need storage.
Make sure you understand pod performance needs and access methods so that you can select the best storage
for your application. The AKS node size may impact your storage choices. Plan for ways to back up and test the
restore process for attached storage.
This best practices article focuses on storage considerations for cluster operators. In this article, you learn:
What types of storage are available.
How to correctly size AKS nodes for storage performance.
Differences between dynamic and static provisioning of volumes.
Ways to back up and secure your data volumes.
Applications often require different types and speeds of storage. Determine the most appropriate storage type
by asking the following questions.
Do your applications need storage that connects to individual pods?
Do your applications need storage shared across multiple pods?
Is the storage for read-only access to data?
Will the storage be used to write large amounts of structured data?
The following table outlines the available storage types and their capabilities:
W IN DO W S
SERVER
REA D/ W RIT E REA D- O N LY REA D/ W RIT E C O N TA IN ER
USE C A SE VO L UM E P L UGIN ONC E MANY MANY SUP P O RT
AKS provides two primary types of secure storage for volumes backed by Azure Disks or Azure Files. Both use
the default Azure Storage Service Encryption (SSE) that encrypts data at rest. Disks cannot be encrypted using
Azure Disk Encryption at the AKS node level.
Both Azure Files and Azure Disks are available in Standard and Premium performance tiers:
Premium disks
Backed by high-performance solid-state disks (SSDs).
Recommended for all production workloads.
Standard disks
Backed by regular spinning disks (HDDs).
Good for archival or infrequently accessed data.
Understand the application performance needs and access patterns to choose the appropriate storage tier. For
more information about Managed Disks sizes and performance tiers, see Azure Managed Disks overview
Create and use storage classes to define application needs
Define the type of storage you want using Kubernetes storage classes. The storage class is then referenced in the
pod or deployment specification. Storage class definitions work together to create the appropriate storage and
connect it to pods.
For more information, see Storage classes in AKS.
AKS nodes run as various Azure VM types and sizes. Each VM size provides:
A different amount of core resources such as CPU and memory.
A maximum number of disks that can be attached.
Storage performance also varies between VM sizes for the maximum local and attached disk IOPS (input/output
operations per second).
If your applications require Azure Disks as their storage solution, strategize an appropriate node VM size.
Storage capabilities and CPU and memory amounts play a major role when deciding on a VM size.
For example, while both the Standard_B2ms and Standard_DS2_v2 VM sizes include a similar amount of CPU
and memory resources, their potential storage performance is different:
M A X UN C A C H ED
N O DE T Y P E A N D M A X UN C A C H ED T H RO UGH P UT
SIZ E VC P U M EM O RY ( GIB ) M A X DATA DISK S DISK IO P S ( M B P S)
Standard_DS2_v 2 7 8 6,400 96
2
In this example, the Standard_DS2_v2 offers twice as many attached disks, and three to four times the amount of
IOPS and disk throughput. If you only compared core compute resources and compared costs, you might have
chosen the Standard_B2ms VM size with poor storage performance and limitations.
Work with your application development team to understand their storage capacity and performance needs.
Choose the appropriate VM size for the AKS nodes to meet or exceed their performance needs. Regularly
baseline applications to adjust VM size as needed.
For more information about available VM sizes, see Sizes for Linux virtual machines in Azure.
To attach storage to pods, use persistent volumes. Persistent volumes can be created manually or dynamically.
Creating persistent volumes manually adds management overhead and limits your ability to scale. Instead,
provision persistent volume dynamically to simplify storage management and allow your applications to grow
and scale as needed.
A persistent volume claim (PVC) lets you dynamically create storage as needed. Underlying Azure disks are
created as pods request them. In the pod definition, request a volume to be created and attached to a designated
mount path.
For the concepts on how to dynamically create and use volumes, see Persistent Volumes Claims.
To see these volumes in action, see how to dynamically create and use a persistent volume with Azure Disks or
Azure Files.
As part of your storage class definitions, set the appropriate reclaimPolicy. This reclaimPolicy controls the
behavior of the underlying Azure storage resource when the pod is deleted. The underlying storage resource
can either be deleted or retained for future pod use. Set the reclaimPolicy to retain or delete.
Understand your application needs, and implement regular checks for retained storage to minimize the amount
of unused and billed storage.
For more information about storage class options, see storage reclaim policies.
When your applications store and consume data persisted on disks or in files, you need to take regular backups
or snapshots of that data. Azure Disks can use built-in snapshot technologies. Your applications may need to
flush writes-to-disk before you perform the snapshot operation. Velero can back up persistent volumes along
with additional cluster resources and configurations. If you can't remove state from your applications, back up
the data from persistent volumes and regularly test the restore operations to verify data integrity and the
processes required.
Understand the limitations of the different approaches to data backups and if you need to quiesce your data
prior to snapshot. Data backups don't necessarily let you restore your application environment of cluster
deployment. For more information about those scenarios, see Best practices for business continuity and disaster
recovery in AKS.
Next steps
This article focused on storage best practices in AKS. For more information about storage basics in Kubernetes,
see Storage concepts for applications in AKS.
Best practices for business continuity and disaster
recovery in Azure Kubernetes Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
As you manage clusters in Azure Kubernetes Service (AKS), application uptime becomes important. By default,
AKS provides high availability by using multiple nodes in a Virtual Machine Scale Set (VMSS). But these multiple
nodes don’t protect your system from a region failure. To maximize your uptime, plan ahead to maintain
business continuity and prepare for disaster recovery.
This article focuses on how to plan for business continuity and disaster recovery in AKS. You learn how to:
Plan for AKS clusters in multiple regions.
Route traffic across multiple clusters by using Azure Traffic Manager.
Use geo-replication for your container image registries.
Plan for application state across multiple clusters.
Replicate storage across multiple regions.
An AKS cluster is deployed into a single region. To protect your system from region failure, deploy your
application into multiple AKS clusters across different regions. When planning where to deploy your AKS cluster,
consider:
AKS region availability
Choose regions close to your users.
AKS continually expands into new regions.
Azure paired regions
For your geographic area, choose two regions paired together.
AKS platform updates (planned maintenance) are serialized with a delay of at least 24 hours between
paired regions.
Recovery efforts for paired regions are prioritized where needed.
Ser vice availability
Decide whether your paired regions should be hot/hot, hot/warm, or hot/cold.
Do you want to run both regions at the same time, with one region ready to start serving traffic? Or,
Do you want to give one region time to get ready to serve traffic?
AKS region availability and paired regions are a joint consideration. Deploy your AKS clusters into paired
regions designed to manage region disaster recovery together. For example, AKS is available in East US and
West US. These regions are paired. Choose these two regions when you're creating an AKS BC/DR strategy.
When you deploy your application, add another step to your CI/CD pipeline to deploy to these multiple AKS
clusters. Updating your deployment pipelines prevents applications from deploying into only one of your
regions and AKS clusters. In that scenario, customer traffic directed to a secondary region won't receive the
latest code updates.
Use Azure Traffic Manager to route traffic
Best practice
Azure Traffic Manager can direct you to your closest AKS cluster and application instance. For the best
performance and redundancy, direct all application traffic through Traffic Manager before it goes to your
AKS cluster.
If you have multiple AKS clusters in different regions, use Traffic Manager to control traffic flow to the
applications running in each cluster. Azure Traffic Manager is a DNS-based traffic load balancer that can
distribute network traffic across regions. Use Traffic Manager to route users based on cluster response time or
based on geography.
If you have a single AKS cluster, you typically connect to the service IP or DNS name of a given application. In a
multi-cluster deployment, you should connect to a Traffic Manager DNS name that points to the services on
each AKS cluster. Define these services by using Traffic Manager endpoints. Each endpoint is the service load
balancer IP. Use this configuration to direct network traffic from the Traffic Manager endpoint in one region to
the endpoint in a different region.
Traffic Manager performs DNS lookups and returns your most appropriate endpoint. Nested profiles can
prioritize a primary location. For example, you should connect to their closest geographic region. If that region
has a problem, Traffic Manager directs you to a secondary region. This approach ensures that you can connect to
an application instance even if your closest geographic region is unavailable.
For information on how to set up endpoints and routing, see Configure the geographic traffic routing method
by using Traffic Manager.
Application routing with Azure Front Door Service
Using split TCP-based anycast protocol, Azure Front Door Service promptly connects your end users to the
nearest Front Door POP (Point of Presence). More features of Azure Front Door Service:
TLS termination
Custom domain
Web application firewall
URL Rewrite
Session affinity
Review the needs of your application traffic to understand which solution is the most suitable.
Interconnect regions with global virtual network peering
Connect both virtual networks to each other through virtual network peering to enable communication between
clusters. Virtual network peering interconnects virtual networks, providing high bandwidth across Microsoft's
backbone network - even across different geographic regions.
Before peering virtual networks with running AKS clusters, use the standard Load Balancer in your AKS cluster.
This prerequisite makes Kubernetes services reachable across the virtual network peering.
When you use Container Registry geo-replication to pull images from the same region, the results are:
Faster : Pull images from high-speed, low-latency network connections within the same Azure region.
More reliable : If a region is unavailable, your AKS cluster pulls the images from an available container
registry.
Cheaper : No network egress charge between datacenters.
Geo-replication is a Premium SKU container registry feature. For information on how to configure geo-
replication, see Container Registry geo-replication.
Service state refers to the in-memory or on-disk data required by a service to function. State includes the data
structures and member variables that the service reads and writes. Depending on how the service is architected,
the state might also include files or other resources stored on the disk. For example, the state might include the
files a database uses to store data and transaction logs.
State can be either externalized or co-located with the code that manipulates the state. Typically, you externalize
state by using a database or other data store that runs on different machines over the network or that runs out
of process on the same machine.
Containers and microservices are most resilient when the processes that run inside them don't retain state.
Since applications almost always contain some state, use a PaaS solution, such as:
Azure Cosmos DB
Azure Database for PostgreSQL
Azure Database for MySQL
Azure SQL Database
To build portable applications, see the following guidelines:
The 12-factor app methodology
Run a web application in multiple Azure regions
Your applications might use Azure Storage for their data. If so, your applications are spread across multiple AKS
clusters in different regions. You need to keep the storage synchronized. Here are two common ways to replicate
storage:
Infrastructure-based asynchronous replication
Application-based asynchronous replication
Infrastructure -based asynchronous replication
Your applications might require persistent storage even after a pod is deleted. In Kubernetes, you can use
persistent volumes to persist data storage. Persistent volumes are mounted to a node VM and then exposed to
the pods. Persistent volumes follow pods even if the pods are moved to a different node inside the same cluster.
The replication strategy you use depends on your storage solution. The following common storage solutions
provide their own guidance about disaster recovery and replication:
Gluster
Ceph
Rook
Portworx
Typically, you provide a common storage point where applications write their data. This data is then replicated
across regions and accessed locally.
If you use Azure Managed Disks, you can use Velero on Azure and Kasten to handle replication and disaster
recovery. These options are back up solutions native to but unsupported by Kubernetes.
Application-based asynchronous replication
Kubernetes currently provides no native implementation for application-based asynchronous replication. Since
containers and Kubernetes are loosely coupled, any traditional application or language approach should work.
Typically, the applications themselves replicate the storage requests, which are then written to each cluster's
underlying data storage.
Next steps
This article focuses on business continuity and disaster recovery considerations for AKS clusters. For more
information about cluster operations in AKS, see these articles about best practices:
Multitenancy and cluster isolation
Basic Kubernetes scheduler features
Best practices for application developers to manage
resources in Azure Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
As you develop and run applications in Azure Kubernetes Service (AKS), there are a few key areas to consider.
How you manage your application deployments can negatively impact the end-user experience of services that
you provide. To succeed, keep in mind some best practices you can follow as you develop and run applications
in AKS.
This article focuses on running your cluster and workloads from an application developer perspective. For
information about administrative best practices, see Cluster operator best practices for isolation and resource
management in Azure Kubernetes Service (AKS). In this article, you learn:
Pod resource requests and limits.
Ways to develop and deploy applications with Bridge to Kubernetes and Visual Studio Code.
Use pod requests and limits to manage the compute resources within an AKS cluster. Pod requests and limits
inform the Kubernetes scheduler which compute resources to assign to a pod.
Pod CPU/Memory requests
Pod requests define a set amount of CPU and memory that the pod needs regularly.
In your pod specifications, it's best practice and ver y impor tant to define these requests and limits based on
the above information. If you don't include these values, the Kubernetes scheduler cannot take into account the
resources your applications require to aid in scheduling decisions.
Monitor the performance of your application to adjust pod requests.
If you underestimate pod requests, your application may receive degraded performance due to over-
scheduling a node.
If requests are overestimated, your application may have increased difficulty getting scheduled.
Pod CPU/Memory limits**
Pod limits set the maximum amount of CPU and memory that a pod can use.
Memory limits define which pods should be killed when nodes are unstable due to insufficient resources.
Without proper limits set, pods will be killed until resource pressure is lifted.
While a pod may exceed the CPU limit periodically, the pod will not be killed for exceeding the CPU limit.
Pod limits define when a pod has lost control of resource consumption. When it exceeds the limit, the pod is
marked for killing. This behavior maintains node health and minimizes impact to pods sharing the node. Not
setting a pod limit defaults it to the highest available value on a given node.
Avoid setting a pod limit higher than your nodes can support. Each AKS node reserves a set amount of CPU and
memory for the core Kubernetes components. Your application may try to consume too many resources on the
node for other pods to successfully run.
Monitor the performance of your application at different times during the day or week. Determine peak demand
times and align the pod limits to the resources required to meet maximum needs.
IMPORTANT
In your pod specifications, define these requests and limits based on the above information. Failing to include these values
prevents the Kubernetes scheduler from accounting for resources your applications require to aid in scheduling decisions.
If the scheduler places a pod on a node with insufficient resources, application performance will be degraded.
Cluster administrators must set resource quotas on a namespace that requires you to set resource requests and
limits. For more information, see resource quotas on AKS clusters.
When you define a CPU request or limit, the value is measured in CPU units.
1.0 CPU equates to one underlying virtual CPU core on the node.
The same measurement is used for GPUs.
You can define fractions measured in millicores. For example, 100m is 0.1 of an underlying vCPU core.
In the following basic example for a single NGINX pod, the pod requests 100m of CPU time, and 128Mi of
memory. The resource limits for the pod are set to 250m CPU and 256Mi memory:
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: mypod
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
For more information about resource measurements and assignments, see Managing compute resources for
containers.
With Bridge to Kubernetes, you can develop, debug, and test applications directly against an AKS cluster.
Developers within a team collaborate to build and test throughout the application lifecycle. You can continue to
use existing tools such as Visual Studio or Visual Studio Code with the Bridge to Kubernetes extension.
Using integrated development and test process with Bridge to Kubernetes reduces the need for local test
environments like minikube. Instead, you develop and test against an AKS cluster, even secured and isolated
clusters.
NOTE
Bridge to Kubernetes is intended for use with applications that run on Linux pods and nodes.
Use the Visual Studio Code (VS Code) extension for Kubernetes
Best practice guidance
Install and use the VS Code extension for Kubernetes when you write YAML manifests. You can also use the
extension for integrated deployment solution, which may help application owners that infrequently interact
with the AKS cluster.
The Visual Studio Code extension for Kubernetes helps you develop and deploy applications to AKS. The
extension provides:
Intellisense for Kubernetes resources, Helm charts, and templates.
Browse, deploy, and edit capabilities for Kubernetes resources from within VS Code.
An intellisense check for resource requests or limits being set in the pod specifications:
Next steps
This article focused on how to run your cluster and workloads from a cluster operator perspective. For
information about administrative best practices, see Cluster operator best practices for isolation and resource
management in Azure Kubernetes Service (AKS).
To implement some of these best practices, see the following articles:
Develop with Bridge to Kubernetes
Best practices for pod security in Azure Kubernetes
Service (AKS)
6/15/2022 • 6 minutes to read • Edit Online
As you develop and run applications in Azure Kubernetes Service (AKS), the security of your pods is a key
consideration. Your applications should be designed for the principle of least number of privileges required.
Keeping private data secure is top of mind for customers. You don't want credentials like database connection
strings, keys, or secrets and certificates exposed to the outside world where an attacker could take advantage of
those secrets for malicious purposes. Don't add them to your code or embed them in your container images.
This approach would create a risk for exposure and limit the ability to rotate those credentials as the container
images will need to be rebuilt.
This best practices article focuses on how to secure pods in AKS. You learn how to:
Use pod security context to limit access to processes and services or privilege escalation
Authenticate with other Azure resources using pod managed identities
Request and retrieve credentials from a digital vault such as Azure Key Vault
You can also read the best practices for cluster security and for container image management.
Work with your cluster operator to determine what security context settings you need. Try to design your
applications to minimize additional permissions and access the pod requires. There are additional security
features to limit access using AppArmor and seccomp (secure computing) that can be implemented by cluster
operators. For more information, see Secure container access to resources.
IMPORTANT
Associated AKS open source projects are not supported by Azure technical support. They are provided for users to self-
install into clusters and gather feedback from our community.
The following associated AKS open source projects let you automatically authenticate pods or request
credentials and keys from a digital vault. These projects are maintained by the Azure Container Compute
Upstream team and are part of a broader list of projects available for use.
Azure Active Directory Pod Identity
Azure Key Vault Provider for Secrets Store CSI Driver
Use pod managed identities
A managed identity for Azure resources lets a pod authenticate itself against Azure services that support it, such
as Storage or SQL. The pod is assigned an Azure Identity that lets them authenticate to Azure Active Directory
and receive a digital token. This digital token can be presented to other Azure services that check if the pod is
authorized to access the service and perform the required actions. This approach means that no secrets are
required for database connection strings, for example. The simplified workflow for pod managed identity is
shown in the following diagram:
With a managed identity, your application code doesn't need to include credentials to access a service, such as
Azure Storage. As each pod authenticates with its own identity, so you can audit and review access. If your
application connects with other Azure services, use managed identities to limit credential reuse and risk of
exposure.
For more information about pod identities, see Configure an AKS cluster to use pod managed identities and with
your applications
Use Azure Key Vault with Secrets Store CSI Driver
Using the pod identity project enables authentication against supporting Azure services. For your own services
or applications without managed identities for Azure resources, you can still authenticate using credentials or
keys. A digital vault can be used to store these secret contents.
When applications need a credential, they communicate with the digital vault, retrieve the latest secret contents,
and then connect to the required service. Azure Key Vault can be this digital vault. The simplified workflow for
retrieving a credential from Azure Key Vault using pod managed identities is shown in the following diagram:
With Key Vault, you store and regularly rotate secrets such as credentials, storage account keys, or certificates.
You can integrate Azure Key Vault with an AKS cluster using the Azure Key Vault provider for the Secrets Store
CSI Driver. The Secrets Store CSI driver enables the AKS cluster to natively retrieve secret contents from Key
Vault and securely provide them only to the requesting pod. Work with your cluster operator to deploy the
Secrets Store CSI Driver onto AKS worker nodes. You can use a pod managed identity to request access to Key
Vault and retrieve the secret contents needed through the Secrets Store CSI Driver.
Next steps
This article focused on how to secure your pods. To implement some of these areas, see the following articles:
Use managed identities for Azure resources with AKS
Integrate Azure Key Vault with AKS
Migrate to Azure Kubernetes Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
To help you plan and execute a successful migration to Azure Kubernetes Service (AKS), this guide provides
details for the current recommended AKS configuration. While this article doesn't cover every scenario, it
contains links to more detailed information for planning a successful migration.
This document helps support the following scenarios:
Containerizing certain applications and migrating them to AKS using Azure Migrate.
Migrating an AKS Cluster backed by Availability Sets to Virtual Machine Scale Sets.
Migrating an AKS cluster to use a Standard SKU load balancer.
Migrating from Azure Container Service (ACS) - retiring January 31, 2020 to AKS.
Migrating from AKS engine to AKS.
Migrating from non-Azure based Kubernetes clusters to AKS.
Moving existing resources to a different region.
When migrating, ensure your target Kubernetes version is within the supported window for AKS. Older versions
may not be within the supported range and will require a version upgrade to be supported by AKS. For more
information, see AKS supported Kubernetes versions.
If you're migrating to a newer version of Kubernetes, review Kubernetes version and version skew support
policy.
Several open-source tools can help with your migration, depending on your scenario:
Velero (Requires Kubernetes 1.7+)
Azure Kube CLI extension
ReShifter
In this article we will summarize migration details for:
Containerizing applications through Azure Migrate
AKS with Standard Load Balancer and Virtual Machine Scale Sets
Existing attached Azure Services
Ensure valid quotas
High Availability and business continuity
Considerations for stateless applications
Considerations for stateful applications
Deployment of your cluster configuration
AKS with Standard Load Balancer and Virtual Machine Scale Sets
AKS is a managed service offering unique capabilities with lower management overhead. Since AKS is a
managed service, you must select from a set of regions which AKS supports. You may need to modify your
existing applications to keep them healthy on the AKS-managed control plane during the transition from your
existing cluster to AKS.
We recommend using AKS clusters backed by Virtual Machine Scale Sets and the Azure Standard Load Balancer
to ensure you get features such as:
Multiple node pools,
Availability Zones,
Authorized IP ranges,
Cluster Autoscaler,
Azure Policy for AKS, and
Other new features as they are released.
AKS clusters backed by Virtual Machine Availability Sets lack support for many of these features.
The following example creates an AKS cluster with single node pool backed by a virtual machine (VM) scale set.
The cluster:
Uses a standard load balancer.
Enables the cluster autoscaler on the node pool for the cluster.
Sets a minimum of 1 and maximum of 3 nodes.
# Now create the AKS cluster and enable the cluster autoscaler
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3
Azure Front Door Service is another option for routing traffic for AKS clusters. With Azure Front Door Service,
you can define, manage, and monitor the global routing for your web traffic by optimizing for best performance
and instant global failover for high availability.
Considerations for stateless applications
Stateless application migration is the most straightforward case:
1. Apply your resource definitions (YAML or Helm) to the new cluster.
2. Ensure everything works as expected.
3. Redirect traffic to activate your new cluster.
Considerations for stateful applications
Carefully plan your migration of stateful applications to avoid data loss or unexpected downtime.
If you use Azure Files, you can mount the file share as a volume into the new cluster. See Mount Static Azure
Files as a Volume.
If you use Azure Managed Disks, you can only mount the disk if unattached to any VM. See Mount Static
Azure Disk as a Volume.
If neither of those approaches work, you can use a backup and restore options. See Velero on Azure.
Azure Files
Unlike disks, Azure Files can be mounted to multiple hosts concurrently. In your AKS cluster, Azure and
Kubernetes don't prevent you from creating a pod that your AKS cluster still uses. To prevent data loss and
unexpected behavior, ensure that the clusters don't write to the same files simultaneously.
If your application can host multiple replicas that point to the same file share, follow the stateless migration
steps and deploy your YAML definitions to your new cluster.
If not, one possible migration approach involves the following steps:
1. Validate your application is working correctly.
2. Point your live traffic to your new AKS cluster.
3. Disconnect the old cluster.
If you want to start with an empty share and make a copy of the source data, you can use the
az storage file copy commands to migrate your data.
IMPORTANT
If you choose not to quiesce writes, you'll need to replicate data to the new deployment. Otherwise you'll miss the data
that was written after you took the disk snapshots.
Some open-source tools can help you create managed disks and migrate volumes between Kubernetes clusters:
Azure CLI Disk Copy extension copies and converts disks across resource groups and Azure regions.
Azure Kube CLI extension enumerates ACS Kubernetes volumes and migrates them to an AKS cluster.
Deployment of your cluster configuration
We recommend that you use your existing Continuous Integration (CI) and Continuous Deliver (CD) pipeline to
deploy a known-good configuration to AKS. You can use Azure Pipelines to build and deploy your applications
to AKS. Clone your existing deployment tasks and ensure that kubeconfig points to the new AKS cluster.
If that's not possible, export resource definitions from your existing Kubernetes cluster and then apply them to
AKS. You can use kubectl to export objects. For example:
Be sure to examine the output and remove any unnecessary live data fields.
Moving existing resources to another region
You may want to move your AKS cluster to a different region supported by AKS. We recommend that you create
a new cluster in the other region, then deploy your resources and applications to your new cluster.
In addition, if you have any services running on your AKS cluster, you will need to install and configure those
services on your cluster in the new region.
In this article, we summarized migration details for:
AKS with Standard Load Balancer and Virtual Machine Scale Sets
Existing attached Azure Services
Ensure valid quotas
High Availability and business continuity
Considerations for stateless applications
Considerations for stateful applications
Deployment of your cluster configuration
Java web app containerization and migration to
Azure Kubernetes Service
6/15/2022 • 15 minutes to read • Edit Online
In this article, you'll learn how to containerize Java web applications (running on Apache Tomcat) and migrate
them to Azure Kubernetes Service (AKS) using the Azure Migrate: App Containerization tool. The
containerization process doesn’t require access to your codebase and provides an easy way to containerize
existing applications. The tool works by using the running state of the applications on a server to determine the
application components and helps you package them in a container image. The containerized application can
then be deployed on Azure Kubernetes Service (AKS).
The Azure Migrate: App Containerization tool currently supports -
Containerizing Java Web Apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on AKS.
Containerizing Java Web Apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on App Service. Learn more
Containerizing ASP.NET apps and deploying them on Windows containers on AKS. Learn more
Containerizing ASP.NET apps and deploying them on Windows containers on App Service. Learn more
The Azure Migrate: App Containerization tool helps you to -
Discover your application : The tool remotely connects to the application servers running your Java web
application (running on Apache Tomcat) and discovers the application components. The tool creates a
Dockerfile that can be used to create a container image for the application.
Build the container image : You can inspect and further customize the Dockerfile as per your application
requirements and use that to build your application container image. The application container image is
pushed to an Azure Container Registry you specify.
Deploy to Azure Kubernetes Ser vice : The tool then generates the Kubernetes resource definition YAML
files needed to deploy the containerized application to your Azure Kubernetes Service cluster. You can
customize the YAML files and use them to deploy the application on AKS.
NOTE
The Azure Migrate: App Containerization tool helps you discover specific application types (ASP.NET and Java web apps on
Apache Tomcat) and their components on an application server. To discover servers and the inventory of apps, roles, and
features running on on-premises machines, use Azure Migrate: Discovery and assessment capability. Learn more
While all applications won't benefit from a straight shift to containers without significant rearchitecting, some of
the benefits of moving existing apps to containers without rewriting include:
Improved infrastructure utilization: With containers, multiple applications can share resources and be
hosted on the same infrastructure. This can help you consolidate infrastructure and improve utilization.
Simplified management: By hosting your applications on a modern managed platform like AKS and App
Service, you can simplify your management practices. You can achieve this by retiring or reducing the
infrastructure maintenance and management processes that you'd traditionally perform with owned
infrastructure.
Application por tability: With increased adoption and standardization of container specification formats
and platforms, application portability is no longer a concern.
Adopt modern management with DevOps: Helps you adopt and standardize on modern practices for
management and security and transition to DevOps.
In this tutorial, you'll learn how to:
Set up an Azure account.
Install the Azure Migrate: App Containerization tool.
Discover your Java web application.
Build the container image.
Deploy the containerized application on AKS.
NOTE
Tutorials show you the simplest deployment path for a scenario so that you can quickly set up a proof-of-concept.
Tutorials use default options where possible, and don't show all possible settings and paths.
Prerequisites
Before you begin this tutorial, you should:
Identify a machine to install the tool A Windows machine to install and run the Azure Migrate:
App Containerization tool. The Windows machine could be a
server (Windows Server 2016 or later) or client (Windows
10) operating system, meaning that the tool can run on
your desktop as well.
Application ser vers - Enable Secure Shell (SSH) connection on port 22 on the
server(s) running the Java application(s) to be containerized.
2. In the Subscriptions page, select the subscription in which you want to create an Azure Migrate project.
3. Select Access control (IAM) .
4. Select Add > Add role assignment to open the Add role assignment page.
5. Assign the following role. For detailed steps, see Assign Azure roles using the Azure portal.
SET T IN G VA L UE
Role Owner
9. In case the 'App registrations' settings is set to 'No', request the tenant/global admin to assign the
required permission. Alternately, the tenant/global admin can assign the Application Developer role to
an account to allow the registration of Azure Active Directory App. Learn more.
.\AppContainerizationInstaller.ps1
Sign in to Azure
Click Sign in to log in to your Azure account.
1. You'll need a device code to authenticate with Azure. Clicking on sign in will open a modal with the device
code.
2. Click on Copy code & sign in to copy the device code and open an Azure sign in prompt in a new
browser tab. If it doesn't appear, make sure you've disabled the pop-up blocker in the browser.
3. On the new tab, paste the device code and complete sign in using your Azure account credentials. You can
close the browser tab after sign in is complete and return to the App Containerization tool's web
interface.
4. Select the Azure tenant that you want to use.
5. Specify the Azure subscription that you want to use.
2. Review the Dockerfile : The Dockerfile needed to build the container images for each selected
application are generated at the beginning of the build step. Click Review to review the Dockerfile. You
can also add any necessary customizations to the Dockerfile in the review step and save the changes
before starting the build process.
3. Configure Application Insights : You can enable monitoring for your Java apps running on App
Service without instrumenting your code. The tool will install the Java standalone agent as part of the
container image. Once configured during deployment, the Java agent will automatically collect a
multitude of requests, dependencies, logs, and metrics for your application that can be used for
monitoring with Application Insights. This option is enabled by default for all Java applications.
4. Trigger build process : Select the applications to build images for and click Build . Clicking build will
start the container image build for each application. The tool keeps monitoring the build status
continuously and will let you proceed to the next step upon successful completion of the build.
5. Track build status : You can also monitor progress of the build step by clicking the Build in Progress
link under the status column. The link takes a couple of minutes to be active after you've triggered the
build process.
6. Once the build is completed, click Continue to specify deployment settings.
If you don’t have an AKS cluster or would like to create a new AKS cluster to deploy the application to,
you can choose to create on from the tool by clicking Create new AKS cluster .
The AKS cluster created using the tool will be created with a Linux node pool. The cluster will be
configured to allow it to pull images from the Azure Container Registry that was created earlier
(if create new registry option was chosen).
Click Continue after selecting the AKS cluster.
2. Specify secret store and monitoring workspace : If you had opted to parameterize application
configurations, then specify the secret store to be used for the application. You can choose Azure Key
Vault or Kubernetes Secrets for managing your application secrets.
If you've selected Kubernetes secrets for managing secrets, then click Continue .
If you'd like to use an Azure Key Vault for managing your application secrets, then specify the Azure
Key Vault that you'd want to use.
If you don’t have an Azure Key Vault or would like to create a new Key Vault, you can choose to
create on from the tool by clicking Create new .
The tool will automatically assign the necessary permissions for managing secrets through the
Key Vault.
Monitoring workspace : If you'd selected to enabled monitoring with Application Insights, then
specify the Application Insights resource that you'd want to use. This option won't be visible if you had
disabled monitoring integration.
If you don’t have an Application Insights resource or would like to create a new resource, you
can choose to create on from the tool by clicking Create new .
3. Specify Azure file share : If you had added more folders and selected the Persistent Volume option,
then specify the Azure file share that should be used by Azure Migrate: App Containerization tool during
the deployment process. The tool will create new directories in this Azure file share to copy over the
application folders that are configured for Persistent Volume storage. Once the application deployment is
complete, the tool will clean up the Azure file share by deleting the directories it had created.
If you don't have an Azure file share or would like to create a new Azure file share, you can choose to
create on from the tool by clicking Create new Storage Account and file share .
4. Application deployment configuration : Once you've completed the steps above, you'll need to
specify the deployment configuration for the application. Click Configure to customize the deployment
for the application. In the configure step you can provide the following customizations:
Prefix string : Specify a prefix string to use in the name for all resources that are created for the
containerized application in the AKS cluster.
Replica Sets : Specify the number of application instances (pods) that should run inside the
containers.
Load balancer type : Select External if the containerized application should be reachable from public
networks.
Application Configuration : For any application configurations that were parameterized, provide the
values to use for the current deployment.
Storage : For any application folders that were configured for Persistent Volume storage, specify
whether the volume should be shared across application instances or should be initialized individually
with each instance in the container. By default, all application folders on Persistent Volumes are
configured as shared.
Click Apply to save the deployment configuration.
Click Continue to deploy the application.
5. Deploy the application : Once the deployment configuration for the application is saved, the tool will
generate the Kubernetes deployment YAML for the application.
Click Review to review and customize the Kubernetes deployment YAML for the applications.
Select the application to deploy.
Click Deploy to start deployments for the selected applications
Once the application is deployed, you can click the Deployment status column to track the
resources that were deployed for the application.
Next steps
Containerizing Java web apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on App Service. Learn more
Containerizing ASP.NET web apps and deploying them on Windows containers on AKS. Learn more
Containerizing ASP.NET web apps and deploying them on Windows containers on Azure App Service. Learn
more
ASP.NET app containerization and migration to
Azure Kubernetes Service
6/15/2022 • 15 minutes to read • Edit Online
In this article, you'll learn how to containerize ASP.NET applications and migrate them to Azure Kubernetes
Service (AKS) using the Azure Migrate: App Containerization tool. The containerization process doesn’t require
access to your codebase and provides an easy way to containerize existing applications. The tool works by using
the running state of the applications on a server to determine the application components and helps you
package them in a container image. The containerized application can then be deployed on Azure Kubernetes
Service (AKS).
The Azure Migrate: App Containerization tool currently supports -
Containerizing ASP.NET apps and deploying them on Windows containers on Azure Kubernetes Service.
Containerizing ASP.NET apps and deploying them on Windows containers on Azure App Service. Learn more
Containerizing Java Web Apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on AKS. Learn more
Containerizing Java Web Apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on App Service. Learn more
The Azure Migrate: App Containerization tool helps you to -
Discover your application : The tool remotely connects to the application servers running your ASP.NET
application and discovers the application components. The tool creates a Dockerfile that can be used to
create a container image for the application.
Build the container image : You can inspect and further customize the Dockerfile as per your application
requirements and use that to build your application container image. The application container image is
pushed to an Azure Container Registry you specify.
Deploy to Azure Kubernetes Ser vice : The tool then generates the Kubernetes resource definition YAML
files needed to deploy the containerized application to your Azure Kubernetes Service cluster. You can
customize the YAML files and use them to deploy the application on AKS.
NOTE
The Azure Migrate: App Containerization tool helps you discover specific application types (ASP.NET and Java web apps on
Apache Tomcat) and their components on an application server. To discover servers and the inventory of apps, roles, and
features running on on-premises machines, use Azure Migrate: Discovery and assessment capability. Learn more
While all applications won't benefit from a straight shift to containers without significant rearchitecting, some of
the benefits of moving existing apps to containers without rewriting include:
Improved infrastructure utilization: With containers, multiple applications can share resources and be
hosted on the same infrastructure. This can help you consolidate infrastructure and improve utilization.
Simplified management: By hosting your applications on a modern managed platform like AKS and App
Service, you can simplify your management practices. You can achieve this by retiring or reducing the
infrastructure maintenance and management processes that you'd traditionally perform with owned
infrastructure.
Application por tability: With increased adoption and standardization of container specification formats
and platforms, application portability is no longer a concern.
Adopt modern management with DevOps: Helps you adopt and standardize on modern practices for
management and security and transition to DevOps.
In this tutorial, you'll learn how to:
Set up an Azure account.
Install the Azure Migrate: App Containerization tool.
Discover your ASP.NET application.
Build the container image.
Deploy the containerized application on AKS.
NOTE
Tutorials show you the simplest deployment path for a scenario so that you can quickly set up a proof-of-concept.
Tutorials use default options where possible, and don't show all possible settings and paths.
Prerequisites
Before you begin this tutorial, you should:
Identify a machine to install the tool A Windows machine to install and run the Azure Migrate:
App Containerization tool. The Windows machine could be a
server (Windows Server 2016 or later) or client (Windows
10) operating system, meaning that the tool can run on
your desktop as well.
Application ser vers Enable PowerShell remoting on the application servers: Log
in to the application server and Follow these instructions to
turn on PowerShell remoting.
2. In the Subscriptions page, select the subscription in which you want to create an Azure Migrate project.
3. Select Access control (IAM) .
4. Select Add > Add role assignment to open the Add role assignment page.
5. Assign the following role. For detailed steps, see Assign Azure roles using the Azure portal.
SET T IN G VA L UE
Role Owner
6. Your Azure account also needs permissions to register Azure Active Director y apps.
7. In Azure portal, navigate to Azure Active Director y > Users > User Settings .
8. In User settings , verify that Azure AD users can register applications (set to Yes by default).
9. In case the 'App registrations' settings is set to 'No', request the tenant/global admin to assign the
required permission. Alternately, the tenant/global admin can assign the Application Developer role to
an account to allow the registration of Azure Active Directory App. Learn more.
.\AppContainerizationInstaller.ps1
Sign in to Azure
Click Sign in to log in to your Azure account.
1. You'll need a device code to authenticate with Azure. Clicking on sign in will open a modal with the device
code.
2. Click on Copy code & sign in to copy the device code and open an Azure sign in prompt in a new
browser tab. If it doesn't appear, make sure you've disabled the pop-up blocker in the browser.
3. On the new tab, paste the device code and complete sign in using your Azure account credentials. You can
close the browser tab after sign in is complete and return to the App Containerization tool's web
interface.
4. Select the Azure tenant that you want to use.
5. Specify the Azure subscription that you want to use.
2. Review the Dockerfile : The Dockerfile needed to build the container images for each selected
application are generated at the beginning of the build step. Click Review to review the Dockerfile. You
can also add any necessary customizations to the Dockerfile in the review step and save the changes
before starting the build process.
3. Trigger build process : Select the applications to build images for and click Build . Clicking build will
start the container image build for each application. The tool keeps monitoring the build status
continuously and will let you proceed to the next step upon successful completion of the build.
4. Track build status : You can also monitor progress of the build step by clicking the Build in Progress
link under the status column. The link takes a couple of minutes to be active after you've triggered the
build process.
5. Once the build is completed, click Continue to specify deployment settings.
Deploy the containerized app on AKS
Once the container image is built, the next step is to deploy the application as a container on Azure Kubernetes
Service (AKS).
1. Select the Azure Kubernetes Ser vice Cluster : Specify the AKS cluster that the application should be
deployed to.
The selected AKS cluster must have a Windows node pool.
The cluster must be configured to allow pulling of images from the Azure Container Registry that was
selected to store the images.
Run the following command in Azure CLI to attach the AKS cluster to the ACR.
If you don’t have an AKS cluster or would like to create a new AKS cluster to deploy the application to,
you can choose to create on from the tool by clicking Create new AKS cluster .
The AKS cluster created using the tool will be created with a Windows node pool. The cluster
will be configured to allow it to pull images from the Azure Container Registry that was created
earlier (if create new registry option was chosen).
Click Continue after selecting the AKS cluster.
2. Specify secret store : If you had opted to parameterize application configurations, then specify the
secret store to be used for the application. You can choose Azure Key Vault or App Service application
settings for managing your application secrets. Learn more
If you've selected App Service application settings for managing secrets, then click Continue .
If you'd like to use an Azure Key Vault for managing your application secrets, then specify the Azure
Key Vault that you'd want to use.
If you don’t have an Azure Key Vault or would like to create a new Key Vault, you can choose to
create on from the tool by clicking Create new Azure Key Vault .
The tool will automatically assign the necessary permissions for managing secrets through the
Key Vault.
3. Specify Azure file share : If you had added more folders and selected the Persistent Volume option,
then specify the Azure file share that should be used by Azure Migrate: App Containerization tool during
the deployment process. The tool will create new directories in this Azure file share to copy over the
application folders that are configured for Persistent Volume storage. Once the application deployment is
complete, the tool will clean up the Azure file share by deleting the directories it had created.
If you don't have an Azure file share or would like to create a new Azure file share, you can choose to
create on from the tool by clicking Create new Storage Account and file share .
4. Application deployment configuration : Once you've completed the steps above, you'll need to
specify the deployment configuration for the application. Click Configure to customize the deployment
for the application. In the configure step you can provide the following customizations:
Prefix string : Specify a prefix string to use in the name for all resources that are created for the
containerized application in the AKS cluster.
SSL cer tificate : If your application requires an https site binding, specify the PFX file that contains the
certificate to be used for the binding. The PFX file shouldn't be password protected and the original
site shouldn't have multiple bindings.
Replica Sets : Specify the number of application instances (pods) that should run inside the
containers.
Load balancer type : Select External if the containerized application should be reachable from public
networks.
Application Configuration : For any application configurations that were parameterized, provide the
values to use for the current deployment.
Storage : For any application folders that were configured for Persistent Volume storage, specify
whether the volume should be shared across application instances or should be initialized individually
with each instance in the container. By default, all application folders on Persistent Volumes are
configured as shared.
Click Apply to save the deployment configuration.
Click Continue to deploy the application.
5. Deploy the application : Once the deployment configuration for the application is saved, the tool will
generate the Kubernetes deployment YAML for the application.
Click Review to review and customize the Kubernetes deployment YAML for the applications.
Select the application to deploy.
Click Deploy to start deployments for the selected applications
Once the application is deployed, you can click the Deployment status column to track the
resources that were deployed for the application.
Troubleshoot issues
To troubleshoot any issues with the tool, you can look at the log files on the Windows machine running the App
Containerization tool. Tool log files are located at C:\ProgramData\Microsoft Azure Migrate App
Containerization\Logs folder.
Next steps
Containerizing ASP.NET web apps and deploying them on Windows containers on App Service. Learn more
Containerizing Java web apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on AKS. Learn more
Containerizing Java web apps on Apache Tomcat (on Linux servers) and deploying them on Linux containers
on App Service. Learn more
Scale the node count in an Azure Kubernetes
Service (AKS) cluster
6/15/2022 • 2 minutes to read • Edit Online
If the resource needs of your applications change, you can manually scale an AKS cluster to run a different
number of nodes. When you scale down, nodes are carefully cordoned and drained to minimize disruption to
running applications. When you scale up, AKS waits until nodes are marked Ready by the Kubernetes cluster
before pods are scheduled on them.
[
{
"count": 1,
"maxPods": 110,
"name": "nodepool1",
"osDiskSizeGb": 30,
"osType": "Linux",
"storageProfile": "ManagedDisks",
"vmSize": "Standard_DS2_v2"
}
]
Use the az aks scale command to scale the cluster nodes. The following example scales a cluster named
myAKSCluster to a single node. Provide your own --nodepool-name from the previous command, such as
nodepool1:
az aks scale --resource-group myResourceGroup --name myAKSCluster --node-count 1 --nodepool-name <your node
pool name>
The following example output shows the cluster has successfully scaled to one node, as shown in the
agentPoolProfiles section:
{
"aadProfile": null,
"addonProfiles": null,
"agentPoolProfiles": [
{
"count": 1,
"maxPods": 110,
"name": "nodepool1",
"osDiskSizeGb": 30,
"osType": "Linux",
"storageProfile": "ManagedDisks",
"vmSize": "Standard_DS2_v2",
"vnetSubnetId": null
}
],
[...]
}
az aks nodepool scale --name <your node pool name> --cluster-name myAKSCluster --resource-group
myResourceGroup --node-count 0
You can also autoscale User node pools to 0 nodes, by setting the --min-count parameter of the Cluster
Autoscaler to 0.
Next steps
In this article, you manually scaled an AKS cluster to increase or decrease the number of nodes. You can also use
the cluster autoscaler to automatically scale your cluster.
Use Scale-down Mode to delete/deallocate nodes
in Azure Kubernetes Service (AKS)
6/15/2022 • 2 minutes to read • Edit Online
By default, scale-up operations performed manually or by the cluster autoscaler require the allocation and
provisioning of new nodes, and scale-down operations delete nodes. Scale-down Mode allows you to decide
whether you would like to delete or deallocate the nodes in your Azure Kubernetes Service (AKS) cluster upon
scaling down.
When an Azure VM is in the Stopped (deallocated) state, you will not be charged for the VM compute resources.
However, you'll still need to pay for any OS and data storage disks attached to the VM. This also means that the
container images will be preserved on those nodes. For more information, see States and billing of Azure Virtual
Machines. This behavior allows for faster operation speeds, as your deployment uses cached images. Scale-
down Mode removes the need to pre-provision nodes and pre-pull container images, saving you compute cost.
This article assumes that you have an existing AKS cluster. If you need an AKS cluster, see the AKS quickstart
using the Azure CLI, using Azure PowerShell, or using the Azure portal.
Limitations
Ephemeral OS disks aren't supported. Be sure to specify managed OS disks via --node-osdisk-type Managed
when creating a cluster or node pool.
NOTE
Previously, while Scale-down Mode was in preview, spot node pools were unsupported. Now that Scale-down Mode is
Generally Available, this limitation no longer applies.
By scaling the node pool and changing the node count to 5, we'll deallocate 15 nodes.
az aks nodepool scale --node-count 5 --name nodepool2 --cluster-name myAKSCluster --resource-group
myResourceGroup
az aks nodepool update --scale-down-mode Delete --name nodepool2 --cluster-name myAKSCluster --resource-
group myResourceGroup
NOTE
Changing your scale-down mode from Deallocate to Delete then back to Deallocate will delete all deallocated
nodes while keeping your node pool in Deallocate scale-down mode.
Next steps
To learn more about upgrading your AKS cluster, see Upgrade an AKS cluster
To learn more about the cluster autoscaler, see Automatically scale a cluster to meet application demands on
AKS
Stop and Start an Azure Kubernetes Service (AKS)
cluster
6/15/2022 • 4 minutes to read • Edit Online
Your AKS workloads may not need to run continuously, for example a development cluster that is used only
during business hours. This leads to times where your Azure Kubernetes Service (AKS) cluster might be idle,
running no more than the system components. You can reduce the cluster footprint by scaling all the User
node pools to 0, but your System pool is still required to run the system components while the cluster is
running. To optimize your costs further during these periods, you can completely turn off (stop) your cluster.
This action will stop your control plane and agent nodes altogether, allowing you to save on all the compute
costs, while maintaining all your objects (except standalone pods) and cluster state stored for when you start it
again. You can then pick up right where you left of after a weekend or to have your cluster running only while
you run your batch jobs.
You can use the az aks stop command to stop a running AKS cluster's nodes and control plane. The following
example stops a cluster named myAKSCluster:
You can verify when your cluster is stopped by using the az aks show command and confirming the powerState
shows as Stopped as on the below output:
{
[...]
"nodeResourceGroup": "MC_myResourceGroup_myAKSCluster_westus2",
"powerState":{
"code":"Stopped"
},
"privateFqdn": null,
"provisioningState": "Succeeded",
"resourceGroup": "myResourceGroup",
[...]
}
If the provisioningState shows Stopping that means your cluster hasn't fully stopped yet.
IMPORTANT
If you are using Pod Disruption Budgets the stop operation can take longer as the drain process will take more time to
complete.
It is important that you don't repeatedly start/stop your cluster. Repeatedly starting/stopping your cluster may
result in errors. Once your cluster is stopped, you should wait 15-30 minutes before starting it up again.
Azure CLI
Azure PowerShell
You can use the az aks start command to start a stopped AKS cluster's nodes and control plane. The cluster is
restarted with the previous control plane state and number of agent nodes. The following example starts a
cluster named myAKSCluster:
You can verify when your cluster has started by using the az aks show command and confirming the
powerState shows Running as on the below output:
{
[...]
"nodeResourceGroup": "MC_myResourceGroup_myAKSCluster_westus2",
"powerState":{
"code":"Running"
},
"privateFqdn": null,
"provisioningState": "Succeeded",
"resourceGroup": "myResourceGroup",
[...]
}
If the provisioningState shows Starting that means your cluster hasn't fully started yet.
NOTE
When you start your cluster back up, the following is expected behavior:
The IP address of your API server may change.
If you are using cluster autoscaler, when you start your cluster back up your current node count may not be between
the min and max range values you set. The cluster starts with the number of nodes it needs to run its workloads,
which isn't impacted by your autoscaler settings. When your cluster performs scaling operations, the min and max
values will impact your current node count and your cluster will eventually enter and remain in that desired range until
you stop your cluster.
Next steps
To learn how to scale User pools to 0, see Scale User pools to 0.
To learn how to save costs using Spot instances, see Add a spot node pool to AKS.
To learn more about the AKS support policies, see AKS support policies.
Use Planned Maintenance to schedule maintenance
windows for your Azure Kubernetes Service (AKS)
cluster (preview)
6/15/2022 • 4 minutes to read • Edit Online
Your AKS cluster has regular maintenance performed on it automatically. By default, this work can happen at any
time. Planned Maintenance allows you to schedule weekly maintenance windows that will update your control
plane as well as your kube-system Pods on a VMSS instance and minimize workload impact. Once scheduled, all
your maintenance will occur during the window you selected. You can schedule one or more weekly windows on
your cluster by specifying a day or time range on a specific day. Maintenance Windows are configured using the
Azure CLI.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Limitations
When using Planned Maintenance, the following restrictions apply:
AKS reserves the right to break these windows for unplanned/reactive maintenance operations that are
urgent or critical.
Currently, performing maintenance operations are considered best-effort only and are not guaranteed to
occur within a specified window.
Updates cannot be blocked for more than seven days.
Install aks-preview CLI extension
You also need the aks-preview Azure CLI extension version 0.5.4 or later. Install the aks-preview Azure CLI
extension by using the az extension add command. Or install any available updates by using the az extension
update command.
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Allow maintenance on every Monday at 1:00am to 2:00am
To add a maintenance window, you can use the az aks maintenanceconfiguration add command.
IMPORTANT
At this time, you must set default as the value for --name . Using any other name will cause your maintenance
window to not run.
Planned Maintenance windows are specified in Coordinated Universal Time (UTC).
The following example output shows the maintenance window from 1:00am to 2:00am every Monday.
{
"id":
"/subscriptions/<subscriptionID>/resourcegroups/MyResourceGroup/providers/Microsoft.ContainerService/managed
Clusters/myAKSCluster/maintenanceConfigurations/default",
"name": "default",
"notAllowedTime": null,
"resourceGroup": "MyResourceGroup",
"systemData": null,
"timeInWeek": [
{
"day": "Monday",
"hourSlots": [
1
]
}
],
"type": null
}
To allow maintenance any time during a day, omit the start-hour parameter. For example, the following
command sets the maintenance window for the full day every Monday:
The above JSON file specifies maintenance windows every Tuesday at 1:00am - 3:00am and every Wednesday
at 1:00am - 2:00am and at 6:00am - 7:00am. There is also an exception from 2021-05-26T03:00:00Z to 2021-
05-30T12:00:00Z where maintenance isn't allowed even if it overlaps with a maintenance window. The following
command adds the maintenance windows from test.json .
In the output below, you can see that there are two maintenance windows configured for myAKSCluster. One
window is on Mondays at 1:00am and another window is on Friday at 4:00am.
[
{
"id":
"/subscriptions/<subscriptionID>/resourcegroups/MyResourceGroup/providers/Microsoft.ContainerService/managed
Clusters/myAKSCluster/maintenanceConfigurations/default",
"name": "default",
"notAllowedTime": null,
"resourceGroup": "MyResourceGroup",
"systemData": null,
"timeInWeek": [
{
"day": "Monday",
"hourSlots": [
1
]
}
],
"type": null
},
{
"id":
"/subscriptions/<subscriptionID>/resourcegroups/MyResourceGroup/providers/Microsoft.ContainerService/managed
Clusters/myAKSCluster/maintenanceConfigurations/testConfiguration",
"name": "testConfiguration",
"notAllowedTime": null,
"resourceGroup": "MyResourceGroup",
"systemData": null,
"timeInWeek": [
{
"day": "Friday",
"hourSlots": [
4
]
}
],
"type": null
}
]
The following example output shows the maintenance window for default:
{
"id":
"/subscriptions/<subscriptionID>/resourcegroups/MyResourceGroup/providers/Microsoft.ContainerService/managed
Clusters/myAKSCluster/maintenanceConfigurations/default",
"name": "default",
"notAllowedTime": null,
"resourceGroup": "MyResourceGroup",
"systemData": null,
"timeInWeek": [
{
"day": "Monday",
"hourSlots": [
1
]
}
],
"type": null
}
Next steps
To get started with upgrading your AKS cluster, see Upgrade an AKS cluster
Enable Cloud Controller Manager
6/15/2022 • 2 minutes to read • Edit Online
As a Cloud Provider, Microsoft Azure works closely with the Kubernetes community to support our
infrastructure on behalf of users.
Previously, Cloud provider integration with Kubernetes was "in-tree", where any changes to Cloud specific
features would follow the standard Kubernetes release cycle. When issues were fixed or enhancements were
rolled out, they would need to be within the Kubernetes community's release cycle.
The Kubernetes community is now adopting an "out-of-tree" model where the Cloud providers will control their
releases independently of the core Kubernetes release schedule through the cloud-provider-azure component.
As part of this cloud-provider-azure component, we are also introducing a cloud-node-manager component,
which is a component of the Kubernetes node lifecycle controller. This component is deployed by a DaemonSet
in the kube-system namespace. To view this component, use
We recently rolled out the Cloud Storage Interface (CSI) drivers to be the default in Kubernetes version 1.21 and
above.
NOTE
When enabling Cloud Controller Manager on your AKS cluster, this will also enable the out of tree CSI drivers.
The Cloud Controller Manager is the default controller from Kubernetes 1.22, supported by AKS. If running <
v1.22, follow instructions below.
Prerequisites
You must have the following resource installed:
The Azure CLI
Kubernetes version 1.20.x or above
The aks-preview extension version 0.5.5 or later
Register the EnableCloudControllerManager feature flag
To use the Cloud Controller Manager feature, you must enable the EnableCloudControllerManager feature flag on
your subscription.
You can check on the registration status by using the az feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
az provider register --namespace Microsoft.ContainerService
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Create a new AKS cluster with Cloud Controller Manager with version
<1.22
To create a cluster using the Cloud Controller Manager, pass EnableCloudControllerManager=True as a customer
header to the Azure API using the Azure CLI.
Next steps
For more information on CSI drivers, and the default behavior for Kubernetes versions above 1.21, please
see our documentation.
You can find more information about the Kubernetes community direction regarding Out of Tree
providers on the community blog post.
Upgrade an Azure Kubernetes Service (AKS) cluster
6/15/2022 • 9 minutes to read • Edit Online
Part of the AKS cluster lifecycle involves performing periodic upgrades to the latest Kubernetes version. It’s
important you apply the latest security releases, or upgrade to get the latest features. This article shows you
how to check for, configure, and apply upgrades to your AKS cluster.
For AKS clusters that use multiple node pools or Windows Server nodes, see Upgrade a node pool in AKS.
WARNING
An AKS cluster upgrade triggers a cordon and drain of your nodes. If you have a low compute quota available, the
upgrade may fail. For more information, see increase quotas
NOTE
When you upgrade a supported AKS cluster, Kubernetes minor versions can't be skipped. All upgrades must be performed
sequentially by major version number. For example, upgrades between 1.14.x -> 1.15.x or 1.15.x -> 1.16.x are allowed,
however 1.14.x -> 1.16.x is not allowed.
Skipping multiple versions can only be done when upgrading from an unsupported version back to a supported version.
For example, an upgrade from an unsupported 1.10.x -> a supported 1.15.x can be completed if available.
The following example output shows that the cluster can be upgraded to versions 1.19.1 and 1.19.3:
ERROR: Table output unavailable. Use the --query option to specify an appropriate query. Use --debug for
more info.
IMPORTANT
If no upgrade is available, create a new cluster with a supported version of Kubernetes and migrate your workloads from
the existing cluster to the new cluster. Attempting to upgrade a cluster to a newer Kubernetes version when
az aks get-upgrades shows no upgrades available is not supported.
By default, AKS configures upgrades to surge with one extra node. A default value of one for the max surge
settings will enable AKS to minimize workload disruption by creating an extra node before the cordon/drain of
existing applications to replace an older versioned node. The max surge value may be customized per node pool
to enable a trade-off between upgrade speed and upgrade disruption. By increasing the max surge value, the
upgrade process completes faster, but setting a large value for max surge may cause disruptions during the
upgrade process.
For example, a max surge value of 100% provides the fastest possible upgrade process (doubling the node
count) but also causes all nodes in the node pool to be drained simultaneously. You may wish to use a higher
value such as this for testing environments. For production node pools, we recommend a max_surge setting of
33%.
AKS accepts both integer values and a percentage value for max surge. An integer such as "5" indicates five extra
nodes to surge. A value of "50%" indicates a surge value of half the current node count in the pool. Max surge
percent values can be a minimum of 1% and a maximum of 100%. A percent value is rounded up to the nearest
node count. If the max surge value is lower than the current node count at the time of upgrade, the current node
count is used for the max surge value.
During an upgrade, the max surge value can be a minimum of 1 and a maximum value equal to the number of
nodes in your node pool. You can set larger values, but the maximum number of nodes used for max surge
won't be higher than the number of nodes in the pool at the time of upgrade.
IMPORTANT
The max surge setting on a node pool is persistent. Subsequent Kubernetes upgrades or node version upgrades will use
this setting. You may change the max surge value for your node pools at any time. For production node pools, we
recommend a max-surge setting of 33%.
Use the following commands to set max surge values for new or existing node pools.
NOTE
If no patch is specified, the cluster will automatically be upgraded to the specified minor version's latest GA patch. For
example, setting --kubernetes-version to 1.21 will result in the cluster upgrading to 1.21.9 .
When upgrading by alias minor version, only a higher minor version is supported. For example, upgrading from 1.20.x
to 1.20 will not trigger an upgrade to the latest GA 1.20 patch, but upgrading to 1.21 will trigger an upgrade to
the latest GA 1.21 patch.
az aks upgrade \
--resource-group myResourceGroup \
--name myAKSCluster \
--kubernetes-version KUBERNETES_VERSION
It takes a few minutes to upgrade the cluster, depending on how many nodes you have.
IMPORTANT
Ensure that any PodDisruptionBudgets (PDBs) allow for at least 1 pod replica to be moved at a time otherwise the
drain/evict operation will fail. If the drain operation fails, the upgrade operation will fail by design to ensure that the
applications are not disrupted. Please correct what caused the operation to stop (incorrect PDBs, lack of quota, and so on)
and re-try the operation.
To confirm that the upgrade was successful, use the az aks show command:
The following example output shows that the cluster now runs 1.18.10:
The following example output shows some of the above events listed during an upgrade.
...
default 2m1s Normal Drain node/aks-nodepool1-96663640-vmss000001 Draining node: [aks-nodepool1-96663640-
vmss000001]
...
default 9m22s Normal Surge node/aks-nodepool1-96663640-vmss000002 Created a surge node [aks-nodepool1-
96663640-vmss000002 nodepool1] for agentpool %!s(MISSING)
...
C H A N N EL A C T IO N EXA M P L E
none disables auto-upgrades and keeps the Default setting if left unchanged
cluster at its current version of
Kubernetes
node-image automatically upgrade the node image Microsoft provides patches and new
to the latest version available. images for image nodes frequently
(usually weekly), but your running
nodes won't get the new images
unless you do a node image upgrade.
Turning on the node-image channel
will automatically update your node
images whenever a new version is
available.
NOTE
Cluster auto-upgrade only updates to GA versions of Kubernetes and will not update to preview versions.
Automatically upgrading a cluster follows the same process as manually upgrading a cluster. For more
information, see Upgrade an AKS cluster.
To set the auto-upgrade channel when creating a cluster, use the auto-upgrade-channel parameter, similar to the
following example.
To set the auto-upgrade channel on existing cluster, update the auto-upgrade-channel parameter, similar to the
following example.
Uptime SLA is a tier to enable a financially backed, higher SLA for an AKS cluster. Clusters with Uptime SLA, also
regarded as Paid tier in AKS REST APIs, come with greater amount of control plane resources and automatically
scale to meet the load of your cluster. Uptime SLA guarantees 99.95% availability of the Kubernetes API server
endpoint for clusters that use Availability Zones and 99.9% of availability for clusters that don't use Availability
Zones. AKS uses master node replicas across update and fault domains to ensure SLA requirements are met.
AKS recommends use of Uptime SLA in production workloads to ensure availability of control plane
components. Clusters on free tier by contrast come with fewer replicas and limited resources for the control
plane and are not suitable for production workloads.
Customers can still create unlimited number of free clusters with a service level objective (SLO) of 99.5% and
opt for the preferred SLO.
IMPORTANT
For clusters with egress lockdown, see limit egress traffic to open appropriate ports.
Region availability
Uptime SLA is available in public regions and Azure Government regions where AKS is supported.
Uptime SLA is available for private AKS clusters in all public regions where AKS is supported.
Use the az aks create command to create an AKS cluster. The following example creates a cluster named
myAKSCluster with one node. This operation takes several minutes to complete:
},
"sku": {
"name": "Basic",
"tier": "Paid"
},
The following JSON snippet shows the paid tier for the SKU, indicating your cluster is enabled with Uptime SLA:
},
"sku": {
"name": "Basic",
"tier": "Paid"
},
Clean up
To avoid charges, clean up any resources you created. To delete the cluster, use the az group delete command
to delete the AKS resource group:
az group delete --name myResourceGroup --yes --no-wait
Next steps
Use Availability Zones to increase high availability with your AKS cluster workloads.
Configure your cluster to limit egress traffic.
Draft for Azure Kubernetes Service (AKS) (preview)
6/15/2022 • 3 minutes to read • Edit Online
How it works
Draft has the following commands to help ease your development on Kubernetes:
draft create : Creates the Dockerfile and the proper manifest files.
draft setup-gh : Sets up your GitHub OIDC.
draft generate-workflow : Generates the GitHub Action workflow file for deployment onto your cluster.
draft up : Sets up your GitHub OIDC and generates a GitHub Action workflow file, combining the previous
two commands.
Prerequisites
If you don't have an Azure subscription, create a free account before you begin.
Install the latest version of the Azure CLI and the aks-preview extension.
If you don't have one already, you need to create an AKS cluster and an Azure Container Registry instance.
Install the aks-preview Azure CLI extension
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
You can also run the command on a specific directory using the --destination flag:
az aks draft up
You can also run the command on a specific directory using the --destination flag:
You can also run the command on a specific directory using the --destination flag:
NOTE
When using proximity placement groups on AKS, colocation only applies to the agent nodes. Node to node and the
corresponding hosted pod to pod latency is improved. The colocation does not affect the placement of a cluster's control
plane.
When deploying your application in Azure, spreading Virtual Machine (VM) instances across regions or
availability zones creates network latency, which may impact the overall performance of your application. A
proximity placement group is a logical grouping used to make sure Azure compute resources are physically
located close to each other. Some applications like gaming, engineering simulations, and high-frequency trading
(HFT) require low latency and tasks that complete quickly. For high-performance computing (HPC) scenarios
such as these, consider using proximity placement groups (PPG) for your cluster's node pools.
NOTE
While proximity placement groups require a node pool to use at most one availability zone, the baseline Azure VM SLA of
99.9% is still in effect for VMs in a single zone.
Proximity placement groups are a node pool concept and associated with each individual node pool. Using a
PPG resource has no impact on AKS control plane availability. This can impact how a cluster should be designed
with zones. To ensure a cluster is spread across multiple zones the following design is recommended.
Provision a cluster with the first system pool using 3 zones and no proximity placement group associated.
This ensures the system pods land in a dedicated node pool which will spread across multiple zones.
Add additional user node pools with a unique zone and proximity placement group associated to each pool.
An example is nodepool1 in zone 1 and PPG1, nodepool2 in zone 2 and PPG2, nodepool3 in zone 3 with
PPG3. This ensures at a cluster level, nodes are spread across multiple zones and each individual node pool is
colocated in the designated zone with a dedicated PPG resource.
The command produces output, which includes the id value you need for upcoming CLI commands:
{
"availabilitySets": null,
"colocationStatus": null,
"id":
"/subscriptions/yourSubscriptionID/resourceGroups/myResourceGroup/providers/Microsoft.Compute/proximityPlace
mentGroups/myPPG",
"location": "centralus",
"name": "myPPG",
"proximityPlacementGroupType": "Standard",
"resourceGroup": "myResourceGroup",
"tags": {},
"type": "Microsoft.Compute/proximityPlacementGroups",
"virtualMachineScaleSets": null,
"virtualMachines": null
}
Use the proximity placement group resource ID for the myPPGResourceID value in the below command:
# Create an AKS cluster that uses a proximity placement group for the initial system node pool only. The PPG
has no effect on the cluster control plane.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--ppg myPPGResourceID
# Add a new node pool that uses a proximity placement group, use a --node-count = 1 for testing
az aks nodepool add \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name mynodepool \
--node-count 1 \
--ppg myPPGResourceID
Clean up
To delete the cluster, use the az group delete command to delete the AKS resource group:
Next steps
Learn more about proximity placement groups.
Azure Kubernetes Service (AKS) node image
upgrade
6/15/2022 • 3 minutes to read • Edit Online
AKS supports upgrading the images on a node so you're up to date with the newest OS and runtime updates.
AKS regularly provides new images with the latest updates, so it's beneficial to upgrade your node's images
regularly for the latest AKS features. Linux node images are updated weekly, and Windows node images
updated monthly. Although customers will be notified of image upgrades via the AKS release notes, it might
take up to a week for updates to be rolled out in all regions. This article shows you how to upgrade AKS cluster
node images and how to update node pool images without upgrading the version of Kubernetes.
For more information about the latest images provided by AKS, see the AKS release notes.
For information on upgrading the Kubernetes version for your cluster, see Upgrade an AKS cluster.
NOTE
The AKS cluster must use virtual machine scale sets for the nodes.
In the output you can see the latestNodeImageVersion like on the example below:
{
"id": "/subscriptions/XXXX-XXX-XXX-XXX-
XXXXX/resourcegroups/myResourceGroup/providers/Microsoft.ContainerService/managedClusters/myAKSCluster/agent
Pools/nodepool1/upgradeProfiles/default",
"kubernetesVersion": "1.17.11",
"latestNodeImageVersion": "AKSUbuntu-1604-2020.10.28",
"name": "default",
"osType": "Linux",
"resourceGroup": "myResourceGroup",
"type": "Microsoft.ContainerService/managedClusters/agentPools/upgradeProfiles",
"upgrades": null
}
So for nodepool1 the latest node image available is AKSUbuntu-1604-2020.10.28 . You can now compare it with
the current node image version in use by your node pool by running:
"AKSUbuntu-1604-2020.10.08"
So in this example you could upgrade from the current AKSUbuntu-1604-2020.10.08 image version to the latest
version AKSUbuntu-1604-2020.10.28 .
az aks upgrade \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-image-only
During the upgrade, check the status of the node images with the following kubectl command to get the labels
and filter out the current node image information:
When the upgrade is complete, use az aks show to get the updated node pool details. The current node image
is shown in the nodeImageVersion property.
az aks show \
--resource-group myResourceGroup \
--name myAKSCluster
During the upgrade, check the status of the node images with the following kubectl command to get the labels
and filter out the current node image information:
When the upgrade is complete, use az aks nodepool show to get the updated node pool details. The current
node image is shown in the nodeImageVersion property.
az aks nodepool show \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name mynodepool
During the upgrade, check the status of the node images with the following kubectl command to get the labels
and filter out the current node image information:
Use az aks nodepool show to get the updated node pool details. The current node image is shown in the
nodeImageVersion property.
Next steps
See the AKS release notes for information about the latest node images.
Learn how to upgrade the Kubernetes version with Upgrade an AKS cluster.
Automatically apply cluster and node pool upgrades with GitHub Actions
Learn more about multiple node pools and how to upgrade node pools with Create and manage multiple
node pools.
Apply security updates to Azure Kubernetes Service
(AKS) nodes automatically using GitHub Actions
6/15/2022 • 6 minutes to read • Edit Online
Security updates are a key part of maintaining your AKS cluster's security and compliance with the latest fixes
for the underlying OS. These updates include OS security fixes or kernel updates. Some updates require a node
reboot to complete the process.
Running az aks upgrade gives you a zero downtime way to apply updates. The command handles applying the
latest updates to all your cluster's nodes, cordoning and draining traffic to the nodes, and restarting the nodes,
then allowing traffic to the updated nodes. If you update your nodes using a different method, AKS will not
automatically restart your nodes.
NOTE
The main difference between az aks upgrade when used with the --node-image-only flag is that, when it's used, only
the node images will be upgraded. If omitted, both the node images and the Kubernetes control plane version will be
upgraded. You can check the docs for managed upgrades on nodes and the docs for cluster upgrades for more in-depth
information.
All Kubernetes' nodes run in a standard Azure virtual machine (VM). These VMs can be Windows or Linux-
based. The Linux-based VMs use an Ubuntu image, with the OS configured to automatically check for updates
every night.
When you use the az aks upgrade command, Azure CLI creates a surge of new nodes with the latest security
and kernel updates, these nodes are initially cordoned to prevent any apps from being scheduled to them until
the update is finished. After completion, Azure cordons (makes the node unavailable for scheduling of new
workloads) and drains (moves the existent workloads to other node) the older nodes and uncordon the new
ones, effectively transferring all the scheduled applications to the new nodes.
This process is better than updating Linux-based kernels manually because Linux requires a reboot when a new
kernel update is installed. If you update the OS manually, you also need to reboot the VM, manually cordoning
and draining all the apps.
This article shows you how you can automate the update process of AKS nodes. You'll use GitHub Actions and
Azure CLI to create an update task based on cron that runs automatically.
5. Create a new job using the below. This job is named upgrade-node , runs on an Ubuntu agent, and will
connect to your Azure CLI account to execute the needed steps to upgrade the nodes.
on:
schedule:
- cron: '0 3 */15 * *'
jobs:
upgrade-node:
runs-on: ubuntu-latest
on:
schedule:
- cron: '0 3 */15 * *'
jobs:
upgrade-node:
runs-on: ubuntu-latest
steps:
- name: Azure Login
uses: Azure/[email protected]
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
5. From the Azure CLI, run the following command to generate a new username and password.
{
"appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"displayName": "azure-cli-xxxx-xx-xx-xx-xx-xx",
"name": "http://azure-cli-xxxx-xx-xx-xx-xx-xx",
"password": "xXxXxXxXx",
"tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
6. In a new browser window navigate to your GitHub repository and open the Settings tab of the
repository. Click Secrets then, click on New Repositor y Secret .
7. For Name, use AZURE_CREDENTIALS .
8. For Value, add the entire contents from the output of the previous step where you created a new
username and password.
2. Click the copy button on the GitHub marketplace result and paste the contents of the action in the main
editor, below the Azure Login step, similar to the following:
name: Upgrade cluster node images
on:
schedule:
- cron: '0 3 */15 * *'
jobs:
upgrade-node:
runs-on: ubuntu-latest
steps:
- name: Azure Login
uses: Azure/[email protected]
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Upgrade node images
uses: Azure/[email protected]
with:
inlineScript: az aks upgrade -g {resourceGroupName} -n {aksClusterName} --node-image-only -
-yes
TIP
You can decouple the -g and -n parameters from the command by adding them to secrets similar to the
previous steps. Replace the {resourceGroupName} and {aksClusterName} placeholders by their secret
counterparts, for example ${{secrets.RESOURCE_GROUP_NAME}} and ${{secrets.AKS_CLUSTER_NAME}}
NOTE
To upgrade a single node pool instead of all node pools on the cluster, add the --name parameter to the
az aks nodepool upgrade command to specify the node pool name. For example:
on:
schedule:
- cron: '0 3 */15 * *'
workflow_dispatch:
jobs:
upgrade-node:
runs-on: ubuntu-latest
steps:
- name: Azure Login
uses: Azure/[email protected]
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
Next steps
See the AKS release notes for information about the latest node images.
Learn how to upgrade the Kubernetes version with Upgrade an AKS cluster.
Learn more about multiple node pools and how to upgrade node pools with Create and manage multiple
node pools.
Learn more about system node pools
To learn how to save costs using Spot instances, see add a spot node pool to AKS
Apply security and kernel updates to Linux nodes in
Azure Kubernetes Service (AKS)
6/15/2022 • 5 minutes to read • Edit Online
To protect your clusters, security updates are automatically applied to Linux nodes in AKS. These updates include
OS security fixes or kernel updates. Some of these updates require a node reboot to complete the process. AKS
doesn't automatically reboot these Linux nodes to complete the update process.
The process to keep Windows Server nodes up to date is a little different. Windows Server nodes don't receive
daily updates. Instead, you perform an AKS upgrade that deploys new nodes with the latest base Window Server
image and patches. For AKS clusters that use Windows Server nodes, see Upgrade a node pool in AKS.
This article shows you how to use the open-source kured (KUbernetes REboot Daemon) to watch for Linux
nodes that require a reboot, then automatically handle the rescheduling of running pods and node reboot
process.
NOTE
Kured is an open-source project by Weaveworks. Please direct issues to the kured GitHub. Additional support can be
found in the #weave-community Slack channel.
Some security updates, such as kernel updates, require a node reboot to finalize the process. A Linux node that
requires a reboot creates a file named /var/run/reboot-required. This reboot process doesn't happen
automatically.
You can use your own workflows and processes to handle node reboots, or use kured to orchestrate the
process. With kured , a DaemonSet is deployed that runs a pod on each Linux node in the cluster. These pods in
the DaemonSet watch for existence of the /var/run/reboot-required file, and then initiate a process to reboot the
nodes.
Node image upgrades
Unattended upgrades apply updates to the Linux node OS, but the image used to create nodes for your cluster
remains unchanged. If a new Linux node is added to your cluster, the original image is used to create the node.
This new node will receive all the security and kernel updates available during the automatic check every day
but will remain unpatched until all checks and restarts are complete.
Alternatively, you can use node image upgrade to check for and update node images used by your cluster. For
more details on node image upgrade, see Azure Kubernetes Service (AKS) node image upgrade.
Node upgrades
There is an additional process in AKS that lets you upgrade a cluster. An upgrade is typically to move to a newer
version of Kubernetes, not just apply node security updates. An AKS upgrade performs the following actions:
A new node is deployed with the latest security updates and Kubernetes version applied.
An old node is cordoned and drained.
Pods are scheduled on the new node.
The old node is deleted.
You can't remain on the same Kubernetes version during an upgrade event. You must specify a newer version of
Kubernetes. To upgrade to the latest version of Kubernetes, you can upgrade your AKS cluster.
# Create a dedicated namespace where you would like to deploy kured into
kubectl create namespace kured
# Install kured in that namespace with Helm 3 (only on Linux nodes, kured is not working on Windows nodes)
helm install kured kured/kured --namespace kured --set nodeSelector."kubernetes\.io/os"=linux
You can also configure additional parameters for kured , such as integration with Prometheus or Slack. For
more information about additional configuration parameters, see the kured Helm chart.
Once the update process is complete, you can view the status of the nodes using the kubectl get nodes
command with the --output wide parameter. This additional output lets you see a difference in KERNEL-
VERSION of the underlying nodes, as shown in the following example output. The aks-nodepool1-28993262-0
was updated in a previous step and shows kernel version 4.15.0-1039-azure. The node aks-nodepool1-
28993262-1 that hasn't been updated shows kernel version 4.15.0-1037-azure.
Next steps
This article detailed how to use kured to reboot Linux nodes automatically as part of the security update
process. To upgrade to the latest version of Kubernetes, you can upgrade your AKS cluster.
For AKS clusters that use Windows Server nodes, see Upgrade a node pool in AKS.
Configure an AKS cluster
6/15/2022 • 9 minutes to read • Edit Online
As part of creating an AKS cluster, you may need to customize your cluster configuration to suit your needs. This
article introduces a few options for customizing your AKS cluster.
OS configuration
AKS supports Ubuntu 18.04 as the default node operating system (OS) in general availability (GA) for clusters.
Containerd works on every GA version of Kubernetes in AKS, and in every upstream kubernetes version above
v1.19, and supports all Kubernetes and AKS features.
IMPORTANT
Clusters with Linux node pools created on Kubernetes v1.19 or greater default to containerd for its container runtime.
Clusters with node pools on a earlier supported Kubernetes versions receive Docker for their container runtime. Linux
node pools will be updated to containerd once the node pool Kubernetes version is updated to a version that supports
containerd . You can still use Docker node pools and clusters on older supported versions until those fall off support.
Using containerd with Windows Server 2019 node pools is generally available, although the default for node pools
created on Kubernetes v1.22 and earlier is still Docker. For more details, see [Add a Windows Server node pool with
containerd ][/learn/aks-add-np-containerd].
It is highly recommended to test your workloads on AKS node pools with containerd prior to using clusters with a
Kubernetes version that supports containerd for your node pools.
Containerd limitations/differences
For containerd , we recommend using crictl as a replacement CLI instead of the Docker CLI for
troubleshooting pods, containers, and container images on Kubernetes nodes (for example, crictl ps ).
It doesn't provide the complete functionality of the docker CLI. It's intended for troubleshooting only.
crictl offers a more kubernetes-friendly view of containers, with concepts like pods, etc. being
present.
Containerd sets up logging using the standardized cri logging format (which is different from what you
currently get from docker’s json driver). Your logging solution needs to support the cri logging format (like
Azure Monitor for Containers)
You can no longer access the docker engine, /var/run/docker.sock , or use Docker-in-Docker (DinD).
If you currently extract application logs or monitoring data from Docker Engine, please use something
like Azure Monitor for Containers instead. Additionally AKS doesn't support running any out of band
commands on the agent nodes that could cause instability.
Even when using Docker, building images and directly leveraging the Docker engine via the methods
above is strongly discouraged. Kubernetes isn't fully aware of those consumed resources, and those
approaches present numerous issues detailed here and here, for example.
Building images - You can continue to use your current docker build workflow as normal, unless you are
building images inside your AKS cluster. In this case, please consider switching to the recommended
approach for building images using ACR Tasks, or a more secure in-cluster option like docker buildx.
Ephemeral OS
By default, Azure automatically replicates the operating system disk for a virtual machine to Azure storage to
avoid data loss should the VM need to be relocated to another host. However, since containers aren't designed
to have local state persisted, this behavior offers limited value while providing some drawbacks, including
slower node provisioning and higher read/write latency.
By contrast, ephemeral OS disks are stored only on the host machine, just like a temporary disk. This provides
lower read/write latency, along with faster node scaling and cluster upgrades.
Like the temporary disk, an ephemeral OS disk is included in the price of the virtual machine, so you incur no
additional storage costs.
IMPORTANT
When a user does not explicitly request managed disks for the OS, AKS will default to ephemeral OS if possible for a given
node pool configuration.
When using ephemeral OS, the OS disk must fit in the VM cache. The sizes for VM cache are available in the
Azure documentation in parentheses next to IO throughput ("cache size in GiB").
Using the AKS default VM size Standard_DS2_v2 with the default OS disk size of 100GB as an example, this VM
size supports ephemeral OS but only has 86GB of cache size. This configuration would default to managed disks
if the user does not specify explicitly. If a user explicitly requested ephemeral OS, they would receive a validation
error.
If a user requests the same Standard_DS2_v2 with a 60GB OS disk, this configuration would default to
ephemeral OS: the requested size of 60GB is smaller than the maximum cache size of 86GB.
Using Standard_D8s_v3 with 100GB OS disk, this VM size supports ephemeral OS and has 200GB of cache
space. If a user does not specify the OS disk type, the node pool would receive ephemeral OS by default.
Ephemeral OS requires at least version 2.15.0 of the Azure CLI.
Use Ephemeral OS on new clusters
Configure the cluster to use Ephemeral OS disks when the cluster is created. Use the --node-osdisk-type flag to
set Ephemeral OS as the OS disk type for the new cluster.
If you want to create a regular cluster using network-attached OS disks, you can do so by specifying
--node-osdisk-type=Managed . You can also choose to add more ephemeral OS node pools as per below.
IMPORTANT
With ephemeral OS you can deploy VM and instance images up to the size of the VM cache. In the AKS case, the default
node OS disk configuration uses 128GB, which means that you need a VM size that has a cache larger than 128GB. The
default Standard_DS2_v2 has a cache size of 86GB, which is not large enough. The Standard_DS3_v2 has a cache size of
172GB, which is large enough. You can also reduce the default size of the OS disk by using --node-osdisk-size . The
minimum size for AKS images is 30GB.
If you want to create node pools with network-attached OS disks, you can do so by specifying
--node-osdisk-type Managed .
Custom resource group name
When you deploy an Azure Kubernetes Service cluster in Azure, a second resource group gets created for the
worker nodes. By default, AKS will name the node resource group MC_resourcegroupname_clustername_location ,
but you can also provide your own name.
To specify your own resource group name, install the aks-preview Azure CLI extension version 0.3.2 or later.
Using the Azure CLI, use the --node-resource-group parameter of the az aks create command to specify a
custom name for the resource group. If you use an Azure Resource Manager template to deploy an AKS cluster,
you can define the resource group name by using the nodeResourceGroup property.
The secondary resource group is automatically created by the Azure resource provider in your own subscription.
You can only specify the custom resource group name when the cluster is created.
As you work with the node resource group, keep in mind that you can't:
Specify an existing resource group for the node resource group.
Specify a different subscription for the node resource group.
Change the node resource group name after the cluster has been created.
Specify names for the managed resources within the node resource group.
Modify or delete Azure-created tags of managed resources within the node resource group.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
WARNING
Enable/disable OIDC Issuer will change the current service account token issuer to a new value, which causes some down
time and make API server restart. If the application pods based on service account token keep in failed status after
enable/disable OIDC Issuer, it's recommended to restart the pods manually.
You can check on the registration status by using the az feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Next steps
Learn how upgrade the node images in your cluster.
See Upgrade an Azure Kubernetes Service (AKS) cluster to learn how to upgrade your cluster to the latest
version of Kubernetes.
Read more about containerd and Kubernetes
See the list of Frequently asked questions about AKS to find answers to some common AKS questions.
Read more about Ephemeral OS disks.
Customize node configuration for Azure Kubernetes
Service (AKS) node pools
6/15/2022 • 9 minutes to read • Edit Online
Customizing your node configuration allows you to configure or tune your operating system (OS) settings or
the kubelet parameters to match the needs of the workloads. When you create an AKS cluster or add a node
pool to your cluster, you can customize a subset of commonly used OS and kubelet settings. To configure
settings beyond this subset, use a daemon set to customize your needed configurations without losing AKS
support for your nodes.
A L LO W ED
PA RA M ET ER VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
cpuCfsQuotaPeriod Interval in milliseconds (ms) 100ms Sets CPU CFS quota period
value.
A L LO W ED
SET T IN G VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
A L LO W ED
SET T IN G VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
1-
net.ipv4.tcp_keepalive_probes 15 9 How many keepalive
probes TCP sends out, until
it decides that the
connection is broken.
net.ipv4.ip_local_port_rangeFirst: 1024 - 60999 and First: 32768 and Last: The local port range that is
Last: 32768 - 65000] 60999 used by TCP and UDP traffic
to choose the local port.
Comprised of two numbers:
The first number is the first
local port allowed for TCP
and UDP traffic on the
agent node, the second is
the last local port number.
128 - 80000
net.ipv4.neigh.default.gc_thresh1 4096 Minimum number of entries
that may be in the ARP
cache. Garbage collection
won't be triggered if the
number of entries is below
this setting.
512 - 90000
net.ipv4.neigh.default.gc_thresh2 8192 Soft maximum number of
entries that may be in the
ARP cache. This setting is
arguably the most
important, as ARP garbage
collection will be triggered
about 5 seconds after
reaching this soft maximum.
A L LO W ED
SET T IN G VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
1024 -
net.ipv4.neigh.default.gc_thresh3 100000 16384 Hard maximum number of
entries in the ARP cache.
131072
net.netfilter.nf_conntrack_max - 1048576 131072 nf_conntrack is a module
that tracks connection
entries for NAT within Linux.
The nf_conntrack module
uses a hash table to record
the established connection
record of the TCP protocol.
nf_conntrack_max is the
maximum number of nodes
in the hash table, that is,
the maximum number of
connections supported by
the nf_conntrack module
or the size of connection
tracking table.
65536 -
net.netfilter.nf_conntrack_buckets 147456 65536 nf_conntrack is a module
that tracks connection
entries for NAT within Linux.
The nf_conntrack module
uses a hash table to record
the established connection
record of the TCP protocol.
nf_conntrack_buckets is
the size of hash table.
Worker limits
Like file descriptor limits, the number of workers or threads that a process can create are limited by both a
kernel setting and user limits. The user limit on AKS is unlimited.
A L LO W ED
SET T IN G VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
Virtual memory
The settings below can be used to tune the operation of the virtual memory (VM) subsystem of the Linux kernel
and the writeout of dirty data to disk.
A L LO W ED
SET T IN G VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
A L LO W ED
SET T IN G VA L UES/ IN T ERVA L DEFA ULT DESC RIP T IO N
IMPORTANT
For ease of search and readability the OS settings are displayed in this document by their name but should be added to
the configuration json file or AKS API using camelCase capitalization convention.
{
"transparentHugePageEnabled": "madvise",
"transparentHugePageDefrag": "defer+madvise",
"swapFileSizeMB": 1500,
"sysctls": {
"netCoreSomaxconn": 163849,
"netIpv4TcpTwReuse": true,
"netIpv4IpLocalPortRange": "32000 60000"
}
}
Create a new cluster specifying the kubelet and OS configurations using the JSON files created in the previous
step.
NOTE
When you create a cluster, you can specify the kubelet configuration, OS configuration, or both. If you specify a
configuration when creating a cluster, only the nodes in the initial node pool will have that configuration applied. Any
settings not configured in the JSON file will retain the default value.
Add a new node pool specifying the Kubelet parameters using the JSON file you created.
NOTE
When you add a node pool to an existing cluster, you can specify the kubelet configuration, OS configuration, or both. If
you specify a configuration when adding a node pool, only the nodes in the new node pool will have that configuration
applied. Any settings not configured in the JSON file will retain the default value.
Other configuration
The settings below can be used to modify other Operating System settings.
Message of the Day
Pass the --message-of-the-day flag with the location of the file to replace the Message of the Day on Linux
nodes at cluster creation or node pool creation.
Cluster creation
Nodepool creation
Next steps
Learn how to configure your AKS cluster.
Learn how upgrade the node images in your cluster.
See Upgrade an Azure Kubernetes Service (AKS) cluster to learn how to upgrade your cluster to the latest
version of Kubernetes.
See the list of Frequently asked questions about AKS to find answers to some common AKS questions.
Authenticate with Azure Container Registry from
Azure Kubernetes Service
6/15/2022 • 4 minutes to read • Edit Online
When you're using Azure Container Registry (ACR) with Azure Kubernetes Service (AKS), an authentication
mechanism needs to be established. This operation is implemented as part of the CLI, PowerShell, and Portal
experience by granting the required permissions to your ACR. This article provides examples for configuring
authentication between these two Azure services.
You can set up the AKS to ACR integration in a few simple commands with the Azure CLI or Azure PowerShell.
This integration assigns the AcrPull role to the managed identity associated to the AKS Cluster.
NOTE
This article covers automatic authentication between AKS and ACR. If you need to pull an image from a private external
registry, use an image pull secret.
Owner , Azure account administrator , or Azure co-administrator role on the Azure subscription
Azure CLI version 2.7.0 or later
To avoid needing an Owner , Azure account administrator , or Azure co-administrator role, you can use an
existing managed identity to authenticate ACR from AKS. For more information, see Use an Azure managed
identity to authenticate to an Azure container registry.
# set this to the name of your Azure Container Registry. It must be globally unique
MYACR=myContainerRegistry
# Run the following line to create an Azure Container Registry if you do not already have one
az acr create -n $MYACR -g myContainerRegistryResourceGroup --sku basic
NOTE
If you are using an ACR that is located in a different subscription from your AKS cluster, use the ACR resource ID when
attaching or detaching from an AKS cluster.
Integrate an existing ACR with existing AKS clusters by supplying valid values for acr-name or acr-resource-
id as below.
or,
NOTE
Running az aks update --attach-acr uses the permissions of the user running the command to create the role ACR
assignment. This role is assigned to the kubelet managed identity. For more information on the AKS managed identities,
see Summary of managed identities.
You can also remove the integration between an ACR and an AKS cluster with the following
or
Azure CLI
Azure CLI
Azure PowerShell
Create a file called acr-nginx.yaml that contains the following. Substitute the resource name of your registry
for acr-name . Example: myContainerRegistry.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx0-deployment
labels:
app: nginx0-deployment
spec:
replicas: 2
selector:
matchLabels:
app: nginx0
template:
metadata:
labels:
app: nginx0
spec:
containers:
- name: nginx
image: <acr-name>.azurecr.io/nginx:v1
ports:
- containerPort: 80
Troubleshooting
Run the az aks check-acr command to validate that the registry is accessible from the AKS cluster.
Learn more about ACR Monitoring
Learn more about ACR Health
Create and configure an Azure Kubernetes Services
(AKS) cluster to use virtual nodes
6/15/2022 • 2 minutes to read • Edit Online
To rapidly scale application workloads in an AKS cluster, you can use virtual nodes. With virtual nodes, you have
quick provisioning of pods, and only pay per second for their execution time. You don't need to wait for
Kubernetes cluster autoscaler to deploy VM compute nodes to run the additional pods. Virtual nodes are only
supported with Linux pods and nodes.
The virtual nodes add-on for AKS, is based on the open source project Virtual Kubelet.
This article gives you an overview of the region availability and networking requirements for using virtual
nodes, as well as the known limitations.
Regional availability
All regions, where ACI supports VNET SKUs, are supported for virtual nodes deployments. For more details, see
Resource availability for Azure Container Instances in Azure regions.
For available CPU and Memory SKUs in each region, please check the Azure Container Instances Resource
availability for Azure Container Instances in Azure regions - Linux container groups
Network requirements
Virtual nodes enable network communication between pods that run in Azure Container Instances (ACI) and the
AKS cluster. To provide this communication, a virtual network subnet is created and delegated permissions are
assigned. Virtual nodes only work with AKS clusters created using advanced networking (Azure CNI). By default,
AKS clusters are created with basic networking (kubenet).
Pods running in Azure Container Instances (ACI) need access to the AKS API server endpoint, in order to
configure networking.
Known limitations
Virtual Nodes functionality is heavily dependent on ACI's feature set. In addition to the quotas and limits for
Azure Container Instances, the following scenarios are not yet supported with Virtual nodes:
Using service principal to pull ACR images. Workaround is to use Kubernetes secrets
Virtual Network Limitations including VNet peering, Kubernetes network policies, and outbound traffic to the
internet with network security groups.
Init containers
Host aliases
Arguments for exec in ACI
DaemonSets will not deploy pods to the virtual nodes
Virtual nodes support scheduling Linux pods. You can manually install the open source Virtual Kubelet ACI
provider to schedule Windows Server containers to ACI.
Virtual nodes require AKS clusters with Azure CNI networking.
Using api server authorized ip ranges for AKS.
Volume mounting Azure Files share support General-purpose V1. Follow the instructions for mounting a
volume with Azure Files share
Using IPv6 is not supported.
Next steps
Configure virtual nodes for your clusters:
Create virtual nodes using Azure CLI
Create virtual nodes using the portal in Azure Kubernetes Services (AKS)
Virtual nodes are often one component of a scaling solution in AKS. For more information on scaling solutions,
see the following articles:
Use the Kubernetes horizontal pod autoscaler
Use the Kubernetes cluster autoscaler
Check out the Autoscale sample for Virtual Nodes
Read more about the Virtual Kubelet open source library
Create and configure an Azure Kubernetes Services
(AKS) cluster to use virtual nodes using the Azure
CLI
6/15/2022 • 8 minutes to read • Edit Online
This article shows you how to use the Azure CLI to create and configure the virtual network resources and AKS
cluster, then enable virtual nodes.
IMPORTANT
Before using virtual nodes with AKS, review both the limitations of AKS virtual nodes and the virtual networking
limitations of ACI. These limitations affect the location, networking configuration, and other configuration details of both
your AKS cluster and the virtual nodes.
If you have not previously used ACI, register the service provider with your subscription. You can check the
status of the ACI provider registration using the az provider list command, as shown in the following example:
The Microsoft.ContainerInstance provider should report as Registered, as shown in the following example
output:
If the provider shows as NotRegistered, register the provider using the az provider register as shown in the
following example:
Now create an additional subnet for virtual nodes using the az network vnet subnet create command. The
following example creates a subnet named myVirtualNodeSubnet with the address prefix of 10.241.0.0/16.
az ad sp create-for-rbac
Make a note of the appId and password. These values are used in the following steps.
To grant the correct access for the AKS cluster to use the virtual network, create a role assignment using the az
role assignment create command. Replace <appId > and <vnetId> with the values gathered in the previous two
steps.
az network vnet subnet show --resource-group myResourceGroup --vnet-name myVnet --name myAKSSubnet --query
id -o tsv
Use the az aks create command to create an AKS cluster. The following example creates a cluster named
myAKSCluster with one node. Replace <subnetId> with the ID obtained in the previous step, and then <appId>
and <password> with the values gathered in the previous section.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--network-plugin azure \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--docker-bridge-address 172.17.0.1/16 \
--vnet-subnet-id <subnetId> \
--service-principal <appId> \
--client-secret <password>
After several minutes, the command completes and returns JSON-formatted information about the cluster.
az aks enable-addons \
--resource-group myResourceGroup \
--name myAKSCluster \
--addons virtual-node \
--subnet-name myVirtualNodeSubnet
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows the single VM node created and then the virtual node for Linux, virtual-
node-aci-linux:
Use the kubectl get pods command with the -o wide argument to output a list of pods and the scheduled node.
Notice that the aci-helloworld pod has been scheduled on the virtual-node-aci-linux node.
The pod is assigned an internal IP address from the Azure virtual network subnet delegated for use with virtual
nodes.
NOTE
If you use images stored in Azure Container Registry, configure and use a Kubernetes secret. A current limitation of virtual
nodes is that you can't use integrated Azure AD service principal authentication. If you don't use a secret, pods scheduled
on virtual nodes fail to start and report the error HTTP response status code 400 error code "InaccessibleImage" .
Now access the address of your pod using curl , such as http://10.241.0.4. Provide your own internal IP address
shown in the previous kubectl get pods command:
curl -L http://10.241.0.4
The demo application is displayed, as shown in the following condensed example output:
<html>
<head>
<title>Welcome to Azure Container Instances!</title>
</head>
[...]
Close the terminal session to your test pod with exit . When your session is ended, the pod is the deleted.
Next steps
In this article, a pod was scheduled on the virtual node and assigned a private, internal IP address. You could
instead create a service deployment and route traffic to your pod through a load balancer or ingress controller.
For more information, see Create a basic ingress controller in AKS.
Virtual nodes are often one component of a scaling solution in AKS. For more information on scaling solutions,
see the following articles:
Use the Kubernetes horizontal pod autoscaler
Use the Kubernetes cluster autoscaler
Check out the Autoscale sample for Virtual Nodes
Read more about the Virtual Kubelet open source library
Create and configure an Azure Kubernetes Services
(AKS) cluster to use virtual nodes in the Azure
portal
6/15/2022 • 5 minutes to read • Edit Online
This article shows you how to use the Azure portal to create and configure the virtual network resources and an
AKS cluster with virtual nodes enabled.
NOTE
This article gives you an overview of the region availability and limitations using virtual nodes.
The Microsoft.ContainerInstance provider should report as Registered, as shown in the following example
output:
If the provider shows as NotRegistered, register the provider using the az provider register as shown in the
following example:
Sign in to Azure
Sign in to the Azure portal at https://portal.azure.com.
By default, a cluster identity is created. This cluster identity is used for cluster communication and integration
with other Azure services. By default, this cluster identity is a managed identity. For more information, see Use
managed identities. You can also use a service principal as your cluster identity.
The cluster is also configured for advanced networking. The virtual nodes are configured to use their own Azure
virtual network subnet. This subnet has delegated permissions to connect Azure resources between the AKS
cluster. If you don't already have delegated subnet, the Azure portal creates and configures the Azure virtual
network and subnet for use with the virtual nodes.
Select Review + create . After the validation is complete, select Create .
It takes a few minutes to create the AKS cluster and to be ready for use.
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows the single VM node created and then the virtual node for Linux, virtual-
node-aci-linux:
apiVersion: apps/v1
kind: Deployment
metadata:
name: aci-helloworld
spec:
replicas: 1
selector:
matchLabels:
app: aci-helloworld
template:
metadata:
labels:
app: aci-helloworld
spec:
containers:
- name: aci-helloworld
image: mcr.microsoft.com/azuredocs/aci-helloworld
ports:
- containerPort: 80
nodeSelector:
kubernetes.io/role: agent
beta.kubernetes.io/os: linux
type: virtual-kubelet
tolerations:
- key: virtual-kubelet.io/provider
operator: Exists
Use the kubectl get pods command with the -o wide argument to output a list of pods and the scheduled node.
Notice that the virtual-node-helloworld pod has been scheduled on the virtual-node-linux node.
The pod is assigned an internal IP address from the Azure virtual network subnet delegated for use with virtual
nodes.
NOTE
If you use images stored in Azure Container Registry, configure and use a Kubernetes secret. A current limitation of virtual
nodes is that you can't use integrated Azure AD service principal authentication. If you don't use a secret, pods scheduled
on virtual nodes fail to start and report the error HTTP response status code 400 error code "InaccessibleImage" .
Now access the address of your pod using curl , such as http://10.241.0.4. Provide your own internal IP address
shown in the previous kubectl get pods command:
curl -L http://10.241.0.4
The demo application is displayed, as shown in the following condensed example output:
<html>
<head>
<title>Welcome to Azure Container Instances!</title>
</head>
[...]
Close the terminal session to your test pod with exit . When your session is ended, the pod is the deleted.
Next steps
In this article, a pod was scheduled on the virtual node and assigned a private, internal IP address. You could
instead create a service deployment and route traffic to your pod through a load balancer or ingress controller.
For more information, see Create a basic ingress controller in AKS.
Virtual nodes are one component of a scaling solution in AKS. For more information on scaling solutions, see
the following articles:
Use the Kubernetes horizontal pod autoscaler
Use the Kubernetes cluster autoscaler
Check out the Autoscale sample for Virtual Nodes
Read more about the Virtual Kubelet open source library
Automatically scale a cluster to meet application
demands on Azure Kubernetes Service (AKS)
6/15/2022 • 12 minutes to read • Edit Online
To keep up with application demands in Azure Kubernetes Service (AKS), you may need to adjust the number of
nodes that run your workloads. The cluster autoscaler component can watch for pods in your cluster that can't
be scheduled because of resource constraints. When issues are detected, the number of nodes in a node pool is
increased to meet the application demand. Nodes are also regularly checked for a lack of running pods, with the
number of nodes then decreased as needed. This ability to automatically scale up or down the number of nodes
in your AKS cluster lets you run an efficient, cost-effective cluster.
This article shows you how to enable and manage the cluster autoscaler in an AKS cluster.
Both the horizontal pod autoscaler and cluster autoscaler can also decrease the number of pods and nodes as
needed. The cluster autoscaler decreases the number of nodes when there has been unused capacity for a
period of time. Pods on a node to be removed by the cluster autoscaler are safely scheduled elsewhere in the
cluster. The cluster autoscaler may be unable to scale down if pods can't move, such as in the following
situations:
A pod is directly created and isn't backed by a controller object, such as a deployment or replica set.
A pod disruption budget (PDB) is too restrictive and doesn't allow the number of pods to be fall below a
certain threshold.
A pod uses node selectors or anti-affinity that can't be honored if scheduled on a different node.
For more information about how the cluster autoscaler may be unable to scale down, see What types of pods
can prevent the cluster autoscaler from removing a node?
The cluster autoscaler uses startup parameters for things like time intervals between scale events and resource
thresholds. For more information on what parameters the cluster autoscaler uses, see Using the autoscaler
profile.
The cluster and horizontal pod autoscalers can work together, and are often both deployed in a cluster. When
combined, the horizontal pod autoscaler is focused on running the number of pods required to meet application
demand. The cluster autoscaler is focused on running the number of nodes required to support the scheduled
pods.
NOTE
Manual scaling is disabled when you use the cluster autoscaler. Let the cluster autoscaler determine the required number
of nodes. If you want to manually scale your cluster, disable the cluster autoscaler.
IMPORTANT
The cluster autoscaler is a Kubernetes component. Although the AKS cluster uses a virtual machine scale set for the
nodes, don't manually enable or edit settings for scale set autoscale in the Azure portal or using the Azure CLI. Let the
Kubernetes cluster autoscaler manage the required scale settings. For more information, see Can I modify the AKS
resources in the node resource group?
The following example creates an AKS cluster with a single node pool backed by a virtual machine scale set. It
also enables the cluster autoscaler on the node pool for the cluster and sets a minimum of 1 and maximum of 3
nodes:
# Now create the AKS cluster and enable the cluster autoscaler
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3
It takes a few minutes to create the cluster and configure the cluster autoscaler settings.
The following example updates an existing AKS cluster to enable the cluster autoscaler on the node pool for the
cluster and sets a minimum of 1 and maximum of 3 nodes:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3
It takes a few minutes to update the cluster and configure the cluster autoscaler settings.
In the previous step to create an AKS cluster or update an existing node pool, the cluster autoscaler minimum
node count was set to 1, and the maximum node count was set to 3. As your application demands change, you
may need to adjust the cluster autoscaler node count.
To change the node count, use the az aks update command.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--update-cluster-autoscaler \
--min-count 1 \
--max-count 5
The above example updates cluster autoscaler on the single node pool in myAKSCluster to a minimum of 1 and
maximum of 5 nodes.
NOTE
The cluster autoscaler will enforce the minimum count in cases where the actual count drops below the minimum due to
external factors, such as during a spot eviction or when changing the minimum count value from the AKS API.
Monitor the performance of your applications and services, and adjust the cluster autoscaler node counts to
match the required performance.
IMPORTANT
The cluster autoscaler profile affects all node pools that use the cluster autoscaler. You can't set an autoscaler profile per
node pool.
The cluster autoscaler profile requires version 2.11.1 or greater of the Azure CLI. If you need to install or upgrade, see
Install Azure CLI.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile scan-interval=30s
When you enable the cluster autoscaler on node pools in the cluster, these node pools with CA enabled will also
use the cluster autoscaler profile. For example:
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3 \
--cluster-autoscaler-profile scan-interval=30s
The above command creates an AKS cluster and defines the scan interval as 30 seconds for the cluster-wide
autoscaler profile. The command also enables the cluster autoscaler on the initial node pool, sets the minimum
node count to 1 and the maximum node count to 3.
Reset cluster autoscaler profile to default values
Use the az aks update command to reset the cluster autoscaler profile on your cluster.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile ""
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--disable-cluster-autoscaler
You can manually scale your cluster after disabling the cluster autoscaler by using the az aks scale command. If
you use the horizontal pod autoscaler, that feature continues to run with the cluster autoscaler disabled, but
pods may end up unable to be scheduled if all node resources are in use.
AzureDiagnostics
| where Category == "cluster-autoscaler"
You should see logs similar to the following example as long as there are logs to retrieve.
The cluster autoscaler will also write out health status to a configmap named cluster-autoscaler-status . To
retrieve these logs, execute the following kubectl command. A health status will be reported for each node
pool configured with the cluster autoscaler.
To learn more about what is logged from the autoscaler, read the FAQ on the Kubernetes/autoscaler GitHub
project.
The cluster autoscaler can be disabled with az aks nodepool update and passing the
--disable-cluster-autoscaler parameter.
If you wish to re-enable the cluster autoscaler on an existing cluster, you can re-enable it using the az aks
nodepool update command, specifying the --enable-cluster-autoscaler , --min-count , and --max-count
parameters.
NOTE
If you are planning on using the cluster autoscaler with nodepools that span multiple zones and leverage scheduling
features related to zones such as volume topological scheduling, the recommendation is to have one nodepool per zone
and enable the --balance-similar-node-groups through the autoscaler profile. This will ensure that the autoscaler will
scale up succesfully and try and keep the sizes of the nodepools balanced.
Next steps
This article showed you how to automatically scale the number of AKS nodes. You can also use the horizontal
pod autoscaler to automatically adjust the number of pods that run your application. For steps on using the
horizontal pod autoscaler, see Scale applications in AKS.
Create an Azure Kubernetes Service (AKS) cluster
that uses availability zones
6/15/2022 • 7 minutes to read • Edit Online
An Azure Kubernetes Service (AKS) cluster distributes resources such as nodes and storage across logical
sections of underlying Azure infrastructure. This deployment model when using availability zones, ensures
nodes in a given availability zone are physically separated from those defined in another availability zone. AKS
clusters deployed with multiple availability zones configured across a cluster provide a higher level of
availability to protect against a hardware failure or a planned maintenance event.
By defining node pools in a cluster to span multiple zones, nodes in a given node pool are able to continue
operating even if a single zone has gone down. Your applications can continue to be available even if there is a
physical failure in a single datacenter if orchestrated to tolerate failure of a subset of nodes.
This article shows you how to create an AKS cluster and distribute the node components across availability
zones.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--generate-ssh-keys \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--node-count 3 \
--zones 1 2 3
Next, use the kubectl describe command to list the nodes in the cluster and filter on the
topology.kubernetes.io/zone value. The following example is for a Bash shell.
The following example output shows the three nodes distributed across the specified region and availability
zones, such as eastus2-1 for the first availability zone and eastus2-2 for the second availability zone:
Name: aks-nodepool1-28993262-vmss000000
topology.kubernetes.io/zone=eastus2-1
Name: aks-nodepool1-28993262-vmss000001
topology.kubernetes.io/zone=eastus2-2
Name: aks-nodepool1-28993262-vmss000002
topology.kubernetes.io/zone=eastus2-3
As you add additional nodes to an agent pool, the Azure platform automatically distributes the underlying VMs
across the specified availability zones.
Note that in newer Kubernetes versions (1.17.0 and later), AKS is using the newer label
topology.kubernetes.io/zone in addition to the deprecated failure-domain.beta.kubernetes.io/zone . You can get
the same result as above with by running the following script:
az aks scale \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 5
When the scale operation completes after a few minutes, the command
kubectl describe nodes | grep -e "Name:" -e "topology.kubernetes.io/zone" in a Bash shell should give an
output similar to this sample:
Name: aks-nodepool1-28993262-vmss000000
topology.kubernetes.io/zone=eastus2-1
Name: aks-nodepool1-28993262-vmss000001
topology.kubernetes.io/zone=eastus2-2
Name: aks-nodepool1-28993262-vmss000002
topology.kubernetes.io/zone=eastus2-3
Name: aks-nodepool1-28993262-vmss000003
topology.kubernetes.io/zone=eastus2-1
Name: aks-nodepool1-28993262-vmss000004
topology.kubernetes.io/zone=eastus2-2
We now have two additional nodes in zones 1 and 2. You can deploy an application consisting of three replicas.
We will use NGINX as an example:
By viewing nodes where your pods are running, you see pods are running on the nodes corresponding to three
different availability zones. For example, with the command
kubectl describe pod | grep -e "^Name:" -e "^Node:" in a Bash shell you would get an output similar to this:
Name: nginx-6db489d4b7-ktdwg
Node: aks-nodepool1-28993262-vmss000000/10.240.0.4
Name: nginx-6db489d4b7-v7zvj
Node: aks-nodepool1-28993262-vmss000002/10.240.0.6
Name: nginx-6db489d4b7-xz6wj
Node: aks-nodepool1-28993262-vmss000004/10.240.0.8
As you can see from the previous output, the first pod is running on node 0, which is located in the availability
zone eastus2-1 . The second pod is running on node 2, which corresponds to eastus2-3 , and the third one in
node 4, in eastus2-2 . Without any additional configuration, Kubernetes is spreading the pods correctly across
all three availability zones.
Next steps
This article detailed how to create an AKS cluster that uses availability zones. For more considerations on highly
available clusters, see Best practices for business continuity and disaster recovery in AKS.
Azure Kubernetes Service (AKS) node pool
snapshot
6/15/2022 • 3 minutes to read • Edit Online
AKS releases a new node image weekly and every new cluster, new node pool, or upgrade cluster will always
receive the latest image that can make it hard to maintain your environments consistent and to have repeatable
environments.
Node pool snapshots allow you to take a configuration snapshot of your node pool and then create new node
pools or new clusters based of that snapshot for as long as that configuration and kubernetes version is
supported. For more information on the supportability windows, see Supported Kubernetes versions in AKS.
The snapshot is an Azure resource that will contain the configuration information from the source node pool
such as the node image version, kubernetes version, OS type, and OS SKU. You can then reference this snapshot
resource and the respective values of its configuration to create any new node pool or cluster based off of it.
IMPORTANT
Your AKS node pool must be created or upgraded after Nov 10th, 2021 in order for a snapshot to be taken from it. If you
are using the aks-preview Azure CLI extension version 0.5.59 or newer, the commands for node pool snapshot have
changed. For updated commands, see the Node Pool Snapshot CLI reference.
Now, to take a snapshot from the previous node pool you'll use the az aks snapshot CLI command.
SNAPSHOT_ID=$(az aks nodepool snapshot show --name MySnapshot --resource-group myResourceGroup --query id -o
tsv)
Now, we can use the command below to add a new node pool based off of this snapshot.
az aks nodepool add --name np2 --cluster-name myAKSCluster --resource-group myResourceGroup --snapshot-id
$SNAPSHOT_ID
SNAPSHOT_ID=$(az aks nodepool snapshot show --name MySnapshot --resource-group myResourceGroup --query id -o
tsv)
Now, we can use this command to upgrade this node pool to this snapshot configuration.
NOTE
Your node pool image version will be the same contained in the snapshot and will remain the same throughout every
scale operation. However, if this node pool is upgraded or a node image upgrade is performed without providing a
snapshot-id the node image will be upgraded to latest.
SNAPSHOT_ID=$(az aks nodepool snapshot show --name MySnapshot --resource-group myResourceGroup --query id -o
tsv)
Now, we can use this command to create this cluster off of the snapshot configuration.
Next steps
See the AKS release notes for information about the latest node images.
Learn how to upgrade the Kubernetes version with Upgrade an AKS cluster.
Learn how to upgrade you node image version with Node Image Upgrade
Learn more about multiple node pools and how to upgrade node pools with Create and manage multiple
node pools.
Add Azure Dedicated Host to an Azure Kubernetes
Service (AKS) cluster (Preview)
6/15/2022 • 5 minutes to read • Edit Online
Azure Dedicated Host is a service that provides physical servers - able to host one or more virtual machines -
dedicated to one Azure subscription. Dedicated hosts are the same physical servers used in our data centers,
provided as a resource. You can provision dedicated hosts within a region, availability zone, and fault domain.
Then, you can place VMs directly into your provisioned hosts, in whatever configuration best meets your needs.
Using Azure Dedicated Hosts for nodes with your AKS cluster has the following benefits:
Hardware isolation at the physical server level. No other VMs will be placed on your hosts. Dedicated hosts
are deployed in the same data centers and share the same network and underlying storage infrastructure as
other, non-isolated hosts.
Control over maintenance events initiated by the Azure platform. While most maintenance events have little
to no impact on your virtual machines, there are some sensitive workloads where each second of pause can
have an impact. With dedicated hosts, you can opt in to a maintenance window to reduce the impact to your
service.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
It takes a few minutes for the status to show Registered. Verify the registration status by using the [az feature
list][az-feature-list] command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the [az
provider register][az-provider-register] command:
Limitations
The following limitations apply when you integrate Azure Dedicated Host with Azure Kubernetes Service:
An existing agent pool can't be converted from non-ADH to ADH or ADH to non-ADH.
It is not supported to update agent pool from host group A to host group B.
Using ADH across subscriptions.
az vm host create \
--host-group myHostGroup \
--name myHost \
--sku DSv3-Type1 \
--platform-fault-domain 0 \
-g myDHResourceGroup
az role assignment create --assignee <id> --role "Contributor" --scope <Resource id>
Next steps
In this article, you learned how to create an AKS cluster with a Dedicated host, and to add a dedicated host to an
existing cluster. For more information about Dedicated Hosts, see dedicated-hosts.
Create and manage multiple node pools for a
cluster in Azure Kubernetes Service (AKS)
6/15/2022 • 26 minutes to read • Edit Online
In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools.
These node pools contain the underlying VMs that run your applications. The initial number of nodes and their
size (SKU) is defined when you create an AKS cluster, which creates a system node pool. To support applications
that have different compute or storage demands, you can create additional user node pools. System node pools
serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. User node pools
serve the primary purpose of hosting your application pods. However, application pods can be scheduled on
system node pools if you wish to only have one pool in your AKS cluster. User node pools are where you place
your application-specific pods. For example, use these additional user node pools to provide GPUs for compute-
intensive applications, or access to high-performance SSD storage.
NOTE
This feature enables higher control over how to create and manage multiple node pools. As a result, separate commands
are required for create/update/delete. Previously cluster operations through az aks create or az aks update used
the managedCluster API and were the only option to change your control plane and a single node pool. This feature
exposes a separate operation set for agent pools through the agentPool API and require use of the az aks nodepool
command set to execute operations on an individual node pool.
This article shows you how to create and manage multiple node pools in an AKS cluster.
Limitations
The following limitations apply when you create and manage AKS clusters that support multiple node pools:
See Quotas, virtual machine size restrictions, and region availability in Azure Kubernetes Service (AKS).
You can delete system node pools, provided you have another system node pool to take its place in the AKS
cluster.
System pools must contain at least one node, and user node pools may contain zero or more nodes.
The AKS cluster must use the Standard SKU load balancer to use multiple node pools, the feature isn't
supported with Basic SKU load balancers.
The AKS cluster must use virtual machine scale sets for the nodes.
You can't change the VM size of a node pool after you create it.
The name of a node pool may only contain lowercase alphanumeric characters and must begin with a
lowercase letter. For Linux node pools the length must be between 1 and 12 characters, for Windows node
pools the length must be between 1 and 6 characters.
All node pools must reside in the same virtual network.
When creating multiple node pools at cluster create time, all Kubernetes versions used by node pools must
match the version set for the control plane. This can be updated after the cluster has been provisioned by
using per node pool operations.
Create an AKS cluster
IMPORTANT
If you run a single system node pool for your AKS cluster in a production environment, we recommend you use at least
three nodes for the node pool.
To get started, create an AKS cluster with a single node pool. The following example uses the az group create
command to create a resource group named myResourceGroup in the eastus region. An AKS cluster named
myAKSCluster is then created using the az aks create command.
NOTE
The Basic load balancer SKU is not suppor ted when using multiple node pools. By default, AKS clusters are created with
the Standard load balancer SKU from the Azure CLI and Azure portal.
NOTE
To ensure your cluster operates reliably, you should run at least 2 (two) nodes in the default node pool, as essential
system services are running across this node pool.
When the cluster is ready, use the az aks get-credentials command to get the cluster credentials for use with
kubectl :
To see the status of your node pools, use the az aks node pool list command and specify your resource group
and cluster name:
The following example output shows that mynodepool has been successfully created with three nodes in the
node pool. When the AKS cluster was created in the previous step, a default nodepool1 was created with a node
count of 2.
[
{
...
"count": 3,
...
"name": "mynodepool",
"orchestratorVersion": "1.15.7",
...
"vmSize": "Standard_DS2_v2",
...
},
{
...
"count": 2,
...
"name": "nodepool1",
"orchestratorVersion": "1.15.7",
...
"vmSize": "Standard_DS2_v2",
...
}
]
TIP
If no VmSize is specified when you add a node pool, the default size is Standard_D2s_v3 for Windows node pools and
Standard_DS2_v2 for Linux node pools. If no OrchestratorVersion is specified, it defaults to the same version as the
control plane.
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
Limitations
All subnets assigned to node pools must belong to the same virtual network.
System pods must have access to all nodes/pods in the cluster to provide critical functionality such as DNS
resolution and tunneling kubectl logs/exec/port-forward proxy.
If you expand your VNET after creating the cluster you must update your cluster (perform any managed
cluster operation but node pool operations don't count) before adding a subnet outside the original cidr. AKS
will error out on the agent pool add now though we originally allowed it. The aks-preview Azure CLI
extension (version 0.5.66+) now supports running az aks update -g <resourceGroup> -n <clusterName>
without any optional arguments. This command will perform an update operation without making any
changes, which can recover a cluster stuck in a failed state.
In clusters with Kubernetes version < 1.23.3, kube-proxy will SNAT traffic from new subnets, which can cause
Azure Network Policy to drop the packets.
Windows nodes will SNAT traffic to the new subnets until the node pool is reimaged.
Internal load balancers default to one of the node pool subnets (usually the first subnet of the node pool at
cluster creation). To override this behavior, you can specify the load balancer's subnet explicitly using an
annotation.
To create a node pool with a dedicated subnet, pass the subnet resource ID as an additional parameter when
creating a node pool.
The commands in this section explain how to upgrade a single specific node pool. The relationship between
upgrading the Kubernetes version of the control plane and the node pool are explained in the section below.
NOTE
The node pool OS image version is tied to the Kubernetes version of the cluster. You will only get OS image upgrades,
following a cluster upgrade.
Since there are two node pools in this example, we must use az aks nodepool upgrade to upgrade a node pool.
To see the available upgrades use az aks get-upgrades
az aks get-upgrades --resource-group myResourceGroup --name myAKSCluster
Let's upgrade the mynodepool. Use the az aks nodepool upgrade command to upgrade the node pool, as shown
in the following example:
List the status of your node pools again using the az aks node pool list command. The following example shows
that mynodepool is in the Upgrading state to KUBERNETES_VERSION:
[
{
...
"count": 3,
...
"name": "mynodepool",
"orchestratorVersion": "KUBERNETES_VERSION",
...
"provisioningState": "Upgrading",
...
"vmSize": "Standard_DS2_v2",
...
},
{
...
"count": 2,
...
"name": "nodepool1",
"orchestratorVersion": "1.15.7",
...
"provisioningState": "Succeeded",
...
"vmSize": "Standard_DS2_v2",
...
}
]
An AKS cluster has two cluster resource objects with Kubernetes versions associated.
1. A cluster control plane Kubernetes version.
2. A node pool with a Kubernetes version.
A control plane maps to one or many node pools. The behavior of an upgrade operation depends on which
Azure CLI command is used.
Upgrading an AKS control plane requires using az aks upgrade . This command upgrades the control plane
version and all node pools in the cluster.
Issuing the az aks upgrade command with the --control-plane-only flag upgrades only the cluster control
plane. None of the associated node pools in the cluster are changed.
Upgrading individual node pools requires using az aks nodepool upgrade . This command upgrades only the
target node pool with the specified Kubernetes version
Validation rules for upgrades
The valid Kubernetes upgrades for a cluster's control plane and node pools are validated by the following sets of
rules.
Rules for valid versions to upgrade node pools:
The node pool version must have the same major version as the control plane.
The node pool minor version must be within two minor versions of the control plane version.
The node pool version can't be greater than the control major.minor.patch version.
Rules for submitting an upgrade operation:
You can't downgrade the control plane or a node pool Kubernetes version.
If a node pool Kubernetes version isn't specified, behavior depends on the client being used.
Declaration in Resource Manager templates falls back to the existing version defined for the node pool
if used, if none is set the control plane version is used to fall back on.
You can either upgrade or scale a control plane or a node pool at a given time, you can't submit
multiple operations on a single control plane or node pool resource simultaneously.
List the status of your node pools again using the az aks node pool list command. The following example shows
that mynodepool is in the Scaling state with a new count of 5 nodes:
[
{
...
"count": 5,
...
"name": "mynodepool",
"orchestratorVersion": "1.15.7",
...
"provisioningState": "Scaling",
...
"vmSize": "Standard_DS2_v2",
...
},
{
...
"count": 2,
...
"name": "nodepool1",
"orchestratorVersion": "1.15.7",
...
"provisioningState": "Succeeded",
...
"vmSize": "Standard_DS2_v2",
...
}
]
When you delete a node pool, AKS doesn't perform cordon and drain, and there are no recovery options for
data loss that may occur when you delete a node pool. If pods can't be scheduled on other node pools, those
applications become unavailable. Make sure you don't delete a node pool when in-use applications don't have
data backups or the ability to run on other node pools in your cluster. To minimize the disruption of rescheduling
pods currently running on the node pool you are going to delete, perform a cordon and drain on all nodes in the
node pool before deleting. For more information, see cordon and drain node pools.
The following example output from the az aks node pool list command shows that mynodepool is in the
Deleting state:
[
{
...
"count": 5,
...
"name": "mynodepool",
"orchestratorVersion": "1.15.7",
...
"provisioningState": "Deleting",
...
"vmSize": "Standard_DS2_v2",
...
},
{
...
"count": 2,
...
"name": "nodepool1",
"orchestratorVersion": "1.15.7",
...
"provisioningState": "Succeeded",
...
"vmSize": "Standard_DS2_v2",
...
}
]
It takes a few minutes to delete the nodes and the node pool.
As your application workloads demands, you may associate node pools to capacity reservation groups created
prior. This ensures guaranteed capacity is allocated for your node pools.
For more information on the capacity reservation groups, please refer to Capacity Reservation Groups.
Associating a node pool with an existing capacity reservation group can be done using az aks nodepool add
command and specifying a capacity reservation group with the --capacityReservationGroup flag" The capacity
reservation group should already exist , otherwise the node pool will be added to the cluster with a warning and
no capacity reservation group gets associated.
Associating a system node pool with an existing capacity reservation group can be done using az aks create
command. If the capacity reservation group specified doesn't exist, then a warning is issued and the cluster gets
created without any capacity reservation group association.
Deleting a node pool command will implicitly dissociate a node pool from any associated capacity reservation
group, before that node pool is deleted.
Deleting a cluster command implicitly dissociates all node pools in a cluster from their associated capacity
reservation groups.
The following example output from the az aks node pool list command shows that gpunodepool is Creating
nodes with the specified VmSize:
IMPORTANT
Adding taints, labels, or tags to nodes should be done for the entire node pool using az aks nodepool . Applying taints,
labels, or tags to individual nodes in a node pool using kubectl is not recommended.
The following example output from the az aks nodepool list command shows that taintnp is Creating nodes with
the specified nodeTaints:
The taint information is visible in Kubernetes for handling scheduling rules for nodes. The Kubernetes scheduler
can use taints and tolerations to restrict what workloads can run on nodes.
A taint is applied to a node that indicates only specific pods can be scheduled on them.
A toleration is then applied to a pod that allows them to tolerate a node's taint.
For more information on how to use advanced Kubernetes scheduled features, see Best practices for advanced
scheduler features in AKS
In the previous step, you applied the sku=gpu:NoSchedule taint when you created your node pool. The
following basic example YAML manifest uses a toleration to allow the Kubernetes scheduler to run an NGINX
pod on a node in that node pool.
Create a file named nginx-toleration.yaml and copy in the following example YAML:
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- image: mcr.microsoft.com/oss/nginx/nginx:1.15.9-alpine
name: mypod
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 1
memory: 2G
tolerations:
- key: "sku"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
It takes a few seconds to schedule the pod and pull the NGINX image. Use the kubectl describe pod command to
view the pod status. The following condensed example output shows the sku=gpu:NoSchedule toleration is
applied. In the events section, the scheduler has assigned the pod to the aks-taintnp-28993262-vmss000000
node:
[...]
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
sku=gpu:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m48s default-scheduler Successfully assigned default/mypod to aks-taintnp-28993262-
vmss000000
Normal Pulling 4m47s kubelet pulling image "mcr.microsoft.com/oss/nginx/nginx:1.15.9-
alpine"
Normal Pulled 4m43s kubelet Successfully pulled image
"mcr.microsoft.com/oss/nginx/nginx:1.15.9-alpine"
Normal Created 4m40s kubelet Created container
Normal Started 4m40s kubelet Started container
Only pods that have this toleration applied can be scheduled on nodes in taintnp. Any other pod would be
scheduled in the nodepool1 node pool. If you create additional node pools, you can use additional taints and
tolerations to limit what pods can be scheduled on those node resources.
Setting nodepool labels
For more information on using labels with node pools, see Use labels in an Azure Kubernetes Service (AKS)
cluster.
Setting nodepool Azure tags
For more information on using Azure tags with node pools, see Use Azure tags in Azure Kubernetes Service
(AKS).
To create a FIPS-enabled node pool, use az aks nodepool add with the --enable-fips-image parameter when
creating a node pool.
NOTE
You can also use the --enable-fips-image parameter with az aks create when creating a cluster to enable FIPS on the
default node pool. When adding node pools to a cluster created in this way, you still must use the --enable-fips-image
parameter when adding node pools to create a FIPS-enabled node pool.
To verify your node pool is FIPS-enabled, use az aks show to check the enableFIPS value in agentPoolProfiles.
The following example output shows the fipsnp node pool is FIPS-enabled and nodepool1 isn't.
Name enableFips
--------- ------------
fipsnp True
nodepool1 False
You can also verify deployments have access to the FIPS cryptographic libraries using kubectl debug on a node
in the FIPS-enabled node pool. Use kubectl get nodes to list the nodes:
In the above example, the nodes starting with aks-fipsnp are part of the FIPS-enabled node pool. Use
kubectl debug to run a deployment with an interactive session on one of those nodes in the FIPS-enabled node
pool.
From the interactive session, you can verify the FIPS cryptographic libraries are enabled:
root@aks-fipsnp-12345678-vmss000000:/# cat /proc/sys/crypto/fips_enabled
1
FIPS-enabled node pools also have a kubernetes.azure.com/fips_enabled=true label, which can be used by
deployments to target those node pools.
Deploy this template using the az deployment group create command, as shown in the following example.
You're prompted for the existing AKS cluster name and location:
TIP
You can add a tag to your node pool by adding the tag property in the template, as shown in the following example.
...
"resources": [
{
...
"properties": {
...
"tags": {
"name1": "val1"
},
...
}
}
...
It may take a few minutes to update your AKS cluster depending on the node pool settings and operations you
define in your Resource Manager template.
Create a new AKS cluster and attach a public IP for your nodes. Each of the nodes in the node pool receives a
unique public IP. You can verify this by looking at the Virtual Machine Scale Set instances.
For existing AKS clusters, you can also add a new node pool, and attach a public IP for your nodes.
View the output, and take note of the id for the prefix:
{
...
"id": "/subscriptions/<subscription-
id>/resourceGroups/myResourceGroup3/providers/Microsoft.Network/publicIPPrefixes/MyPublicIPPrefix",
...
}
Finally, when creating a new cluster or adding a new node pool, use the flag node-public-ip-prefix and pass in
the prefix's resource ID:
IMPORTANT
The node resource group contains the nodes and their public IPs. Use the node resource group when executing
commands to find the public IPs for your nodes.
Clean up resources
In this article, you created an AKS cluster that includes GPU-based nodes. To reduce unnecessary cost, you may
want to delete the gpunodepool, or the whole AKS cluster.
To delete the GPU-based node pool, use the az aks nodepool delete command as shown in following example:
To delete the cluster itself, use the az group delete command to delete the AKS resource group:
You can also delete the additional cluster you created for the public IP for node pools scenario.
A spot node pool is a node pool backed by a spot virtual machine scale set. Using spot VMs for nodes with your
AKS cluster allows you to take advantage of unutilized capacity in Azure at a significant cost savings. The
amount of available unutilized capacity will vary based on many factors, including node size, region, and time of
day.
When deploying a spot node pool, Azure will allocate the spot nodes if there's capacity available. But there's no
SLA for the spot nodes. A spot scale set that backs the spot node pool is deployed in a single fault domain and
offers no high availability guarantees. At any time when Azure needs the capacity back, the Azure infrastructure
will evict spot nodes.
Spot nodes are great for workloads that can handle interruptions, early terminations, or evictions. For example,
workloads such as batch processing jobs, development and testing environments, and large compute workloads
may be good candidates to be scheduled on a spot node pool.
In this article, you add a secondary spot node pool to an existing Azure Kubernetes Service (AKS) cluster.
This article assumes a basic understanding of Kubernetes and Azure Load Balancer concepts. For more
information, see Kubernetes core concepts for Azure Kubernetes Service (AKS).
If you don't have an Azure subscription, create a free account before you begin.
By default, you create a node pool with a priority of Regular in your AKS cluster when you create a cluster with
multiple node pools. The above command adds an auxiliary node pool to an existing AKS cluster with a priority
of Spot. The priority of Spot makes the node pool a spot node pool. The eviction-policy parameter is set to
Delete in the above example, which is the default value. When you set the eviction policy to Delete, nodes in the
underlying scale set of the node pool are deleted when they're evicted. You can also set the eviction policy to
Deallocate. When you set the eviction policy to Deallocate, nodes in the underlying scale set are set to the
stopped-deallocated state upon eviction. Nodes in the stopped-deallocated state count against your compute
quota and can cause issues with cluster scaling or upgrading. The priority and eviction-policy values can only be
set during node pool creation. Those values can't be updated later.
The command also enables the cluster autoscaler, which is recommended to use with spot node pools. Based on
the workloads running in your cluster, the cluster autoscaler scales up and scales down the number of nodes in
the node pool. For spot node pools, the cluster autoscaler will scale up the number of nodes after an eviction if
additional nodes are still needed. If you change the maximum number of nodes a node pool can have, you also
need to adjust the maxCount value associated with the cluster autoscaler. If you do not use a cluster autoscaler,
upon eviction, the spot pool will eventually decrease to zero and require a manual operation to receive any
additional spot nodes.
IMPORTANT
Only schedule workloads on spot node pools that can handle interruptions, such as batch processing jobs and testing
environments. It is recommended that you set up taints and tolerations on your spot node pool to ensure that only
workloads that can handle node evictions are scheduled on a spot node pool. For example, the above command by
default adds a taint of kubernetes.azure.com/scalesetpriority=spot:NoSchedule so only pods with a corresponding
toleration are scheduled on this node.
spec:
containers:
- name: spot-example
tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
operator: "Equal"
value: "spot"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "kubernetes.azure.com/scalesetpriority"
operator: In
values:
- "spot"
...
When a pod with this toleration and node affinity is deployed, Kubernetes will successfully schedule the pod on
the nodes with the taint and label applied.
Next steps
In this article, you learned how to add a spot node pool to an AKS cluster. For more information about how to
control pods across node pools, see Best practices for advanced scheduler features in AKS.
Manage system node pools in Azure Kubernetes
Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools.
Node pools contain the underlying VMs that run your applications. System node pools and user node pools are
two different node pool modes for your AKS clusters. System node pools serve the primary purpose of hosting
critical system pods such as CoreDNS and metrics-server . User node pools serve the primary purpose of
hosting your application pods. However, application pods can be scheduled on system node pools if you wish to
only have one pool in your AKS cluster. Every AKS cluster must contain at least one system node pool with at
least one node.
IMPORTANT
If you run a single system node pool for your AKS cluster in a production environment, we recommend you use at least
three nodes for the node pool.
Limitations
The following limitations apply when you create and manage AKS clusters that support system node pools.
See Quotas, virtual machine size restrictions, and region availability in Azure Kubernetes Service (AKS).
The AKS cluster must be built with virtual machine scale sets as the VM type and the Standard SKU load
balancer.
The name of a node pool may only contain lowercase alphanumeric characters and must begin with a
lowercase letter. For Linux node pools, the length must be between 1 and 12 characters. For Windows node
pools, the length must be between 1 and 6 characters.
An API version of 2020-03-01 or greater must be used to set a node pool mode. Clusters created on API
versions older than 2020-03-01 contain only user node pools, but can be migrated to contain system node
pools by following update pool mode steps.
The mode of a node pool is a required property and must be explicitly set when using ARM templates or
direct API calls.
Use the az aks create command to create an AKS cluster. The following example creates a cluster named
myAKSCluster with one dedicated system pool containing one node. For your production workloads, ensure you
are using system node pools with at least three nodes. This operation may take several minutes to complete.
You can add one or more system node pools to existing AKS clusters. It's recommended to schedule your
application pods on user node pools, and dedicate system node pools to only critical system pods. This prevents
rogue application pods from accidentally killing system pods. Enforce this behavior with the
CriticalAddonsOnly=true:NoSchedule taint for your system node pools.
The following command adds a dedicated node pool of mode type system with a default count of three nodes.
A mode of type System is defined for system node pools, and a mode of type User is defined for user node
pools. For a system pool, verify the taint is set to CriticalAddonsOnly=true:NoSchedule , which will prevent
application pods from beings scheduled on this node pool.
{
"agentPoolType": "VirtualMachineScaleSets",
"availabilityZones": null,
"count": 1,
"enableAutoScaling": null,
"enableNodePublicIp": false,
"id":
"/subscriptions/yourSubscriptionId/resourcegroups/myResourceGroup/providers/Microsoft.ContainerService/manag
edClusters/myAKSCluster/agentPools/systempool",
"maxCount": null,
"maxPods": 110,
"minCount": null,
"mode": "System",
"name": "systempool",
"nodeImageVersion": "AKSUbuntu-1604-2020.06.30",
"nodeLabels": {},
"nodeTaints": [
"CriticalAddonsOnly=true:NoSchedule"
],
"orchestratorVersion": "1.16.10",
"osDiskSizeGb": 128,
"osType": "Linux",
"provisioningState": "Failed",
"proximityPlacementGroupId": null,
"resourceGroup": "myResourceGroup",
"scaleSetEvictionPolicy": null,
"scaleSetPriority": null,
"spotMaxPrice": null,
"tags": null,
"type": "Microsoft.ContainerService/managedClusters/agentPools",
"upgradeSettings": {
"maxSurge": null
},
"vmSize": "Standard_DS2_v2",
"vnetSubnetId": null
}
You can change modes for both system and user node pools. You can change a system node pool to a user pool
only if another system node pool already exists on the AKS cluster.
This command changes a system node pool to a user node pool.
You must have at least two system node pools on your AKS cluster before you can delete one of them.
Clean up resources
To delete the cluster, use the az group delete command to delete the AKS resource group:
Next steps
In this article, you learned how to create and manage system node pools in an AKS cluster. For more information
about how to use multiple node pools, see use multiple node pools.
Create WebAssembly System Interface (WASI) node
pools in Azure Kubernetes Service (AKS) to run
your WebAssembly (WASM) workload (preview)
6/15/2022 • 6 minutes to read • Edit Online
WebAssembly (WASM) is a binary format that is optimized for fast download and maximum execution speed in
a WASM runtime. A WASM runtime is designed to run on a target architecture and execute WebAssemblies in a
sandbox, isolated from the host computer, at near-native performance. By default, WebAssemblies can't access
resources on the host outside of the sandbox unless it is explicitly allowed, and they can't communicate over
sockets to access things environment variables or HTTP traffic. The WebAssembly System Interface (WASI)
standard defines an API for WASM runtimes to provide access to WebAssemblies to the environment and
resources outside the host using a capabilities model. Krustlet is an open-source project that allows WASM
modules to be run on Kubernetes. Krustlet creates a kubelet that runs on nodes with a WASM/WASI runtime.
AKS allows you to create node pools that run WASM assemblies using nodes with WASM/WASI runtimes and
Krustlets.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
This article uses Helm 3 to install the nginx chart on a supported version of Kubernetes. Make sure that you are
using the latest release of Helm and have access to the bitnami Helm repository. The steps outlined in this article
may not be compatible with previous versions of the Helm chart or Kubernetes.
You must also have the following resource installed:
The latest version of the Azure CLI.
The aks-preview extension version 0.5.34 or later
Register the WasmNodePoolPreview preview feature
To use the feature, you must also enable the WasmNodePoolPreview feature flag on your subscription.
Register the WasmNodePoolPreview feature flag by using the az feature register command, as shown in the
following example:
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Limitations
You can't run WebAssemblies and containers in the same node pool.
Only the WebAssembly(WASI) runtime is available, using the Wasmtime provider.
The WASM/WASI node pools can't be used for system node pool.
The os-type for WASM/WASI node pools must be Linux.
Krustlet doesn't work with Azure CNI at this time. For more information, see the CNI Support for Kruslet
GitHub issue.
Krustlet doesn't provide networking configuration for WebAssemblies. The WebAssembly manifest must
provide the networking configuration, such as IP address.
NOTE
The default value for the workload-runtime parameter is ocicontainer. To create a node pool that runs container
workloads, omit the workload-runtime parameter or set the value to ocicontainer.
Verify the workloadRuntime value using az aks nodepool show . For example:
az aks nodepool show -g myResourceGroup --cluster-name myAKSCluster -n mywasipool
The following example output shows the mywasipool has the workloadRuntime type of WasmWasi.
{
...
"name": "mywasipool",
..
"workloadRuntime": "WasmWasi"
}
For a WASM/WASI node pool, verify the taint is set to kubernetes.io/arch=wasm32-wagi:NoSchedule and
kubernetes.io/arch=wasm32-wagi:NoExecute , which will prevent container pods from being scheduled on this
node pool. Also, you should see nodeLabels to be kubernetes.io/arch: wasm32-wasi , which prevents WASM pods
from being scheduled on regular container(OCI) node pools.
NOTE
The taints for a WASI node pool are not visible using az aks nodepool list . Use kubectl to verify the taints are set
on the nodes in the WASI node pool.
Configure kubectl to connect to your Kubernetes cluster using the az aks get-credentials command. The
following command:
Name: aks-mywasipool-12456878-vmss000000
Roles: agent
Labels: agentpool=mywasipool
...
kubernetes.io/arch=wasm32-wagi
...
Taints: kubernetes.io/arch=wasm32-wagi:NoExecute
kubernetes.io/arch=wasm32-wagi:NoSchedule
...
spec:
nodeSelector:
kubernetes.io/arch: "wasm32-wagi"
tolerations:
- key: "node.kubernetes.io/network-unavailable"
operator: "Exists"
effect: "NoSchedule"
- key: "kubernetes.io/arch"
operator: "Equal"
value: "wasm32-wagi"
effect: "NoExecute"
- key: "kubernetes.io/arch"
operator: "Equal"
value: "wasm32-wagi"
effect: "NoSchedule"
...
To run a sample deployment, create a wasi-example.yaml file using the following YAML definition:
apiVersion: v1
kind: Pod
metadata:
name: krustlet-wagi-demo
labels:
app: krustlet-wagi-demo
annotations:
alpha.wagi.krustlet.dev/default-host: "0.0.0.0:3001"
alpha.wagi.krustlet.dev/modules: |
{
"krustlet-wagi-demo-http-example": {"route": "/http-example", "allowed_hosts":
["https://api.brigade.sh"]},
"krustlet-wagi-demo-hello": {"route": "/hello/..."},
"krustlet-wagi-demo-error": {"route": "/error"},
"krustlet-wagi-demo-log": {"route": "/log"},
"krustlet-wagi-demo-index": {"route": "/"}
}
spec:
hostNetwork: true
nodeSelector:
kubernetes.io/arch: wasm32-wagi
containers:
- image: webassembly.azurecr.io/krustlet-wagi-demo-http-example:v1.0.0
imagePullPolicy: Always
name: krustlet-wagi-demo-http-example
- image: webassembly.azurecr.io/krustlet-wagi-demo-hello:v1.0.0
imagePullPolicy: Always
name: krustlet-wagi-demo-hello
- image: webassembly.azurecr.io/krustlet-wagi-demo-index:v1.0.0
imagePullPolicy: Always
name: krustlet-wagi-demo-index
- image: webassembly.azurecr.io/krustlet-wagi-demo-error:v1.0.0
imagePullPolicy: Always
name: krustlet-wagi-demo-error
- image: webassembly.azurecr.io/krustlet-wagi-demo-log:v1.0.0
imagePullPolicy: Always
name: krustlet-wagi-demo-log
tolerations:
- key: "node.kubernetes.io/network-unavailable"
operator: "Exists"
effect: "NoSchedule"
- key: "kubernetes.io/arch"
operator: "Equal"
value: "wasm32-wagi"
effect: "NoExecute"
- key: "kubernetes.io/arch"
operator: "Equal"
value: "wasm32-wagi"
effect: "NoSchedule"
NOTE
The pod for the example deployment may stay in the Registered status. This behavior is expected, and you and proceed to
the next step.
Create values.yaml using the example yaml below, replacing WASINODE_IP with the value from the earlier step.
serverBlock: |-
server {
listen 0.0.0.0:8080;
location / {
proxy_pass http://WASINODE_IP:3001;
}
}
Using Helm, add the bitnami repository and install the nginx chart with the values.yaml file you created in the
previous step. Installing NGINX with the above values.yaml creates a reverse proxy to the example deployment,
allowing you to access it using an external IP address.
NOTE
The following example pulls a public container image from Docker Hub. We recommend that you set up a pull secret to
authenticate using a Docker Hub account instead of making an anonymous pull request. To improve reliability when
working with public content, import and manage the image in a private Azure container registry. Learn more about
working with public images.
Use kubectl get service to display the external IP address of the hello-wasi-ngnix service.
Verify the example deployment is running by the curl command against the /hello path of EXTERNAL_IP.
curl EXTERNAL_IP/hello
$ curl EXTERNAL_IP/hello
hello world
NOTE
To publish the service on your own domain, see Azure DNS and the external-dns project.
Clean up
To remove NGINX, use helm delete .
Your AKS workloads may not need to run continuously, for example a development cluster that has node pools
running specific workloads. To optimize your costs, you can completely turn off (stop) your node pools in your
AKS cluster, allowing you to save on compute costs.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
You also need the aks-preview Azure CLI extension. Install the aks-preview Azure CLI extension by using the az
extension add command. Or install any available updates by using the az extension update command.
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
Use az aks nodepool stop to stop a running AKS node pool. The following example stops the testnodepool node
pool:
You can verify when your node pool is stopped by using the az aks show command and confirming the
powerState shows as Stopped as on the below output:
{
[...]
"osType": "Linux",
"podSubnetId": null,
"powerState": {
"code": "Stopped"
},
"provisioningState": "Succeeded",
"proximityPlacementGroupId": null,
[...]
}
NOTE
If the provisioningState shows Stopping , your node pool hasn't fully stopped yet.
You can verify your node pool has started using az aks show and confirming the powerState shows Running .
For example:
{
[...]
"osType": "Linux",
"podSubnetId": null,
"powerState": {
"code": "Running"
},
"provisioningState": "Succeeded",
"proximityPlacementGroupId": null,
[...]
}
NOTE
If the provisioningState shows Starting , your node pool hasn't fully started yet.
Next steps
To learn how to scale User pools to 0, see Scale User pools to 0.
To learn how to stop your cluster, see Cluster start/stop.
To learn how to save costs using Spot instances, see Add a spot node pool to AKS.
To learn more about the AKS support policies, see AKS support policies.
Resize node pools in Azure Kubernetes Service
(AKS)
6/15/2022 • 7 minutes to read • Edit Online
Due to an increasing number of deployments or to run a larger workload, you may want to change the virtual
machine scale set plan or resize AKS instances. However, as per support policies for AKS:
AKS agent nodes appear in the Azure portal as regular Azure IaaS resources. But these virtual machines are
deployed into a custom Azure resource group (usually prefixed with MC_*). You cannot do any direct
customizations to these nodes using the IaaS APIs or resources. Any custom changes that are not done via
the AKS API will not persist through an upgrade, scale, update or reboot.
This lack of persistence also applies to the resize operation, thus, resizing AKS instances in this manner isn't
supported. In this how-to guide, you'll learn the recommended method to address this scenario.
IMPORTANT
This method is specific to virtual machine scale set-based AKS clusters. When using virtual machine availability sets, you
are limited to only one node pool per cluster.
Example resources
Suppose you want to resize an existing node pool, called nodepool1 , from SKU size Standard_DS2_v2 to
Standard_DS3_v2. To accomplish this task, you'll need to create a new node pool using Standard_DS3_v2, move
workloads from nodepool1 to the new node pool, and remove nodepool1 . In this example, we'll call this new
node pool mynodepool .
When resizing, be sure to consider other requirements and configure your node pool accordingly. You may need
to modify the above command. For a full list of the configuration options, see the az aks nodepool add reference
page.
After a few minutes, the new node pool has been created:
Next, using kubectl cordon <node-names> , specify the desired nodes in a space-separated list:
node/aks-nodepool1-31721111-vmss000000 cordoned
node/aks-nodepool1-31721111-vmss000001 cordoned
node/aks-nodepool1-31721111-vmss000002 cordoned
Draining nodes will cause pods running on them to be evicted and recreated on the other, schedulable nodes.
To drain nodes, use kubectl drain <node-names> --ignore-daemonsets --delete-emptydir-data , again using a
space-separated list of node names:
IMPORTANT
Using --delete-emptydir-data is required to evict the AKS-created coredns and metrics-server pods. If this flag
isn't used, an error is expected. For more information, see the documentation on emptydir.
After the drain operation finishes, all pods other than those controlled by daemon sets are running on the new
node pool:
Troubleshooting
You may see an error like the following:
Error when evicting pods/[podname] -n [namespace] (will retry after 5s): Cannot evict pod as it would
violate the pod's disruption budget.
By default, your cluster has AKS_managed pod disruption budgets (such as coredns-pdb or konnectivity-agent )
with a MinAvailable of 1. If, for example, there are two coredns pods running, while one of them is getting
recreated and is unavailable, the other is unable to be affected due to the pod disruption budget. This resolves
itself after the initial coredns pod is scheduled and running, allowing the second pod to be properly evicted and
recreated.
TIP
Consider draining nodes one-by-one for a smoother eviction experience and to avoid throttling. For more information,
see:
Plan for availability using a pod disruption budget
Specifying a Disruption Budget for your Application
Disruptions
After completion, the final result is the AKS cluster having a single, new node pool with the new, desired SKU
size and all the applications and pods properly running:
Next steps
After resizing a node pool by cordoning and draining, learn more about using multiple node pools.
Access Kubernetes resources from the Azure portal
6/15/2022 • 4 minutes to read • Edit Online
The Azure portal includes a Kubernetes resource view for easy access to the Kubernetes resources in your Azure
Kubernetes Service (AKS) cluster. Viewing Kubernetes resources from the Azure portal reduces context switching
between the Azure portal and the kubectl command-line tool, streamlining the experience for viewing and
editing your Kubernetes resources. The resource viewer currently includes multiple resource types, such as
deployments, pods, and replica sets.
The Kubernetes resource view from the Azure portal replaces the AKS dashboard add-on, which is deprecated.
Prerequisites
To view Kubernetes resources in the Azure portal, you need an AKS cluster. Any cluster is supported, but if using
Azure Active Directory (Azure AD) integration, your cluster must use AKS-managed Azure AD integration. If your
cluster uses legacy Azure AD, you can upgrade your cluster in the portal or with the Azure CLI. You can also use
the Azure portal to create a new AKS cluster.
Deploy an application
In this example, we'll use our sample AKS cluster to deploy the Azure Vote application from the AKS quickstart.
1. Select Add from any of the resource views (Namespace, Workloads, Services and ingresses, Storage, or
Configuration).
2. Paste the YAML for the Azure Vote application from the AKS quickstart.
3. Select Add at the bottom of the YAML editor to deploy the application.
Once the YAML file is added, the resource viewer shows both Kubernetes services that were created: the internal
service (azure-vote-back), and the external service (azure-vote-front) to access the Azure Vote application. The
external service includes a linked external IP address so you can easily view the application in your browser.
Edit YAML
The Kubernetes resource view also includes a YAML editor. A built-in YAML editor means you can update or
create services and deployments from within the portal and apply changes immediately.
After editing the YAML, changes are applied by selecting Review + save , confirming the changes, and then
saving again.
WARNING
Performing direct production changes via UI or CLI is not recommended, you should leverage continuous integration (CI)
and continuous deployment (CD) best practices. The Azure Portal Kubernetes management capabilities and the YAML
editor are built for learning and flighting new deployments in a development and testing setting.
Troubleshooting
This section addresses common problems and troubleshooting steps.
Unauthorized access
To access the Kubernetes resources, you must have access to the AKS cluster, the Kubernetes API, and the
Kubernetes objects. Ensure that you're either a cluster administrator or a user with the appropriate permissions
to access the AKS cluster. For more information on cluster security, see Access and identity options for AKS.
NOTE
The kubernetes resource view in the Azure Portal is only supported by managed-AAD enabled clusters or non-AAD
enabled clusters. If you are using a managed-AAD enabled cluster, your AAD user or identity needs to have the respective
roles/role bindings to access the kubernetes API, in addition to the permission to pull the user kubeconfig .
TIP
The AKS feature for API ser ver authorized IP ranges can be added to limit API server access to only the firewall's
public endpoint. Another option for such clusters is updating --api-server-authorized-ip-ranges to include access for
a local client computer or IP address range (from which portal is being browsed). To allow this access, you need the
computer's public IPv4 address. You can find this address with below command or by searching "what is my IP address" in
an internet browser.
Next steps
This article showed you how to access Kubernetes resources for your AKS cluster. See Deployments and YAML
manifests for a deeper understanding of cluster resources and the YAML files that are accessed with the
Kubernetes resource viewer.
Use Azure tags in Azure Kubernetes Service (AKS)
6/15/2022 • 6 minutes to read • Edit Online
With Azure Kubernetes Service (AKS), you can set Azure tags on an AKS cluster and its related resources by
using Azure Resource Manager, through the Azure CLI. For some resources, you can also use Kubernetes
manifests to set Azure tags. Azure tags are a useful tracking resource for certain business processes, such as
chargeback.
This article explains how to set Azure tags for AKS clusters and related resources.
NOTE
Azure Private DNS only supports 15 tags. tag resources.
To create a cluster and assign Azure tags, run az aks create with the --tags parameter, as shown in the
following command. Running the command creates a myAKSCluster in the myResourceGroup with the tags
dept=IT and costcenter=9999.
NOTE
To set tags on the initial node pool, the node resource group, the virtual machine scale set, and each virtual machine scale
set instance that's associated with the initial node pool, also set the --nodepool-tags parameter.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--tags dept=IT costcenter=9999 \
--generate-ssh-keys
IMPORTANT
If you're using existing resources when you're creating a new cluster, such as an IP address or route table,
az aks create overwrites the set of tags. If you delete that cluster later, any tags set by the cluster will be removed.
Verify that the tags have been applied to the cluster and related resources. The cluster tags for myAKSCluster are
shown in the following example:
To update the tags on an existing cluster, run az aks update with the --tags parameter. Running the command
updates the myAKSCluster with the tags team=alpha and costcenter=1234.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--tags team=alpha costcenter=1234
Verify that the tags have been applied to the cluster. For example:
$ az aks show -g myResourceGroup -n myAKSCluster --query '[tags]'
{
"clusterTags": {
"costcenter": "1234",
"team": "alpha"
}
}
IMPORTANT
Setting tags on a cluster by using az aks update overwrites the set of tags. For example, if your cluster has the tags
dept=IT and costcenter=9999 and you use az aks update with the tags team=alpha and costcenter=1234, the new
list of tags would be team=alpha and costcenter=1234.
Verify that the tags have been applied to the tagnodepool node pool.
IMPORTANT
Setting tags on a node pool by using az aks nodepool update overwrites the set of tags. For example, if your node
pool has the tags abtest=a and costcenter=5555, and you use az aks nodepool update with the tags
appversion=0.0.2 and costcenter=4444, the new list of tags would be appversion=0.0.2 and costcenter=4444.
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/azure-pip-tags: costcenter=3333,team=beta
spec:
...
For files and disks, use tags under parameters. For example:
---
apiVersion: storage.k8s.io/v1
...
parameters:
...
tags: costcenter=3333,team=beta
...
IMPORTANT
Setting tags on files, disks, and public IPs by using Kubernetes updates the set of tags. For example, if your disk has the
tags dept=IT and costcenter=5555, and you use Kubernetes to set the tags team=beta and costcenter=3333, the new
list of tags would be dept=IT, team=beta, and costcenter=3333.
Any updates that you make to tags through Kubernetes will retain the value that's set through Kubernetes. For example,
if your disk has tags dept=IT and costcenter=5555 set by Kubernetes, and you use the portal to set the tags team=beta
and costcenter=3333, the new list of tags would be dept=IT, team=beta, and costcenter=5555. If you then remove the
disk through Kubernetes, the disk would have the tag team=beta.
Use labels in an Azure Kubernetes Service (AKS)
cluster
6/15/2022 • 4 minutes to read • Edit Online
If you have multiple node pools, you may want to add a label during node pool creation. Theselabels are visible
in Kubernetesfor handling scheduling rules for nodes. You can add labels to a node pool anytime, and they'll be
set on all nodes in the node pool.
In this how-to guide, you'll learn how to use labels in an AKS cluster.
Prerequisites
You need the Azure CLI version 2.2.0 or later installed and configured. Run az --version to find the version. If
you need to install or upgrade, see Install Azure CLI.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 2 \
--nodepool-labels dept=IT costcenter=9000
Verify the labels were set by running kubectl get nodes --show-labels .
The following example output from the az aks nodepool list command shows that labelnp is Creating nodes
with the specified nodeLabels:
az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster
```output
[
{
...
"count": 1,
...
"name": "labelnp",
"orchestratorVersion": "1.15.7",
...
"provisioningState": "Creating",
...
"nodeLabels": {
"costcenter": "5000",
"dept": "HR"
},
...
},
...
]
Verify the labels were set by running kubectl get nodes --show-labels .
Verify the labels were set by running kubectl get nodes --show-labels .
Unavailable labels
Reserved system labels
Since the 2021-08-19 AKS release, Azure Kubernetes Service (AKS) has stopped the ability to make changes to
AKS reserved labels. Attempting to change these labels will result in an error message.
The following labels are reserved for use by AKS. Virtual node usage specifies if these labels could be a
supported system feature on virtual nodes.
Some properties that these system features change aren't available on the virtual nodes, because they require
modifying the host.
L A B EL VA L UE EXA M P L E/ O P T IO N S VIRT UA L N O DE USA GE
Same is included in places where the expected values for the labels don't differ between a standard node
pool and a virtual node pool. As virtual node pods don't expose any underlying virtual machine (VM), the VM
SKU values are replaced with the SKU Virtual.
Virtual node version refers to the current version of the virtual Kubelet-ACI connector release.
Virtual node subnet name is the name of the subnet where virtual node pods are deployed into Azure
Container Instance (ACI).
Virtual node virtual network is the name of the virtual network, which contains the subnet where virtual
node pods are deployed on ACI.
Reserved prefixes
The following list of prefixes are reserved for usage by AKS and can't be used for any node.
kubernetes.azure.com/
kubernetes.io/
For additional reserved prefixes, see Kubernetes well-known labels, annotations, and taints.
Deprecated labels
The following labels are planned for deprecation with the release of Kubernetes v1.24. Customers should
change any label references to the recommended substitute.
*Newly deprecated. For more information, see Release Notes on when these labels will no longer be maintained.
Next steps
Learn more about Kubernetes labels at the Kubernetes labels documentation.
Overview of Microsoft Defender for Containers
6/15/2022 • 5 minutes to read • Edit Online
Microsoft Defender for Containers is the cloud-native solution for securing your containers so you can improve,
monitor, and maintain the security of your clusters, containers, and their applications.
How does Defender for Containers work in each Kubernetes platform?
Required roles and permissions: • To auto provision the required components, see the
permissions for each of the components
• Security admin can dismiss alerts
• Security reader can view vulnerability assessment
findings
See also Azure Container Registry roles and permissions
Clouds: Azure :
Commercial clouds
National clouds (Azure Government, Azure China
21Vianet) (Except for preview features))
Non-Azure :
Connected AWS accounts (Preview)
Connected GCP projects (Preview)
On-prem/IaaS supported via Arc enabled Kubernetes
(Preview).
Hardening
Continuous monitoring of your Kubernetes clusters - wherever they're hosted
Defender for Cloud continuously assesses the configurations of your clusters and compares them with the
initiatives applied to your subscriptions. When it finds misconfigurations, Defender for Cloud generates security
recommendations. Use Defender for Cloud's recommendations page to view recommendations and
remediate issues. For details of the relevant Defender for Cloud recommendations that might appear for this
feature, see the compute section of the recommendations reference table.
For Kubernetes clusters on EKS, you'll need to connect your AWS account to Microsoft Defender for Cloud. Then
ensure you've enabled the CSPM plan.
When reviewing the outstanding recommendations for your container-related resources, whether in asset
inventory or the recommendations page, you can use the resource filter:
Vulnerability assessment
Scanning images in ACR registries
Defender for Containers includes an integrated vulnerability scanner for scanning images in Azure Container
Registry registries. The vulnerability scanner runs on an image:
When you push the image to your registry
Weekly on any image that was pulled within the last 30
When you import the image to your Azure Container Registry
Continuously in specific situations
Learn more in Vulnerability assessment.
View vulnerabilities for running images
The recommendation Running container images should have vulnerability findings resolved shows
vulnerabilities for running images by using the scan results from ACR registries and information on running
images from the Defender security profile/extension. Images that are deployed from a non-ACR registry, will
appear under the Not applicable tab.
Learn More
If you would like to learn more from the product manager about Microsoft Defender for Containers, check out
Microsoft Defender for Containers.
You can also check out the following blogs:
How to demonstrate the new containers features in Microsoft Defender for Cloud
Introducing Microsoft Defender for Containers
Next steps
In this overview, you learned about the core elements of container security in Microsoft Defender for Cloud. To
enable the plan, see:
Enable Defender for Containers
Enable Microsoft Defender for Containers
6/15/2022 • 23 minutes to read • Edit Online
Microsoft Defender for Containers is the cloud-native solution for securing your containers.
Defender for Containers protects your clusters whether they're running in:
Azure Kubernetes Ser vice (AKS) - Microsoft's managed service for developing, deploying, and
managing containerized applications.
Amazon Elastic Kubernetes Ser vice (EKS) in a connected Amazon Web Ser vices (AWS)
account - Amazon's managed service for running Kubernetes on AWS without needing to install,
operate, and maintain your own Kubernetes control plane or nodes.
Google Kubernetes Engine (GKE) in a connected Google Cloud Platform (GCP) project -
Google’s managed environment for deploying, managing, and scaling applications using GCP
infrastructure.
Other Kubernetes distributions (using Azure Arc-enabled Kubernetes) - Cloud Native Computing
Foundation (CNCF) certified Kubernetes clusters hosted on-premises or on IaaS. For more information,
see the On-prem/IaaS (Arc) section of Supported features by environment.
Learn about this plan in Overview of Microsoft Defender for Containers.
NOTE
Defender for Containers' support for Arc-enabled Kubernetes clusters, AWS EKS, and GCP GKE. This is a preview feature.
The Azure Preview Supplemental Terms include additional legal terms that apply to Azure features that are in beta,
preview, or otherwise not yet released into general availability.
Network requirements
Validate the following endpoints are configured for outbound access so that the Defender profile can connect to
Microsoft Defender for Cloud to send security data and events:
See the required FQDN/application rules for Microsoft Defender for Containers.
By default, AKS clusters have unrestricted outbound (egress) internet access.
Network requirements
Validate the following endpoints are configured for outbound access so that the Defender extension can connect
to Microsoft Defender for Cloud to send security data and events:
For Azure public cloud deployments:
DO M A IN P O RT
*.ods.opinsights.azure.com 443
*.oms.opinsights.azure.com 443
DO M A IN P O RT
login.microsoftonline.com 443
You will also need to validate the Azure Arc-enabled Kubernetes network requirements.
TIP
If the subscription already has Defender for Kubernetes and/or Defender for container registries enabled, an
update notice is shown. Otherwise, the only option will be Defender for Containers .
3. By default, when enabling the plan through the Azure portal, Microsoft Defender for Containers is
configured to auto provision (automatically install) required components to provide the protections
offered by plan, including the assignment of a default workspace.
If you want to disable auto provisioning during the onboading process, select Edit configuration for the
Containers plan. This opens the Advanced options, where you can disable auto provisioning for each
component.
In addition, you can modify this configuration from the Defender plans page or from the Auto
provisioning page on the Microsoft Defender for Containers components (preview) row:
NOTE
If you choose to disable the plan at any time after enabling it through the portal as shown above, you'll need to
manually remove Defender for Containers components deployed on your clusters.
NOTE
Microsoft Defender for Containers is configured to defend all of your clouds automatically. When you install all of
the required prerequisites and enable all of the auto provisioning capabilities.
If you choose to disable all of the auto provision configuration options, no agents, or components will be
deployed to your clusters. Protection will be limited to the Agentless features only. Learn which features are
Agentless in the availability section for Defender for Containers.
Use the fix button from the Defender for Cloud recommendation
A streamlined, frictionless, process lets you use the Azure portal pages to enable the Defender for Cloud plan
and setup auto provisioning of all the necessary components for defending your Kubernetes clusters at scale.
A dedicated Defender for Cloud recommendation provides:
Visibility about which of your clusters has the Defender profile deployed
Fix button to deploy it to those clusters without the extension
1. From Microsoft Defender for Cloud's recommendations page, open the Enable enhanced security
security control.
2. Use the filter to find the recommendation named Azure Kubernetes Ser vice clusters should have
Defender profile enabled .
TIP
Notice the Fix icon in the actions column
3. Select the clusters to see the details of the healthy and unhealthy resources - clusters with and without
the profile.
4. From the unhealthy resources list, select a cluster and select Remediate to open the pane with the
remediation confirmation.
5. Select Fix [x] resources .
3. By default, when enabling the plan through the Azure portal, Microsoft Defender for Containers is
configured to auto provision (automatically install) required components to provide the protections
offered by plan, including the assignment of a default workspace.
If you want to disable auto provisioning during the onboading process, select Edit configuration for the
Containers plan. This opens the Advanced options, where you can disable auto provisioning for each
component.
In addition, you can modify this configuration from the Defender plans page or from the Auto
provisioning page on the Microsoft Defender for Containers components (preview) row:
NOTE
If you choose to disable the plan at any time after enabling it through the portal as shown above, you'll need to
manually remove Defender for Containers components deployed on your clusters.
Prerequisites
Before deploying the extension, ensure you:
Connect the Kubernetes cluster to Azure Arc
Complete the pre-requisites listed under the generic cluster extensions documentation.
Use the fix button from the Defender for Cloud recommendation
A dedicated Defender for Cloud recommendation provides:
Visibility about which of your clusters has the Defender for Kubernetes extension deployed
Fix button to deploy it to those clusters without the extension
1. From Microsoft Defender for Cloud's recommendations page, open the Enable enhanced security
security control.
2. Use the filter to find the recommendation named Azure Arc-enabled Kubernetes clusters should
have Defender for Cloud's extension installed .
TIP
Notice the Fix icon in the actions column
3. Select the extension to see the details of the healthy and unhealthy resources - clusters with and without
the extension.
4. From the unhealthy resources list, select a cluster and select Remediate to open the pane with the
remediation options.
5. Select the relevant Log Analytics workspace and select Remediate x resource .
Use Defender for Cloud recommendation to verify the status of your extension
1. From Microsoft Defender for Cloud's recommendations page, open the Enable Microsoft Defender
for Cloud security control.
2. Select the recommendation named Azure Arc-enabled Kubernetes clusters should have
Microsoft Defender for Cloud's extension installed .
3. Check that the cluster on which you deployed the extension is listed as Healthy .
Protect Amazon Elastic Kubernetes Service clusters
IMPORTANT
If you haven't already connected an AWS account, do so now using the instructions in Connect your AWS accounts to
Microsoft Defender for Cloud.
To protect your EKS clusters, enable the Containers plan on the relevant account connector:
1. From Defender for Cloud's menu, open Environment settings .
2. Select the AWS connector.
4. (Optional) To change the retention period for your audit logs, select Configure , enter the required
timeframe, and select Save .
NOTE
If you disable this configuration, then the Threat detection (control plane) feature will be disabled. Learn
more about features availability.
5. Continue through the remaining pages of the connector wizard.
6. Azure Arc-enabled Kubernetes, the Defender extension, and the Azure Policy extension should be installed
and running on your EKS clusters. There are 2 dedicated Defender for Cloud recommendations to install
these extensions (and Azure Arc if necessary):
EKS clusters should have Microsoft Defender's extension for Azure Arc installed
EKS clusters should have the Azure Policy extension installed
For each of the recommendations, follow follow the steps below to install the required extensions.
To install the required extensions :
a. From Defender for Cloud's Recommendations page, search for one of the recommendations by
name.
b. Select an unhealthy cluster.
IMPORTANT
You must select the clusters one at a time.
Don't select the clusters by their hyperlinked names: select anywhere else in the relevant row.
c. Select Fix .
d. Defender for Cloud generates a script in the language of your choice: select Bash (for Linux) or
PowerShell (for Windows).
e. Select Download remediation logic .
f. Run the generated script on your cluster.
g. Repeat steps "a" through "f" for the second recommendation.
To view the alerts and recommendations for your EKS clusters, use the filters on the alerts, recommendations,
and inventory pages to filter by resource type AWS EKS cluster .
To protect your GKE clusters, you will need to enable the Containers plan on the relevant GCP project.
To protect Google Kubernetes Engine (GKE) clusters :
1. Sign in to the Azure portal.
2. Navigate to Microsoft Defender for Cloud > Environment settings .
3. Select the relevant GCP connector
4. Select the Next: Select plans > button.
5. Ensure that the Containers plan is toggled to On .
IMPORTANT
You must select the clusters one at a time.
Don't select the clusters by their hyperlinked names: select anywhere else in the relevant row.
3. Select Definitions .
4. Search for policy ID 64def556-fbad-4622-930e-72d1d5589bf5 .
5. Select [Preview]: Configure Azure Kubernetes Ser vice clusters to enable Defender profile .
6. Select Assign .
7. In the Parameters tab, deselect the Only show parameters that need input or review option.
8. Enter LogAnalyticsWorkspaceResource .
9. Select Review + create .
10. Select Create .
https://management.azure.com/subscriptions/{{SubscriptionId}}/resourcegroups/{{ResourceGroup}}/providers/Mic
rosoft.ContainerService/managedClusters/{{ClusterName}}?api-version={{ApiVersion}}
Request body:
{
"location": "{{Location}}",
"properties": {
"securityProfile": {
"azureDefender": {
"enabled": false
}
}
}
}
FAQ
How can I use my existing Log Analytics workspace?
Can I delete the default workspaces created by Defender for Cloud?
I deleted my default workspace, how can I get it back?
Where is the default Log Analytics workspace located?
How can I use my existing Log Analytics workspace?
You can use your existing Log Analytics workspace by following the steps in the Assign a custom workspace
workspace section of this article.
Can I delete the default workspaces created by Defender for Cloud?
We do not recommend deleting the default workspace. Defender for Containers uses the default workspaces to
collect security data from your clusters. Defender for Containers will be unable to collect data, and some security
recommendations and alerts, will become unavailable if you delete the default workspace.
I deleted my default workspace, how can I get it back?
To recover your default workspace, you need to remove the Defender profile/extension, and reinstall the agent.
Reinstalling the Defender profile/extension creates a new default workspace.
Where is the default Log Analytics workspace located?
Depending on your region the default Log Analytics workspace located will be located in various locations. To
check your region see Where is the default Log Analytics workspace created?
Learn More
Learn more from the product manager about Microsoft Defender for Containers in a multicloud environment.
You can also learn how to Protect Containers in GCP with Defender for Containers.
You can also check out the following blogs:
Protect your Google Cloud workloads with Microsoft Defender for Cloud
Introducing Microsoft Defender for Containers
A new name for multicloud security: Microsoft Defender for Cloud
Next steps
Use Defender for Containers to scan your ACR images for vulnerabilities.
Overview of Microsoft Defender for Containers
6/15/2022 • 5 minutes to read • Edit Online
Microsoft Defender for Containers is the cloud-native solution for securing your containers so you can improve,
monitor, and maintain the security of your clusters, containers, and their applications.
How does Defender for Containers work in each Kubernetes platform?
Required roles and permissions: • To auto provision the required components, see the
permissions for each of the components
• Security admin can dismiss alerts
• Security reader can view vulnerability assessment
findings
See also Azure Container Registry roles and permissions
Clouds: Azure :
Commercial clouds
National clouds (Azure Government, Azure China
21Vianet) (Except for preview features))
Non-Azure :
Connected AWS accounts (Preview)
Connected GCP projects (Preview)
On-prem/IaaS supported via Arc enabled Kubernetes
(Preview).
Hardening
Continuous monitoring of your Kubernetes clusters - wherever they're hosted
Defender for Cloud continuously assesses the configurations of your clusters and compares them with the
initiatives applied to your subscriptions. When it finds misconfigurations, Defender for Cloud generates security
recommendations. Use Defender for Cloud's recommendations page to view recommendations and
remediate issues. For details of the relevant Defender for Cloud recommendations that might appear for this
feature, see the compute section of the recommendations reference table.
For Kubernetes clusters on EKS, you'll need to connect your AWS account to Microsoft Defender for Cloud. Then
ensure you've enabled the CSPM plan.
When reviewing the outstanding recommendations for your container-related resources, whether in asset
inventory or the recommendations page, you can use the resource filter:
Vulnerability assessment
Scanning images in ACR registries
Defender for Containers includes an integrated vulnerability scanner for scanning images in Azure Container
Registry registries. The vulnerability scanner runs on an image:
When you push the image to your registry
Weekly on any image that was pulled within the last 30
When you import the image to your Azure Container Registry
Continuously in specific situations
Learn more in Vulnerability assessment.
View vulnerabilities for running images
The recommendation Running container images should have vulnerability findings resolved shows
vulnerabilities for running images by using the scan results from ACR registries and information on running
images from the Defender security profile/extension. Images that are deployed from a non-ACR registry, will
appear under the Not applicable tab.
Learn More
If you would like to learn more from the product manager about Microsoft Defender for Containers, check out
Microsoft Defender for Containers.
You can also check out the following blogs:
How to demonstrate the new containers features in Microsoft Defender for Cloud
Introducing Microsoft Defender for Containers
Next steps
In this overview, you learned about the core elements of container security in Microsoft Defender for Cloud. To
enable the plan, see:
Enable Defender for Containers
Use a service principal with Azure Kubernetes
Service (AKS)
6/15/2022 • 9 minutes to read • Edit Online
To access other Azure Active Directory (Azure AD) resources, an AKS cluster requires either an Azure Active
Directory (AD) service principal or a managed identity. A service principal or managed identity is needed to
dynamically create and manage other Azure resources such as an Azure load balancer or container registry
(ACR).
Managed identities are the recommended way to authenticate with other resources in Azure, and is the default
authentication method for your AKS cluster. For more information about using a managed identity with your
cluster, see Use a system-assigned managed identity.
This article shows how to create and use a service principal for your AKS clusters.
Prerequisites
Azure CLI version 2.0.59 or later. Run az --version to find the version. If you need to install or upgrade, see
Install Azure CLI.
Azure PowerShell version 5.0.0 or later. Run Get-InstalledModule -Name Az to find the version. If you need to
install or upgrade, see Install the Azure Az PowerShell module.
To manually create a service principal with the Azure CLI, use the az ad sp create-for-rbac command.
The output is similar to the following example. Copy the values for appId and password . These values are used
when you create an AKS cluster in the next section.
{
"appId": "559513bd-0c19-4c1a-87cd-851a26afd5fc",
"displayName": "myAKSClusterServicePrincipal",
"name": "http://myAKSClusterServicePrincipal",
"password": "e763725a-5eee-40e8-a466-dc88d980f415",
"tenant": "72f988bf-86f1-41af-91ab-2d7cd011db48"
}
To use an existing service principal when you create an AKS cluster using the az aks create command, use the
--service-principal and --client-secret parameters to specify the appId and password from the output of
the az ad sp create-for-rbac command:
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--service-principal <appId> \
--client-secret <password>
NOTE
If you're using an existing service principal with customized secret, ensure the secret is not longer than 190 bytes.
To delegate permissions, create a role assignment using the az role assignment create command. Assign the
appId to a particular scope, such as a resource group or virtual network resource. A role then defines what
permissions the service principal has on the resource, as shown in the following example:
NOTE
If you have removed the Contributor role assignment from the node resource group, the operations below may fail.
Permission granted to a cluster using a system-assigned managed identity may take up 60 minutes to populate.
The following sections detail common delegations that you may need to assign.
Azure Container Registry
Azure CLI
Azure PowerShell
If you use Azure Container Registry (ACR) as your container image store, you need to grant permissions to the
service principal for your AKS cluster to read and pull images. Currently, the recommended configuration is to
use the az aks create or az aks update command to integrate with a registry and assign the appropriate role for
the service principal. For detailed steps, see Authenticate with Azure Container Registry from Azure Kubernetes
Service.
Networking
You may use advanced networking where the virtual network and subnet or public IP addresses are in another
resource group. Assign the Network Contributor built-in role on the subnet within the virtual network.
Alternatively, you can create a custom role with permissions to access the network resources in that resource
group. For more information, see AKS service permissions.
Storage
If you need to access existing disk resources in another resource group, assign one of the following set of role
permissions:
Create a custom role and define the following role permissions:
Microsoft.Compute/disks/read
Microsoft.Compute/disks/write
Or, assign the Storage Account Contributor built-in role on the resource group
Azure Container Instances
If you use Virtual Kubelet to integrate with AKS and choose to run Azure Container Instances (ACI) in resource
group separate from the AKS cluster, the AKS cluster service principal must be granted Contributor permissions
on the ACI resource group.
Other considerations
Azure CLI
Azure PowerShell
When using AKS and an Azure AD service principal, consider the following:
The service principal for Kubernetes is a part of the cluster configuration. However, don't use this identity to
deploy the cluster.
By default, the service principal credentials are valid for one year. You can update or rotate the service
principal credentials at any time.
Every service principal is associated with an Azure AD application. The service principal for a Kubernetes
cluster can be associated with any valid Azure AD application name (for example:
https://www.contoso.org/example). The URL for the application doesn't have to be a real endpoint.
When you specify the service principal Client ID , use the value of the appId .
On the agent node VMs in the Kubernetes cluster, the service principal credentials are stored in the file
/etc/kubernetes/azure.json
When you use the az aks create command to generate the service principal automatically, the service
principal credentials are written to the file ~/.azure/aksServicePrincipal.json on the machine used to run
the command.
If you don't specify a service principal with AKS CLI commands, the default service principal located at
~/.azure/aksServicePrincipal.json is used.
You can optionally remove the aksServicePrincipal.json file, and AKS creates a new service principal.
When you delete an AKS cluster that was created by az aks create, the service principal created automatically
isn't deleted.
To delete the service principal, query for your clusters servicePrincipalProfile.clientId and then
delete it using the az ad sp delete command. Replace the values for the -g parameter for the
resource group name, and -n parameter for the cluster name:
Troubleshoot
Azure CLI
Azure PowerShell
The service principal credentials for an AKS cluster are cached by the Azure CLI. If these credentials have expired,
you encounter errors during deployment of the AKS cluster. The following error message when running az aks
create may indicate a problem with the cached service principal credentials:
Check the age of the credentials file by running the following command:
ls -la $HOME/.azure/aksServicePrincipal.json
The default expiration time for the service principal credentials is one year. If your aksServicePrincipal.json file is
older than one year, delete the file and retry deploying the AKS cluster.
Next steps
For more information about Azure Active Directory service principals, see Application and service principal
objects.
For information on how to update the credentials, see Update or rotate the credentials for a service principal in
AKS.
Use a managed identity in Azure Kubernetes
Service
6/15/2022 • 10 minutes to read • Edit Online
An Azure Kubernetes Service (AKS) cluster requires an identity to access Azure resources like load balancers and
managed disks. This identity can be either a managed identity or a service principal. By default, when you create
an AKS cluster a system-assigned managed identity automatically created. The identity is managed by the Azure
platform and doesn't require you to provision or rotate any secrets. For more information about managed
identities in Azure AD, see Managed identities for Azure resources.
To use a service principal, you have to create one, as AKS does not create one automatically. Clusters using a
service principal eventually expire and the service principal must be renewed to keep the cluster working.
Managing service principals adds complexity, thus it's easier to use managed identities instead. The same
permission requirements apply for both service principals and managed identities.
Managed identities are essentially a wrapper around service principals, and make their management simpler.
Managed identities use certificate-based authentication, and each managed identities credential has an
expiration of 90 days and it's rolled after 45 days. AKS uses both system-assigned and user-assigned managed
identity types, and these identities are immutable.
Prerequisites
Azure CLI version 2.23.0 or later. Run az --version to find the version. If you need to install or upgrade, see
Install Azure CLI.
Limitations
Tenants move or migrate a managed identity-enabled cluster isn't supported.
If the cluster has aad-pod-identity enabled, Node-Managed Identity (NMI) pods modify the nodes' iptables
to intercept calls to the Azure Instance Metadata endpoint. This configuration means any request made to the
Metadata endpoint is intercepted by NMI even if the pod doesn't use aad-pod-identity .
AzurePodIdentityException CRD can be configured to inform aad-pod-identity that any requests to the
Metadata endpoint originating from a pod that matches labels defined in CRD should be proxied without any
processing in NMI. The system pods with kubernetes.azure.com/managedby: aks label in kube-system
namespace should be excluded in aad-pod-identity by configuring the AzurePodIdentityException CRD. For
more information, see Disable aad-pod-identity for a specific pod or application. To configure an exception,
install the mic-exception YAML.
Control plane AKS Cluster Name Used by AKS control Contributor role for Supported
plane components to Node resource group
manage cluster
resources including
ingress load
balancers and AKS
managed public IPs,
Cluster Autoscaler,
Azure Disk & File CSI
drivers
You can create an AKS cluster using a system-assigned managed identity by running the following CLI
command.
First, create an Azure resource group:
Once the cluster is created, you can then deploy your application workloads to the new cluster and interact with
it just as you've done with service-principal-based AKS clusters.
Finally, get credentials to access the cluster:
NOTE
An update will only work if there is an actual VHD update to consume. If you are running the latest VHD, you'll need to
wait until the next VHD is available in order to perform the update.
NOTE
After updating, your cluster's control plane and addon pods, they use the managed identity, but kubelet will continue
using a service principal until you upgrade your agentpool. Perform an az aks nodepool upgrade --node-image-only
on your nodes to complete the update to a managed identity.
If your cluster was using --attach-acr to pull from image from Azure Container Registry, after updating your cluster to
a managed identity, you need to rerun az aks update --attach-acr <ACR Resource ID> to let the newly created
kubelet used for managed identity get the permission to pull from ACR. Otherwise, you won't be able to pull from ACR
after the upgrade.
The Azure CLI will ensure your addon's permission is correctly set after migrating, if you're not using the Azure CLI to
perform the migrating operation, you'll need to handle the addon identity's permission by yourself. Here is one example
using an Azure Resource Manager template.
WARNING
A nodepool upgrade will cause downtime for your AKS cluster as the nodes in the nodepools will be cordoned/drained
and then reimaged.
NOTE
If you are not using the CLI but using your own VNet, attached Azure disk, static IP address, route table or user-assigned
kubelet identity which are outside of the worker node resource group, it's recommended to use user-assigned control
plane identity. For system-assigned control plane identity, we cannot get the identity ID before creating cluster, which
causes delay for role assignment to take effect.
{
"clientId": "<client-id>",
"id":
"/subscriptions/<subscriptionid>/resourcegroups/myResourceGroup/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/myIdentity",
"location": "eastus",
"name": "myIdentity",
"principalId": "<principal-id>",
"resourceGroup": "myResourceGroup",
"tags": {},
"tenantId": "<tenant-id>",
"type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}
Example:
For user-assigned kubelet identity which is outside the default woker node resource group, you need to assign
the Managed Identity Operator on kubelet identity.
Example:
NOTE
Permission granted to your cluster's managed identity used by Azure may take up 60 minutes to populate.
NOTE
USDOD Central, USDOD East, USGov Iowa regions in Azure US Government cloud aren't currently supported.
AKS will create a system-assigned kubelet identity in the Node resource group if you do not specify your own kubelet
managed identity.
If you don't have a managed identity, you should create one by running the az identity command.
{
"clientId": "<client-id>",
"clientSecretUrl": "<clientSecretUrl>",
"id":
"/subscriptions/<subscriptionid>/resourcegroups/myResourceGroup/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/myIdentity",
"location": "westus2",
"name": "myIdentity",
"principalId": "<principal-id>",
"resourceGroup": "myResourceGroup",
"tags": {},
"tenantId": "<tenant-id>",
"type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}
Run the following command to create a cluster with your existing identity:
az aks create \
--resource-group myResourceGroup \
--name myManagedCluster \
--network-plugin azure \
--vnet-subnet-id <subnet-id> \
--docker-bridge-address 172.17.0.1/16 \
--dns-service-ip 10.2.0.10 \
--service-cidr 10.2.0.0/24 \
--enable-managed-identity \
--assign-identity <identity-resource-id>
A successful cluster creation using your own managed identity should resemble the following
userAssignedIdentities profile information:
"identity": {
"principalId": null,
"tenantId": null,
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/<subscriptionid>/resourcegroups/myResourceGroup/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/myIdentity": {
"clientId": "<client-id>",
"principalId": "<principal-id>"
}
}
},
If you don't have a kubelet managed identity, you can create one by running the following az identity create
command:
{
"clientId": "<client-id>",
"clientSecretUrl": "<clientSecretUrl>",
"id":
"/subscriptions/<subscriptionid>/resourcegroups/myResourceGroup/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/myKubeletIdentity",
"location": "westus2",
"name": "myKubeletIdentity",
"principalId": "<principal-id>",
"resourceGroup": "myResourceGroup",
"tags": {},
"tenantId": "<tenant-id>",
"type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}
az aks create \
--resource-group myResourceGroup \
--name myManagedCluster \
--network-plugin azure \
--vnet-subnet-id <subnet-id> \
--docker-bridge-address 172.17.0.1/16 \
--dns-service-ip 10.2.0.10 \
--service-cidr 10.2.0.0/24 \
--enable-managed-identity \
--assign-identity <identity-resource-id> \
--assign-kubelet-identity <kubelet-identity-resource-id>
A successful AKS cluster creation using your own kubelet managed identity should resemble the following
output:
"identity": {
"principalId": null,
"tenantId": null,
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/<subscriptionid>/resourcegroups/resourcegroups/providers/Microsoft.ManagedIdentity/userAssig
nedIdentities/myIdentity": {
"clientId": "<client-id>",
"principalId": "<principal-id>"
}
}
},
"identityProfile": {
"kubeletidentity": {
"clientId": "<client-id>",
"objectId": "<object-id>",
"resourceId":
"/subscriptions/<subscriptionid>/resourcegroups/resourcegroups/providers/Microsoft.ManagedIdentity/userAssig
nedIdentities/myKubeletIdentity"
}
},
WARNING
Updating kubelet managed identity upgrades Nodepool, which causes downtime for your AKS cluster as the nodes in the
nodepools will be cordoned/drained and then reimaged.
NOTE
If your cluster was using --attach-acr to pull from image from Azure Container Registry, after updating your cluster
kubelet identity, you need to rerun az aks update --attach-acr <ACR Resource ID> to let the newly created kubelet
used for managed identity get the permission to pull from ACR. Otherwise, you won't be able to pull from ACR after the
upgrade.
Get the current control plane identity for your AKS cluster
Confirm your AKS cluster is using user-assigned control plane identity with the following CLI command:
If the cluster is using a managed identity, the output shows clientId with a value of msi . A cluster using a
service principal shows an object ID. For example:
{
"clientId": "msi"
}
After verifying the cluster is using a managed identity, you can find the control plane identity's resource ID by
running the following command:
For user-assigned control plane identity, the output should look like:
{
"principalId": null,
"tenantId": null,
"type": "UserAssigned",
"userAssignedIdentities": <identity-resource-id>
"clientId": "<client-id>",
"principalId": "<principal-id>"
},
{
"clientId": "<client-id>",
"clientSecretUrl": "<clientSecretUrl>",
"id":
"/subscriptions/<subscriptionid>/resourcegroups/myResourceGroup/providers/Microsoft.ManagedIdentity/userAssi
gnedIdentities/myKubeletIdentity",
"location": "westus2",
"name": "myKubeletIdentity",
"principalId": "<principal-id>",
"resourceGroup": "myResourceGroup",
"tags": {},
"tenantId": "<tenant-id>",
"type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}
Now you can use the following command to update your cluster with your existing identities. Provide the
control plane identity resource ID via assign-identity and the kubelet managed identity via
assign-kubelet-identity :
az aks update \
--resource-group myResourceGroup \
--name myManagedCluster \
--enable-managed-identity \
--assign-identity <identity-resource-id> \
--assign-kubelet-identity <kubelet-identity-resource-id>
A successful cluster update using your own kubelet managed identity contains the following output:
"identity": {
"principalId": null,
"tenantId": null,
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/<subscriptionid>/resourcegroups/resourcegroups/providers/Microsoft.ManagedIdentity/userAssig
nedIdentities/myIdentity": {
"clientId": "<client-id>",
"principalId": "<principal-id>"
}
}
},
"identityProfile": {
"kubeletidentity": {
"clientId": "<client-id>",
"objectId": "<object-id>",
"resourceId":
"/subscriptions/<subscriptionid>/resourcegroups/resourcegroups/providers/Microsoft.ManagedIdentity/userAssig
nedIdentities/myKubeletIdentity"
}
},
Next steps
Use Azure Resource Manager templates to create a managed identity-enabled cluster.
Use Azure role-based access control to define
access to the Kubernetes configuration file in Azure
Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
You can interact with Kubernetes clusters using the kubectl tool. The Azure CLI provides an easy way to get the
access credentials and configuration information to connect to your AKS clusters using kubectl . To limit who
can get that Kubernetes configuration (kubeconfig) information and to limit the permissions they then have, you
can use Azure role-based access control (Azure RBAC).
This article shows you how to assign Azure roles that limit who can get the configuration information for an AKS
cluster.
IMPORTANT
In some cases, the user.name in the account is different than the userPrincipalName, such as with Azure AD guest users:
In this case, set the value of ACCOUNT_UPN to the userPrincipalName from the Azure AD user. For example, if your
account user.name is [email protected]:
TIP
If you want to assign permissions to an Azure AD group, update the --assignee parameter shown in the previous
example with the object ID for the group rather than a user. To obtain the object ID for a group, use the az ad group
show command. The following example gets the object ID for the Azure AD group named appdev:
az ad group show --group appdev --query objectId -o tsv
You can change the previous assignment to the Cluster User Role as needed.
The following example output shows the role assignment has been successfully created:
{
"canDelegate": null,
"id":
"/subscriptions/<guid>/resourcegroups/myResourceGroup/providers/Microsoft.ContainerService/managedClusters/m
yAKSCluster/providers/Microsoft.Authorization/roleAssignments/b2712174-5a41-4ecb-82c5-12b8ad43d4fb",
"name": "b2712174-5a41-4ecb-82c5-12b8ad43d4fb",
"principalId": "946016dd-9362-4183-b17d-4c416d1f8f61",
"resourceGroup": "myResourceGroup",
"roleDefinitionId": "/subscriptions/<guid>/providers/Microsoft.Authorization/roleDefinitions/0ab01a8-8aac-
4efd-b8c2-3ee1fb270be8",
"scope":
"/subscriptions/<guid>/resourcegroups/myResourceGroup/providers/Microsoft.ContainerService/managedClusters/m
yAKSCluster",
"type": "Microsoft.Authorization/roleAssignments"
}
You can then use the kubectl config view command to verify that the context for the cluster shows that the
admin configuration information has been applied:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://myaksclust-myresourcegroup-19da35-4839be06.hcp.eastus.azmk8s.io:443
name: myAKSCluster
contexts:
- context:
cluster: myAKSCluster
user: clusterAdmin_myResourceGroup_myAKSCluster
name: myAKSCluster-admin
current-context: myAKSCluster-admin
kind: Config
preferences: {}
users:
- name: clusterAdmin_myResourceGroup_myAKSCluster
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
token: e9f2f819a4496538b02cefff94e61d35
Next steps
For enhanced security on access to AKS clusters, integrate Azure Active Directory authentication.
Secure access to the API server using authorized IP
address ranges in Azure Kubernetes Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
In Kubernetes, the API server receives requests to perform actions in the cluster such as to create resources or
scale the number of nodes. The API server is the central way to interact with and manage a cluster. To improve
cluster security and minimize attacks, the API server should only be accessible from a limited set of IP address
ranges.
This article shows you how to use API server authorized IP address ranges to limit which IP addresses and
CIDRs can access control plane.
IMPORTANT
By default, your cluster uses the Standard SKU load balancer which you can use to configure the outbound gateway.
When you enable API server authorized IP ranges during cluster creation, the public IP for your cluster is also allowed by
default in addition to the ranges you specify. If you specify "" or no value for --api-server-authorized-ip-ranges , API
server authorized IP ranges will be disabled. Note that if you're using PowerShell, use
--api-server-authorized-ip-ranges="" (with equals sign) to avoid any parsing issues.
The following example creates a single-node cluster named myAKSCluster in the resource group named
myResourceGroup with API server authorized IP ranges enabled. The IP address ranges allowed are
73.140.245.0/24:
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--api-server-authorized-ip-ranges 73.140.245.0/24 \
--generate-ssh-keys
NOTE
You should add these ranges to an allow list:
The firewall public IP address
Any range that represents networks that you'll administer the cluster from
The upper limit for the number of IP ranges you can specify is 200.
The rules can take up to 2min to propagate. Please allow up to that time when testing the connection.
Specify the outbound IPs for the Standard SKU load balancer
When creating an AKS cluster, if you specify the outbound IP addresses or prefixes for the cluster, those
addresses or prefixes are allowed as well. For example:
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--api-server-authorized-ip-ranges 73.140.245.0/24 \
--load-balancer-outbound-ips <publicIpId1>,<publicIpId2> \
--generate-ssh-keys
In the above example, all IPs provided in the parameter --load-balancer-outbound-ip-prefixes are allowed along
with the IPs in the --api-server-authorized-ip-ranges parameter.
Instead, you can specify the --load-balancer-outbound-ip-prefixes parameter to allow outbound load balancer
IP prefixes.
Allow only the outbound public IP of the Standard SKU load balancer
When you enable API server authorized IP ranges during cluster creation, the outbound public IP for the
Standard SKU load balancer for your cluster is also allowed by default in addition to the ranges you specify. To
allow only the outbound public IP of the Standard SKU load balancer, use 0.0.0.0/32 when specifying the
--api-server-authorized-ip-ranges parameter.
In the following example, only the outbound public IP of the Standard SKU load balancer is allowed, and you can
only access the API server from the nodes within the cluster.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--api-server-authorized-ip-ranges 0.0.0.0/32 \
--generate-ssh-keys
The following example updates API server authorized IP ranges on the cluster named myAKSCluster in the
resource group named myResourceGroup. The IP address range to authorize is 73.140.245.0/24:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--api-server-authorized-ip-ranges 73.140.245.0/24
You can also use 0.0.0.0/32 when specifying the --api-server-authorized-ip-ranges parameter to allow only the
public IP of the Standard SKU load balancer.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--api-server-authorized-ip-ranges ""
az aks show \
--resource-group myResourceGroup \
--name myAKSCluster \
--query apiServerAccessProfile.authorizedIpRanges
Update, disable, and find authorized IP ranges using Azure portal
The above operations of adding, updating, finding, and disabling authorized IP ranges can also be performed in
the Azure portal. To access, navigate to Networking under Settings in the menu blade of your cluster resource.
Another option is to use the below command on Windows systems to get the public IPv4 address, or you can
use the steps in Find your IP address.
You can also find this address by searching "what is my IP address" in an internet browser.
Next steps
In this article, you enabled API server authorized IP ranges. This approach is one part of how you can run a
secure AKS cluster.
For more information, see Security concepts for applications and clusters in AKS and Best practices for cluster
security and upgrades in AKS.
Add KMS etcd encryption to an Azure Kubernetes
Service (AKS) cluster (Preview)
6/15/2022 • 3 minutes to read • Edit Online
This article shows you how to enable encryption at rest for your Kubernetes data in etcd using Azure Key Vault
with Key Management Service (KMS) plugin. The KMS plugin allows you to:
Use a key in Key Vault for etcd encryption
Bring your own keys
Provide encryption at rest for secrets stored in etcd
For more information on using the KMS plugin, see Encrypting Secret Data at Rest.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/AzureKeyVaultKmsPreview')].
{Name:name,State:properties.state}"
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az
provider register command:
Limitations
The following limitations apply when you integrate KMS etcd encryption with AKS:
Disabling of the KMS etcd encryption feature.
Changing of key ID, including key name and key version.
Deletion of the key, Key Vault, or the associated identity.
KMS etcd encryption doesn't work with System-Assigned Managed Identity. The keyvault access-policy is
required to be set before the feature is enabled. In addition, System-Assigned Managed Identity isn't
available until cluster creation, thus there's a cycle dependency.
Using more than 2000 secrets in a cluster.
Bring your own (BYO) Azure Key Vault from another tenant.
export KEY_ID=$(az keyvault key show --name MyKeyName --vault-name MyKeyVault --query 'key.kid' -o tsv)
echo $KEY_ID
The above example stores the value of the Identity Object ID in IDENTITY_OBJECT_ID.
Use az identity show to get Identity Resource ID.
The above example stores the value of the Identity Resource ID in IDENTITY_RESOURCE_ID.
Use below command to update all secrets. Otherwise, the old secrets aren't encrypted.
AKS clusters created with a service principal have a one-year expiration time. As you near the expiration date,
you can reset the credentials to extend the service principal for an additional period of time. You may also want
to update, or rotate, the credentials as part of a defined security policy. This article details how to update these
credentials for an AKS cluster.
You may also have integrated your AKS cluster with Azure Active Directory (Azure AD), and use it as an
authentication provider for your cluster. In that case you will have 2 more identities created for your cluster, the
Azure AD Server App and the Azure AD Client App, you may also reset those credentials.
Alternatively, you can use a managed identity for permissions instead of a service principal. Managed identities
are easier to manage than service principals and do not require updates or rotations. For more information, see
Use managed identities.
WARNING
If you choose to create a new service principal, wait around 30 minutes for the service principal permission to propagate
across all regions. Updating a large AKS cluster to use these credentials may take a long time to complete.
With a variable set that contains the service principal ID, now reset the credentials using az ad sp credential
reset. The following example lets the Azure platform generate a new secure secret for the service principal. This
new secure secret is also stored as a variable.
Now continue on to update AKS cluster with new service principal credentials. This step is necessary for the
Service Principal changes to reflect on the AKS cluster.
Create a new service principal
If you chose to update the existing service principal credentials in the previous section, skip this step. Continue
to update AKS cluster with new service principal credentials.
To create a service principal and then update the AKS cluster to use these new credentials, use the az ad sp
create-for-rbac command.
The output is similar to the following example. Make a note of your own appId and password . These values are
used in the next step.
{
"appId": "7d837646-b1f3-443d-874c-fd83c7c739c5",
"name": "7d837646-b1f3-443d-874c-fd83c7c739c",
"password": "a5ce83c9-9186-426d-9183-614597c7f2f7",
"tenant": "a4342dc8-cd0e-4742-a467-3129c469d0e5"
}
Now define variables for the service principal ID and client secret using the output from your own az ad sp
create-for-rbac command, as shown in the following example. The SP_ID is your appId, and the SP_SECRET is
your password:
SP_ID=7d837646-b1f3-443d-874c-fd83c7c739c5
SP_SECRET=a5ce83c9-9186-426d-9183-614597c7f2f7
Now continue on to update AKS cluster with new service principal credentials. This step is necessary for the
Service Principal changes to reflect on the AKS cluster.
Regardless of whether you chose to update the credentials for the existing service principal or create a service
principal, you now update the AKS cluster with your new credentials using the az aks update-credentials
command. The variables for the --service-principal and --client-secret are used:
az aks update-credentials \
--resource-group myResourceGroup \
--name myAKSCluster \
--reset-service-principal \
--service-principal "$SP_ID" \
--client-secret "${SP_SECRET:Q}"
NOTE
${SP_SECRET:Q} escapes any special characters in SP_SECRET , which can cause the command to fail. The above
example works for Azure Cloud Shell and zsh terminals. For BASH terminals, use ${SP_SECRET@Q} .
For small and midsize clusters, it takes a few moments for the service principal credentials to be updated in the
AKS.
az aks update-credentials \
--resource-group myResourceGroup \
--name myAKSCluster \
--reset-aad \
--aad-server-app-id <SERVER APPLICATION ID> \
--aad-server-app-secret <SERVER APPLICATION SECRET> \
--aad-client-app-id <CLIENT APPLICATION ID>
Next steps
In this article, the service principal for the AKS cluster itself and the Azure AD Integration Applications were
updated. For more information on how to manage identity for workloads within a cluster, see Best practices for
authentication and authorization in AKS.
AKS-managed Azure Active Directory integration
6/15/2022 • 11 minutes to read • Edit Online
AKS-managed Azure AD integration simplifies the Azure AD integration process. Previously, users were required
to create a client and server app, and required the Azure AD tenant to grant Directory Read permissions. In the
new version, the AKS resource provider manages the client and server apps for you.
Limitations
AKS-managed Azure AD integration can't be disabled
Changing a AKS-managed Azure AD integrated cluster to legacy AAD is not supported
Clusters without Kubernetes RBAC enabled aren't supported for AKS-managed Azure AD integration
Prerequisites
The Azure CLI version 2.29.0 or later
Kubectl with a minimum version of 1.18.1 or kubelogin
If you are using helm, minimum version of helm 3.3.
IMPORTANT
You must use Kubectl with a minimum version of 1.18.1 or kubelogin. The difference between the minor versions of
Kubernetes and kubectl should not be more than 1 version. If you don't use the correct version, you will notice
authentication issues.
To create a new Azure AD group for your cluster administrators, use the following command:
Create an AKS cluster, and enable administration access for your Azure AD group
A successful creation of an AKS-managed Azure AD cluster has the following section in the response body
"AADProfile": {
"adminGroupObjectIds": [
"5d24****-****-****-****-****afa27aed"
],
"clientAppId": null,
"managed": true,
"serverAppId": null,
"serverAppSecret": null,
"tenantId": "72f9****-****-****-****-****d011db47"
}
Configure Azure role-based access control (Azure RBAC) to configure additional security groups for your
clusters.
If you're permanently blocked by not having access to a valid Azure AD group with access to your cluster, you
can still obtain the admin credentials to access the cluster directly.
To do these steps, you'll need to have access to the Azure Kubernetes Service Cluster Admin built-in role.
A successful activation of an AKS-managed Azure AD cluster has the following section in the response body
"AADProfile": {
"adminGroupObjectIds": [
"5d24****-****-****-****-****afa27aed"
],
"clientAppId": null,
"managed": true,
"serverAppId": null,
"serverAppSecret": null,
"tenantId": "72f9****-****-****-****-****d011db47"
}
Download user credentials again to access your cluster by following the steps here.
"AADProfile": {
"adminGroupObjectIds": [
"5d24****-****-****-****-****afa27aed"
],
"clientAppId": null,
"managed": true,
"serverAppId": null,
"serverAppSecret": null,
"tenantId": "72f9****-****-****-****-****d011db47"
}
Update kubeconfig in order to access the cluster, follow the steps here.
NOTE
On clusters with Azure AD integration enabled, users belonging to a group specified by aad-admin-group-object-ids
will still be able to gain access via non-admin credentials. On clusters without Azure AD integration enabled and
properties.disableLocalAccounts set to true, obtaining both user and admin credentials will fail.
NOTE
After disabling local accounts users on an already existing AKS cluster where users might have used local account/s, admin
must rotate the cluster certificates, in order to revoke the certificates those users might have access to. If this is a new
cluster then no action is required.
In the output, confirm local accounts have been disabled by checking the field properties.disableLocalAccounts
is set to true:
"properties": {
...
"disableLocalAccounts": true,
...
}
Attempting to get admin credentials will fail with an error message indicating the feature is preventing access:
Operation failed with status: 'Bad Request'. Details: Getting static credential is not allowed because this
cluster is set to disable local accounts.
In the output, confirm local accounts have been disabled by checking the field properties.disableLocalAccounts
is set to true:
"properties": {
...
"disableLocalAccounts": true,
...
}
Attempting to get admin credentials will fail with an error message indicating the feature is preventing access:
Operation failed with status: 'Bad Request'. Details: Getting static credential is not allowed because this
cluster is set to disable local accounts.
In the output, confirm local accounts have been re-enabled by checking the field
properties.disableLocalAccounts is set to false:
"properties": {
...
"disableLocalAccounts": false,
...
}
NOTE
Azure AD Conditional Access is an Azure AD Premium capability.
To create an example Conditional Access policy to use with AKS, complete the following steps:
1. At the top of the Azure portal, search for and select Azure Active Directory.
2. In the menu for Azure Active Directory on the left-hand side, select Enterprise applications.
3. In the menu for Enterprise applications on the left-hand side, select Conditional Access.
4. In the menu for Conditional Access on the left-hand side, select Policies then New policy.
Follow the instructions to sign in again. Notice there is an error message stating you are successfully logged in,
but your admin requires the device requesting access to be managed by your Azure AD to access the resource.
In the Azure portal, navigate to Azure Active Directory, select Enterprise applications then under Activity select
Sign-ins. Notice an entry at the top with a Status of Failed and a Conditional Access of Success. Select the entry
then select Conditional Access in Details. Notice your Conditional Access policy is listed.
NOTE
PIM is an Azure AD Premium capability requiring a Premium P2 SKU. For more on Azure AD SKUs, see the pricing guide.
To integrate just-in-time access requests with an AKS cluster using AKS-managed Azure AD integration,
complete the following steps:
1. At the top of the Azure portal, search for and select Azure Active Directory.
2. Take note of the Tenant ID, referred to for the rest of these instructions as <tenant-id>
3. In the menu for Azure Active Directory on the left-hand side, under Manage select Groups then New Group.
4. Make sure a Group Type of Security is selected and enter a group name, such as myJITGroup. Under Azure
AD Roles can be assigned to this group (Preview), select Yes. Finally, select Create.
5. You will be brought back to the Groups page. Select your newly created group and take note of the Object ID,
referred to for the rest of these instructions as <object-id> .
6. Deploy an AKS cluster with AKS-managed Azure AD integration by using the <tenant-id> and <object-id>
values from earlier:
7. Back in the Azure portal, in the menu for Activity on the left-hand side, select Privileged Access (Preview) and
select Enable Privileged Access.
9. Select a role of member, and select the users and groups to whom you wish to grant cluster access. These
assignments can be modified at any time by a group admin. When you're ready to move on, select Next.
10. Choose an assignment type of Active, the desired duration, and provide a justification. When you're ready to
proceed, select Assign. For more on assignment types, see Assign eligibility for a privileged access group
(preview) in Privileged Identity Management.
Once the assignments have been made, verify just-in-time access is working by accessing the cluster. For
example:
Note the authentication requirement and follow the steps to authenticate. If successful, you should see output
similar to the following:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code
AAAAAAAAA to authenticate.
NAME STATUS ROLES AGE VERSION
aks-nodepool1-61156405-vmss000000 Ready agent 6m36s v1.18.14
aks-nodepool1-61156405-vmss000001 Ready agent 6m42s v1.18.14
aks-nodepool1-61156405-vmss000002 Ready agent 6m33s v1.18.14
az role assignment create --role "Azure Kubernetes Service RBAC Reader" --assignee <AAD-ENTITY-ID> --scope
$AKS_ID/namespaces/<namespace-name>
3. Associate the group you just configured at the namespace level with PIM to complete the configuration.
Troubleshooting
If kubectl get nodes returns an error similar to the following:
Error from server (Forbidden): nodes is forbidden: User "aaaa11111-11aa-aa11-a1a1-111111aaaaa" cannot list
resource "nodes" in API group "" at the cluster scope
Make sure the admin of the security group has given your account an Active assignment.
Next steps
Learn about Azure RBAC integration for Kubernetes Authorization
Learn about Azure AD integration with Kubernetes RBAC.
Use kubelogin to access features for Azure authentication that aren't available in kubectl.
Learn more about AKS and Kubernetes identity concepts.
Use Azure Resource Manager (ARM) templates to create AKS-managed Azure AD enabled clusters.
Integrate Azure Active Directory with Azure
Kubernetes Service using the Azure CLI (legacy)
6/15/2022 • 8 minutes to read • Edit Online
WARNING
**The feature described in this document, Azure AD Integration (legacy), will be deprecated on February 29th 2024.
AKS has a new improved AKS-managed Azure AD experience that doesn't require you to manage server or client
application. If you want to migrate follow the instructions here.
Azure Kubernetes Service (AKS) can be configured to use Azure Active Directory (AD) for user authentication. In
this configuration, you can log into an AKS cluster using an Azure AD authentication token. Cluster operators
can also configure Kubernetes role-based access control (Kubernetes RBAC) based on a user's identity or
directory group membership.
This article shows you how to create the required Azure AD components, then deploy an Azure AD-enabled
cluster and create a basic Kubernetes role in the AKS cluster.
For the complete sample script used in this article, see [Azure CLI samples - AKS integration with Azure AD]
[complete-script].
aksname="myakscluster"
Now create a service principal for the server app using the az ad sp create command. This service principal is
used to authenticate itself within the Azure platform. Then, get the service principal secret using the az ad sp
credential reset command and assign to the variable named serverApplicationSecret for use in one of the
following steps:
The Azure AD service principal needs permissions to perform the following actions:
Read directory data
Sign in and read user profile
Assign these permissions using the az ad app permission add command:
Finally, grant the permissions assigned in the previous step for the server application using the az ad app
permission grant command. This step fails if the current account is not a tenant admin. You also need to add
permissions for Azure AD application to request information that may otherwise require administrative consent
using the az ad app permission admin-consent:
az ad app permission grant --id $serverApplicationId --api 00000003-0000-0000-c000-000000000000
az ad app permission admin-consent --id $serverApplicationId
Create a service principal for the client application using the az ad sp create command:
Get the oAuth2 ID for the server app to allow the authentication flow between the two app components using
the az ad app show command. This oAuth2 ID is used in the next step.
Add the permissions for the client application and server application components to use the oAuth2
communication flow using the az ad app permission add command. Then, grant permissions for the client
application to communication with the server application using the az ad app permission grant command:
Get the tenant ID of your Azure subscription using the az account show command. Then, create the AKS cluster
using the az aks create command. The command to create the AKS cluster provides the server and client
application IDs, the server application service principal secret, and your tenant ID:
tenantId=$(az account show --query tenantId -o tsv)
az aks create \
--resource-group myResourceGroup \
--name $aksname \
--node-count 1 \
--generate-ssh-keys \
--aad-server-app-id $serverApplicationId \
--aad-server-app-secret $serverApplicationSecret \
--aad-client-app-id $clientApplicationId \
--aad-tenant-id $tenantId
Finally, get the cluster admin credentials using the az aks get-credentials command. In one of the following
steps, you get the regular user cluster credentials to see the Azure AD authentication flow in action.
IMPORTANT
If the user you grant the Kubernetes RBAC binding for is in the same Azure AD tenant, assign permissions based on the
userPrincipalName. If the user is in a different Azure AD tenant, query for and use the objectId property instead.
Create a YAML manifest named basic-azure-ad-binding.yaml and paste the following contents. On the last line,
replace userPrincipalName_or_objectId with the UPN or object ID output from the previous command:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: contoso-cluster-admins
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: userPrincipalName_or_objectId
Create the ClusterRoleBinding using the kubectl apply command and specify the filename of your YAML
manifest:
kubectl apply -f basic-azure-ad-binding.yaml
Now use the kubectl get pods command to view pods across all namespaces:
You receive a sign in prompt to authenticate using Azure AD credentials using a web browser. After you've
successfully authenticated, the kubectl command displays the pods in the AKS cluster, as shown in the
following example output:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code
BYMK7UXVD to authenticate.
The authentication token received for kubectl is cached. You are only reprompted to sign in when the token has
expired or the Kubernetes config file is re-created.
If you see an authorization error message after you've successfully signed in using a web browser as in the
following example output, check the following possible issues:
You defined the appropriate object ID or UPN, depending on if the user account is in the same Azure AD
tenant or not.
The user is not a member of more than 200 groups.
Secret defined in the application registration for server matches the value configured using
--aad-server-app-secret
Be sure that only one version of kubectl is installed on your machine at a time. Conflicting versions can cause
issues during authorization. To install the latest version, use az aks install-cli.
Next steps
For the complete script that contains the commands shown in this article, see the [Azure AD integration script in
the AKS samples repo][complete-script].
To use Azure AD users and groups to control access to cluster resources, see Control access to cluster resources
using Kubernetes role-based access control and Azure AD identities in AKS.
For more information about how to secure Kubernetes clusters, see Access and identity options for AKS).
For best practices on identity and resource control, see Best practices for authentication and authorization in
AKS.
Enable Group Managed Service Accounts (GMSA)
for your Windows Server nodes on your Azure
Kubernetes Service (AKS) cluster
6/15/2022 • 8 minutes to read • Edit Online
Group Managed Service Accounts (GMSA) is a managed domain account for multiple servers that provides
automatic password management, simplified service principal name (SPN) management and the ability to
delegate the management to other administrators. AKS provides the ability to enable GMSA on your Windows
Server nodes, which allows containers running on Windows Server nodes to integrate with and be managed by
GMSA.
Pre-requisites
Enabling GMSA with Windows Server nodes on AKS requires:
Kubernetes 1.19 or greater.
Azure CLI version 2.35.0 or greater
Managed identities with your AKS cluster.
Permissions to create or update an Azure Key Vault.
Permissions to configure GMSA on Active Directory Domain Service or on-prem Active Directory.
The domain controller must have Active Directory Web Services enabled and must be reachable on port
9389 by the AKS cluster.
IMPORTANT
You must use either Active Directory Domain Service or on-prem Active Directory. At this time, you can't use Azure Active
Directory to configure GMSA with an AKS cluster.
NOTE
Use the Fully Qualified Domain Name for the Domain rather than the Partially Qualified Domain Name that may be used
on internal networks.
The above command escapes the value parameter for running the Azure CLI on a Linux shell. When running the Azure
CLI command on Windows PowerShell, you don't need to escape characters in the value parameter.
You can grant your kubelet identity access to you key vault before or after you create you cluster. The following
example uses az identity list to get the id of the identity and set it to MANAGED_ID then uses
az keyvault set-policy to grant the identity access to the MyAKSGMSAVault key vault.
echo "Please enter the username to use as administrator credentials for Windows Server nodes on your
cluster: " && read WINDOWS_USERNAME
Use az aks create to create an AKS cluster then az aks nodepool add to add a Windows Server node pool. The
following example creates a MyAKS cluster in the MyResourceGroup resource group, enables GMSA, and then
adds a new node pool named npwin.
NOTE
If you are using a custom vnet, you also need to specify the id of the vnet using vnet-subnet-id and may need to also add
docker-bridge-address, dns-service-ip, and service-cidr depending on your configuration.
If you created your own identity for the kubelet identity, use the assign-kubelet-identity parameter to specify your
identity.
az aks create \
--resource-group MyResourceGroup \
--name MyAKS \
--vm-set-type VirtualMachineScaleSets \
--network-plugin azure \
--load-balancer-sku standard \
--windows-admin-username $WINDOWS_USERNAME \
--enable-managed-identity \
--enable-windows-gmsa \
--gmsa-dns-server $DNS_SERVER \
--gmsa-root-domain-name $ROOT_DOMAIN_NAME
You can also enable GMSA on existing clusters that already have Windows Server nodes and managed identities
enabled using az aks update . For example:
az aks update \
--resource-group MyResourceGroup \
--name MyAKS \
--enable-windows-gmsa \
--gmsa-dns-server $DNS_SERVER \
--gmsa-root-domain-name $ROOT_DOMAIN_NAME
After creating your cluster or updating your cluster, use az keyvault set-policy to grant the identity access to
your key vault. The following example grants the kubelet identity created by the cluster access to the
MyAKSGMSAVault key vault.
NOTE
If you provided your own identity for the kubelet identity, skip this step.
Create a gmsa-spec.yaml with the following, replacing the placeholders with your own values.
apiVersion: windows.k8s.io/v1alpha1
kind: GMSACredentialSpec
metadata:
name: aks-gmsa-spec # This name can be changed, but it will be used as a reference in the pod spec
credspec:
ActiveDirectoryConfig:
GroupManagedServiceAccounts:
- Name: $GMSA_ACCOUNT_USERNAME
Scope: $NETBIOS_DOMAIN_NAME
- Name: $GMSA_ACCOUNT_USERNAME
Scope: $DNS_DOMAIN_NAME
HostAccountConfig:
PluginGUID: '{CCC2A336-D7F3-4818-A213-272B7924213E}'
PortableCcgVersion: "1"
PluginInput: ObjectId=$MANAGED_ID;SecretUri=$SECRET_URI # SECRET_URI takes the form
https://$akvName.vault.azure.net/secrets/$akvSecretName
CmsPlugins:
- ActiveDirectory
DomainJoinConfig:
DnsName: $DNS_DOMAIN_NAME
DnsTreeName: $DNS_ROOT_DOMAIN_NAME
Guid: $AD_DOMAIN_OBJECT_GUID
MachineAccountName: $GMSA_ACCOUNT_USERNAME
NetBiosName: $NETBIOS_DOMAIN_NAME
Sid: $GMSA_SID
Use kubectl apply to apply the changes from gmsa-spec.yaml, gmsa-role.yaml, and gmsa-role-binding.yaml.
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
app: gmsa-demo
name: gmsa-demo
namespace: default
data:
run.ps1: |
$ErrorActionPreference = "Stop"
# Add required Windows features, since they are not installed by default.
Install-WindowsFeature "Web-Windows-Auth", "Web-Asp-Net45"
C:\ServiceMonitor.exe w3svc
---
apiVersion: apps/v1
kind: Deployment
kind: Deployment
metadata:
labels:
app: gmsa-demo
name: gmsa-demo
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: gmsa-demo
template:
metadata:
labels:
app: gmsa-demo
spec:
securityContext:
windowsOptions:
gmsaCredentialSpecName: aks-gmsa-spec
containers:
- name: iis
image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
imagePullPolicy: IfNotPresent
command:
- powershell
args:
- -File
- /gmsa-demo/run.ps1
volumeMounts:
- name: gmsa-demo
mountPath: /gmsa-demo
volumes:
- configMap:
defaultMode: 420
name: gmsa-demo
name: gmsa-demo
nodeSelector:
kubernetes.io/os: windows
---
apiVersion: v1
kind: Service
metadata:
labels:
app: gmsa-demo
name: gmsa-demo
namespace: default
spec:
ports:
- port: 80
targetPort: 80
selector:
app: gmsa-demo
type: LoadBalancer
Use kubectl get service to display the IP address of the example application.
When the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
To verify GMSA is working and configured correctly, open a web browser to the external IP address of gmsa-
demo service. Authenticate with $NETBIOS_DOMAIN_NAME\$AD_USERNAME and password and confirm you see
Authenticated as $NETBIOS_DOMAIN_NAME\$AD_USERNAME, Type of Authentication: Negotiate .
Troubleshooting
No authentication is prompted when loading the page
If the page loads, but you are not prompted to authenticate, use kubelet logs POD_NAME to display the logs of
your pod and verify you see IIS with authentication is ready.
Connection timeout when trying to load the page
If you receive a connection timeout when trying to load the page, verify the sample app is running with
kubectl get pods --watch . Sometimes the external IP address for the sample app service is available before the
sample app pod is running.
Pod fails to start and an winapi error shows in the pod events
After running kubectl get pods --watch and waiting several minutes, if your pod does not start, run
kubectl describe pod POD_NAME . If you see a winapi error in the pod events, this is likely an error in your GMSA
cred spec configuration. Verify all the replacement values in gmsa-spec.yaml are correct, rerun
kubectl apply -f gmsa-spec.yaml , and redeploy the sample application.
Use Azure RBAC for Kubernetes Authorization
6/15/2022 • 6 minutes to read • Edit Online
Today you can already leverage integrated authentication between Azure Active Directory (Azure AD) and AKS.
When enabled, this integration allows customers to use Azure AD users, groups, or service principals as subjects
in Kubernetes RBAC, see more here. This feature frees you from having to separately manage user identities and
credentials for Kubernetes. However, you still have to set up and manage Azure RBAC and Kubernetes RBAC
separately. For more details on authentication and authorization with RBAC on AKS, see here.
This document covers a new approach that allows for the unified management and access control across Azure
Resources, AKS, and Kubernetes resources.
Create the AKS cluster with managed Azure AD integration and Azure RBAC for Kubernetes Authorization.
# Create an AKS-managed Azure AD cluster
az aks create -g MyResourceGroup -n MyManagedCluster --enable-aad --enable-azure-rbac
A successful creation of a cluster with Azure AD integration and Azure RBAC for Kubernetes Authorization has
the following section in the response body:
"AADProfile": {
"adminGroupObjectIds": null,
"clientAppId": null,
"enableAzureRbac": true,
"managed": true,
"serverAppId": null,
"serverAppSecret": null,
"tenantId": "****-****-****-****-****"
}
To add Azure RBAC for Kubernetes Authorization into an existing AKS cluster, use the az aks update command
with the flag enable-azure-rbac .
To remove Azure RBAC for Kubernetes Authorization from an existing AKS cluster, use the az aks update
command with the flag disable-azure-rbac .
RO L E DESC RIP T IO N
Azure Kubernetes Service RBAC Reader Allows read-only access to see most objects in a namespace.
It doesn't allow viewing roles or role bindings. This role
doesn't allow viewing Secrets , since reading the contents
of Secrets enables access to ServiceAccount credentials in
the namespace, which would allow API access as any
ServiceAccount in the namespace (a form of privilege
escalation)
Azure Kubernetes Service RBAC Writer Allows read/write access to most objects in a namespace.
This role doesn't allow viewing or modifying roles or role
bindings. However, this role allows accessing Secrets and
running Pods as any ServiceAccount in the namespace, so it
can be used to gain the API access levels of any
ServiceAccount in the namespace.
RO L E DESC RIP T IO N
Azure Kubernetes Service RBAC Admin Allows admin access, intended to be granted within a
namespace. Allows read/write access to most resources in a
namespace (or cluster scope), including the ability to create
roles and role bindings within the namespace. This role
doesn't allow write access to resource quota or to the
namespace itself.
Azure Kubernetes Service RBAC Cluster Admin Allows super-user access to perform any action on any
resource. It gives full control over every resource in the
cluster and in all namespaces.
Roles assignments scoped to the entire AKS cluster can be done either on the Access Control (IAM) blade of
the cluster resource on Azure portal or by using Azure CLI commands as shown below:
az role assignment create --role "Azure Kubernetes Service RBAC Admin" --assignee <AAD-ENTITY-ID> --scope
$AKS_ID
where <AAD-ENTITY-ID> could be a username (for example, [email protected]) or even the ClientID of a service
principal.
You can also create role assignments scoped to a specific namespace within the cluster:
az role assignment create --role "Azure Kubernetes Service RBAC Reader" --assignee <AAD-ENTITY-ID> --scope
$AKS_ID/namespaces/<namespace-name>
Today, role assignments scoped to namespaces need to be configured via Azure CLI.
Create custom roles definitions
Optionally you may choose to create your own role definition and then assign as above.
Below is an example of a role definition that allows a user to only read deployments and nothing else. You can
check the full list of possible actions here.
Copy the below json into a file called deploy-view.json .
{
"Name": "AKS Deployment Reader",
"Description": "Lets you view all deployments in cluster/namespace.",
"Actions": [],
"NotActions": [],
"DataActions": [
"Microsoft.ContainerService/managedClusters/apps/deployments/read"
],
"NotDataActions": [],
"assignableScopes": [
"/subscriptions/<YOUR SUBSCRIPTION ID>"
]
}
Replace <YOUR SUBSCRIPTION ID> by the ID from your subscription, which you can get by running:
Now we can create the role definition by running the below command from the folder where you saved
deploy-view.json :
Now that you have your role definition, you can assign it to a user or other identity by running:
az role assignment create --role "AKS Deployment Reader" --assignee <AAD-ENTITY-ID> --scope $AKS_ID
NOTE
Ensure you have the latest kubectl by running the below command:
az aks install-cli
Now that you have assigned your desired role and permissions. You can start calling the Kubernetes API, for
example, from kubectl .
For this purpose, let's first get the cluster's kubeconfig using the below command:
IMPORTANT
You'll need the Azure Kubernetes Service Cluster User built-in role to perform the step above.
Now, you can use kubectl to, for example, list the nodes in the cluster. The first time you run it you'll need to sign
in, and subsequent commands will use the respective access token.
kubectl get nodes
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code
AAAAAAAAA to authenticate.
export KUBECONFIG=/path/to/kubeconfig
kubelogin convert-kubeconfig
The first time, you'll have to sign in interactively like with regular kubectl, but afterwards you'll no longer need
to, even for new Azure AD clusters (as long as your token is still valid).
Clean up
Clean Role assignment
Copy the ID or IDs from all the assignments you did and then.
Next steps
Read more about AKS Authentication, Authorization, Kubernetes RBAC, and Azure RBAC here.
Read more about Azure RBAC here.
Read more about the all the actions you can use to granularly define custom Azure roles for Kubernetes
authorization here.
Control access to cluster resources using Kubernetes
role-based access control and Azure Active
Directory identities in Azure Kubernetes Service
6/15/2022 • 11 minutes to read • Edit Online
Azure Kubernetes Service (AKS) can be configured to use Azure Active Directory (AD) for user authentication. In
this configuration, you sign in to an AKS cluster using an Azure AD authentication token. Once authenticated,
you can use the built-in Kubernetes role-based access control (Kubernetes RBAC) to manage access to
namespaces and cluster resources based on a user's identity or group membership.
This article shows you how to control access using Kubernetes RBAC in an AKS cluster based on Azure AD group
membership. Example groups and users are created in Azure AD, then Roles and RoleBindings are created in the
AKS cluster to grant the appropriate permissions to create and view resources.
Create the first example group in Azure AD for the application developers using the az ad group create
command. The following example creates a group named appdev:
APPDEV_ID=$(az ad group create --display-name appdev --mail-nickname appdev --query objectId -o tsv)
Now, create an Azure role assignment for the appdev group using the az role assignment create command. This
assignment lets any member of the group use kubectl to interact with an AKS cluster by granting them the
Azure Kubernetes Service Cluster User Role.
TIP
If you receive an error such as
Principal 35bfec9328bd4d8d9b54dea6dac57b82 does not exist in the directory a5443dcd-cd0e-494d-a387-
3039b419f0d5.
, wait a few seconds for the Azure AD group object ID to propagate through the directory then try the
az role assignment create command again.
Create a second example group, this one for SREs named opssre:
OPSSRE_ID=$(az ad group create --display-name opssre --mail-nickname opssre --query objectId -o tsv)
Again, create an Azure role assignment to grant members of the group the Azure Kubernetes Service Cluster
User Role:
echo "Please enter the UPN for application developers: " && read AAD_DEV_UPN
The following command prompts you for the password and sets it to AAD_DEV_PW for use in a later command.
echo "Please enter the secure password for application developers: " && read AAD_DEV_PW
Create the first user account in Azure AD using the az ad user create command.
The following example creates a user with the display name AKS Dev and the UPN and secure password using
the values in AAD_DEV_UPN and AAD_DEV_PW:
AKSDEV_ID=$(az ad user create \
--display-name "AKS Dev" \
--user-principal-name $AAD_DEV_UPN \
--password $AAD_DEV_PW \
--query objectId -o tsv)
Now add the user to the appdev group created in the previous section using the az ad group member add
command:
Set the UPN and password for SREs. The following command prompts you for the UPN and sets it to
AAD_SRE_UPN for use in a later command (remember that the commands in this article are entered into a
BASH shell). The UPN must include the verified domain name of your tenant, for example [email protected] .
echo "Please enter the UPN for SREs: " && read AAD_SRE_UPN
The following command prompts you for the password and sets it to AAD_SRE_PW for use in a later command.
echo "Please enter the secure password for SREs: " && read AAD_SRE_PW
Create a second user account. The following example creates a user with the display name AKS SRE and the UPN
and secure password using the values in AAD_SRE_UPN and AAD_SRE_PW:
Create a namespace in the AKS cluster using the kubectl create namespace command. The following example
creates a namespace name dev:
In Kubernetes, Roles define the permissions to grant, and RoleBindings apply them to desired users or groups.
These assignments can be applied to a given namespace, or across the entire cluster. For more information, see
Using Kubernetes RBAC authorization.
First, create a Role for the dev namespace. This role grants full permissions to the namespace. In production
environments, you can specify more granular permissions for different users or groups.
Create a file named role-dev-namespace.yaml and paste the following YAML manifest:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: dev-user-full-access
namespace: dev
rules:
- apiGroups: ["", "extensions", "apps"]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["batch"]
resources:
- jobs
- cronjobs
verbs: ["*"]
Create the Role using the kubectl apply command and specify the filename of your YAML manifest:
Next, get the resource ID for the appdev group using the az ad group show command. This group is set as the
subject of a RoleBinding in the next step.
Now, create a RoleBinding for the appdev group to use the previously created Role for namespace access. Create
a file named rolebinding-dev-namespace.yaml and paste the following YAML manifest. On the last line, replace
groupObjectId with the group object ID output from the previous command:
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: dev-user-access
namespace: dev
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: dev-user-full-access
subjects:
- kind: Group
namespace: dev
name: groupObjectId
TIP
If you want to create the RoleBinding for a single user, specify kind: User and replace groupObjectId with the user principal
name (UPN) in the above sample.
Create the RoleBinding using the kubectl apply command and specify the filename of your YAML manifest:
kubectl apply -f rolebinding-dev-namespace.yaml
Create a file named role-sre-namespace.yaml and paste the following YAML manifest:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: sre-user-full-access
namespace: sre
rules:
- apiGroups: ["", "extensions", "apps"]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["batch"]
resources:
- jobs
- cronjobs
verbs: ["*"]
Create the Role using the kubectl apply command and specify the filename of your YAML manifest:
Get the resource ID for the opssre group using the az ad group show command:
Create a RoleBinding for the opssre group to use the previously created Role for namespace access. Create a file
named rolebinding-sre-namespace.yaml and paste the following YAML manifest. On the last line, replace
groupObjectId with the group object ID output from the previous command:
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: sre-user-access
namespace: sre
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: sre-user-full-access
subjects:
- kind: Group
namespace: sre
name: groupObjectId
Create the RoleBinding using the kubectl apply command and specify the filename of your YAML manifest:
kubectl apply -f rolebinding-sre-namespace.yaml
Schedule a basic NGINX pod using the kubectl run command in the dev namespace:
As the sign in prompt, enter the credentials for your own [email protected] account created at the start of the
article. Once you are successfully signed in, the account token is cached for future kubectl commands. The
NGINX is successfully schedule, as shown in the following example output:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code
B24ZD6FP8 to authenticate.
pod/nginx-dev created
Now use the kubectl get pods command to view pods in the dev namespace.
As shown in the following example output, the NGINX pod is successfully Running:
The user's group membership does not have a Kubernetes Role that allows this action, as shown in the following
example output:
$ kubectl get pods --all-namespaces
Error from server (Forbidden): pods is forbidden: User "[email protected]" cannot list resource "pods" in
API group "" at the cluster scope
In the same way, try to schedule a pod in different namespace, such as the sre namespace. The user's group
membership does not align with a Kubernetes Role and RoleBinding to grant these permissions, as shown in the
following example output:
Error from server (Forbidden): pods is forbidden: User "[email protected]" cannot create resource "pods" in
API group "" in the namespace "sre"
Try to schedule and view pods in the assigned sre namespace. When prompted, sign in with your own
[email protected] credentials created at the start of the article:
As shown in the following example output, you can successfully create and view the pods:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code
BM4RHP3FD to authenticate.
pod/nginx-sre created
These kubectl commands fail, as shown in the following example output. The user's group membership and
Kubernetes Role and RoleBindings don't grant permissions to create or manager resources in other namespaces:
$ kubectl get pods --all-namespaces
Error from server (Forbidden): pods is forbidden: User "[email protected]" cannot list pods at the cluster
scope
Clean up resources
In this article, you created resources in the AKS cluster and users and groups in Azure AD. To clean up all these
resources, run the following commands:
# Get the admin kubeconfig context to delete the necessary cluster resources
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster --admin
# Delete the dev and sre namespaces. This also deletes the pods, Roles, and RoleBindings
kubectl delete namespace dev
kubectl delete namespace sre
# Delete the Azure AD groups for appdev and opssre. This also deletes the Azure role assignments.
az ad group delete --group appdev
az ad group delete --group opssre
Next steps
For more information about how to secure Kubernetes clusters, see Access and identity options for AKS).
For best practices on identity and resource control, see Best practices for authentication and authorization in
AKS.
Custom certificate authority (CA) in Azure
Kubernetes Service (AKS) (preview)
6/15/2022 • 2 minutes to read • Edit Online
Custom certificate authorities (CAs) allow you to establish trust between your Azure Kubernetes Service (AKS)
cluster and your workloads, such as private registries, proxies, and firewalls. A Kubernetes secret is used to store
the certificate authority's information, then it's passed to all nodes in the cluster.
This feature is applied per nodepool, so new and existing nodepools must be configured to enable this feature.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
A base64 encoded certificate string.
Limitations
This feature isn't currently supported for Windows nodepools.
Install the aks-preview extension
You also need the aks-preview Azure CLI extensions version 0.5.72 or later. Install the aks-preview extension by
using the az extension add command, or install any available updates by using the az extension update
command.
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
It takes a few minutes for the status to show Registered. Verify the registration status by using the az feature list
command:
az feature list --query "[?contains(name, 'Microsoft.ContainerService/CustomCATrustPreview')].
{Name:name,State:properties.state}" -o table
Refresh the registration of the Microsoft.ContainerService resource provider by using the az provider register
command:
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 2 \
--enable-custom-ca-trust
To update or remove a CA, edit and apply the YAML manifest. The cluster will poll for changes and update the
nodes accordingly. This process may take a couple of minutes before changes are applied.
Next steps
For more information on AKS security best practices, see Best practices for cluster security and upgrades in
Azure Kubernetes Service (AKS).
Certificate rotation in Azure Kubernetes Service
(AKS)
6/15/2022 • 4 minutes to read • Edit Online
Azure Kubernetes Service (AKS) uses certificates for authentication with many of its components. If you have a
RBAC-enabled cluster built after March 2022 it is enabled with certificate auto-rotation. Periodically, you may
need to rotate those certificates for security or policy reasons. For example, you may have a policy to rotate all
your certificates every 90 days.
NOTE
Certificate auto-rotation will not be enabled by default for non-RBAC enabled AKS clusters.
This article shows you how certificate rotation works in your AKS cluster.
NOTE
AKS clusters created prior to May 2019 have certificates that expire after two years. Any cluster created after May 2019
or any cluster that has its certificates rotated have Cluster CA certificates that expire after 30 years. All other AKS
certificates, which use the Cluster CA for signing, will expire after two years and are automatically rotated during an AKS
version upgrade which happened after 8/1/2021. To verify when your cluster was created, use kubectl get nodes to
see the Age of your node pools.
Additionally, you can check the expiration date of your cluster's certificate. For example, the following bash command
displays the client certificate details for the myAKSCluster cluster in resource group rg
Check expiration date of certificate on one virtual machine scale set agent node
NOTE
If you have an existing cluster you have to upgrade that cluster to enable Certificate Auto-Rotation.
For any AKS clusters created or upgraded after March 2022 Azure Kubernetes Service will automatically rotate
non-CA certificates on both the control plane and agent nodes within 80% of the client certificate valid time,
before they expire with no downtime for the cluster.
How to check whether current agent node pool is TLS Bootstrapping enabled?
To verify if TLS Bootstrapping is enabled on your cluster browse to the following paths:
On a Linux node: /var/lib/kubelet/bootstrap-kubeconfig
On a Windows node: C:\k\bootstrap-config
To access agent nodes, see Connect to Azure Kubernetes Service cluster nodes for maintenance or
troubleshooting for more information.
NOTE
The file path may change as Kubernetes version evolves in the future.
Once a region is configured, create a new cluster or upgrade an existing cluster with az aks upgrade to set that
cluster for auto-certificate rotation. A control plane and node pool upgrade is needed to enable this feature.
Limitation
Auto certificate rotation won't be enabled on a non-RBAC cluster.
Use az aks get-credentials to sign in to your AKS cluster. This command also downloads and configures the
kubectl client certificate on your local machine.
Use az aks rotate-certs to rotate all certificates, CAs, and SAs on your cluster.
IMPORTANT
It may take up to 30 minutes for az aks rotate-certs to complete. If the command fails before completing, use
az aks show to verify the status of the cluster is Certificate Rotating. If the cluster is in a failed state, rerun
az aks rotate-certs to rotate your certificates again.
Verify that the old certificates are no longer valid by running a kubectl command. Since you have not updated
the certificates used by kubectl , you will see an error. For example:
Verify the certificates have been updated by running a kubectl command, which will now succeed. For
example:
NOTE
If you have any services that run on top of AKS, you may need to update certificates related to those services as well.
Next steps
This article showed you how to automatically rotate your cluster's certificates, CAs, and SAs. You can see Best
practices for cluster security and upgrades in Azure Kubernetes Service (AKS) for more information on AKS
security best practices.
Secure your cluster with Azure Policy
6/15/2022 • 4 minutes to read • Edit Online
To improve the security of your Azure Kubernetes Service (AKS) cluster, you can apply and enforce built-in
security policies on your cluster using Azure Policy. Azure Policy helps to enforce organizational standards and
to assess compliance at-scale. After installing the Azure Policy Add-on for AKS, you can apply individual policy
definitions or groups of policy definitions called initiatives (sometimes called policysets) to your cluster. See
Azure Policy built-in definitions for AKS for a complete list of AKS policy and initiative definitions.
This article shows you how to apply policy definitions to your cluster and verify those assignments are being
enforced.
Prerequisites
This article assumes that you have an existing AKS cluster. If you need an AKS cluster, see the AKS quickstart
using the Azure CLI, using Azure PowerShell, or using the Azure portal.
The Azure Policy Add-on for AKS installed on an AKS cluster. Follow these steps to install the Azure Policy
Add-on.
Custom policies allow you to define rules for using Azure. For example, you can enforce:
Security practices
Cost management
Organization-specific rules (like naming or locations)
Before creating a custom policy, check the list of common patterns and samples to see if your case is already
covered.
Custom policy definitions are written in JSON. To learn more about creating a custom policy, see Azure Policy
definition structure and Create a custom policy definition.
NOTE
Azure Policy now utilizes a new property known as templateInfo that allows users to define the source type for the
constraint template. By defining templateInfo in policy definitions, users don’t have to define constraintTemplate or
constraint properties. Users still need to define apiGroups and kinds. For more information on this, see Understanding
Azure Policy effects.
Once your custom policy definition has been created, see Assign a policy definition for a step-by-step
walkthrough of assigning the policy to your Kubernetes cluster.
NOTE
Policy assignments can take up to 20 minutes to sync into each cluster.
Create the pod with kubectl apply command and specify the name of your YAML manifest:
As expected the pod fails to be scheduled, as shown in the following example output:
The pod doesn't reach the scheduling stage, so there are no resources to delete before you move on.
Test creation of an unprivileged pod
In the previous example, the container image automatically tried to use root to bind NGINX to port 80. This
request was denied by the policy initiative, so the pod fails to start. Let's try now running that same NGINX pod
without privileged access.
Create a file named nginx-unprivileged.yaml and paste the following YAML manifest:
apiVersion: v1
kind: Pod
metadata:
name: nginx-unprivileged
spec:
containers:
- name: nginx-unprivileged
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
Create the pod using the kubectl apply command and specify the name of your YAML manifest:
The pod is successfully scheduled. When you check the status of the pod using the kubectl get pods command,
the pod is Running:
Next steps
For more information about how Azure Policy works:
Azure Policy Overview
Azure Policy initiatives and polices for AKS
Remove the Azure Policy Add-on.
Understand Azure Policy for Kubernetes clusters
6/15/2022 • 24 minutes to read • Edit Online
Azure Policy extends Gatekeeper v3, an admission controller webhook for Open Policy Agent (OPA), to apply at-
scale enforcements and safeguards on your clusters in a centralized, consistent manner. Azure Policy makes it
possible to manage and report on the compliance state of your Kubernetes clusters from one place. The add-on
enacts the following functions:
Checks with Azure Policy service for policy assignments to the cluster.
Deploys policy definitions into the cluster as constraint template and constraint custom resources.
Reports auditing and compliance details back to Azure Policy service.
Azure Policy for Kubernetes supports the following cluster environments:
Azure Kubernetes Service (AKS)
Azure Arc enabled Kubernetes
AKS Engine
IMPORTANT
The add-ons for AKS Engine and Arc enabled Kubernetes are in preview . Azure Policy for Kubernetes only supports Linux
node pools and built-in policy definitions (custom policy definitions is a public preview feature). Built-in policy definitions
are in the Kubernetes category. The limited preview policy definitions with EnforceOPAConstraint and
EnforceRegoPolicy effect and the related Kubernetes Ser vice category are deprecated. Instead, use the effects audit
and deny with Resource Provider mode Microsoft.Kubernetes.Data .
Overview
To enable and use Azure Policy with your Kubernetes cluster, take the following actions:
1. Configure your Kubernetes cluster and install the add-on:
Azure Kubernetes Service (AKS)
Azure Arc enabled Kubernetes
AKS Engine
NOTE
For common issues with installation, see Troubleshoot - Azure Policy Add-on.
Limitations
The following general limitations apply to the Azure Policy Add-on for Kubernetes clusters:
Azure Policy Add-on for Kubernetes is supported on Kubernetes version 1.14 or higher.
Azure Policy Add-on for Kubernetes can only be deployed to Linux node pools.
Only built-in policy definitions are supported. Custom policy definitions are a public preview feature.
Maximum number of pods supported by the Azure Policy Add-on: 10,000
Maximum number of Non-compliant records per policy per cluster: 500
Maximum number of Non-compliant records per subscription: 1 million
Installations of Gatekeeper outside of the Azure Policy Add-on aren't supported. Uninstall any components
installed by a previous Gatekeeper installation before enabling the Azure Policy Add-on.
Reasons for non-compliance aren't available for the Microsoft.Kubernetes.Data Resource Provider mode. Use
Component details.
Component-level exemptions aren't supported for Resource Provider modes.
The following limitations apply only to the Azure Policy Add-on for AKS:
AKS Pod security policy and the Azure Policy Add-on for AKS can't both be enabled. For more information,
see AKS pod security limitation.
Namespaces automatically excluded by Azure Policy Add-on for evaluation: kube-system, gatekeeper-system,
and aks-periscope.
Recommendations
The following are general recommendations for using the Azure Policy Add-on:
The Azure Policy Add-on requires three Gatekeeper components to run: One audit pod and two webhook
pod replicas. These components consume more resources as the count of Kubernetes resources and
policy assignments increases in the cluster, which requires audit and enforcement operations.
For fewer than 500 pods in a single cluster with a max of 20 constraints: two vCPUs and 350 MB
memory per component.
For more than 500 pods in a single cluster with a max of 40 constraints: three vCPUs and 600 MB
memory per component.
Windows pods don't support security contexts. Thus, some of the Azure Policy definitions, such as
disallowing root privileges, can't be escalated in Windows pods and only apply to Linux pods.
The following recommendation applies only to AKS and the Azure Policy Add-on:
Use system node pool with CriticalAddonsOnly taint to schedule Gatekeeper pods. For more information,
see Using system node pools.
Secure outbound traffic from your AKS clusters. For more information, see Control egress traffic for cluster
nodes.
If the cluster has aad-pod-identity enabled, Node Managed Identity (NMI) pods modify the nodes' iptables
to intercept calls to the Azure Instance Metadata endpoint. This configuration means any request made to the
Metadata endpoint is intercepted by NMI even if the pod doesn't use aad-pod-identity .
AzurePodIdentityException CRD can be configured to inform aad-pod-identity that any requests to the
Metadata endpoint originating from a pod that matches labels defined in CRD should be proxied without any
processing in NMI. The system pods with kubernetes.azure.com/managedby: aks label in kube-system
namespace should be excluded in aad-pod-identity by configuring the AzurePodIdentityException CRD. For
more information, see Disable aad-pod-identity for a specific pod or application. To configure an exception,
install the mic-exception YAML.
3. If limited preview policy definitions were installed, remove the add-on with the Disable button on your
AKS cluster under the Policies page.
4. The AKS cluster must be version 1.14 or higher. Use the following script to validate your AKS cluster
version:
5. Install version 2.12.0 or higher of the Azure CLI. For more information, see Install the Azure CLI.
Once the above prerequisite steps are completed, install the Azure Policy Add-on in the AKS cluster you want to
manage.
Azure portal
1. Launch the AKS service in the Azure portal by selecting All ser vices , then searching for and
selecting Kubernetes ser vices .
2. Select one of your AKS clusters.
3. Select Policies on the left side of the Kubernetes service page.
4. In the main page, select the Enable add-on button.
Azure CLI
To validate that the add-on installation was successful and that the azure-policy and gatekeeper pods are
running, run the following command:
{
"config": null,
"enabled": true,
"identity": null
}
Note: Azure Policy for Arc extension is supported on the following Kubernetes distributions.
2. Ensure you have met all the common prerequisites for Kubernetes extensions listed here including
connecting your cluster to Azure Arc.
Note: Azure Policy extension is supported for Arc enabled Kubernetes clusters in these regions.
3. Open ports for the Azure Policy extension. The Azure Policy extension uses these domains and ports to
fetch policy definitions and assignments and report compliance of the cluster back to Azure Policy.
DO M A IN P O RT
data.policy.core.windows.net 443
store.policy.core.windows.net 443
login.windows.net 443
dc.services.visualstudio.com 443
4. Before installing the Azure Policy extension or enabling any of the service features, your subscription
must enable the Microsoft.PolicyInsights resource providers.
Note: To enable the resource provider, follow the steps in Resource providers and types or run either
the Azure CLI or Azure PowerShell command:
Azure CLI
Azure PowerShell
To create an extension instance, for your Arc enabled cluster, run the following command substituting <> with
your values:
Example:
Example Output:
{
"aksAssignedIdentity": null,
"autoUpgradeMinorVersion": true,
"configurationProtectedSettings": {},
"configurationSettings": {},
"customLocationSettings": null,
"errorInfo": null,
"extensionType": "microsoft.policyinsights",
"id": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/my-test-
rg/providers/Microsoft.Kubernetes/connectedClusters/my-test-
cluster/providers/Microsoft.KubernetesConfiguration/extensions/azurepolicy",
"identity": {
"principalId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"tenantId": null,
"type": "SystemAssigned"
},
"location": null,
"name": "azurepolicy",
"packageUri": null,
"provisioningState": "Succeeded",
"releaseTrain": "Stable",
"resourceGroup": "my-test-rg",
"scope": {
"cluster": {
"releaseNamespace": "kube-system"
},
"namespace": null
},
"statuses": [],
"systemData": {
"createdAt": "2021-10-27T01:20:06.834236+00:00",
"createdBy": null,
"createdByType": null,
"lastModifiedAt": "2021-10-27T01:20:06.834236+00:00",
"lastModifiedBy": null,
"lastModifiedByType": null
},
"type": "Microsoft.KubernetesConfiguration/extensions",
"version": "1.1.0"
}
Example:
To validate that the extension installation was successful and that the azure-policy and gatekeeper pods are
running, run the following command:
# azure-policy pod is installed in kube-system namespace
kubectl get pods -n kube-system
Install Azure Policy Add-on Using Helm for Azure Arc enabled
Kubernetes (preview)
NOTE
Azure Policy Add-on Helm model will soon begin deprecation. Please opt for the Azure Policy Extension for Azure Arc
enabled Kubernetes instead.
Before installing the Azure Policy Add-on or enabling any of the service features, your subscription must enable
the Microsoft.PolicyInsights resource provider and create a role assignment for the cluster service principal.
1. You need the Azure CLI version 2.12.0 or later installed and configured. Run az --version to find the
version. If you need to install or upgrade, see Install the Azure CLI.
2. To enable the resource provider, follow the steps in Resource providers and types or run either the Azure
CLI or Azure PowerShell command:
Azure CLI
Azure PowerShell
data.policy.core.windows.net 443
store.policy.core.windows.net 443
login.windows.net 443
dc.services.visualstudio.com 443
8. Assign 'Policy Insights Data Writer (Preview)' role assignment to the Azure Arc enabled Kubernetes
cluster. Replace <subscriptionId> with your subscription ID, <rg> with the Azure Arc enabled Kubernetes
cluster's resource group, and <clusterName> with the name of the Azure Arc enabled Kubernetes cluster.
Keep track of the returned values for appId, password, and tenant for the installation steps.
Azure CLI
Azure PowerShell
@{ appId=$sp.ApplicationId;password=
[System.Runtime.InteropServices.Marshal]::PtrToStringAuto([System.Runtime.InteropServices.Mars
hal]::SecureStringToBSTR($sp.Secret));tenant=(Get-AzContext).Tenant.Id } | ConvertTo-Json
{
"appId": "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa",
"password": "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb",
"tenant": "cccccccc-cccc-cccc-cccc-cccccccccccc"
}
Once the above prerequisite steps are completed, install the Azure Policy Add-on in your Azure Arc enabled
Kubernetes cluster:
1. Add the Azure Policy Add-on repo to Helm:
For more information about what the add-on Helm Chart installs, see the Azure Policy Add-on Helm
Chart definition on GitHub.
To validate that the add-on installation was successful and that the azure-policy and gatekeeper pods are
running, run the following command:
Azure PowerShell
Assign 'Policy Insights Data Writer (Preview)' role assignment to the cluster service principal app
ID (value aadClientID from previous step) with Azure CLI. Replace <subscriptionId> with your
subscription ID and <aks engine cluster resource group> with the resource group the AKS Engine
self-managed Kubernetes cluster is in.
az role assignment create --assignee <cluster service principal app ID> --scope
"/subscriptions/<subscriptionId>/resourceGroups/<aks engine cluster resource group>" --role
"Policy Insights Data Writer (Preview)"
Once the above prerequisite steps are completed, install the Azure Policy Add-on. The installation can be during
the creation or update cycle of an AKS Engine or as an independent action on an existing cluster.
Install during creation or update cycle
To enable the Azure Policy Add-on during the creation of a new self-managed cluster or as an update to
an existing cluster, include the addons property cluster definition for AKS Engine.
"addons": [{
"name": "azure-policy",
"enabled": true
}]
For more information about, see the external guide AKS Engine cluster definition.
Install in existing cluster with Helm Charts
Use the following steps to prepare the cluster and install the add-on:
1. Install Helm 3.
2. Add the Azure Policy repo to Helm.
For more information about what the add-on Helm Chart installs, see the Azure Policy Add-on
Helm Chart definition on GitHub.
NOTE
Because of the relationship between Azure Policy Add-on and the resource group ID, Azure Policy
supports only one AKS Engine cluster for each resource group.
To validate that the add-on installation was successful and that the azure-policy and gatekeeper pods are
running, run the following command:
Policy language
The Azure Policy language structure for managing Kubernetes follows that of existing policy definitions. With a
Resource Provider mode of Microsoft.Kubernetes.Data , the effects audit and deny are used to manage your
Kubernetes clusters. Audit and deny must provide details properties specific to working with OPA Constraint
Framework and Gatekeeper v3.
As part of the details.templateInfo, details.constraint, or details.constraintTemplate properties in the policy
definition, Azure Policy passes the URI or Base64Encoded value of these CustomResourceDefinitions (CRD) to
the add-on. Rego is the language that OPA and Gatekeeper support to validate a request to the Kubernetes
cluster. By supporting an existing standard for Kubernetes management, Azure Policy makes it possible to reuse
existing rules and pair them with Azure Policy for a unified cloud compliance reporting experience. For more
information, see What is Rego?.
NOTE
Custom policy definitions is a public preview feature.
Find the built-in policy definitions for managing your cluster using the Azure portal with the following steps. If
using a custom policy definition, search for it by name or the category that you created it with.
1. Start the Azure Policy service in the Azure portal. Select All ser vices in the left pane and then search for
and select Policy .
2. In the left pane of the Azure Policy page, select Definitions .
3. From the Category dropdown list box, use Select all to clear the filter and then select Kubernetes .
4. Select the policy definition, then select the Assign button.
5. Set the Scope to the management group, subscription, or resource group of the Kubernetes cluster
where the policy assignment will apply.
NOTE
When assigning the Azure Policy for Kubernetes definition, the Scope must include the cluster resource. For an
AKS Engine cluster, the Scope must be the resource group of the cluster.
6. Give the policy assignment a Name and Description that you can use to identify it easily.
7. Set the Policy enforcement to one of the values below.
Enabled - Enforce the policy on the cluster. Kubernetes admission requests with violations are
denied.
Disabled - Don't enforce the policy on the cluster. Kubernetes admission requests with violations
aren't denied. Compliance assessment results are still available. When rolling out new policy
definitions to running clusters, Disabled option is helpful for testing the policy definition as
admission requests with violations aren't denied.
8. Select Next .
9. Set parameter values
To exclude Kubernetes namespaces from policy evaluation, specify the list of namespaces in parameter
Namespace exclusions . It's recommended to exclude: kube-system, gatekeeper-system, and azure-
arc.
10. Select Review + create .
Alternately, use the Assign a policy - Portal quickstart to find and assign a Kubernetes policy. Search for a
Kubernetes policy definition instead of the sample 'audit vms'.
IMPORTANT
Built-in policy definitions are available for Kubernetes clusters in category Kubernetes . For a list of built-in policy
definitions, see Kubernetes samples.
Policy evaluation
The add-on checks in with Azure Policy service for changes in policy assignments every 15 minutes. During this
refresh cycle, the add-on checks for changes. These changes trigger creates, updates, or deletes of the constraint
templates and constraints.
In a Kubernetes cluster, if a namespace has the cluster-appropriate label, the admission requests with violations
aren't denied. Compliance assessment results are still available.
Azure Arc-enabled Kubernetes cluster: admission.policy.azure.com/ignore
Azure Kubernetes Service cluster: control-plane
NOTE
While a cluster admin may have permission to create and update constraint templates and constraints resources install by
the Azure Policy Add-on, these aren't supported scenarios as manual updates are overwritten. Gatekeeper continues to
evaluate policies that existed prior to installing the add-on and assigning Azure Policy policy definitions.
Every 15 minutes, the add-on calls for a full scan of the cluster. After gathering details of the full scan and any
real-time evaluations by Gatekeeper of attempted changes to the cluster, the add-on reports the results back to
Azure Policy for inclusion in compliance details like any Azure Policy assignment. Only results for active policy
assignments are returned during the audit cycle. Audit results can also be seen as violations listed in the status
field of the failed constraint. For details on Non-compliant resources, see Component details for Resource
Provider modes.
NOTE
Each compliance report in Azure Policy for your Kubernetes clusters include all violations within the last 45 minutes. The
timestamp indicates when a violation occurred.
NOTE
Init containers may be included during policy evaluation. To see if init containers are included, review the CRD for the
following or a similar declaration:
input_containers[c] {
c := input.review.object.spec.initContainers[_]
}
Logging
As a Kubernetes controller/container, both the azure-policy and gatekeeper pods keep logs in the Kubernetes
cluster. The logs can be exposed in the Insights page of the Kubernetes cluster. For more information, see
Monitor your Kubernetes cluster performance with Azure Monitor for containers.
To view the add-on logs, use kubectl :
# Get the azure-policy pod name installed in kube-system namespace
kubectl logs <azure-policy pod name> -n kube-system
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
annotations:
azure-policy-definition-id:
/subscriptions/<SUBID>/providers/Microsoft.Authorization/policyDefinitions/<GUID>
constraint-template-installed-by: azure-policy-addon
constraint-template: <URL-OF-YAML>
creationTimestamp: "2021-09-01T13:20:55Z"
generation: 1
managedFields:
- apiVersion: templates.gatekeeper.sh/v1beta1
fieldsType: FieldsV1
...
<SUBID> is the subscription ID and <GUID> is the ID of the mapped policy definition. <URL-OF-YAML> is the
source location of the constraint template that the add-on downloaded to install on the cluster.
View constraints related to a constraint template
Once you have the names of the add-on downloaded constraint templates, you can use the name to see the
related constraints. Use kubectl get <constraintTemplateName> to get the list. Constraints installed by the add-on
start with azurepolicy- .
View constraint details
The constraint has details about violations and mappings to the policy definition and assignment. To see the
details, use kubectl get <CONSTRAINT-TEMPLATE> <CONSTRAINT> -o yaml . The results look similar to the following
output:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAzureContainerAllowedImages
metadata:
annotations:
azure-policy-assignment-id: /subscriptions/<SUB-ID>/resourceGroups/<RG-
NAME>/providers/Microsoft.Authorization/policyAssignments/<ASSIGNMENT-GUID>
azure-policy-definition-id: /providers/Microsoft.Authorization/policyDefinitions/<DEFINITION-GUID>
azure-policy-definition-reference-id: ""
azure-policy-setdefinition-id: ""
constraint-installed-by: azure-policy-addon
constraint-url: <URL-OF-YAML>
creationTimestamp: "2021-09-01T13:20:55Z"
spec:
enforcementAction: deny
match:
excludedNamespaces:
- kube-system
- gatekeeper-system
- azure-arc
parameters:
imageRegex: ^.+azurecr.io/.+$
status:
auditTimestamp: "2021-09-01T13:48:16Z"
totalViolations: 32
violations:
- enforcementAction: deny
kind: Pod
message: Container image nginx for container hello-world has not been allowed.
name: hello-world-78f7bfd5b8-lmc5b
namespace: default
- enforcementAction: deny
kind: Pod
message: Container image nginx for container hello-world has not been allowed.
name: hellow-world-89f8bfd6b9-zkggg
"addons": [{
"name": "azure-policy",
"enabled": false
}]
For more information, see AKS Engine - Disable Azure Policy Add-on.
If installed with Helm Charts, run the following Helm command:
Next steps
Review examples at Azure Policy samples.
Review the Policy definition structure.
Review Understanding policy effects.
Understand how to programmatically create policies.
Learn how to get compliance data.
Learn how to remediate non-compliant resources.
Review what a management group is with Organize your resources with Azure management groups.
Bring your own keys (BYOK) with Azure disks in
Azure Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
Azure Storage encrypts all data in a storage account at rest. By default, data is encrypted with Microsoft-
managed keys. For additional control over encryption keys, you can supply customer-managed keys to use for
encryption at rest for both the OS and data disks for your AKS clusters. Learn more about customer-managed
keys on Linux and Windows.
Limitations
Data disk encryption support is limited to AKS clusters running Kubernetes version 1.17 and above.
Encryption of OS disk with customer-managed keys can only be enabled when creating an AKS cluster.
Prerequisites
You must enable soft delete and purge protection for Azure Key Vault when using Key Vault to encrypt
managed disks.
You need the Azure CLI version 2.11.1 or later.
# Optionally retrieve Azure region short names for use on upcoming commands
az account list-locations
# Create a DiskEncryptionSet
az disk-encryption-set create -n myDiskEncryptionSetName -l myAzureRegionName -g myResourceGroup --source-
vault $keyVaultId --key-url $keyVaultKeyUrl
IMPORTANT
Ensure you create a new resoruce group for your AKS cluster
When new node pools are added to the cluster created above, the customer-managed key provided during the
create is used to encrypt the OS disk.
Create a file called byok-azure-disk .yaml that contains the following information. Replace
myAzureSubscriptionId, myResourceGroup, and myDiskEncrptionSetName with your values, and apply the
yaml. Make sure to use the resource group where your DiskEncryptionSet is deployed. If you use the Azure
Cloud Shell, this file can be created using vi or nano as if working on a virtual or physical system:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: byok
provisioner: disk.csi.azure.com # replace with "kubernetes.io/azure-disk" if aks version is less than 1.21
parameters:
skuname: StandardSSD_LRS
kind: managed
diskEncryptionSetID:
"/subscriptions/{myAzureSubscriptionId}/resourceGroups/{myResourceGroup}/providers/Microsoft.Compute/diskEnc
ryptionSets/{myDiskEncryptionSetName}"
# Get credentials
az aks get-credentials --name myAksCluster --resource-group myResourceGroup --output table
# Update cluster
kubectl apply -f byok-azure-disk.yaml
With host-based encryption, the data stored on the VM host of your AKS agent nodes' VMs is encrypted at rest
and flows encrypted to the Storage service. This means the temp disks are encrypted at rest with platform-
managed keys. The cache of OS and data disks is encrypted at rest with either platform-managed keys or
customer-managed keys depending on the encryption type set on those disks.
By default, when using AKS, OS and data disks use server-side encryption with platform-managed keys. The
caches for these disks are also encrypted at rest with platform-managed keys. You can specify your own
managed keys following Bring your own keys (BYOK) with Azure disks in Azure Kubernetes Service. The cache
for these disks will then also be encrypted using the key that you specify in this step.
Host-based encryption is different than server-side encryption (SSE), which is used by Azure Storage. Azure-
managed disks use Azure Storage to automatically encrypt data at rest when saving data. Host-based encryption
uses the host of the VM to handle encryption before the data flows through Azure Storage.
NOTE
Host-based encryption is available in Azure regions that support server side encryption of Azure managed disks and only
with specific supported VM sizes.
Prerequisites
Ensure you have the CLI extension v2.23 or higher version installed.
Ensure you have the EncryptionAtHost feature flag under Microsoft.Compute enabled.
Register EncryptionAtHost feature
To create an AKS cluster that uses host-based encryption, you must enable the EncryptionAtHost feature flags
on your subscription.
Register the EncryptionAtHost feature flag using the az feature register command as shown in the following
example:
It takes a few minutes for the status to show Registered. You can check on the registration status using the az
feature list command:
When ready, refresh the registration of the Microsoft.Compute resource providers using the az provider register
command:
az provider register --namespace Microsoft.Compute
Limitations
Can only be enabled on new node pools.
Can only be enabled in Azure regions that support server-side encryption of Azure managed disks and only
with specific supported VM sizes.
Requires an AKS cluster and node pool based on Virtual Machine Scale Sets(VMSS) as VM set type.
If you want to create clusters without host-based encryption, you can do so by omitting the
--enable-encryption-at-host parameter.
If you want to create new node pools without the host-based encryption feature, you can do so by omitting the
--enable-encryption-at-host parameter.
Next steps
Review best practices for AKS cluster security Read more about host-based encryption.
Use Azure Active Directory pod-managed identities
in Azure Kubernetes Service (Preview)
6/15/2022 • 8 minutes to read • Edit Online
Azure Active Directory (Azure AD) pod-managed identities use Kubernetes primitives to associate managed
identities for Azure resources and identities in Azure AD with pods. Administrators create identities and bindings
as Kubernetes primitives that allow pods to access Azure resources that rely on Azure AD as an identity provider.
NOTE
The feature described in this document, pod-managed identities (preview), will be replaced with Azure AD Workload
Identity .
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Create an AKS cluster with Azure CNI and pod-managed identity enabled. The following commands use az
group create to create a resource group named myResourceGroup and the az aks create command to create an
AKS cluster named myAKSCluster in the myResourceGroup resource group.
Use az aks get-credentials to sign in to your AKS cluster. This command also downloads and configures the
kubectl client certificate on your development computer.
Mitigation
To mitigate the vulnerability at the cluster level, you can use the Azure built-in policy "Kubernetes cluster
containers should only use allowed capabilities" to limit the CAP_NET_RAW attack.
Add NET_RAW to "Required drop capabilities"
If you are not using Azure Policy, you can use OpenPolicyAgent admission controller together with Gatekeeper
validating webhook. Provided you have Gatekeeper already installed in your cluster, add the ConstraintTemplate
of type K8sPSPCapabilities:
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/pod-
security-policy/capabilities/template.yaml
Add a template to limit the spawning of Pods with the NET_RAW capability:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPCapabilities
metadata:
name: prevent-net-raw
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
excludedNamespaces:
- "kube-system"
parameters:
requiredDropCapabilities: ["NET_RAW"]
Create an identity
IMPORTANT
You must have the relevant permissions (for example, Owner) on your subscription to create the identity.
Create an identity which will be used by the demo pod with az identity create and set the IDENTITY_CLIENT_ID
and IDENTITY_RESOURCE_ID variables.
export POD_IDENTITY_NAME="my-pod-identity"
export POD_IDENTITY_NAMESPACE="my-app"
az aks pod-identity add --resource-group myResourceGroup --cluster-name myAKSCluster --namespace
${POD_IDENTITY_NAMESPACE} --name ${POD_IDENTITY_NAME} --identity-resource-id ${IDENTITY_RESOURCE_ID}
NOTE
The "POD_IDENTITY_NAME" has to be a valid DNS subdomain name as defined in RFC 1123.
NOTE
When you assign the pod identity by using pod-identity add , the Azure CLI attempts to grant the Managed Identity
Operator role over the pod identity (IDENTITY_RESOURCE_ID) to the cluster identity.
Azure will create an AzureIdentity resource in your cluster representing the identity in Azure, and an
AzureIdentityBinding resource which connects the AzureIdentity to a selector. You can view these resources with
NOTE
In the previous steps, you created the POD_IDENTITY_NAME, IDENTITY_CLIENT_ID, and IDENTITY_RESOURCE_GROUP
variables. You can use a command such as echo to display the value you set for variables, for example
echo $POD_IDENTITY_NAME .
apiVersion: v1
kind: Pod
metadata:
name: demo
labels:
aadpodidbinding: $POD_IDENTITY_NAME
spec:
containers:
- name: demo
image: mcr.microsoft.com/oss/azure/aad-pod-identity/demo:v1.6.3
args:
- --subscriptionid=$SUBSCRIPTION_ID
- --clientid=$IDENTITY_CLIENT_ID
- --resourcegroup=$IDENTITY_RESOURCE_GROUP
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
nodeSelector:
kubernetes.io/os: linux
Notice the pod definition has an aadpodidbinding label with a value that matches the name of the pod identity
you ran az aks pod-identity add in the previous step.
Deploy demo.yaml to the same namespace as your pod identity using kubectl apply :
Verify that the logs show a token is successfully acquired and the GET operation is successful.
...
successfully doARMOperations vm count 0
successfully acquired a token using the MSI,
msiEndpoint(http://169.254.169.254/metadata/identity/oauth2/token)
successfully acquired a token, userAssignedID MSI,
msiEndpoint(http://169.254.169.254/metadata/identity/oauth2/token) clientID(xxxxxxxx-xxxx-xxxx-xxxx-
xxxxxxxxxxxx)
successfully made GET on instance metadata
...
Then set the aadpodidbinding field in your pod YAML to the binding selector you specified.
apiVersion: v1
kind: Pod
metadata:
name: demo
labels:
aadpodidbinding: myMultiIdentitySelector
...
Clean up
To remove an Azure AD pod-managed identity from your cluster, remove the sample application and the pod
identity from the cluster. Then remove the identity.
Next steps
For more information on managed identities, see Managed identities for Azure resources.
Secure traffic between pods using network policies
in Azure Kubernetes Service (AKS)
6/15/2022 • 14 minutes to read • Edit Online
When you run modern, microservices-based applications in Kubernetes, you often want to control which
components can communicate with each other. The principle of least privilege should be applied to how traffic
can flow between pods in an Azure Kubernetes Service (AKS) cluster. Let's say you likely want to block traffic
directly to back-end applications. The Network Policy feature in Kubernetes lets you define rules for ingress and
egress traffic between pods in a cluster.
This article shows you how to install the network policy engine and create Kubernetes network policies to
control the flow of traffic between pods in AKS. Network policy should only be used for Linux-based nodes and
pods in AKS.
Supported networking options Azure CNI Azure CNI (Windows Server 2019 and
Linux) and kubenet (Linux)
C A PA B IL IT Y A Z URE C A L IC O
Compliance with Kubernetes All policy types supported All policy types supported
specification
Support Supported by Azure support and Calico community support. For more
Engineering team information on additional paid
support, see Project Calico support
options.
Logging Rules added / deleted in IPTables are For more information, see Calico
logged on every host under component logs
/var/log/azure-npm.log
IMPORTANT
The network policy feature can only be enabled when the cluster is created. You can't enable network policy on an existing
AKS cluster.
To use Azure Network Policy, you must use the Azure CNI plug-in and define your own virtual network and
subnets. For more detailed information on how to plan out the required subnet ranges, see configure advanced
networking. Calico Network Policy could be used with either this same Azure CNI plug-in or with the Kubenet
CNI plug-in.
The following example script:
Creates a virtual network and subnet.
Creates an Azure Active Directory (Azure AD) service principal for use with the AKS cluster.
Assigns Contributor permissions for the AKS cluster service principal on the virtual network.
Creates an AKS cluster in the defined virtual network and enables network policy.
The Azure Network policy option is used. To use Calico as the network policy option instead, use the
--network-policy calico parameter. Note: Calico could be used with either --network-plugin azure or
--network-plugin kubenet .
Note that instead of using a service principal, you can use a managed identity for permissions. For more
information, see Use managed identities.
Provide your own secure SP_PASSWORD. You can replace the RESOURCE_GROUP_NAME and CLUSTER_NAME
variables:
RESOURCE_GROUP_NAME=myResourceGroup-NP
CLUSTER_NAME=myAKSCluster
LOCATION=canadaeast
# Assign the service principal Contributor permissions to the virtual network resource
az role assignment create --assignee $SP_ID --scope $VNET_ID --role Contributor
az aks create \
--resource-group $RESOURCE_GROUP_NAME \
--name $CLUSTER_NAME \
--node-count 1 \
--generate-ssh-keys \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--docker-bridge-address 172.17.0.1/16 \
--vnet-subnet-id $SUBNET_ID \
--service-principal $SP_ID \
--client-secret $SP_PASSWORD \
--network-plugin azure \
--network-policy azure
It takes a few minutes to create the cluster. When the cluster is ready, configure kubectl to connect to your
Kubernetes cluster by using the az aks get-credentials command. This command downloads credentials and
configures the Kubernetes CLI to use them:
You can check on the registration status using the az feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider using the az provider
register command:
IMPORTANT
At this time, using Calico network policies with Windows nodes is available on new clusters using Kubernetes version 1.20
or later with Calico 3.17.2 and requires using Azure CNI networking. Windows nodes on AKS clusters with Calico enabled
also have Direct Server Return (DSR) enabled by default.
For clusters with only Linux node pools running Kubernetes 1.20 with earlier versions of Calico, the Calico version will
automatically be upgraded to 3.17.2.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Create a username to use as administrator credentials for your Windows Server containers on your cluster. The
following commands prompt you for a username and set it WINDOWS_USERNAME for use in a later command
(remember that the commands in this article are entered into a BASH shell).
echo "Please enter the username to use as administrator credentials for Windows Server containers on your
cluster: " && read WINDOWS_USERNAME
az aks create \
--resource-group $RESOURCE_GROUP_NAME \
--name $CLUSTER_NAME \
--node-count 1 \
--generate-ssh-keys \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--docker-bridge-address 172.17.0.1/16 \
--vnet-subnet-id $SUBNET_ID \
--service-principal $SP_ID \
--client-secret $SP_PASSWORD \
--windows-admin-username $WINDOWS_USERNAME \
--vm-set-type VirtualMachineScaleSets \
--kubernetes-version 1.20.2 \
--network-plugin azure \
--network-policy calico
It takes a few minutes to create the cluster. By default, your cluster is created with only a Linux node pool. If you
would like to use Windows node pools, you can add one. For example:
When the cluster is ready, configure kubectl to connect to your Kubernetes cluster by using the az aks get-
credentials command. This command downloads credentials and configures the Kubernetes CLI to use them:
Create an example back-end pod that runs NGINX. This back-end pod can be used to simulate a sample back-
end web-based application. Create this pod in the development namespace, and open port 80 to serve web
traffic. Label the pod with app=webapp,role=backend so that we can target it with a network policy in the next
section:
Create another pod and attach a terminal session to test that you can successfully reach the default NGINX
webpage:
kubectl run --rm -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0 network-policy --namespace
development
Install wget :
At the shell prompt, use wget to confirm that you can access the default NGINX webpage:
The following sample output shows that the default NGINX webpage returned:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
[...]
Exit out of the attached terminal session. The test pod is automatically deleted.
exit
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: backend-policy
namespace: development
spec:
podSelector:
matchLabels:
app: webapp
role: backend
ingress: []
Install wget :
At the shell prompt, use wget to see if you can access the default NGINX webpage. This time, set a timeout
value to 2 seconds. The network policy now blocks all inbound traffic, so the page can't be loaded, as shown in
the following example:
Exit out of the attached terminal session. The test pod is automatically deleted.
exit
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: backend-policy
namespace: development
spec:
podSelector:
matchLabels:
app: webapp
role: backend
ingress:
- from:
- namespaceSelector: {}
podSelector:
matchLabels:
app: webapp
role: frontend
NOTE
This network policy uses a namespaceSelector and a podSelector element for the ingress rule. The YAML syntax is
important for the ingress rules to be additive. In this example, both elements must match for the ingress rule to be
applied. Kubernetes versions prior to 1.12 might not interpret these elements correctly and restrict the network traffic as
you expect. For more about this behavior, see Behavior of to and from selectors.
Apply the updated network policy by using the kubectl apply command and specify the name of your YAML
manifest:
Install wget :
At the shell prompt, use wget to see if you can access the default NGINX webpage:
Because the ingress rule allows traffic with pods that have the labels app: webapp,role: frontend, the traffic from
the front-end pod is allowed. The following example output shows the default NGINX webpage returned:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
[...]
Exit out of the attached terminal session. The pod is automatically deleted.
exit
Install wget :
At the shell prompt, use wget to see if you can access the default NGINX webpage. The network policy blocks
the inbound traffic, so the page can't be loaded, as shown in the following example:
Exit out of the attached terminal session. The test pod is automatically deleted.
exit
Schedule a test pod in the production namespace that is labeled as app=webapp,role=frontend. Attach a
terminal session:
Install wget :
At the shell prompt, use wget to confirm that you can access the default NGINX webpage:
Because the labels for the pod match what is currently permitted in the network policy, the traffic is allowed. The
network policy doesn't look at the namespaces, only the pod labels. The following example output shows the
default NGINX webpage returned:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
[...]
Exit out of the attached terminal session. The test pod is automatically deleted.
exit
In more complex examples, you could define multiple ingress rules, like a namespaceSelector and then a
podSelector.
Apply the updated network policy by using the kubectl apply command and specify the name of your YAML
manifest:
Install wget :
At the shell prompt, use wget to see that the network policy now denies traffic:
exit
With traffic denied from the production namespace, schedule a test pod back in the development namespace
and attach a terminal session:
kubectl run --rm -it frontend --image=mcr.microsoft.com/dotnet/runtime-deps:6.0 --labels
app=webapp,role=frontend --namespace development
Install wget :
At the shell prompt, use wget to see that the network policy allows the traffic:
Traffic is allowed because the pod is scheduled in the namespace that matches what's permitted in the network
policy. The following sample output shows the default NGINX webpage returned:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
[...]
Exit out of the attached terminal session. The test pod is automatically deleted.
exit
Clean up resources
In this article, we created two namespaces and applied a network policy. To clean up these resources, use the
kubectl delete command and specify the resource names:
Next steps
For more about network resources, see Network concepts for applications in Azure Kubernetes Service (AKS).
To learn more about policies, see Kubernetes network policies.
Preview - Secure your cluster using pod security
policies in Azure Kubernetes Service (AKS)
6/15/2022 • 14 minutes to read • Edit Online
WARNING
The feature described in this document, pod security policy (preview), will begin deprecation with
Kubernetes version 1.21, with its removal in version 1.25. You can now Migrate Pod Security Policy to Pod
Security Admission Controller ahead of the deprecation.
After pod security policy (preview) is deprecated, you must have already migrated to Pod Security Admission controller or
disabled the feature on any existing clusters using the deprecated feature to perform future cluster upgrades and stay
within Azure support.
To improve the security of your AKS cluster, you can limit what pods can be scheduled. Pods that request
resources you don't allow can't run in the AKS cluster. You define this access using pod security policies. This
article shows you how to use pod security policies to limit the deployment of pods in AKS.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
It takes a few minutes for the status to show Registered. You can check on the registration status using the az
feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider using the az provider
register command:
Installation Enable pod security policy feature Enable Azure Policy Add-on
Deploy policies Deploy pod security policy resource Assign Azure policies to the
subscription or resource group scope.
The Azure Policy Add-on is required for
Kubernetes resource applications.
SC EN A RIO P O D SEC URIT Y P O L IC Y A Z URE P O L IC Y
Default policies When pod security policy is enabled in No default policies are applied by
AKS, default Privileged and enabling the Azure Policy Add-on. You
Unrestricted policies are applied. must explicitly enable policies in Azure
Policy.
Who can create and assign policies Cluster admin creates a pod security Users must have a minimum role of
policy resource 'owner' or 'Resource Policy Contributor'
permissions on the AKS cluster
resource group. - Through API, users
can assign policies at the AKS cluster
resource scope. The user should have
minimum of 'owner' or 'Resource Policy
Contributor' permissions on AKS
cluster resource. - In the Azure portal,
policies can be assigned at the
Management
group/subscription/resource group
level.
Authorizing policies Users and Service Accounts require No additional assignment is required
explicit permissions to use pod security to authorize policies. Once policies are
policies. assigned in Azure, all cluster users can
use these policies.
Policy applicability The admin user bypasses the All users (admin & non-admin) sees
enforcement of pod security policies. the same policies. There is no special
casing based on users. Policy
application can be excluded at the
namespace level.
Policy scope Pod security policies are not Constraint templates used by Azure
namespaced Policy are not namespaced.
Deny/Audit/Mutation action Pod security policies support only deny Azure Policy supports both audit &
actions. Mutation can be done with deny actions. Mutation is not
default values on create requests. supported yet, but planned.
Validation can be done during update
requests.
Pod security policy compliance There is no visibility on compliance of Non-compliant pods that existed
pods that existed before enabling pod before applying Azure policies would
security policy. Non-compliant pods show up in policy violations. Non-
created after enabling pod security compliant pods created after enabling
policies are denied. Azure policies are denied if policies are
set with a deny effect.
How to view policies on the cluster kubectl get psp kubectl get constrainttemplate -
All policies are returned.
Pod security policy standard - A privileged pod security policy Privileged mode implies no restriction,
Privileged resource is created by default when as a result it is equivalent to not
enabling the feature. having any Azure Policy assignment.
Pod security policy standard - User installs a pod security policy Azure Policy provides a built-in
Baseline/default baseline resource. baseline initiative which maps to the
baseline pod security policy.
SC EN A RIO P O D SEC URIT Y P O L IC Y A Z URE P O L IC Y
Pod security policy standard - User installs a pod security policy Azure Policy provides a built-in
Restricted restricted resource. restricted initiative which maps to the
restricted pod security policy.
NOTE
For real-world use, don't enable the pod security policy until you have defined your own custom policies. In this article,
you enable pod security policy as the first step to see how the default policies limit pod deployments.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-pod-security-policy
The privileged pod security policy is applied to any authenticated user in the AKS cluster. This assignment is
controlled by ClusterRoles and ClusterRoleBindings. Use the kubectl get rolebindings command and search for
the default:privileged: binding in the kube-system namespace:
As shown in the following condensed output, the psp:privileged ClusterRole is assigned to any
system:authenticated users. This ability provides a basic level of privilege without your own policies being
defined.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
[...]
name: default:privileged
[...]
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: psp:privileged
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:masters
It's important to understand how these default policies interact with user requests to schedule pods before you
start to create your own pod security policies. In the next few sections, let's schedule some pods to see these
default policies in action.
Next, create a RoleBinding for the nonadmin-user to perform basic actions in the namespace using the kubectl
create rolebinding command:
apiVersion: v1
kind: Pod
metadata:
name: nginx-privileged
spec:
containers:
- name: nginx-privileged
image: mcr.microsoft.com/oss/nginx/nginx:1.14.2-alpine
securityContext:
privileged: true
Create the pod using the kubectl apply command and specify the name of your YAML manifest:
Error from server (Forbidden): error when creating "nginx-privileged.yaml": pods "nginx-privileged" is
forbidden: unable to validate against any pod security policy: []
The pod doesn't reach the scheduling stage, so there are no resources to delete before you move on.
apiVersion: v1
kind: Pod
metadata:
name: nginx-unprivileged
spec:
containers:
- name: nginx-unprivileged
image: mcr.microsoft.com/oss/nginx/nginx:1.14.2-alpine
Create the pod using the kubectl apply command and specify the name of your YAML manifest:
Error from server (Forbidden): error when creating "nginx-unprivileged.yaml": pods "nginx-unprivileged" is
forbidden: unable to validate against any pod security policy: []
The pod doesn't reach the scheduling stage, so there are no resources to delete before you move on.
apiVersion: v1
kind: Pod
metadata:
name: nginx-unprivileged-nonroot
spec:
containers:
- name: nginx-unprivileged
image: mcr.microsoft.com/oss/nginx/nginx:1.14.2-alpine
securityContext:
runAsUser: 2000
Create the pod using the kubectl apply command and specify the name of your YAML manifest:
Error from server (Forbidden): error when creating "nginx-unprivileged-nonroot.yaml": pods "nginx-
unprivileged-nonroot" is forbidden: unable to validate against any pod security policy: []
The pod doesn't reach the scheduling stage, so there are no resources to delete before you move on.
Create the policy using the kubectl apply command and specify the name of your YAML manifest:
To view the policies available, use the kubectl get psp command, as shown in the following example. Compare
the psp-deny-privileged policy with the default privilege policy that was enforced in the previous examples to
create a pod. Only the use of PRIV escalation is denied by your policy. There are no restrictions on the user or
group for the psp-deny-privileged policy.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: psp-deny-privileged-clusterrole
rules:
- apiGroups:
- extensions
resources:
- podsecuritypolicies
resourceNames:
- psp-deny-privileged
verbs:
- use
Create the ClusterRole using the kubectl apply command and specify the name of your YAML manifest:
Now create a ClusterRoleBinding to use the ClusterRole created in the previous step. Create a file named
psp-deny-privileged-clusterrolebinding.yaml and paste the following YAML manifest:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: psp-deny-privileged-clusterrolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: psp-deny-privileged-clusterrole
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:serviceaccounts
Create a ClusterRoleBinding using the kubectl apply command and specify the name of your YAML manifest:
NOTE
In the first step of this article, the pod security policy feature was enabled on the AKS cluster. The recommended practice
was to only enable the pod security policy feature after you've defined your own policies. This is the stage where you
would enable the pod security policy feature. One or more custom policies have been defined, and user accounts have
been associated with those policies. Now you can safely enable the pod security policy feature and minimize problems
caused by the default policies.
The pod is successfully scheduled. When you check the status of the pod using the kubectl get pods command,
the pod is Running:
This example shows how you can create custom pod security policies to define access to the AKS cluster for
different users or groups. The default AKS policies provide tight controls on what pods can run, so create your
own custom policies to then correctly define the restrictions you need.
Delete the NGINX unprivileged pod using the kubectl delete command and specify the name of your YAML
manifest:
Clean up resources
To disable pod security policy, use the az aks update command again. The following example disables pod
security policy on the cluster name myAKSCluster in the resource group named myResourceGroup:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--disable-pod-security-policy
Delete the security policy using kubectl delete command and specify the name of your YAML manifest:
Next steps
This article showed you how to create a pod security policy to prevent the use of privileged access. There are
lots of features that a policy can enforce, such as type of volume or the RunAs user. For more information on the
available options, see the Kubernetes pod security policy reference docs.
For more information about limiting pod network traffic, see Secure traffic between pods using network policies
in AKS.
Use the Azure Key Vault Provider for Secrets Store
CSI Driver in an AKS cluster
6/15/2022 • 9 minutes to read • Edit Online
The Azure Key Vault Provider for Secrets Store CSI Driver allows for the integration of an Azure key vault as a
secrets store with an Azure Kubernetes Service (AKS) cluster via a CSI volume.
Prerequisites
If you don't have an Azure subscription, create a free account before you begin.
Before you start, ensure that your version of the Azure CLI is 2.30.0 or later. If it's an earlier version, install the
latest version.
Supported Kubernetes versions
The minimum recommended Kubernetes version is based on the rolling Kubernetes version support window.
Ensure that you're running version N-2 or later.
Features
Mounts secrets, keys, and certificates to a pod by using a CSI volume
Supports CSI inline volumes
Supports mounting multiple secrets store objects as a single volume
Supports pod portability with the SecretProviderClass CRD
Supports Windows containers
Syncs with Kubernetes secrets
Supports auto rotation of mounted contents and synced Kubernetes secrets
Create an AKS cluster with Azure Key Vault Provider for Secrets Store
CSI Driver support
First, create an Azure resource group:
To create an AKS cluster with Azure Key Vault Provider for Secrets Store CSI Driver capability, use the az aks
create command with the azure-keyvault-secrets-provider add-on.
A user-assigned managed identity, named azurekeyvaultsecretsprovider-* , is created by the add-on for the
purpose of accessing Azure resources. The following example uses this identity to connect to the Azure key vault
where the secrets will be stored, but you can also use other identity access methods. Take note of the identity's
clientId in the output:
...,
"addonProfiles": {
"azureKeyvaultSecretsProvider": {
...,
"identity": {
"clientId": "<client-id>",
...
}
}
Upgrade an existing AKS cluster with Azure Key Vault Provider for
Secrets Store CSI Driver support
To upgrade an existing AKS cluster with Azure Key Vault Provider for Secrets Store CSI Driver capability, use the
az aks enable-addons command with the azure-keyvault-secrets-provider add-on:
As mentioned in the preceding section, the add-on creates a user-assigned managed identity that you can use to
authenticate to your Azure key vault.
Verify the Azure Key Vault Provider for Secrets Store CSI Driver
installation
The preceding command installs the Secrets Store CSI Driver and the Azure Key Vault Provider on your nodes.
Verify that the installation is finished by listing all pods that have the secrets-store-csi-driver and
secrets-store-provider-azure labels in the kube-system namespace, and ensure that your output looks similar
to the output shown here:
Be sure that a Secrets Store CSI Driver pod and an Azure Key Vault Provider pod are running on each node in
your cluster's node pools.
Your Azure key vault can store keys, secrets, and certificates. In this example, you'll set a plain-text secret called
ExampleSecret :
az keyvault secret set --vault-name <keyvault-name> -n ExampleSecret --value MyAKSExampleSecret
Take note of the following properties for use in the next section:
The name of the secret object in the key vault
The object type (secret, key, or certificate)
The name of your Azure key vault resource
The Azure tenant ID that the subscription belongs to
To disable the Azure Key Vault Provider for Secrets Store CSI Driver capability in an existing cluster, use the az
aks disable-addons command with the azure-keyvault-secrets-provider flag:
NOTE
If the add-on is disabled, existing workloads will have no issues and will not see any updates in the mounted secrets. If the
pod restarts or a new pod is created as part of scale-up event, the pod will fail to start because the driver is no longer
running.
NOTE
When the Azure Key Vault Provider for Secrets Store CSI Driver is enabled, it updates the pod mount and the Kubernetes
secret that's defined in the secretObjects field of SecretProviderClass . It does so by polling for changes periodically,
based on the rotation poll interval you've defined. The default rotation poll interval is 2 minutes.
NOTE
When a secret is updated in an external secrets store after initial pod deployment, the Kubernetes Secret and the pod
mount will be periodically updated depending on how the application consumes the secret data.
Mount the Kubernetes Secret as a volume : Use the auto rotation and Sync K8s secrets features of Secrets Store CSI
Driver. The application will need to watch for changes from the mounted Kubernetes Secret volume. When the Kubernetes
Secret is updated by the CSI Driver, the corresponding volume contents are automatically updated.
Application reads the data from the container ’s filesystem : Use the rotation feature of Secrets Store CSI Driver.
The application will need to watch for the file change from the volume mounted by the CSI driver.
Use the Kubernetes Secret for an environment variable : Restart the pod to get the latest secret as an
environment variable. Use a tool such as Reloader to watch for changes on the synced Kubernetes Secret and perform
rolling upgrades on pods.
To enable autorotation of secrets, use the enable-secret-rotation flag when you create your cluster:
NOTE
The example here is incomplete. You'll need to modify it to support your chosen method of access to your key vault
identity.
The secrets will sync only after you start a pod to mount them. To rely solely on syncing with the Kubernetes
secrets feature doesn't work. When all the pods that consume the secret are deleted, the Kubernetes secret is
also deleted.
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-sync
spec:
provider: azure
secretObjects: # [OPTIONAL] SecretObjects defines the desired state of synced
Kubernetes secret objects
- data:
- key: username # data field to populate
objectName: foo1 # name of the mounted content to sync; this could be the
object name or the object alias
secretName: foosecret # name of the Kubernetes secret object
type: Opaque # type of Kubernetes secret object (for example, Opaque,
kubernetes.io/tls)
NOTE
Make sure that the objectName in the secretObjects field matches the file name of the mounted content. If you use
objectAlias instead, it should match the object alias.
kind: Pod
apiVersion: v1
metadata:
name: busybox-secrets-store-inline
spec:
containers:
- name: busybox
image: k8s.gcr.io/e2e-test-images/busybox:1.29-1
command:
- "/bin/sleep"
- "10000"
volumeMounts:
- name: secrets-store01-inline
mountPath: "/mnt/secrets-store"
readOnly: true
env:
- name: SECRET_USERNAME
valueFrom:
secretKeyRef:
name: foosecret
key: username
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "azure-sync"
Metrics
The Azure Key Vault Provider
Metrics are served via Prometheus from port 8898, but this port isn't exposed outside the pod by default.
Access the metrics over localhost by using kubectl port-forward :
The following table lists the metrics that are provided by the Azure Key Vault Provider for Secrets Store CSI
Driver:
The following table lists the metrics provided by the Secrets Store CSI Driver:
Next steps
Now that you've learned how to use the Azure Key Vault Provider for Secrets Store CSI Driver with an AKS
cluster, see Enable CSI drivers for Azure Disks and Azure Files on AKS.
Provide an identity to access the Azure Key Vault
Provider for Secrets Store CSI Driver
6/15/2022 • 6 minutes to read • Edit Online
The Secrets Store CSI Driver on Azure Kubernetes Service (AKS) provides a variety of methods of identity-based
access to your Azure key vault. This article outlines these methods and how to use them to access your key vault
and its contents from your AKS cluster. For more information, see Use the Secrets Store CSI Driver.
3. Create a SecretProviderClass by using the following YAML, using your own values for aadpodidbinding ,
tenantId , and the objects to retrieve from your key vault:
# This is a SecretProviderClass example using aad-pod-identity to access the key vault
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-kvname-podid
spec:
provider: azure
parameters:
usePodIdentity: "true" # Set to true for using aad-pod-identity to access your key
vault
keyvaultName: <key-vault-name> # Set to the name of your key vault
cloudName: "" # [OPTIONAL for Azure] if not provided, the Azure
environment defaults to AzurePublicCloud
objects: |
array:
- |
objectName: secret1
objectType: secret # object types: secret, key, or cert
objectVersion: "" # [OPTIONAL] object versions, default to latest if empty
- |
objectName: key1
objectType: key
objectVersion: ""
tenantId: <tenant-Id> # The tenant ID of the key vault
5. Create a pod by using the following YAML, using the name of your identity:
# This is a sample pod definition for using SecretProviderClass and aad-pod-identity to access the
key vault
kind: Pod
apiVersion: v1
metadata:
name: busybox-secrets-store-inline-podid
labels:
aadpodidbinding: <name> # Set the label value to the name of your pod identity
spec:
containers:
- name: busybox
image: k8s.gcr.io/e2e-test-images/busybox:1.29-1
command:
- "/bin/sleep"
- "10000"
volumeMounts:
- name: secrets-store01-inline
mountPath: "/mnt/secrets-store"
readOnly: true
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "azure-kvname-podid"
Alternatively, you can create a new managed identity and assign it to your virtual machine (VM) scale set
or to each VM instance in your availability set:
2. To grant your identity permissions that enable it to read your key vault and view its contents, run the
following commands:
3. Create a by using the following YAML, using your own values for
SecretProviderClass
userAssignedIdentityID , keyvaultName , tenantId , and the objects to retrieve from your key vault:
# This is a SecretProviderClass example using user-assigned identity to access your key vault
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-kvname-user-msi
spec:
provider: azure
parameters:
usePodIdentity: "false"
useVMManagedIdentity: "true" # Set to true for using managed identity
userAssignedIdentityID: <client-id> # Set the clientID of the user-assigned managed identity to
use
keyvaultName: <key-vault-name> # Set to the name of your key vault
cloudName: "" # [OPTIONAL for Azure] if not provided, the Azure
environment defaults to AzurePublicCloud
objects: |
array:
- |
objectName: secret1
objectType: secret # object types: secret, key, or cert
objectVersion: "" # [OPTIONAL] object versions, default to latest if empty
- |
objectName: key1
objectType: key
objectVersion: ""
tenantId: <tenant-id> # The tenant ID of the key vault
# This is a sample pod definition for using SecretProviderClass and the user-assigned identity to
access your key vault
kind: Pod
apiVersion: v1
metadata:
name: busybox-secrets-store-inline-user-msi
spec:
containers:
- name: busybox
image: k8s.gcr.io/e2e-test-images/busybox:1.29-1
command:
- "/bin/sleep"
- "10000"
volumeMounts:
- name: secrets-store01-inline
mountPath: "/mnt/secrets-store"
readOnly: true
volumes:
- name: secrets-store01-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "azure-kvname-user-msi"
IMPORTANT
Before you begin this step, enable system-assigned managed identity in your AKS cluster's VMs or scale sets.
Usage
1. Verify that your virtual machine scale set or availability set nodes have their own system-assigned
identity:
NOTE
The output should contain type: SystemAssigned . Make a note of the principalId .
IMDS is looking for a System Assigned Identity on VMSS first, then it will look for a User Assigned Identity and
pull that if there is only 1. If there are multiple User Assigned Identities IMDS will throw an error as it does not
know which identity to pull.
2. To grant your identity permissions that enable it to read your key vault and view its contents, run the
following commands:
3. Create a SecretProviderClass by using the following YAML, using your own values for keyvaultName ,
tenantId , and the objects to retrieve from your key vault:
# This is a SecretProviderClass example using system-assigned identity to access your key vault
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-kvname-system-msi
spec:
provider: azure
parameters:
usePodIdentity: "false"
useVMManagedIdentity: "true" # Set to true for using managed identity
userAssignedIdentityID: "" # If empty, then defaults to use the system assigned identity on
the VM
keyvaultName: <key-vault-name>
cloudName: "" # [OPTIONAL for Azure] if not provided, the Azure environment
defaults to AzurePublicCloud
objects: |
array:
- |
objectName: secret1
objectType: secret # object types: secret, key, or cert
objectVersion: "" # [OPTIONAL] object versions, default to latest if empty
- |
objectName: key1
objectType: key
objectVersion: ""
tenantId: <tenant-id> # The tenant ID of the key vault
Next steps
To validate that the secrets are mounted at the volume path that's specified in your pod's YAML, see Use the
Azure Key Vault Provider for Secrets Store CSI Driver in an AKS cluster.
Set up Secrets Store CSI Driver to enable NGINX
Ingress Controller with TLS
6/15/2022 • 7 minutes to read • Edit Online
This article walks you through the process of securing an NGINX Ingress Controller with TLS with an Azure
Kubernetes Service (AKS) cluster and an Azure Key Vault (AKV) instance. For more information, see TLS in
Kubernetes.
Importing the ingress TLS certificate to the cluster can be accomplished using one of two methods:
Application - The application deployment manifest declares and mounts the provider volume. Only when
the application is deployed, is the certificate made available in the cluster, and when the application is
removed the secret is removed as well. This scenario fits development teams who are responsible for the
application’s security infrastructure and their integration with the cluster.
Ingress Controller - The ingress deployment is modified to declare and mount the provider volume. The
secret is imported when ingress pods are created. The application’s pods have no access to the TLS
certificate. This scenario fits scenarios where one team (for example, IT) manages and creates infrastructure
and networking components (including HTTPS TLS certificates) and other teams manage application lifecycle.
In this case, ingress is specific to a single namespace/workload and is deployed in the same namespace as
the application.
Prerequisites
If you don't have an Azure subscription, create a free account before you begin.
Before you start, ensure your Azure CLI version is >= 2.30.0 , or install the latest version.
An AKS cluster with the Secrets Store CSI Driver configured.
An Azure Key Vault instance.
Deploy a SecretProviderClass
First, create a new namespace:
export NAMESPACE=ingress-basic
Select a method to provide an access identity and configure your SecretProviderClass YAML accordingly.
Additionally:
Be sure to use objectType=secret , which is the only way to obtain the private key and the certificate from
AKV.
Set kubernetes.io/tls as the type in your secretObjects section.
See the following example of what your SecretProviderClass might look like:
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-tls
spec:
provider: azure
secretObjects: # secretObjects defines the desired state of synced K8s secret
objects
- secretName: ingress-tls-csi
type: kubernetes.io/tls
data:
- objectName: $CERT_NAME
key: tls.key
- objectName: $CERT_NAME
key: tls.crt
parameters:
usePodIdentity: "false"
useVMManagedIdentity: "true"
userAssignedIdentityID: <client id>
keyvaultName: $AKV_NAME # the name of the AKV instance
objects: |
array:
- |
objectName: $CERT_NAME
objectType: secret
tenantId: $TENANT_ID # the tenant ID of the AKV instance
NOTE
If not using Azure Active Directory (AAD) pod identity as your method of access, remove the line with
--set controller.podLabels.aadpodidbinding=$AAD_POD_IDENTITY_NAME
Make note of the tls section referencing the secret we've created earlier, and apply the file to your cluster:
No additional path was provided with the address, so the ingress controller defaults to the / route. The first
demo application is returned, as shown in the following condensed example output:
[...]
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="/static/default.css">
<title>Welcome to Azure Kubernetes Service (AKS)</title>
[...]
The -v parameter in our curl command outputs verbose information, including the TLS certificate received.
Half-way through your curl output, you can verify that your own TLS certificate was used. The -k parameter
continues loading the page even though we're using a self-signed certificate. The following example shows that
the issuer: CN=demo.azure.com; O=aks-ingress-tls certificate was used:
[...]
* Server certificate:
* subject: CN=demo.azure.com; O=aks-ingress-tls
* start date: Oct 22 22:13:54 2021 GMT
* expire date: Oct 22 22:13:54 2022 GMT
* issuer: CN=demo.azure.com; O=aks-ingress-tls
* SSL certificate verify result: self signed certificate (18), continuing anyway.
[...]
Now add /hello-world-two path to the address, such as https://demo.azure.com/hello-world-two . The second
demo application with the custom title is returned, as shown in the following condensed example output:
[...]
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="/static/default.css">
<title>AKS Ingress Demo</title>
[...]
Troubleshoot Azure Key Vault Provider for Secrets
Store CSI Driver
6/15/2022 • 2 minutes to read • Edit Online
This article lists common issues with using Azure Key Vault Provider for Secrets Store CSI Driver on Azure
Kubernetes Service (AKS) and provides troubleshooting tips for resolving them.
Logging
Azure Key Vault Provider logs are available in the provider pods. To troubleshoot issues with the provider, you
can look at logs from the provider pod that's running on the same node as your application pod. Run the
following commands:
# find the secrets-store-provider-azure pod running on the same node as your application pod
kubectl get pods -l app=secrets-store-provider-azure -n kube-system -o wide
kubectl logs -l app=secrets-store-provider-azure -n kube-system --since=1h | grep ^E
You can also access Secrets Store CSI Driver logs by running the following commands:
# find the secrets-store-csi-driver pod running on the same node as your application pod
kubectl get pods -l app=secrets-store-csi-driver -n kube-system -o wide
kubectl logs -l app=secrets-store-csi-driver -n kube-system --since=1h | grep ^E
Common issues
Failed to get key vault token: nmi response failed with status code: 404
Error message in logs/events:
Description: The Node Managed Identity (NMI) component in aad-pod-identity returned an error for a token
request. For more information about the error and to resolve it, check the NMI pod logs and refer to the Azure
AD pod identity troubleshooting guide.
NOTE
Azure Active Directory (Azure AD) is abbreviated as aad in the aad-pod-identity string.
4. Try getting a secret that's already created in your Azure key vault:
In a private cluster, the control plane or API server has internal IP addresses that are defined in the RFC1918 -
Address Allocation for Private Internet document. By using a private cluster, you can ensure network traffic
between your API server and your node pools remains on the private network only.
The control plane or API server is in an Azure Kubernetes Service (AKS)-managed Azure subscription. A
customer's cluster or node pool is in the customer's subscription. The server and the cluster or node pool can
communicate with each other through the Azure Private Link service in the API server virtual network and a
private endpoint that's exposed in the subnet of the customer's AKS cluster.
When you provision a private AKS cluster, AKS by default creates a private FQDN with a private DNS zone and
an additional public FQDN with a corresponding A record in Azure public DNS. The agent nodes still use the A
record in the private DNS zone to resolve the private IP address of the private endpoint for communication to
the API server.
Region availability
Private cluster is available in public regions, Azure Government, and Azure China 21Vianet regions where AKS is
supported.
NOTE
Azure Government sites are supported, however US Gov Texas isn't currently supported because of missing Private Link
support.
Prerequisites
Azure CLI >= 2.28.0 or Azure CLI with aks-preview extension 0.5.29 or later.
If using ARM or the rest API, the AKS API version must be 2021-05-01 or later.
The Private Link service is supported on Standard Azure Load Balancer only. Basic Azure Load Balancer isn't
supported.
To use a custom DNS server, add the Azure DNS IP 168.63.129.16 as the upstream DNS server in the custom
DNS server.
az aks create \
--resource-group <private-cluster-resource-group> \
--name <private-cluster-name> \
--load-balancer-sku standard \
--enable-private-cluster \
--network-plugin azure \
--vnet-subnet-id <subnet-id> \
--docker-bridge-address 172.17.0.1/16 \
--dns-service-ip 10.2.0.10 \
--service-cidr 10.2.0.0/24
NOTE
If the Docker bridge address CIDR (172.17.0.1/16) clashes with the subnet CIDR, change the Docker bridge address
appropriately.
Create a private AKS cluster with Custom Private DNS Zone or Private DNS SubZone
Create a private AKS cluster with Custom Private DNS Zone and Custom Subdomain
1. By default, when a private cluster is provisioned, a private endpoint (1) and a private DNS zone (2) are
created in the cluster-managed resource group. The cluster uses an A record in the private zone to
resolve the IP of the private endpoint for communication to the API server.
2. The private DNS zone is linked only to the VNet that the cluster nodes are attached to (3). This means that
the private endpoint can only be resolved by hosts in that linked VNet. In scenarios where no custom
DNS is configured on the VNet (default), this works without issue as hosts point at 168.63.129.16 for
DNS that can resolve records in the private DNS zone because of the link.
3. In scenarios where the VNet containing your cluster has custom DNS settings (4), cluster deployment
fails unless the private DNS zone is linked to the VNet that contains the custom DNS resolvers (5). This
link can be created manually after the private zone is created during cluster provisioning or via
automation upon detection of creation of the zone using event-based deployment mechanisms (for
example, Azure Event Grid and Azure Functions).
NOTE
Conditional Forwarding doesn't support subdomains.
NOTE
If you are using Bring Your Own Route Table with kubenet and Bring Your Own DNS with Private Cluster, the cluster
creation will fail. You will need to associate the RouteTable in the node resource group to the subnet after the cluster
creation failed, in order to make the creation successful.
IMPORTANT
If the virtual network is configured with custom DNS servers, private DNS will need to be set up appropriately for the
environment. See the virtual networks name resolution documentation for more details.
1. On the Azure portal menu or from the Home page, select Create a resource .
2. Search for Private Endpoint and select Create > Private Endpoint .
3. Select Create .
4. On the Basics tab, set up the following options:
Project details :
Select an Azure Subscription .
Select the Azure Resource group where your virtual network is located.
Instance details :
Enter a Name for the private endpoint, such as myPrivateEndpoint.
Select a Region for the private endpoint.
IMPORTANT
Check that the region selected is the same as the virtual network where you want to connect from, otherwise you won't
see your virtual network in the Configuration tab.
IMPORTANT
When creating the A record, use only the name, and not the fully qualified domain name (FQDN).
Once the A record is created, link the private DNS zone to the virtual network that will access the private cluster.
1. Go to the private DNS zone created in previous steps.
2. In the left pane, select Vir tual network links .
3. Create a new link to add the virtual network to the private DNS zone. It takes a few minutes for the DNS zone
link to become available.
WARNING
If the private cluster is stopped and restarted, the private cluster's original private link service is removed and re-created,
which breaks the connection between your private endpoint and the private cluster. To resolve this issue, delete and re-
create any user created private endpoints linked to the private cluster. DNS records will also need to be updated if the re-
created private endpoints have new IP addresses.
Limitations
IP authorized ranges can't be applied to the private API server endpoint, they only apply to the public API
server
Azure Private Link service limitations apply to private clusters.
No support for Azure DevOps Microsoft-hosted Agents with private clusters. Consider using Self-hosted
Agents.
If you need to enable Azure Container Registry to work with a private AKS cluster, set up a private link for the
container registry in the cluster virtual network or set up peering between the Container Registry virtual
network and the private cluster's virtual network.
No support for converting existing AKS clusters into private clusters
Deleting or modifying the private endpoint in the customer subnet will cause the cluster to stop functioning.
Use command invoke to access a private Azure
Kubernetes Service (AKS) cluster
6/15/2022 • 2 minutes to read • Edit Online
Accessing a private AKS cluster requires that you connect to that cluster either from the cluster virtual network,
from a peered network, or via a configured private endpoint. These approaches require configuring a VPN,
Express Route, deploying a jumpbox within the cluster virtual network, or creating a private endpoint inside of
another virtual network. Alternatively, you can use command invoke to access private clusters without having to
configure a VPN or Express Route. Using command invoke allows you to remotely invoke commands like
kubectl and helm on your private cluster through the Azure API without directly connecting to the cluster.
Permissions for using command invoke are controlled through the
Microsoft.ContainerService/managedClusters/runcommand/action and
Microsoft.ContainerService/managedclusters/commandResults/read roles.
Prerequisites
An existing private cluster.
The Azure CLI version 2.24.0 or later.
Access to the Microsoft.ContainerService/managedClusters/runcommand/action and
Microsoft.ContainerService/managedclusters/commandResults/read roles on the cluster.
Limitations
The pod created by the run command provides the following binaries:
The latest compatible version of kubectl for your cluster with kustomize .
helm
In addition, command invoke runs the commands from your cluster so any commands run in this manner are
subject to networking and other restrictions you have configured on your cluster.
The above example runs the kubectl get pods -n kube-system command on the myAKSCluster cluster in
myResourceGroup.
The above example runs three helm commands on the myAKSCluster cluster in myResourceGroup.
The above runs kubectl apply -f deployment.yaml -n default on the myAKSCluster cluster in
myResourceGroup. The deployment.yaml file used by that command is attached from the current directory on
the development computer where az aks command invoke was run.
You can also attach all files in the current directory. For example:
The above runs kubectl apply -f deployment.yaml configmap.yaml -n default on the myAKSCluster cluster in
myResourceGroup. The deployment.yaml and configmap.yaml files used by that command are part of the
current directory on the development computer where az aks command invoke was run.
Use kubenet networking with your own IP address
ranges in Azure Kubernetes Service (AKS)
6/15/2022 • 14 minutes to read • Edit Online
By default, AKS clusters use kubenet, and an Azure virtual network and subnet are created for you. With
kubenet, nodes get an IP address from the Azure virtual network subnet. Pods receive an IP address from a
logically different address space to the Azure virtual network subnet of the nodes. Network address translation
(NAT) is then configured so that the pods can reach resources on the Azure virtual network. The source IP
address of the traffic is NAT'd to the node's primary IP address. This approach greatly reduces the number of IP
addresses that you need to reserve in your network space for pods to use.
With Azure Container Networking Interface (CNI), every pod gets an IP address from the subnet and can be
accessed directly. These IP addresses must be unique across your network space, and must be planned in
advance. Each node has a configuration parameter for the maximum number of pods that it supports. The
equivalent number of IP addresses per node are then reserved up front for that node. This approach requires
more planning, and often leads to IP address exhaustion or the need to rebuild clusters in a larger subnet as
your application demands grow. You can configure the maximum pods deployable to a node at cluster create
time or when creating new node pools. If you don't specify maxPods when creating new node pools, you receive
a default value of 110 for kubenet.
This article shows you how to use kubenet networking to create and use a virtual network subnet for an AKS
cluster. For more information on network options and considerations, see Network concepts for Kubernetes and
AKS.
Prerequisites
The virtual network for the AKS cluster must allow outbound internet connectivity.
Don't create more than one AKS cluster in the same subnet.
AKS clusters may not use 169.254.0.0/16 , 172.30.0.0/16 , 172.31.0.0/16 , or 192.0.2.0/24 for the
Kubernetes service address range, pod address range or cluster virtual network address range.
The cluster identity used by the AKS cluster must have at least Network Contributor role on the subnet within
your virtual network. CLI helps do the role assignment automatically. If you are using ARM template or other
clients, the role assignment needs to be done manually. You must also have the appropriate permissions,
such as the subscription owner, to create a cluster identity and assign it permissions. If you wish to define a
custom role instead of using the built-in Network Contributor role, the following permissions are required:
Microsoft.Network/virtualNetworks/subnets/join/action
Microsoft.Network/virtualNetworks/subnets/read
WARNING
To use Windows Server node pools, you must use Azure CNI. The use of kubenet as the network model is not available for
Windows Server containers.
Azure supports a maximum of 400 routes in a UDR, so you can't have an AKS cluster larger than 400 nodes. AKS
Virtual Nodes and Azure Network Policies aren't supported with kubenet. You can use Calico Network Policies,
as they are supported with kubenet.
With Azure CNI, each pod receives an IP address in the IP subnet, and can directly communicate with other pods
and services. Your clusters can be as large as the IP address range you specify. However, the IP address range
must be planned in advance, and all of the IP addresses are consumed by the AKS nodes based on the maximum
number of pods that they can support. Advanced network features and scenarios such as Virtual Nodes or
Network Policies (either Azure or Calico) are supported with Azure CNI.
Limitations & considerations for kubenet
An additional hop is required in the design of kubenet, which adds minor latency to pod communication.
Route tables and user-defined routes are required for using kubenet, which adds complexity to operations.
Direct pod addressing isn't supported for kubenet due to kubenet design.
Unlike Azure CNI clusters, multiple kubenet clusters can't share a subnet.
AKS doesn't apply Network Security Groups (NSGs) to its subnet and will not modify any of the NSGs
associated with that subnet. If you provide your own subnet and add NSGs associated with that subnet, you
must ensure the security rules in the NSGs allow traffic between the node and pod CIDR. For more details,
see Network security groups.
Features not suppor ted on kubenet include:
Azure network policies, but Calico network policies are supported on kubenet
Windows node pools
Virtual nodes add-on
IP address availability and exhaustion
With Azure CNI, a common issue is the assigned IP address range is too small to then add additional nodes
when you scale or upgrade a cluster. The network team may also not be able to issue a large enough IP address
range to support your expected application demands.
As a compromise, you can create an AKS cluster that uses kubenet and connect to an existing virtual network
subnet. This approach lets the nodes receive defined IP addresses, without the need to reserve a large number
of IP addresses up front for all of the potential pods that could run in the cluster.
With kubenet, you can use a much smaller IP address range and be able to support large clusters and
application demands. For example, even with a /27 IP address range on your subnet, you could run a 20-25
node cluster with enough room to scale or upgrade. This cluster size would support up to 2,200-2,750 pods
(with a default maximum of 110 pods per node). The maximum number of pods per node that you can
configure with kubenet in AKS is 110.
The following basic calculations compare the difference in network models:
kubenet - a simple /24 IP address range can support up to 251 nodes in the cluster (each Azure virtual
network subnet reserves the first three IP addresses for management operations)
This node count could support up to 27,610 pods (with a default maximum of 110 pods per node with
kubenet)
Azure CNI - that same basic /24 subnet range could only support a maximum of 8 nodes in the cluster
This node count could only support up to 240 pods (with a default maximum of 30 pods per node
with Azure CNI)
NOTE
These maximums don't take into account upgrade or scale operations. In practice, you can't run the maximum number of
nodes that the subnet IP address range supports. You must leave some IP addresses available for use during scale or
upgrade operations.
If you don't have an existing virtual network and subnet to use, create these network resources using the az
network vnet create command. In the following example, the virtual network is named myVnet with the address
prefix of 192.168.0.0/16. A subnet is created named myAKSSubnet with the address prefix 192.168.1.0/24.
az ad sp create-for-rbac
The following example output shows the application ID and password for your service principal. These values
are used in additional steps to assign a role to the service principal and then create the AKS cluster:
{
"appId": "476b3636-5eda-4c0e-9751-849e70b5cfad",
"displayName": "azure-cli-2019-01-09-22-29-24",
"name": "http://azure-cli-2019-01-09-22-29-24",
"password": "a1024cd7-af7b-469f-8fd7-b293ecbb174e",
"tenant": "72f998bf-85f1-41cf-92ab-2e7cd014db46"
}
To assign the correct delegations in the remaining steps, use the az network vnet show and az network vnet
subnet show commands to get the required resource IDs. These resource IDs are stored as variables and
referenced in the remaining steps:
NOTE
If you are using CLI, you can skip this step. With ARM template or other clients, you need to do the below role
assignment.
VNET_ID=$(az network vnet show --resource-group myResourceGroup --name myAKSVnet --query id -o tsv)
SUBNET_ID=$(az network vnet subnet show --resource-group myResourceGroup --vnet-name myAKSVnet --name
myAKSSubnet --query id -o tsv)
Now assign the service principal for your AKS cluster Network Contributor permissions on the virtual network
using the az role assignment create command. Provide your own <appId> as shown in the output from the
previous command to create the service principal:
az role assignment create --assignee <appId> --scope $VNET_ID --role "Network Contributor"
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 3 \
--network-plugin kubenet \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--pod-cidr 10.244.0.0/16 \
--docker-bridge-address 172.17.0.1/16 \
--vnet-subnet-id $SUBNET_ID \
--service-principal <appId> \
--client-secret <password>
NOTE
If you wish to enable an AKS cluster to include a Calico network policy you can use the following command.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 3 \
--network-plugin kubenet --network-policy calico \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--pod-cidr 10.244.0.0/16 \
--docker-bridge-address 172.17.0.1/16 \
--vnet-subnet-id $SUBNET_ID \
--service-principal <appId> \
--client-secret <password>
When you create an AKS cluster, a network security group and route table are automatically created. These
network resources are managed by the AKS control plane. The network security group is automatically
associated with the virtual NICs on your nodes. The route table is automatically associated with the virtual
network subnet. Network security group rules and route tables are automatically updated as you create and
expose services.
WARNING
Custom rules can be added to the custom route table and updated. However, rules are added by the Kubernetes cloud
provider which must not be updated or removed. Rules such as 0.0.0.0/0 must always exist on a given route table and
map to the target of your internet gateway, such as an NVA or other egress gateway. Take caution when updating rules
that only your custom rules are being modified.
# Create a kubernetes cluster with with a custom subnet preconfigured with a route table
az aks create -g MyResourceGroup -n MyManagedCluster --vnet-subnet-id MySubnetID
Next steps
With an AKS cluster deployed into your existing virtual network subnet, you can now use the cluster as normal.
Get started with creating new apps using Helm or deploy existing apps using Helm.
Use dual-stack kubenet networking in Azure
Kubernetes Service (AKS) (Preview)
6/15/2022 • 7 minutes to read • Edit Online
AKS clusters can now be deployed in a dual-stack (using both IPv4 and IPv6 addresses) mode when using
kubenet networking and a dual-stack Azure virtual network. In this configuration, nodes receive both an IPv4
and IPv6 address from the Azure virtual network subnet. Pods receive both an IPv4 and IPv6 address from a
logically different address space to the Azure virtual network subnet of the nodes. Network address translation
(NAT) is then configured so that the pods can reach resources on the Azure virtual network. The source IP
address of the traffic is NAT'd to the node's primary IP address of the same family (IPv4 to IPv4 and IPv6 to
IPv6).
This article shows you how to use dual-stack networking with an AKS cluster. For more information on network
options and considerations, see Network concepts for Kubernetes and AKS.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Limitations
NOTE
Dual-stack kubenet networking is currently not available in sovereign clouds. This note will be removed when rollout is
complete.
Azure Route Tables have a hard limit of 400 routes per table. Because each node in a dual-stack cluster
requires two routes, one for each IP address family, dual-stack clusters are limited to 200 nodes.
During preview, service objects are only supported with externalTrafficPolicy: Local .
Dual-stack networking is required for the Azure Virtual Network and the pod CIDR - single stack IPv6-only
isn't supported for node or pod IP addresses. Services can be provisioned on IPv4 or IPv6.
Features not suppor ted on dual-stack kubenet include:
Azure network policies
Calico network policies
NAT Gateway
Virtual nodes add-on
Windows node pools
Prerequisites
All prerequisites from configure kubenet networking apply.
AKS dual-stack clusters require Kubernetes version v1.21.2 or greater. v1.22.2 or greater is recommended to
take advantage of the out-of-tree cloud controller manager, which is the default on v1.22 and up.
Azure CLI with the aks-preview extension 0.5.48 or newer.
If using Azure Resource Manager templates, schema version 2021-10-01 is required.
Register the AKS-EnableDualStack preview feature
To create an AKS dual-stack cluster, you must enable the AKS-EnableDualStack feature flag on your subscription.
Register the AKS-EnableDualStack feature flag by using the az feature register command, as shown in the
following example:
It takes a few minutes for the status to show Registered. Verify the registration status by using the
az feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the
az provider register command:
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Deploying a dual-stack cluster requires passing the --ip-families parameter with the parameter value of
ipv4,ipv6 to indicate that a dual-stack cluster should be created.
Finally, after the cluster has been created, get the admin credentials:
The output from the kubectl get nodes command will show that the nodes have addresses and pod IP
assignment space from both IPv4 and IPv6.
kubectl create
YAML
Using the following kubectl get pods command will show that the pods have both IPv4 and IPv6 addresses
(note that the pods will not show IP addresses until they are ready):
IMPORTANT
There are currently two limitations pertaining to IPv6 services in AKS. These are both preview limitations and work is
underway to remove them.
Azure Load Balancer sends health probes to IPv6 destinations from a link-local address. This traffic cannot be routed to
a pod and thus traffic flowing to IPv6 services deployed with externalTrafficPolicy: Cluster will fail. During
preview, IPv6 services MUST be deployed with externalTrafficPolicy: Local , which causes kube-proxy to
respond to the probe on the node, in order to function.
Only the first IP address for a service will be provisioned to the load balancer, so a dual-stack service will only receive a
public IP for its first listed IP family. In order to provide a dual-stack service for a single deployment, please create two
services targeting the same selector, one for IPv4 and one for IPv6.
service/nginx-ipv4 exposed
service/nginx-ipv6 exposed
Once the deployment has been exposed and the LoadBalancer services have been fully provisioned,
kubectl get services will show the IP addresses of the services:
Next, we can verify functionality via a command-line web request from an IPv6 capable host (note that Azure
Cloud Shell is not IPv6 capable):
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
Configure Azure CNI networking in Azure
Kubernetes Service (AKS)
6/15/2022 • 18 minutes to read • Edit Online
By default, AKS clusters use kubenet, and a virtual network and subnet are created for you. With kubenet, nodes
get an IP address from a virtual network subnet. Network address translation (NAT) is then configured on the
nodes, and pods receive an IP address "hidden" behind the node IP. This approach reduces the number of IP
addresses that you need to reserve in your network space for pods to use.
With Azure Container Networking Interface (CNI), every pod gets an IP address from the subnet and can be
accessed directly. These IP addresses must be unique across your network space, and must be planned in
advance. Each node has a configuration parameter for the maximum number of pods that it supports. The
equivalent number of IP addresses per node are then reserved up front for that node. This approach requires
more planning, and often leads to IP address exhaustion or the need to rebuild clusters in a larger subnet as
your application demands grow.
This article shows you how to use Azure CNI networking to create and use a virtual network subnet for an AKS
cluster. For more information on network options and considerations, see Network concepts for Kubernetes and
AKS.
Prerequisites
The virtual network for the AKS cluster must allow outbound internet connectivity.
AKS clusters may not use 169.254.0.0/16 , 172.30.0.0/16 , 172.31.0.0/16 , or 192.0.2.0/24 for the
Kubernetes service address range, pod address range, or cluster virtual network address range.
The cluster identity used by the AKS cluster must have at least Network Contributor permissions on the
subnet within your virtual network. If you wish to define a custom role instead of using the built-in Network
Contributor role, the following permissions are required:
Microsoft.Network/virtualNetworks/subnets/join/action
Microsoft.Network/virtualNetworks/subnets/read
The subnet assigned to the AKS node pool cannot be a delegated subnet.
AKS doesn't apply Network Security Groups (NSGs) to its subnet and will not modify any of the NSGs
associated with that subnet. If you provide your own subnet and add NSGs associated with that subnet, you
must ensure the security rules in the NSGs allow traffic within the node CIDR range. For more details, see
Network security groups.
If you expect your nodes to run the maximum number of pods, and regularly destroy and deploy pods, you
should also factor in some additional IP addresses per node. These additional IP addresses take into
consideration it may take a few seconds for a service to be deleted and the IP address released for a new service
to be deployed and acquire the address.
The IP address plan for an AKS cluster consists of a virtual network, at least one subnet for nodes and pods, and
a Kubernetes service address range.
Virtual network The Azure virtual network can be as large as /8, but is
limited to 65,536 configured IP addresses. Consider all your
networking needs, including communicating with services in
other virtual networks, before configuring your address
space. For example, if you configure too large of an address
space, you may run into issues with overlapping other
address spaces within your network.
A DDRESS RA N GE / A Z URE RESO URC E L IM IT S A N D SIZ IN G
Kubernetes service address range This range should not be used by any network element on
or connected to this virtual network. Service address CIDR
must be smaller than /12. You can reuse this range across
different AKS clusters.
Kubernetes DNS service IP address IP address within the Kubernetes service address range that
will be used by cluster service discovery. Don't use the first
IP address in your address range. The first address in your
subnet range is used for the
kubernetes.default.svc.cluster.local address.
Docker bridge address The Docker bridge network address represents the default
docker0 bridge network address present in all Docker
installations. While docker0 bridge is not used by AKS
clusters or the pods themselves, you must set this address
to continue to support scenarios such as docker build within
the AKS cluster. It is required to select a CIDR for the Docker
bridge network address because otherwise Docker will pick a
subnet automatically, which could conflict with other CIDRs.
You must pick an address space that does not collide with
the rest of the CIDRs on your networks, including the
cluster's service CIDR and pod CIDR. Default of
172.17.0.1/16. You can reuse this range across different AKS
clusters.
N ET W O RK IN G M IN IM UM M A XIM UM
Kubenet 10 250
NOTE
The minimum value in the table above is strictly enforced by the AKS service. You can not set a maxPods value lower than
the minimum shown as doing so can prevent the cluster from starting.
Azure CLI : Specify the --max-pods argument when you deploy a cluster with the az aks create command.
The maximum value is 250.
Resource Manager template : Specify the maxPods property in the ManagedClusterAgentPoolProfile
object when you deploy a cluster with a Resource Manager template. The maximum value is 250.
Azure por tal : Change the Max pods per node field in the node pool settings when creating a cluster or
adding a new node pool.
Configure maximum - existing clusters
The maxPod per node setting can be defined when you create a new node pool. If you need to increase the
maxPod per node setting on an existing cluster, add a new node pool with the new desired maxPod count. After
migrating your pods to the new pool, delete the older pool. To delete any older pool in a cluster, ensure you are
setting node pool modes as defined in the system node pools document.
Deployment parameters
When you create an AKS cluster, the following parameters are configurable for Azure CNI networking:
Vir tual network : The virtual network into which you want to deploy the Kubernetes cluster. If you want to
create a new virtual network for your cluster, select Create new and follow the steps in the Create virtual
network section. For information about the limits and quotas for an Azure virtual network, see Azure
subscription and service limits, quotas, and constraints.
Subnet : The subnet within the virtual network where you want to deploy the cluster. If you want to create a new
subnet in the virtual network for your cluster, select Create new and follow the steps in the Create subnet
section. For hybrid connectivity, the address range shouldn't overlap with any other virtual networks in your
environment.
Azure Network Plugin : When Azure network plugin is used, the internal LoadBalancer service with
"externalTrafficPolicy=Local" can't be accessed from VMs with an IP in clusterCIDR that does not belong to AKS
cluster.
Kubernetes ser vice address range : This parameter is the set of virtual IPs that Kubernetes assigns to internal
services in your cluster. You can use any private address range that satisfies the following requirements:
Must not be within the virtual network IP address range of your cluster
Must not overlap with any other virtual networks with which the cluster virtual network peers
Must not overlap with any on-premises IPs
Must not be within the ranges 169.254.0.0/16 , 172.30.0.0/16 , 172.31.0.0/16 , or 192.0.2.0/24
Although it's technically possible to specify a service address range within the same virtual network as your
cluster, doing so is not recommended. Unpredictable behavior can result if overlapping IP ranges are used. For
more information, see the FAQ section of this article. For more information on Kubernetes services, see Services
in the Kubernetes documentation.
Kubernetes DNS ser vice IP address : The IP address for the cluster's DNS service. This address must be
within the Kubernetes service address range. Don't use the first IP address in your address range. The first
address in your subnet range is used for the kubernetes.default.svc.cluster.local address.
Docker Bridge address : The Docker bridge network address represents the default docker0 bridge network
address present in all Docker installations. While docker0 bridge is not used by AKS clusters or the pods
themselves, you must set this address to continue to support scenarios such as docker build within the AKS
cluster. It is required to select a CIDR for the Docker bridge network address because otherwise Docker will pick
a subnet automatically which could conflict with other CIDRs. You must pick an address space that does not
collide with the rest of the CIDRs on your networks, including the cluster's service CIDR and pod CIDR.
/subscriptions/<guid>/resourceGroups/myVnet/providers/Microsoft.Network/virtualNetworks/myVnet/subnets/defau
lt
Use the az aks create command with the --network-plugin azure argument to create a cluster with advanced
networking. Update the --vnet-subnet-id value with the subnet ID collected in the previous step:
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--network-plugin azure \
--vnet-subnet-id <subnet-id> \
--docker-bridge-address 172.17.0.1/16 \
--dns-service-ip 10.2.0.10 \
--service-cidr 10.2.0.0/24 \
--generate-ssh-keys
NOTE
When using dynamic allocation of IPs, exposing an application as a Private Link Service using a Kubernetes Load Balancer
Service is not supported.
The prerequisites already listed for Azure CNI still apply, but there are a few additional limitations:
Only linux node clusters and node pools are supported.
AKS Engine and DIY clusters are not supported.
Azure CLI version 2.37.0 or later.
Planning IP addressing
When using this feature, planning is much simpler. Since the nodes and pods scale independently, their address
spaces can also be planned separately. Since pod subnets can be configured to the granularity of a node pool,
customers can always add a new subnet when they add a node pool. The system pods in a cluster/node pool
also receive IPs from the pod subnet, so this behavior needs to be accounted for.
The planning of IPs for Kubernetes services and Docker bridge remain unchanged.
Maximum pods per node in a cluster with dynamic allocation of IPs and enhanced subnet support
The pods per node values when using Azure CNI with dynamic allocation of IPs have changed slightly from the
traditional CNI behavior:
All other guidance related to configuring the maximum nodes per pod remains the same.
Additional deployment parameters
The deployment parameters described above are all still valid, with one exception:
The subnet parameter now refers to the subnet related to the cluster's nodes.
An additional parameter pod subnet is used to specify the subnet whose IP addresses will be dynamically
allocated to pods.
Configure networking - CLI with dynamic allocation of IPs and enhanced subnet support
Using dynamic allocation of IPs and enhanced subnet support in your cluster is similar to the default method for
configuring a cluster Azure CNI. The following example walks through creating a new virtual network with a
subnet for nodes and a subnet for pods, and creating a cluster that uses Azure CNI with dynamic allocation of
IPs and enhanced subnet support. Be sure to replace variables such as $subscription with your own values:
First, create the virtual network with two subnets:
resourceGroup="myResourceGroup"
vnet="myVirtualNetwork"
location="westcentralus"
Then, create the cluster, referencing the node subnet using --vnet-subnet-id and the pod subnet using
--pod-subnet-id :
clusterName="myAKSCluster"
subscription="aaaaaaa-aaaaa-aaaaaa-aaaa"
az network vnet subnet create -g $resourceGroup --vnet-name $vnet --name node2subnet --address-prefixes
10.242.0.0/16 -o none
az network vnet subnet create -g $resourceGroup --vnet-name $vnet --name pod2subnet --address-prefixes
10.243.0.0/16 -o none
Next steps
Learn more about networking in AKS in the following articles:
Use a static IP address with the Azure Kubernetes Service (AKS) load balancer
Use an internal load balancer with Azure Container Service (AKS)
Create a basic ingress controller with external network connectivity
Enable the HTTP application routing add-on
Create an ingress controller that uses an internal, private network and IP address
Create an ingress controller with a dynamic public IP and configure Let's Encrypt to automatically
generate TLS certificates
Create an ingress controller with a static public IP and configure Let's Encrypt to automatically generate
TLS certificates
Bring your own Container Network Interface (CNI)
plugin with Azure Kubernetes Service (AKS)
(PREVIEW)
6/15/2022 • 4 minutes to read • Edit Online
Kubernetes does not provide a network interface system by default; this functionality is provided by network
plugins. Azure Kubernetes Service provides several supported CNI plugins. Documentation for supported
plugins can be found from the networking concepts page.
While the supported plugins meet most networking needs in Kubernetes, advanced users of AKS may desire to
utilize the same CNI plugin used in on-premises Kubernetes environments or to make use of specific advanced
functionality available in other CNI plugins.
This article shows how to deploy an AKS cluster with no CNI plugin pre-installed, which allows for installation of
any third-party CNI plugin that works in Azure.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Support
BYOCNI has support implications - Microsoft support will not be able to assist with CNI-related issues in
clusters deployed with BYOCNI. For example, CNI-related issues would cover most east/west (pod to pod) traffic,
along with kubectl proxy and similar commands. If CNI-related support is desired, a supported AKS network
plugin can be used or support could be procured for the BYOCNI plugin from a third-party vendor.
Support will still be provided for non-CNI-related issues.
Prerequisites
For ARM/Bicep, use at least template version 2022-01-02-preview
For Azure CLI, use at least version 0.5.55 of the aks-preview extension
The virtual network for the AKS cluster must allow outbound internet connectivity.
AKS clusters may not use 169.254.0.0/16 , 172.30.0.0/16 , 172.31.0.0/16 , or 192.0.2.0/24 for the
Kubernetes service address range, pod address range, or cluster virtual network address range.
The cluster identity used by the AKS cluster must have at least Network Contributor permissions on the
subnet within your virtual network. If you wish to define a custom role instead of using the built-in Network
Contributor role, the following permissions are required:
Microsoft.Network/virtualNetworks/subnets/join/action
Microsoft.Network/virtualNetworks/subnets/read
The subnet assigned to the AKS node pool cannot be a delegated subnet.
AKS doesn't apply Network Security Groups (NSGs) to its subnet and will not modify any of the NSGs
associated with that subnet. If you provide your own subnet and add NSGs associated with that subnet, you
must ensure the security rules in the NSGs allow traffic within the node CIDR range. For more details, see
Network security groups.
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
Deploy a cluster
Azure CLI
Azure Resource Manager
Bicep
Deploying a BYOCNI cluster requires passing the --network-plugin parameter with the parameter value of
none .
Next steps
Learn more about networking in AKS in the following articles:
Use a static IP address with the Azure Kubernetes Service (AKS) load balancer
Use an internal load balancer with Azure Container Service (AKS)
Create a basic ingress controller with external network connectivity
Enable the HTTP application routing add-on
Create an ingress controller that uses an internal, private network and IP address
Create an ingress controller with a dynamic public IP and configure Let's Encrypt to automatically
generate TLS certificates
Create an ingress controller with a static public IP and configure Let's Encrypt to automatically generate
TLS certificates
Use an internal load balancer with Azure
Kubernetes Service (AKS)
6/15/2022 • 6 minutes to read • Edit Online
To restrict access to your applications in Azure Kubernetes Service (AKS), you can create and use an internal load
balancer. An internal load balancer makes a Kubernetes service accessible only to applications running in the
same virtual network as the Kubernetes cluster. This article shows you how to create and use an internal load
balancer with Azure Kubernetes Service (AKS).
NOTE
Azure Load Balancer is available in two SKUs - Basic and Standard. By default, the Standard SKU is used when you create
an AKS cluster. When creating a Service with type as LoadBalancer, you will get the same LB type as when you provision
the cluster. For more information, see Azure load balancer SKU comparison.
apiVersion: v1
kind: Service
metadata:
name: internal-app
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: internal-app
Deploy the internal load balancer using the kubectl apply and specify the name of your YAML manifest:
kubectl apply -f internal-lb.yaml
An Azure load balancer is created in the node resource group and connected to the same virtual network as the
AKS cluster.
When you view the service details, the IP address of the internal load balancer is shown in the EXTERNAL-IP
column. In this context, External is in relation to the external interface of the load balancer, not that it receives a
public, external IP address. It may take a minute or two for the IP address to change from <pending> to an
actual internal IP address, as shown in the following example:
Specify an IP address
If you would like to use a specific IP address with the internal load balancer, add the loadBalancerIP property to
the load balancer YAML manifest. In this scenario, the specified IP address must reside in the same subnet as the
AKS cluster but can't already be assigned to a resource. For example, an IP address in the range designated for
the Kubernetes subnet within the AKS cluster shouldn't be used.
apiVersion: v1
kind: Service
metadata:
name: internal-app
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
loadBalancerIP: 10.240.0.25
ports:
- port: 80
selector:
app: internal-app
When deployed and you view the service details, the IP address in the EXTERNAL-IP column reflects your
specified IP address:
For more information on configuring your load balancer in a different subnet, see Specify a different subnet
apiVersion: v1
kind: Service
metadata:
name: internal-app
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-pls-create: "true"
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: internal-app
Deploy the internal load balancer using the kubectl apply and specify the name of your YAML manifest:
An Azure load balancer is created in the node resource group and connected to the same virtual network as the
AKS cluster.
When you view the service details, the IP address of the internal load balancer is shown in the EXTERNAL-IP
column. In this context, External is in relation to the external interface of the load balancer, not that it receives a
public, external IP address. It may take a minute or two for the IP address to change from <pending> to an
actual internal IP address, as shown in the following example:
Additionally, a Private Link Service object will also be created that connects to the Frontend IP configuration of
the Load Balancer associated with the Kubernetes service. Details of the Private Link Service object can be
retrieved as shown in the following example:
Name Alias
-------- -------------------------------------------------------------------------
pls-xyz pls-xyz.abc123-defg-4hij-56kl-789mnop.eastus2.azure.privatelinkservice
NOTE
You may need to grant the cluster identity for your AKS cluster the Network Contributor role to the resource group where
your Azure virtual network resources are deployed. View the cluster identity with az aks show, such as
az aks show --resource-group myResourceGroup --name myAKSCluster --query "identity" . To create a role
assignment, use the az role assignment create command.
apiVersion: v1
kind: Service
metadata:
name: internal-app
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "apps-subnet"
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: internal-app
Next steps
Learn more about Kubernetes services at the Kubernetes services documentation.
Use a public Standard Load Balancer in Azure
Kubernetes Service (AKS)
6/15/2022 • 21 minutes to read • Edit Online
The Azure Load Balancer is on L4 of the Open Systems Interconnection (OSI) model that supports both inbound
and outbound scenarios. It distributes inbound flows that arrive at the load balancer's front end to the backend
pool instances.
A public Load Balancer when integrated with AKS serves two purposes:
1. To provide outbound connections to the cluster nodes inside the AKS virtual network. It achieves this
objective by translating the nodes private IP address to a public IP address that is part of its Outbound Pool.
2. To provide access to applications via Kubernetes services of type LoadBalancer . With it, you can easily scale
your applications and create highly available services.
An internal (or private) load balancer is used where only private IPs are allowed as frontend. Internal load
balancers are used to load balance traffic inside a virtual network. A load balancer frontend can also be accessed
from an on-premises network in a hybrid scenario.
This document covers the integration with Public Load balancer. For internal Load Balancer integration, see the
AKS Internal Load balancer documentation.
IMPORTANT
If you prefer not to leverage the Azure Load Balancer to provide outbound connection and instead have your own
gateway, firewall or proxy for that purpose you can skip the creation of the load balancer outbound pool and respective
frontend IP by using Outbound type as UserDefinedRouting (UDR) . The Outbound type defines the egress method
for a cluster and it defaults to type: load balancer.
Deploy the public service manifest by using kubectl apply and specify the name of your YAML manifest:
The Azure Load Balancer will be configured with a new public IP that will front this new service. Since the Azure
Load Balancer can have multiple Frontend IPs, each new service deployed will get a new dedicated frontend IP
to be uniquely accessed.
You can confirm your service is created and the load balancer is configured by running for example:
When you view the service details, the public IP address created for this service on the load balancer is shown in
the EXTERNAL-IP column. It may take a minute or two for the IP address to change from <pending> to an actual
public IP address, as shown in the above example.
IMPORTANT
Only one outbound IP option (managed IPs, bring your own IP, or IP Prefix) can be used at a given time.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-managed-outbound-ip-count 2
The above example sets the number of managed outbound public IPs to 2 for the myAKSCluster cluster in
myResourceGroup.
You can also use the load-balancer-managed-ip-count parameter to set the initial number of managed outbound
public IPs when creating your cluster by appending the --load-balancer-managed-outbound-ip-count parameter
and setting it to your desired value. The default number of managed outbound public IPs is 1.
Provide your own outbound public IPs or prefixes
When you use a Standard SKU load balancer, by default the AKS cluster automatically creates a public IP in the
AKS-managed infrastructure resource group and assigns it to the load balancer outbound pool.
A public IP created by AKS is considered an AKS managed resource. This means the lifecycle of that public IP is
intended to be managed by AKS and requires no user action directly on the public IP resource. Alternatively, you
can assign your own custom public IP or public IP prefix at cluster creation time. Your custom IPs can also be
updated on an existing cluster's load balancer properties.
Requirements for using your own public IP or prefix:
Custom public IP addresses must be created and owned by the user. Managed public IP addresses created by
AKS cannot be reused as a bring your own custom IP as it can cause management conflicts.
You must ensure the AKS cluster identity (Service Principal or Managed Identity) has permissions to access
the outbound IP. As per the required public IP permissions list.
Make sure you meet the pre-requisites and constraints necessary to configure Outbound IPs or Outbound IP
prefixes.
Update the cluster with your own outbound public IP
Use the az network public-ip show command to list the IDs of your public IPs.
The above command shows the ID for the myPublicIP public IP in the myResourceGroup resource group.
Use the az aks update command with the load-balancer-outbound-ips parameter to update your cluster with
your public IPs.
The following example uses the load-balancer-outbound-ips parameter with the IDs from the previous
command.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-outbound-ips <publicIpId1>,<publicIpId2>
az network public-ip prefix show --resource-group myResourceGroup --name myPublicIPPrefix --query id -o tsv
The above command shows the ID for the myPublicIPPrefix public IP prefix in the myResourceGroup resource
group.
The following example uses the load-balancer-outbound-ip-prefixes parameter with the IDs from the previous
command.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-outbound-ip-prefixes <publicIpPrefixId1>,<publicIpPrefixId2>
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-outbound-ips <publicIpId1>,<publicIpId2>
Use the az aks create command with the load-balancer-outbound-ip-prefixes parameter to create a new cluster
with your public IP prefixes at the start.
az aks create \
--resource-group myResourceGroup \
--load-balancer-outbound-ip-prefixes <publicIpPrefixId1>,<publicIpPrefixId2>
By default, AKS sets AllocatedOutboundPorts on its load balancer to 0 , which enables automatic outbound port
assignment based on backend pool size when creating a cluster. For example, if a cluster has 50 or fewer nodes,
1024 ports are allocated to each node. As the number of nodes in the cluster is increased, fewer ports will be
available per node. To show the AllocatedOutboundPorts value for the AKS cluster load balancer, use
az network lb outbound-rule list . For example:
NODE_RG=$(az aks show --resource-group myResourceGroup --name myAKSCluster --query nodeResourceGroup -o tsv)
az network lb outbound-rule list --resource-group $NODE_RG --lb-name kubernetes -o table
The following example output shows that automatic outbound port assignment based on backend pool size is
enabled for the cluster:
To configure a specific value for AllocatedOutboundPorts and outbound IP address when creating or updating a
cluster, use load-balancer-outbound-ports and either load-balancer-managed-outbound-ip-count ,
load-balancer-outbound-ips , or load-balancer-outbound-ip-prefixes . Before setting a specific value or increasing
an existing value for either for outbound ports and outbound IP address, you must calculate the appropriate
number of outbound ports and IP address. Use the following equation for this calculation rounded to the
nearest integer:
64,000 ports per IP / <outbound ports per node> * <number of outbound IPs> = <maximum number of nodes in the
cluster>
.
When calculating the number of outbound ports and IPs and setting the values, remember:
The number of outbound ports is fixed per node based on the value you set.
The value for outbound ports must be a multiple of 8.
Adding more IPs does not add more ports to any node. It provides capacity for more nodes in the cluster.
You must account for nodes that may be added as part of upgrades, including the count of nodes specified
via maxSurge values.
The following examples show how the number of outbound ports and IP addresses are affected by the values
you set:
If the default values are used and the cluster has 48 nodes, each node will have 1024 ports available.
If the default values are used and the cluster scales from 48 to 52 nodes, each node will be updated from
1024 ports available to 512 ports available.
If outbound ports is set to 1,000 and outbound IP count is set to 2, then the cluster can support a maximum
of 128 nodes: 64,000 ports per IP / 1,000 ports per node * 2 IPs = 128 nodes .
If outbound ports is set to 1,000 and outbound IP count is set to 7, then the cluster can support a maximum
of 448 nodes: 64,000 ports per IP / 1,000 ports per node * 7 IPs = 448 nodes .
If outbound ports is set to 4,000 and outbound IP count is set to 2, then the cluster can support a maximum
of 32 nodes: 64,000 ports per IP / 4,000 ports per node * 2 IPs = 32 nodes .
If outbound ports is set to 4,000 and outbound IP count is set to 7, then the cluster can support a maximum
of 112 nodes: 64,000 ports per IP / 4,000 ports per node * 7 IPs = 112 nodes .
IMPORTANT
After calculating the number outbound ports and IPs, verify you have additional outbound port capacity to handle node
surge during upgrades. It is critical to allocate sufficient excess ports for additional nodes needed for upgrade and other
operations. AKS defaults to one buffer node for upgrade operations. If using maxSurge values, multiply the outbound
ports per node by your maxSurge value to determine the number of ports required. For example if you calculated you
needed 4000 ports per node with 7 IP address on a cluster with a maximum of 100 nodes and a max surge of 2:
2 surge nodes * 4000 ports per node = 8000 ports needed for node surge during upgrades.
100 nodes * 4000 ports per node = 400,000 ports required for your cluster.
7 IPs * 64000 ports per IP = 448,000 ports available for your cluster.
The above example shows the cluster has an excess capacity of 48,000 ports, which is sufficient to handle the 8000 ports
needed for node surge during upgrades.
Once the values have been calculated and verified, you can apply those values using
load-balancer-outbound-ports and either load-balancer-managed-outbound-ip-count , load-balancer-outbound-ips ,
or load-balancer-outbound-ip-prefixes when creating or updating a cluster. For example:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-managed-outbound-ip-count 7 \
--load-balancer-outbound-ports 4000
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-idle-timeout 4
If you expect to have numerous short lived connections, and no connections that are long lived and might have
long times of idle, like leveraging kubectl proxy or kubectl port-forward consider using a low timeout value
such as 4 minutes. Also, when using TCP keepalives, it's sufficient to enable them on one side of the connection.
For example, it's sufficient to enable them on the server side only to reset the idle timer of the flow and it's not
necessary for both sides to start TCP keepalives. Similar concepts exist for application layer, including database
client-server configurations. Check the server side for what options exist for application-specific keepalives.
IMPORTANT
AKS enables TCP Reset on idle by default and recommends you keep this configuration on and leverage it for more
predictable application behavior on your scenarios. TCP RST is only sent during TCP connection in ESTABLISHED state.
Read more about it here.
When setting IdleTimeoutInMinutes to a different value than the default of 30 minutes, consider how long your
workloads will need an outbound connection. Also consider the default timeout value for a Standard SKU load
balancer used outside of AKS is 4 minutes. An IdleTimeoutInMinutes value that more accurately reflects your
specific AKS workload can help decrease SNAT exhaustion caused by tying up connections no longer being used.
WARNING
Altering the values for AllocatedOutboundPorts and IdleTimeoutInMinutes may significantly change the behavior of the
outbound rule for your load balancer and should not be done lightly, without understanding the tradeoffs and your
application's connection patterns, check the SNAT Troubleshooting section below and review the Load Balancer outbound
rules and outbound connections in Azure before updating these values to fully understand the impact of your changes.
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
loadBalancerSourceRanges:
- MY_EXTERNAL_IP_RANGE
This example updates the rule to allow inbound external traffic only from the MY_EXTERNAL_IP_RANGE range. If you
replace MY_EXTERNAL_IP_RANGE with the internal subnet IP address, traffic is restricted to only cluster internal IPs.
If traffic is restricted to cluster internal IPs, clients outside your Kubernetes cluster won't be able to access the
load balancer.
NOTE
Inbound, external traffic flows from the load balancer to the virtual network for your AKS cluster. The virtual network has a
Network Security Group (NSG) which allows all inbound traffic from the load balancer. This NSG uses a service tag of type
LoadBalancer to allow traffic from the load balancer.
service.beta.kubernetes.io/azure- Name of the subnet Specify which subnet the internal load
load-balancer-internal-subnet balancer should be bound to. It’s
defaulting to the subnet configured in
cloud config file if not set.
service.beta.kubernetes.io/azure- Name of the DNS label on Public IPs Specify the DNS label name for the
dns-label-name public service. If it is set to empty
string, the DNS entry in the Public IP
will not be used.
service.beta.kubernetes.io/azure- Name of the resource group Specify the resource group of load
load-balancer-resource-group balancer public IPs that aren't in the
same resource group as the cluster
infrastructure (node resource group).
service.beta.kubernetes.io/azure- List of allowed service tags Specify a list of allowed service tags
allowed-service-tags separated by comma.
service.beta.kubernetes.io/azure- TCP idle timeouts in minutes Specify the time, in minutes, for TCP
load-balancer-tcp-idle-timeout connection idle timeouts to occur on
the load balancer. Default and
minimum value is 4. Maximum value is
30. Must be an integer.
A N N OTAT IO N VA L UE DESC RIP T IO N
Troubleshooting SNAT
If you know that you're starting many outbound TCP or UDP connections to the same destination IP address and
port, and you observe failing outbound connections or are advised by support that you're exhausting SNAT
ports (preallocated ephemeral ports used by PAT), you have several general mitigation options. Review these
options and decide what is available and best for your scenario. It's possible that one or more can help manage
this scenario. For detailed information, review the Outbound Connections Troubleshooting Guide.
Frequently the root cause of SNAT exhaustion is an anti-pattern for how outbound connectivity is established,
managed, or configurable timers changed from their default values. Review this section carefully.
Steps
1. Check if your connections remain idle for a long time and rely on the default idle timeout for releasing that
port. If so the default timeout of 30 min might need to be reduced for your scenario.
2. Investigate how your application is creating outbound connectivity (for example, code review or packet
capture).
3. Determine if this activity is expected behavior or whether the application is misbehaving. Use metrics and
logs in Azure Monitor to substantiate your findings. Use "Failed" category for SNAT Connections metric for
example.
4. Evaluate if appropriate patterns are followed.
5. Evaluate if SNAT port exhaustion should be mitigated with additional Outbound IP addresses + additional
Allocated Outbound Ports .
Design patterns
Always take advantage of connection reuse and connection pooling whenever possible. These patterns will
avoid resource exhaustion problems and result in predictable behavior. Primitives for these patterns can be
found in many development libraries and frameworks.
Atomic requests (one request per connection) are generally not a good design choice. Such anti-pattern
limits scale, reduces performance, and decreases reliability. Instead, reuse HTTP/S connections to reduce the
numbers of connections and associated SNAT ports. The application scale will increase and performance
improve because of reduced handshakes, overhead, and cryptographic operation cost when using TLS.
If you're using out of cluster/custom DNS, or custom upstream servers on coreDNS have in mind that DNS
can introduce many individual flows at volume when the client isn't caching the DNS resolvers result. Make
sure to customize coreDNS first instead of using custom DNS servers, and define a good caching value.
UDP flows (for example DNS lookups) allocate SNAT ports for the duration of the idle timeout. The longer
the idle timeout, the higher the pressure on SNAT ports. Use short idle timeout (for example 4 minutes). Use
connection pools to shape your connection volume.
Never silently abandon a TCP flow and rely on TCP timers to clean up flow. If you don't let TCP explicitly close
the connection, state remains allocated at intermediate systems and endpoints and makes SNAT ports
unavailable for other connections. This pattern can trigger application failures and SNAT exhaustion.
Don't change OS-level TCP close related timer values without expert knowledge of impact. While the TCP
stack will recover, your application performance can be negatively affected when the endpoints of a
connection have mismatched expectations. Wishing to change timers is usually a sign of an underlying
design problem. Review following recommendations.
Moving from a basic SKU load balancer to standard SKU
If you have an existing cluster with the Basic SKU Load Balancer, there are important behavioral differences to
note when migrating to use a cluster with the Standard SKU Load Balancer.
For example, making blue/green deployments to migrate clusters is a common practice given the
load-balancer-sku type of a cluster can only be defined at cluster create time. However, Basic SKU Load
Balancers use Basic SKU IP Addresses, which aren't compatible with Standard SKU Load Balancers as they
require Standard SKU IP Addresses. When migrating clusters to upgrade Load Balancer SKUs, a new IP address
with a compatible IP Address SKU will be required.
For more considerations on how to migrate clusters, visit our documentation on migration considerations to
view a list of important topics to consider when migrating. The below limitations are also important behavioral
differences to note when using Standard SKU Load Balancers in AKS.
Limitations
The following limitations apply when you create and manage AKS clusters that support a load balancer with the
Standard SKU:
At least one public IP or IP prefix is required for allowing egress traffic from the AKS cluster. The public IP or
IP prefix is also required to maintain connectivity between the control plane and agent nodes and to maintain
compatibility with previous versions of AKS. You have the following options for specifying public IPs or IP
prefixes with a Standard SKU load balancer:
Provide your own public IPs.
Provide your own public IP prefixes.
Specify a number up to 100 to allow the AKS cluster to create that many Standard SKU public IPs in
the same resource group created as the AKS cluster, which is usually named with MC_ at the
beginning. AKS assigns the public IP to the Standard SKU load balancer. By default, one public IP will
automatically be created in the same resource group as the AKS cluster, if no public IP, public IP prefix,
or number of IPs is specified. You also must allow public addresses and avoid creating any Azure
Policy that bans IP creation.
A public IP created by AKS cannot be reused as a custom bring your own public IP address. All custom IP
addresses must be created and managed by the user.
Defining the load balancer SKU can only be done when you create an AKS cluster. You can't change the load
balancer SKU after an AKS cluster has been created.
You can only use one type of load balancer SKU (Basic or Standard) in a single cluster.
Standard SKU Load Balancers only support Standard SKU IP Addresses.
Next steps
Learn more about Kubernetes services at the Kubernetes services documentation.
Learn more about using Internal Load Balancer for Inbound traffic at the AKS Internal Load Balancer
documentation.
Use a static public IP address and DNS label with
the Azure Kubernetes Service (AKS) load balancer
6/15/2022 • 4 minutes to read • Edit Online
By default, the public IP address assigned to a load balancer resource created by an AKS cluster is only valid for
the lifespan of that resource. If you delete the Kubernetes service, the associated load balancer and IP address
are also deleted. If you want to assign a specific IP address or retain an IP address for redeployed Kubernetes
services, you can create and use a static public IP address.
This article shows you how to create a static public IP address and assign it to your Kubernetes service.
NOTE
If you are using a Basic SKU load balancer in your AKS cluster, use Basic for the sku parameter when defining a public IP.
Only Basic SKU IPs work with the Basic SKU load balancer and only Standard SKU IPs work with Standard SKU load
balancers.
{
"publicIp": {
...
"ipAddress": "40.121.183.52",
...
}
}
You can later get the public IP address using the az network public-ip list command. Specify the name of the
node resource group and public IP address you created, and query for the ipAddress as shown in the following
example:
$ az network public-ip show --resource-group myResourceGroup --name myAKSPublicIP --query ipAddress --output
tsv
40.121.183.52
IMPORTANT
If you customized your outbound IP make sure your cluster identity has permissions to both the outbound public IP and
this inbound public IP.
To create a LoadBalancer service with the static public IP address, add the loadBalancerIP property and the
value of the static public IP address to the YAML manifest. Create a file named load-balancer-service.yaml and
copy in the following YAML. Provide your own public IP address created in the previous step. The following
example also sets the annotation to the resource group named myResourceGroup. Provide your own resource
group name.
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/azure-load-balancer-resource-group: myResourceGroup
name: azure-load-balancer
spec:
loadBalancerIP: 40.121.183.52
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-load-balancer
Create the service and deployment with the kubectl apply command.
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/azure-dns-label-name: myserviceuniquelabel
name: azure-load-balancer
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-load-balancer
NOTE
To publish the service on your own domain, see Azure DNS and the external-dns project.
Troubleshoot
If the static IP address defined in the loadBalancerIP property of the Kubernetes service manifest does not exist,
or has not been created in the node resource group and no additional delegations configured, the load balancer
service creation fails. To troubleshoot, review the service creation events with the kubectl describe command.
Provide the name of the service as specified in the YAML manifest, as shown in the following example:
Information about the Kubernetes service resource is displayed. The Events at the end of the following example
output indicate that the user supplied IP Address was not found. In these scenarios, verify that you have created
the static public IP address in the node resource group and that the IP address specified in the Kubernetes
service manifest is correct.
Name: azure-load-balancer
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=azure-load-balancer
Type: LoadBalancer
IP: 10.0.18.125
IP: 40.121.183.52
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 32582/TCP
Endpoints: <none>
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreatingLoadBalancer 7s (x2 over 22s) service-controller Creating load balancer
Warning CreatingLoadBalancerFailed 6s (x2 over 12s) service-controller Error creating load balancer
(will retry): Failed to create load balancer for service default/azure-load-balancer: user supplied IP
Address 40.121.183.52 was not found
Next steps
For additional control over the network traffic to your applications, you may want to instead create an ingress
controller. You can also create an ingress controller with a static public IP address.
HTTP proxy support in Azure Kubernetes Service
6/15/2022 • 3 minutes to read • Edit Online
Azure Kubernetes Service (AKS) clusters, whether deployed into a managed or custom virtual network, have
certain outbound dependencies necessary to function properly. Previously, in environments requiring internet
access to be routed through HTTP proxies, this was a problem. Nodes had no way of bootstrapping the
configuration, environment variables, and certificates necessary to access internet services.
This feature adds HTTP proxy support to AKS clusters, exposing a straightforward interface that cluster
operators can use to secure AKS-required network traffic in proxy-dependent environments.
Some more complex solutions may require creating a chain of trust to establish secure communications across
the network. The feature also enables installation of a trusted certificate authority onto the nodes as part of
bootstrapping a cluster.
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Latest version of Azure CLI installed.
{
"httpProxy": "string",
"httpsProxy": "string",
"noProxy": [
"string"
],
"trustedCa": "string"
}
httpProxy : A proxy URL to use for creating HTTP connections outside the cluster. The URL scheme must be
http . httpsProxy : A proxy URL to use for creating HTTPS connections outside the cluster. If this is not specified,
then httpProxy is used for both HTTP and HTTPS connections. noProxy : A list of destination domain names,
domains, IP addresses or other network CIDRs to exclude proxying. trustedCa : A string containing the
base64 encoded alternative CA certificate content. For now we only support PEM format. Another thing to note
is that, for compatibility with Go-based components that are part of the Kubernetes system, the certificate MUST
support Subject Alternative Names(SANs) instead of the deprecated Common Name certs.
Example input: Note the CA cert should be the base64 encoded string of the PEM format cert content.
{
"httpProxy": "http://myproxy.server.com:8080/",
"httpsProxy": "https://myproxy.server.com:8080/",
"noProxy": [
"localhost",
"127.0.0.1"
],
"trustedCA":
"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUgvVENDQmVXZ0F3SUJB...b3Rpbk15RGszaWFyCkYxMFlscWNPbWVYMXVGbUtiZGkv
WG9yR2xrQ29NRjNURHg4cm1wOURCaUIvCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0="
}
Create a file and provide values for httpProxy, httpsProxy, and noProxy. If your environment requires it, also
provide a trustedCa value. Next, deploy a cluster, passing in your filename via the http-proxy-config flag.
Your cluster will initialize with the HTTP proxy configured on the nodes.
"properties": {
...,
"httpProxyConfig": {
"httpProxy": "string",
"httpsProxy": "string",
"noProxy": [
"string"
],
"trustedCa": "string"
}
}
In your template, provide values for httpProxy, httpsProxy, and noProxy. If necessary, also provide a value for
`trustedCa. Deploy the template, and your cluster should initialize with your HTTP proxy configured on the
nodes.
Handling CA rollover
Values for httpProxy, httpsProxy, and noProxy cannot be changed after cluster creation. However, to support
rolling CA certs, the value for trustedCa can be changed and applied to the cluster with the az aks update
command.
For example, assuming a new file has been created with the base64 encoded string of the new CA cert called
aks-proxy-config-2.json, the following action will update the cluster:
Next steps
For more on the network requirements of AKS clusters, see control egress traffic for cluster nodes in AKS.
Create an ingress controller in Azure Kubernetes
Service (AKS)
6/15/2022 • 12 minutes to read • Edit Online
An ingress controller is a piece of software that provides reverse proxy, configurable traffic routing, and TLS
termination for Kubernetes services. Kubernetes ingress resources are used to configure the ingress rules and
routes for individual Kubernetes services. When you use an ingress controller and ingress rules, a single IP
address can be used to route traffic to multiple services in a Kubernetes cluster.
This article shows you how to deploy the NGINX ingress controller in an Azure Kubernetes Service (AKS) cluster.
Two applications are then run in the AKS cluster, each of which is accessible over the single IP address.
NOTE
There are two open source ingress controllers for Kubernetes based on Nginx: one is maintained by the Kubernetes
community (kubernetes/ingress-nginx), and one is maintained by NGINX, Inc. (nginxinc/kubernetes-ingress). This article
will be using the Kubernetes community ingress controller.
This article also requires that you're running the Azure CLI version 2.0.64 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
In addition, this article assumes you have an existing AKS cluster with an integrated Azure Container Registry
(ACR). For more information on creating an AKS cluster with an integrated ACR, see Authenticate with Azure
Container Registry from Azure Kubernetes Service.
Basic configuration
To create a basic NGINX ingress controller without customizing the defaults, you'll use Helm.
Azure CLI
Azure PowerShell
NAMESPACE=ingress-basic
The above configuration uses the default configuration for simplicity. You can add parameters for customizing
the deployment, for example, --set controller.replicaCount=3 . The next section will show a highly customized
example of the ingress controller.
Customized configuration
As an alternative to the basic configuration presented in the above section, the next set of steps will show how to
deploy a customized ingress controller. You'll have the option of using an internal static IP address, or using a
dynamic public IP address.
Import the images used by the Helm chart into your ACR
Azure CLI
Azure PowerShell
To control image versions, you'll want to import them into your own Azure Container Registry. The NGINX
ingress controller Helm chart relies on three container images. Use az acr import to import those images into
your ACR.
REGISTRY_NAME=<REGISTRY_NAME>
SOURCE_REGISTRY=k8s.gcr.io
CONTROLLER_IMAGE=ingress-nginx/controller
CONTROLLER_TAG=v1.2.1
PATCH_IMAGE=ingress-nginx/kube-webhook-certgen
PATCH_TAG=v1.1.1
DEFAULTBACKEND_IMAGE=defaultbackend-amd64
DEFAULTBACKEND_TAG=1.5
NOTE
In addition to importing container images into your ACR, you can also import Helm charts into your ACR. For more
information, see Push and pull Helm charts to an Azure Container Registry.
This example assigns 10.224.0.42 to the loadBalancerIP resource. Provide your own internal IP address for use
with the ingress controller. Make sure that this IP address isn't already in use within your virtual network. Also, if
you're using an existing virtual network and subnet, you must configure your AKS cluster with the correct
permissions to manage the virtual network and subnet. For more information, see Use kubenet networking with
your own IP address ranges in Azure Kubernetes Service (AKS) or Configure Azure CNI networking in Azure
Kubernetes Service (AKS).
When you deploy the nginx-ingress chart with Helm, add the -f internal-ingress.yaml parameter.
Azure CLI
Azure PowerShell
NOTE
If you would like to enable client source IP preservation for requests to containers in your cluster, add
--set controller.service.externalTrafficPolicy=Local to the Helm install command. The client source IP is stored
in the request header under X-Forwarded-For. When you're using an ingress controller with client source IP preservation
enabled, TLS pass-through won't work.
Azure CLI
Azure PowerShell
When the Kubernetes load balancer service is created for the NGINX ingress controller, an IP address is assigned
under EXTERNAL-IP, as shown in the following example output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
AGE SELECTOR
nginx-ingress-ingress-nginx-controller LoadBalancer 10.0.74.133 EXTERNAL_IP
80:32486/TCP,443:30953/TCP 44s app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-
ingress,app.kubernetes.io/name=ingress-nginx
No ingress rules have been created yet, so the NGINX ingress controller's default 404 page is displayed if you
browse to the external IP address. Ingress rules are configured in the following steps.
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld-one
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld-one
template:
metadata:
labels:
app: aks-helloworld-one
spec:
containers:
- name: aks-helloworld-one
image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
ports:
- containerPort: 80
env:
- name: TITLE
value: "Welcome to Azure Kubernetes Service (AKS)"
---
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld-one
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: aks-helloworld-one
Now add the /hello-world-two path to the IP address, such as EXTERNAL_IP/hello-world-two. The second demo
application with the custom title is displayed:
Now access the address of your Kubernetes ingress controller using curl , such as http://10.224.0.42. Provide
your own internal IP address specified when you deployed the ingress controller.
curl -L http://10.224.0.42
No path was provided with the address, so the ingress controller defaults to the / route. The first demo
application is returned, as shown in the following condensed example output:
$ curl -L http://10.224.0.42
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="/static/default.css">
<title>Welcome to Azure Kubernetes Service (AKS)</title>
[...]
Now add /hello-world-two path to the address, such as http://10.224.0.42/hello-world-two. The second demo
application with the custom title is returned, as shown in the following condensed example output:
$ curl -L -k http://10.224.0.42/hello-world-two
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="/static/default.css">
<title>AKS Ingress Demo</title>
[...]
Clean up resources
This article used Helm to install the ingress components and sample apps. When you deploy a Helm chart, many
Kubernetes resources are created. These resources include pods, deployments, and services. To clean up these
resources, you can either delete the entire sample namespace, or the individual resources.
Delete the sample namespace and all resources
To delete the entire sample namespace, use the kubectl delete command and specify your namespace name.
All the resources in the namespace are deleted.
Uninstall the releases with the helm uninstall command. The following example uninstalls the NGINX ingress
deployment.
Remove the ingress route that directed traffic to the sample apps:
Finally, you can delete the itself namespace. Use the kubectl delete command and specify your namespace
name:
Next steps
To configure TLS with your existing ingress components, see Use TLS with an ingress controller.
To configure your AKS cluster to use HTTP application routing, see Enable the HTTP application routing add-on.
This article included some external components to AKS. To learn more about these components, see the
following project pages:
Helm CLI
NGINX ingress controller
Use TLS with an ingress controller on Azure
Kubernetes Service (AKS)
6/15/2022 • 13 minutes to read • Edit Online
Transport layer security (TLS) is a protocol for providing security in communication, such as encryption,
authentication, and integrity, by using certificates. Using TLS with an ingress controller on AKS allows you to
secure communication between your applications, while also having the benefits of an ingress controller.
You can bring your own certificates and integrate them with the Secrets Store CSI driver. Alternatively, you can
also use cert-manager, which is used to automatically generate and configure Let's Encrypt certificates. Finally,
two applications are run in the AKS cluster, each of which is accessible over a single IP address.
NOTE
There are two open source ingress controllers for Kubernetes based on Nginx: one is maintained by the Kubernetes
community (kubernetes/ingress-nginx), and one is maintained by NGINX, Inc. (nginxinc/kubernetes-ingress). This article
will be using the Kubernetes community ingress controller.
In addition, this article assumes you have an existing AKS cluster with an integrated Azure Container Registry
(ACR). For more information on creating an AKS cluster with an integrated ACR, see Authenticate with Azure
Container Registry from Azure Kubernetes Service.
This article also requires that you're running the Azure CLI version 2.0.64 or later. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Use TLS with your own certificates with Secrets Store CSI Driver
To use TLS with your own certificates with Secrets Store CSI Driver, you'll need an AKS cluster with the Secrets
Store CSI Driver configured, and an Azure Key Vault instance. For more information, see Set up Secrets Store CSI
Driver to enable NGINX Ingress Controller with TLS.
REGISTRY_NAME=<REGISTRY_NAME>
CERT_MANAGER_REGISTRY=quay.io
CERT_MANAGER_TAG=v1.8.0
CERT_MANAGER_IMAGE_CONTROLLER=jetstack/cert-manager-controller
CERT_MANAGER_IMAGE_WEBHOOK=jetstack/cert-manager-webhook
CERT_MANAGER_IMAGE_CAINJECTOR=jetstack/cert-manager-cainjector
NOTE
In addition to importing container images into your ACR, you can also import Helm charts into your ACR. For more
information, see Push and pull Helm charts to an Azure Container Registry.
First get the resource group name of the AKS cluster with the az aks show command:
Next, create a public IP address with the static allocation method using the az network public-ip create
command. The following example creates a public IP address named myAKSPublicIP in the AKS cluster resource
group obtained in the previous step:
Alternatively, you can create an IP address in a different resource group, which can be managed separately from
your AKS cluster. If you create an IP address in a different resource group, ensure the following are true:
The cluster identity used by the AKS cluster has delegated permissions to the resource group, such as
Network Contributor.
Add the
--set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-resource-group"="
<RESOURCE_GROUP>"
parameter. Replace <RESOURCE_GROUP> with the name of the resource group where the IP address resides.
When you update the ingress controller, you must pass a parameter to the Helm release so the ingress
controller is made aware of the static IP address of the load balancer to be allocated to the ingress controller
service. For the HTTPS certificates to work correctly, a DNS name label is used to configure an FQDN for the
ingress controller IP address.
1. Add the --set controller.service.loadBalancerIP="<EXTERNAL_IP>" parameter. Specify your own public IP
address that was created in the previous step.
2. Add the
--set controller.service.annotations."service\.beta\.kubernetes\.io/azure-dns-label-name"="<DNS_LABEL>"
parameter. The DNS label can be set either when the ingress controller is first deployed, or it can be
configured later.
Azure CLI
Azure PowerShell
DNS_LABEL="demo-aks-ingress"
NAMESPACE="ingress-basic"
STATIC_IP=<STATIC_IP>
For more information, see Use a static public IP address and DNS label with the AKS load balancer.
The example output shows the details about the ingress controller:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
AGE SELECTOR
nginx-ingress-ingress-nginx-controller LoadBalancer 10.0.74.133 EXTERNAL_IP
80:32486/TCP,443:30953/TCP 44s app.kubernetes.io/component=controller,app.kubernetes.io/instance=nginx-
ingress,app.kubernetes.io/name=ingress-nginx
If you're using a custom domain, you'll need to add an A record to your DNS zone. Otherwise, you'll need to
configure the public IP address with a fully qualified domain name (FQDN).
Add an A record to your DNS zone
Azure CLI
Azure PowerShell
Add an A record to your DNS zone with the external IP address of the NGINX service using az network dns
record-set a add-record.
Azure CLI
Azure PowerShell
Azure CLI
Azure PowerShell
DNS_LABEL="demo-aks-ingress"
NAMESPACE="ingress-basic"
Install cert-manager
The NGINX ingress controller supports TLS termination. There are several ways to retrieve and configure
certificates for HTTPS. This article demonstrates using cert-manager, which provides automatic Lets Encrypt
certificate generation and management functionality.
To install the cert-manager controller:
Azure CLI
Azure PowerShell
NOTE
If you configured an FQDN for the ingress controller IP address instead of a custom domain, use the FQDN instead of
hello-world-ingress.MY_CUSTOM_DOMAIN. For example if your FQDN is demo-aks-ingress.eastus.cloudapp.azure.com,
replace hello-world-ingress.MY_CUSTOM_DOMAIN with demo-aks-ingress.eastus.cloudapp.azure.com in
hello-world-ingress.yaml .
Create or update the hello-world-ingress.yaml file using below example YAML. Update the spec.tls.hosts and
spec.rules.host to the DNS name you created in a previous step.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hello-world-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/use-regex: "true"
cert-manager.io/cluster-issuer: letsencrypt
spec:
ingressClassName: nginx
tls:
- hosts:
- hello-world-ingress.MY_CUSTOM_DOMAIN
secretName: tls-secret
rules:
- host: hello-world-ingress.MY_CUSTOM_DOMAIN
http:
paths:
- path: /hello-world-one(/|$)(.*)
pathType: Prefix
backend:
service:
name: aks-helloworld-one
port:
number: 80
- path: /hello-world-two(/|$)(.*)
pathType: Prefix
backend:
service:
name: aks-helloworld-two
port:
number: 80
- path: /(.*)
pathType: Prefix
backend:
service:
name: aks-helloworld-one
port:
number: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hello-world-ingress-static
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/rewrite-target: /static/$2
spec:
ingressClassName: nginx
tls:
- hosts:
- hello-world-ingress.MY_CUSTOM_DOMAIN
secretName: tls-secret
rules:
- host: hello-world-ingress.MY_CUSTOM_DOMAIN
http:
paths:
- path: /static(/|$)(.*)
pathType: Prefix
backend:
service:
name: aks-helloworld-one
port:
number: 80
Clean up resources
This article used Helm to install the ingress components, certificates, and sample apps. When you deploy a Helm
chart, many Kubernetes resources are created. These resources include pods, deployments, and services. To
clean up these resources, you can either delete the entire sample namespace, or the individual resources.
Delete the sample namespace and all resources
To delete the entire sample namespace, use the kubectl delete command and specify your namespace name.
All the resources in the namespace are deleted.
List the Helm releases with the helm list command. Look for charts named nginx and cert-manager, as shown
in the following example output:
$ helm list --namespace ingress-basic
Uninstall the releases with the helm uninstall command. The following example uninstalls the NGINX ingress
and cert-manager deployments.
Remove the ingress route that directed traffic to the sample apps:
Finally, you can delete the itself namespace. Use the kubectl delete command and specify your namespace
name:
Next steps
This article included some external components to AKS. To learn more about these components, see the
following project pages:
Helm CLI
NGINX ingress controller
cert-manager
You can also:
Enable the HTTP application routing add-on
HTTP application routing
6/15/2022 • 7 minutes to read • Edit Online
The HTTP application routing solution makes it easy to access applications that are deployed to your Azure
Kubernetes Service (AKS) cluster. When the solution's enabled, it configures an Ingress controller in your AKS
cluster. As applications are deployed, the solution also creates publicly accessible DNS names for application
endpoints.
When the add-on is enabled, it creates a DNS Zone in your subscription. For more information about DNS cost,
see DNS pricing.
Cau t i on
The HTTP application routing add-on is designed to let you quickly create an ingress controller and access your
applications. This add-on is not currently designed for use in a production environment and is not
recommended for production use. For production-ready ingress deployments that include multiple replicas and
TLS support, see Create an HTTPS ingress controller.
Limitations
HTTP application routing doesn't currently work with AKS versions 1.22.6+
TIP
If you want to enable multiple add-ons, provide them as a comma-separated list. For example, to enable HTTP application
routing and monitoring, use the format --enable-addons http_application_routing,monitoring .
You can also enable HTTP routing on an existing AKS cluster using the az aks enable-addons command. To
enable HTTP routing on an existing cluster, add the --addons parameter and specify http_application_routing as
shown in the following example:
This name is needed to deploy applications to the AKS cluster and is shown in the following example output:
9f9c1fe7-21a1-416d-99cd-3543bb92e4c3.eastus.aksapp.io
After the cluster is deployed, browse to the auto-created AKS resource group and select the DNS zone. Take note
of the DNS zone name. This name is needed to deploy applications to the AKS cluster.
Connect to your AKS cluster
To connect to the Kubernetes cluster from your local computer, you use kubectl, the Kubernetes command-line
client.
If you use the Azure Cloud Shell, kubectl is already installed. You can also install it locally using the az aks
install-cli command:
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. The
following example gets credentials for the AKS cluster named MyAKSCluster in the MyResourceGroup:
annotations:
kubernetes.io/ingress.class: addon-http-application-routing
Create a file named samples-http-application-routing.yaml and copy in the following YAML. On line 43,
update <CLUSTER_SPECIFIC_DNS_ZONE> with the DNS zone name collected in the previous step of this article.
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld
template:
metadata:
labels:
app: aks-helloworld
spec:
containers:
- name: aks-helloworld
image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
ports:
- containerPort: 80
env:
- name: TITLE
value: "Welcome to Azure Kubernetes Service (AKS)"
---
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: aks-helloworld
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: aks-helloworld
annotations:
kubernetes.io/ingress.class: addon-http-application-routing
spec:
rules:
- host: aks-helloworld.<CLUSTER_SPECIFIC_DNS_ZONE>
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: aks-helloworld
port:
number: 80
deployment.apps/aks-helloworld created
service/aks-helloworld created
ingress.networking.k8s.io/aks-helloworld created
When the HTTP application routing add-on is disabled, some Kubernetes resources may remain in the cluster.
These resources include configMaps and secrets, and are created in the kube-system namespace. To maintain a
clean cluster, you may want to remove these resources.
Look for addon-http-application-routing resources using the following kubectl get commands:
To delete resources, use the kubectl delete command. Specify the resource type, resource name, and namespace.
The following example deletes one of the previous configmaps:
Repeat the previous kubectl delete step for all addon-http-application-routing resources that remained in your
cluster.
Troubleshoot
Use the kubectl logs command to view the application logs for the External-DNS application. The logs should
confirm that an A and TXT DNS record were created successfully.
$ kubectl logs -f deploy/addon-http-application-routing-external-dns -n kube-system
These records can also be seen on the DNS zone resource in the Azure portal.
Use the kubectl logs command to view the application logs for the Nginx Ingress controller. The logs should
confirm the CREATE of an Ingress resource and the reload of the controller. All HTTP activity is logged.
$ kubectl logs -f deploy/addon-http-application-routing-nginx-ingress-controller -n kube-system
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: 0.13.0
Build: git-4bc943a
Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------
Clean up
Remove the associated Kubernetes objects created in this article using kubectl delete .
Next steps
For information on how to install an HTTPS-secured Ingress controller in AKS, see HTTPS Ingress on Azure
Kubernetes Service (AKS).
Tutorial: Enable Application Gateway Ingress
Controller add-on for an existing AKS cluster with
an existing Application Gateway
6/15/2022 • 6 minutes to read • Edit Online
You can use Azure CLI or Portal to enable the Application Gateway Ingress Controller (AGIC) add-on for an
existing Azure Kubernetes Services (AKS) cluster. In this tutorial, you'll learn how to use AGIC add-on to expose
your Kubernetes application in an existing AKS cluster through an existing Application Gateway deployed in
separate virtual networks. You'll start by creating an AKS cluster in one virtual network and an Application
Gateway in a separate virtual network to simulate existing resources. You'll then enable the AGIC add-on, peer
the two virtual networks together, and deploy a sample application that will be exposed through the Application
Gateway using the AGIC add-on. If you're enabling the AGIC add-on for an existing Application Gateway and
existing AKS cluster in the same virtual network, then you can skip the peering step below. The add-on provides
a much faster way of deploying AGIC for your AKS cluster than previously through Helm and also offers a fully
managed experience.
In this tutorial, you learn how to:
Create a resource group
Create a new AKS cluster
Create a new Application Gateway
Enable the AGIC add-on in the existing AKS cluster through Azure CLI
Enable the AGIC add-on in the existing AKS cluster through Portal
Peer the Application Gateway virtual network with the AKS cluster virtual network
Deploy a sample application using AGIC for Ingress on the AKS cluster
Check that the application is reachable through Application Gateway
If you don't have an Azure subscription, create an Azure free account before you begin.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
Create a resource group
In Azure, you allocate related resources to a resource group. Create a resource group by using az group create.
The following example creates a resource group named myResourceGroup in the canadacentral location
(region).
To configure additional parameters for the az aks create command, visit references here.
NOTE
Application Gateway Ingress Controller (AGIC) add-on only supports Application Gateway v2 SKUs (Standard and WAF),
and not the Application Gateway v1 SKUs.
Enable the AGIC add-on in existing AKS cluster through Azure CLI
If you'd like to continue using Azure CLI, you can continue to enable the AGIC add-on in the AKS cluster you
created, myCluster, and specify the AGIC add-on to use the existing Application Gateway you created,
myApplicationGateway.
appgwId=$(az network application-gateway show -n myApplicationGateway -g myResourceGroup -o tsv --query
"id")
az aks enable-addons -n myCluster -g myResourceGroup -a ingress-appgw --appgw-id $appgwId
Once you have the credentials to the cluster you created, run the following command to set up a sample
application that uses AGIC for Ingress to the cluster. AGIC will update the Application Gateway you set up earlier
with corresponding routing rules to the new sample application you deployed.
kubectl apply -f https://raw.githubusercontent.com/Azure/application-gateway-kubernetes-
ingress/master/docs/examples/aspnetapp.yaml
Check that the sample application you created is up and running by either visiting the IP address of the
Application Gateway that you got from running the above command or check with curl . It may take
Application Gateway a minute to get the update, so if the Application Gateway is still in an "Updating" state on
Portal, then let it finish before trying to reach the IP address.
Clean up resources
When no longer needed, remove the resource group, application gateway, and all related resources.
Next steps
Learn more about disabling the AGIC add-on
Control egress traffic for cluster nodes in Azure
Kubernetes Service (AKS)
6/15/2022 • 26 minutes to read • Edit Online
This article provides the necessary details that allow you to secure outbound traffic from your Azure Kubernetes
Service (AKS). It contains the cluster requirements for a base AKS deployment, and additional requirements for
optional addons and features. An example will be provided at the end on how to configure these requirements
with Azure Firewall. However, you can apply this information to any outbound restriction method or appliance.
Background
AKS clusters are deployed on a virtual network. This network can be managed (created by AKS) or custom (pre-
configured by the user beforehand). In either case, the cluster has outbound dependencies on services outside
of that virtual network (the service has no inbound dependencies).
For management and operational purposes, nodes in an AKS cluster need to access certain ports and fully
qualified domain names (FQDNs). These endpoints are required for the nodes to communicate with the API
server, or to download and install core Kubernetes cluster components and node security updates. For example,
the cluster needs to pull base system container images from Microsoft Container Registry (MCR).
The AKS outbound dependencies are almost entirely defined with FQDNs, which don't have static addresses
behind them. The lack of static addresses means that Network Security Groups can't be used to lock down the
outbound traffic from an AKS cluster.
By default, AKS clusters have unrestricted outbound (egress) internet access. This level of network access allows
nodes and services you run to access external resources as needed. If you wish to restrict egress traffic, a limited
number of ports and addresses must be accessible to maintain healthy cluster maintenance tasks. The simplest
solution to securing outbound addresses lies in use of a firewall device that can control outbound traffic based
on domain names. Azure Firewall, for example, can restrict outbound HTTP and HTTPS traffic based on the
FQDN of the destination. You can also configure your preferred firewall and security rules to allow these
required ports and addresses.
IMPORTANT
This document covers only how to lock down the traffic leaving the AKS subnet. AKS has no ingress requirements by
default. Blocking internal subnet traffic using network security groups (NSGs) and firewalls is not supported. To control
and block the traffic within the cluster, use Network Policies .
DEST IN AT IO N F Q DN P O RT USE
DEST IN AT IO N F Q DN P O RT USE
If you choose to block/not allow these FQDNs, the nodes will only receive OS updates when you do a node
image upgrade or cluster upgrade.
DEST IN AT IO N F Q DN P O RT USE
F Q DN P O RT USE
F Q DN P O RT USE
Azure Policy
Required FQDN / application rules
The following FQDN / application rules are required for AKS clusters that have the Azure Policy enabled.
F Q DN P O RT USE
F Q DN P O RT USE
F Q DN P O RT USE
Cluster extensions
Required FQDN / application rules
The following FQDN / application rules are required for using cluster extensions on AKS clusters.
F Q DN P O RT USE
<region>.dp.kubernetesconfiguration.azure.com
HTTPS:443 This address is used to fetch
configuration information from the
Cluster Extensions service and report
extension status to the service.
F Q DN P O RT USE
<region>.dp.kubernetesconfiguration.azure.us
HTTPS:443 This address is used to fetch
configuration information from the
Cluster Extensions service and report
extension status to the service.
NOTE
The FQDN tag contains all the FQDNs listed above and is kept automatically up to date.
We recommend having a minimum of 20 Frontend IPs on the Azure Firewall for production scenarios to avoid incurring in
SNAT port exhaustion issues.
Create a virtual network with two subnets to host the AKS cluster and the Azure Firewall. Each will have their
own subnet. Let's start with the AKS network.
# Dedicated virtual network with AKS subnet
IMPORTANT
If your cluster or application creates a large number of outbound connections directed to the same or small subset of
destinations, you might require more firewall frontend IPs to avoid maxing out the ports per frontend IP. For more
information on how to create an Azure firewall with multiple IPs, see here
Create a standard SKU public IP resource that will be used as the Azure Firewall frontend address.
The IP address created earlier can now be assigned to the firewall frontend.
NOTE
Set up of the public IP address to the Azure Firewall may take a few minutes. To leverage FQDN on network rules we need
DNS proxy enabled, when enabled the firewall will listen on port 53 and will forward DNS requests to the DNS server
specified above. This will allow the firewall to translate that FQDN automatically.
When the previous command has succeeded, save the firewall frontend IP address for configuration later.
NOTE
If you use secure access to the AKS API server with authorized IP address ranges, you need to add the firewall public IP
into the authorized IP range.
See virtual network route table documentation about how you can override Azure's default system routes or
add additional routes to a subnet's route table.
Adding firewall rules
NOTE
For applications outside of the kube-system or gatekeeper-system namespaces that needs to talk to the API server, an
additional network rule to allow TCP communication to port 443 for the API server IP in addition to adding application
rule for fqdn-tag AzureKubernetesService is required.
Below are three network rules you can use to configure on your firewall, you may need to adapt these rules
based on your deployment. The first rule allows access to port 9000 via TCP. The second rule allows access to
port 1194 and 123 via UDP (if you're deploying to Azure China 21Vianet, you might require more). Both these
rules will only allow traffic destined to the Azure Region CIDR that we're using, in this case East US. Finally, we'll
add a third network rule opening port 123 to ntp.ubuntu.com FQDN via UDP (adding an FQDN as a network
rule is one of the specific features of Azure Firewall, and you'll need to adapt it when using your own options).
After setting the network rules, we'll also add an application rule using the AzureKubernetesService that covers
all needed FQDNs accessible through TCP port 443 and port 80.
See Azure Firewall documentation to learn more about the Azure Firewall service.
Associate the route table to AKS
To associate the cluster with the firewall, the dedicated subnet for the cluster's subnet must reference the route
table created above. Association can be done by issuing a command to the virtual network holding both the
cluster and firewall to update the route table of the cluster's subnet.
# Associate route table with next hop to Firewall to the AKS subnet
az network vnet subnet update -g $RG --vnet-name $VNET_NAME --name $AKSSUBNET_NAME --route-table
$FWROUTE_TABLE_NAME
az ad sp create-for-rbac -n "${PREFIX}sp"
Now replace the APPID and PASSWORD below with the service principal appid and service principal password
autogenerated by the previous command output. We'll reference the VNET resource ID to grant the permissions
to the service principal so AKS can deploy resources into it.
APPID="<SERVICE_PRINCIPAL_APPID_GOES_HERE>"
PASSWORD="<SERVICEPRINCIPAL_PASSWORD_GOES_HERE>"
VNETID=$(az network vnet show -g $RG --name $VNET_NAME --query id -o tsv)
az role assignment create --assignee $APPID --scope $VNETID --role "Network Contributor"
You can check the detailed permissions that are required here.
NOTE
If you're using the kubenet network plugin, you'll need to give the AKS service principal or managed identity permissions
to the pre-created route table, since kubenet requires a route table to add neccesary routing rules.
Deploy AKS
Finally, the AKS cluster can be deployed into the existing subnet we've dedicated for the cluster. The target
subnet to be deployed into is defined with the environment variable, $SUBNETID . We didn't define the $SUBNETID
variable in the previous steps. To set the value for the subnet ID, you can use the following command:
SUBNETID=$(az network vnet subnet show -g $RG --vnet-name $VNET_NAME --name $AKSSUBNET_NAME --query id -o
tsv)
You'll define the outbound type to use the UDR that already exists on the subnet. This configuration will enable
AKS to skip the setup and IP provisioning for the load balancer.
IMPORTANT
For more information on outbound type UDR including limitations, see egress outbound type UDR.
TIP
Additional features can be added to the cluster deployment such as Private Cluster .
The AKS feature for API ser ver authorized IP ranges can be added to limit API server access to only the firewall's
public endpoint. The authorized IP ranges feature is denoted in the diagram as optional. When enabling the authorized IP
range feature to limit API server access, your developer tools must use a jumpbox from the firewall's virtual network or
you must add all developer endpoints to the authorized IP range.
Use the [az aks get-credentials][az-aks-get-credentials] command to configure kubectl to connect to your
newly created Kubernetes cluster.
Deploy the Azure voting app application by copying the yaml below to a file named example.yaml .
# voting-storage-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: voting-storage
spec:
replicas: 1
selector:
matchLabels:
app: voting-storage
template:
metadata:
labels:
app: voting-storage
spec:
containers:
- name: voting-storage
image: mcr.microsoft.com/aks/samples/voting/storage:2.0
args: ["--ignore-db-dir=lost+found"]
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_ROOT_PASSWORD
- name: MYSQL_USER
- name: MYSQL_USER
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_USER
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_PASSWORD
- name: MYSQL_DATABASE
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_DATABASE
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pv-claim
---
# voting-storage-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: voting-storage-secret
type: Opaque
data:
MYSQL_USER: ZGJ1c2Vy
MYSQL_PASSWORD: UGFzc3dvcmQxMg==
MYSQL_DATABASE: YXp1cmV2b3Rl
MYSQL_ROOT_PASSWORD: UGFzc3dvcmQxMg==
---
# voting-storage-pv-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
# voting-storage-service.yaml
apiVersion: v1
kind: Service
metadata:
name: voting-storage
labels:
app: voting-storage
spec:
ports:
- port: 3306
name: mysql
selector:
app: voting-storage
---
# voting-app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: voting-app
spec:
replicas: 1
selector:
matchLabels:
app: voting-app
template:
metadata:
metadata:
labels:
app: voting-app
spec:
containers:
- name: voting-app
image: mcr.microsoft.com/aks/samples/voting/app:2.0
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
env:
- name: MYSQL_HOST
value: "voting-storage"
- name: MYSQL_USER
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_USER
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_PASSWORD
- name: MYSQL_DATABASE
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_DATABASE
- name: ANALYTICS_HOST
value: "voting-analytics"
---
# voting-app-service.yaml
apiVersion: v1
kind: Service
metadata:
name: voting-app
labels:
app: voting-app
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
name: http
selector:
app: voting-app
---
# voting-analytics-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: voting-analytics
spec:
replicas: 1
selector:
matchLabels:
app: voting-analytics
version: "2.0"
template:
metadata:
labels:
app: voting-analytics
version: "2.0"
spec:
containers:
- name: voting-analytics
image: mcr.microsoft.com/aks/samples/voting/analytics:2.0
imagePullPolicy: Always
ports:
ports:
- containerPort: 8080
name: http
env:
- name: MYSQL_HOST
value: "voting-storage"
- name: MYSQL_USER
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_USER
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_PASSWORD
- name: MYSQL_DATABASE
valueFrom:
secretKeyRef:
name: voting-storage-secret
key: MYSQL_DATABASE
---
# voting-analytics-service.yaml
apiVersion: v1
kind: Service
metadata:
name: voting-analytics
labels:
app: voting-analytics
spec:
ports:
- port: 8080
name: http
selector:
app: voting-analytics
IMPORTANT
When you use Azure Firewall to restrict egress traffic and create a user-defined route (UDR) to force all egress traffic, make
sure you create an appropriate DNAT rule in Firewall to correctly allow ingress traffic. Using Azure Firewall with a UDR
breaks the ingress setup due to asymmetric routing. (The issue occurs if the AKS subnet has a default route that goes to
the firewall's private IP address, but you're using a public load balancer - ingress or Kubernetes service of type:
LoadBalancer). In this case, the incoming load balancer traffic is received via its public IP address, but the return path goes
through the firewall's private IP address. Because the firewall is stateful, it drops the returning packet because the firewall
isn't aware of an established session. To learn how to integrate Azure Firewall with your ingress or service load balancer,
see Integrate Azure Firewall with Azure Standard Load Balancer.
To configure inbound connectivity, a DNAT rule must be written to the Azure Firewall. To test connectivity to your
cluster, a rule is defined for the firewall frontend public IP address to route to the internal IP exposed by the
internal service.
The destination address can be customized as it's the port on the firewall to be accessed. The translated address
must be the IP address of the internal load balancer. The translated port must be the exposed port for your
Kubernetes service.
You'll need to specify the internal IP address assigned to the load balancer created by the Kubernetes service.
Retrieve the address by running:
The IP address needed will be listed in the EXTERNAL-IP column, similar to the following.
Validate connectivity
Navigate to the Azure Firewall frontend IP address in a browser to validate connectivity.
You should see the AKS voting app. In this example, the Firewall public IP was 52.253.228.132 .
Clean up resources
To clean up Azure resources, delete the AKS resource group.
Next steps
In this article, you learned what ports and addresses to allow if you want to restrict egress traffic for the cluster.
You also saw how to secure your outbound traffic using Azure Firewall.
If needed, you can generalize the steps above to forward the traffic to your preferred egress solution, following
the Outbound Type userDefinedRoute documentation.
If you want to restrict how pods communicate between themselves and East-West traffic restrictions within
cluster see Secure traffic between pods using network policies in AKS.
Customize cluster egress with a User-Defined Route
6/15/2022 • 3 minutes to read • Edit Online
Egress from an AKS cluster can be customized to fit specific scenarios. By default, AKS will provision a Standard
SKU Load Balancer to be set up and used for egress. However, the default setup may not meet the requirements
of all scenarios if public IPs are disallowed or additional hops are required for egress.
This article walks through how to customize a cluster's egress route to support custom network scenarios, such
as those which disallows public IPs and requires the cluster to sit behind a network virtual appliance (NVA).
Prerequisites
Azure CLI version 2.0.81 or greater
API version of 2020-01-01 or greater
Limitations
OutboundType can only be defined at cluster create time and can't be updated afterwards.
Setting outboundType requires AKS clusters with a vm-set-type of VirtualMachineScaleSets and
load-balancer-sku of Standard .
Setting outboundType to a value of UDR requires a user-defined route with valid outbound connectivity for
the cluster.
Setting outboundType to a value of UDR implies the ingress source IP routed to the load-balancer may not
match the cluster's outgoing egress destination address.
IMPORTANT
Outbound type impacts only the egress traffic of your cluster. For more information, see setting up ingress controllers.
NOTE
You can use your own route table with UDR and kubenet networking. Make sure you cluster identity (service principal or
managed identity) has Contributor permissions to the custom route table.
NOTE
Using outbound type is an advanced networking scenario and requires proper network configuration.
If userDefinedRouting is set, AKS won't automatically configure egress paths. The egress setup must be done by
you.
The AKS cluster must be deployed into an existing virtual network with a subnet that has been previously
configured because when not using standard load balancer (SLB) architecture, you must establish explicit egress.
As such, this architecture requires explicitly sending egress traffic to an appliance like a firewall, gateway, proxy
or to allow the Network Address Translation (NAT) to be done by a public IP assigned to the standard load
balancer or appliance.
Load balancer creation with userDefinedRouting
AKS clusters with an outbound type of UDR receive a standard load balancer (SLB) only when the first
Kubernetes service of type 'loadBalancer' is deployed. The load balancer is configured with a public IP address
for inbound requests and a backend pool for inbound requests. Inbound rules are configured by the Azure cloud
provider, but no outbound public IP address or outbound rules are configured as a result of having an
outbound type of UDR. Your UDR will still be the only source for egress traffic.
Azure load balancers don't incur a charge until a rule is placed.
Next steps
See Azure networking UDR overview.
See how to create, change, or delete a route table.
Managed NAT Gateway
6/15/2022 • 2 minutes to read • Edit Online
Whilst AKS customers are able to route egress traffic through an Azure Load Balancer, there are limitations on
the amount of outbound flows of traffic that is possible.
Azure NAT Gateway allows up to 64,512 outbound UDP and TCP traffic flows per IP address with a maximum of
16 IP addresses.
This article will show you how to create an AKS cluster with a Managed NAT Gateway for egress traffic.
az aks create \
--resource-group myResourceGroup \
--name natcluster \
--node-count 3 \
--outbound-type managedNATGateway \
--nat-gateway-managed-outbound-ip-count 2 \
--nat-gateway-idle-timeout 30
IMPORTANT
If no value the outbound IP address is specified, the default value is one.
az aks update \
--resource-group myresourcegroup \
--name natcluster\
--nat-gateway-managed-outbound-ip-count 5
2. Create a managed identity for network permissions and store the ID to $IDENTITY_ID for later use:
6. Create a subnet in the virtual network using the NAT gateway and store the ID to $SUBNET_ID for later
use:
az aks create \
--resource-group myresourcegroup \
--name natcluster \
--location southcentralus \
--network-plugin azure \
--vnet-subnet-id $SUBNET_ID \
--outbound-type userAssignedNATGateway \
--enable-managed-identity \
--assign-identity $IDENTITY_ID
Next Steps
For more information on Azure NAT Gateway, see Azure NAT Gateway.
Customize CoreDNS with Azure Kubernetes Service
6/15/2022 • 5 minutes to read • Edit Online
Azure Kubernetes Service (AKS) uses the CoreDNS project for cluster DNS management and resolution with all
1.12.x and higher clusters. Previously, the kube-dns project was used. This kube-dns project is now deprecated.
For more information about CoreDNS customization and Kubernetes, see the official upstream documentation.
As AKS is a managed service, you cannot modify the main configuration for CoreDNS (a CoreFile). Instead, you
use a Kubernetes ConfigMap to override the default settings. To see the default AKS CoreDNS ConfigMaps, use
the kubectl get configmaps --namespace=kube-system coredns -o yaml command.
This article shows you how to use ConfigMaps for basic customization options of CoreDNS in AKS. This
approach differs from configuring CoreDNS in other contexts such as using the CoreFile. Verify the version of
CoreDNS you are running as the configuration values may change between versions.
NOTE
kube-dns offered different customization options via a Kubernetes config map. CoreDNS is not backwards compatible
with kube-dns. Any customizations you previously used must be updated for use with CoreDNS.
What is supported/unsupported
All built-in CoreDNS plugins are supported. No add-on/third party plugins are supported.
Rewrite DNS
One scenario you have is to perform on-the-fly DNS name rewrites. In the following example, replace
<domain to be written> with your own fully qualified domain name. Create a file named corednsms.yaml and
paste the following example configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
test.server: | # you may select any name here, but it must end with the .server file extension
<domain to be rewritten>.com:53 {
log
errors
rewrite stop {
name regex (.*)\.<domain to be rewritten>.com {1}.default.svc.cluster.local
answer name (.*)\.default\.svc\.cluster\.local {1}.<domain to be rewritten>.com
}
forward . /etc/resolv.conf # you can redirect this to a specific DNS server such as 10.0.0.10, but that
server must be able to resolve the rewritten domain name
}
IMPORTANT
If you redirect to a DNS server, such as the CoreDNS service IP, that DNS server must be able to resolve the rewritten
domain name.
Create the ConfigMap using the kubectl apply configmap command and specify the name of your YAML
manifest:
To verify the customizations have been applied, use the kubectl get configmaps and specify your coredns-
custom ConfigMap:
Now force CoreDNS to reload the ConfigMap. The kubectl delete pod command isn't destructive and doesn't
cause down time. The kube-dns pods are deleted, and the Kubernetes Scheduler then recreates them. These
new pods contain the change in TTL value.
NOTE
The command above is correct. While we're changing coredns , the deployment is under the kube-dns label.
As in the previous examples, create the ConfigMap using the kubectl apply configmap command and specify the
name of your YAML manifest. Then, force CoreDNS to reload the ConfigMap using the kubectl delete pod for the
Kubernetes Scheduler to recreate them:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
puglife.server: | # you may select any name here, but it must end with the .server file extension
puglife.local:53 {
errors
cache 30
forward . 192.11.0.1 # this is my test/dev DNS server
}
As in the previous examples, create the ConfigMap using the kubectl apply configmap command and specify the
name of your YAML manifest. Then, force CoreDNS to reload the ConfigMap using the kubectl delete pod for the
Kubernetes Scheduler to recreate them:
Stub domains
CoreDNS can also be used to configure stub domains. In the following example, update the custom domains
and IP addresses with the values for your own environment. Create a file named corednsms.yaml and paste the
following example configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
test.server: | # you may select any name here, but it must end with the .server file extension
abc.com:53 {
errors
cache 30
forward . 1.2.3.4
}
my.cluster.local:53 {
errors
cache 30
forward . 2.3.4.5
}
As in the previous examples, create the ConfigMap using the kubectl apply configmap command and specify the
name of your YAML manifest. Then, force CoreDNS to reload the ConfigMap using the kubectl delete pod for the
Kubernetes Scheduler to recreate them:
Hosts plugin
As all built-in plugins are supported this means that the CoreDNS Hosts plugin is available to customize as well:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom # this is the name of the configmap you can overwrite with your changes
namespace: kube-system
data:
test.override: | # you may select any name here, but it must end with the .override file extension
hosts {
10.0.0.1 example1.org
10.0.0.2 example2.org
10.0.0.3 example3.org
fallthrough
}
Troubleshooting
For general CoreDNS troubleshooting steps, such as checking the endpoints or resolution, see Debugging DNS
Resolution.
To enable DNS query logging, apply the following configuration in your coredns-custom ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
log.override: | # you may select any name here, but it must end with the .override file extension
log
After you apply the configuration changes, use the kubectl logs command to view the CoreDNS debug
logging. For example:
Next steps
This article showed some example scenarios for CoreDNS customization. For information on the CoreDNS
project, see the CoreDNS upstream project page.
To learn more about core network concepts, see Network concepts for applications in AKS.
Dynamically create and use a persistent volume with
Azure disks in Azure Kubernetes Service (AKS)
6/15/2022 • 7 minutes to read • Edit Online
A persistent volume represents a piece of storage that has been provisioned for use with Kubernetes pods. A
persistent volume can be used by one or many pods, and can be dynamically or statically provisioned. This
article shows you how to dynamically create persistent volumes with Azure disks for use by a single pod in an
Azure Kubernetes Service (AKS) cluster.
NOTE
An Azure disk can only be mounted with Access mode type ReadWriteOnce, which makes it available to one node in AKS.
If you need to share a persistent volume across multiple nodes, use Azure Files.
For more information on Kubernetes volumes, see Storage options for applications in AKS.
For example, if you want to use a disk of size 4 TiB, you must create a storage class that defines
cachingmode: None because disk caching isn't supported for disks 4 TiB and larger.
For more information about storage classes and creating your own storage class, see Storage options for
applications in AKS.
Use the kubectl get sc command to see the pre-created storage classes. The following example shows the pre-
create storage classes available within an AKS cluster:
$ kubectl get sc
NOTE
Persistent volume claims are specified in GiB but Azure managed disks are billed by SKU for a specific size. These SKUs
range from 32GiB for S4 or P4 disks to 32TiB for S80 or P80 disks (in preview). The throughput and IOPS performance of
a Premium managed disk depends on the both the SKU and the instance size of the nodes in the AKS cluster. For more
information, see Pricing and Performance of Managed Disks.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-csi
resources:
requests:
storage: 5Gi
TIP
To create a disk that uses premium storage, use storageClassName: managed-csi-premium rather than managed-csi.
Create the persistent volume claim with the kubectl apply command and specify your azure-pvc.yaml file:
persistentvolumeclaim/azure-managed-disk created
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: mypod
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- mountPath: "/mnt/azure"
name: volume
volumes:
- name: volume
persistentVolumeClaim:
claimName: azure-managed-disk
Create the pod with the kubectl apply command, as shown in the following example:
pod/mypod created
You now have a running pod with your Azure disk mounted in the /mnt/azure directory. This configuration can
be seen when inspecting your pod via kubectl describe pod mypod , as shown in the following condensed
example:
[...]
Volumes:
volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: azure-managed-disk
ReadOnly: false
default-token-smm2n:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-smm2n
Optional: false
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned mypod to
aks-nodepool1-79590246-0
Normal SuccessfulMountVolume 2m kubelet, aks-nodepool1-79590246-0 MountVolume.SetUp succeeded for
volume "default-token-smm2n"
Normal SuccessfulMountVolume 1m kubelet, aks-nodepool1-79590246-0 MountVolume.SetUp succeeded for
volume "pvc-faf0f176-8b8d-11e8-923b-deb28c58d242"
[...]
This volume name forms the underlying Azure disk name. Query for the disk ID with az disk list and provide
your PVC volume name, as shown in the following example:
/subscriptions/<guid>/resourceGroups/MC_MYRESOURCEGROUP_MYAKSCLUSTER_EASTUS/providers/MicrosoftCompute/disks
/kubernetes-dynamic-pvc-faf0f176-8b8d-11e8-923b-deb28c58d242
Use the disk ID to create a snapshot disk with az snapshot create. The following example creates a snapshot
named pvcSnapshot in the same resource group as the AKS cluster
(MC_myResourceGroup_myAKSCluster_eastus). You may encounter permission issues if you create snapshots
and restore disks in resource groups that the AKS cluster does not have access to.
$ az snapshot create \
--resource-group MC_myResourceGroup_myAKSCluster_eastus \
--name pvcSnapshot \
--source
/subscriptions/<guid>/resourceGroups/MC_myResourceGroup_myAKSCluster_eastus/providers/MicrosoftCompute/disks
/kubernetes-dynamic-pvc-faf0f176-8b8d-11e8-923b-deb28c58d242
Depending on the amount of data on your disk, it may take a few minutes to create the snapshot.
To use the restored disk with a pod, specify the ID of the disk in the manifest. Get the disk ID with the az disk
show command. The following example gets the disk ID for pvcRestored created in the previous step:
Create a pod manifest named azure-restored.yaml and specify the disk URI obtained in the previous step. The
following example creates a basic NGINX web server, with the restored disk mounted as a volume at /mnt/azure:
kind: Pod
apiVersion: v1
metadata:
name: mypodrestored
spec:
containers:
- name: mypodrestored
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- mountPath: "/mnt/azure"
name: volume
volumes:
- name: volume
azureDisk:
kind: Managed
diskName: pvcRestored
diskURI:
/subscriptions/<guid>/resourceGroups/MC_myResourceGroupAKS_myAKSCluster_eastus/providers/Microsoft.Compute/d
isks/pvcRestored
Create the pod with the kubectl apply command, as shown in the following example:
pod/mypodrestored created
You can use kubectl describe pod mypodrestored to view details of the pod, such as the following condensed
example that shows the volume information:
[...]
Volumes:
volume:
Type: AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)
DiskName: pvcRestored
DiskURI: /subscriptions/19da35d3-9a1a-4f3b-9b9c-
3c56ef409565/resourceGroups/MC_myResourceGroupAKS_myAKSCluster_eastus/providers/Microsoft.Compute/disks/pvcR
estored
Kind: Managed
FSType: ext4
CachingMode: ReadWrite
ReadOnly: false
[...]
Next steps
For associated best practices, see Best practices for storage and backups in AKS.
Learn more about Kubernetes persistent volumes using Azure disks.
Kubernetes plugin for Azure disks
Create a static volume with Azure disks in Azure
Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
Container-based applications often need to access and persist data in an external data volume. If a single pod
needs access to storage, you can use Azure disks to present a native volume for application use. This article
shows you how to manually create an Azure disk and attach it to a pod in AKS.
NOTE
An Azure disk can only be mounted to a single pod at a time. If you need to share a persistent volume across multiple
pods, use Azure Files.
For more information on Kubernetes volumes, see Storage options for applications in AKS.
MC_myResourceGroup_myAKSCluster_eastus
2. Create a disk using the az disk create command. Specify the node resource group name obtained in the
previous command, and then a name for the disk resource, such as myAKSDisk. The following example
creates a 20GiB disk, and outputs the ID of the disk after it's created. If you need to create a disk for use
with Windows Server containers, add the --os-type windows parameter to correctly format the disk.
az disk create \
--resource-group MC_myResourceGroup_myAKSCluster_eastus \
--name myAKSDisk \
--size-gb 20 \
--query id --output tsv
NOTE
Azure disks are billed by SKU for a specific size. These SKUs range from 32GiB for S4 or P4 disks to 32TiB for S80
or P80 disks (in preview). The throughput and IOPS performance of a Premium managed disk depends on both
the SKU and the instance size of the nodes in the AKS cluster. See Pricing and Performance of Managed Disks.
The disk resource ID is displayed once the command has successfully completed, as shown in the
following example output. This disk ID is used to mount the disk in the next section.
/subscriptions/<subscriptionID>/resourceGroups/MC_myAKSCluster_myAKSCluster_eastus/providers/Microsof
t.Compute/disks/myAKSDisk
2. Create a pvc-azuredisk.yaml file with a PersistentVolumeClaim that uses the PersistentVolume. For
example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-azuredisk
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
volumeName: pv-azuredisk
storageClassName: managed-csi
3. Use the kubectl commands to create the PersistentVolume and PersistentVolumeClaim, referencing the
two YAML files created earlier:
4. To verify your PersistentVolumeClaim is created and bound to the PersistentVolume, run the following
command:
6. Run the following command to apply the configuration and mount the volume, referencing the YAML
configuration file created in the previous steps:
Next steps
To learn about our recommended storage and backup practices, see Best practices for storage and backups in
AKS.
Dynamically create and use a persistent volume with
Azure Files in Azure Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
A persistent volume represents a piece of storage that has been provisioned for use with Kubernetes pods. A
persistent volume can be used by one or many pods, and can be dynamically or statically provisioned. If
multiple pods need concurrent access to the same storage volume, you can use Azure Files to connect using the
Server Message Block (SMB) protocol. This article shows you how to dynamically create an Azure Files share for
use by multiple pods in an Azure Kubernetes Service (AKS) cluster.
For more information on Kubernetes volumes, see Storage options for applications in AKS.
NOTE
Minimum premium file share is 100GB.
For more information on Kubernetes storage classes for Azure Files, see Kubernetes Storage Classes.
Create a file named azure-file-sc.yaml and copy in the following example manifest. For more information on
mountOptions, see the Mount options section.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: my-azurefile
provisioner: file.csi.azure.com # replace with "kubernetes.io/azure-file" if aks version is less than 1.21
allowVolumeExpansion: true
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict
- actimeo=30
parameters:
skuName: Premium_LRS
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-azurefile
spec:
accessModes:
- ReadWriteMany
storageClassName: my-azurefile
resources:
requests:
storage: 100Gi
NOTE
If using the Premium_LRS sku for your storage class, the minimum value for storage must be 100Gi.
Create the persistent volume claim with the kubectl apply command:
Once completed, the file share will be created. A Kubernetes secret is also created that includes connection
information and credentials. You can use the kubectl get command to view the status of the PVC:
$ kubectl get pvc my-azurefile
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: mypod
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- mountPath: "/mnt/azure"
name: volume
volumes:
- name: volume
persistentVolumeClaim:
claimName: my-azurefile
You now have a running pod with your Azure Files share mounted in the /mnt/azure directory. This
configuration can be seen when inspecting your pod via kubectl describe pod mypod . The following condensed
example output shows the volume mounted in the container:
Containers:
mypod:
Container ID: docker://053bc9c0df72232d755aa040bfba8b533fa696b123876108dec400e364d2523e
Image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
Image ID: docker-
pullable://nginx@sha256:d85914d547a6c92faa39ce7058bd7529baacab7e0cd4255442b04577c4d1f424
State: Running
Started: Fri, 01 Mar 2019 23:56:16 +0000
Ready: True
Mounts:
/mnt/azure from volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-8rv4z (ro)
[...]
Volumes:
volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-azurefile
ReadOnly: false
[...]
Mount options
The default value for fileMode and dirMode is 0777 for Kubernetes version 1.13.0 and above. If dynamically
creating the persistent volume with a storage class, mount options can be specified on the storage class object.
The following example sets 0777:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: my-azurefile
provisioner: file.csi.azure.com # replace with "kubernetes.io/azure-file" if aks version is less than 1.21
allowVolumeExpansion: true
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict
- actimeo=30
parameters:
skuName: Premium_LRS
Next steps
For associated best practices, see Best practices for storage and backups in AKS.
For storage class parameters, see Dynamic Provision.
Learn more about Kubernetes persistent volumes using Azure Files.
Kubernetes plugin for Azure Files
Manually create and use a volume with Azure Files
share in Azure Kubernetes Service (AKS)
6/15/2022 • 4 minutes to read • Edit Online
Container-based applications often need to access and persist data in an external data volume. If multiple pods
need concurrent access to the same storage volume, you can use Azure Files to connect using the Server
Message Block (SMB) protocol. This article shows you how to manually create an Azure Files share and attach it
to a pod in AKS.
For more information on Kubernetes volumes, see Storage options for applications in AKS.
# Export the connection string as an environment variable, this is used when creating the Azure file share
export AZURE_STORAGE_CONNECTION_STRING=$(az storage account show-connection-string -n
$AKS_PERS_STORAGE_ACCOUNT_NAME -g $AKS_PERS_RESOURCE_GROUP -o tsv)
Make a note of the storage account name and key shown at the end of the script output. These values are
needed when you create the Kubernetes volume in one of the following steps.
Create a Kubernetes secret
Kubernetes needs credentials to access the file share created in the previous step. These credentials are stored in
a Kubernetes secret, which is referenced when you create a Kubernetes pod.
Use the kubectl create secret command to create the secret. The following example creates a secret named
azure-secret and populates the azurestorageaccountname and azurestorageaccountkey from the previous step.
To use an existing Azure storage account, provide the account name and key.
To mount the Azure Files share into your pod, configure the volume in the container spec. Create a new file
named azure-files-pod.yaml with the following contents. If you changed the name of the Files share or secret
name, update the shareName and secretName. If desired, update the mountPath , which is the path where the
Files share is mounted in the pod. For Windows Server containers, specify a mountPath using the Windows path
convention, such as 'D:'.
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
name: mypod
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- name: azure
mountPath: /mnt/azure
volumes:
- name: azure
csi:
driver: file.csi.azure.com
volumeAttributes:
secretName: azure-secret # required
shareName: aksshare # required
mountOptions: "dir_mode=0777,file_mode=0777,cache=strict,actimeo=30" # optional
apiVersion: v1
kind: PersistentVolume
metadata:
name: azurefile
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: azurefile-csi
csi:
driver: file.csi.azure.com
readOnly: false
volumeHandle: unique-volumeid # make sure this volumeid is unique in the cluster
volumeAttributes:
resourceGroup: EXISTING_RESOURCE_GROUP_NAME # optional, only set this when storage account is not in
the same resource group as agent node
shareName: aksshare
nodeStageSecretRef:
name: azure-secret
namespace: default
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict
- nosharesock
- nobrl
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azurefile
spec:
accessModes:
- ReadWriteMany
storageClassName: azurefile-csi
volumeName: azurefile
resources:
requests:
storage: 5Gi
Update your container spec to reference your PersistentVolumeClaim and update your pod. For example:
...
volumes:
- name: azure
persistentVolumeClaim:
claimName: azurefile
As the pod spec can't be updated in place, use kubectl commands to delete, and then re-create the pod:
Next steps
For Azure File CSI driver parameters, see CSI driver parameters.
For associated best practices, see Best practices for storage and backups in AKS.
Integrate Azure HPC Cache with Azure Kubernetes
Service
6/15/2022 • 6 minutes to read • Edit Online
Azure HPC Cache speeds access to your data for high-performance computing (HPC) tasks. By caching files in
Azure, Azure HPC Cache brings the scalability of cloud computing to your existing workflow. This article shows
you how to integrate Azure HPC Cache with Azure Kubernetes Service (AKS).
IMPORTANT
Your AKS cluster must be in a region that supports Azure HPC Cache.
You also need to install and configure Azure CLI version 2.7 or later. Run az --version to find the version. If you
need to install or upgrade, see Install Azure CLI. See hpc-cache-cli-prerequisites for more information about
using Azure CLI with HPC Cache.
You will also need to install the hpc-cache Azure CLI extension. Please do the following:
MC_myResourceGroup_myAKSCluster_eastus
NOTE
The resource provider registration can take some time to complete.
NOTE
The HPC Cache takes approximately 20 minutes to be created.
RESOURCE_GROUP=MC_myResourceGroup_myAKSCluster_eastus
VNET_NAME=$(az network vnet list --resource-group $RESOURCE_GROUP --query [].name -o tsv)
VNET_ID=$(az network vnet show --resource-group $RESOURCE_GROUP --name $VNET_NAME --query "id" -o tsv)
SUBNET_NAME=MyHpcCacheSubnet
SUBNET_ID=$(az network vnet subnet show --resource-group $RESOURCE_GROUP --vnet-name $VNET_NAME --name
$SUBNET_NAME --query "id" -o tsv)
az hpc-cache create \
--resource-group $RESOURCE_GROUP \
--cache-size-gb "3072" \
--location eastus \
--subnet $SUBNET_ID \
--sku-name "Standard_2G" \
--name MyHpcCache
IMPORTANT
You need to select a unique storage account name. Replace 'uniquestorageaccount' with something that will be unique for
you.
Check that the storage account name that you have selected is available.
STORAGE_ACCOUNT_NAME=uniquestorageaccount
az storage account check-name --name $STORAGE_ACCOUNT_NAME
RESOURCE_GROUP=MC_myResourceGroup_myAKSCluster_eastus
STORAGE_ACCOUNT_NAME=uniquestorageaccount
az storage account create \
-n $STORAGE_ACCOUNT_NAME \
-g $RESOURCE_GROUP \
-l eastus \
--sku Standard_LRS
STORAGE_ACCOUNT_NAME=uniquestorageaccount
STORAGE_ACCOUNT_ID=$(az storage account show --name $STORAGE_ACCOUNT_NAME --query "id" -o tsv)
AD_USER=$(az ad signed-in-user show --query objectId -o tsv)
CONTAINER_NAME=mystoragecontainer
az role assignment create --role "Storage Blob Data Contributor" --assignee $AD_USER --scope
$STORAGE_ACCOUNT_ID
az storage container create --name $CONTAINER_NAME --account-name $STORAGE_ACCOUNT_NAME --auth-mode login
Provide permissions to the Azure HPC Cache service account to access your storage account and Blob container.
RESOURCE_GROUP=MC_myResourceGroup_myAKSCluster_eastus
STORAGE_ACCOUNT_NAME=uniquestorageaccount
STORAGE_ACCOUNT_ID=$(az storage account show --name $STORAGE_ACCOUNT_NAME --query "id" -o tsv)
CONTAINER_NAME=mystoragecontainer
az hpc-cache blob-storage-target add \
--resource-group $RESOURCE_GROUP \
--cache-name MyHpcCache \
--name MyStorageTarget \
--storage-account $STORAGE_ACCOUNT_ID \
--container-name $CONTAINER_NAME \
--virtual-namespace-path "/myfilepath"
DNS_NAME="server"
PRIVATE_DNS_ZONE="myhpccache.local"
RESOURCE_GROUP=MC_myResourceGroup_myAKSCluster_eastus
HPC_MOUNTS0=$(az hpc-cache show --name "MyHpcCache" --resource-group $RESOURCE_GROUP --query
"mountAddresses[0]" -o tsv | tr --delete '\r')
HPC_MOUNTS1=$(az hpc-cache show --name "MyHpcCache" --resource-group $RESOURCE_GROUP --query
"mountAddresses[1]" -o tsv | tr --delete '\r')
HPC_MOUNTS2=$(az hpc-cache show --name "MyHpcCache" --resource-group $RESOURCE_GROUP --query
"mountAddresses[2]" -o tsv | tr --delete '\r')
az network private-dns record-set a add-record -g $RESOURCE_GROUP -z $PRIVATE_DNS_ZONE -n $DNS_NAME -a
$HPC_MOUNTS0
az network private-dns record-set a add-record -g $RESOURCE_GROUP -z $PRIVATE_DNS_ZONE -n $DNS_NAME -a
$HPC_MOUNTS1
az network private-dns record-set a add-record -g $RESOURCE_GROUP -z $PRIVATE_DNS_ZONE -n $DNS_NAME -a
$HPC_MOUNTS2
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs
spec:
capacity:
storage: 10000Gi
accessModes:
- ReadWriteMany
mountOptions:
- vers=3
nfs:
server: server.myhpccache.local
path: /
First, ensure that you have credentials for your Kubernetes cluster.
Update the server and path to the values of your NFS (Network File System) volume you created in the previous
step. Create the persistent volume with the kubectl apply command:
kubectl apply -f pv-nfs.yaml
Verify that status of the persistent volume is Available using the kubectl describe command:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-nfs
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 100Gi
Use the kubectl apply command to create the persistent volume claim:
Verify that the status of the persistent volume claim is Bound using the kubectl describe command:
kind: Pod
apiVersion: v1
metadata:
name: nginx-nfs
spec:
containers:
- image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
name: nginx-nfs
command:
- "/bin/sh"
- "-c"
- while true; do echo $(date) >> /mnt/azure/myfilepath/outfile; sleep 1; done
volumeMounts:
- name: disk01
mountPath: /mnt/azure
volumes:
- name: disk01
persistentVolumeClaim:
claimName: pvc-nfs
Verify that the pod is running by using the kubectl describe command:
Verify your volume has been mounted in the pod by using kubectl exec to connect to the pod then df -h to
check if the volume is mounted.
/ # df -h
Filesystem Size Used Avail Use% Mounted on
...
server.myhpccache.local:/myfilepath 8.0E 0 8.0E 0% /mnt/azure/myfilepath
...
Next steps
For more information on Azure HPC Cache, see HPC Cache Overview.
For more information on using NFS with AKS, see Manually create and use an NFS (Network File System)
Linux Server volume with Azure Kubernetes Service (AKS).
Manually create and use a Linux NFS (Network File
System) Server with Azure Kubernetes Service
(AKS)
6/15/2022 • 4 minutes to read • Edit Online
Sharing data between containers is often a necessary component of container-based services and applications.
You usually have various pods that need access to the same information on an external persistent volume. While
Azure Files is an option, creating an NFS Server on an Azure VM is another form of persistent shared storage.
This article will show you how to create an NFS Server on an Azure Ubuntu virtual machine, and set up your
AKS cluster with access to this shared file system as a persistent volume.
EXPORT_DIRECTORY=${1:-/export/data}
DATA_DIRECTORY=${2:-/data}
AKS_SUBNET=${3:-*}
echo "Making new directory to be exported and linked to data directory: ${EXPORT_DIRECTORY}"
mkdir -p ${EXPORT_DIRECTORY}
parentdir="$(dirname "$EXPORT_DIRECTORY")"
echo "Giving 777 permissions to parent: ${parentdir} directory"
chmod 777 $parentdir
echo "Appending localhost and Kubernetes subnet address ${AKS_SUBNET} to exports configuration file"
echo "/export ${AKS_SUBNET}(rw,async,insecure,fsid=0,crossmnt,no_subtree_check)" >>
/etc/exports
echo "/export localhost(rw,async,insecure,fsid=0,crossmnt,no_subtree_check)" >> /etc/exports
The script initiates a restart of the NFS Server, and afterwards you can proceed with connecting to the
NFS Server from your AKS cluster.
2. After creating your Linux VM, copy the file created in the previous step from your local machine to the
VM using the following command:
3. After the file is copied over, open a secure shell (SSH) connection to the VM and execute the following
command:
sudo ./nfs-server-setup.sh
If execution fails because of a permission denied error, set execution permission for all by running the
following command:
chmod +x ~/nfs-server-setup.sh
Connecting AKS cluster to NFS Server
You can connect the NFS Server to your AKS cluster by provisioning a persistent volume and persistent volume
claim that specifies how to access the volume. Connecting the two resources in the same or peered virtual
networks is necessary. To learn how to set up the cluster in the same VNet, see: Creating AKS Cluster in existing
VNet.
Once both resources are on the same virtual or peered VNet, next provision a persistent volume and a
persistent volume claim in your AKS Cluster. The containers can then mount the NFS drive to their local
directory.
1. Create a pv-azurefilesnfs.yaml file with a PersistentVolume. For example:
apiVersion: v1
kind: PersistentVolume
metadata:
name: NFS_NAME
labels:
type: nfs
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
server: NFS_INTERNAL_IP
path: NFS_EXPORT_FILE_PATH
Replace the values for NFS_INTERNAL_IP , NFS_NAME and NFS_EXPORT_FILE_PATH with the actual
settings from your NFS Server.
2. Create a pvc-azurefilesnfs.yaml file with a PersistentVolumeClaim that uses the PersistentVolume. For
example:
IMPORTANT
storageClassName value needs to remain an empty string or the claim won't work.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: NFS_NAME
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 1Gi
selector:
matchLabels:
type: nfs
Replace the value for NFS_NAME with the actual setting from your NFS Server.
Troubleshooting
If you can't connect to the server from your AKS cluster, the issue might be the exported directory or its parent,
doesn't have sufficient permissions to access the NFS Server VM.
Check that both your export directory and its parent directory have 777 permissions.
You can check permissions by running the following command and the directories should have 'drwxrwxrwx'
permissions:
ls -l
Next steps
For associated best practices, see Best practices for storage and backups in AKS.
To learn more on setting up your NFS Server or to help debug issues, see the following tutorial from the
Ubuntu community NFS Tutorial
Integrate Azure NetApp Files with Azure
Kubernetes Service
6/15/2022 • 11 minutes to read • Edit Online
A persistent volume represents a piece of storage that has been provisioned for use with Kubernetes pods. A
persistent volume can be used by one or many pods and can be dynamically or statically provisioned. This
article shows you how to create Azure NetApp Files volumes to be used by pods in an Azure Kubernetes Service
(AKS) cluster.
Azure NetApp Files is an enterprise-class, high-performance, metered file storage service running on Azure.
Kubernetes users have two options when it comes to using Azure NetApp Files volumes for Kubernetes
workloads:
Create Azure NetApp Files volumes statically : In this scenario, the creation of volumes is achieved external
to AKS; volumes are created using az /Azure UI and are then exposed to Kubernetes by the creation of a
PersistentVolume . Statically created Azure NetApp Files volumes have lots of limitations (for example,
inability to be expanded, needing to be over-provisioned, and so on) and are not recommended for most use
cases.
Create Azure NetApp Files volumes on-demand , orchestrating through Kubernetes: This method is the
preferred mode of operation for creating multiple volumes directly through Kubernetes and is achieved
using Astra Trident. Astra Trident is a CSI-compliant dynamic storage orchestrator that helps provision
volumes natively through Kubernetes.
Using a CSI driver to directly consume Azure NetApp Files volumes from AKS workloads is highly
recommended for most use cases. This requirement is fulfilled using Astra Trident, an open-source dynamic
storage orchestrator for Kubernetes. Astra Trident is an enterprise-grade storage orchestrator purpose-built for
Kubernetes, fully supported by NetApp. It simplifies access to storage from Kubernetes clusters by automating
storage provisioning. You can take advantage of Astra Trident's Container Storage Interface (CSI) driver for Azure
NetApp Files to abstract underlying details and create, expand, and snapshot volumes on-demand. Also, using
Astra Trident enables you to use Astra Control Service built on top of Astra Trident to backup, recover, move, and
manage the application-data lifecycle of your AKS workloads across clusters within and across Azure regions to
meet your business and service continuity needs.
IMPORTANT
Your AKS cluster must also be in a region that supports Azure NetApp Files.
You also need the Azure CLI version 2.0.59 or later installed and configured. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
Prerequisites
The following considerations apply when you use Azure NetApp Files:
Azure NetApp Files is only available in selected Azure regions.
After the initial deployment of an AKS cluster, you can choose to provision Azure NetApp Files volumes
statically or dynamically.
To use dynamic provisioning with Azure NetApp Files, install and configure Astra Trident version 19.07 or
later.
NOTE
This can take some time to complete.
When you create an Azure NetApp account for use with AKS, you need to create the account in the node
resource group. First, get the resource group name with the az aks show command and add the
--query nodeResourceGroup query parameter. The following example gets the node resource group for the AKS
cluster named myAKSCluster in the resource group name myResourceGroup:
MC_myResourceGroup_myAKSCluster_eastus
Create an Azure NetApp Files account in the node resource group and same region as your AKS cluster using az
netappfiles account create. The following example creates an account named myaccount1 in the
MC_myResourceGroup_myAKSCluster_eastus resource group and eastus region:
Create a new capacity pool by using az netappfiles pool create. The following example creates a new capacity
pool named mypool1 with 4 TB in size and Premium service level:
Create a subnet to delegate to Azure NetApp Files using az network vnet subnet create. This subnet must be in
the same virtual network as your AKS cluster.
RESOURCE_GROUP=MC_myResourceGroup_myAKSCluster_eastus
VNET_NAME=$(az network vnet list --resource-group $RESOURCE_GROUP --query [].name -o tsv)
VNET_ID=$(az network vnet show --resource-group $RESOURCE_GROUP --name $VNET_NAME --query "id" -o tsv)
SUBNET_NAME=MyNetAppSubnet
az network vnet subnet create \
--resource-group $RESOURCE_GROUP \
--vnet-name $VNET_NAME \
--name $SUBNET_NAME \
--delegations "Microsoft.NetApp/volumes" \
--address-prefixes 10.0.0.0/28
Volumes can either be provisioned statically or dynamically. Both options are covered in detail below.
RESOURCE_GROUP=MC_myResourceGroup_myAKSCluster_eastus
LOCATION=eastus
ANF_ACCOUNT_NAME=myaccount1
POOL_NAME=mypool1
SERVICE_LEVEL=Premium
VNET_NAME=$(az network vnet list --resource-group $RESOURCE_GROUP --query [].name -o tsv)
VNET_ID=$(az network vnet show --resource-group $RESOURCE_GROUP --name $VNET_NAME --query "id" -o tsv)
SUBNET_NAME=MyNetAppSubnet
SUBNET_ID=$(az network vnet subnet show --resource-group $RESOURCE_GROUP --vnet-name $VNET_NAME --name
$SUBNET_NAME --query "id" -o tsv)
VOLUME_SIZE_GiB=100 # 100 GiB
UNIQUE_FILE_PATH="myfilepath2" # Note that file path needs to be unique within all ANF Accounts
Create a pv-nfs.yaml defining a PersistentVolume. Replace path with the creationToken and server with
ipAddress from the previous command. For example:
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteMany
mountOptions:
- vers=3
nfs:
server: 10.0.0.4
path: /myfilepath2
Update the server and path to the values of your NFS (Network File System) volume you created in the previous
step. Create the PersistentVolume with the kubectl apply command:
Verify the Status of the PersistentVolume is Available using the kubectl describe command:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-nfs
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 1Gi
Verify the Status of the PersistentVolumeClaim is Bound using the kubectl describe command:
kind: Pod
apiVersion: v1
metadata:
name: nginx-nfs
spec:
containers:
- image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
name: nginx-nfs
command:
- "/bin/sh"
- "-c"
- while true; do echo $(date) >> /mnt/azure/outfile; sleep 1; done
volumeMounts:
- name: disk01
mountPath: /mnt/azure
volumes:
- name: disk01
persistentVolumeClaim:
claimName: pvc-nfs
Verify your volume has been mounted in the pod by using kubectl exec to connect to the pod then df -h to
check if the volume is mounted.
/ # df -h
Filesystem Size Used Avail Use% Mounted on
...
10.0.0.4:/myfilepath2 100T 384K 100T 1% /mnt/azure
...
See to Deploying Trident to understand how each option works and identify the one that works best for you.
Download Astra Trident from its GitHub repository. Choose from the desired version and download the installer
bundle.
$ wget https://github.com/NetApp/trident/releases/download/v21.07.1/trident-installer-21.07.1.tar.gz
$ tar xzvf trident-installer-21.07.1.tar.gz
namespace/trident created
serviceaccount/trident-operator created
clusterrole.rbac.authorization.k8s.io/trident-operator created
clusterrolebinding.rbac.authorization.k8s.io/trident-operator created
deployment.apps/trident-operator created
podsecuritypolicy.policy/tridentoperatorpods created
tridentorchestrator.trident.netapp.io/trident created
The operator installs by using the parameters provided in the TridentOrchestrator spec. You can learn about the
configuration parameters and example backends from the extensive installation and backend guides.
Confirm Astra Trident was installed.
$ kubectl describe torc trident
Name: trident
Namespace:
Labels: <none>
Annotations: <none>
API Version: trident.netapp.io/v1
Kind: TridentOrchestrator
...
Spec:
Debug: true
Namespace: trident
Status:
Current Installation Params:
IPv6: false
Autosupport Hostname:
Autosupport Image: netapp/trident-autosupport:21.01
Autosupport Proxy:
Autosupport Serial Number:
Debug: true
Enable Node Prep: false
Image Pull Secrets:
Image Registry:
k8sTimeout: 30
Kubelet Dir: /var/lib/kubelet
Log Format: text
Silence Autosupport: false
Trident Image: netapp/trident:21.07.1
Message: Trident installed
Namespace: trident
Status: Installed
Version: v21.07.1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Installing 74s trident-operator.netapp.io Installing Trident
Normal Installed 67s trident-operator.netapp.io Trident installed
Create a backend
After Astra Trident is installed, create a backend that points to your Azure NetApp Files subscription.
secret/backend-tbc-anf-secret created
tridentbackendconfig.trident.netapp.io/backend-tbc-anf created
Before running the command, you need to update backend-anf.yaml to include details about the Azure NetApp
Files subscription, such as:
subscriptionID for the Azure subscription with Azure NetApp Files enabled. The
tenantID , clientID , and clientSecret from an App Registration in Azure Active Directory (AD) with
sufficient permissions for the Azure NetApp Files service. The App Registration must carry the Owner or
Contributor role that’s predefined by Azure.
Azure location that contains at least one delegated subnet.
In addition, you can choose to provide a different service level. Azure NetApp Files provides three service levels:
Standard, Premium, and Ultra.
Create a StorageClass
A storage class is used to define how a unit of storage is dynamically created with a persistent volume. To
consume Azure NetApp Files volumes, a storage class must be created. Create a file named
anf-storageclass.yaml and copy in the manifest provided below.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azure-netapp-files
provisioner: csi.trident.netapp.io
parameters:
backendType: "azure-netapp-files"
fsType: "nfs"
storageclass/azure-netapp-files created
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azure-netapp-files csi.trident.netapp.io Delete Immediate false 3s
Create a PersistentVolumeClaim
A PersistentVolumeClaim (PVC) is a request for storage by a user. Upon the creation of a PersistentVolumeClaim,
Astra Trident automatically creates an Azure NetApp Files volume and makes it available for Kubernetes
workloads to consume.
Create a file named anf-pvc.yaml and provide the following manifest. In this example, a 1-TiB volume is created
that is ReadWriteMany.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: anf-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Ti
storageClassName: azure-netapp-files
Create the persistent volume claim with the kubectl apply command:
persistentvolumeclaim/anf-pvc created
kind: Pod
apiVersion: v1
metadata:
name: nginx-pod
spec:
containers:
- name: nginx
image: mcr.microsoft.com/oss/nginx/nginx:latest1.15.5-alpine
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- mountPath: "/mnt/data"
name: volume
volumes:
- name: volume
persistentVolumeClaim:
claimName: anf-pvc
pod/nginx-pod created
Kubernetes has now created a pod with the volume mounted and accessible within the nginx container at
/mnt/data . Confirm by checking the event logs for the pod using kubectl describe :
[...]
Volumes:
volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: anf-pvc
ReadOnly: false
default-token-k7952:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-k7952
Optional: false
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15s default-scheduler Successfully assigned trident/nginx-pod to
brameshb-non-root-test
Normal SuccessfulAttachVolume 15s attachdetach-controller AttachVolume.Attach succeeded for volume
"pvc-bffa315d-3f44-4770-86eb-c922f567a075"
Normal Pulled 12s kubelet Container image
"mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine" already present on machine
Normal Created 11s kubelet Created container nginx
Normal Started 10s kubelet Started container nginx
Astra Trident supports many features with Azure NetApp Files, such as:
Expanding volumes
On-demand volume snapshots
Importing volumes
Next steps
For more information on Azure NetApp Files, see What is Azure NetApp Files.
Use Azure ultra disks on Azure Kubernetes Service
6/15/2022 • 4 minutes to read • Edit Online
Azure ultra disks offer high throughput, high IOPS, and consistent low latency disk storage for your stateful
applications. One major benefit of ultra disks is the ability to dynamically change the performance of the SSD
along with your workloads without the need to restart your agent nodes. Ultra disks are suited for data-
intensive workloads.
IMPORTANT
Azure ultra disks require nodepools deployed in availability zones and regions that support these disks as well as only
specific VM series. See the Ultra disks GA scope and limitations .
Limitations
See the Ultra disks GA scope and limitations
The supported size range for a Ultra disks is between 100 and 1500
If you want to create clusters without ultra disk support, you can do so by omitting the --enable-ultra-ssd
parameter.
az aks nodepool add --name ultradisk --cluster-name myAKSCluster --resource-group myResourceGroup --node-vm-
size Standard_D2s_v3 --zones 1 2 --node-count 2 --enable-ultra-ssd
If you want to create new node pools without support for ultra disks, you can do so by omitting the
--enable-ultra-ssd parameter.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ultra-disk-sc
provisioner: disk.csi.azure.com # replace with "kubernetes.io/azure-disk" if aks version is less than 1.21
volumeBindingMode: WaitForFirstConsumer # optional, but recommended if you want to wait until the pod that
will use this disk is created
parameters:
skuname: UltraSSD_LRS
kind: managed
cachingMode: None
diskIopsReadWrite: "2000" # minimum value: 2 IOPS/GiB
diskMbpsReadWrite: "320" # minimum value: 0.032/GiB
Create the storage class with the kubectl apply command and specify your azure-ultra-disk-sc.yaml file:
storageclass.storage.k8s.io/ultra-disk-sc created
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ultra-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: ultra-disk-sc
resources:
requests:
storage: 1000Gi
Create the persistent volume claim with the kubectl apply command and specify your azure-ultra-disk-pvc.yaml
file:
$ kubectl apply -f azure-ultra-disk-pvc.yaml
persistentvolumeclaim/ultra-disk created
kind: Pod
apiVersion: v1
metadata:
name: nginx-ultra
spec:
containers:
- name: nginx-ultra
image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- mountPath: "/mnt/azure"
name: volume
volumes:
- name: volume
persistentVolumeClaim:
claimName: ultra-disk
Create the pod with the kubectl apply command, as shown in the following example:
pod/nginx-ultra created
You now have a running pod with your Azure disk mounted in the /mnt/azure directory. This configuration can
be seen when inspecting your pod via kubectl describe pod nginx-ultra , as shown in the following condensed
example:
$ kubectl describe pod nginx-ultra
[...]
Volumes:
volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: azure-managed-disk
ReadOnly: false
default-token-smm2n:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-smm2n
Optional: false
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned mypod to
aks-nodepool1-79590246-0
Normal SuccessfulMountVolume 2m kubelet, aks-nodepool1-79590246-0 MountVolume.SetUp succeeded for
volume "default-token-smm2n"
Normal SuccessfulMountVolume 1m kubelet, aks-nodepool1-79590246-0 MountVolume.SetUp succeeded for
volume "pvc-faf0f176-8b8d-11e8-923b-deb28c58d242"
[...]
Next steps
For more about ultra disks, see Using Azure ultra disks.
For more about storage best practices, see Best practices for storage and backups in Azure Kubernetes
Service (AKS)
Container Storage Interface (CSI) drivers on Azure
Kubernetes Service (AKS)
6/15/2022 • 5 minutes to read • Edit Online
The Container Storage Interface (CSI) is a standard for exposing arbitrary block and file storage systems to
containerized workloads on Kubernetes. By adopting and using CSI, Azure Kubernetes Service (AKS) can write,
deploy, and iterate plug-ins to expose new or improve existing storage systems in Kubernetes without having to
touch the core Kubernetes code and wait for its release cycles.
The CSI storage driver support on AKS allows you to natively use:
Azure disks can be used to create a Kubernetes DataDisk resource. Disks can use Azure Premium Storage,
backed by high-performance SSDs, or Azure Standard Storage, backed by regular HDDs or Standard SSDs.
For most production and development workloads, use Premium Storage. Azure disks are mounted as
ReadWriteOnce, which makes it available to one node in AKS. For storage volumes that can be accessed by
multiple pods simultaneously, use Azure Files.
Azure Files can be used to mount an SMB 3.0/3.1 share backed by an Azure storage account to pods. With
Azure Files, you can share data across multiple nodes and pods. Azure Files can use Azure Standard storage
backed by regular HDDs or Azure Premium storage backed by high-performance SSDs.
IMPORTANT
Starting with Kubernetes version 1.21, AKS only uses CSI drivers by default and CSI migration is enabled. Existing in-tree
persistent volumes will continue to function. However, internally Kubernetes hands control of all storage management
operations (previously targeting in-tree drivers) to CSI drivers.
In-tree drivers refers to the storage drivers that are part of the core Kubernetes code opposed to the CSI drivers, which
are plug-ins.
NOTE
Azure disks CSI driver v2 (preview) improves scalability and reduces pod failover latency. It uses shared disks to provision
attachment replicas on multiple cluster nodes and integrates with the pod scheduler to ensure a node with an attachment
replica is chosen on pod failover. Azure disks CSI driver v2 (preview) also provides the ability to fine tune performance. If
you're interested in participating in the preview, submit a request: https://aka.ms/DiskCSIv2Preview. This preview version
is provided without a service level agreement, and you can occasionally expect breaking changes while in preview. The
preview version isn't recommended for production workloads. For more information, see Supplemental Terms of Use for
Microsoft Azure Previews.
NOTE
AKS provides the option to enable and disable the CSI drivers (preview) on new and existing clusters. CSI drivers are
enabled by default on new clusters. You should verify that there are no existing Persistent Volumes created by Azure disks
and Azure Files CSI drivers and that there is not any existing VolumeSnapshot, VolumeSnapshotClass or
VolumeSnapshotContent resources before running this command on existing cluster. This preview version is provided
without a service level agreement, and you can occasionally expect breaking changes while in preview. The preview version
isn't recommended for production workloads. For more information, see Supplemental Terms of Use for Microsoft Azure
Previews.
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
Install the aks-preview Azure CLI
You also need the aks-preview Azure CLI extension version 0.5.78 or later. Install the aks-preview Azure CLI
extension by using the az extension add command. Or install any available updates by using the az extension
update command.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: custom-managed-premium
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete
parameters:
storageAccountType: Premium_LRS
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: custom-managed-premium
provisioner: disk.csi.azure.com
reclaimPolicy: Delete
parameters:
storageAccountType: Premium_LRS
The CSI storage system supports the same features as the In-tree drivers, so the only change needed would be
the provisioner.
Next steps
To use the CSI driver for Azure disks, see Use Azure disks with CSI drivers.
To use the CSI driver for Azure Files, see Use Azure Files with CSI drivers.
For more about storage best practices, see Best practices for storage and backups in Azure Kubernetes
Service.
For more information on CSI migration, see Kubernetes In-Tree to CSI Volume Migration.
Use the Azure disk Container Storage Interface
(CSI) driver in Azure Kubernetes Service (AKS)
6/15/2022 • 12 minutes to read • Edit Online
The Azure disk Container Storage Interface (CSI) driver is a CSI specification-compliant driver used by Azure
Kubernetes Service (AKS) to manage the lifecycle of Azure disks.
The CSI is a standard for exposing arbitrary block and file storage systems to containerized workloads on
Kubernetes. By adopting and using CSI, AKS can write, deploy, and iterate plug-ins to expose new or improve
existing storage systems in Kubernetes without having to touch the core Kubernetes code and wait for its
release cycles.
To create an AKS cluster with CSI driver support, see Enable CSI driver on AKS. This article describes how to use
the Azure disk CSI driver version 1.
NOTE
Azure disk CSI driver v2 (preview) improves scalability and reduces pod failover latency. It uses shared disks to provision
attachment replicas on multiple cluster nodes and integrates with the pod scheduler to ensure a node with an attachment
replica is chosen on pod failover. Azure disk CSI driver v2 (preview) also provides the ability to fine tune performance. If
you're interested in participating in the preview, submit a request: https://aka.ms/DiskCSIv2Preview. This preview version
is provided without a service level agreement, and you can occasionally expect breaking changes while in preview. The
preview version isn't recommended for production workloads. For more information, see Supplemental Terms of Use for
Microsoft Azure Previews.
NOTE
In-tree drivers refers to the current storage drivers that are part of the core Kubernetes code versus the new CSI drivers,
which are plug-ins.
The reclaim policy in both storage classes ensures that the underlying Azure disk is deleted when the respective
PV is deleted. The storage classes also configure the PVs to be expandable. You just need to edit the persistent
volume claim (PVC) with the new size.
To leverage these storage classes, create a PVC and respective pod that references and uses them. A PVC is used
to automatically provision storage based on a storage class. A PVC can use one of the pre-created storage
classes or a user-defined storage class to create an Azure-managed disk for the desired SKU and size. When you
create a pod definition, the PVC is specified to request the desired storage.
Create an example pod and respective PVC by running the kubectl apply command:
persistentvolumeclaim/pvc-azuredisk created
pod/nginx-azuredisk created
After the pod is in the running state, run the following command to create a new file called test.txt .
To validate the disk is correctly mounted, run the following command and verify you see the test.txt file in the
output:
lost+found
outfile
test.txt
You can use a volumeBindingMode: Immediate class that guarantees it occurs immediately once the PVC is created.
In cases where your node pools are topology constrained, for example when using availability zones, PVs would
be bound or provisioned without knowledge of the pod's scheduling requirements (in this case to be in a
specific zone).
To address this scenario, you can use volumeBindingMode: WaitForFirstConsumer , which delays the binding and
provisioning of a PV until a pod that uses the PVC is created. This way, the PV conforms and is provisioned in the
availability zone (or other topology) that's specified by the pod's scheduling constraints. The default storage
classes use volumeBindingMode: WaitForFirstConsumer class.
Create a file named sc-azuredisk-csi-waitforfirstconsumer.yaml , and then paste the following manifest. The
storage class is the same as our managed-csi storage class, but with a different volumeBindingMode class.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azuredisk-csi-waitforfirstconsumer
provisioner: disk.csi.azure.com
parameters:
skuname: StandardSSD_LRS
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Create the storage class by running the kubectl apply command and specify your
sc-azuredisk-csi-waitforfirstconsumer.yaml file:
storageclass.storage.k8s.io/azuredisk-csi-waitforfirstconsumer created
Volume snapshots
The Azure disk CSI driver supports creating snapshots of persistent volumes. As part of this capability, the driver
can perform either full or incremental snapshots depending on the value set in the incremental parameter (by
default, it's true).
The following table provides details for all of the parameters.
volumesnapshotclass.snapshot.storage.k8s.io/csi-azuredisk-vsc created
Now let's create a volume snapshot from the PVC that we dynamically created at the beginning of this tutorial,
pvc-azuredisk .
volumesnapshot.snapshot.storage.k8s.io/azuredisk-volume-snapshot created
Name: azuredisk-volume-snapshot
Namespace: default
Labels: <none>
Annotations: API Version: snapshot.storage.k8s.io/v1
Kind: VolumeSnapshot
Metadata:
Creation Timestamp: 2020-08-27T05:27:58Z
Finalizers:
snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
Generation: 1
Resource Version: 714582
Self Link: /apis/snapshot.storage.k8s.io/v1/namespaces/default/volumesnapshots/azuredisk-volume-
snapshot
UID: dd953ab5-6c24-42d4-ad4a-f33180e0ef87
Spec:
Source:
Persistent Volume Claim Name: pvc-azuredisk
Volume Snapshot Class Name: csi-azuredisk-vsc
Status:
Bound Volume Snapshot Content Name: snapcontent-dd953ab5-6c24-42d4-ad4a-f33180e0ef87
Creation Time: 2020-08-31T05:27:59Z
Ready To Use: true
Restore Size: 10Gi
Events: <none>
persistentvolumeclaim/pvc-azuredisk-snapshot-restored created
pod/nginx-restored created
Finally, let's make sure it's the same PVC created before by checking the contents.
lost+found
outfile
test.txt
Clone volumes
A cloned volume is defined as a duplicate of an existing Kubernetes volume. For more information on cloning
volumes in Kubernetes, see the conceptual documentation for volume cloning.
The CSI driver for Azure disks supports volume cloning. To demonstrate, create a cloned volume of the
previously created azuredisk-pvc and a new pod to consume it.
persistentvolumeclaim/pvc-azuredisk-cloning created
pod/nginx-restored-cloning created
You can verify the content of the cloned volume by running the following command and confirming the file
test.txt is created.
lost+found
outfile
test.txt
You can request a larger volume for a PVC. Edit the PVC object, and specify a larger size. This change triggers the
expansion of the underlying volume that backs the PV.
NOTE
A new PV is never created to satisfy the claim. Instead, an existing volume is resized.
In AKS, the built-in managed-csi storage class already supports expansion, so use the PVC created earlier with
this storage class. The PVC requested a 10-Gi persistent volume. You can confirm by running the following
command:
Expand the PVC by increasing the spec.resources.requests.storage field running the following command:
$ kubectl patch pvc pvc-azuredisk --type merge --patch '{"spec": {"resources": {"requests": {"storage":
"15Gi"}}}}'
persistentvolumeclaim/pvc-azuredisk patched
Run the following command to confirm the volume size has increased:
$ kubectl get pv
And after a few minutes, run the following commands to confirm the size of the PVC and inside the pod:
On-demand bursting
On-demand disk bursting model allows disk bursts whenever its needs exceed its current capacity. This model
incurs additional charges anytime the disk bursts. On-demand bursting is only available for premium SSDs
larger than 512 GiB. For more information on premium SSDs provisioned IOPS and throughput per disk, see
Premium SSD size. Alternatively, credit-based bursting is where the disk will burst only if it has burst credits
accumulated in its credit bucket. Credit-based bursting does not incur additional charges when the disk bursts.
Credit-based bursting is only available for premium SSDs 512 GiB and smaller, and standard SSDs 1024 GiB and
smaller. For more details on on-demand bursting, see On-demand bursting.
IMPORTANT
The default managed-csi-premium storage class has on-demand bursting disabled and uses credit-based bursting. Any
premium SSD dynamically created by a persistent volume claim based on the default managed-csi-premium storage
class also has on-demand bursting disabled.
To create a premium SSD persistent volume with on-demand bursting enabled you can create a new storage
class with the enableBursting parameter set to true as shown in the following YAML template. For more details
on enabling on-demand bursting, see On-demand bursting. For more details on building your own storage class
with on-demand bursting enabled, see Create a Burstable Managed CSI Premium Storage Class.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: burstable-managed-csi-premium
provisioner: disk.csi.azure.com
parameters:
skuname: Premium_LRS
enableBursting: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Windows containers
The Azure disk CSI driver supports Windows nodes and containers. If you want to use Windows containers,
follow the Windows containers quickstart to add a Windows node pool.
After you have a Windows node pool, you can now use the built-in storage classes like managed-csi . You can
deploy an example Windows-based stateful set that saves timestamps into the file data.txt by running the
following kubectl apply command:
statefulset.apps/busybox-azuredisk created
2020-08-27 08:13:41Z
2020-08-27 08:13:42Z
2020-08-27 08:13:44Z
(...)
Next steps
To learn how to use CSI driver for Azure Files, see Use Azure Files with CSI driver.
For more information about storage best practices, see Best practices for storage and backups in Azure
Kubernetes Service.
Use Azure Files Container Storage Interface (CSI)
drivers in Azure Kubernetes Service (AKS)
6/15/2022 • 9 minutes to read • Edit Online
The Azure Files Container Storage Interface (CSI) driver is a CSI specification-compliant driver used by Azure
Kubernetes Service (AKS) to manage the lifecycle of Azure Files shares.
The CSI is a standard for exposing arbitrary block and file storage systems to containerized workloads on
Kubernetes. By adopting and using CSI, AKS now can write, deploy, and iterate plug-ins to expose new or
improve existing storage systems in Kubernetes. Using CSI drivers in AKS avoids having to touch the core
Kubernetes code and wait for its release cycles.
To create an AKS cluster with CSI drivers support, see Enable CSI drivers on AKS.
NOTE
In-tree drivers refers to the current storage drivers that are part of the core Kubernetes code versus the new CSI drivers,
which are plug-ins.
NOTE
Azure Files supports Azure Premium Storage. The minimum premium file share is 100 GB.
When you use storage CSI drivers on AKS, there are two more built-in StorageClasses that use the Azure Files
CSI storage drivers. The other CSI storage classes are created with the cluster alongside the in-tree default
storage classes.
azurefile-csi : Uses Azure Standard Storage to create an Azure Files share.
azurefile-csi-premium : Uses Azure Premium Storage to create an Azure Files share.
The reclaim policy on both storage classes ensures that the underlying Azure Files share is deleted when the
respective PV is deleted. The storage classes also configure the file shares to be expandable, you just need to edit
the persistent volume claim (PVC) with the new size.
To use these storage classes, create a PVC and respective pod that references and uses them. A PVC is used to
automatically provision storage based on a storage class. A PVC can use one of the pre-created storage classes
or a user-defined storage class to create an Azure Files share for the desired SKU and size. When you create a
pod definition, the PVC is specified to request the desired storage.
Create an example PVC and pod that prints the current date into an outfile by running the kubectl apply
commands:
persistentvolumeclaim/pvc-azurefile created
pod/nginx-azurefile created
After the pod is in the running state, you can validate that the file share is correctly mounted by running the
following command and verifying the output contains the outfile :
total 29
-rwxrwxrwx 1 root root 29348 Aug 31 21:59 outfile
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: my-azurefile
provisioner: file.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
mountOptions:
- dir_mode=0640
- file_mode=0640
- uid=0
- gid=0
- mfsymlinks
- cache=strict # https://linux.die.net/man/8/mount.cifs
- nosharesock
parameters:
skuName: Standard_LRS
storageclass.storage.k8s.io/my-azurefile created
The Azure Files CSI driver supports creating snapshots of persistent volumes and the underlying file shares.
NOTE
This driver only supports snapshot creation, restore from snapshot is not supported by this driver. Snapshots can be
restored from Azure portal or CLI. For more information about creating and restoring a snapshot, see Overview of share
snapshots for Azure Files.
volumesnapshotclass.snapshot.storage.k8s.io/csi-azurefile-vsc created
Create a volume snapshot from the PVC we dynamically created at the beginning of this tutorial, pvc-azurefile .
volumesnapshot.snapshot.storage.k8s.io/azurefile-volume-snapshot created
Verify the snapshot was created correctly by running the following command:
Name: azurefile-volume-snapshot
Namespace: default
Labels: <none>
Annotations: API Version: snapshot.storage.k8s.io/v1beta1
Kind: VolumeSnapshot
Metadata:
Creation Timestamp: 2020-08-27T22:37:41Z
Finalizers:
snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
Generation: 1
Resource Version: 955091
Self Link: /apis/snapshot.storage.k8s.io/v1beta1/namespaces/default/volumesnapshots/azurefile-
volume-snapshot
UID: c359a38f-35c1-4fb1-9da9-2c06d35ca0f4
Spec:
Source:
Persistent Volume Claim Name: pvc-azurefile
Volume Snapshot Class Name: csi-azurefile-vsc
Status:
Bound Volume Snapshot Content Name: snapcontent-c359a38f-35c1-4fb1-9da9-2c06d35ca0f4
Ready To Use: false
Events: <none>
NOTE
A new PV is never created to satisfy the claim. Instead, an existing volume is resized.
In AKS, the built-in azurefile-csi storage class already supports expansion, so use the PVC created earlier with
this storage class. The PVC requested a 100Gi file share. We can confirm that by running:
kubectl patch pvc pvc-azurefile --type merge --patch '{"spec": {"resources": {"requests": {"storage":
"200Gi"}}}}'
persistentvolumeclaim/pvc-azurefile patched
Verify that both the PVC and the file system inside the pod show the new size:
Create a file named private-azure-file-sc.yaml, and then paste the following example manifest in the file. Replace
the values for <resourceGroup> and <storageAccountName> .
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: private-azurefile-csi
provisioner: file.csi.azure.com
allowVolumeExpansion: true
parameters:
resourceGroup: <resourceGroup>
storageAccount: <storageAccountName>
server: <storageAccountName>.privatelink.file.core.windows.net
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict # https://linux.die.net/man/8/mount.cifs
- nosharesock # reduce probability of reconnect race
- actimeo=30 # reduce latency for metadata-heavy workload
storageclass.storage.k8s.io/private-azurefile-csi created
Create a file named private-pvc.yaml, and then paste the following example manifest in the file:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: private-azurefile-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: private-azurefile-csi
resources:
requests:
storage: 100Gi
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azurefile-csi-nfs
provisioner: file.csi.azure.com
allowVolumeExpansion: true
parameters:
protocol: nfs
mountOptions:
- nconnect=8
After editing and saving the file, create the storage class with the kubectl apply command:
storageclass.storage.k8s.io/azurefile-csi-nfs created
statefulset.apps/statefulset-azurefile created
Windows containers
The Azure Files CSI driver also supports Windows nodes and containers. To use Windows containers, follow the
Windows containers quickstart to add a Windows node pool.
After you have a Windows node pool, use the built-in storage classes like azurefile-csi or create a custom one.
You can deploy an example Windows-based stateful set that saves timestamps into a file data.txt by running
the kubectl apply command:
statefulset.apps/busybox-azurefile created
Validate the contents of the volume by running the following kubectl exec command:
2020-08-27 22:11:01Z
2020-08-27 22:11:02Z
2020-08-27 22:11:04Z
(...)
Next steps
To learn how to use CSI drivers for Azure disks, see Use Azure disks with CSI drivers.
For more about storage best practices, see Best practices for storage and backups in Azure Kubernetes
Service.
Monitoring Azure Kubernetes Service (AKS) with
Azure Monitor
6/15/2022 • 15 minutes to read • Edit Online
This scenario describes how to use Azure Monitor to monitor the health and performance of Azure Kubernetes
Service (AKS). It includes collection of telemetry critical for monitoring, analysis and visualization of collected
data to identify trends, and how to configure alerting to be proactively notified of critical issues.
The Cloud Monitoring Guide defines the primary monitoring objectives you should focus on for your Azure
resources. This scenario focuses on Health and Status monitoring using Azure Monitor.
NOTE
Azure Monitor was designed to monitor the availability and performance of cloud resources. While the operational data
stored in Azure Monitor may be useful for investigating security incidents, other services in Azure were designed to
monitor security. Security monitoring for AKS is done with Microsoft Sentinel and Microsoft Defender for Cloud. See
Monitor virtual machines with Azure Monitor - Security monitoring for a description of the security monitoring tools in
Azure and their relationship to Azure Monitor.
For information on using the security services to monitor AKS, see Microsoft Defender for Kubernetes - the benefits and
features and Connect Azure Kubernetes Service (AKS) diagnostics logs to Microsoft Sentinel.
Container insights
AKS generates platform metrics and resource logs, like any other Azure resource, that you can use to monitor its
basic health and performance. Enable Container insights to expand on this monitoring. Container insights is a
feature in Azure Monitor that monitors the health and performance of managed Kubernetes clusters hosted on
AKS in addition to other cluster configurations. Container insights provides interactive views and workbooks
that analyze collected data for a variety of monitoring scenarios.
Prometheus and Grafana are CNCF backed widely popular open source tools for kubernetes monitoring. AKS
exposes many metrics in Prometheus format which makes Prometheus a popular choice for monitoring.
Container insights has native integration with AKS, collecting critical metrics and logs, alerting on identified
issues, and providing visualization with workbooks. It also collects certain Prometheus metrics, and many native
Azure Monitor insights are built-up on top of Prometheus metrics. Container insights complements and
completes E2E monitoring of AKS including log collection which Prometheus as stand-alone tool doesn’t
provide. Many customers use Prometheus integration and Azure Monitor together for E2E monitoring.
Learn more about using Container insights at Container insights overview. Monitor layers of AKS with Container
insights below introduces various features of Container insights and the monitoring scenarios that they support.
Configure monitoring
The following sections describe the steps required to configure full monitoring of your AKS cluster using Azure
Monitor.
Create Log Analytics workspace
You require at least one Log Analytics workspace to support Container insights and to collect and analyze other
telemetry about your AKS cluster. There is no cost for the workspace, but you do incur ingestion and retention
costs when you collect data. See Azure Monitor Logs pricing details for details.
If you're just getting started with Azure Monitor, then start with a single workspace and consider creating
additional workspaces as your requirements evolve. Many environments will use a single workspace for all the
Azure resources they monitor. You can even share a workspace used by Microsoft Defender for Cloud and
Microsoft Sentinel, although many customers choose to segregate their availability and performance telemetry
from security data.
See Designing your Azure Monitor Logs deployment for details on logic that you should consider for designing
a workspace configuration.
Enable container insights
When you enable Container insights for your AKS cluster, it deploys a containerized version of the Log Analytics
agent that sends data to Azure Monitor. There are multiple methods to enable it depending whether you're
working with a new or existing AKS cluster. See Enable Container insights for prerequisites and configuration
options.
Configure collection from Prometheus
Container insights allows you to collect certain Prometheus metrics in your Log Analytics workspace without
requiring a Prometheus server. You can analyze this data using Azure Monitor features along with other data
collected by Container insights. See Configure scraping of Prometheus metrics with Container insights for
details on this configuration.
Collect resource logs
The logs for AKS control plane components are implemented in Azure as resource logs. Container insights
doesn't currently use these logs, so you do need to create your own log queries to view and analyze them. See
How to query logs from Container insights for details on the structure of these logs and how to write queries for
them.
You need to create a diagnostic setting to collect resource logs. Create multiple diagnostic settings to send
different sets of logs to different locations. See Create diagnostic settings to send platform logs and metrics to
different destinations to create diagnostic settings for your AKS cluster.
There is a cost for sending resource logs to a workspace, so you should only collect those log categories that
you intend to use. Send logs to an Azure storage account to reduce costs if you need to retain the information
but don't require it to be readily available for analysis. See Resource logs for a description of the categories that
are available for AKS and See Azure Monitor Logs pricing details for details for details on the cost of ingesting
and retaining log data. Start by collecting a minimal number of categories and then modify the diagnostic
setting to collect additional categories as your needs increase and as you understand your associated costs.
If you're unsure about which resource logs to initially enable, use the recommendations in the following table
which are based on the most common customer requirements. Enable the other categories if you later find that
you require this information.
C AT EGO RY EN A B L E? DEST IN AT IO N
kube-scheduler Disable
Metrics Open metrics explorer with the scope set to the current
cluster.
Diagnostic settings Create diagnostic settings for the cluster to collect resource
logs.
Logs Open Log Analytics with the scope set to the current cluster
to analyze log data and access prebuilt queries.
Use existing views and reports in Container Insights to monitor cluster level components. The Cluster view
gives you a quick view of the performance of the nodes in your cluster including their CPU and memory
utilization. Use the Nodes view to view the health of each node in addition to the health and performance of the
pods running on each. See Monitor your Kubernetes cluster performance with Container insights for details on
using this view and analyzing node health and performance.
Use Node workbooks in Container Insights to analyze disk capacity and IO in addition to GPU usage. See Node
workbooks for a description of these workbooks.
For troubleshooting scenarios, you may need to access the AKS nodes directly for maintenance or immediate
log collection. For security purposes, the AKS nodes aren't exposed to the internet but you can kubectl debug to
SSH to the AKS nodes. See Connect with SSH to Azure Kubernetes Service (AKS) cluster nodes for maintenance
or troubleshooting for details on this process.
Level 2 - Managed AKS components
Managed AKS level includes the following components.
C O M P O N EN T M O N ITO RIN G
API Server Monitor the status of API server, identifying any increase in
request load and bottlenecks if the service is down.
Azure Monitor and container insights don't yet provide full monitoring for the API server. You can use metrics
explorer to view the Inflight Requests counter, but you should refer to metrics in Prometheus for a complete
view of API Server performance. This includes such values as request latency and workqueue processing time. A
Grafana dashboard that provides views of the critical metrics for the API server is available at Grafana Labs. Use
this dashboard on your existing Grafana server or setup a new Grafana server in Azure using Monitor your
Azure services in Grafana
Use the Kubelet workbook to view the health and performance of each kubelet. See Resource Monitoring
workbooks for details on this workbooks. For troubleshooting scenarios, you can access kubelet logs using the
process described at Get kubelet logs from Azure Kubernetes Service (AKS) cluster nodes.
Resource logs
Use log queries with resource logs to analyze control plane logs generated by AKS components.
Level 3 - Kubernetes objects and workloads
Kubernetes objects and workloads level include the following components.
Use existing views and reports in Container Insights to monitor containers and pods. Use the Nodes and
Controllers views to view the health and performance of the pods running on them and drill down to the
health and performance of their containers. View the health and performance for containers directly from the
Containers view. See Monitor your Kubernetes cluster performance with Container insights for details on using
this view and analyzing container health and performance.
Use the Deployment workbook in Container insights to view metrics collected for deployments. See
Deployment & HPA metrics with Container insights for details.
NOTE
Deployments view in Container insights is currently in public preview.
Live data
In troubleshooting scenarios, Container insights provides access to live AKS container logs (stdout/stderror),
events, and pod metrics. See How to view Kubernetes logs, events, and pod metrics in real-time for details on
using this feature.
Level 4- Applications
The application level includes the application workloads running in the AKS cluster.
Application Insights provides complete monitoring of applications running on AKS and other environments. If
you have a Java application, you can provide monitoring without instrumenting your code following Zero
instrumentation application monitoring for Kubernetes - Azure Monitor Application Insights. For complete
monitoring though, you should configure code-based monitoring depending on your application.
ASP.NET Applications
ASP.NET Core Applications
.NET Console Applications
Java
Node.js
Python
Other platforms
See What is Application Insights?
Level 5- External components
Components external to AKS include the following.
Monitor external components such as Service Mesh, Ingress, Egress with Prometheus and Grafana or other
proprietary tools. Monitor databases and other Azure resources using other features of Azure Monitor.
In addition to Container insights data, you can use log queries to analyze resource logs from AKS. For a list of
the log categories available, see AKS data reference resource logs. You must create a diagnostic setting to collect
each category as described in Configure monitoring before that data will be collected.
Alerts
Alerts in Azure Monitor proactively notify you of interesting data and patterns in your monitoring data. They
allow you to identify and address issues in your system before your customers notice them. There are no
preconfigured alert rules for AKS clusters, but you can create your own based on data collected by Container
insights.
IMPORTANT
Most alert rules have a cost that's dependent on the type of rule, how many dimensions it includes, and how frequently
it's run. Refer to Aler t rules in Azure Monitor pricing before you create any alert rules.
Next steps
See Monitoring AKS data reference for a reference of the metrics, logs, and other important values created
by AKS.
Monitoring AKS data reference
6/15/2022 • 3 minutes to read • Edit Online
See Monitoring AKS for details on collecting and analyzing monitoring data for AKS.
Metrics
The following table lists the platform metrics collected for AKS. Follow each link for a detailed list of the metrics
for each particular type.
For more information, see a list of all platform metrics supported in Azure Monitor.
Metric dimensions
The following table lists dimensions for AKS metrics.
Resource logs
The following table lists the resource log categories you can collect for AKS. These are the logs for AKS control
plane components. See Configure monitoring for information on creating a diagnostic setting to collect these
logs and recommendations on which to enable. See How to query logs from Container insights for query
examples.
For reference, see a list of all resource logs category types supported in Azure Monitor.
guard Managed Azure Active Directory and Azure RBAC audits. For
managed Azure AD, this includes token in and user info out.
For Azure RBAC, this includes access reviews in and out.
kube-audit Audit log data for every audit event including get, list,
create, update, delete, patch, and post.
RESO URC E T Y P E N OT ES
Kubernetes services Follow this link for a list of all tables used by AKS and a
description of their structure.
For a reference of all Azure Monitor Logs / Log Analytics tables, see the Azure Monitor Log Table Reference.
Activity log
The following table lists a few example operations related to AKS that may be created in the Activity log. Use the
Activity log to track information such as when a cluster is created or had its configuration change. You can either
view this information in the portal or create an Activity log alert to be proactively notified when an event occurs.
For a complete list of possible log entries, see Microsoft.ContainerService Resource Provider options.
For more information on the schema of Activity Log entries, see Activity Log schema.
See also
See Monitoring Azure AKS for a description of monitoring Azure AKS.
See Monitoring Azure resources with Azure Monitor for details on monitoring Azure resources.
Get kubelet logs from Azure Kubernetes Service
(AKS) cluster nodes
6/15/2022 • 2 minutes to read • Edit Online
As part of operating an AKS cluster, you may need to review logs to troubleshoot a problem. Built-in to the
Azure portal is the ability to view logs for the AKS master components or containers in an AKS cluster.
Occasionally, you may need to get kubelet logs from an AKS node for troubleshooting purposes.
This article shows you how you can use journalctl to view the kubelet logs on an AKS node.
chroot /host
journalctl -u kubelet -o cat
NOTE
You don't need to use sudo journalctl since you are already root on the node.
NOTE
For Windows nodes, the log data is in C:\k and can be viewed using the more command:
more C:\k\kubelet.log
Next steps
If you need additional troubleshooting information from the Kubernetes master, see view Kubernetes master
node logs in AKS.
How to view Kubernetes logs, events, and pod
metrics in real-time
6/15/2022 • 6 minutes to read • Edit Online
Container insights includes the Live Data feature, which is an advanced diagnostic feature allowing you direct
access to your Azure Kubernetes Service (AKS) container logs (stdout/stderror), events, and pod metrics. It
exposes direct access to kubectl logs -c , kubectl get events, and kubectl top pods . A console pane shows
the logs, events, and metrics generated by the container engine to further assist in troubleshooting issues in
real-time.
This article provides a detailed overview and helps you understand how to use this feature.
For help setting up or troubleshooting the Live Data feature, review our setup guide. This feature directly
accesses the Kubernetes API, and additional information about the authentication model can be found here.
View logs
You can view real-time log data as they are generated by the container engine from the Nodes , Controllers ,
and Containers view. To view log data, perform the following steps.
1. In the Azure portal, browse to the AKS cluster resource group and select your AKS resource.
2. On the AKS cluster dashboard, under Monitoring on the left-hand side, choose Insights .
3. Select either the Nodes , Controllers , or Containers tab.
4. Select an object from the performance grid, and on the properties pane found on the right side, select
View live data option. If the AKS cluster is configured with single sign-on using Azure AD, you are
prompted to authenticate on first use during that browser session. Select your account and complete
authentication with Azure.
NOTE
When viewing the data from your Log Analytics workspace by selecting the View in analytics option from the
properties pane, the log search results will potentially show Nodes , Daemon Sets , Replica Sets , Jobs , Cron
Jobs , Pods , and Containers which may no longer exist. Attempting to search logs for a container which isn't
available in kubectl will also fail here. Review the How to query logs from Container insights feature to learn
more about viewing historical logs, events and metrics.
After successfully authenticating, the Live Data console pane will appear below the performance data grid where
you can view log data in a continuous stream. If the fetch status indicator shows a green check mark, which is on
the far right of the pane, it means data can be retrieved and it begins streaming to your console.
The pane title shows the name of the pod the container is grouped with.
View events
You can view real-time event data as they are generated by the container engine from the Nodes , Controllers ,
Containers , and Deployments view when a container, pod, node, ReplicaSet, DaemonSet, job, CronJob or
Deployment is selected. To view events, perform the following steps.
1. In the Azure portal, browse to the AKS cluster resource group and select your AKS resource.
2. On the AKS cluster dashboard, under Monitoring on the left-hand side, choose Insights .
3. Select either the Nodes , Controllers , Containers , or Deployments tab.
4. Select an object from the performance grid, and on the properties pane found on the right side, select
View live data option. If the AKS cluster is configured with single sign-on using Azure AD, you are
prompted to authenticate on first use during that browser session. Select your account and complete
authentication with Azure.
NOTE
When viewing the data from your Log Analytics workspace by selecting the View in analytics option from the
properties pane, the log search results will potentially show Nodes , Daemon Sets , Replica Sets , Jobs , Cron
Jobs , Pods , and Containers which may no longer exist. Attempting to search logs for a container which isn't
available in kubectl will also fail here. Review the How to query logs from Container insights feature to learn
more about viewing historical logs, events and metrics.
After successfully authenticating, the Live Data console pane will appear below the performance data grid. If the
fetch status indicator shows a green check mark, which is on the far right of the pane, it means data can be
retrieved and it begins streaming to your console.
If the object you selected was a container, select the Events option in the pane. If you selected a Node, Pod, or
controller, viewing events is automatically selected.
The pane title shows the name of the Pod the container is grouped with.
Filter events
While viewing events, you can additionally limit the results using the Filter pill found to the right of the search
bar. Depending on what resource you have selected, the pill lists a Pod, Namespace, or cluster to choose from.
View metrics
You can view real-time metric data as they are generated by the container engine from the Nodes or
Controllers view only when a Pod is selected. To view metrics, perform the following steps.
1. In the Azure portal, browse to the AKS cluster resource group and select your AKS resource.
2. On the AKS cluster dashboard, under Monitoring on the left-hand side, choose Insights .
3. Select either the Nodes or Controllers tab.
4. Select a Pod object from the performance grid, and on the properties pane found on the right side, select
View live data option. If the AKS cluster is configured with single sign-on using Azure AD, you are
prompted to authenticate on first use during that browser session. Select your account and complete
authentication with Azure.
NOTE
When viewing the data from your Log Analytics workspace by selecting the View in analytics option from the
properties pane, the log search results will potentially show Nodes , Daemon Sets , Replica Sets , Jobs , Cron
Jobs , Pods , and Containers which may no longer exist. Attempting to search logs for a container which isn't
available in kubectl will also fail here. Review How to query logs from Container insights to learn more about
viewing historical logs, events and metrics.
After successfully authenticating, the Live Data console pane will appear below the performance data grid.
Metric data is retrieved and begins streaming to your console for presentation in the two charts. The pane title
shows the name of the pod the container is grouped with.
IMPORTANT
No data is stored permanently during operation of this feature. All information captured during the session is deleted
when you close your browser or navigate away from it. Data only remains present for visualization inside the five minute
window of the metrics feature; any metrics older than five minutes are also deleted. The Live Data buffer queries within
reasonable memory usage limits.
Next steps
To continue learning how to use Azure Monitor and monitor other aspects of your AKS cluster, see View
Azure Kubernetes Service health.
View How to query logs from Container insights to see predefined queries and examples to create alerts,
visualizations, or perform further analysis of your clusters.
Connect with RDP to Azure Kubernetes Service
(AKS) cluster Windows Server nodes for
maintenance or troubleshooting
6/15/2022 • 5 minutes to read • Edit Online
Throughout the lifecycle of your Azure Kubernetes Service (AKS) cluster, you may need to access an AKS
Windows Server node. This access could be for maintenance, log collection, or other troubleshooting operations.
You can access the AKS Windows Server nodes using RDP. Alternatively, if you want to use SSH to access the
AKS Windows Server nodes and you have access to the same keypair that was used during cluster creation, you
can follow the steps in SSH into Azure Kubernetes Service (AKS) cluster nodes. For security purposes, the AKS
nodes are not exposed to the internet.
This article shows you how to create an RDP connection with an AKS node using their private IP addresses.
If you need to reset both the username and password, see Reset Remote Desktop Services or its administrator
password in a Windows VM .
You also need the Azure CLI version 2.0.61 or later installed and configured. Run az --version to find the
version. If you need to install or upgrade, see Install Azure CLI.
az vm create \
--resource-group myResourceGroup \
--name myVM \
--image win2019datacenter \
--admin-username azureuser \
--admin-password myP@ssw0rd12 \
--subnet $SUBNET_ID \
--query publicIpAddress -o tsv
The following example output shows the VM has been successfully created and displays the public IP address of
the virtual machine.
13.62.204.18
Record the public IP address of the virtual machine. You will use this address in a later step.
NOTE
The NSGs are controlled by the AKS service. Any change you make to the NSG will be overwritten at any time by the
control plane.
First, get the resource group and nsg name of the nsg to add the rule to:
az network nsg rule create --name tempRDPAccess --resource-group $CLUSTER_RG --nsg-name $NSG_NAME --priority
100 --destination-port-range 3389 --protocol Tcp --description "Temporary RDP access to Windows nodes"
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. This
command downloads credentials and configures the Kubernetes CLI to use them.
The follow example output shows the internal IP addresses of all the nodes in the cluster, including the Windows
Server nodes.
Record the internal IP address of the Windows Server node you wish to troubleshoot. You will use this address
in a later step.
After you've connected to your virtual machine, connect to the internal IP address of the Windows Server node
you want to troubleshoot using an RDP client from within your virtual machine.
You are now connected to your Windows Server node.
You can now run any troubleshooting commands in the cmd window. Since Windows Server nodes use
Windows Server Core, there's not a full GUI or other GUI tools when you connect to a Windows Server node
over RDP.
az network nsg rule delete --resource-group $CLUSTER_RG --nsg-name $NSG_NAME --name tempRDPAccess
Next steps
If you need additional troubleshooting data, you can view the Kubernetes master node logs or Azure Monitor.
Use Windows HostProcess containers
6/15/2022 • 2 minutes to read • Edit Online
HostProcess / Privileged containers extend the Windows container model to enable a wider range of Kubernetes
cluster management scenarios. HostProcess containers run directly on the host and maintain behavior and
access similar to that of a regular process. HostProcess containers allow users to package and distribute
management operations and functionalities that require host access while retaining versioning and deployment
methods provided by containers.
A privileged DaemonSet can carry out changes or monitor a Linux host on Kubernetes but not Windows hosts.
HostProcess containers are the Windows equivalent of host elevation.
Limitations
HostProcess containers require Kubernetes 1.23 or greater.
HostProcess containers require containerd 1.6 or higher container runtime.
HostProcess pods can only contain HostProcess containers. This is a current limitation of the Windows
operating system. Non-privileged Windows containers can't share a vNIC with the host IP namespace.
HostProcess containers run as a process on the host. The only isolation those containers have from the host
is the resource constraints imposed on the HostProcess user account.
Filesystem isolation and Hyper-V isolation aren't supported for HostProcess containers.
Volume mounts are supported and are mounted under the container volume. See Volume Mounts.
A limited set of host user accounts are available for Host Process containers by default. See Choosing a User
Account.
Resource limits such as disk, memory, and cpu count, work the same way as fashion as processes on the
host.
Named pipe mounts and Unix domain sockets are not directly supported, but can be accessed on their host
path, for example \\.\pipe\* .
spec:
...
containers:
...
securityContext:
privileged: true
windowsOptions:
hostProcess: true
...
hostNetwork: true
...
To run an example workload that uses HostProcess features on an existing AKS cluster with Windows nodes,
create hostprocess.yaml with the following:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: privileged-daemonset
namespace: kube-system
labels:
app: privileged-daemonset
spec:
selector:
matchLabels:
app: privileged-daemonset
template:
metadata:
labels:
app: privileged-daemonset
spec:
nodeSelector:
kubernetes.io/os: windows
containers:
- name: powershell
image: mcr.microsoft.com/powershell:lts-nanoserver-1809
securityContext:
privileged: true
windowsOptions:
hostProcess: true
runAsUserName: "NT AUTHORITY\\SYSTEM"
command:
- pwsh.exe
- -command
- |
$AdminRights = ([Security.Principal.WindowsPrincipal]
[Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole]"Adminis
trator")
Write-Host "Process has admin rights: $AdminRights"
while ($true) { Start-Sleep -Seconds 2147483 }
hostNetwork: true
terminationGracePeriodSeconds: 0
You can verify your workload use the features of HostProcess by view the pod's logs.
Use kubectl to find the name of the pod in the kube-system namespace.
Use kubctl log to view the logs of the pod and verify the pod has administrator rights:
$ kubectl logs privileged-daemonset-12345 --namespace kube-system
InvalidOperation: Unable to find type [Security.Principal.WindowsPrincipal].
Process has admin rights:
Next steps
For more details on HostProcess containers and Microsoft's contribution to Kubernetes upstream, see the Alpha
in v1.22: Windows HostProcess Containers.
Frequently asked questions for Windows Server
node pools in AKS
6/15/2022 • 9 minutes to read • Edit Online
In Azure Kubernetes Service (AKS), you can create a node pool that runs Windows Server as the guest OS on the
nodes. These nodes can run native Windows container applications, such as those built on the .NET Framework.
There are differences in how the Linux and Windows OS provides container support. Some common Linux
Kubernetes and pod-related features are not currently available for Windows node pools.
This article outlines some of the frequently asked questions and OS concepts for Windows Server nodes in AKS.
NOTE
The updated Windows Server image will only be used if a cluster upgrade (control plane upgrade) has been performed
prior to upgrading the node pool.
az aks update \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--windows-admin-password $NEW_PW
IMPORTANT
Performing the az aks update operation upgrades only Windows Server node pools. Linux node pools are not affected.
When you're changing --windows-admin-password , the new password must be at least 14 characters and meet
Windows Server password requirements.
Can I use Azure Monitor for containers with Windows nodes and
containers?
Yes, you can. However, Azure Monitor is in public preview for gathering logs (stdout, stderr) and metrics from
Windows containers. You can also attach to the live stream of stdout logs from a Windows container.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--load-balancer-sku Standard \
--windows-admin-password 'Password1234$' \
--windows-admin-username azure \
--network-plugin azure
--enable-ahub
To use Azure Hybrid Benefit on an existing AKS cluster, run the az aks update command and use the update the
cluster by using the --enable-ahub argument.
az aks update \
--resource-group myResourceGroup
--name myAKSCluster
--enable-ahub
To check if Azure Hybrid Benefit is set on the Windows nodes in the cluster, run the az vmss show command
with the --name and --resource-group arguments to query the virtual machine scale set. To identify the
resource group the scale set for the Windows node pool is created in, you can run the az vmss list -o table
command.
If the Windows nodes in the scale set have Azure Hybrid Benefit enabled, the output of az vmss show will be
similar to the following:
""hardwareProfile": null,
"licenseType": "Windows_Server",
"networkProfile": {
"healthProbe": null,
"networkApiVersion": null,
In the running container, use Set-TimeZone to set the time zone of the running container. For example:
To see the current time zone of the running container or an available list of time zones, use Get-TimeZone.
Next steps
To get started with Windows Server containers in AKS, see Create a node pool that runs Windows Server in AKS.
Install existing applications with Helm in Azure
Kubernetes Service (AKS)
6/15/2022 • 5 minutes to read • Edit Online
Helm is an open-source packaging tool that helps you install and manage the lifecycle of Kubernetes
applications. Similar to Linux package managers such as APT and Yum, Helm is used to manage Kubernetes
charts, which are packages of preconfigured Kubernetes resources.
This article shows you how to configure and use Helm in a Kubernetes cluster on AKS.
IMPORTANT
Helm is intended to run on Linux nodes. If you have Windows Server nodes in your cluster, you must ensure that Helm
pods are only scheduled to run on Linux nodes. You also need to ensure that any Helm charts you install are also
scheduled to run on the correct nodes. The commands in this article use [node-selectors][k8s-node-selector] to make
sure pods are scheduled to the correct nodes, but not all Helm charts may expose a node selector. You can also consider
using other options on your cluster, such as taints.
helm version
$ helm version
version.BuildInfo{Version:"v3.0.0", GitCommit:"e29ce2a54e96cd02ccfce88bee4f58bb6e2a28b6",
GitTreeState:"clean", GoVersion:"go1.13.4"}
The following condensed example output shows some of the Helm charts available for use:
To update the list of charts, use the helm repo update command.
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ingress-nginx" chart repository
Update Complete. Happy Helming!
Import the images used by the Helm chart into your ACR
This article uses the NGINX ingress controller Helm chart, which relies on three container images. Use
az acr import to import those images into your ACR.
REGISTRY_NAME=<REGISTRY_NAME>
CONTROLLER_REGISTRY=k8s.gcr.io
CONTROLLER_IMAGE=ingress-nginx/controller
CONTROLLER_TAG=v0.48.1
PATCH_REGISTRY=docker.io
PATCH_IMAGE=jettech/kube-webhook-certgen
PATCH_TAG=v1.5.1
DEFAULTBACKEND_REGISTRY=k8s.gcr.io
DEFAULTBACKEND_IMAGE=defaultbackend-amd64
DEFAULTBACKEND_TAG=1.5
TIP
The following example creates a Kubernetes namespace for the ingress resources named ingress-basic and is intended to
work within that namespace. Specify a namespace for your own environment as needed.
ACR_URL=<REGISTRY_URL>
The following condensed example output shows the deployment status of the Kubernetes resources created by
the Helm chart:
NAME: nginx-ingress
LAST DEPLOYED: Wed Jul 28 11:35:29 2021
NAMESPACE: ingress-basic
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace ingress-basic get services -o wide -w nginx-
ingress-ingress-nginx-controller'
...
Use the kubectl get services command to get the EXTERNAL-IP of your service.
kubectl --namespace ingress-basic get services -o wide -w nginx-ingress-ingress-nginx-controller
For example, the below command shows the EXTERNAL-IP for the nginx-ingress-ingress-nginx-controller
service:
List releases
To see a list of releases installed on your cluster, use the helm list command.
The following example shows the my-nginx-ingress release deployed in the previous step:
Clean up resources
When you deploy a Helm chart, a number of Kubernetes resources are created. These resources include pods,
deployments, and services. To clean up these resources, use the helm uninstall command and specify your
release name, as found in the previous helm list command.
The following example shows the release named my-nginx-ingress has been uninstalled:
To delete the entire sample namespace, use the kubectl delete command and specify your namespace name.
All the resources in the namespace are deleted.
Next steps
For more information about managing Kubernetes application deployments with Helm, see the Helm
documentation.
Helm documentation
Using OpenFaaS on AKS
6/15/2022 • 4 minutes to read • Edit Online
OpenFaaS is a framework for building serverless functions through the use of containers. As an open source
project, it has gained large-scale adoption within the community. This document details installing and using
OpenFaas on an Azure Kubernetes Service (AKS) cluster.
Prerequisites
In order to complete the steps within this article, you need the following.
Basic understanding of Kubernetes.
An Azure Kubernetes Service (AKS) cluster and AKS credentials configured on your development system.
Azure CLI installed on your development system.
Git command-line tools installed on your system.
Deploy OpenFaaS
As a good practice, OpenFaaS and OpenFaaS functions should be stored in their own Kubernetes namespace.
Create a namespace for the OpenFaaS system and functions:
You can get the value of the secret with echo $PASSWORD .
The password we create here will be used by the helm chart to enable basic authentication on the OpenFaaS
Gateway, which is exposed to the Internet through a cloud LoadBalancer.
A Helm chart for OpenFaaS is included in the cloned repository. Use this chart to deploy OpenFaaS into your
AKS cluster.
helm upgrade openfaas --install openfaas/openfaas \
--namespace openfaas \
--set basic_auth=true \
--set functionNamespace=openfaas-fn \
--set serviceType=LoadBalancer
Output:
NAME: openfaas
LAST DEPLOYED: Wed Feb 28 08:26:11 2018
NAMESPACE: openfaas
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
prometheus-config 2 20s
alertmanager-config 1 20s
{snip}
NOTES:
To verify that openfaas has started, run:
```console
kubectl --namespace=openfaas get deployments -l "release=openfaas, app=openfaas"
A public IP address is created for accessing the OpenFaaS gateway. To retrieve this IP address, use the kubectl
get service command. It may take a minute for the IP address to be assigned to the service.
Output.
To test the OpenFaaS system, browse to the external IP address on port 8080, http://52.186.64.52:8080 in this
example. You will be prompted to log in. To fetch your password, enter echo $PASSWORD .
Finally, install the OpenFaaS CLI. This example used brew, see the OpenFaaS CLI documentation for more
options.
export OPENFAAS_URL=http://52.186.64.52:8080
echo -n $PASSWORD | ./faas-cli login -g $OPENFAAS_URL -u admin --password-stdin
Output:
_ _ _ _ _
| | | | ___| | | ___ / \ _____ _ _ __ ___
| |_| |/ _ \ | |/ _ \ / _ \ |_ / | | | '__/ _ \
| _ | __/ | | (_) | / ___ \ / /| |_| | | | __/
|_| |_|\___|_|_|\___/ /_/ \_\/___|\__,_|_| \___|
Deploy a CosmosDB instance of kind MongoDB . The instance needs a unique name, update openfaas-cosmos to
something unique to your environment.
Now populate the Cosmos DB with test data. Create a file named plans.json and copy in the following json.
{
"name" : "two_person",
"friendlyName" : "Two Person Plan",
"portionSize" : "1-2 Person",
"mealsPerWeek" : "3 Unique meals per week",
"price" : 72,
"description" : "Our basic plan, delivering 3 meals per week, which will feed 1-2 people.",
"__v" : 0
}
Use the mongoimport tool to load the CosmosDB instance with data.
If needed, install the MongoDB tools. The following example installs these tools using brew, see the MongoDB
documentation for other options.
Output:
Run the following command to create the function. Update the value of the -g argument with your OpenFaaS
gateway address.
Once deployed, you should see your newly created OpenFaaS endpoint for the function.
Test the function using curl. Update the IP address with the OpenFaaS gateway address.
curl -s http://52.186.64.52:8080/function/cosmos-query
Output:
[{"ID":"","Name":"two_person","FriendlyName":"","PortionSize":"","MealsPerWeek":"","Price":72,"Description":
"Our basic plan, delivering 3 meals per week, which will feed 1-2 people."}]
You can also test the function within the OpenFaaS UI.
Next Steps
You can continue to learn with the OpenFaaS workshop through a set of hands-on labs that cover topics such as
how to create your own GitHub bot, consuming secrets, viewing metrics, and auto-scaling.
Use GPUs for compute-intensive workloads on
Azure Kubernetes Service (AKS)
6/15/2022 • 10 minutes to read • Edit Online
Graphical processing units (GPUs) are often used for compute-intensive workloads such as graphics and
visualization workloads. AKS supports the creation of GPU-enabled node pools to run these compute-intensive
workloads in Kubernetes. For more information on available GPU-enabled VMs, see GPU optimized VM sizes in
Azure. For AKS node pools, we recommend a minimum size of Standard_NC6. Note that the NVv4 series (based
on AMD GPUs) are not yet supported with AKS.
NOTE
GPU-enabled VMs contain specialized hardware that is subject to higher pricing and region availability. For more
information, see the pricing tool and region availability.
Currently, using GPU-enabled node pools is only available for Linux node pools.
WARNING
You can use either of the above options, but you shouldn't manually install the NVIDIA device plugin daemon set with
clusters that use the AKS GPU image.
It might take several minutes for the status to show as Registered . You can check the registration status by
using the az feature list command:
When the status shows as registered, refresh the registration of the Microsoft.ContainerService resource
provider by using the az provider register command:
To install the aks-preview CLI extension, use the following Azure CLI commands:
To update the aks-preview CLI extension, use the following Azure CLI commands:
The above command adds a node pool named gpunp to the myAKSCluster in the myResourceGroup resource
group. The command also sets the VM size for the nodes in the node pool to Standard_NC6, enables the cluster
autoscaler, configures the cluster autoscaler to maintain a minimum of one node and a maximum of three nodes
in the node pool, specifies a specialized AKS GPU image nodes on your new node pool, and specifies a
sku=gpu:NoSchedule taint for the node pool.
NOTE
A taint and VM size can only be set for node pools during node pool creation, but the autoscaler settings can be updated
at any time.
NOTE
If your GPU sku requires generation two VMs use --aks-custom-headers UseGPUDedicatedVHD=true,usegen2vm=true.
For example:
The above command adds a node pool named gpunp to the myAKSCluster in the myResourceGroup resource
group. The command also sets the VM size for the nodes in the node pool to Standard_NC6, enables the cluster
autoscaler, configures the cluster autoscaler to maintain a minimum of one node and a maximum of three nodes
in the node pool, and specifies a sku=gpu:NoSchedule taint for the node pool.
NOTE
A taint and VM size can only be set for node pools during node pool creation, but the autoscaler settings can be updated
at any time.
Create a namespace using the kubectl create namespace command, such as gpu-resources:
Create a file named nvidia-device-plugin-ds.yaml and paste the following YAML manifest. This manifest is
provided as part of the NVIDIA device plugin for Kubernetes project.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-device-plugin-daemonset
namespace: gpu-resources
spec:
selector:
matchLabels:
name: nvidia-device-plugin-ds
updateStrategy:
type: RollingUpdate
template:
metadata:
# Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
# reserves resources for critical add-on pods so that they can be rescheduled after
# a failure. This annotation works in tandem with the toleration below.
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
name: nvidia-device-plugin-ds
spec:
tolerations:
# Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
# This, along with the annotation above marks this pod as a critical add-on.
- key: CriticalAddonsOnly
operator: Exists
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
- key: "sku"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
containers:
- image: mcr.microsoft.com/oss/nvidia/k8s-device-plugin:1.11
name: nvidia-device-plugin-ctr
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
Use kubectl apply to create the DaemonSet and confirm the NVIDIA device plugin is created successfully, as
shown in the following example output:
Now use the kubectl describe node command to confirm that the GPUs are schedulable. Under the Capacity
section, the GPU should list as nvidia.com/gpu: 1 .
The following condensed example shows that a GPU is available on the node named aks-nodepool1-18821093-
0:
Name: aks-gpunp-28993262-0
Roles: agent
Labels: accelerator=nvidia
[...]
Capacity:
[...]
nvidia.com/gpu: 1
[...]
NOTE
If you receive a version mismatch error when calling into drivers, such as, CUDA driver version is insufficient for CUDA
runtime version, review the NVIDIA driver matrix compatibility chart - https://docs.nvidia.com/deploy/cuda-
compatibility/index.html
apiVersion: batch/v1
kind: Job
metadata:
labels:
app: samples-tf-mnist-demo
name: samples-tf-mnist-demo
spec:
template:
metadata:
labels:
app: samples-tf-mnist-demo
spec:
containers:
- name: samples-tf-mnist-demo
image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
args: ["--max_steps", "500"]
imagePullPolicy: IfNotPresent
resources:
limits:
nvidia.com/gpu: 1
restartPolicy: OnFailure
tolerations:
- key: "sku"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
Use the kubectl apply command to run the job. This command parses the manifest file and creates the defined
Kubernetes objects:
To look at the output of the GPU-enabled workload, first get the name of the pod with the kubectl get pods
command:
Now use the kubectl logs command to view the pod logs. The following example pod logs confirm that the
appropriate GPU device has been discovered, Tesla K80 . Provide the name for your own pod:
Clean up resources
To remove the associated Kubernetes objects created in this article, use the kubectl delete job command as
follows:
Next steps
To run Apache Spark jobs, see Run Apache Spark jobs on AKS.
For more information about running machine learning (ML) workloads on Kubernetes, see Kubeflow Labs.
For information on using Azure Kubernetes Service with Azure Machine Learning, see the following articles:
Deploy a model to Azure Kubernetes Service.
Deploy a deep learning model for inference with GPU.
High-performance serving with Triton Inference Server.
Tutorial: Deploy Django app on AKS with Azure
Database for PostgreSQL - Flexible Server
6/15/2022 • 8 minutes to read • Edit Online
NOTE
This quickstart assumes a basic understanding of Kubernetes concepts, Django and PostgreSQL.
Pre-requisites
If you don't have an Azure subscription, create an Azure free account before you begin.
Launch Azure Cloud Shell in new browser window. You can install Azure CLI on your local machine too. If
you're using a local install, login with Azure CLI by using the az login command. To finish the authentication
process, follow the steps displayed in your terminal.
Run az version to find the version and dependent libraries that are installed. To upgrade to the latest version,
run az upgrade. This article requires the latest version of Azure CLI. If you're using Azure Cloud Shell, the
latest version is already installed.
NOTE
The location for the resource group is where resource group metadata is stored. It is also where your resources run in
Azure if you don't specify another region during resource creation.
The following example output shows the resource group created successfully:
{
"id": "/subscriptions/<guid>/resourceGroups/django-project",
"location": "eastus",
"managedBy": null,
"name": "django-project",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null
}
After a few minutes, the command completes and returns JSON-formatted information about the cluster.
NOTE
When creating an AKS cluster a second resource group is automatically created to store the AKS resources. See Why are
two resource groups created with AKS?
NOTE
If running Azure CLI locally , please run the az aks install-cli command to install kubectl .
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. This
command downloads credentials and configures the Kubernetes CLI to use them.
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows the single node created in the previous steps. Make sure that the status of
the node is Ready:
└───my-djangoapp
└───views.py
└───models.py
└───forms.py
├───templates
. . . . . . .
├───static
. . . . . . .
└───my-django-project
└───settings.py
└───urls.py
└───wsgi.py
. . . . . . .
└─── Dockerfile
└─── requirements.txt
└─── manage.py
Update ALLOWED_HOSTS in settings.py to make sure the Django application uses the external IP that gets
assigned to kubernetes app.
ALLOWED_HOSTS = ['*']
Update DATABASES={ } section in the settings.py file. The code snippet below is reading the database host,
username and password from the Kubernetes manifest file.
DATABASES={
'default':{
'ENGINE':'django.db.backends.postgresql_psycopg2',
'NAME':os.getenv('DATABASE_NAME'),
'USER':os.getenv('DATABASE_USER'),
'PASSWORD':os.getenv('DATABASE_PASSWORD'),
'HOST':os.getenv('DATABASE_HOST'),
'PORT':'5432',
'OPTIONS': {'sslmode': 'require'}
}
}
Django==2.2.17
postgres==3.0.0
psycopg2-binary==2.8.6
psycopg2-pool==1.1
pytz==2020.4
Create a Dockerfile
Create a new file named Dockerfile and copy the code snippet below. This Dockerfile in setting up Python 3.8
and installing all the requirements listed in requirements.txt file.
# Use the official Python image from the Docker Hub
IMPORTANT
If you are using Azure container registry (ACR), then run the az aks update command to attach ACR account with the
AKS cluster.
apiVersion: apps/v1
kind: Deployment
metadata:
name: django-app
spec:
replicas: 1
selector:
matchLabels:
app: django-app
template:
metadata:
labels:
app: django-app
spec:
containers:
- name: django-app
image: [DOCKER-HUB-USER-OR-ACR-ACCOUNT]/[YOUR-IMAGE-NAME]:[TAG]
ports:
- containerPort: 8000
env:
- name: DATABASE_HOST
value: "SERVERNAME.postgres.database.azure.com"
- name: DATABASE_USER
value: "YOUR-DATABASE-USERNAME"
- name: DATABASE_PASSWORD
value: "YOUR-DATABASE-PASSWORD"
- name: DATABASE_NAME
value: "postgres"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- django-app
topologyKey: "kubernetes.io/hostname"
---
apiVersion: v1
kind: Service
metadata:
name: python-svc
spec:
type: LoadBalancer
ports:
- protocol: TCP
port: 80
targetPort: 8000
selector:
app: django-app
The following example output shows the Deployments and Services created successfully:
A deployment django-app allows you to describes details on of your deployment such as which images to use
for the app, the number of pods and pod configuration. A service python-svc is created to expose the
application through an external IP.
When the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
Now open a web browser to the external IP address of your service (http://<service-external-ip-address>) and
view the Django application.
NOTE
Currently the Django site is not using HTTPS. It is recommended to ENABLE TLS with your own certificates.
You can enable HTTP routing for your cluster. When http routing is enabled, it configures an Ingress controller in your
AKS cluster. As applications are deployed, the solution also creates publicly accessible DNS names for application
endpoints.
Once the pod name has been found you can run django database migrations with the command
$ kubectl exec <pod-name> -- [COMMAND] . Note /code/ is the working directory for the project define in
Dockerfile above.
Operations to perform:
Apply all migrations: admin, auth, contenttypes, sessions
Running migrations:
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial... OK
Applying admin.0001_initial... OK
Applying admin.0002_logentry_remove_auto_add... OK
Applying admin.0003_logentry_add_action_flag_choices... OK
. . . . . .
If you run into issues, please run kubectl logs <pod-name> to see what exception is thrown by your application. If
the application is working successfully you would see an output like this when running kubectl logs .
You have 17 unapplied migration(s). Your project may not work properly until you apply the migrations for
app(s): admin, auth, contenttypes, sessions.
Run 'python manage.py migrate' to apply them.
December 08, 2020 - 23:24:14
Django version 2.2.17, using settings 'django_postgres_app.settings'
Starting development server at http://0.0.0.0:8000/
Quit the server with CONTROL-C.
NOTE
When you delete the cluster, the Azure Active Directory service principal used by the AKS cluster is not removed. For
steps on how to remove the service principal, see AKS service principal considerations and deletion. If you used a
managed identity, the identity is managed by the platform and does not require removal.
Next steps
Learn how to access the Kubernetes web dashboard for your AKS cluster
Learn how to enable continuous deployment
Learn how to scale your cluster
Learn how to manage your postgres flexible server
Learn how to configure server parameters for your database server.
Deploy a Java application with Azure Database for
PostgreSQL server to Open Liberty or WebSphere
Liberty on an Azure Kubernetes Service (AKS)
cluster
6/15/2022 • 9 minutes to read • Edit Online
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
This article requires at least version 2.31.0 of Azure CLI. If using Azure Cloud Shell, the latest version is
already installed.
If running the commands in this guide locally (instead of Azure Cloud Shell):
Prepare a local machine with Unix-like operating system installed (for example, Ubuntu, macOS,
Windows Subsystem for Linux).
Install a Java SE implementation (for example, AdoptOpenJDK OpenJDK 8 LTS/OpenJ9).
Install Maven 3.5.0 or higher.
Install Docker for your OS.
Create a user-assigned managed identity and assign Owner role or Contributor and
User Access Administrator roles to that identity by following the steps in Manage user-assigned
managed identities. Assign Directory readers role to the identity in Azure AD by following Assign
Azure AD roles to users. Return to this document after creating the identity and assigning it the
necessary roles.
RESOURCE_GROUP_NAME=java-liberty-project-postgresql
az group create --name $RESOURCE_GROUP_NAME --location eastus
TIP
To help ensure a globally unique name, prepend a disambiguation string such as your initials and the MMDD of
today's date.
export DB_NAME=youruniquedbname
export DB_ADMIN_USERNAME=myadmin
export DB_ADMIN_PASSWORD=<server_admin_password>
az postgres server create --resource-group $RESOURCE_GROUP_NAME --name $DB_NAME --location eastus --
admin-user $DB_ADMIN_USERNAME --admin-password $DB_ADMIN_PASSWORD --sku-name GP_Gen5_2
3. Allow Azure Services, such as our Open Liberty and WebSphere Liberty application, to access the Azure
PostgreSQL server.
4. Allow your local IP address to access the Azure PostgreSQL server. This is necessary to allow the
liberty:devc to access the database.
If you don't want to use the CLI, you may use the Azure portal by following the steps in Quickstart: Create an
Azure Database for PostgreSQL server by using the Azure portal. You must also grant access to Azure services
by following the steps in Firewall rules in Azure Database for PostgreSQL - Single Server. Return to this
document after creating and configuring the database server.
javaee-app-db-using-actions/postgres
├─ src/main/
│ ├─ aks/
│ │ ├─ db-secret.yaml
│ │ ├─ openlibertyapplication.yaml
│ ├─ docker/
│ │ ├─ Dockerfile
│ │ ├─ Dockerfile-local
│ │ ├─ Dockerfile-wlp
│ │ ├─ Dockerfile-wlp-local
│ ├─ liberty/config/
│ │ ├─ server.xml
│ ├─ java/
│ ├─ resources/
│ ├─ webapp/
├─ pom.xml
The directories java, resources, and webapp contain the source code of the sample application. The code
declares and uses a data source named jdbc/JavaEECafeDB .
In the aks directory, we placed two deployment files. db-secret.xml is used to create Kubernetes Secrets with DB
connection credentials. The file openlibertyapplication.yaml is used to deploy the application image.
In the docker directory, we placed four Dockerfiles. Dockerfile-local is used for local debugging, and Dockerfile is
used to build the image for an AKS deployment. These two files work with Open Liberty. Dockerfile-wlp-local
and Dockerfile-wlp are also used for local debugging and to build the image for an AKS deployment
respectively, but instead work with WebSphere Liberty.
In directory liberty/config, the server.xml is used to configure the DB connection for the Open Liberty and
WebSphere Liberty cluster.
Acquire necessary variables from AKS deployment
After the offer is successfully deployed, an AKS cluster will be generated automatically. The AKS cluster is
configured to connect to the ACR. Before we get started with the application, we need to extract the namespace
configured for the AKS.
1. Run the following command to print the current deployment file, using the
appDeploymentTemplateYamlEncoded you saved above. The output contains all the variables we need.
2. Save the metadata.namespace from this yaml output aside for later use in this article.
Build the project
Now that you have gathered the necessary properties, you can build the application. The POM file for the
project reads many properties from the environment.
cd <path-to-your-repo>/javaee-app-db-using-actions/postgres
cd <path-to-your-repo>/javaee-app-db-using-actions/postgres
3. Verify the application works as expected. You should see a message similar to
[INFO] [AUDIT] CWWKZ0003I: The application javaee-cafe updated in 1.930 seconds. in the command
output if successful. Go to http://localhost:9080/ in your browser and verify the application is
accessible and all functions are working.
4. Press Ctrl+C to stop liberty:devc mode.
Build image for AKS deployment
After successfully running the app in the Liberty Docker container, you can run the docker build command to
build the image.
cd <path-to-your-repo>/javaee-app-db-using-actions/postgres
# Fetch maven artifactId as image name, maven build version as image version
IMAGE_NAME=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.artifactId}' --non-recursive exec:exec)
IMAGE_VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec)
cd <path-to-your-repo>/javaee-app-db-using-actions/postgres/target
You should see output similar to the following to indicate that all the pods are running.
Clean up resources
To avoid Azure charges, you should clean up unnecessary resources. When the cluster is no longer needed, use
the az group delete command to remove the resource group, container service, container registry, and all
related resources.
Next steps
Azure Kubernetes Service
Azure Database for PostgreSQL
Open Liberty
Open Liberty Operator
Open Liberty Server Configuration
Tutorial: Deploy WordPress app on AKS with Azure
Database for MySQL - Flexible Server
6/15/2022 • 8 minutes to read • Edit Online
NOTE
This quickstart assumes a basic understanding of Kubernetes concepts, WordPress and MySQL.
If you don't have an Azure subscription, create an Azure free account before you begin. With an Azure free
account, you can now try Azure Database for MySQL - Flexible Server for free for 12 months. For more details,
see Try Flexible Server for free.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
This article requires the latest version of Azure CLI. If using Azure Cloud Shell, the latest version is already
installed.
NOTE
If running the commands in this quickstart locally (instead of Azure Cloud Shell), ensure you run the commands as
administrator.
Create a resource group
An Azure resource group is a logical group in which Azure resources are deployed and managed. Let's create a
resource group, wordpress-project using the [az group create][az-group-create] command in the eastus
location.
NOTE
The location for the resource group is where resource group metadata is stored. It is also where your resources run in
Azure if you don't specify another region during resource creation.
The following example output shows the resource group created successfully:
{
"id": "/subscriptions/<guid>/resourceGroups/wordpress-project",
"location": "eastus",
"managedBy": null,
"name": "wordpress-project",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null
}
After a few minutes, the command completes and returns JSON-formatted information about the cluster.
NOTE
When creating an AKS cluster a second resource group is automatically created to store the AKS resources. See Why are
two resource groups created with AKS?
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. This
command downloads credentials and configures the Kubernetes CLI to use them.
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows the single node created in the previous steps. Make sure that the status of
the node is Ready:
// ** MySQL settings - You can get this info from your web host ** //
/** The name of the database for WordPress */
$connectstr_dbhost = getenv('DATABASE_HOST');
$connectstr_dbusername = getenv('DATABASE_USERNAME');
$connectstr_dbpassword = getenv('DATABASE_PASSWORD');
$connectst_dbname = getenv('DATABASE_NAME');
/** SSL*/
define('MYSQL_CLIENT_FLAGS', MYSQLI_CLIENT_SSL);
Create a Dockerfile
Create a new Dockerfile and copy this code snippet. This Dockerfile in setting up Apache web server with PHP
and enabling mysqli extension.
FROM php:7.2-apache
COPY public/ /var/www/html/
RUN docker-php-ext-install mysqli
RUN docker-php-ext-enable mysqli
Build your docker image
Make sure you're in the directory my-wordpress-app in a terminal using the cd command. Run the following
command to build the image:
IMPORTANT
If you are using Azure container regdistry (ACR), then run the az aks update command to attach ACR account with the
AKS cluster.
IMPORTANT
Replace [DOCKER-HUB-USER/ACR ACCOUNT]/[YOUR-IMAGE-NAME]:[TAG] with your actual WordPress docker image
name and tag, for example docker-hub-user/myblog:latest .
Update env section below with your SERVERNAME , YOUR-DATABASE-USERNAME , YOUR-DATABASE-PASSWORD of your
MySQL flexible server.
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress-blog
spec:
replicas: 1
selector:
matchLabels:
app: wordpress-blog
template:
metadata:
labels:
app: wordpress-blog
spec:
containers:
- name: wordpress-blog
image: [DOCKER-HUB-USER-OR-ACR-ACCOUNT]/[YOUR-IMAGE-NAME]:[TAG]
ports:
- containerPort: 80
env:
- name: DATABASE_HOST
value: "SERVERNAME.mysql.database.azure.com" #Update here
- name: DATABASE_USERNAME
value: "YOUR-DATABASE-USERNAME" #Update here
- name: DATABASE_PASSWORD
value: "YOUR-DATABASE-PASSWORD" #Update here
- name: DATABASE_NAME
value: "flexibleserverdb"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- wordpress-blog
topologyKey: "kubernetes.io/hostname"
---
apiVersion: v1
kind: Service
metadata:
name: php-svc
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: wordpress-blog
The following example output shows the Deployments and Services created successfully:
When the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process. The following example output shows a valid public IP address assigned to the service:
Browse WordPress
Open a web browser to the external IP address of your service to see your WordPress installation page.
NOTE
Currently the WordPress site is not using HTTPS. It is recommended to ENABLE TLS with your own certificates.
You can enable HTTP routing for your cluster.
Next steps
Learn how to access the Kubernetes web dashboard for your AKS cluster
Learn how to scale your cluster
Learn how to manage your MySQL flexible server
Learn how to configure server parameters for your database server.
Deploy a Java application with Open Liberty or
WebSphere Liberty on an Azure Kubernetes Service
(AKS) cluster
6/15/2022 • 12 minutes to read • Edit Online
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Azure Cloud Shell Quickstart -
Bash.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you are running on Windows
or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the
Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish
the authentication process, follow the steps displayed in your terminal. For additional sign-in
options, see Sign in with the Azure CLI.
When you're prompted, install Azure CLI extensions on first use. For more information about
extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the
latest version, run az upgrade.
This article requires the latest version of Azure CLI. If using Azure Cloud Shell, the latest version is already
installed.
If running the commands in this guide locally (instead of Azure Cloud Shell):
Prepare a local machine with Unix-like operating system installed (for example, Ubuntu, macOS,
Windows Subsystem for Linux).
Install a Java SE implementation (for example, AdoptOpenJDK OpenJDK 8 LTS/OpenJ9).
Install Maven 3.5.0 or higher.
Install Docker for your OS.
Please make sure you have been assigned either Owner role or Contributor and User Access Administrator
roles of the subscription. You can verify it by following steps in List role assignments for a user or group
RESOURCE_GROUP_NAME=java-liberty-project
az group create --name $RESOURCE_GROUP_NAME --location eastus
export REGISTRY_NAME=youruniqueacrname
az acr create --resource-group $RESOURCE_GROUP_NAME --name $REGISTRY_NAME --sku Basic --admin-enabled
After a short time, you should see a JSON output that contains:
"provisioningState": "Succeeded",
"publicNetworkAccess": "Enabled",
"resourceGroup": "java-liberty-project",
You should see Login Succeeded at the end of command output if you have logged into the ACR instance
successfully.
CLUSTER_NAME=myAKSCluster
az aks create --resource-group $RESOURCE_GROUP_NAME --name $CLUSTER_NAME --node-count 1 --generate-ssh-keys
--enable-managed-identity
After a few minutes, the command completes and returns JSON-formatted information about the cluster,
including the following:
"nodeResourceGroup": "MC_java-liberty-project_myAKSCluster_eastus",
"privateFqdn": null,
"provisioningState": "Succeeded",
"resourceGroup": "java-liberty-project",
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. This
command downloads credentials and configures the Kubernetes CLI to use them.
NOTE
The above command uses the default location for the Kubernetes configuration file, which is ~/.kube/config . You can
specify a different location for your Kubernetes configuration file using --file.
To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.
The following example output shows the single node created in the previous steps. Make sure that the status of
the node is Ready:
2. Once your database is created, open your SQL ser ver > Firewalls and vir tual networks . Set
Minimal TLS Version to > 1.0 and select Save .
3. Open your SQL database > Connection strings > Select JDBC . Write down the Por t number
following sql server address. For example, 1433 is the port number in the example below.
Install Open Liberty Operator
After creating and connecting to the cluster, install the Open Liberty Operator by running the following
commands.
Follow the steps in this section to deploy the sample application on the Jakarta EE runtime. These steps use
Maven and the liberty-maven-plugin . To learn more about the liberty-maven-plugin , see Building a web
application with Maven.
Check out the application
Clone the sample code for this guide. The sample is on GitHub. There are three samples in the repository. We
will use javaee-app-db-using-actions/mssql. Here is the file structure of the application.
javaee-app-db-using-actions/mssql
├─ src/main/
│ ├─ aks/
│ │ ├─ db-secret.yaml
│ │ ├─ openlibertyapplication.yaml
│ ├─ docker/
│ │ ├─ Dockerfile
│ │ ├─ Dockerfile-local
│ │ ├─ Dockerfile-wlp
│ │ ├─ Dockerfile-wlp-local
│ ├─ liberty/config/
│ │ ├─ server.xml
│ ├─ java/
│ ├─ resources/
│ ├─ webapp/
├─ pom.xml
The directories java, resources, and webapp contain the source code of the sample application. The code
declares and uses a data source named jdbc/JavaEECafeDB .
In the aks directory, we placed two deployment files. db-secret.xml is used to create Kubernetes Secrets with DB
connection credentials. The file openlibertyapplication.yaml is used to deploy the application image.
In the docker directory, we place four Dockerfiles. Dockerfile-local is used for local debugging, and Dockerfile is
used to build the image for an AKS deployment. These two files work with Open Liberty. Dockerfile-wlp-local
and Dockerfile-wlp are also used for local debugging and to build the image for an AKS deployment
respectively, but instead work with WebSphere Liberty.
In the liberty/config directory, the server.xml is used to configure the DB connection for the Open Liberty and
WebSphere Liberty cluster.
Build project
Now that you have gathered the necessary properties, you can build the application. The POM file for the
project reads many properties from the environment.
cd <path-to-your-repo>/javaee-app-db-using-actions/mssql
1. Verify the application works as expected. You should see a message similar to
[INFO] [AUDIT] CWWKZ0003I: The application javaee-cafe updated in 1.930 seconds. in the command
output if successful. Go to http://localhost:9080/ in your browser to verify the application is accessible
and all functions are working.
2. Press Ctrl+C to stop liberty:devc mode.
Build image for AKS deployment
After successfully running the app in the Liberty Docker container, you can run the docker build command to
build the image.
cd <path-to-your-repo>/javaee-app-db-using-actions/mssql
# Fetch maven artifactId as image name, maven build version as image version
IMAGE_NAME=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.artifactId}' --non-recursive exec:exec)
IMAGE_VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec)
cd <path-to-your-repo>/javaee-app-db-using-actions/mssql/target
with DB connection
without DB connection
Follow steps below to deploy the Liberty application on the AKS cluster.
1. Attach the ACR instance to the AKS cluster so that the AKS cluster is authenticated to pull image from the
ACR instance.
cd <path-to-your-repo>/javaee-app-db-using-actions/mssql
artifactId=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.artifactId}' --non-recursive
exec:exec)
3. Apply the DB secret and deployment file by running the following command:
cd <path-to-your-repo>/javaee-app-db-using-actions/mssql/target
# Apply DB secret
kubectl apply -f <path-to-your-repo>/javaee-app-db-using-actions/mssql/target/db-secret.yaml
4. Wait until you see 3/3 under the READY column and 3 under the AVAILABLE column, then use CTRL-C
to stop the kubectl watch process.
Test the application
When the application runs, a Kubernetes load balancer service exposes the application front end to the internet.
This process can take a while to complete.
To monitor progress, use the kubectl get service command with the --watch argument.
Once the EXTERNAL-IP address changes from pending to an actual public IP address, use CTRL-C to stop the
kubectl watch process.
Open a web browser to the external IP address of your service ( 52.152.189.57 for the above example) to see the
application home page. You should see the pod name of your application replicas displayed at the top-left of the
page. Wait for a few minutes and refresh the page to see a different pod name displayed due to load balancing
provided by the AKS cluster.
NOTE
Currently the application is not using HTTPS. It is recommended to ENABLE TLS with your own certificates.
Next steps
You can learn more from references used in this guide:
Azure Kubernetes Service
Open Liberty
Open Liberty Operator
Open Liberty Server Configuration
Liberty Maven Plugin
Open Liberty Container Images
WebSphere Liberty Container Images
Use Azure API Management with microservices
deployed in Azure Kubernetes Service
6/15/2022 • 7 minutes to read • Edit Online
Microservices are perfect for building APIs. With Azure Kubernetes Service (AKS), you can quickly deploy and
operate a microservices-based architecture in the cloud. You can then leverage Azure API Management (API
Management) to publish your microservices as APIs for internal and external consumption. This article describes
the options of deploying API Management with AKS. It assumes basic knowledge of Kubernetes, API
Management, and Azure networking.
Background
When publishing microservices as APIs for consumption, it can be challenging to manage the communication
between the microservices and the clients that consume them. There is a multitude of cross-cutting concerns
such as authentication, authorization, throttling, caching, transformation, and monitoring. These concerns are
valid regardless of whether the microservices are exposed to internal or external clients.
The API Gateway pattern addresses these concerns. An API gateway serves as a front door to the microservices,
decouples clients from your microservices, adds an additional layer of security, and decreases the complexity of
your microservices by removing the burden of handling cross cutting concerns.
Azure API Management is a turnkey solution to solve your API gateway needs. You can quickly create a
consistent and modern gateway for your microservices and publish them as APIs. As a full-lifecycle API
management solution, it also provides additional capabilities including a self-service developer portal for API
discovery, API lifecycle management, and API analytics.
When used together, AKS and API Management provide a platform for deploying, publishing, securing,
monitoring, and managing your microservices-based APIs. In this article, we will go through a few options of
deploying AKS in conjunction with API Management.
Pros:
Easy configuration on the API Management side because it does not need to be injected into the cluster VNet
and mTLS is natively supported
Centralizes protection for inbound cluster traffic at the Ingress Controller layer
Reduces security risk by minimizing publicly visible cluster endpoints
Cons:
Increases complexity of cluster configuration due to extra work to install, configure and maintain the Ingress
Controller and manage certificates used for mTLS
Security risk due to public visibility of Ingress Controller endpoint(s)
When you publish APIs through API Management, it's easy and common to secure access to those APIs by using
subscription keys. Developers who need to consume the published APIs must include a valid subscription key in
HTTP requests when they make calls to those APIs. Otherwise, the calls are rejected immediately by the API
Management gateway. They aren't forwarded to the back-end services.
To get a subscription key for accessing APIs, a subscription is required. A subscription is essentially a named
container for a pair of subscription keys. Developers who need to consume the published APIs can get
subscriptions. And they don't need approval from API publishers. API publishers can also create subscriptions
directly for API consumers.
Option 3: Deploy APIM inside the cluster VNet
In some cases, customers with regulatory constraints or strict security requirements may find Option 1 and 2
not viable solutions due to publicly exposed endpoints. In others, the AKS cluster and the applications that
consume the microservices might reside within the same VNet, hence there is no reason to expose the cluster
publicly as all API traffic will remain within the VNet. For these scenarios, you can deploy API Management into
the cluster VNet. API Management Developer and Premium tiers support VNet deployment.
There are two modes of deploying API Management into a VNet – External and Internal.
If API consumers do not reside in the cluster VNet, the External mode (Fig. 4) should be used. In this mode, the
API Management gateway is injected into the cluster VNet but accessible from public internet via an external
load balancer. It helps to hide the cluster completely while still allowing external clients to consume the
microservices. Additionally, you can use Azure networking capabilities such as Network Security Groups (NSG)
to restrict network traffic.
If all API consumers reside within the cluster VNet, then the Internal mode (Fig. 5) could be used. In this mode,
the API Management gateway is injected into the cluster VNET and accessible only from within this VNet via an
internal load balancer. There is no way to reach the API Management gateway or the AKS cluster from public
internet.
In both cases, the AKS cluster is not publicly visible. Compared to Option 2, the Ingress Controller may not be
necessary. Depending on your scenario and configuration, authentication might still be required between API
Management and your microservices. For instance, if a Service Mesh is adopted, it always requires mutual TLS
authentication.
Pros:
The most secure option because the AKS cluster has no public endpoint
Simplifies cluster configuration since it has no public endpoint
Ability to hide both API Management and AKS inside the VNet using the Internal mode
Ability to control network traffic using Azure networking capabilities such as Network Security Groups (NSG)
Cons:
Increases complexity of deploying and configuring API Management to work inside the VNet
Next steps
Learn more about Network concepts for applications in AKS
Learn more about How to use API Management with virtual networks
Dapr
6/15/2022 • 4 minutes to read • Edit Online
Distributed Application Runtime (Dapr) offers APIs that simplify microservice development and implementation.
Running as a sidecar process in tandem with your applications, DaprAPIs abstract away common complexities
developers regularly encounter when building distributed applications, such as service discovery, message
broker integration, encryption, observability, and secret management. Whether your inter-
applicationcommunication is direct service-to-service, or pub/sub messaging, Dapr helps you write simple,
portable, resilient, and secured microservices.
Dapr is incrementally adoptable – the API building blocks can be used as the need arises. Use one, several, or all
to develop your application faster.
Suppor ted secrets stores Local environment variables (for Azure Key Vault secret store
Development); Local file (for
Development); Kubernetes Secrets;
AWS Secrets Manager; Azure Key Vault
secret store; Azure Key Vault with
Managed Identities on Kubernetes;
GCP Secret Manager; HashiCorp Vault
Accessing secrets in application Call the Dapr secrets API Access the mounted volume or sync
code mounted content as a Kubernetes
secret and set an environment variable
Secret rotation New API calls obtain the updated Polls for secrets and updates the
secrets mount at a configurable interval
Logging and metrics The Dapr sidecar generates logs, which Emits driver and Azure Key Vault
can be configured with collectors such provider metrics via Prometheus
as Azure Monitor, emits metrics via
Prometheus, and exposes an HTTP
endpoint for health checks
For more information on the secret management in Dapr, see the secrets management building block overview.
For more information on the Secrets Store CSI driver and Azure Key Vault provider, see the Secrets Store CSI
driver overview.
How does the managed Dapr cluster extension compare to the open source Dapr offering?
The managed Dapr cluster extension is the easiest method to provision Dapr on an AKS cluster. With the
extension, you're able to offload management of the Dapr runtime version by opting into automatic upgrades.
Additionally, the extension installs Dapr with smart defaults (for example, provisioning the Dapr control plane in
high availability mode).
When installing Dapr OSS via helm or the Dapr CLI, runtime versions and configuration options are the
responsibility of developers and cluster maintainers.
Lastly, the Dapr extension is an extension of AKS, therefore you can expect the same support policy as other AKS
features.
How can I switch to using the Dapr extension if I’ve already installed Dapr via a method, such as Helm?
Recommended guidance is to completely uninstall Dapr from the AKS cluster and reinstall it via the cluster
extension.
If you install Dapr through the AKS extension, our recommendation is to continue using the extension for future
management of Dapr instead of the Dapr CLI. Combining the two tools can cause conflicts and result in
undesired behavior.
Next Steps
After learning about Dapr and some of the challenges it solves, try Deploying an application with the Dapr
cluster extension.
Dapr extension for Azure Kubernetes Service (AKS)
and Arc-enabled Kubernetes
6/15/2022 • 7 minutes to read • Edit Online
Dapr is a portable, event-driven runtime that makes it easy for any developer to build resilient, stateless and
stateful applications that run on the cloud and edge and embraces the diversity of languages and developer
frameworks. Leveraging the benefits of a sidecar architecture, Dapr helps you tackle the challenges that come
with building microservices and keeps your code platform agnostic. In particular, it helps with solving problems
around services calling other services reliably and securely, building event-driven apps with pub-sub, and
building applications that are portable across multiple cloud services and hosts (e.g., Kubernetes vs. a VM).
By using the Dapr extension to provision Dapr on your AKS or Arc-enabled Kubernetes cluster, you eliminate the
overhead of downloading Dapr tooling and manually installing and managing the runtime on your AKS cluster.
Additionally, the extension offers support for all native Dapr configuration capabilities through simple
command-line arguments.
NOTE
If you plan on installing Dapr in a Kubernetes production environment, please see the Dapr guidelines for production
usage documentation page.
How it works
The Dapr extension uses the Azure CLI to provision the Dapr control plane on your AKS or Arc-enabled
Kubernetes cluster. This will create:
dapr-operator : Manages component updates and Kubernetes services endpoints for Dapr (state stores,
pub/subs, etc.)
dapr-sidecar-injector : Injects Dapr into annotated deployment pods and adds the environment variables
DAPR_HTTP_PORT and DAPR_GRPC_PORT to enable user-defined applications to easily communicate with Dapr
without hard-coding Dapr port values.
dapr-placement : Used for actors only. Creates mapping tables that map actor instances to pods
dapr-sentr y : Manages mTLS between services and acts as a certificate authority. For more information read
the security overview.
Once Dapr is installed on your cluster, you can begin to develop using the Dapr building block APIs by adding a
few annotations to your deployments. For a more in-depth overview of the building block APIs and how to best
use them, please see the Dapr building blocks overview.
WARNING
If you install Dapr through the AKS or Arc-enabled Kubernetes extension, our recommendation is to continue using the
extension for future management of Dapr instead of the Dapr CLI. Combining the two tools can cause conflicts and result
in undesired behavior.
Currently supported
Dapr versions
The Dapr extension support varies depending on how you manage the runtime.
Self-managed
For self-managed runtime, the Dapr extension supports:
The latest version of Dapr and 1 previous version (N-1)
Upgrading minor version incrementally (for example, 1.5 -> 1.6 -> 1.7)
Self-managed runtime requires manual upgrade to remain in the support window. To upgrade Dapr via the
extension, follow the Update extension instance instructions.
Auto-upgrade
Enabling auto-upgrade keeps your Dapr extension updated to the latest minor version. You may experience
breaking changes between updates.
Components
Azure + open source components are supported. Alpha and beta components are supported via best effort.
Clouds/regions
Global Azure cloud is supported with Arc support on the regions listed by Azure Products by Region.
Prerequisites
If you don't have an Azure subscription, create a free account before you begin.
Install the latest version of the Azure CLI.
If you don't have one already, you need to create an AKS cluster or connect an Arc-enabled Kubernetes
cluster.
Set up the Azure CLI extension for cluster extensions
You will need the k8s-extension Azure CLI extension. Install this by running the following commands:
If the k8s-extension extension is already installed, you can update it to the latest version using the following
command:
Create the Dapr extension, which installs Dapr on your AKS or Arc-enabled Kubernetes cluster. For example, for
an AKS cluster:
az k8s-extension create --cluster-type managedClusters \
--cluster-name myAKSCluster \
--resource-group myResourceGroup \
--name myDaprExtension \
--extension-type Microsoft.Dapr
You have the option of allowing Dapr to auto-update its minor version by specifying the
--auto-upgrade-minor-version parameter and setting the value to true :
--auto-upgrade-minor-version true
Configuration settings
The extension enables you to set Dapr configuration options by using the --configuration-settings parameter.
For example, to provision Dapr with high availability (HA) enabled, set the global.ha.enabled parameter to
true :
NOTE
If configuration settings are sensitive and need to be protected, for example cert related information, pass the
--configuration-protected-settings parameter and the value will be protected from being read.
ha:
enabled: true
replicaCount: 3
disruption:
minimumAvailable: ""
maximumUnavailable: "25%"
prometheus:
enabled: true
port: 9090
mtls:
enabled: true
workloadCertTTL: 24h
allowedClockSkew: 15m
The same command-line argument is used for installing a specific version of Dapr or rolling back to a previous
version. Set --auto-upgrade-minor-version to false and --version to the version of Dapr you wish to install. If
the version parameter is omitted, the extension will install the latest version of Dapr. For example, to use Dapr
X.X.X:
NOTE
High availability (HA) can be enabled at any time. However, once enabled, disabling it requires deletion and recreation of
the extension. If you aren't sure if high availability is necessary for your use case, we recommend starting with it disabled
to minimize disruption.
To update your Dapr configuration settings, simply recreate the extension with the desired state. For example,
assume we have previously created and installed the extension using the following configuration:
The below JSON is returned, and the error message is captured in the message property.
"statuses": [
{
"code": "InstallationFailed",
"displayStatus": null,
"level": null,
"message": "Error: {failed to install chart from path [] for release [dapr-1]: err [template:
dapr/charts/dapr_sidecar_injector/templates/dapr_sidecar_injector_poddisruptionbudget.yaml:1:17: executing
\"dapr/charts/dapr_sidecar_injector/templates/dapr_sidecar_injector_poddisruptionbudget.yaml\" at
<.Values.global.ha.enabled>: can't evaluate field enabled in type interface {}]} occurred while doing the
operation : {Installing the extension} on the config",
"time": null
}
],
Troubleshooting Dapr
Troubleshoot Dapr errors via the common Dapr issues and solutions guide.
Next Steps
Once you have successfully provisioned Dapr in your AKS cluster, try deploying a sample application.
Tutorial: Use GitOps with Flux v2 in Azure Arc-
enabled Kubernetes or AKS clusters
6/15/2022 • 29 minutes to read • Edit Online
GitOps with Flux v2 can be enabled in Azure Kubernetes Service (AKS) managed clusters or Azure Arc-enabled
Kubernetes connected clusters as a cluster extension. After the microsoft.flux cluster extension is installed, you
can create one or more fluxConfigurations resources that sync your Git repository sources to the cluster and
reconcile the cluster to the desired state. With GitOps, you can use your Git repository as the source of truth for
cluster configuration and application deployment.
NOTE
Eventually Azure will stop supporting GitOps with Flux v1, so begin using Flux v2 as soon as possible.
This tutorial describes how to use GitOps in a Kubernetes cluster. Before you dive in, take a moment to learn
how GitOps with Flux works conceptually.
IMPORTANT
Add-on Azure management services, like Kubernetes Configuration, are charged when enabled. Costs related to use of
Flux v2 will start to be billed on July 1, 2022. For more information, see Azure Arc pricing.
IMPORTANT
The microsoft.flux extension released major version 1.0.0. This includes the multi-tenancy feature. If you have existing
GitOps Flux v2 configurations that use a previous version of the microsoft.flux extension you can upgrade to the
latest extension manually using the Azure CLI: "az k8s-extension create -g <RESOURCE_GROUP> -c <CLUSTER_NAME> -
n flux --extension-type microsoft.flux -t <CLUSTER_TYPE>" (use "-t connectedClusters" for Arc clusters and "-t
managedClusters" for AKS clusters).
Prerequisites
To manage GitOps through the Azure CLI or the Azure portal, you need the following items.
For Azure Arc-enabled Kubernetes clusters
An Azure Arc-enabled Kubernetes connected cluster that's up and running.
Learn how to connect a Kubernetes cluster to Azure Arc. If you need to connect through an outbound
proxy, then assure you install the Arc agents with proxy settings.
Read and write permissions on the Microsoft.Kubernetes/connectedClusters resource type.
For Azure Kubernetes Service clusters
An MSI-based AKS cluster that's up and running.
IMPORTANT
Ensure that the AKS cluster is created with MSI (not SPN), because the microsoft.flux extension won't
work with SPN-based AKS clusters. For new AKS clusters created with “az aks create”, the cluster will be MSI-based
by default. For already created SPN-based clusters that need to be converted to MSI run “az aks update -g
$RESOURCE_GROUP -n $CLUSTER_NAME --enable-managed-identity”. For more information, refer to managed
identity docs.
az version
az upgrade
Registration of the following Azure service providers. (It's OK to re-register an existing provider.)
Registration is an asynchronous process and should finish within 10 minutes. Use the following code to
monitor the registration process:
Supported regions
GitOps is currently supported in all regions that Azure Arc-enabled Kubernetes supports. See the supported
regions. GitOps is currently supported in a subset of the regions that AKS supports. The GitOps service is adding
new supported regions on a regular cadence.
Network requirements
The GitOps agents require outbound (egress) TCP to the repo source on either port 22 (SSH) or port 443
(HTTPS) to function. The agents also require the following outbound URLs:
EN DP O IN T ( DN S) DESC RIP T IO N
https://<region>.dp.kubernetesconfiguration.azure.com Data plane endpoint for the agent to push status and fetch
configuration information. Depends on <region> (the
supported regions mentioned earlier).
EN DP O IN T ( DN S) DESC RIP T IO N
To see the list of az CLI extensions installed and their versions, use the following command:
TIP
For help resolving any errors, see the Flux v2 suggestions in Azure Arc-enabled Kubernetes and GitOps troubleshooting.
IMPORTANT
The demonstration repo is designed to simplify your use of this tutorial and illustrate some key principles. To keep up to
date, the repo can get breaking changes occasionally from version upgrades. These changes won't affect your new
application of this tutorial, only previous tutorial applications that have not been deleted. To learn how to handle these
changes please see the breaking change disclaimer.
In the following example:
The resource group that contains the cluster is flux-demo-rg .
The name of the Azure Arc cluster is flux-demo-arc .
The cluster type is Azure Arc ( -t connectedClusters ), but this example also works with AKS (
-t managedClusters ).
The name of the Flux configuration is cluster-config .
The namespace for configuration installation is cluster-config .
The URL for the public Git repository is https://github.com/Azure/gitops-flux2-kustomize-helm-mt .
The Git repository branch is main .
The scope of the configuration is cluster . This gives the operators permissions to make changes
throughout cluster. To use namespace scope with this tutorial, see the changes needed.
Two kustomizations are specified with names infra and apps . Each is associated with a path in the
repository.
The apps kustomization depends on the infra kustomization. (The infra kustomization must finish
before the apps kustomization runs.)
Set prune=true on both kustomizations. This setting assures that the objects that Flux deployed to the cluster
will be cleaned up if they're removed from the repository or if the Flux configuration or kustomizations are
deleted.
If the microsoft.flux extension isn't already installed in the cluster, it'll be installed. When the flux configuration
is installed, the initial compliance state may be "Pending" or "Non-compliant" because reconciliation is still on-
going. After a minute you can query the configuration again and see the final compliance state.
'Microsoft.Flux' extension not found on the cluster, installing it now. This may take a few minutes...
'Microsoft.Flux' extension was successfully installed on the cluster
Creating the flux configuration 'cluster-config' in the cluster. This may take a few minutes...
{
"complianceState": "Pending",
... (not shown because of pending status)
}
NAME CREATED AT
alerts.notification.toolkit.fluxcd.io 2022-04-06T17:15:48Z
arccertificates.clusterconfig.azure.com 2022-03-28T21:45:19Z
azureclusteridentityrequests.clusterconfig.azure.com 2022-03-28T21:45:19Z
azureextensionidentities.clusterconfig.azure.com 2022-03-28T21:45:19Z
buckets.source.toolkit.fluxcd.io 2022-04-06T17:15:48Z
connectedclusters.arc.azure.com 2022-03-28T21:45:19Z
customlocationsettings.clusterconfig.azure.com 2022-03-28T21:45:19Z
extensionconfigs.clusterconfig.azure.com 2022-03-28T21:45:19Z
fluxconfigs.clusterconfig.azure.com 2022-04-06T17:15:48Z
gitconfigs.clusterconfig.azure.com 2022-03-28T21:45:19Z
gitrepositories.source.toolkit.fluxcd.io 2022-04-06T17:15:48Z
helmcharts.source.toolkit.fluxcd.io 2022-04-06T17:15:48Z
helmreleases.helm.toolkit.fluxcd.io 2022-04-06T17:15:48Z
helmrepositories.source.toolkit.fluxcd.io 2022-04-06T17:15:48Z
imagepolicies.image.toolkit.fluxcd.io 2022-04-06T17:15:48Z
imagerepositories.image.toolkit.fluxcd.io 2022-04-06T17:15:48Z
imageupdateautomations.image.toolkit.fluxcd.io 2022-04-06T17:15:48Z
kustomizations.kustomize.toolkit.fluxcd.io 2022-04-06T17:15:48Z
providers.notification.toolkit.fluxcd.io 2022-04-06T17:15:48Z
receivers.notification.toolkit.fluxcd.io 2022-04-06T17:15:48Z
volumesnapshotclasses.snapshot.storage.k8s.io 2022-03-28T21:06:12Z
volumesnapshotcontents.snapshot.storage.k8s.io 2022-03-28T21:06:12Z
volumesnapshots.snapshot.storage.k8s.io 2022-03-28T21:06:12Z
websites.extensions.example.com 2022-03-30T23:42:32Z
For an AKS cluster, use the same command but with -t managedClusters replacing -t connectedClusters .
Note that this action does not remove the Flux extension.
Delete the Flux cluster extension
You can delete the Flux extension by using either the CLI or the portal. The delete action removes both the
microsoft.flux extension resource in Azure and the Flux extension objects in the cluster.
If the Flux extension was created automatically when the Flux configuration was first created, the extension
name will be flux .
For an Azure Arc-enabled Kubernetes cluster, use this command:
Here's an example for including the Flux image-reflector and image-automation controllers. If the Flux extension
was created automatically when a Flux configuration was first created, the extension name will be flux .
NS="flux-system"
oc adm policy add-scc-to-user nonroot system:serviceaccount:$NS:kustomize-controller
oc adm policy add-scc-to-user nonroot system:serviceaccount:$NS:helm-controller
oc adm policy add-scc-to-user nonroot system:serviceaccount:$NS:source-controller
oc adm policy add-scc-to-user nonroot system:serviceaccount:$NS:notification-controller
oc adm policy add-scc-to-user nonroot system:serviceaccount:$NS:image-automation-controller
oc adm policy add-scc-to-user nonroot system:serviceaccount:$NS:image-reflector-controller
For more information on OpenShift guidance for onboarding Flux, refer to the Flux documentation.
Group
az k8s-configuration flux : Commands to manage Flux v2 Kubernetes configurations.
Subgroups:
deployed-object : Commands to see deployed objects associated with Flux v2 Kubernetes
configurations.
kustomization : Commands to manage Kustomizations associated with Flux v2 Kubernetes
configurations.
Commands:
create : Create a Flux v2 Kubernetes configuration.
delete : Delete a Flux v2 Kubernetes configuration.
list : List all Flux v2 Kubernetes configurations.
show : Show a Flux v2 Kubernetes configuration.
update : Update a Flux v2 Kubernetes configuration.
Here are the parameters for the k8s-configuration flux create CLI command:
Command
az k8s-configuration flux create : Create a Flux v2 Kubernetes configuration.
Arguments
--cluster-name -c [Required] : Name of the Kubernetes cluster.
--cluster-type -t [Required] : Specify Arc connected clusters or AKS managed clusters.
Allowed values: connectedClusters, managedClusters.
--name -n [Required] : Name of the flux configuration.
--resource-group -g [Required] : Name of resource group. You can configure the default group
using `az configure --defaults group=<name>`.
--url -u [Required] : URL of the source to reconcile.
--bucket-insecure : Communicate with a bucket without TLS. Allowed values: false,
true.
--bucket-name : Name of the S3 bucket to sync.
--interval --sync-interval : Time between reconciliations of the source on the cluster.
--kind : Source kind to reconcile. Allowed values: bucket, git.
Default: git.
--kustomization -k : Define kustomizations to sync sources with parameters ['name',
'path', 'depends_on', 'timeout', 'sync_interval',
'retry_interval', 'prune', 'force'].
--namespace --ns : Namespace to deploy the configuration. Default: default.
--no-wait : Do not wait for the long-running operation to finish.
--scope -s : Specify scope of the operator to be 'namespace' or 'cluster'.
Allowed values: cluster, namespace. Default: cluster.
--suspend : Suspend the reconciliation of the source and kustomizations
associated with this configuration. Allowed values: false,
true.
--timeout : Maximum time to reconcile the source before timing out.
Auth Arguments
--local-auth-ref --local-ref : Local reference to a kubernetes secret in the configuration
namespace to use for communication to the source.
Global Arguments
--debug : Increase logging verbosity to show all debug logs.
--help -h : Show this help message and exit.
--only-show-errors : Only show errors, suppressing warnings.
--output -o : Output format. Allowed values: json, jsonc, none, table, tsv,
yaml, yamlc. Default: json.
--query : JMESPath query string. See http://jmespath.org/ for more
information and examples.
--subscription : Name or ID of subscription. You can configure the default
subscription using `az account set -s NAME_OR_ID`.
--verbose : Increase logging verbosity. Use --debug for full debug logs.
Examples
Create a Flux v2 Kubernetes configuration
az k8s-configuration flux create --resource-group my-resource-group \
--cluster-name mycluster --cluster-type connectedClusters \
--name myconfig --scope cluster --namespace my-namespace \
--kind git --url https://github.com/Azure/arc-k8s-demo \
--branch main --kustomization name=my-kustomization
For more information, see the Flux documentation on Git repository checkout strategies.
Public Git repository
PA RA M ET ER F O RM AT N OT ES
PA RA M ET ER F O RM AT N OT ES
--ssh-private-key-file Full path to local file Provide the full path to the local file
that contains the PEM-format key.
PA RA M ET ER F O RM AT N OT ES
PA RA M ET ER F O RM AT N OT ES
PA RA M ET ER F O RM AT N OT ES
For HTTPS authentication, you create a secret with the username and password :
For SSH authentication, you create a secret with the identity and known_hosts fields:
For both cases, when you create the Flux configuration, use --local-auth-ref my-custom-secret in place of the
other authentication parameters:
az k8s-configuration flux create -g <cluster_resource_group> -c <cluster_name> -n <config_name> -t
connectedClusters --scope cluster --namespace flux-config -u <git-repo-url> --kustomization
name=kustomization1 --local-auth-ref my-custom-secret
Learn more about using a local Kubernetes secret with these authentication methods:
Git repository HTTPS authentication
Git repository HTTPS self-signed certificates
Git repository SSH authentication
Bucket static authentication
NOTE
If you need Flux to access the source through your proxy, you'll need to update the Azure Arc agents with the proxy
settings. For more information, see Connect using an outbound proxy server.
Git implementation
To support various repository providers that implement Git, Flux can be configured to use one of two Git
libraries: go-git or libgit2 . See the Flux documentation for details.
The GitOps implementation of Flux v2 automatically determines which library to use for public cloud
repositories:
For GitHub, GitLab, and BitBucket repositories, Flux uses go-git .
For Azure DevOps and all other repositories, Flux uses libgit2 .
PA RA M ET ER F O RM AT N OT ES
You can also use az k8s-configuration flux kustomization to create, update, list, show, and delete
kustomizations in a Flux configuration:
Group
az k8s-configuration flux kustomization : Commands to manage Kustomizations associated with Flux
v2 Kubernetes configurations.
Commands:
create : Create a Kustomization associated with a Flux v2 Kubernetes configuration.
delete : Delete a Kustomization associated with a Flux v2 Kubernetes configuration.
list : List Kustomizations associated with a Flux v2 Kubernetes configuration.
show : Show a Kustomization associated with a Flux v2 Kubernetes configuration.
update : Update a Kustomization associated with a Flux v2 Kubernetes configuration.
Command
az k8s-configuration flux kustomization create : Create a Kustomization associated with a
Kubernetes Flux v2 Configuration.
Arguments
--cluster-name -c [Required] : Name of the Kubernetes cluster.
--cluster-type -t [Required] : Specify Arc connected clusters or AKS managed clusters.
Allowed values: connectedClusters, managedClusters.
--kustomization-name -k [Required] : Specify the name of the kustomization to target.
--name -n [Required] : Name of the flux configuration.
--resource-group -g [Required] : Name of resource group. You can configure the default
group using `az configure --defaults group=<name>`.
--dependencies --depends --depends-on : Comma-separated list of kustomization dependencies.
--force : Re-create resources that cannot be updated on the
cluster (i.e. jobs). Allowed values: false, true.
--interval --sync-interval : Time between reconciliations of the kustomization on the
cluster.
--no-wait : Do not wait for the long-running operation to finish.
--path : Specify the path in the source that the kustomization
should apply.
--prune : Garbage collect resources deployed by the kustomization
on the cluster. Allowed values: false, true.
--retry-interval : Time between reconciliations of the kustomization on the
cluster on failures, defaults to --sync-interval.
--timeout : Maximum time to reconcile the kustomization before
timing out.
Global Arguments
--debug : Increase logging verbosity to show all debug logs.
--help -h : Show this help message and exit.
--only-show-errors : Only show errors, suppressing warnings.
--output -o : Output format. Allowed values: json, jsonc, none,
table, tsv, yaml, yamlc. Default: json.
--query : JMESPath query string. See http://jmespath.org/ for more
information and examples.
--subscription : Name or ID of subscription. You can configure the
default subscription using `az account set -s
NAME_OR_ID`.
--verbose : Increase logging verbosity. Use --debug for full debug
logs.
Examples
Create a Kustomization associated with a Kubernetes v2 Flux Configuration
az k8s-configuration flux kustomization create --resource-group my-resource-group \
--cluster-name mycluster --cluster-type connectedClusters --name myconfig \
--kustomization-name my-kustomization-2 --path ./my/path --prune --force
TIP
Because of how Helm handles index files, processing helm charts is an expensive operation and can have very high
memory footprint. As a result, helm chart reconciliation, when occurring in parallel can cause memory spikes and
OOMKilled if you are reconciling a large number of helm charts at a given time. By default, the source-controller sets its
memory limit at 1Gi and its memory requests at 64Mi. If you need to increase this limit and requests due to a high
number of large helm chart reconciliations, you can do so by running the following command after Microsoft.Flux
extension installation.
az k8s-extension update -g <resource-group> -c <cluster-name> -n flux -t connectedClusters --config
source-controller.resources.limits.memory=2Gi source-controller.resources.requests.memory=300Mi
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: somename
namespace: somenamespace
annotations:
clusterconfig.azure.com/use-managed-source: "true"
spec:
...
By using this annotation, the HelmRelease that is deployed will be patched with the reference to the configured
source. Note that only GitRepository source is supported for this currently.
Multi-tenancy
Flux v2 supports multi-tenancy in version 0.26. This capability has been integrated into Azure GitOps with Flux
v2.
NOTE
For the multi-tenancy feature you need to know if your manifests contain any cross-namespace sourceRef for
HelmRelease, Kustomization, ImagePolicy, or other objects, or if you use a Kubernetes version less than 1.20.6. To prepare,
take these actions:
Upgrade to Kubernetes version 1.20.6 or greater.
In your Kubernetes manifests assure that all sourceRef are to objects within the same namespace as the GitOps
configuration.
If you need time to update your manifests, you can opt-out of multi-tenancy. However, you still need to
upgrade your Kubernetes version.
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nginx
namespace: nginx
spec:
releaseName: nginx-ingress-controller
chart:
spec:
chart: nginx-ingress-controller
sourceRef:
kind: HelmRepository
name: bitnami
namespace: flux-system
version: "5.6.14"
interval: 1h0m0s
install:
remediation:
retries: 3
# Default values
# https://github.com/bitnami/charts/blob/master/bitnami/nginx-ingress-controller/values.yaml
values:
service:
type: NodePort
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
name: bitnami
namespace: flux-system
spec:
interval: 30m
url: https://charts.bitnami.com/bitnami
By default, the Flux extension will deploy the fluxConfigurations by impersonating the flux-applier service
account that is deployed only in the cluster-config namespace. Using the above manifests, when multi-tenancy
is enabled the HelmRelease would be blocked. This is because the HelmRelease is in the nginx namespace and
is referencing a HelmRepository in the flux-system namespace. Also, the Flux helm-controller cannot apply the
HelmRelease, because there is no flux-applier service account in the nginx namespace.
To work with multi-tenancy, the correct approach is to deploy all Flux objects into the same namespace as the
fluxConfigurations . This avoids the cross-namespace reference issue, and allows the Flux controllers to get the
permissions to apply the objects. Thus, for a GitOps configuration created in the cluster-config namespace, the
above manifests would change to these:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nginx
namespace: cluster-config
spec:
releaseName: nginx-ingress-controller
targetNamespace: nginx
chart:
spec:
chart: nginx-ingress-controller
sourceRef:
kind: HelmRepository
name: bitnami
namespace: cluster-config
version: "5.6.14"
interval: 1h0m0s
install:
remediation:
retries: 3
# Default values
# https://github.com/bitnami/charts/blob/master/bitnami/nginx-ingress-controller/values.yaml
values:
service:
type: NodePort
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
name: bitnami
namespace: cluster-config
spec:
interval: 30m
url: https://charts.bitnami.com/bitnami
or
You can also use the Azure portal to view and delete GitOps configurations in Azure Arc-enabled Kubernetes or
AKS clusters.
General information about migration from Flux v1 to Flux v2 is available in the fluxcd project: Migrate from Flux
v1 to v2.
Next steps
Advance to the next tutorial to learn how to implement CI/CD with GitOps.
Implement CI/CD with GitOps
Open Service Mesh AKS add-on
6/15/2022 • 2 minutes to read • Edit Online
Open Service Mesh (OSM) is a lightweight, extensible, cloud native service mesh that allows users to uniformly
manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments.
OSM runs an Envoy-based control plane on Kubernetes and can be configured with SMI APIs. OSM works by
injecting an Envoy proxy as a sidecar container with each instance of your application. The Envoy proxy contains
and executes rules around access control policies, implements routing configuration, and captures metrics. The
control plane continually configures the Envoy proxies to ensure policies and routing rules are up to date and
ensures proxies are healthy.
The OSM project was originated by Microsoft and has since been donated and is governed by the Cloud Native
Computing Foundation (CNCF).
IMPORTANT
The OSM add-on installs version 1.0.0 of OSM on your cluster.
Example scenarios
OSM can be used to help your AKS deployments in many different ways. For example:
Encrypt communications between service endpoints deployed in the cluster.
Enable traffic authorization of both HTTP/HTTPS and TCP traffic.
Configure weighted traffic controls between two or more services for A/B testing or canary deployments.
Collect and view KPIs from application traffic.
Add-on limitations
The OSM AKS add-on has the following limitations:
Iptables redirection for port IP address and port range exclusion must be enabled using kubectl patch after
installation. For more details, see iptables redirection.
Pods that are onboarded to the mesh that need access to IMDS, Azure DNS, or the Kubernetes API server
must have their IP addresses to the global list of excluded outbound IP ranges using Global outbound IP
range exclusions.
Next steps
After enabling the OSM add-on using the Azure CLI or a Bicep template, you can:
Deploy a sample application
Onboard an existing application
Install the Open Service Mesh add-on by using the
Azure CLI
6/15/2022 • 3 minutes to read • Edit Online
This article shows you how to install the Open Service Mesh (OSM) add-on on an Azure Kubernetes Service
(AKS) cluster and verify that it's installed and running.
IMPORTANT
The OSM add-on installs version 1.0.0 of OSM on your cluster.
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-addons open-service-mesh
For existing clusters, use az aks enable-addons . The following code shows an example.
IMPORTANT
You can't enable the OSM add-on on an existing cluster if an OSM mesh is already on your cluster. Uninstall any existing
OSM meshes on your cluster before enabling the OSM add-on.
az aks enable-addons \
--resource-group myResourceGroup \
--name myAKSCluster \
--addons open-service-mesh
The following example output shows version 0.11.1 of the OSM mesh:
To verify the status of the OSM components running on your cluster, use kubectl to show the status of the
app.kubernetes.io/name=openservicemesh.io deployments, pods, and services. For example:
IMPORTANT
If any pods have a status other than Running , such as Pending , your cluster might not have enough resources to run
OSM. Review the sizing for your cluster, such as the number of nodes and the virtual machine's SKU, before continuing to
use OSM on your cluster.
To verify the configuration of your OSM mesh, use kubectl get meshconfig . For example:
The preceding example shows enablePermissiveTrafficPolicyMode: true , which means OSM has permissive
traffic policy mode enabled. With this mode enabled in your OSM mesh:
The SMI traffic policy enforcement is bypassed.
OSM automatically discovers services that are a part of the service mesh.
OSM creates traffic policy rules on each Envoy proxy sidecar to be able to communicate with these services.
Alternatively, you can uninstall the OSM add-on and the related resources from your cluster. For more
information, see Uninstall the Open Service Mesh add-on from your AKS cluster.
Next steps
This article showed you how to install the OSM add-on on an AKS cluster, and then verify that it's installed and
running. With the OSM add-on installed on your cluster, you can deploy a sample application or onboard an
existing application to work with your OSM mesh.
Deploy the Open Service Mesh add-on by using
Bicep
6/15/2022 • 5 minutes to read • Edit Online
This article shows you how to deploy the Open Service Mesh (OSM) add-on to Azure Kubernetes Service (AKS)
by using a Bicep template.
IMPORTANT
The OSM add-on installs version 1.0.0 of OSM on your cluster.
Bicep is a domain-specific language that uses declarative syntax to deploy Azure resources. You can use Bicep in
place of creating Azure Resource Manager templates to deploy your infrastructure-as-code Azure resources.
Prerequisites
Azure CLI version 2.20.0 or later
OSM version 0.11.1 or later
An SSH public key used for deploying AKS
Visual Studio Code with a Bash terminal
The Visual Studio Code Bicep extension
Install the OSM add-on for a new AKS cluster by using Bicep
For deployment of a new AKS cluster, you enable the OSM add-on at cluster creation. The following instructions
use a generic Bicep template that deploys an AKS cluster by using ephemeral disks and the kubenet container
network interface, and then enables the OSM add-on. For more advanced deployment scenarios, see What is
Bicep?.
Create a resource group
In Azure, you can associate related resources by using a resource group. Create a resource group by using az
group create. The following example creates a resource group named my-osm-bicep-aks-cluster-rg in a
specified Azure location (region):
mkdir bicep-osm-aks-addon
cd bicep-osm-aks-addon
Next, create both the main file and the parameters file, as shown in the following example:
touch osm.aks.bicep && touch osm.aks.parameters.json
Open the osm.aks.bicep file and copy the following example content to it. Then save the file.
// https://docs.microsoft.com/azure/aks/troubleshooting#what-naming-restrictions-are-enforced-for-aks-
resources-and-parameters
@minLength(3)
@maxLength(63)
@description('Provide a name for the AKS cluster. The only allowed characters are letters, numbers, dashes,
and underscore. The first and last character must be a letter or a number.')
param clusterName string
@minLength(3)
@maxLength(54)
@description('Provide a name for the AKS dnsPrefix. Valid characters include alphanumeric values and hyphens
(-). The dnsPrefix can\'t include special characters such as a period (.)')
param clusterDNSPrefix string
param k8Version string
param sshPubKey string
Open the osm.aks.parameters.json file and copy the following example content to it. Add the deployment-
specific parameters, and then save the file.
NOTE
The osm.aks.parameters.json file is an example template parameters file needed for the Bicep deployment. Update the
parameters specifically for your deployment environment. The specific parameter values in this example need the following
parameters to be updated: clusterName , clusterDNSPrefix , k8Version , and sshPubKey . To find a list of supported
Kubernetes versions in your region, use the az aks get-versions --location <region> command.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"clusterName": {
"value": "<YOUR CLUSTER NAME HERE>"
},
"clusterDNSPrefix": {
"value": "<YOUR CLUSTER DNS PREFIX HERE>"
},
"k8Version": {
"value": "<YOUR SUPPORTED KUBERNETES VERSION HERE>"
},
"sshPubKey": {
"value": "<YOUR SSH KEY HERE>"
}
}
}
When the deployment finishes, you should see a message that says the deployment succeeded.
apiVersion: config.openservicemesh.io/v1alpha1
kind: MeshConfig
metadata:
creationTimestamp: "0000-00-00A00:00:00A"
generation: 1
name: osm-mesh-config
namespace: kube-system
resourceVersion: "2494"
uid: 6c4d67f3-c241-4aeb-bf4f-b029b08faa31
spec:
certificate:
serviceCertValidityDuration: 24h
featureFlags:
enableEgressPolicy: true
enableMulticlusterMode: false
enableWASMStats: true
observability:
enableDebugServer: true
osmLogLevel: info
tracing:
address: jaeger.osm-system.svc.cluster.local
enable: false
endpoint: /api/v2/spans
port: 9411
sidecar:
configResyncInterval: 0s
enablePrivilegedInitContainer: false
envoyImage: mcr.microsoft.com/oss/envoyproxy/envoy:v1.18.3
initContainerImage: mcr.microsoft.com/oss/openservicemesh/init:v0.9.1
logLevel: error
maxDataPlaneConnections: 0
resources: {}
traffic:
enableEgress: true
enablePermissiveTrafficPolicyMode: true
inboundExternalAuthorization:
enable: false
failureModeAllow: false
statPrefix: inboundExtAuthz
timeout: 1s
useHTTPSIngress: false
Notice that enablePermissiveTrafficPolicyMode is configured to true . In OSM, permissive traffic policy mode
bypasses SMI traffic policy enforcement. In this mode, OSM automatically discovers services that are a part of
the service mesh. The discovered services will have traffic policy rules programmed on each Envoy proxy
sidecar to allow communications between these services.
WARNING
Before you proceed, verify that your permissive traffic policy mode is set to true . If it isn't, change it to true by using
the following command:
Clean up resources
When you no longer need the Azure resources, use the Azure CLI to delete the deployment's test resource
group:
Alternatively, you can uninstall the OSM add-on and the related resources from your cluster. For more
information, see Uninstall the Open Service Mesh add-on from your AKS cluster.
Next steps
This article showed you how to install the OSM add-on on an AKS cluster and verify that it's installed and
running. With the OSM add-on installed on your cluster, you can deploy a sample application or onboard an
existing application to work with your OSM mesh.
Download and configure the Open Service Mesh
(OSM) client library
6/15/2022 • 3 minutes to read • Edit Online
This article will discuss how to download the OSM client library to be used to operate and configure the OSM
add-on for AKS, and how to configure the binary for your environment.
Download and install the Open Service Mesh (OSM) client binary
In a bash-based shell on Linux or Windows Subsystem for Linux, use curl to download the OSM release and
then extract with tar as follows:
# Specify the OSM version that will be leveraged throughout these instructions
OSM_VERSION=v1.0.0
The osm client binary runs on your client machine and allows you to manage OSM in your AKS cluster. Use the
following commands to install the OSM osm client binary in a bash-based shell on Linux or Windows
Subsystem for Linux. These commands copy the osm client binary to the standard user program location in
your PATH .
You can verify the osm client library has been correctly added to your path and its version number with the
following command.
osm version
Download and install the Open Service Mesh (OSM) client binary
In a bash-based shell, use curl to download the OSM release and then extract with tar as follows:
# Specify the OSM version that will be leveraged throughout these instructions
OSM_VERSION=v1.0.0
The osm client binary runs on your client machine and allows you to manage OSM in your AKS cluster. Use the
following commands to install the OSM osm client binary in a bash-based shell. These commands copy the
osm client binary to the standard user program location in your PATH .
osm version
Download and install the Open Service Mesh (OSM) client binary
In a PowerShell-based shell on Windows, use Invoke-WebRequest to download the OSM release and then extract
with Expand-Archive as follows:
# Specify the OSM version that will be leveraged throughout these instructions
$OSM_VERSION="v1.0.0"
[Net.ServicePointManager]::SecurityProtocol = "tls12"
$ProgressPreference = 'SilentlyContinue'; Invoke-WebRequest -URI
"https://github.com/openservicemesh/osm/releases/download/$OSM_VERSION/osm-$OSM_VERSION-windows-amd64.zip" -
OutFile "osm-$OSM_VERSION.zip"
Expand-Archive -Path "osm-$OSM_VERSION.zip" -DestinationPath .
The osm client binary runs on your client machine and allows you to manage the OSM controller in your AKS
cluster. Use the following commands to install the OSM osm client binary in a PowerShell-based shell on
Windows. These commands copy the osm client binary to an OSM folder and then make it available both
immediately (in current shell) and permanently (across shell restarts) via your PATH . You don't need elevated
(Admin) privileges to run these commands and you don't need to restart your shell.
WARNING
Do not attempt to install OSM from the binary using osm install . This will result in a installation of OSM that is not
integrated as an add-on for AKS.
install:
kind: managed
distribution: AKS
namespace: kube-system
If the file is not created at $HOME/.osm/config.yaml , remember to set the OSM_CONFIG environment variable to
point to the path where the config file is created.
After setting OSM_CONFIG, the output of the osm env command should be the following:
$ osm env
---
install:
kind: managed
distribution: AKS
namespace: kube-system
Integrations with Open Service Mesh on Azure
Kubernetes Service (AKS)
6/15/2022 • 2 minutes to read • Edit Online
The Open Service Mesh (OSM) add-on integrates with features provided by Azure as well as open source
projects.
IMPORTANT
Integrations with open source projects aren't covered by the AKS support policy.
Ingress
Ingress allows for traffic external to the mesh to be routed to services within the mesh. With OSM, you can
configure most ingress solutions to work with your mesh, but OSM works best with Web Application Routing,
NGINX ingress, or Contour ingress. Open source projects integrating with OSM, including NGINX ingress and
Contour ingress, aren't covered by the AKS support policy.
Using Azure Gateway Ingress Controller (AGIC) for ingress with OSM isn't supported and not recommended.
Metrics observability
Observability of metrics allows you to view the metrics of your mesh and the deployments in your mesh. With
OSM, you can use Prometheus and Grafana for metrics observability, but those integrations aren't covered by
the AKS support policy.
You can also integrate OSM with Azure Monitor.
Before you can enable metrics on your mesh to integrate with Azure Monitor:
Enable Azure Monitor on your cluster
Enable the OSM add-on for your AKS cluster
Onboard your application namespaces to the mesh
To enable metrics for a namespace in the mesh use osm metrics enable . For example:
Create a Configmap in the kube-system namespace that enables Azure Monitor to monitor your namespaces.
For example, create a monitor-configmap.yaml with the following to monitor the myappnamespace :
kind: ConfigMap
apiVersion: v1
data:
schema-version: v1
config-version: ver1
osm-metric-collection-configuration: |-
# OSM metric collection settings
[osm_metric_collection_configuration]
[osm_metric_collection_configuration.settings]
# Namespaces to monitor
monitor_namespaces = ["myappnamespace"]
metadata:
name: container-azm-ms-osmconfig
namespace: kube-system
To access your metrics from the Azure portal, select your AKS cluster, then select Logs under Monitoring. From
the Monitoring section, query the InsightsMetrics table to view metrics in the enabled namespaces. For
example, the following query shows the envoy metrics for the myappnamespace namespace.
InsightsMetrics
| where Name contains "envoy"
| extend t=parse_json(Tags)
| where t.app == "myappnamespace"
External authorization
External authorization allows you to offload authorization of HTTP requests to an external service. OSM can use
external authorization by integrating with Open Policy Agent (OPA), but that integration isn't covered by the AKS
support policy.
Certificate management
OSM has several types of certificates it uses to operate on your AKS cluster. OSM includes its own certificate
manager called Tresor, which is used by default. Alternatively, OSM allows you to integrate with Hashicorp Vault
and cert-manager, but those integrations aren't covered by the AKS support policy.
Open Service Mesh (OSM) AKS add-on
Troubleshooting Guides
6/15/2022 • 5 minutes to read • Edit Online
When you deploy the OSM AKS add-on, you could possibly experience problems associated with configuration
of the service mesh. The following guide will assist you on how to troubleshoot errors and resolve common
problems.
NOTE
For the osm-controller services the CLUSTER-IP would be different. The service NAME and PORT(S) must be the same as
the example above.
Check for the service and the CA bundle of the Validating webhook
A well configured Validating Webhook Configuration would look exactly like this:
{
"name": "osm-config-validator",
"namespace": "kube-system",
"path": "/validate-webhook",
"port": 9093
}
Check for the service and the CA bundle of the Mutating webhook
A well configured Mutating Webhook Configuration would look exactly like this:
{
"name": "osm-injector",
"namespace": "kube-system",
"path": "/mutate-pod-creation",
"port": 9090
}
K UB EC T L PATC H
K EY TYPE DEFA ULT VA L UE C O M M A N D EXA M P L ES
Check Namespaces
NOTE
The kube-system namespace will never participate in a service mesh and will never be labeled and/or annotated with the
key/values below.
We use the osm namespace add command to join namespaces to a given service mesh. When a k8s namespace is
part of the mesh (or for it to be part of the mesh) the following must be true:
View the annotations with
{
"openservicemesh.io/sidecar-injection": "enabled"
}
{
"openservicemesh.io/monitored-by": "osm"
}
NOTE
After osm namespace add is called only new pods will be injected with an Envoy sidecar. Existing pods must be restarted
with kubectl rollout restart deployment ...
Expected output:
To list the OSM controller pods for a mesh, please run the following command passing in the mesh's namespace
kubectl get pods -n <osm-mesh-namespace> -l app=osm-controller
This article shows you how to uninstall the OMS add-on and related resources from your AKS cluster.
az aks disable-addons \
--resource-group myResourceGroup \
--name myAKSCluster \
--addons open-service-mesh
The above example removes the OSM add-on from the myAKSCluster in myResourceGroup.
IMPORTANT
You must remove these additional resources after you disable the OSM add-on. Leaving these resources on your cluster
may cause issues if you enable the OSM add-on again in the future.
AKS release tracker
6/15/2022 • 2 minutes to read • Edit Online
AKS releases weekly rounds of fixes and feature and component updates that affect all clusters and customers.
However, these releases can take up to two weeks to roll out to all regions from the initial time of shipping due
to Azure Safe Deployment Practices (SDP). It is important for customers to know when a particular AKS release
is hitting their region, and the AKS release tracker provides these details in real time by versions and regions.
The bottom half of the tracker shows the SDP process. The table has two views: one shows the latest version and
status update for each grouping of regions and the other shows the status and region availability of each
currently supported version.
Simplified application autoscaling with Kubernetes
Event-driven Autoscaling (KEDA) add-on (Preview)
6/15/2022 • 2 minutes to read • Edit Online
Kubernetes Event-driven Autoscaling (KEDA) is a single-purpose and lightweight component that strives to
make application autoscaling simple and is a CNCF Incubation project.
It applies event-driven autoscaling to scale your application to meet demand in a sustainable and cost-efficient
manner with scale-to-zero.
The KEDA add-on makes it even easier by deploying a managed KEDA installation, providing you with a rich
catalog of 50+ KEDA scalers that you can scale your applications with on your Azure Kubernetes Services (AKS)
cluster.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Architecture
KEDA provides two main components:
KEDA operator allows end-users to scale workloads in/out from 0 to N instances with support for
Kubernetes Deployments, Jobs, StatefulSets or any custom resource that defines /scale subresource.
Metrics ser ver exposes external metrics to Horizontal Pod Autoscaler (HPA) in Kubernetes for autoscaling
purposes such as messages in a Kafka topic, or number of events in an Azure event hub. Due to upstream
limitations, KEDA must be the only installed metric adapter.
Learn more about how KEDA works in the official KEDA documentation.
IMPORTANT
The KEDA add-on installs version 2.7.0 of KEDA on your cluster.
Next steps
Enable the KEDA add-on with an ARM template
Enable the KEDA add-on with the Azure CLI
Troubleshoot KEDA add-on problems
Autoscale a .NET Core worker processing Azure Service Bus Queue messages
Install the Kubernetes Event-driven Autoscaling
(KEDA) add-on by using ARM template
6/15/2022 • 2 minutes to read • Edit Online
This article shows you how to deploy the Kubernetes Event-driven Autoscaling (KEDA) add-on to Azure
Kubernetes Service (AKS) by using an ARM template.
IMPORTANT
The KEDA add-on installs version 2.7.0 of KEDA on your cluster.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
Firewall rules are configured to allow access to the Kubernetes API server. (learn more)
Register the AKS-KedaPreview feature flag
To use the KEDA, you must enable the AKS-KedaPreview feature flag on your subscription.
You can check on the registration status by using the az feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the
az provider register command:
"workloadAutoScalerProfile": {
"keda": {
"enabled": true
}
}
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. The
following example gets credentials for the AKS cluster named MyAKSCluster in the MyResourceGroup:
Example deployment
The following snippet is a sample deployment that creates a cluster with KEDA enabled with a single node pool
comprised of three DS2_v5 nodes.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources": [
{
"apiVersion": "2022-05-02-preview",
"dependsOn": [],
"type": "Microsoft.ContainerService/managedClusters",
"location": "westcentralus",
"name": "myAKSCluster",
"properties": {
"kubernetesVersion": "1.23.5",
"enableRBAC": true,
"dnsPrefix": "myAKSCluster",
"agentPoolProfiles": [
{
"name": "agentpool",
"osDiskSizeGB": 200,
"count": 3,
"enableAutoScaling": false,
"vmSize": "Standard_D2S_v5",
"osType": "Linux",
"storageProfile": "ManagedDisks",
"type": "VirtualMachineScaleSets",
"mode": "System",
"maxPods": 110,
"availabilityZones": [],
"nodeTaints": [],
"enableNodePublicIP": false
}
],
"networkProfile": {
"loadBalancerSku": "standard",
"networkPlugin": "kubenet"
},
"workloadAutoScalerProfile": {
"keda": {
"enabled": true
}
}
},
"identity": {
"type": "SystemAssigned"
}
}
]
}
Clean Up
To remove the resource group, and all related resources, use the Az PowerShell module group delete command:
This article shows you how to install the Kubernetes Event-driven Autoscaling (KEDA) add-on to Azure
Kubernetes Service (AKS) by using Azure CLI. The article includes steps to verify that it's installed and running.
IMPORTANT
The KEDA add-on installs version 2.7.0 of KEDA on your cluster.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
Firewall rules are configured to allow access to the Kubernetes API server. (learn more)
Install the extension aks-preview
Install the aks-preview extension in the AKS cluster to make sure you have the latest version of AKS extension
before installing KEDA add-on.
You can check on the registration status by using the az feature list command:
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the
az provider register command:
az provider register --namespace Microsoft.ContainerService
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-keda
For existing clusters, use az aks update with --enable-keda option. The following code shows an example.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-keda
The following example shows the status of the KEDA add-on for myAKSCluster in myResourceGroup:
The following example output shows that the KEDA operator and metrics API server are installed in the AKS
cluster along with its status.
kubectl get pods -n kube-system
To verify the version of your KEDA, use kubectl get crd/scaledobjects.keda.sh -o yaml . For example:
The following example output shows the configuration of KEDA in the app.kubernetes.io/version label:
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.8.0
creationTimestamp: "2022-06-08T10:31:06Z"
generation: 1
labels:
addonmanager.kubernetes.io/mode: Reconcile
app.kubernetes.io/component: operator
app.kubernetes.io/name: keda-operator
app.kubernetes.io/part-of: keda-operator
app.kubernetes.io/version: 2.7.0
name: scaledobjects.keda.sh
resourceVersion: "2899"
uid: 85b8dec7-c3da-4059-8031-5954dc888a0b
spec:
conversion:
strategy: None
group: keda.sh
names:
kind: ScaledObject
listKind: ScaledObjectList
plural: scaledobjects
shortNames:
- so
singular: scaledobject
scope: Namespaced
# Redacted for simplicity
While KEDA provides various customization options, the KEDA add-on currently provides basic common
configuration.
If you have requirement to run with another custom configurations, such as namespaces that should be watched
or tweaking the log level, then you may edit the KEDA YAML manually and deploy it.
However, when the installation is customized there will no support offered for custom configurations.
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--disable-keda
Next steps
This article showed you how to install the KEDA add-on on an AKS cluster using Azure CLI. The steps to verify
that KEDA add-on is installed and running are included. With the KEDA add-on installed on your cluster, you can
deploy a sample application to start scaling apps.
You can troubleshoot troubleshoot KEDA add-on problems in this article.
Integrations with Kubernetes Event-driven
Autoscaling (KEDA) on Azure Kubernetes Service
(AKS) (Preview)
6/15/2022 • 2 minutes to read • Edit Online
The Kubernetes Event-driven Autoscaling (KEDA) add-on integrates with features provided by Azure and open
source projects.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
IMPORTANT
Integrations with open source projects are not covered by the AKS support policy.
Next steps
Enable the KEDA add-on with an ARM template
Enable the KEDA add-on with the Azure CLI
Troubleshoot KEDA add-on problems
Autoscale a .NET Core worker processing Azure Service Bus Queue message
Kubernetes Event-driven Autoscaling (KEDA) AKS
add-on Troubleshooting Guides
6/15/2022 • 2 minutes to read • Edit Online
When you deploy the KEDA AKS add-on, you could possibly experience problems associated with configuration
of the application autoscaler.
The following guide will assist you on how to troubleshoot errors and resolve common problems with the add-
on, in addition to the official KEDA FAQ & troubleshooting guide.
APP
2.7.0
While in the metric server you might notice that it's not able to start up:
I0607 09:53:05.297924 1 main.go:147] keda_metrics_adapter "msg"="KEDA Version: 2.7.1"
I0607 09:53:05.297979 1 main.go:148] keda_metrics_adapter "msg"="KEDA Commit: "
I0607 09:53:05.297996 1 main.go:149] keda_metrics_adapter "msg"="Go Version: go1.17.9"
I0607 09:53:05.298006 1 main.go:150] keda_metrics_adapter "msg"="Go OS/Arch: linux/amd64"
E0607 09:53:15.344324 1 logr.go:279] keda_metrics_adapter "msg"="Failed to get API Group-Resources"
"error"="Get \"https://10.0.0.1:443/api?timeout=32s\": EOF"
E0607 09:53:15.344360 1 main.go:104] keda_metrics_adapter "msg"="failed to setup manager" "error"="Get
\"https://10.0.0.1:443/api?timeout=32s\": EOF"
E0607 09:53:15.344378 1 main.go:209] keda_metrics_adapter "msg"="making provider" "error"="Get
\"https://10.0.0.1:443/api?timeout=32s\": EOF"
E0607 09:53:15.344399 1 main.go:168] keda_metrics_adapter "msg"="unable to run external metrics adapter"
"error"="Get \"https://10.0.0.1:443/api?timeout=32s\": EOF"
This most likely means that the KEDA add-on isn't able to start up due to a misconfigured firewall.
In order to make sure it runs correctly, make sure to configure the firewall to meet the requirements.
Enabling add-on on clusters with self-managed open-source KEDA installations
While Kubernetes only allows one metric server to be installed, you can in theory install KEDA multiple times.
However, it isn't recommended given only one installation will work.
When the KEDA add-on is installed in an AKS cluster, the previous installation of open-source KEDA will be
overridden and the add-on will take over.
This means that the customization and configuration of the self-installed KEDA deployment will get lost and no
longer be applied.
While there's a possibility that the existing autoscaling will keep on working, it introduces a risk given it will be
configured differently and won't support features such as managed identity.
It's recommended to uninstall existing KEDA installations before enabling the KEDA add-on given the installation
will succeed without any error.
In order to determine which metrics adapter is being used by KEDA, use the kubectl command:
An overview will be provided showing the service and namespace that Kubernetes will use to get metrics:
NAME NAMESPACE
keda-operator-metrics-apiserver kube-system
WARNING
If the namespace is not kube-system , then the AKS add-on is being ignored and another metric server is being used.
Web Application Routing (Preview)
6/15/2022 • 5 minutes to read • Edit Online
The Web Application Routing solution makes it easy to access applications that are deployed to your Azure
Kubernetes Service (AKS) cluster. When the solution's enabled, it configures an Ingress controller in your AKS
cluster, SSL termination, and Open Service Mesh (OSM) for E2E encryption of inter cluster communication. As
applications are deployed, the solution also creates publicly accessible DNS names for application endpoints.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Limitations
Web Application Routing currently doesn't support named ports in ingress backend.
Prerequisites
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI installed.
An Azure Key Vault containing any application certificates.
A DNS solution.
Install the aks-preview Azure CLI extension
You also need the aks-preview Azure CLI extension version 0.5.75 or later. Install the aks-preview Azure CLI
extension by using the az extension add command. Or install any available updates by using the az extension
update command.
# Install the aks-preview extension
az extension add --name aks-preview
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
TIP
If you want to enable multiple add-ons, provide them as a comma-separated list. For example, to enable Web Application
Routing routing and monitoring, use the format --enable-addons web_application_routing,monitoring .
You can also enable Web Application Routing on an existing AKS cluster using the az aks enable-addons
command. To enable Web Application Routing on an existing cluster, add the --addons parameter and specify
web_application_routing as shown in the following example:
az aks install-cli
To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. The
following example gets credentials for the AKS cluster named myAKSCluster in myResourceGroup:
We also need to add the application namespace to the OSM control plane:
Grant GET permissions for Web Application Routing to retrieve certificates from Azure Key Vault:
These annotations in the service manifest would direct Web Application Routing to create an ingress servicing
myapp.contoso.com connected to the keyvault myapp-contoso and will retrieve the keyvault-certificate-name
with keyvault-certificate-name-revision
Create a file named samples-web-app-routing.yaml and copy in the following YAML. On line 29-31, update
<MY_HOSTNAME> with your DNS host name and <MY_KEYVAULT_URI> with the full certficicate vault URI.
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld
template:
metadata:
labels:
app: aks-helloworld
spec:
containers:
- name: aks-helloworld
image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
ports:
- containerPort: 80
env:
- name: TITLE
value: "Welcome to Azure Kubernetes Service (AKS)"
---
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld
annotations:
kubernetes.azure.com/ingress-host: <MY_HOSTNAME>
kubernetes.azure.com/tls-cert-keyvault-uri: <MY_KEYVAULT_URI>
spec:
type: ClusterIP
ports:
- port: 80
selector:
app: aks-helloworld
deployment.apps/aks-helloworld created
service/aks-helloworld created
Open a web browser to <MY_HOSTNAME>, for example myapp.contoso.com and verify you see the demo
application. The application may take a few minutes to appear.
The Web Application Routing add-on can be removed using the Azure CLI. To do so run the following command,
substituting your AKS cluster and resource group name.
When the Web Application Routing add-on is disabled, some Kubernetes resources may remain in the cluster.
These resources include configMaps and secrets, and are created in the app-routing-system namespace. To
maintain a clean cluster, you may want to remove these resources.
Clean up
Remove the associated Kubernetes objects created in this article using kubectl delete .
Cluster extensions provides an Azure Resource Manager driven experience for installation and lifecycle
management of services like Azure Machine Learning (ML) on an AKS cluster. This feature enables:
Azure Resource Manager-based deployment of extensions, including at-scale deployments across AKS
clusters.
Lifecycle management of the extension (Update, Delete) from Azure Resource Manager.
In this article, you will learn about:
A conceptual overview of this feature is available in Cluster extensions - Azure Arc-enabled Kubernetes article.
Prerequisites
IMPORTANT
Ensure that your AKS cluster is created with a managed identity, as cluster extensions won't work with service principal-
based clusters.
For new clusters created with az aks create , managed identity is configured by default. For existing service principal-
based clusters that need to be switched over to managed identity, it can be enabled by running az aks update with the
--enable-managed-identity flag. For more information, see Use managed identity.
An Azure subscription. If you don't have an Azure subscription, you can create a free account.
Azure CLI version >= 2.16.0 installed.
NOTE
If you have enabled AAD-based pod identity on your AKS cluster, please add the following AzurePodIdentityException
to the release namespace of your extension instance on the AKS cluster:
apiVersion: aadpodidentity.k8s.io/v1
kind: AzurePodIdentityException
metadata:
name: k8s-extension-exception
namespace: <release-namespace-of-extension>
spec:
podLabels:
clusterconfig.azure.com/managedby: k8s-extension
You will also need the k8s-extension Azure CLI extension. Install this by running the following commands:
If the k8s-extension extension is already installed, you can update it to the latest version using the following
command:
Flux (GitOps) Use GitOps with Flux to manage cluster configuration and
application deployment.
NOTE
The Cluster Extensions service is unable to retain sensitive information for more than 48 hours. If the cluster extension
agents don't have network connectivity for more than 48 hours and cannot determine whether to create an extension on
the cluster, then the extension transitions to Failed state. Once in Failed state, you will need to run
k8s-extension create again to create a fresh extension instance.
Required parameters
PA RA M ET ER N A M E DESC RIP T IO N
--extension-type The type of extension you want to install on the cluster. For
example: Microsoft.AzureML.Kubernetes
--cluster-name Name of the AKS cluster on which the extension instance has
to be created
Optional parameters
PA RA M ET ER N A M E DESC RIP T IO N
--configuration-settings Settings that can be passed into the extension to control its
functionality. They are to be passed in as space separated
key=value pairs after the parameter name. If this
parameter is used in the command, then
--configuration-settings-file can't be used in the
same command.
PA RA M ET ER N A M E DESC RIP T IO N
--configuration-settings-file Path to the JSON file having key value pairs to be used for
passing in configuration settings to the extension. If this
parameter is used in the command, then
--configuration-settings can't be used in the same
command.
--configuration-protected-settings These settings are not retrievable using GET API calls or
az k8s-extension show commands, and are thus used to
pass in sensitive settings. They are to be passed in as space
separated key=value pairs after the parameter name. If
this parameter is used in the command, then
--configuration-protected-settings-file can't be used
in the same command.
--configuration-protected-settings-file Path to the JSON file having key value pairs to be used for
passing in sensitive settings to the extension. If this
parameter is used in the command, then
--configuration-protected-settings can't be used in
the same command.
NOTE
Refer to documentation of the extension type (Eg: Azure ML) to learn about the specific settings under
ConfigurationSetting and ConfigurationProtectedSettings that are allowed to be updated. For
ConfigurationProtectedSettings, all settings are expected to be provided during an update of a single setting. If some
settings are omitted, those settings would be considered obsolete and deleted.
Update an existing extension instance with k8s-extension update , passing in values for the mandatory
parameters. The below command updates the auto-upgrade setting for an Azure Machine Learning extension
instance:
Required parameters
PA RA M ET ER N A M E DESC RIP T IO N
--extension-type The type of extension you want to install on the cluster. For
example: Microsoft.AzureML.Kubernetes
--cluster-name Name of the AKS cluster on which the extension instance has
to be created
Optional parameters
PA RA M ET ER N A M E DESC RIP T IO N
--configuration-settings Settings that can be passed into the extension to control its
functionality. Only the settings that require an update need
to be provided. The provided settings would be replaced
with the provided values. They are to be passed in as space
separated key=value pairs after the parameter name. If
this parameter is used in the command, then
--configuration-settings-file can't be used in the
same command.
--configuration-settings-file Path to the JSON file having key value pairs to be used for
passing in configuration settings to the extension. If this
parameter is used in the command, then
--configuration-settings can't be used in the same
command.
--configuration-protected-settings These settings are not retrievable using GET API calls or
az k8s-extension show commands, and are thus used to
pass in sensitive settings. When updating a setting, all
settings are expected to be provided. If some settings are
omitted, those settings would be considered obsolete and
deleted. They are to be passed in as space separated
key=value pairs after the parameter name. If this
parameter is used in the command, then
--configuration-protected-settings-file can't be used
in the same command.
--configuration-protected-settings-file Path to the JSON file having key value pairs to be used for
passing in sensitive settings to the extension. If this
parameter is used in the command, then
--configuration-protected-settings can't be used in
the same command.
NOTE
The Azure resource representing this extension gets deleted immediately. The Helm release on the cluster associated with
this extension is only deleted when the agents running on the Kubernetes cluster have network connectivity and can
reach out to Azure services again to fetch the desired state.
Delete an extension instance on a cluster with k8s-extension delete , passing in values for the mandatory
parameters.
Azure DevOps Starter presents a simplified experience where you can bring your existing code and Git repo or
choose a sample application to create a continuous integration (CI) and continuous delivery (CD) pipeline to
Azure.
DevOps Starter also:
Automatically creates Azure resources, such as Azure Kubernetes Service (AKS).
Creates and configures a release pipeline in Azure DevOps that sets up a build and release pipeline for CI/CD.
Creates an Azure Application Insights resource for monitoring.
Enables Azure Monitor for containers to monitor performance for the container workloads on the AKS cluster
In this tutorial, you will:
Use DevOps Starter to deploy an ASP.NET Core app to AKS
Configure Azure DevOps and an Azure subscription
Examine the AKS cluster
Examine the CI pipeline
Examine the CD pipeline
Commit changes to Git and automatically deploy them to Azure
Clean up resources
Prerequisites
An Azure subscription. You can get one free through Visual Studio Dev Essentials.
You're now ready to collaborate with a team on your app by using a CI/CD process that automatically deploys
your latest work to your website. Each change to the Git repo starts a build in Azure DevOps, and a CD pipeline
executes a deployment to Azure. Follow the procedure in this section, or use another technique to commit
changes to your repo. For example, you can clone the Git repo in your favorite tool or IDE, and then push
changes to this repo.
1. In the Azure DevOps menu, select Code > Files , and then go to your repo.
2. Go to the Views\Home directory, select the ellipsis (...) next to the Index.cshtml file, and then select Edit .
3. Make a change to the file, such as adding some text within one of the div tags.
4. At the top right, select Commit , and then select Commit again to push your change. After a few
moments, a build starts in Azure DevOps and a release executes to deploy the changes. Monitor the build
status on the DevOps Starter dashboard or in the browser with your Azure DevOps organization.
5. After the release is completed, refresh your app to verify your changes.
Clean up resources
If you are testing, you can avoid accruing billing charges by cleaning up your resources. When they are no
longer needed, you can delete the AKS cluster and related resources that you created in this tutorial. To do so,
use the Delete functionality on the DevOps Starter dashboard.
IMPORTANT
The following procedure permanently deletes resources. The Delete functionality destroys the data that's created by the
project in DevOps Starter in both Azure and Azure DevOps, and you will be unable to retrieve it. Use this procedure only
after you've carefully read the prompts.
Next steps
You can optionally modify these build and release pipelines to meet the needs of your team. You can also use
this CI/CD pattern as a template for your other pipelines. In this tutorial, you learned how to:
Use DevOps Starter to deploy an ASP.NET Core app to AKS
Configure Azure DevOps and an Azure subscription
Examine the AKS cluster
Examine the CI pipeline
Examine the CD pipeline
Commit changes to Git and automatically deploy them to Azure
Clean up resources
To learn more about using the Kubernetes dashboard, see:
Use the Kubernetes dashboard
Deployment Center for Azure Kubernetes
6/15/2022 • 4 minutes to read • Edit Online
Deployment Center in Azure DevOps simplifies setting up a robust Azure DevOps pipeline for your application.
By default, Deployment Center configures an Azure DevOps pipeline to deploy your application updates to the
Kubernetes cluster. You can extend the default configured Azure DevOps pipeline and also add richer capabilities:
the ability to gain approval before deploying, provision additional Azure resources, run scripts, upgrade your
application, and even run more validation tests.
In this tutorial, you will:
Configure an Azure DevOps pipeline to deploy your application updates to the Kubernetes cluster.
Examine the continuous integration (CI) pipeline.
Examine the continuous delivery (CD) pipeline.
Clean up the resources.
Prerequisites
An Azure subscription. You can get one free through Visual Studio Dev Essentials.
An Azure Kubernetes Service (AKS) cluster.
GitHub : Authorize and select the repository for your GitHub account.
4. Deployment Center analyzes the repository and detects your Dockerfile. If you want to update the
Dockerfile, you can edit the identified port number.
If the repository doesn't contain the Dockerfile, the system displays a message to commit one.
5. Select an existing container registry or create one, and then select Finish . The pipeline is created
automatically and queues a build in Azure Pipelines.
Azure Pipelines is a cloud service that you can use to automatically build and test your code project and
make it available to other users. Azure Pipelines combines continuous integration and continuous
delivery to constantly and consistently test and build your code and ship it to any target.
6. Select the link to see the ongoing pipeline.
7. You'll see the successful logs after deployment is complete.
Clean up resources
You can delete the related resources that you created when you don't need them anymore. Use the delete
functionality on the DevOps Projects dashboard.
Next steps
You can modify these build and release pipelines to meet the needs of your team. Or, you can use this CI/CD
model as a template for your other pipelines.
GitHub Actions for deploying to Kubernetes service
6/15/2022 • 9 minutes to read • Edit Online
GitHub Actions gives you the flexibility to build an automated software development lifecycle workflow. You can
use multiple Kubernetes actions to deploy to containers from Azure Container Registry to Azure Kubernetes
Service with GitHub Actions.
Prerequisites
An Azure account with an active subscription. Create an account for free.
A GitHub account. If you don't have one, sign up for free.
A working Kubernetes cluster
Tutorial: Prepare an application for Azure Kubernetes Service
SEC T IO N TA SK S
You can create a service principal by using the az ad sp create-for-rbac command in the Azure CLI. You can run
this command using Azure Cloud Shell in the Azure portal or by selecting the Tr y it button.
In the above command, replace the placeholders with your subscription ID, and resource group. The output is
the role assignment credentials that provide access to your resource. The command should output a JSON
object similar to this.
{
"clientId": "<GUID>",
"clientSecret": "<GUID>",
"subscriptionId": "<GUID>",
"tenantId": "<GUID>",
(...)
}
Copy this JSON object, which you can use to authenticate from GitHub.
2. Paste the contents of the above az cli command as the value of secret variable. For example,
AZURE_CREDENTIALS .
3. Similarly, define the following additional secrets for the container registry credentials and set them in
Docker login action.
REGISTRY_USERNAME
REGISTRY_PASSWORD
4. You will see the secrets as shown below once defined.
Build a container image and deploy to Azure Kubernetes Service
cluster
The build and push of the container images is done using azure/docker-login@v1 action.
env:
REGISTRY_NAME: {registry-name}
CLUSTER_NAME: {cluster-name}
CLUSTER_RESOURCE_GROUP: {resource-group-name}
NAMESPACE: {namespace-name}
APP_NAME: {app-name}
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@main
PA RA M ET ER EXP L A N AT IO N
manifests (Required) Path to the manifest files, that will be used for
deployment
NOTE
The manifest files should be created manually by you. Currently there are no tools that will generate such files in an
automated way, for more information see this sample repository with example manifest files.
Before you can deploy to AKS, you'll need to set target Kubernetes namespace and create an image pull secret.
See Pull images from an Azure container registry to a Kubernetes cluster, to learn more about how pulling
images works.
Complete your deployment with the azure/k8s-deploy@v1 action. Replace the environment variables with values
for your application.
Service principal
Open ID Connect
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@main
Clean up resources
When your Kubernetes cluster, container registry, and repository are no longer needed, clean up the resources
you deployed by deleting the resource group and your GitHub repository.
Next steps
Learn about Azure Kubernetes Service
Learn how to create multiple pipelines on GitHub Actions with AKS
More Kubernetes GitHub Actions
Kubectl tool installer ( azure/setup-kubectl ): Installs a specific version of kubectl on the runner.
Kubernetes set context ( azure/k8s-set-context ): Set the target Kubernetes cluster context which will be used
by other actions or run any kubectl commands.
AKS set context ( azure/aks-set-context ): Set the target Azure Kubernetes Service cluster context.
Kubernetes create secret ( azure/k8s-create-secret ): Create a generic secret or docker-registry secret in the
Kubernetes cluster.
Kubernetes deploy ( azure/k8s-deploy ): Bake and deploy manifests to Kubernetes clusters.
Setup Helm ( azure/setup-helm ): Install a specific version of Helm binary on the runner.
Kubernetes bake ( azure/k8s-bake ): Bake manifest file to be used for deployments using helm2, kustomize or
kompose.
Kubernetes lint ( azure/k8s-lint ): Validate/lint your manifest files.
Build and deploy to Azure Kubernetes Service with
Azure Pipelines
6/15/2022 • 12 minutes to read • Edit Online
Prerequisites
An Azure account with an active subscription. Create an account for free.
An Azure Resource Manager service connection. Create an Azure Resource Manager service connection.
A GitHub account. Create a free GitHub account if you don't have one already.
https://github.com/MicrosoftDocs/pipelines-javascript-docker
NOTE
If you're using a Microsoft-hosted agent, you must add the IP range of the Microsoft-hosted agent to your firewall. Get
the weekly list of IP ranges from the weekly JSON file, which is published every Wednesday. The new IP ranges become
effective the following Monday. For more information, see Microsoft-hosted agents. To find the IP ranges that are required
for your Azure DevOps organization, learn how to identify the possible IP ranges for Microsoft-hosted agents.
After the pipeline run is finished, explore what happened and then go see your app deployed. From the pipeline
summary:
1. Select the Environments tab.
2. Select View environment .
3. Select the instance of your app for the namespace you deployed to. If you stuck to the defaults we
mentioned above, then it will be the myapp app in the default namespace.
4. Select the Ser vices tab.
5. Select and copy the external IP address to your clipboard.
6. Open a new browser tab or window and enter <IP address>:8080.
If you're building our sample app, then Hello world appears in your browser.
- stage: Build
displayName: Build stage
jobs:
- job: Build
displayName: Build job
pool:
vmImage: $(vmImageName)
steps:
- task: Docker@2
displayName: Build and push an image to container registry
inputs:
command: buildAndPush
repository: $(imageRepository)
dockerfile: $(dockerfilePath)
containerRegistry: $(dockerRegistryServiceConnection)
tags: |
$(tag)
- task: PublishPipelineArtifact@1
inputs:
artifactName: 'manifests'
path: 'manifests'
The deployment job uses the Kubernetes manifest task to create the imagePullSecret required by Kubernetes
cluster nodes to pull from the Azure Container Registry resource. Manifest files are then used by the Kubernetes
manifest task to deploy to the Kubernetes cluster.
- stage: Deploy
displayName: Deploy stage
dependsOn: Build
jobs:
- deployment: Deploy
displayName: Deploy job
pool:
vmImage: $(vmImageName)
environment: 'myenv.aksnamespace' #customize with your environment
strategy:
runOnce:
deploy:
steps:
- task: DownloadPipelineArtifact@2
inputs:
artifactName: 'manifests'
downloadPath: '$(System.ArtifactsDirectory)/manifests'
- task: KubernetesManifest@0
displayName: Create imagePullSecret
inputs:
action: createSecret
secretName: $(imagePullSecret)
namespace: $(k8sNamespace)
dockerRegistryEndpoint: $(dockerRegistryServiceConnection)
- task: KubernetesManifest@0
displayName: Deploy to Kubernetes cluster
inputs:
action: deploy
namespace: $(k8sNamespace)
manifests: |
$(System.ArtifactsDirectory)/manifests/deployment.yml
$(System.ArtifactsDirectory)/manifests/service.yml
imagePullSecrets: |
$(imagePullSecret)
containers: |
$(containerRegistry)/$(imageRepository):$(tag)
Clean up resources
Whenever you're done with the resources you created, you can use the following command to delete them:
Prerequisites
An Azure account with an active subscription. Create an account for free.
An Azure Resource Manager service connection. Create an Azure Resource Manager service connection.
A GitHub account. Create a free GitHub account if you don't have one already.
https://github.com/MicrosoftDocs/pipelines-javascript-docker
Configure authentication
When you use Azure Container Registry (ACR) with Azure Kubernetes Service (AKS), you must establish an
authentication mechanism. This can be achieved in two ways:
1. Grant AKS access to ACR. See Authenticate with Azure Container Registry from Azure Kubernetes Service.
2. Use a Kubernetes image pull secret. An image pull secret can be created by using the Kubernetes
deployment task.
7. Choose + in the Agent job and add another Package and deploy Helm char ts task. Configure the
settings for this task as follows:
Kubernetes cluster : Enter or select the AKS cluster you created.
Namespace : Enter your Kubernetes cluster namespace where you want to deploy your
application. Kubernetes supports multiple virtual clusters backed by the same physical cluster.
These virtual clusters are called namespaces. You can use namespaces to create different
environments such as dev, test, and staging in the same cluster.
Command : Select upgrade as the Helm command. You can run any Helm command using this
task and pass in command options as arguments. When you select the upgrade , the task shows
some more fields:
Char t Type : Select File Path . Alternatively, you can specify Char t Name if you want to
specify a URL or a chart name. For example, if the chart name is stable/mysql , the task will
execute helm upgrade stable/mysql
Char t Path : This can be a path to a packaged chart or a path to an unpacked chart
directory. In this example, you're publishing the chart using a CI build, so select the file
package using file picker or enter $(System.DefaultWorkingDirectory)/**/*.tgz
Release Name : Enter a name for your release; for example, azuredevops
Recreate Pods : Tick this checkbox if there is a configuration change during the release and
you want to replace a running pod with the new configuration.
Reset Values : Tick this checkbox if you want the values built into the chart to override all
values provided by the task.
Force : Tick this checkbox if, should conflicts occur, you want to upgrade and rollback to
delete, recreate the resource, and reinstall the full release. This is useful in scenarios where
applying patches can fail (for example, for services because the cluster IP address is
immutable).
Arguments : Enter the Helm command arguments and their values; for this example
--set image.repository=$(imageRepoName) --set image.tag=$(Build.BuildId) See this section
for a description of why we're using these arguments.
Enable TLS : Tick this checkbox to enable strong TLS-based connections between Helm and
Tiller.
CA cer tificate : Specify a CA certificate to be uploaded and used to issue certificates for
Tiller and Helm client.
Cer tificate : Specify the Tiller certificate or Helm client certificate
Key : Specify the Tiller Key or Helm client key
8. In the Variables page of the pipeline, add a variable named imageRepoName and set the value to the
name of your Helm image repository. Typically, this is in the format example.azurecr.io/coderepository
9. Save the release pipeline.
The value of $(imageRepoName) was set in the Variables page (or the variables section of your YAML file).
Alternatively, you can directly replace it with your image repository name in the --set arguments value or
values.yaml file. For example:
image:
repository: VALUE_TO_BE_OVERRIDDEN
tag: latest
Another alternative is to set the Set Values option of the task to specify the argument values as comma-
separated key-value pairs.
When you create or manage Azure Kubernetes Service (AKS) clusters, you might occasionally come across
problems. This article details some common problems and troubleshooting steps.
The following mitigation can be taken by creating new subnets. The permission to create a new subnet is
required for mitigation due to the inability to update an existing subnet's CIDR range.
1. Rebuild a new subnet with a larger CIDR range sufficient for operation goals:
a. Create a new subnet with a new desired non-overlapping range.
b. Create a new node pool on the new subnet.
c. Drain pods from the old node pool residing in the old subnet to be replaced.
d. Delete the old subnet and old node pool.
I can't get logs by using kubectl logs or I can't connect to the API
server. I'm getting "Error from server: error dialing backend: dial
tcp…". What should I do?
Ensure ports 22, 9000 and 1194 are open to connect to the API server. Check whether the tunnelfront or
aks-link pod is running in the kube-system namespace using the kubectl get pods --namespace kube-system
command. If it isn't, force deletion of the pod and it will restart.
I'm receiving errors trying to use features that require virtual machine
scale sets
This troubleshooting assistance is directed from aka.ms/aks-vmss-enablement
You may receive errors that indicate your AKS cluster isn't on a virtual machine scale set, such as the following
example:
AgentPool <agentpoolname> has set auto scaling as enabled but isn't on Vir tual Machine Scale Sets
Features such as the cluster autoscaler or multiple node pools require virtual machine scale sets as the
vm-set-type .
Follow the Before you begin steps in the appropriate doc to correctly create an AKS cluster:
Use the cluster autoscaler
Create and use multiple node pools
Service returned an error. Status=429 Code=\"OperationNotAllowed\" Message=\"The server rejected the request
because too many requests have been received for this subscription.\" Details=
[{\"code\":\"TooManyRequests\",\"message\":\"
{\\\"operationGroup\\\":\\\"HighCostGetVMScaleSet30Min\\\",\\\"startTime\\\":\\\"2020-09-
20T07:13:55.2177346+00:00\\\",\\\"endTime\\\":\\\"2020-09-
20T07:28:55.2177346+00:00\\\",\\\"allowedRequestCount\\\":1800,\\\"measuredRequestCount\\\":2208}\",\"target
\":\"HighCostGetVMScaleSet30Min\"}] InnerError={\"internalErrorCode\":\"TooManyRequestsReceived\"}"}
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext:
runAsUser: 0
fsGroup: 0
NOTE
Since gid and uid are mounted as root or 0 by default. If gid or uid are set as non-root, for example 1000, Kubernetes will
use chown to change all directories and files under that disk. This operation can be time consuming and may make
mounting the disk very slow.
initContainers:
- name: volume-mount
image: mcr.microsoft.com/dotnet/runtime-deps:6.0
command: ["sh", "-c", "chown -R 100:100 /data"]
volumeMounts:
- name: <your data volume>
mountPath: /data
Mount options can be specified on the storage class object. The following example sets 0777:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azurefile
provisioner: kubernetes.io/azure-file
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=1000
- gid=1000
- mfsymlinks
- nobrl
- cache=none
parameters:
skuName: Standard_LRS
initdb: could not change permissions of directory "/var/lib/postgresql/data": Operation not permitted
fixing permissions on existing directory /var/lib/postgresql/data
This error is caused by the Azure Files plugin using the cifs/SMB protocol. When using the cifs/SMB protocol, the
file and directory permissions couldn't be changed after mounting.
To resolve this issue, use subPath together with the Azure Disk plugin.
NOTE
For ext3/4 disk type, there is a lost+found directory after the disk is formatted.
Azure Files has high latency compared to Azure Disk when handling many small files
In some case, such as handling many small files, you may experience high latency when using Azure Files when
compared to Azure Disk.
Error when enabling "Allow access allow access from selected network" setting on storage account
If you enable allow access from selected network on a storage account that's used for dynamic provisioning in
AKS, you'll get an error when AKS creates a file share:
persistentvolume-controller (combined from similar events): Failed to provision volume with StorageClass
"azurefile": failed to create share kubernetes-dynamic-pvc-xxx in account xxx: failed to create file share,
err: storage: service returned error: StatusCode=403, ErrorCode=AuthorizationFailure, ErrorMessage=This
request is not authorized to perform this operation.
This error is because of the Kubernetes persistentvolume-controller not being on the network chosen when
setting allow access from selected network.
You can mitigate the issue by using static provisioning with Azure Files.
Azure Files mount fails because of storage account key changed
If your storage account key has changed, you may see Azure Files mount failures.
You can mitigate by manually updating the azurestorageaccountkey field manually in an Azure file secret with
your base64-encoded storage account key.
To encode your storage account key in base64, you can use base64 . For example:
To update your Azure secret file, use kubectl edit secret . For example:
After a few minutes, the agent node will retry the Azure File mount with the updated storage key.
Cluster autoscaler fails to scale with error failed to fix node group sizes
If your cluster autoscaler isn't scaling up/down and you see an error like the below on the cluster autoscaler
logs.
E1114 09:58:55.367731 1 static_autoscaler.go:239] Failed to fix node group sizes: failed to decrease aks-
default-35246781-vmss: attempt to delete existing nodes
This error is because of an upstream cluster autoscaler race condition. In such a case, cluster autoscaler ends
with a different value than the one that is actually in the cluster. To get out of this state, disable and re-enable the
cluster autoscaler.
Why do upgrades to Kubernetes 1.16 fail when using node labels with a kubernetes.io prefix
As of Kubernetes 1.16 only a defined subset of labels with the kubernetes.io prefix can be applied by the kubelet
to nodes. AKS cannot remove active labels on your behalf without consent, as it may cause downtime to
impacted workloads.
As a result, to mitigate this issue you can:
1. Upgrade your cluster control plane to 1.16 or higher
2. Add a new nodepoool on 1.16 or higher without the unsupported kubernetes.io labels
3. Delete the older node pool
AKS is investigating the capability to mutate active labels on a node pool to improve this mitigation.
Connect to Azure Kubernetes Service (AKS) cluster
nodes for maintenance or troubleshooting
6/15/2022 • 4 minutes to read • Edit Online
Throughout the lifecycle of your Azure Kubernetes Service (AKS) cluster, you may need to access an AKS node.
This access could be for maintenance, log collection, or other troubleshooting operations. You can access AKS
nodes using SSH, including Windows Server nodes. You can also connect to Windows Server nodes using
remote desktop protocol (RDP) connections. For security purposes, the AKS nodes aren't exposed to the internet.
To connect to the AKS nodes, you use kubectl debug or the private IP address.
This article shows you how to create a connection to an AKS node.
Use kubectl debug to run a container image on the node to connect to it.
This command starts a privileged container on your node and connects to it.
$ kubectl debug node/aks-nodepool1-12345678-vmss000000 -it --image=mcr.microsoft.com/dotnet/runtime-deps:6.0
Creating debugging pod node-debugger-aks-nodepool1-12345678-vmss000000-bkmmx with container debugger on node
aks-nodepool1-12345678-vmss000000.
If you don't see a command prompt, try pressing enter.
root@aks-nodepool1-12345678-vmss000000:/#
NOTE
You can interact with the node session by running chroot /host from the privileged container.
The above example begins forwarding network traffic from port 2022 on your development computer to port
22 on the deployed pod. When using kubectl port-forward to open a connection and forward network traffic,
the connection remains open until you stop the kubectl port-forward command.
Open a new terminal and use kubectl get nodes to show the internal IP address of the Windows Server node:
$ kubectl get nodes -o wide
In the above example, 10.240.0.67 is the internal IP address of the Windows Server node.
Create an SSH connection to the Windows Server node using the internal IP address. The default username for
AKS nodes is azureuser. Accept the prompt to continue with the connection. You are then provided with the bash
prompt of your Windows Server node:
[...]
azureuser@aksnpwin000000 C:\Users\azureuser>
The above example connects to port 22 on the Windows Server node through port 2022 on your development
computer.
NOTE
If you prefer to use password authentication, use -o PreferredAuthentications=password . For example:
Next steps
If you need more troubleshooting data, you can view the kubelet logs or view the Kubernetes master node logs.
Linux Performance Troubleshooting
6/15/2022 • 11 minutes to read • Edit Online
Resource exhaustion on Linux machines is a common issue and can manifest through a wide variety of
symptoms. This document provides a high-level overview of the tools available to help diagnose such issues.
Many of these tools accept an interval on which to produce rolling output. This output format typically makes
spotting patterns much easier. Where accepted, the example invocation will include [interval] .
Many of these tools have an extensive history and wide set of configuration options. This page provides only a
simple subset of invocations to highlight common problems. The canonical source of information is always the
reference documentation for each particular tool. That documentation will be much more thorough than what is
provided here.
Guidance
Be systematic in your approach to investigating performance issues. Two common approaches are USE
(utilization, saturation, errors) and RED (rate, errors, duration). RED is typically used in the context of services for
request-based monitoring. USE is typically used for monitoring resources: for each resource in a machine,
monitor utilization, saturation, and errors. The four main kinds of resources on any machine are cpu, memory,
disk, and network. High utilization, saturation, or error rates for any of these resources indicates a possible
problem with the system. When a problem exists, investigate the root cause: why is disk IO latency high? Are the
disks or virtual machine SKU throttled? What processes are writing to the devices, and to what files?
Some examples of common issues and indicators to diagnose them:
IOPS throttling: use iostat to measure per-device IOPS. Ensure no individual disk is above its limit, and the
sum for all disks is less than the limit for the virtual machine.
Bandwidth throttling: use iostat as for IOPS, but measuring read/write throughput. Ensure both per-device
and aggregate throughput are below the bandwidth limits.
SNAT exhaustion: this can manifest as high active (outbound) connections in SAR.
Packet loss: this can be measured by proxy via TCP retransmit count relative to sent/received count. Both
sar and netstat can show this information.
General
These tools are general purpose and cover basic system information. They are a good starting point for further
investigation.
uptime
$ uptime
19:32:33 up 17 days, 12:36, 0 users, load average: 0.21, 0.77, 0.69
uptime provides system uptime and 1, 5, and 15-minute load averages. These load averages roughly
correspond to threads doing work or waiting for uninterruptible work to complete. In absolute these numbers
can be difficult to interpret, but measured over time they can tell us useful information:
1-minute average > 5-minute average means load is increasing.
1-minute average < 5-minute average means load is decreasing.
uptime can also illuminate why information is not available: the issue may have resolved on its own or by a
restart before the user could access the machine.
Load averages higher than the number of CPU threads available may indicate a performance issue with a given
workload.
dmesg
$ dmesg | tail
$ dmesg --level=err | tail
dmesg dumps the kernel buffer. Events like OOMKill add an entry to the kernel buffer. Finding an OOMKill or
other resource exhaustion messages in dmesg logs is a strong indicator of a problem.
top
$ top
Tasks: 249 total, 1 running, 158 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.2 us, 1.3 sy, 0.0 ni, 95.4 id, 1.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 65949064 total, 43415136 free, 2349328 used, 20184600 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 62739060 avail Mem
top provides a broad overview of current system state. The headers provide some useful aggregate
information:
state of tasks: running, sleeping, stopped.
CPU utilization, in this case mostly showing idle time.
total, free, and used system memory.
top may miss short-lived processes; alternatives like htop and atop provide similar interfaces while fixing
some of these shortcomings.
CPU
These tools provide CPU utilization information. This is especially useful with rolling output, where patterns
become easy to spot.
mpstat
19:49:03 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
19:49:04 all 1.01 0.00 0.63 2.14 0.00 0.13 0.00 0.00 0.00 96.11
19:49:04 0 1.01 0.00 1.01 17.17 0.00 0.00 0.00 0.00 0.00 80.81
19:49:04 1 1.98 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 97.03
19:49:04 2 1.01 0.00 0.00 0.00 0.00 1.01 0.00 0.00 0.00 97.98
19:49:04 3 0.00 0.00 0.99 0.00 0.00 0.99 0.00 0.00 0.00 98.02
19:49:04 4 1.98 0.00 1.98 0.00 0.00 0.00 0.00 0.00 0.00 96.04
19:49:04 5 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 98.00
19:49:04 6 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 98.00
19:49:04 7 1.98 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 97.03
mpstat prints similar CPU information to top, but broken down by CPU thread. Seeing all cores at once can be
useful for detecting highly imbalanced CPU usage, for example when a single threaded application uses one
core at 100% utilization. This problem may be more difficult to spot when aggregated over all CPUs in the
system.
vmstat
$ vmstat [interval]
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 43300372 545716 19691456 0 0 3 50 3 3 2 1 95 1 0
vmstat provides similar information mpstat and top , enumerating number of processes waiting on CPU (r
column), memory statistics, and percent of CPU time spent in each work state.
Memory
Memory is a very important, and thankfully easy, resource to track. Some tools can report both CPU and
memory, like vmstat . But tools like free may still be useful for quick debugging.
free
$ free -m
total used free shared buff/cache available
Mem: 64403 2338 42485 1 19579 61223
Swap: 0 0 0
free presents basic information about total memory as well as used and free memory. vmstat may be more
useful even for basic memory analysis due to its ability to provide rolling output.
Disk
These tools measure disk IOPS, wait queues, and total throughput.
iostat
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await
svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00
sda 0.00 56.00 0.00 65.00 0.00 504.00 15.51 0.01 3.02 0.00 3.02
0.12 0.80
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00
iostat provides deep insights into disk utilization. This invocation passes -x for extended statistics, -y to
skip the initial output printing system averages since boot, and 1 1 to specify we want 1-second interval,
ending after one block of output.
iostat exposes many useful statistics:
r/s and w/s are reads per second and writes per second. The sum of these values is IOPS.
rkB/s and wkB/s are kilobytes read/written per second. The sum of these values is throughput.
await is the average iowait time in milliseconds for queued requests.
avgqu-sz is the average queue size over the provided interval.
On an Azure VM:
the sum of r/s and w/s for an individual block device may not exceed that disk's SKU limits.
the sum of rkB/s and wkB/s for an individual block device may not exceed that disk's SKU limits
the sum of r/s and w/s for all block devices may not exceed the limits for the VM SKU.
the sum of rkB/s and `wkB/s for all block devices may not exceed the limits for the VM SKU.
Note that the OS disk counts as a managed disk of the smallest SKU corresponding to its capacity. For example,
a 1024GB OS Disk corresponds to a P30 disk. Ephemeral OS disks and temporary disks do not have individual
disk limits; they are only limited by the full VM limits.
Non-zero values of await or avgqu-sz are also good indicators of IO contention.
Network
These tools measure network statistics like throughput, transmission failures, and utilization. Deeper analysis
can expose fine-grained TCP statistics about congestion and dropped packets.
sar
sar is a powerful tool for a wide range of analysis. While this example uses its ability to measure network stats,
it is equally powerful for measuring CPU and memory consumption. This example invokes sar with -n flag to
specify the DEV (network device) keyword, displaying network throughput by device.
The sum of rxKb/s and txKb/s is total throughput for a given device. When this value exceeds the limit for
the provisioned Azure NIC, workloads on the machine will experience increased network latency.
%ifutil measures utilization for a given device. As this value approaches 100%, workloads will experience
increased network latency.
$ sar -n TCP,ETCP [interval]
Linux 4.15.0-1064-azure (aks-main-10212767-vmss000001) 02/10/20 _x86_64_ (8 CPU)
This invocation of sar uses the TCP,ETCP keywords to examine TCP connections. The third column of the last
row, "retrans", is the number of TCP retransmits per second. High values for this field indicate an unreliable
network connection. In The first and third rows, "active" means a connection originated from the local device,
while "remote" indicates an incoming connection. A common issue on Azure is SNAT port exhaustion, which
sar can help detect. SNAT port exhaustion would manifest as high "active" values, since the problem is due to a
high rate of outbound, locally-initiated TCP connections.
As sar takes an interval, it prints rolling output and then prints final rows of output containing the average
results from the invocation.
netstat
$ netstat -s
Ip:
71046295 total packets received
78 forwarded
0 incoming packets discarded
71046066 incoming packets delivered
83774622 requests sent out
40 outgoing packets dropped
Icmp:
103 ICMP messages received
0 input ICMP message failed.
ICMP input histogram:
destination unreachable: 103
412802 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 412802
IcmpMsg:
InType3: 103
OutType3: 412802
Tcp:
11487089 active connections openings
592 passive connection openings
1137 failed connection attempts
404 connection resets received
17 connections established
70880911 segments received
95242567 segments send out
176658 segments retransmited
3 bad segments received.
163295 resets sent
Udp:
164968 packets received
84 packets to unknown port received.
0 packet receive errors
165082 packets sent
UdpLite:
TcpExt:
5 resets received for embryonic SYN_RECV sockets
1670559 TCP sockets finished time wait in fast timer
95 packets rejects in established connections because of timestamp
756870 delayed acks sent
2236 delayed acks further delayed because of locked socket
Quick ack mode was activated 479 times
11983969 packet headers predicted
25061447 acknowledgments not containing data payload received
5596263 predicted acknowledgments
19 times recovered from packet loss by selective acknowledgements
Detected reordering 114 times using SACK
Detected reordering 4 times using time stamp
5 congestion windows fully recovered without slow start
1 congestion windows partially recovered using Hoe heuristic
5 congestion windows recovered without slow start by DSACK
111 congestion windows recovered without slow start after partial ack
73 fast retransmits
26 retransmits in slow start
311 other TCP timeouts
TCPLossProbes: 198845
TCPLossProbeRecovery: 147
480 DSACKs sent for old packets
175310 DSACKs received
316 connections reset due to unexpected data
272 connections reset due to early user close
5 connections aborted due to timeout
TCPDSACKIgnoredNoUndo: 8498
TCPSpuriousRTOs: 1
TCPSackShifted: 3
TCPSackMerged: 9
TCPSackShiftFallback: 177
IPReversePathFilter: 4
TCPRcvCoalesce: 1501457
TCPOFOQueue: 9898
TCPChallengeACK: 342
TCPSYNChallenge: 3
TCPSpuriousRtxHostQueues: 17
TCPAutoCorking: 2315642
TCPFromZeroWindowAdv: 483
TCPToZeroWindowAdv: 483
TCPWantZeroWindowAdv: 115
TCPSynRetrans: 885
TCPOrigDataSent: 51140171
TCPHystartTrainDetect: 349
TCPHystartTrainCwnd: 7045
TCPHystartDelayDetect: 26
TCPHystartDelayCwnd: 862
TCPACKSkippedPAWS: 3
TCPACKSkippedSeq: 4
TCPKeepAlive: 62517
IpExt:
InOctets: 36416951539
OutOctets: 41520580596
InNoECTPkts: 86631440
InECT0Pkts: 14
netstat can introspect a wide variety of network stats, here invoked with summary output. There are many
useful fields here depending on the issue. One useful field in the TCP section is "failed connection attempts". This
may be an indication of SNAT port exhaustion or other issues making outbound connections. A high rate of
retransmitted segments (also under the TCP section) may indicate issues with packet delivery.
Check for Resource Health events impacting your
AKS cluster (Preview)
6/15/2022 • 2 minutes to read • Edit Online
When running your container workloads on AKS, you want to ensure you can troubleshoot and fix problems as
soon as they arise to minimize the impact on the availability of your workloads. Azure Resource Health gives
you visibility into various health events that may cause your AKS cluster to be unavailable.
IMPORTANT
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and
they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer
support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the
following support articles:
AKS support policies
Azure support FAQ
Troubleshooting Azure Kubernetes Service (AKS) cluster issues plays an important role in maintaining your
cluster, especially if your cluster is running mission-critical workloads. AKS Diagnostics is an intelligent, self-
diagnostic experience that:
Helps you identify and resolve problems in your cluster.
Is cloud-native.
Requires no extra configuration or billing cost.
This feature is now in public preview.
Cluster Insights
The following diagnostic checks are available in Cluster Insights .
Cluster Node Issues
Cluster Node Issues checks for node-related issues that cause your cluster to behave unexpectedly.
Node readiness issues
Node failures
Insufficient resources
Node missing IP configuration
Node CNI failures
Node not found
Node power off
Node authentication failure
Node kube-proxy stale
Create, read, update & delete (CRUD) operations
CRUD Operations checks for any CRUD operations that cause issues in your cluster.
In-use subnet delete operation error
Network security group delete operation error
In-use route table delete operation error
Referenced resource provisioning error
Public IP address delete operation error
Deployment failure due to deployment quota
Operation error due to organization policy
Missing subscription registration
VM extension provisioning error
Subnet capacity
Quota exceeded error
Identity and security management
Identity and Security Management detects authentication and authorization errors that prevent communication
to your cluster.
Node authorization failures
401 errors
403 errors
Next steps
Collect logs to help you further troubleshoot your cluster issues by using AKS Periscope.
Read the triage practices section of the AKS day-2 operations guide.
Post your questions or feedback at UserVoice by adding "[Diag]" in the title.
Azure Policy built-in definitions for Azure
Kubernetes Service
6/15/2022 • 15 minutes to read • Edit Online
This page is an index of Azure Policy built-in policy definitions for Azure Kubernetes Service. For additional
Azure Policy built-ins for other services, see Azure Policy built-in definitions.
The name of each built-in policy definition links to the policy definition in the Azure portal. Use the link in the
Version column to view the source on the Azure Policy GitHub repo.
Initiatives
NAME DESC RIP T IO N P O L IC IES VERSIO N
Policy definitions
Microsoft.ContainerService
NAME VERSIO N
( A ZURE PO RTA L) DESC RIP T IO N EF F EC T ( S) ( GIT HUB)
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Azure Kubernetes Service Enable the private cluster Audit, Deny, Disabled 1.0.0
Private Clusters should be feature for your Azure
enabled Kubernetes Service cluster
to ensure network traffic
between your API server
and your node pools
remains on the private
network only. This is a
common requirement in
many regulatory and
industry compliance
standards.
Azure Policy Add-on for Azure Policy Add-on for Audit, Disabled 1.0.2
Kubernetes service (AKS) Kubernetes service (AKS)
should be installed and extends Gatekeeper v3, an
enabled on your clusters admission controller
webhook for Open Policy
Agent (OPA), to apply at-
scale enforcements and
safeguards on your clusters
in a centralized, consistent
manner.
Both operating systems and Encrypting OS and data Audit, Deny, Disabled 1.0.0
data disks in Azure disks using customer-
Kubernetes Service clusters managed keys provides
should be encrypted by more control and greater
customer-managed keys flexibility in key
management. This is a
common requirement in
many regulatory and
industry compliance
standards.
Deploy Azure Policy Add-on Use Azure Policy Add-on to DeployIfNotExists, Disabled 4.0.0
to Azure Kubernetes Service manage and report on the
clusters compliance state of your
Azure Kubernetes Service
(AKS) clusters. For more
information, see
https://aka.ms/akspolicydoc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Ensure cluster containers This policy enforces that all Audit, Deny, Disabled 1.1.0
have readiness or liveness pods have a readiness
probes configured and/or liveness probes
configured. Probe Types can
be any of tcpSocket,
httpGet and exec. This
policy is generally available
for Kubernetes Service
(AKS), and preview for AKS
Engine and Azure Arc
enabled Kubernetes. For
instructions on using this
policy, visit
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster Enforce container CPU and audit, Audit, deny, Deny, 7.2.0
containers CPU and memory resource limits to disabled, Disabled
memory resource limits prevent resource
should not exceed the exhaustion attacks in a
specified limits Kubernetes cluster. This
policy is generally available
for Kubernetes Service
(AKS), and preview for AKS
Engine and Azure Arc
enabled Kubernetes. For
more information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster Block pod containers from audit, Audit, deny, Deny, 3.2.0
containers should not share sharing the host process ID disabled, Disabled
host process ID or host IPC namespace and host IPC
namespace namespace in a Kubernetes
cluster. This
recommendation is part of
CIS 5.2.2 and CIS 5.2.3
which are intended to
improve the security of
your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes cluster Containers should not use audit, Audit, deny, Deny, 5.1.0
containers should not use forbidden sysctl interfaces disabled, Disabled
forbidden sysctl interfaces in a Kubernetes cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster Containers should only use audit, Audit, deny, Deny, 4.2.0
containers should only use allowed AppArmor profiles disabled, Disabled
allowed AppArmor profiles in a Kubernetes cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster Restrict the capabilities to audit, Audit, deny, Deny, 4.2.0
containers should only use reduce the attack surface of disabled, Disabled
allowed capabilities containers in a Kubernetes
cluster. This
recommendation is part of
CIS 5.2.8 and CIS 5.2.9
which are intended to
improve the security of
your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes cluster Use images from trusted audit, Audit, deny, Deny, 7.1.0
containers should only use registries to reduce the disabled, Disabled
allowed images Kubernetes cluster's
exposure risk to unknown
vulnerabilities, security
issues and malicious
images. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster Pod containers can only use audit, Audit, deny, Deny, 6.3.0
containers should only use allowed ProcMountTypes in disabled, Disabled
allowed ProcMountType a Kubernetes cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster Pod containers can only use audit, Audit, deny, Deny, 4.2.0
containers should only use allowed seccomp profiles in disabled, Disabled
allowed seccomp profiles a Kubernetes cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes cluster Run containers with a read audit, Audit, deny, Deny, 4.2.0
containers should run with only root file system to disabled, Disabled
a read only root file system protect from changes at
run-time with malicious
binaries being added to
PATH in a Kubernetes
cluster. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster pod Pod FlexVolume volumes audit, Audit, deny, Deny, 3.1.0
FlexVolume volumes should should only use allowed disabled, Disabled
only use allowed drivers drivers in a Kubernetes
cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster pod Limit pod HostPath volume audit, Audit, deny, Deny, 4.2.0
hostPath volumes should mounts to the allowed host disabled, Disabled
only use allowed host paths paths in a Kubernetes
Cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes cluster pods Control the user, primary audit, Audit, deny, Deny, 4.2.0
and containers should only group, supplemental group disabled, Disabled
run with approved user and and file system group IDs
group IDs that pods and containers
can use to run in a
Kubernetes Cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster pods Pods and containers should audit, Audit, deny, Deny, 5.2.0
and containers should only only use allowed SELinux disabled, Disabled
use allowed SELinux options options in a Kubernetes
cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster pods Pods can only use allowed audit, Audit, deny, Deny, 3.2.0
should only use allowed volume types in a disabled, Disabled
volume types Kubernetes cluster. This
recommendation is part of
Pod Security Policies which
are intended to improve the
security of your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes cluster pods Restrict pod access to the audit, Audit, deny, Deny, 4.2.0
should only use approved host network and the disabled, Disabled
host network and port allowable host port range in
range a Kubernetes cluster. This
recommendation is part of
CIS 5.2.4 which is intended
to improve the security of
your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster pods Use specified labels to audit, Audit, deny, Deny, 6.2.0
should use specified labels identify the pods in a disabled, Disabled
Kubernetes cluster. This
policy is generally available
for Kubernetes Service
(AKS), and preview for AKS
Engine and Azure Arc
enabled Kubernetes. For
more information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster services Restrict services to listen audit, Audit, deny, Deny, 6.2.0
should listen only on only on allowed ports to disabled, Disabled
allowed ports secure access to the
Kubernetes cluster. This
policy is generally available
for Kubernetes Service
(AKS), and preview for AKS
Engine and Azure Arc
enabled Kubernetes. For
more information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes cluster services Use allowed external IPs to audit, Audit, deny, Deny, 3.1.0
should only use allowed avoid the potential attack disabled, Disabled
external IPs (CVE-2020-8554) in a
Kubernetes cluster. For
more information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes cluster should Do not allow privileged audit, Audit, deny, Deny, 7.2.0
not allow privileged containers creation in a disabled, Disabled
containers Kubernetes cluster. This
recommendation is part of
CIS 5.2.1 which is intended
to improve the security of
your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes clusters should Use of HTTPS ensures audit, Audit, deny, Deny, 6.1.0
be accessible only over authentication and protects disabled, Disabled
HTTPS data in transit from network
layer eavesdropping
attacks. This capability is
currently generally available
for Kubernetes Service
(AKS), and in preview for
AKS Engine and Azure Arc
enabled Kubernetes. For
more info, visit
https://aka.ms/kubepolicyd
oc
Kubernetes clusters should Disable automounting API audit, Audit, deny, Deny, 2.1.0
disable automounting API credentials to prevent a disabled, Disabled
credentials potentially compromised
Pod resource to run API
commands against
Kubernetes clusters. For
more information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes clusters should Do not allow containers to audit, Audit, deny, Deny, 4.2.0
not allow container privilege run with privilege escalation disabled, Disabled
escalation to root in a Kubernetes
cluster. This
recommendation is part of
CIS 5.2.5 which is intended
to improve the security of
your Kubernetes
environments. This policy is
generally available for
Kubernetes Service (AKS),
and preview for AKS Engine
and Azure Arc enabled
Kubernetes. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes clusters should To reduce the attack surface audit, Audit, deny, Deny, 3.3.0
not grant CAP_SYS_ADMIN of your containers, restrict disabled, Disabled
security capabilities CAP_SYS_ADMIN Linux
capabilities. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes clusters should Prevent specific security audit, Audit, deny, Deny, 3.2.0
not use specific security capabilities in Kubernetes disabled, Disabled
capabilities clusters to prevent
ungranted privileges on the
Pod resource. For more
information, see
https://aka.ms/kubepolicyd
oc.
NAME VERSIO N
DESC RIP T IO N EF F EC T ( S)
Kubernetes clusters should Prevent usage of the audit, Audit, deny, Deny, 2.2.0
not use the default default namespace in disabled, Disabled
namespace Kubernetes clusters to
protect against
unauthorized access for
ConfigMap, Pod, Secret,
Service, and ServiceAccount
resource types. For more
information, see
https://aka.ms/kubepolicyd
oc.
Kubernetes clusters should Use internal load balancers audit, Audit, deny, Deny, 6.1.0
use internal load balancers to make a Kubernetes disabled, Disabled
service accessible only to
applications running in the
same virtual network as the
Kubernetes cluster. For
more information, see
https://aka.ms/kubepolicyd
oc.
Temp disks and cache for To enhance data security, Audit, Deny, Disabled 1.0.0
agent node pools in Azure the data stored on the
Kubernetes Service clusters virtual machine (VM) host
should be encrypted at of your Azure Kubernetes
host Service nodes VMs should
be encrypted at rest. This is
a common requirement in
many regulatory and
industry compliance
standards.
Next steps
See the built-ins on the Azure Policy GitHub repo.
Review the Azure Policy definition structure.
Review Understanding policy effects.
Support policies for Azure Kubernetes Service
6/15/2022 • 9 minutes to read • Edit Online
This article provides details about technical support policies and limitations for Azure Kubernetes Service (AKS).
The article also details agent node management, managed control plane components, third-party open-source
components, and security or patch management.
Shared responsibility
When a cluster is created, you define the Kubernetes agent nodes that AKS creates. Your workloads are executed
on these nodes.
Because your agent nodes execute private code and store sensitive data, Microsoft Support can access them
only in a very limited way. Microsoft Support can't sign in to, execute commands in, or view logs for these nodes
without your express permission or assistance.
Any modification done directly to the agent nodes using any of the IaaS APIs renders the cluster unsupportable.
Any modification done to the agent nodes must be done using kubernetes-native mechanisms such as
Daemon Sets .
Similarly, while you may add any metadata to the cluster and nodes, such as tags and labels, changing any of the
system created metadata will render the cluster unsupported.
NOTE
Any cluster actions taken by Microsoft/AKS are made with user consent under a built-in Kubernetes role aks-service
and built-in role binding aks-service-rolebinding . This role enables AKS to troubleshoot and diagnose cluster issues,
but can't modify permissions nor create roles or role bindings, or other high privilege actions. Role access is only enabled
under active support tickets with just-in-time (JIT) access.
NOTE
Microsoft Support can advise on AKS cluster functionality, customization, and tuning (for example, Kubernetes
operations issues and procedures).
Third-party open-source projects that aren't provided as part of the Kubernetes control plane or deployed
with AKS clusters. These projects might include Istio, Helm, Envoy, or others.
NOTE
Microsoft can provide best-effort support for third-party open-source projects such as Helm. Where the third-
party open-source tool integrates with the Kubernetes Azure cloud provider or other AKS-specific bugs, Microsoft
supports examples and applications from Microsoft documentation.
Third-party closed-source software. This software can include security scanning tools and networking
devices or software.
Network customizations other than the ones listed in the AKS documentation.
Custom or 3rd-party CNI plugins used in BYOCNI mode.
NOTE
If an agent node is not operational, AKS might restart individual components or the entire agent node. These restart
operations are automated and provide auto-remediation for common issues. If you want to know more about the auto-
remediation mechanisms, see Node Auto-Repair
NOTE
AKS agent nodes appear in the Azure portal as regular Azure IaaS resources. But these virtual machines are deployed into
a custom Azure resource group (usually prefixed with MC_*). You cannot change the base OS image or do any direct
customizations to these nodes using the IaaS APIs or resources. Any custom changes that are not done via the AKS API
will not persist through an upgrade, scale, update or reboot. Also any change to the nodes' extensions like the
CustomScriptExtension one can lead to unexpected behavior and should be prohibited. Avoid performing changes to the
agent nodes unless Microsoft Support directs you to make changes.
AKS manages the lifecycle and operations of agent nodes on your behalf - modifying the IaaS resources
associated with the agent nodes is not suppor ted . An example of an unsupported operation is customizing a
node pool virtual machine scale set by manually changing configurations through the virtual machine scale set
portal or API.
For workload-specific configurations or packages, AKS recommends using Kubernetes daemon sets .
Using Kubernetes privileged daemon sets and init containers enables you to tune/modify or install 3rd party
software on cluster agent nodes. Examples of such customizations include adding custom security scanning
software or updating sysctl settings.
While this path is recommended if the above requirements apply, AKS engineering and support cannot assist in
troubleshooting or diagnosing modifications that render the node unavailable due to a custom deployed
daemon set .
This article addresses frequent questions about Azure Kubernetes Service (AKS).
Can I provide my own name for the AKS node resource group?
Yes. By default, AKS will name the node resource group MC_resourcegroupname_clustername_location, but you
can also provide your own name.
To specify your own resource group name, install the aks-preview Azure CLI extension version 0.3.2 or later.
When you create an AKS cluster by using the az aks create command, use the --node-resource-group parameter
and specify a name for the resource group. If you use an Azure Resource Manager template to deploy an AKS
cluster, you can define the resource group name by using the nodeResourceGroup property.
The secondary resource group is automatically created by the Azure resource provider in your own
subscription.
You can specify a custom resource group name only when you're creating the cluster.
As you work with the node resource group, keep in mind that you can't:
Specify an existing resource group for the node resource group.
Specify a different subscription for the node resource group.
Change the node resource group name after the cluster has been created.
Specify names for the managed resources within the node resource group.
Modify or delete Azure-created tags of managed resources within the node resource group. (See additional
information in the next section.)
Can I modify tags and other properties of the AKS resources in the
node resource group?
If you modify or delete Azure-created tags and other resource properties in the node resource group, you could
get unexpected results such as scaling and upgrading errors. AKS allows you to create and modify custom tags
created by end users, and you can add those tags when creating a node pool. You might want to create or
modify custom tags, for example, to assign a business unit or cost center. This can also be achieved by creating
Azure Policies with a scope on the managed resource group.
However, modifying any Azure-created tags on resources under the node resource group in the AKS cluster is
an unsupported action, which breaks the service-level objective (SLO). For more information, see Does AKS offer
a service-level agreement?
namespaceSelector:
matchExpressions:
- key: control-plane
operator: DoesNotExist
AKS firewalls the API server egress so your admission controller webhooks need to be accessible from within
the cluster.
I ran an upgrade, but now my pods are in crash loops, and readiness
probes fail?
Confirm your service principal hasn't expired. See: AKS service principal and AKS update credentials.
Can I use the virtual machine scale set APIs to scale manually?
No, scale operations by using the virtual machine scale set APIs aren't supported. Use the AKS APIs (
az aks scale ).
Can I use virtual machine scale sets to manually scale to zero nodes?
No, scale operations by using the virtual machine scale set APIs aren't supported. You can use the AKS API to
scale to zero non-system node pools or stop your cluster instead.
Can I stop or de-allocate all my VMs?
While AKS has resilience mechanisms to withstand such a config and recover from it, this isn't a supported
configuration. Stop your cluster instead.
Does AKS store any customer data outside of the cluster's region?
The feature to enable storing customer data in a single region is currently only available in the Southeast Asia
Region (Singapore) of the Asia Pacific Geo and Brazil South (Sao Paulo State) Region of Brazil Geo. For all other
regions, customer data is stored in Geo.
default via 10.240.0.1 dev azure0 proto dhcp src 10.240.0.4 metric 100
10.240.0.0/12 dev azure0 proto kernel scope link src 10.240.0.4
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
root@k8s-agentpool1-20465682-1:/#
Transparent mode
Transparent mode takes a straight forward approach to setting up Linux networking. In this mode, Azure CNI
won't change any properties of eth0 interface in the Linux VM. This minimal approach of changing the Linux
networking properties helps reduce complex corner case issues that clusters could face with Bridge mode. In
Transparent Mode, Azure CNI will create and add host-side pod veth pair interfaces that will be added to the
host network. Intra VM Pod-to-Pod communication is through ip routes that the CNI will add. Essentially Pod-to-
Pod communication is over layer 3 and pod traffic is routed by L3 routing rules.
Below is an example ip route setup of transparent mode, each Pod's interface will get a static route attached so
that traffic with dest IP as the Pod will be sent directly to the Pod's host side veth pair interface.
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext:
runAsUser: 0
fsGroup: 0
The issue has been resolved by Kubernetes v1.20, refer Kubernetes 1.20: Granular Control of Volume Permission
Changes for more details.